Package uk.ac.starlink.table.join
Class RowMatcher
- java.lang.Object
-
- uk.ac.starlink.table.join.RowMatcher
-
public class RowMatcher extends java.lang.Object
Performs matching on the rows of one or more tables. The specifics of what constitutes a matched row, and some additional intelligence about how to determine this, are supplied by an associatedMatchEngine
object, but the generic parts of the matching algorithms are done here.Note that since the LinkSets and other objects handled by this class may be very large when large tables are being matched, the algorithms in this class are coded carefully to use as little memory as possible. Techniques include removing items from one collection as they are added to another. This means that in many cases input values may be modified by the methods.
- Author:
- Mark Taylor (Starlink)
-
-
Constructor Summary
Constructors Constructor Description RowMatcher(MatchEngine engine, StarTable[] tables)
Constructs a new matcher with match characteristics defined by a given matching engine.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description LinkSet
createLinkSet()
Constructs a new empty LinkSet for use by this matcher.LinkSet
findGroupMatches(MultiJoinType[] joinTypes)
Returns a list of RowLink objects corresponding to a match performed with this matcher's tables using its match engine.LinkSet
findInternalMatches(boolean includeSingles)
Returns a list of RowLink objects corresponding to all the internal matches in this matcher's sole table using its match engine.LinkSet
findMultiPairMatches(int index0, boolean bestOnly, MultiJoinType[] joinTypes)
Returns a set of RowLink objects each of which represents matches between one of the rows of a reference table and any of the other tables which can provide matches.LinkSet
findPairMatches(PairMode pairMode)
Returns a set of RowLink objects corresponding to a pairwise match between this matcher's two tables performed with its match engine.ProgressIndicator
getIndicator()
Returns the current progress indicator for this matcher.void
setIndicator(ProgressIndicator indicator)
Sets the progress indicator for this matcher.
-
-
-
Constructor Detail
-
RowMatcher
public RowMatcher(MatchEngine engine, StarTable[] tables)
Constructs a new matcher with match characteristics defined by a given matching engine.- Parameters:
engine
- matching enginetables
- the array of tables on which matches are to be done
-
-
Method Detail
-
setIndicator
public void setIndicator(ProgressIndicator indicator)
Sets the progress indicator for this matcher.- Parameters:
indicator
- new indicator
-
getIndicator
public ProgressIndicator getIndicator()
Returns the current progress indicator for this matcher.- Returns:
- indicator
-
createLinkSet
public LinkSet createLinkSet()
Constructs a new empty LinkSet for use by this matcher. The current implementation returns one based on a SortedSet, but future implementations may provide the option of LinkSet implementations backed by disk.- Returns:
- new LinkSet
-
findPairMatches
public LinkSet findPairMatches(PairMode pairMode) throws java.io.IOException, java.lang.InterruptedException
Returns a set of RowLink objects corresponding to a pairwise match between this matcher's two tables performed with its match engine. Each element in the returned list corresponds to a matched pair with one entry from each of the input tables.- Parameters:
pairMode
- matching mode to determine which rows appear in the result- Returns:
- links representing matched rows
- Throws:
java.io.IOException
java.lang.InterruptedException
-
findMultiPairMatches
public LinkSet findMultiPairMatches(int index0, boolean bestOnly, MultiJoinType[] joinTypes) throws java.io.IOException, java.lang.InterruptedException
Returns a set of RowLink objects each of which represents matches between one of the rows of a reference table and any of the other tables which can provide matches. Elements of the result set will be instances ofPairsRowLink
.- Parameters:
index0
- index of the reference table in the list of tables owned by this row matcherbestOnly
- true if only the best match between the reference table and any other table should be retainedjoinTypes
- inclusion criteria for output table rows- Returns:
- set of PairsRowLink objects representing multi-pair matches
- Throws:
java.io.IOException
java.lang.InterruptedException
-
findGroupMatches
public LinkSet findGroupMatches(MultiJoinType[] joinTypes) throws java.io.IOException, java.lang.InterruptedException
Returns a list of RowLink objects corresponding to a match performed with this matcher's tables using its match engine. Each element in the returned list corresponds to a matched group of input rows, with no more than one entry from each table. Each input table row appears in no more than one RowLink in the returned list. Any number of tables can be matched.- Parameters:
joinTypes
- inclusion criteria for output table rows- Returns:
- list of
RowLink
s corresponding to the selected rows - Throws:
java.io.IOException
java.lang.InterruptedException
-
findInternalMatches
public LinkSet findInternalMatches(boolean includeSingles) throws java.io.IOException, java.lang.InterruptedException
Returns a list of RowLink objects corresponding to all the internal matches in this matcher's sole table using its match engine.- Parameters:
includeSingles
- whether to include unmatched (singleton) row links in the returned link set- Returns:
- a set of
RowLink
objects giving all the groups of matched objects in this matcher's sole table - Throws:
java.io.IOException
java.lang.InterruptedException
-
-