Package org.apache.lucene.search.spans
The calculus of spans.
A span is a <doc,startPosition,endPosition>
tuple.
The following span query operators are implemented:
- A
SpanTermQuery
matches all spans containing a particularTerm
. - A
SpanNearQuery
matches spans which occur near one another, and can be used to implement things like phrase search (when constructed fromSpanTermQuery
s) and inter-phrase proximity (when constructed from otherSpanNearQuery
s). - A
SpanOrQuery
merges spans from a number of otherSpanQuery
s. - A
SpanNotQuery
removes spans matching oneSpanQuery
which overlap (or comes near) another. This can be used, e.g., to implement within-paragraph search. - A
SpanFirstQuery
matches spans matchingq
whose end position is less thann
. This can be used to constrain matches to the first part of the document. - A
SpanPositionRangeQuery
is a more general form of SpanFirstQuery that can constrain matches to arbitrary portions of the document.
For example, a span query which matches "John Kerry" within ten words of "George Bush" within the first 100 words of the document could be constructed with:
SpanQuery john = new SpanTermQuery(new Term("content", "john")); SpanQuery kerry = new SpanTermQuery(new Term("content", "kerry")); SpanQuery george = new SpanTermQuery(new Term("content", "george")); SpanQuery bush = new SpanTermQuery(new Term("content", "bush")); SpanQuery johnKerry = new SpanNearQuery(new SpanQuery[] {john, kerry}, 0, true); SpanQuery georgeBush = new SpanNearQuery(new SpanQuery[] {george, bush}, 0, true); SpanQuery johnKerryNearGeorgeBush = new SpanNearQuery(new SpanQuery[] {johnKerry, georgeBush}, 10, false); SpanQuery johnKerryNearGeorgeBushAtStart = new SpanFirstQuery(johnKerryNearGeorgeBush, 100);
Span queries may be freely intermixed with other Lucene queries. So, for example, the above query can be restricted to documents which also use the word "iraq" with:
Query query = new BooleanQuery(); query.add(johnKerryNearGeorgeBushAtStart, true, false); query.add(new TermQuery("content", "iraq"), true, false);
-
Class Summary Class Description FieldMaskingSpanQuery Wrapper to allowSpanQuery
objects participate in composite single-field SpanQueries by 'lying' about their search field.NearSpansOrdered A Spans that is formed from the ordered subspans of a SpanNearQuery where the subspans do not overlap and have a maximum slop between them.NearSpansUnordered Similar toNearSpansOrdered
, but for the unordered case.SpanFirstQuery Matches spans near the beginning of a field.SpanMultiTermQueryWrapper<Q extends MultiTermQuery> Wraps anyMultiTermQuery
as aSpanQuery
, so it can be nested within other SpanQuery classes.SpanMultiTermQueryWrapper.SpanRewriteMethod Abstract class that defines how the query is rewritten.SpanMultiTermQueryWrapper.TopTermsSpanBooleanQueryRewrite A rewrite method that first translates each term into a SpanTermQuery in aBooleanClause.Occur.SHOULD
clause in a BooleanQuery, and keeps the scores as computed by the query.SpanNearPayloadCheckQuery Only return those matches that have a specific payload at the given position.SpanNearQuery Matches spans which are near one another.SpanNotQuery Removes matches which overlap with another SpanQuery or within a x tokens before or y tokens after another SpanQuery.SpanOrQuery Matches the union of its clauses.SpanPayloadCheckQuery Only return those matches that have a specific payload at the given position.SpanPositionCheckQuery Base class for filtering a SpanQuery based on the position of a match.SpanPositionRangeQuery Checks to see if theSpanPositionCheckQuery.getMatch()
lies between a start and end positionSpanQuery Base class for span-based queries.Spans Expert: an enumeration of span matches.SpanScorer Public for extension only.SpanTermQuery Matches spans containing a term.SpanWeight Expert-only.TermSpans Expert: Public for extension only