dcf6218cdb
* A deduplication filter step ran too early, and removed many good results on the basis that they partially, but did not fully fit another set of search terms. * Altered the query creation process to prefer documents where multiple terms appear in the priority index. |
||
---|---|---|
.. | ||
src | ||
build.gradle | ||
readme.md |
Reverse Index
The reverse index contains a mapping from word to document id.
There are two tiers of this index.
- A priority index which only indexes terms that are flagged with priority flags1.
- A full index that indexes all terms.
The full index also provides access to term-level metadata, while the priority index is a binary index that only offers information about which documents has a specific word.
[1] See WordFlags in common/model and KeywordMetadata in features-convert/keyword-extraction.
Central Classes
- ReverseIndexFullConverter constructs the full index.
- ReverseIndexFullReader interrogates the full index.
- ReverseIndexPriorityConverter constructs the priority index.
- ReverseIndexPriorityReader interrogates the priority index.