(sideload) Just index based on first paragraph
This seems like it would make the wikipedia search result worse, but it drastically improves the result quality! This is because wikipedia has a lot of articles that each talk about a lot of irrelevant concepts, and indexing the entire document means tangentially relevant results tend to displace the most relevant results.
This commit is contained in:
parent
f6fa8bd722
commit
faa50bf578
@ -120,6 +120,7 @@ public class EncyclopediaMarginaliaNuSideloader implements SideloadSource, AutoC
|
||||
fullHtml.append("<p>");
|
||||
fullHtml.append(part);
|
||||
fullHtml.append("</p>");
|
||||
break; // Only take the first part, this improves accuracy a lot
|
||||
}
|
||||
fullHtml.append("</div></body></html>");
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user