Commit graph

620 commits

Author SHA1 Message Date
Viktor Lofgren
616effdb3c The refactoring will continue until morale improves. 2023-03-12 10:04:48 +01:00
Viktor Lofgren
4cec89da91 Fix bug where results would sometimes be presented solely based on the fact that the document is important on the site in general, regardless of whether it's important to the document. 2023-03-11 14:20:32 +01:00
Viktor Lofgren
2e2916cebe Additional code restructuring to get rid of util and misc-style packages. 2023-03-11 13:53:36 +01:00
Viktor Lofgren
6d939175b1 Additional code restructuring to get rid of util and misc-style packages. 2023-03-11 13:48:40 +01:00
Viktor Lofgren
73e412ea5b Clean up search-service and index-api 2023-03-11 12:26:12 +01:00
Viktor Lofgren
c2f9980eba Tidy up. 2023-03-11 12:13:53 +01:00
Viktor Lofgren
0532e8c40e Tidy up. 2023-03-11 11:35:08 +01:00
Viktor Lofgren
919b80b9ab Gradle shouldn't generate dist zips, zipping jar files is slow and also just ridiculous when you realize jar files are zip files and you can't compress a file twice using the same algo. 2023-03-11 11:34:51 +01:00
Viktor Lofgren
1aee6fdc11 Fix docker dependencies warning. 2023-03-10 17:16:44 +01:00
Viktor Lofgren
a62015d5f3 Fix broken test, compiler warning. 2023-03-10 17:12:12 +01:00
Viktor Lofgren
722ff3bffb Word feature bit for words that appear in the URL, new search profile for plain text files, better plain text titles. 2023-03-10 16:46:56 +01:00
Viktor Lofgren
2bc212d65c Refactor DocumentKeyword-related classes 2023-03-09 20:41:38 +01:00
Viktor Lofgren
efb46cc703 Remove count from WordMetadata entirely. 2023-03-09 18:14:14 +01:00
Viktor Lofgren
8fb531c614 Word Metadata's count is hella broken, stopgap fix by bitCounting positions instead as this is messing with the search result ordering very badly. 2023-03-09 17:58:56 +01:00
Viktor Lofgren
9ece07d559 Chasing a result ranking bug 2023-03-09 17:52:35 +01:00
Viktor Lofgren
0ae4731cf1 Add invariant to WordMetadata 2023-03-09 17:27:07 +01:00
Viktor Lofgren
02db999762 Enable assertions in reconvert script. 2023-03-09 17:26:08 +01:00
Viktor Lofgren
2a25b5e8a9 Placeholder screenshots when the domain is missing from the database entirely. 2023-03-08 18:36:41 +01:00
Viktor Lofgren
5c1a59257c Reconvert script broken when code/ moved. 2023-03-08 17:18:04 +01:00
Viktor Lofgren
d4010c76cf Better title extraction for plain text plugin. 2023-03-07 21:53:44 +01:00
Viktor Lofgren
6fb0f77eea Improving search result scoring in index. 2023-03-07 21:53:30 +01:00
Viktor Lofgren
1252f95da5 Fix for valuation bug in index code that wouldn't sort bad-ish items properly. 2023-03-07 21:26:04 +01:00
Viktor Lofgren
f3babde415 Readme for code/ 2023-03-07 17:32:16 +01:00
Viktor Lofgren
ad1be7c835 Move all code to a code directory. 2023-03-07 17:14:32 +01:00
Viktor Lofgren
c47eb25483 Remove refuse pile logic that in practice resulted in a lot fewer results showing up for many queries. 2023-03-07 16:38:33 +01:00
Viktor Lofgren
58fcddedbb Code cleanup ForwardIndexReader 2023-03-07 16:38:03 +01:00
Viktor Lofgren
11af3f3e64 Code cleanup 2023-03-07 16:37:08 +01:00
Viktor Lofgren
549d323f6d Code cleanup 2023-03-07 16:37:05 +01:00
Viktor Lofgren
a2885acdf4 Performance optimization IndexJournalReadEntry.read 2023-03-07 16:36:44 +01:00
Viktor Lofgren
bd84c73e05 Clean up DocumentKeywordExtractor and DocumentKeywordsBuilder 2023-03-07 16:36:12 +01:00
Viktor Lofgren
04f501b8c8 Tidying up the HTML plugin. 2023-03-06 19:41:20 +01:00
Viktor Lofgren
be040419f3 Tidying up the HTML plugin. 2023-03-06 19:39:21 +01:00
Viktor Lofgren
384de2e54b Fixing LSH deduplication bug. 2023-03-06 19:32:37 +01:00
Viktor Lofgren
43f3380cb9 Refactoring converting-process 2023-03-06 19:32:25 +01:00
Viktor Lofgren
bce452fb4f More documentation... 2023-03-06 19:01:36 +01:00
Viktor Lofgren
0553174401 More documentation... 2023-03-06 18:55:28 +01:00
Viktor Lofgren
2d066af5b9 More documentation... 2023-03-06 18:45:01 +01:00
Viktor Lofgren
b945fd7f39 A lot of readmes, some refactoring. 2023-03-06 18:32:13 +01:00
Viktor Lofgren
f19c9a2863 More readmes, cleaning up dead code. 2023-03-05 19:31:43 +01:00
Viktor Lofgren
87767b14bd index service readme refers to index primitives 2023-03-05 19:16:08 +01:00
Viktor Lofgren
fe0d754f2c Make the code run properly without WMSA_HOME set, adding missing test assets. 2023-03-05 14:12:13 +01:00
Viktor Lofgren
fd1b56dbad Make the code run properly without WMSA_HOME set, adding missing test assets. 2023-03-05 13:47:40 +01:00
Viktor Lofgren
ed8ec0990e Cleaning junk from QueryParserTest 2023-03-05 13:19:26 +01:00
Viktor Lofgren
4d94a023c9 More tests for BTree, cleaned up code a bit. 2023-03-05 13:03:55 +01:00
Viktor Lofgren
96f6cd19e9 Repair integration tests 2023-03-05 12:24:12 +01:00
Viktor Lofgren
cf00963e57 Cleaning up the BTree library a bit. 2023-03-05 11:27:56 +01:00
Viktor Lofgren
4464055715 Readme for array 2023-03-04 19:19:47 +01:00
Viktor Lofgren
4972ad4c4f Readme for array 2023-03-04 19:19:12 +01:00
Viktor Lofgren
a0482273e0 Readme for array 2023-03-04 19:17:30 +01:00
Viktor Lofgren
0b7f8e1459 Readme for array 2023-03-04 19:15:51 +01:00