Commit Graph

637 Commits

Author SHA1 Message Date
Viktor Lofgren
0682550bd2 Clean up summary extractor module. 2023-03-18 10:33:58 +01:00
Viktor Lofgren
6e89377dea Clean up summary extractor module. 2023-03-18 10:29:25 +01:00
Viktor Lofgren
950c49d80f Clean up summary extractor module. 2023-03-18 10:28:48 +01:00
Viktor Lofgren
8def95e849 Clean up summary extractor module. 2023-03-18 10:24:12 +01:00
Viktor Lofgren
43430728aa Clean up summary extractor module. 2023-03-18 10:21:41 +01:00
Viktor Lofgren
6a20b2b678 Trivial reformatting of code. 2023-03-17 22:11:14 +01:00
Viktor Lofgren
3675c7a090 The search-service doesn't speak REST. 2023-03-17 16:21:52 +01:00
Viktor Lofgren
2eb972dea1 Remove unrelated code, break tools into their own directory. 2023-03-17 16:03:11 +01:00
Viktor Lofgren
449471a076 Yet more restructuring. Improved search result ranking. 2023-03-16 21:35:54 +01:00
Viktor Lofgren
5ef17a2a20 Yet more restructuring. 2023-03-13 23:43:09 +01:00
Viktor Lofgren
0ecab53635 Yet more restructuring. 2023-03-13 23:40:26 +01:00
Viktor Lofgren
d82532b7f1 More restructuring, big bug fixes in keyword extraction. 2023-03-13 17:39:53 +01:00
Viktor Lofgren
281f1322a9 Clean up BTreeWriter 2023-03-12 12:49:49 +01:00
Viktor Lofgren
347f16939c Fix broken setup script 2023-03-12 12:21:37 +01:00
Viktor Lofgren
6e1ddca293 Fix broken mariadb setup 2023-03-12 12:11:33 +01:00
Viktor Lofgren
8b8fc49901 The refactoring will continue until morale improves. 2023-03-12 11:42:07 +01:00
Viktor Lofgren
73eaa0865d The refactoring will continue until morale improves. 2023-03-12 10:50:31 +01:00
Viktor Lofgren
616effdb3c The refactoring will continue until morale improves. 2023-03-12 10:04:48 +01:00
Viktor Lofgren
4cec89da91 Fix bug where results would sometimes be presented solely based on the fact that the document is important on the site in general, regardless of whether it's important to the document. 2023-03-11 14:20:32 +01:00
Viktor Lofgren
2e2916cebe Additional code restructuring to get rid of util and misc-style packages. 2023-03-11 13:53:36 +01:00
Viktor Lofgren
6d939175b1 Additional code restructuring to get rid of util and misc-style packages. 2023-03-11 13:48:40 +01:00
Viktor Lofgren
73e412ea5b Clean up search-service and index-api 2023-03-11 12:26:12 +01:00
Viktor Lofgren
c2f9980eba Tidy up. 2023-03-11 12:13:53 +01:00
Viktor Lofgren
0532e8c40e Tidy up. 2023-03-11 11:35:08 +01:00
Viktor Lofgren
919b80b9ab Gradle shouldn't generate dist zips, zipping jar files is slow and also just ridiculous when you realize jar files are zip files and you can't compress a file twice using the same algo. 2023-03-11 11:34:51 +01:00
Viktor Lofgren
1aee6fdc11 Fix docker dependencies warning. 2023-03-10 17:16:44 +01:00
Viktor Lofgren
a62015d5f3 Fix broken test, compiler warning. 2023-03-10 17:12:12 +01:00
Viktor Lofgren
722ff3bffb Word feature bit for words that appear in the URL, new search profile for plain text files, better plain text titles. 2023-03-10 16:46:56 +01:00
Viktor Lofgren
2bc212d65c Refactor DocumentKeyword-related classes 2023-03-09 20:41:38 +01:00
Viktor Lofgren
efb46cc703 Remove count from WordMetadata entirely. 2023-03-09 18:14:14 +01:00
Viktor Lofgren
8fb531c614 Word Metadata's count is hella broken, stopgap fix by bitCounting positions instead as this is messing with the search result ordering very badly. 2023-03-09 17:58:56 +01:00
Viktor Lofgren
9ece07d559 Chasing a result ranking bug 2023-03-09 17:52:35 +01:00
Viktor Lofgren
0ae4731cf1 Add invariant to WordMetadata 2023-03-09 17:27:07 +01:00
Viktor Lofgren
02db999762 Enable assertions in reconvert script. 2023-03-09 17:26:08 +01:00
Viktor Lofgren
2a25b5e8a9 Placeholder screenshots when the domain is missing from the database entirely. 2023-03-08 18:36:41 +01:00
Viktor Lofgren
5c1a59257c Reconvert script broken when code/ moved. 2023-03-08 17:18:04 +01:00
Viktor Lofgren
d4010c76cf Better title extraction for plain text plugin. 2023-03-07 21:53:44 +01:00
Viktor Lofgren
6fb0f77eea Improving search result scoring in index. 2023-03-07 21:53:30 +01:00
Viktor Lofgren
1252f95da5 Fix for valuation bug in index code that wouldn't sort bad-ish items properly. 2023-03-07 21:26:04 +01:00
Viktor Lofgren
f3babde415 Readme for code/ 2023-03-07 17:32:16 +01:00
Viktor Lofgren
ad1be7c835 Move all code to a code directory. 2023-03-07 17:14:32 +01:00
Viktor Lofgren
c47eb25483 Remove refuse pile logic that in practice resulted in a lot fewer results showing up for many queries. 2023-03-07 16:38:33 +01:00
Viktor Lofgren
58fcddedbb Code cleanup ForwardIndexReader 2023-03-07 16:38:03 +01:00
Viktor Lofgren
11af3f3e64 Code cleanup 2023-03-07 16:37:08 +01:00
Viktor Lofgren
549d323f6d Code cleanup 2023-03-07 16:37:05 +01:00
Viktor Lofgren
a2885acdf4 Performance optimization IndexJournalReadEntry.read 2023-03-07 16:36:44 +01:00
Viktor Lofgren
bd84c73e05 Clean up DocumentKeywordExtractor and DocumentKeywordsBuilder 2023-03-07 16:36:12 +01:00
Viktor Lofgren
04f501b8c8 Tidying up the HTML plugin. 2023-03-06 19:41:20 +01:00
Viktor Lofgren
be040419f3 Tidying up the HTML plugin. 2023-03-06 19:39:21 +01:00
Viktor Lofgren
384de2e54b Fixing LSH deduplication bug. 2023-03-06 19:32:37 +01:00