Commit Graph

810 Commits

Author SHA1 Message Date
Viktor Lofgren
810515c08d Clean up artifact extractor. 2023-04-10 13:07:54 +02:00
Viktor Lofgren
535a51a621 Repair broken year query test. 2023-04-08 12:04:09 +02:00
Viktor
a278fc6296
Increase search result relevance (#8)
* Increase accuracy of the position bits.
* Increase their width to 56.
* Use a rolling position scheme for bits 16-56 to increase the average accuracy.
* Result ranking overhaul
* Optimized queries
* BM25 in the index service's ranking
* Make gui less jank
* Javadocs for ranking parameters.
2023-04-07 20:18:08 +02:00
Viktor
f1c6525a50
Update setup.sh 2023-04-02 14:44:43 +02:00
Viktor
ace0d19973
Update README.md 2023-04-02 14:43:14 +02:00
Viktor
40b8c8c128
Update README.md 2023-04-02 14:08:17 +02:00
Viktor Lofgren
716ab35b4e Search ranking debuggability improvements. 2023-04-02 13:43:24 +02:00
Viktor Lofgren
3fb249758e Adjust result ordering. 2023-04-02 12:05:22 +02:00
Viktor Lofgren
f7a6ef2179 Smarter queries, better logging. 2023-04-02 12:05:09 +02:00
Viktor Lofgren
105d93cd85 Index query builder automatically ignores redundant predicates. 2023-04-02 12:04:26 +02:00
Viktor Lofgren
1e4157017d More helpful descriptions of index queries. 2023-04-02 12:03:58 +02:00
Viktor Lofgren
5fb75adaae Remove antique result scoring adjustment that makes no sense anymore. 2023-04-02 11:58:04 +02:00
Viktor Lofgren
affcf8cf41 Load test tool 2023-04-02 09:43:43 +02:00
Viktor Lofgren
cc4e089a5d Consider average sentence length when selecting search results. This promotes proses over code listings, tabular data, etc. 2023-03-30 15:46:15 +02:00
Viktor Lofgren
32b9c2e671 Fix SentenceExtractor jank 2023-03-30 15:45:04 +02:00
Viktor Lofgren
4d05be4095 Refactor InternalLinkGraph 2023-03-30 15:44:23 +02:00
Viktor Lofgren
137adb9c3c Bitmask calculation improvement. Take sentence length into consideration, not all lines are equal. 2023-03-30 15:42:06 +02:00
Viktor Lofgren
16e37672fc Bugfix crawl plan, doesn't use rewrite() everywhere 2023-03-30 15:41:07 +02:00
Viktor Lofgren
d0c72ceb7e Improve experiment runner, convenient start script. 2023-03-30 15:40:31 +02:00
Viktor Lofgren
0fcb2b534c Polish Names 2023-03-29 16:51:47 +02:00
Viktor Lofgren
dcf6218cdb Fix bugs related to search result selection in the case with multiple search terms.
* A deduplication filter step ran too early, and removed many good results on the basis that they partially, but did not fully fit another set of search terms.

* Altered the query creation process to prefer documents where multiple terms appear in the priority index.
2023-03-29 15:18:52 +02:00
Viktor Lofgren
8f51345a1d Add experiment runner tool and got rid of experiments module in processes. 2023-03-28 16:58:46 +02:00
Viktor Lofgren
03bd892b95 Improve document processing in conversion.
* Add flags for long and short documents.
* Break out common length logic from plugins.
* Cleaning up of related code.
2023-03-28 16:38:00 +02:00
Viktor Lofgren
1e65ac3940 Improve useful-resources.md 2023-03-28 16:35:58 +02:00
Viktor
e622437560
Create FUNDING.yml 2023-03-28 13:13:49 +02:00
Viktor Lofgren
30584887f9 DictionaryMap changes.
Add new flag to change the default size to make prod index boot faster. Remove option to select OffHeapDictionaryHashMap.
2023-03-27 17:28:39 +02:00
Viktor Lofgren
17ca4f9eea Permit search results that are all synthetic to pass relevancy check. 2023-03-27 17:27:35 +02:00
Viktor Lofgren
7fb3db3249 Fix bug where link on front page news listing wouldn't work.
... also changed order of date and source to make the UI more consistent.
2023-03-27 17:26:46 +02:00
Viktor Lofgren
b60fcd0918 Documentation improvements 2023-03-27 17:25:27 +02:00
Viktor Lofgren
862e925d7c "-Dsmall-ram=TRUE" no longer does anything. Remove references to the flag, which previously reduced the memory footprint of the loader and index service. 2023-03-26 21:37:11 +02:00
Viktor Lofgren
a0027ad32b Fix broken diagram links after doc/ restructuring. 2023-03-25 16:32:10 +01:00
Viktor Lofgren
c5f4cb34bf Documentation for DB 2023-03-25 16:14:16 +01:00
Viktor
2e69179f12
Update readme.md 2023-03-25 15:47:45 +01:00
Viktor
19000ab339
Create readme.md 2023-03-25 15:46:19 +01:00
Viktor
be3ba3ef37
Update readme.md 2023-03-25 15:27:11 +01:00
Viktor
ac1ac3ea57
Move database to a separate module
* Move database to a separate project, break apart sql file into separate entities.
* Fix front page news listing.
2023-03-25 15:26:17 +01:00
Viktor
0b505939ed
Update features-convert/readme.md 2023-03-25 12:43:58 +01:00
Viktor
d2a9e1b644
Add processes link to readme.md for code/common 2023-03-25 12:42:44 +01:00
Viktor Lofgren
3464ca514b Fix typeahead suggestions 2023-03-25 10:20:52 +01:00
Viktor Lofgren
2f2c86a9f5 Fix bug where WmsaHome wouldn't look in /var/lib/wmsa as a fallback 2023-03-25 10:20:52 +01:00
Viktor
45dd9fea25
Update readme.md 2023-03-22 17:15:36 +01:00
Viktor
c974d72e7e
Update readme.md 2023-03-22 17:09:48 +01:00
Viktor
e3675d2fa9
Update readme.md 2023-03-22 17:02:03 +01:00
Viktor
c4a6bf7672
Update readme.md 2023-03-22 17:01:34 +01:00
Viktor
5edc0c8d52
Add files via upload 2023-03-22 17:00:01 +01:00
Viktor
cb6865924e
Update readme.md 2023-03-22 16:59:38 +01:00
Viktor Lofgren
ee50f7422d CONTRIBUTING.md 2023-03-22 15:27:20 +01:00
Viktor Lofgren
964014860a Get suggestions working again 2023-03-22 15:11:22 +01:00
Viktor Lofgren
7c58ddce81 readme.md 2023-03-22 15:10:30 +01:00
Viktor Lofgren
611ba2d35a Break apart WordPatterns class 2023-03-22 15:10:17 +01:00