Viktor Lofgren
d82a858491
Don't consider slash to be a sentence separator.
2023-05-31 16:54:30 +02:00
Viktor
7694a15f62
Fix kale's unreasonably high weighting factor
2023-04-22 20:55:09 +02:00
Viktor Lofgren
619fb8ba80
(converter) Adjust the pub-date sniffing heuristics' order. Doing HTML5 tags too early puts some sites too early. Also expanded support for JSON+LD.
2023-04-19 15:28:50 +02:00
Viktor Lofgren
810515c08d
Clean up artifact extractor.
2023-04-10 13:07:54 +02:00
Viktor
a278fc6296
Increase search result relevance ( #8 )
...
* Increase accuracy of the position bits.
* Increase their width to 56.
* Use a rolling position scheme for bits 16-56 to increase the average accuracy.
* Result ranking overhaul
* Optimized queries
* BM25 in the index service's ranking
* Make gui less jank
* Javadocs for ranking parameters.
2023-04-07 20:18:08 +02:00
Viktor Lofgren
716ab35b4e
Search ranking debuggability improvements.
2023-04-02 13:43:24 +02:00
Viktor Lofgren
137adb9c3c
Bitmask calculation improvement. Take sentence length into consideration, not all lines are equal.
2023-03-30 15:42:06 +02:00
Viktor Lofgren
0fcb2b534c
Polish Names
2023-03-29 16:51:47 +02:00
Viktor
0b505939ed
Update features-convert/readme.md
2023-03-25 12:43:58 +01:00
Viktor Lofgren
46f81aca2f
Break apart reverse index into a separate full index and priority index. It did this before using the same code. This will make the priority index about half as big since it no longer needs to keep metadata.
2023-03-21 16:12:31 +01:00
Viktor Lofgren
624e8acd41
Remove copy-pasted application plugin from subprojects that define features.
2023-03-20 17:25:58 +01:00
Viktor Lofgren
0682550bd2
Clean up summary extractor module.
2023-03-18 10:33:58 +01:00
Viktor Lofgren
6e89377dea
Clean up summary extractor module.
2023-03-18 10:29:25 +01:00
Viktor Lofgren
950c49d80f
Clean up summary extractor module.
2023-03-18 10:28:48 +01:00
Viktor Lofgren
8def95e849
Clean up summary extractor module.
2023-03-18 10:24:12 +01:00
Viktor Lofgren
43430728aa
Clean up summary extractor module.
2023-03-18 10:21:41 +01:00
Viktor Lofgren
2eb972dea1
Remove unrelated code, break tools into their own directory.
2023-03-17 16:03:11 +01:00
Viktor Lofgren
449471a076
Yet more restructuring. Improved search result ranking.
2023-03-16 21:35:54 +01:00
Viktor Lofgren
0ecab53635
Yet more restructuring.
2023-03-13 23:40:26 +01:00
Viktor Lofgren
d82532b7f1
More restructuring, big bug fixes in keyword extraction.
2023-03-13 17:39:53 +01:00