Commit Graph

136 Commits

Author SHA1 Message Date
Viktor Lofgren
f92d8a0975 EdgeUrl conversion to/from java.net.URL 2023-06-27 10:57:54 +02:00
Viktor Lofgren
5abaf13192 Fix serialization bug with CompressedBigString 2023-06-27 10:57:54 +02:00
Viktor Lofgren
bd2c3855ed Add bits and keywords for generator classes (docs, forum, wiki). 2023-06-23 21:35:28 +02:00
Viktor Lofgren
54c2be893b TRIVIAL: Remove unused import. 2023-06-22 17:21:47 +02:00
Viktor Lofgren
b5ef67ed28 Categorize generators by type
This is a great quality signal!
Add the type as document bitflags by category.
2023-06-22 16:04:37 +02:00
Viktor Lofgren
9455100907 Throw a custom exception when WMSA_HOME isn't found 2023-06-20 11:37:52 +02:00
Viktor Lofgren
d1a004bea6 (minor) Clean up StringPool 2023-06-19 17:58:19 +02:00
Viktor Lofgren
2cda57355a More word metadata tests 2023-05-28 11:57:06 +02:00
Viktor Lofgren
d42ab19166 Issue 5: Fix bug where some IPv6 addresses blew up domain loading. 2023-04-15 14:11:08 +02:00
Viktor Lofgren
2ab26f37b8 Bug fix for document metadata encoding that breaks year based queries. 2023-04-14 16:56:49 +02:00
Viktor
a278fc6296
Increase search result relevance (#8)
* Increase accuracy of the position bits.
* Increase their width to 56.
* Use a rolling position scheme for bits 16-56 to increase the average accuracy.
* Result ranking overhaul
* Optimized queries
* BM25 in the index service's ranking
* Make gui less jank
* Javadocs for ranking parameters.
2023-04-07 20:18:08 +02:00
Viktor Lofgren
716ab35b4e Search ranking debuggability improvements. 2023-04-02 13:43:24 +02:00
Viktor Lofgren
cc4e089a5d Consider average sentence length when selecting search results. This promotes proses over code listings, tabular data, etc. 2023-03-30 15:46:15 +02:00
Viktor Lofgren
03bd892b95 Improve document processing in conversion.
* Add flags for long and short documents.
* Break out common length logic from plugins.
* Cleaning up of related code.
2023-03-28 16:38:00 +02:00
Viktor Lofgren
c5f4cb34bf Documentation for DB 2023-03-25 16:14:16 +01:00
Viktor
be3ba3ef37
Update readme.md 2023-03-25 15:27:11 +01:00
Viktor
ac1ac3ea57
Move database to a separate module
* Move database to a separate project, break apart sql file into separate entities.
* Fix front page news listing.
2023-03-25 15:26:17 +01:00
Viktor
d2a9e1b644
Add processes link to readme.md for code/common 2023-03-25 12:42:44 +01:00
Viktor Lofgren
2f2c86a9f5 Fix bug where WmsaHome wouldn't look in /var/lib/wmsa as a fallback 2023-03-25 10:20:52 +01:00
Viktor Lofgren
964014860a Get suggestions working again 2023-03-22 15:11:22 +01:00
Viktor Lofgren
46f81aca2f Break apart reverse index into a separate full index and priority index. It did this before using the same code. This will make the priority index about half as big since it no longer needs to keep metadata. 2023-03-21 16:12:31 +01:00
Viktor Lofgren
ca22c287a5 Make use of DocumentFlags' flags 2023-03-21 16:03:15 +01:00
Viktor Lofgren
72115e490f Put news into a database table instead of keeping them hardcoded, request counter on front page. 2023-03-19 12:54:58 +01:00
Viktor Lofgren
bdd2b4a43e Put news into a database table instead of keeping them hardcoded. 2023-03-19 11:46:13 +01:00
Viktor Lofgren
2eb972dea1 Remove unrelated code, break tools into their own directory. 2023-03-17 16:03:11 +01:00
Viktor Lofgren
449471a076 Yet more restructuring. Improved search result ranking. 2023-03-16 21:35:54 +01:00
Viktor Lofgren
616effdb3c The refactoring will continue until morale improves. 2023-03-12 10:04:48 +01:00
Viktor Lofgren
4cec89da91 Fix bug where results would sometimes be presented solely based on the fact that the document is important on the site in general, regardless of whether it's important to the document. 2023-03-11 14:20:32 +01:00
Viktor Lofgren
2e2916cebe Additional code restructuring to get rid of util and misc-style packages. 2023-03-11 13:53:36 +01:00
Viktor Lofgren
6d939175b1 Additional code restructuring to get rid of util and misc-style packages. 2023-03-11 13:48:40 +01:00
Viktor Lofgren
722ff3bffb Word feature bit for words that appear in the URL, new search profile for plain text files, better plain text titles. 2023-03-10 16:46:56 +01:00
Viktor Lofgren
efb46cc703 Remove count from WordMetadata entirely. 2023-03-09 18:14:14 +01:00
Viktor Lofgren
8fb531c614 Word Metadata's count is hella broken, stopgap fix by bitCounting positions instead as this is messing with the search result ordering very badly. 2023-03-09 17:58:56 +01:00
Viktor Lofgren
9ece07d559 Chasing a result ranking bug 2023-03-09 17:52:35 +01:00
Viktor Lofgren
0ae4731cf1 Add invariant to WordMetadata 2023-03-09 17:27:07 +01:00
Viktor Lofgren
ad1be7c835 Move all code to a code directory. 2023-03-07 17:14:32 +01:00