Commit Graph

80 Commits

Author SHA1 Message Date
Viktor Lofgren
c130d7cf5f (*) Use trafeik instead of nginx for reverse proxy 2023-10-24 14:44:19 +02:00
Viktor Lofgren
81dd3809e9 (*) WIP Add node affinity to EC_DOMAIN
Very messy commit due to fractalline yak shaving
2023-10-19 17:48:34 +02:00
Viktor Lofgren
93122bdd18 (run) Add two nodes to the demo setup 2023-10-16 17:37:26 +02:00
Viktor Lofgren
16e0738731 (*) Get multi-node routing working. 2023-10-15 18:38:30 +02:00
Viktor Lofgren
4baf9527d7 (*) WIP Control GUI redesign, executor-service, multi-node mq
This turned out to be very difficult to do in small isolated steps.

* Design overhaul of the control gui using bootstrap
* Move the actors out of control-service into to a new executor-service, that can be run on multiple nodes
* Add node-affinity to message queue
2023-10-14 12:08:43 +02:00
Viktor Lofgren
6319b8ef51 (api-service) Improved testability, always set content type to application/json 2023-10-09 15:39:34 +02:00
Viktor
8e1abc3f10
(index-reverse) Parallel construction of the reverse indexes. (#52)
* (index-reverse) Parallel construction of the reverse indexes.

* (array) Remove wasteful calculation of numDistinct before merging two sorted arrays.

* (index-reverse)  Force changes to disk on close, reduce logging.

* (index-reverse)  Clean up merging process and add back logging

* (run)  Add a conservative default for INDEX_CONSTRUCTION_PROCESS_OPTS's parallelism as it eats a lot of RAM

* (index-reverse)  Better logging during processing

* (array) 2GB+ compatible write() function

* (array) 2GB+ compatible write() function

* (index-reverse) We are logging like Bolsonaro and I will not have it.

* (reverse-index) Self-diagnostics

* (btree) Fix bug in btree reader to do with large data sizes
2023-10-07 10:00:00 +02:00
Viktor Lofgren
4c26674ff4 (setup) Use mirrored lid.176.ftz file that is of a compatible version 2023-10-03 10:29:44 +02:00
Viktor Lofgren
23be648456 (setup) use curl instead of wget for setup.sh 2023-10-02 16:38:23 +02:00
Viktor Lofgren
d160954080 (index) Two useful debug endpoints 2023-09-24 19:39:48 +02:00
Viktor Lofgren
dbe9235f3a (*) Upgrade to JDK21 with preview enabled.
... also move some common configuration into the root build.gradle-file.

Support for JDK21 in lombok is a bit sketchy at the moment, but it seems to work.  This upgrade is kind of important as the new index construction really benefits from Arena based lifecycle control over off-heap memory.
2023-09-24 10:38:59 +02:00
Viktor Lofgren
c68d17d482 (keyword-extraction) Fix bug leading to position data missing on some keywords.
This was due to a discrepancy between the KeywordPositionBitmask and WordsTfIdfCounts' concept of a keyword.
2023-09-02 14:48:55 +02:00
Viktor Lofgren
2b00cd632d (process) Propagate environment JVM params to the index constructor 2023-09-01 15:39:42 +02:00
Viktor
bdcbfb11a8
Merge pull request #42 from MarginaliaSearch/no-downtime-upgrades
Zero downtime upgrades, merge-based index construction
2023-08-29 17:05:48 +02:00
Viktor Lofgren
fa87c7e1b7 (process) Automatic flightrecorder runs for processes when run in docker. 2023-08-29 14:12:51 +02:00
Viktor Lofgren
194a6057dd (index,control) Recoverable index backups 2023-08-25 14:57:43 +02:00
Viktor
229c63c46d
Update readme.md 2023-08-24 13:27:24 +02:00
Viktor Lofgren
b958acb76a (file-storage) New File Storage type for linkdb 2023-08-24 09:06:13 +02:00
Viktor Lofgren
e8c0648e04 Fix missing vol/ss dir in setup.sh 2023-08-23 17:59:40 +02:00
Viktor Lofgren
8bd9a00c38 Amend setup instructions with command 2023-08-23 14:02:21 +00:00
Viktor Lofgren
972d03efdf Fix error in run/readme where it suggested local dev environment uses HTTPS 2023-08-23 13:47:39 +00:00
Viktor Lofgren
2656fcfe2c (conf) Remove unnecessary JVM flags for processes 2023-08-17 17:42:47 +02:00
Viktor Lofgren
46d761f34f (language) fasttext based language filter 2023-08-16 15:48:12 +02:00
Viktor Lofgren
c56ee10185 (control) Separate [Process] and [Process and Load] actions for crawl data; all SLOW data is deletable. 2023-08-13 13:39:59 +02:00
Viktor
69b28fd07d
Update readme.md 2023-08-12 18:58:21 +02:00
Viktor
99884c2c7e
Update readme.md 2023-08-12 15:39:28 +02:00
Viktor Lofgren
a42f707b2d (docs) Update readme with up to date instructions 2023-08-11 13:43:00 +02:00
Viktor Lofgren
eef37927ba (docs) Update readme with up to date instructions 2023-08-11 13:42:14 +02:00
Viktor Lofgren
cdfe284f9a (file storage) File Storage Type for EXPORT data
(file storage) File Storage Type for EXPORT data
2023-08-05 14:45:03 +02:00
Viktor Lofgren
ba724bc1b2 (scripts|docs) Update scripts and documentations for the new operator's gui and file storage workflows. 2023-08-01 22:47:37 +02:00
Viktor Lofgren
483c2dbb44 (conf) Change default user-agent to not associate it with the project; remove unused disks.properties file. 2023-08-01 17:34:25 +02:00
Viktor Lofgren
58556af6c7 (db) Use flwyay for database migrations. 2023-08-01 17:08:42 +02:00
Viktor Lofgren
9786f82220 Fix environment variables to processes so jmc works 2023-07-31 10:32:23 +02:00
Viktor Lofgren
6ff7e9648f (crawler) Use and pass the proper environment variables to the processes. 2023-07-30 16:54:02 +02:00
Viktor Lofgren
a56953c798 (converter, WIP) Refactor converter to not have to load everything into RAM. 2023-07-24 15:25:09 +02:00
Viktor Lofgren
995657c6ce (big-string) Make big-string disable:able 2023-07-21 19:50:35 +02:00
Viktor Lofgren
f91d92cccb (crawler) WIP 2023-07-20 21:05:16 +02:00
Viktor Lofgren
d7ab21fe34 (*) Refactor Control Service and processes 2023-07-17 21:20:31 +02:00
Viktor Lofgren
8b74e3aa0d (*) File Storage WIP 2023-07-14 17:08:10 +02:00
Viktor Lofgren
7087ab5f07 (run) Reduce nginx access log noise for local setup 2023-07-11 23:11:34 +02:00
Viktor Lofgren
77261a38cd (control, WIP) MQFSM and ProcessService are sitting in a tree
We're spawning processes from the MSFSM in control service now!
2023-07-11 17:08:43 +02:00
Viktor Lofgren
2283ceb77d (control) WIP control service 2023-07-10 18:58:43 +02:00
Viktor Lofgren
62cc9df206 Embryo of new control process
* New events and heartbeat tables in mariadb
* Refactored to a cleaner Service interface
2023-07-03 10:40:32 +02:00
Viktor Lofgren
a9fabba407 Tell experiment runner to only process some domains.
Updated the experiment runner, as well as the script.
2023-06-20 14:14:01 +02:00
Viktor
f1c6525a50
Update setup.sh 2023-04-02 14:44:43 +02:00
Viktor Lofgren
d0c72ceb7e Improve experiment runner, convenient start script. 2023-03-30 15:40:31 +02:00
Viktor Lofgren
8f51345a1d Add experiment runner tool and got rid of experiments module in processes. 2023-03-28 16:58:46 +02:00
Viktor Lofgren
862e925d7c "-Dsmall-ram=TRUE" no longer does anything. Remove references to the flag, which previously reduced the memory footprint of the loader and index service. 2023-03-26 21:37:11 +02:00
Viktor Lofgren
964014860a Get suggestions working again 2023-03-22 15:11:22 +01:00
Viktor Lofgren
d82532b7f1 More restructuring, big bug fixes in keyword extraction. 2023-03-13 17:39:53 +01:00