Commit Graph

105 Commits

Author SHA1 Message Date
root
cde3a885cb Add CatBot UA, Expose to Web, Update mariadb to jammy,Add 161.sh domains 2024-03-13 20:38:05 +01:00
Viktor Lofgren
081c7d22bc Fix typo in install.sh 2024-01-25 17:08:18 +01:00
Viktor Lofgren
6aee896657 (*) Add single-node barebones configuration
This adds a single-node barebones configuration to the install script.  It also moves the log4j configuration into system.properties, and sets assertions to disabled by default.
2024-01-25 16:40:28 +01:00
Viktor Lofgren
562012fb22 (doc) Migrate documentation https://docs.marginalia.nu/ 2024-01-22 19:40:08 +01:00
Viktor Lofgren
4c62065e74 (install) Add two separate templates for the install script
One template is for the full Marginalia Search style install, and the other is for a barebones install with no Marginalia-related fluff.
2024-01-13 18:27:42 +01:00
Viktor Lofgren
7c6e18f7a7 (*) Overhaul settings and properties
Use a system.properties file to configure the system.  This is loaded statically by MainClass or ProcessMainClass.  Update the property names to be more consistent, and update the documentations to reflect the changes.
2024-01-13 17:12:18 +01:00
Viktor Lofgren
264e2db539 (control) UX-improvements for control service
This commit overhauls a lot of the UX for the control service, adding a new actions menu to the nodes views.  It has many small tweaks to make the work flow better.

It also adds a new /uploads directory in each index node, from which sideloaded data can be selected.  This is a bit of a breaking change, as this directory needs to exist in each index node.
2024-01-12 12:33:05 +01:00
Viktor Lofgren
734996002c (*) install script for deploying Marginalia outside the codebase
The changeset also makes the control service responsible for flyway migrations.  This helps reduce the number of places the database configuration needs to be spread out.  These automatic migrations can be disabled with -DdisableFlyway=true.

The commit also adds curl to the docker container, to enable docker health checks and interdependencies.
2024-01-11 12:40:03 +01:00
Viktor Lofgren
205e5016e8 (docs) Document barebones config 2024-01-11 09:43:08 +01:00
Viktor Lofgren
116595d218 (prometheus) Add in-docker prometheus instance to exfiltrate metrics from the docker-based services 2024-01-02 14:28:53 +01:00
Viktor
5bd3934d22
Merge pull request #64 from dreimolo/macos_AS_fix
Macos apple silicon fix, and slight improvements to sample downloader
2023-12-18 18:29:14 +01:00
Viktor Lofgren
128f550ee5 (run) Download to a temporary file to avoid corruption from aborted downloads 2023-12-18 18:28:17 +01:00
Viktor Lofgren
c92f1b8df8 (geo-ip) Revert removal of ip2location logic
We do both ip2location and ASN data.

The change also adds some keywords based on autonomous system information, on a somewhat experimental basis.  It would be neat to be able to e.g. exclude cloud services or just e.g. cloudflare from the search results.
2023-12-17 15:03:00 +01:00
Viktor Lofgren
d7bd540683 (*) Replace the ip2location IP geolocation data with ASN information from apnic.net.
Doesn't really make sense to use ip2location as a middle man for information that is already freely available...
2023-12-16 21:55:04 +01:00
dreimolo
62954f98de adds xl to help output 2023-12-16 19:41:41 +01:00
dreimolo
6c7d7427bf Adds check for wget and curl, and valid sample archives 2023-12-16 14:14:58 +01:00
Viktor Lofgren
2f4500be5a (search) New frontend look 2023-12-02 17:06:40 +01:00
Viktor Lofgren
21d6aa421c (docs) Update setup instructions 2023-11-30 21:44:29 +01:00
Viktor Lofgren
8a5b853fae Fix experiment runner 2023-11-15 14:03:17 +01:00
Viktor Lofgren
ef16502159 (doc) Update readme 2023-11-07 16:00:18 +01:00
Viktor Lofgren
3047e2dd7c (screenshot-capture-tool) Make screenshot-capture-tool cooperate with docker 2023-11-01 16:38:55 +01:00
Viktor Lofgren
5d6e0e3790 (log) Clean up logging
Don't log the PROCESS stream to executor's logs, as it will also be logged in the spawned process' log files.

Also tell the spawned process which "service" it is so that it gets a log file with a name that makes sense.
2023-10-29 15:52:17 +01:00
Viktor Lofgren
f6fcb04817 (experiment) Repair the experiment runner 2023-10-27 16:16:50 +02:00
Viktor Lofgren
b8796d825d (docs) Update documentation 2023-10-27 13:24:49 +02:00
Viktor Lofgren
0f637fb722 (logging) Better logging configurations 2023-10-26 12:48:10 +02:00
Viktor Lofgren
c130d7cf5f (*) Use trafeik instead of nginx for reverse proxy 2023-10-24 14:44:19 +02:00
Viktor Lofgren
81dd3809e9 (*) WIP Add node affinity to EC_DOMAIN
Very messy commit due to fractalline yak shaving
2023-10-19 17:48:34 +02:00
Viktor Lofgren
93122bdd18 (run) Add two nodes to the demo setup 2023-10-16 17:37:26 +02:00
Viktor Lofgren
16e0738731 (*) Get multi-node routing working. 2023-10-15 18:38:30 +02:00
Viktor Lofgren
4baf9527d7 (*) WIP Control GUI redesign, executor-service, multi-node mq
This turned out to be very difficult to do in small isolated steps.

* Design overhaul of the control gui using bootstrap
* Move the actors out of control-service into to a new executor-service, that can be run on multiple nodes
* Add node-affinity to message queue
2023-10-14 12:08:43 +02:00
Viktor Lofgren
6319b8ef51 (api-service) Improved testability, always set content type to application/json 2023-10-09 15:39:34 +02:00
Viktor
8e1abc3f10
(index-reverse) Parallel construction of the reverse indexes. (#52)
* (index-reverse) Parallel construction of the reverse indexes.

* (array) Remove wasteful calculation of numDistinct before merging two sorted arrays.

* (index-reverse)  Force changes to disk on close, reduce logging.

* (index-reverse)  Clean up merging process and add back logging

* (run)  Add a conservative default for INDEX_CONSTRUCTION_PROCESS_OPTS's parallelism as it eats a lot of RAM

* (index-reverse)  Better logging during processing

* (array) 2GB+ compatible write() function

* (array) 2GB+ compatible write() function

* (index-reverse) We are logging like Bolsonaro and I will not have it.

* (reverse-index) Self-diagnostics

* (btree) Fix bug in btree reader to do with large data sizes
2023-10-07 10:00:00 +02:00
Viktor Lofgren
4c26674ff4 (setup) Use mirrored lid.176.ftz file that is of a compatible version 2023-10-03 10:29:44 +02:00
Viktor Lofgren
23be648456 (setup) use curl instead of wget for setup.sh 2023-10-02 16:38:23 +02:00
Viktor Lofgren
d160954080 (index) Two useful debug endpoints 2023-09-24 19:39:48 +02:00
Viktor Lofgren
dbe9235f3a (*) Upgrade to JDK21 with preview enabled.
... also move some common configuration into the root build.gradle-file.

Support for JDK21 in lombok is a bit sketchy at the moment, but it seems to work.  This upgrade is kind of important as the new index construction really benefits from Arena based lifecycle control over off-heap memory.
2023-09-24 10:38:59 +02:00
Viktor Lofgren
c68d17d482 (keyword-extraction) Fix bug leading to position data missing on some keywords.
This was due to a discrepancy between the KeywordPositionBitmask and WordsTfIdfCounts' concept of a keyword.
2023-09-02 14:48:55 +02:00
Viktor Lofgren
2b00cd632d (process) Propagate environment JVM params to the index constructor 2023-09-01 15:39:42 +02:00
Viktor
bdcbfb11a8
Merge pull request #42 from MarginaliaSearch/no-downtime-upgrades
Zero downtime upgrades, merge-based index construction
2023-08-29 17:05:48 +02:00
Viktor Lofgren
fa87c7e1b7 (process) Automatic flightrecorder runs for processes when run in docker. 2023-08-29 14:12:51 +02:00
Viktor Lofgren
194a6057dd (index,control) Recoverable index backups 2023-08-25 14:57:43 +02:00
Viktor
229c63c46d
Update readme.md 2023-08-24 13:27:24 +02:00
Viktor Lofgren
b958acb76a (file-storage) New File Storage type for linkdb 2023-08-24 09:06:13 +02:00
Viktor Lofgren
e8c0648e04 Fix missing vol/ss dir in setup.sh 2023-08-23 17:59:40 +02:00
Viktor Lofgren
8bd9a00c38 Amend setup instructions with command 2023-08-23 14:02:21 +00:00
Viktor Lofgren
972d03efdf Fix error in run/readme where it suggested local dev environment uses HTTPS 2023-08-23 13:47:39 +00:00
Viktor Lofgren
2656fcfe2c (conf) Remove unnecessary JVM flags for processes 2023-08-17 17:42:47 +02:00
Viktor Lofgren
46d761f34f (language) fasttext based language filter 2023-08-16 15:48:12 +02:00
Viktor Lofgren
c56ee10185 (control) Separate [Process] and [Process and Load] actions for crawl data; all SLOW data is deletable. 2023-08-13 13:39:59 +02:00
Viktor
69b28fd07d
Update readme.md 2023-08-12 18:58:21 +02:00