CatgirlIntelligenceAgency/run
2023-08-01 17:34:25 +02:00
..
dist (control, WIP) MQFSM and ProcessService are sitting in a tree 2023-07-11 17:08:43 +02:00
env Fix environment variables to processes so jmc works 2023-07-31 10:32:23 +02:00
template/conf (conf) Change default user-agent to not associate it with the project; remove unused disks.properties file. 2023-08-01 17:34:25 +02:00
test-data Make the code run properly without WMSA_HOME set, adding missing test assets. 2023-03-05 13:47:40 +01:00
.gitignore Restructuring the git repo 2023-03-04 13:19:01 +01:00
experiment.sh Tell experiment runner to only process some domains. 2023-06-20 14:14:01 +02:00
nginx-site.conf (run) Reduce nginx access log noise for local setup 2023-07-11 23:11:34 +02:00
readme.md (db) Use flwyay for database migrations. 2023-08-01 17:08:42 +02:00
reconvert.sh "-Dsmall-ram=TRUE" no longer does anything. Remove references to the flag, which previously reduced the memory footprint of the loader and index service. 2023-03-26 21:37:11 +02:00
setup.sh (*) File Storage WIP 2023-07-14 17:08:10 +02:00

Run

When developing locally, this directory will contain run-time data required for the search engine. In a clean check-out, it only contains the tools required to bootstrap this directory structure.

Requirements

While the system is designed to run bare metal in production, for local development, you're strongly encouraged to use docker or podman. These are a bit of a pain to install, but if you follow this guide you're on the right track.

Set up

To go from a clean check out of the git repo to a running search engine, follow these steps. You're assumed to sit in the project root the whole time.

  1. Run the one-time setup, it will create the basic runtime directory structure and download some models and data that doesn't come with the git repo because git deals poorly with large binary files.
$ run/setup.sh
  1. Compile the project and build docker images
$ ./gradlew assemble docker
  1. Initialize the database
$ docker-compose up -d mariadb
$ ./gradlew flywayMigrate
  1. Bring the system online. We'll run it in the foreground in the terminal this time because it's educational to see the logs. Add -d to run in the background.
$ docker-compose up
  1. You should now be able to access the system.
Address Description
https://localhost:8080/ User-facing GUI
https://localhost:8081/ Operator's GUI
  1. Download Sample Data

TODO: How?

Experiment Runner

The script experiment.sh is a launcher for the experiment runner, which is useful when evaluating new algorithms in processing crawl data.