CatgirlIntelligenceAgency/run
2023-03-13 17:39:53 +01:00
..
env WIP run and setup 2023-03-04 14:38:21 +01:00
template/conf WIP run and setup 2023-03-04 14:42:24 +01:00
test-data Make the code run properly without WMSA_HOME set, adding missing test assets. 2023-03-05 13:47:40 +01:00
.gitignore Restructuring the git repo 2023-03-04 13:19:01 +01:00
nginx-site.conf Placeholder screenshots when the domain is missing from the database entirely. 2023-03-08 18:36:41 +01:00
readme.md More documentation... 2023-03-06 18:45:01 +01:00
reconvert.sh More restructuring, big bug fixes in keyword extraction. 2023-03-13 17:39:53 +01:00
setup.sh Fix broken setup script 2023-03-12 12:21:37 +01:00

Run

When developing locally, this directory will contain run-time data required for the search engine. In a clean check-out, it only contains the tools required to bootstrap this directory structure.

Requirements

While the system is designed to run bare metal in production, for local development, you're strongly encouraged to use docker or podman. These are a bit of a pain to install, but if you follow this guide you're on the right track.

Set up

To go from a clean check out of the git repo to a running search engine, follow these steps. You're assumed to sit in the project root the whole time.

  1. Run the one-time setup, it will create the basic runtime directory structure and download some models and data that doesn't come with the git repo.
$ run/setup.sh
  1. Compile the project and build docker images
$ ./gradlew assemble docker
  1. Download a sample of crawl data, process it and stick the metadata into the database. The data is only downloaded once. Grab a cup of coffee, this takes a few minutes. This needs to be done whenever the crawler or processor has changed.
$ docker-compose up -d mariadb
$ run/reconvert.sh
  1. Bring the system online. We'll run it in the foreground in the terminal this time because it's educational to see the logs. Add -d to run in the background.
$ docker-compose up
  1. Since we've just processed new crawl data, the system needs to construct static indexes. Wait for the line 'Auto-conversion finished!'

When all is done, it should be possible to visit http://localhost:8080 and try a few searches!