diff --git a/README.md b/README.md
index 58f84e55..6a53f817 100644
--- a/README.md
+++ b/README.md
@@ -13,12 +13,21 @@ The long term plan is to refine the search engine so that it provide enough publ
 that the project can be funded through grants, donations and commercial API licenses 
 (non-commercial share-alike is always free).
 
+The system can both be run as a copy of Marginalia Search, or as a white-label search engine
+for your own data (either crawled or side-loaded).  At present the logic isn't very configurable, and a lot of the judgements
+made are based on the Marginalia project's goals, but additional configurability is being
+worked on!
+
 ## Set up
 
-Start by running [⚙️ run/setup.sh](run/setup.sh). This will download supplementary model data that is necessary to run the code. 
+To set up a local test environment, follow the instructions in [📄 run/readme.md](run/readme.md)!
+
+Further documentation is available at [🌎&nbsp;https://docs.marginalia.nu/](https://docs.marginalia.nu/).
+
+Before compiling, it's necessary to run [⚙️ run/setup.sh](run/setup.sh). 
+This will download supplementary model data that is necessary to run the code. 
 These are also necessary to run the tests. 
 
-To set up a local test environment, follow the instructions in [📄 run/readme.md](run/readme.md)!
 
 ## Hardware Requirements
 
diff --git a/doc/crawling.md b/doc/crawling.md
deleted file mode 100644
index c1086ac5..00000000
--- a/doc/crawling.md
+++ /dev/null
@@ -1,115 +0,0 @@
-# Crawling
-
-## WARNING
-
-Please don't run the crawler unless you intend to actually operate a public
-facing search engine!  For testing, use crawl sets from [downloads.marginalia.nu](https://downloads.marginalia.nu/) instead;
-or if you wish to play with the crawler, crawl a small set of domains from people who are
-ok with it, use your own, your friends, or any subdomain from marginalia.nu.
-
-See the documentation in run/ for more information on how to load sample data! 
-
-Reckless crawling annoys webmasters and makes it harder to run an independent search engine. 
-Crawling from a domestic IP address is also likely to put you on a greylist
-of probable bots.  You will solve CAPTCHAs for almost every website you visit
-for weeks, and may be permanently blocked from a few IPs.
-
-## Prerequisites
-
-You probably want to run a local bind resolver to speed up DNS lookups and reduce the amount of
-DNS traffic. 
-
-These processes require a lot of disk space.  It's strongly recommended to use a dedicated disk for
-the index storage subdirectory, it doesn't need to be extremely fast, but it should be a few terabytes in size.  
-
-It should be mounted with `noatime`.  It may be a good idea to format the disk with a block size of 4096 bytes.  This will reduce the amount of disk space used by the crawler.
-
-Make sure you configure the user-agent properly.  This will be used to identify the crawler,
-and is matched against the robots.txt file.  The crawler will not crawl sites that don't allow it.
-See [wiki://Robots_exclusion_standard](https://en.wikipedia.org/wiki/Robots_exclusion_standard) for more information
-about robots.txt; the user agent can be configured in conf/properties/system.properties; see the 
-[system-properties](system-properties.md) documentation for more information.
-
-## Setup
-
-Ensure that the system is running and go to https://localhost:8081.  
-
-With the default test configuration, the system is configured to 
-store data in `node-1/storage`.
-
-## Fresh Crawl
-
-While a running search engine can use the link database to figure out which websites to visit, a clean
-system does not know of any links.  To bootstrap a crawl, a crawl specification needs to be created to 
-seed the domain database.
-
-Go to `Nodes->Node 1->Actions->New Crawl`
-
-![img](images/new_crawl.png)
-
-Click the link that says 'New Spec' to arrive at a form for creating a new specification:
-
-![img](images/new_spec.png)
-
-Fill out the form with a description and a link to a domain list.  The domain list is a text file
-with one domain per line, with blank lines and comments starting with `#` ignored.  You can use
-github raw links for this purpose.  For test purposes, you can use this link:
-`https://downloads.marginalia.nu/domain-list-test.txt`, which will create a crawl for a few
-of marignalia.nu's subdomains.
-
-If you aren't redirected there automatically, go back to the `New Crawl` page under Node 1 -> Actions. 
-Your new specification should now be listed.  
-
-Check the box next to it, and click `[Trigger New Crawl]`.
-
-![img](images/new_crawl2.png)
-
-This will start the crawling process.  Crawling may take a while, depending on the size
-of the domain list and the size of the websites.  
-
-![img](images/crawl_in_progress.png)
-
-Eventually a process bar will show up, and the crawl will start.  When it reaches 100%, the crawl is done.
-You can also monitor the `Events Summary` table on the same page to see what happened after the fact.
-
-It is expected that the crawl will stall out toward the end  of the process, this is a statistical effect since
-the largest websites take the longest to finish, and tend to be the ones lingering at 99% or so completion.  The
-crawler has a timeout of 5 hours, where if no new domains are finished crawling, it will stop, to prevent crawler traps
-from stalling the crawl indefinitely. 
-
-**Be sure to read the section on re-crawling!**
-
-## Converting
-
-Once the crawl is done, the data needs to be processed before its searchable.  This is done by going to
-`Nodes->Node 1->Actions->Process Crawl Data`.
-
-![Conversion screenshot](images/convert.png)
-
-This will start the conversion process.  This will again take a while, depending on the size of the crawl. 
-The process bar will show the progress.  When it reaches 100%, the conversion is done, and the data will begin
-loading automatically.  A cascade of actions is performed in sequence, leading to the data being loaded into the
-search engine and an index being constructed.  This is all automatic, but depending on the size of the crawl data,
-may take a while.
-
-When an event `INDEX-SWITCH-OK` is logged in the `Event Summary` table, the data is ready to be searched.
-
-## Re-crawling
-
-The work flow with a crawl spec was a one-off process to bootstrap the search engine.  To keep the search engine up to date,
-it is preferable to do a recrawl.  This will try to reduce the amount of data that needs to be fetched.
-
-To trigger a Recrawl, go to `Nodes->Node 1->Actions->Re-crawl`.  This will bring you to a page that looks similar to the
-first crawl page, where you can select a set of crawl data to use as a source.  Select the crawl data you want, and
-press `[Trigger Recrawl]`. 
-
-Crawling will proceed as before, but this time, the crawler will try to fetch only the data that has changed since the
-last crawl, increasing the number of documents by a percentage.  This will typically be much faster than the initial crawl.  
-
-### Growing the crawl set
-
-The re-crawl will also pull new domains from the `New Domains` dataset, which is an URL configurable in
-`[Top Menu] -> System -> Data Sets`.  If a new domain is found, it will be assigned to the present node, and crawled in
-the re-crawl.
-
-![Datasets screenshot](images/datasets.png)
diff --git a/doc/images/convert.png b/doc/images/convert.png
deleted file mode 100644
index 82ec7343..00000000
Binary files a/doc/images/convert.png and /dev/null differ
diff --git a/doc/images/convert_2.png b/doc/images/convert_2.png
deleted file mode 100644
index c5adb27d..00000000
Binary files a/doc/images/convert_2.png and /dev/null differ
diff --git a/doc/images/crawl_in_progress.png b/doc/images/crawl_in_progress.png
deleted file mode 100644
index ceb39056..00000000
Binary files a/doc/images/crawl_in_progress.png and /dev/null differ
diff --git a/doc/images/datasets.png b/doc/images/datasets.png
deleted file mode 100644
index a5bf0d87..00000000
Binary files a/doc/images/datasets.png and /dev/null differ
diff --git a/doc/images/load_warc.png b/doc/images/load_warc.png
deleted file mode 100644
index 5e0cedde..00000000
Binary files a/doc/images/load_warc.png and /dev/null differ
diff --git a/doc/images/new_crawl.png b/doc/images/new_crawl.png
deleted file mode 100644
index ae905cd6..00000000
Binary files a/doc/images/new_crawl.png and /dev/null differ
diff --git a/doc/images/new_crawl2.png b/doc/images/new_crawl2.png
deleted file mode 100644
index cc85acbe..00000000
Binary files a/doc/images/new_crawl2.png and /dev/null differ
diff --git a/doc/images/new_spec.png b/doc/images/new_spec.png
deleted file mode 100644
index 8b466e87..00000000
Binary files a/doc/images/new_spec.png and /dev/null differ
diff --git a/doc/images/sideload_menu.png b/doc/images/sideload_menu.png
deleted file mode 100644
index 6a85d076..00000000
Binary files a/doc/images/sideload_menu.png and /dev/null differ
diff --git a/doc/images/sideload_warc.png b/doc/images/sideload_warc.png
deleted file mode 100644
index dd763efc..00000000
Binary files a/doc/images/sideload_warc.png and /dev/null differ
diff --git a/doc/readme.md b/doc/readme.md
index bbc64105..082b14e7 100644
--- a/doc/readme.md
+++ b/doc/readme.md
@@ -3,12 +3,11 @@
 A lot of the architectural description is sprinkled into the code repository closer to the code. 
 Start in [📁 ../code/](../code/) and poke around.
 
+Operational documentation is available at [🌎&nbsp;https://docs.marginalia.nu/](https://docs.marginalia.nu/).
+
 ## Operations
 
-* [System Properties](system-properties.md) - JVM property flags
-
 ## How-To 
-* [Sideloading How-To](sideloading-howto.md) - How to sideload various data sets
 * [Parquet How-To](parquet-howto.md) - Useful tips in working with Parquet files
 
 ## Set-up
diff --git a/doc/sideloading-howto.md b/doc/sideloading-howto.md
deleted file mode 100644
index 93a44981..00000000
--- a/doc/sideloading-howto.md
+++ /dev/null
@@ -1,211 +0,0 @@
-# Sideloading How-To
-
-Some websites are much larger than others, this includes
-Wikipedia, Stack Overflow, and a few others.  They are so
-large they are impractical to crawl in the traditional fashion,
-but luckily they make available data dumps that can be processed
-and loaded into the search engine through other means.
-
-To this end, it's possible to sideload data into the search engine
-from other sources than the web crawler.
-
-## Index Nodes
-
-In practice, if you want to sideload data, you need to do it on
-a separate index node.  Index nodes are separate instances of the
-index software.  The default configuration is to have two index nodes,
-one for the web crawler, and one for sideloaded data.  
-
-The need for a separate node is due to incompatibilities in the work flows.
-
-It is also a good idea in general, as very large domains can easily be so large that the entire time budget 
-for the query is spent sifting through documents from that one domain, this is 
-especially true with something like Wikipedia, which has a lot of documents at 
-least tangentially related to any given topic.
-
-This how-to assumes that you are operating on index-node 2.  
-
-## Notes on the upload directory
-
-This is written assuming that the system is installed with the `install.sh`
-script, which deploys the system with docker-compose, and has a directory 
-structure like
-
-```
-...
-index-1/backup/
-index-1/index/
-index-1/storage/
-index-1/uploads/
-index-1/work/
-index-2/backup/
-index-2/index/
-index-2/storage/
-index-2/uploads/
-index-2/work/
-...
-```
-
-We're going to be putting files in the **uploads** directories.   If you have installed
-the system in some other way, or changed the configuration significantly, you need
-to adjust the paths accordingly.
-
-## Sideloading
-
-The sideloading actions are available through Actions menu in each node.
-
-![Sideload menu](images/sideload_menu.png)
-
-## Sideloading WARCs
-
-WARC files are the standard format for web archives.  They can be created e.g. with wget.
-The Marginalia software can read WARC files directly, and sideload them into the index,
-as long as each warc file contains only one domain.
-
-Let's for example archive www.marginalia.nu (I own this domain, so feel free to try this at home)
-
-```bash
-$  wget -r --warc-file=marginalia www.marginalia.nu
-```
-
-**Note** If you intend to do this on other websites, you should probably add a `--wait` parameter to wget,
-e.g. `wget --wait=1 -r --warc-file=...` to avoid hammering the website with requests and getting blocked.
-
-This will take a moment, and create a file called `marginalia.warc.gz`. We move it to the
-upload directory of the index node, and sideload it through the Actions menu.
-
-```bash
-$ mkdir -p index-2/uploads/marginalia-warc
-$ mv marginalia.warc.gz index-2/uploads/marginalia-warc
-```
-
-Go to the Actions menu, and select the "Sideload WARC" action.  This will show a list of
-subdirectories in the Uploads directory.  Select the directory containing the WARC file, and
-click "Sideload".
-
-![Sideload WARC screenshot](images/sideload_warc.png)
-
-This should take you to the node overview, where you can see the progress of the sideloading.
-It will take a moment, as the WARC file is being processed.  
-
-![Processing in progress](images/convert_2.png)
-
-It will not be loaded automatically.  This is to permit you to sideload multiple sources.
-
-When you are ready to load it, go to the Actions menu, and select "Load Crawl Data".
-
-![Load Crawl Data](images/load_warc.png)
-
-Select all the sources you want to load, and click "Load".  This will load the data into the
-index, and make it available for searching.  
-
-## Sideloading Wikipedia
-
-Due to licensing incompatibilities with OpenZim's GPL-2 and AGPL, the workflow 
-depends on using the conversion process from [https://encyclopedia.marginalia.nu/](https://encyclopedia.marginalia.nu/)
-to pre-digest the data.  
-
-Build the [encyclopedia.marginalia.nu Code](https://github.com/MarginaliaSearch/encyclopedia.marginalia.nu)
-and follow the instructions for downloading a ZIM file, and then run something like
-
-```$./encyclopedia convert file.zim articles.db```
-
-This db-file can be processed and loaded into the search engine through the
-Actions view.
-
-FIXME: It will currently only point to en.wikipedia.org, this should be
-made configurable.
-
-
-## Sideloading a directory tree
-
-For relatively small websites, ad-hoc side-loading is available directly from a
-folder structure on the hard drive. This is intended for loading manuals, 
-documentation and similar data sets that are large and slowly changing.
-
-A website can be archived with wget, like this
-
-```bash
-UA="search.marginalia.nu" \
-DOMAIN="www.example.com" \
-wget -nc -x --continue -w 1 -r -U ${UA} -A "html" ${DOMAIN}
-```
-
-After doing this to a bunch of websites, create a YAML file something like this:
-
-```yaml
-sources:
-- name: jdk-20
-  dir: "jdk-20/"
-  domainName: "docs.oracle.com"
-  baseUrl: "https://docs.oracle.com/en/java/javase/20/docs"
-  keywords:
-  - "java"
-  - "docs"
-  - "documentation"
-  - "javadoc"
-- name: python3
-  dir: "python-3.11.5/"
-  domainName: "docs.python.org"
-  baseUrl: "https://docs.python.org/3/"
-  keywords:
-  - "python"
-  - "docs"
-  - "documentation"
-- name: mariadb.com
-  dir: "mariadb.com/"
-  domainName: "mariadb.com"
-  baseUrl: "https://mariadb.com/"
-  keywords:
-  - "sql"
-  - "docs"
-  - "mariadb"
-  - "mysql"
-```
-
-|parameter|description|
-|----|----|
-|name|Purely informative|
-|dir|Path of website contents relative to the location of the yaml file|
-|domainName|The domain name of the website|
-|baseUrl|This URL will be prefixed to the contents of `dir`|
-|keywords|These supplemental keywords will be injected in each document|
-
-The directory structure corresponding to the above might look like
-
-```
-docs-index.yaml
-jdk-20/
-jdk-20/resources/
-jdk-20/api/
-jdk-20/api/[...]
-jdk-20/specs/
-jdk-20/specs/[...]
-jdk-20/index.html
-mariadb.com
-mariadb.com/kb/
-mariadb.com/kb/[...]
-python-3.11.5
-python-3.11.5/genindex-B.html
-python-3.11.5/library/
-python-3.11.5/distutils/
-python-3.11.5/[...]
-[...]
-```
-
-This yaml-file can be processed and loaded into the search engine through the
-Actions view.
-
-
-## Sideloading Stack Overflow/Stackexchange
-
-Stackexchange makes dumps available on Archive.org.  These are unfortunately on a format that 
-needs some heavy-handed pre-processing before they can be loaded.  A tool is available for 
-this in [tools/stackexchange-converter](../code/tools/stackexchange-converter).
-
-After running `gradlew dist`, this tool is found in `build/dist/stackexchange-converter`,
-follow the instructions in the stackexchange-converter readme, and
-convert the stackexchange xml.7z-files to sqlite db-files. 
-
-A directory with such db-files can be processed and loaded into the 
-search engine through the Actions view.
\ No newline at end of file
diff --git a/doc/system-properties.md b/doc/system-properties.md
deleted file mode 100644
index 0c825e7b..00000000
--- a/doc/system-properties.md
+++ /dev/null
@@ -1,42 +0,0 @@
-# System Properties
-
-These are JVM system properties used by each service.  These properties can either
-be loaded from a file or passed in as command line arguments, using `$JAVA_OPTS`.
-
-The system will look for a properties file in `conf/properties/system.properties`,
-within the install dir, as specified by `$WMSA_HOME`.
-
-A template is available in [../run/template/conf/properties/system.properties](../run/template/conf/properties/system.properties).
-
-## Global
-
-| flag        | values     | description                          |
-|-------------|------------|--------------------------------------|
-| blacklist.disable | boolean | Disables the IP blacklist            |
-| flyway.disable | boolean | Disables automatic Flyway migrations |
-
-## Crawler Properties
-
-| flag                         | values     | description                                                                                 |
-|------------------------------|------------|---------------------------------------------------------------------------------------------|
-| crawler.userAgentString      | string | Sets the user agent string used by the crawler                                              |
-| crawler.userAgentIdentifier  | string | Sets the user agent identifier used by the crawler, e.g. what it looks for in robots.txt    |
-| crawler.poolSize             | integer | Sets the number of threads used by the crawler, more is faster, but uses more RAM           |
-| crawler.initialUrlsPerDomain | integer | Sets the initial number of URLs to crawl per domain  (when crawling from spec)              |
-| crawler.maxUrlsPerDomain     | integer | Sets the maximum number of URLs to crawl per domain  (when recrawling)                      |
-| crawler.minUrlsPerDomain     | integer | Sets the minimum number of URLs to crawl per domain  (when recrawling)                      |
-| crawler.crawlSetGrowthFactor | double | If 100 documents were fetched last crawl, increase the goal to 100 x (this value) this time |
-| ip-blocklist.disabled        | boolean | Disables the IP blocklist                                                                   |
-
-## Converter Properties
-
-| flag                        | values     | description                                                                                                                                              |
-|-----------------------------|------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|
-| converter.sideloadThreshold | integer | Threshold value, in number of documents per domain, where a simpler processing method is used which uses less RAM.  10,000 is a good value for ~32GB RAM |
-
-# Marginalia Application Specific
-
-| flag                      | values     | description                                                   |
-|---------------------------|------------|---------------------------------------------------------------|
-| search.websiteUrl         | string | Overrides the website URL used in rendering                   |
-| control.hideMarginaliaApp | boolean | Hides the Marginalia application from the control GUI results |
diff --git a/docker-compose-barebones.yml b/docker-compose-barebones.yml
deleted file mode 100644
index 9f1b3783..00000000
--- a/docker-compose-barebones.yml
+++ /dev/null
@@ -1,181 +0,0 @@
-# This is the barebones docker-compose file for the Marginalia Search Engine.
-#
-# It starts a stripped-down version of the search engine, with only the essential
-# services running, including the database, the query service, the control service,
-# and a single index and executor node.
-#
-# It is a good starting point for setting up a white-label search engine that does not
-# have Marginalia's GUI.  The Query Service presents a simple search box, that also talks
-# JSON, so you can use it as a backend for your own search interface.
-
-
-x-svc: &service
-  env_file:
-    - "run/env/service.env"
-  volumes:
-    - conf:/wmsa/conf:ro
-    - model:/wmsa/model
-    - data:/wmsa/data
-    - logs:/var/log/wmsa
-  networks:
-    - wmsa
-  depends_on:
-    - mariadb
-  labels:
-    - "__meta_docker_port_private=7000"
-x-p1: &partition-1
-  env_file:
-    - "run/env/service.env"
-  volumes:
-    - conf:/wmsa/conf:ro
-    - model:/wmsa/model
-    - data:/wmsa/data
-    - logs:/var/log/wmsa
-    - index-1:/idx
-    - work-1:/work
-    - backup-1:/backup
-    - samples-1:/storage
-    - uploads-1:/uploads
-  networks:
-    - wmsa
-  depends_on:
-    - mariadb
-  environment:
-    - "WMSA_SERVICE_NODE=1"
-
-services:
-  index-service-1:
-    <<: *partition-1
-    image: "marginalia/index-service"
-    container_name: "index-service-1"
-  executor-service-1:
-    <<: *partition-1
-    image: "marginalia/executor-service"
-    container_name: "executor-service-1"
-  query-service:
-    <<: *service
-    image: "marginalia/query-service"
-    container_name: "query-service"
-    expose:
-      - 80
-    labels:
-      - "traefik.enable=true"
-      - "traefik.http.routers.search-service.rule=PathPrefix(`/`)"
-      - "traefik.http.routers.search-service.entrypoints=search"
-      - "traefik.http.routers.search-service.middlewares=add-xpublic"
-      - "traefik.http.routers.search-service.middlewares=add-public"
-      - "traefik.http.middlewares.add-xpublic.headers.customrequestheaders.X-Public=1"
-      - "traefik.http.middlewares.add-public.addprefix.prefix=/public"
-  control-service:
-    <<: *service
-    image: "marginalia/control-service"
-    container_name: "control-service"
-    expose:
-    - 80
-    labels:
-    - "traefik.enable=true"
-    - "traefik.http.routers.control-service.rule=PathPrefix(`/`)"
-    - "traefik.http.routers.control-service.entrypoints=control"
-    - "traefik.http.routers.control-service.middlewares=add-xpublic"
-    - "traefik.http.routers.control-service.middlewares=add-public"
-    - "traefik.http.middlewares.add-xpublic.headers.customrequestheaders.X-Public=1"
-    - "traefik.http.middlewares.add-public.addprefix.prefix=/public"
-  mariadb:
-    image: "mariadb:lts"
-    container_name: "mariadb"
-    env_file: "run/env/mariadb.env"
-    command: ['mysqld', '--character-set-server=utf8mb4', '--collation-server=utf8mb4_unicode_ci']
-    ports:
-      - "127.0.0.1:3306:3306/tcp"
-    healthcheck:
-      test: mysqladmin ping -h 127.0.0.1 -u $$MARIADB_USER --password=$$MARIADB_PASSWORD
-      start_period: 5s
-      interval: 5s
-      timeout: 5s
-      retries: 60
-    volumes:
-      - db:/var/lib/mysql
-      - "./code/common/db/src/main/resources/sql/current/:/docker-entrypoint-initdb.d/"
-    networks:
-      - wmsa
-  traefik:
-    image: "traefik:v2.10"
-    container_name: "traefik"
-    command:
-      #- "--log.level=DEBUG"
-      - "--api.insecure=true"
-      - "--providers.docker=true"
-      - "--providers.docker.exposedbydefault=false"
-      - "--entrypoints.search.address=:80"
-      - "--entrypoints.control.address=:81"
-    ports:
-      - "127.0.0.1:8080:80"
-      - "127.0.0.1:8081:81"
-      - "127.0.0.1:8090:8080"
-    volumes:
-      - "/var/run/docker.sock:/var/run/docker.sock:ro"
-    networks:
-      - wmsa
-networks:
-  wmsa:
-volumes:
-  db:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/db
-  logs:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/logs
-  model:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/model
-  conf:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/conf
-  data:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/data
-  samples-1:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-1/samples
-  index-1:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-1/index
-  work-1:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-1/work
-  backup-1:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-1/backup
-  uploads-1:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-1/uploads
\ No newline at end of file
diff --git a/docker-compose.yml b/docker-compose.yml
deleted file mode 100644
index 63c54f7f..00000000
--- a/docker-compose.yml
+++ /dev/null
@@ -1,315 +0,0 @@
-# This is the full docker-compose.yml file for the Marginalia Search Engine.
-#
-# It starts all the services, including the GUI, the database, the query service,
-# two nodes for demo purposes, as well as a bunch of peripheral services that are
-# application specific.
-#
-
-x-svc: &service
-  env_file:
-    - "run/env/service.env"
-  volumes:
-    - conf:/wmsa/conf:ro
-    - model:/wmsa/model
-    - data:/wmsa/data
-    - logs:/var/log/wmsa
-  networks:
-    - wmsa
-  labels:
-    - "__meta_docker_port_private=7000"
-x-p1: &partition-1
-  env_file:
-    - "run/env/service.env"
-  volumes:
-    - conf:/wmsa/conf:ro
-    - model:/wmsa/model
-    - data:/wmsa/data
-    - logs:/var/log/wmsa
-    - index-1:/idx
-    - work-1:/work
-    - backup-1:/backup
-    - samples-1:/storage
-    - uploads-1:/uploads
-  networks:
-    - wmsa
-  depends_on:
-    - mariadb
-  environment:
-    - "WMSA_SERVICE_NODE=1"
-x-p2: &partition-2
-  env_file:
-    - "run/env/service.env"
-  volumes:
-    - conf:/wmsa/conf:ro
-    - model:/wmsa/model
-    - data:/wmsa/data
-    - logs:/var/log/wmsa
-    - index-2:/idx
-    - work-2:/work
-    - backup-2:/backup
-    - samples-2:/storage
-    - uploads-2:/uploads
-  networks:
-    - wmsa
-  depends_on:
-    mariadb:
-      condition: service_healthy
-  environment:
-    - "WMSA_SERVICE_NODE=2"
-
-services:
-  index-service-1:
-    <<: *partition-1
-    image: "marginalia/index-service"
-    container_name: "index-service-1"
-  executor-service-1:
-    <<: *partition-1
-    image: "marginalia/executor-service"
-    container_name: "executor-service-1"
-  index-service-2:
-    <<: *partition-2
-    image: "marginalia/index-service"
-    container_name: "index-service-2"
-  executor-service-2:
-    <<: *partition-2
-    image: "marginalia/executor-service"
-    container_name: "executor-service-2"
-  query-service:
-    <<: *service
-    image: "marginalia/query-service"
-    container_name: "query-service"
-  search-service:
-    <<: *service
-    image: "marginalia/search-service"
-    container_name: "search-service"
-    expose:
-    - 80
-    labels:
-    - "traefik.enable=true"
-    - "traefik.http.routers.search-service.rule=PathPrefix(`/`)"
-    - "traefik.http.routers.search-service.entrypoints=search"
-    - "traefik.http.routers.search-service.middlewares=add-xpublic"
-    - "traefik.http.routers.search-service.middlewares=add-public"
-    - "traefik.http.middlewares.add-xpublic.headers.customrequestheaders.X-Public=1"
-    - "traefik.http.middlewares.add-public.addprefix.prefix=/public"
-  assistant-service:
-    <<: *service
-    image: "marginalia/assistant-service"
-    container_name: "assistant-service"
-    expose:
-      - 80
-    labels:
-      - "traefik.enable=true"
-      - "traefik.http.routers.assistant-service-screenshot.rule=PathPrefix(`/screenshot`)"
-      - "traefik.http.routers.assistant-service-screenshot.entrypoints=search,dating"
-      - "traefik.http.routers.assistant-service-screenshot.middlewares=add-xpublic"
-      - "traefik.http.routers.assistant-service-screenshot.middlewares=add-public"
-      - "traefik.http.routers.assistant-service-suggest.rule=PathPrefix(`/suggest`)"
-      - "traefik.http.routers.assistant-service-suggest.entrypoints=search"
-      - "traefik.http.routers.assistant-service-suggest.middlewares=add-xpublic"
-      - "traefik.http.routers.assistant-service-suggest.middlewares=add-public"
-      - "traefik.http.middlewares.add-xpublic.headers.customrequestheaders.X-Public=1"
-      - "traefik.http.middlewares.add-public.addprefix.prefix=/public"
-  api-service:
-    <<: *service
-    image: "marginalia/api-service"
-    container_name: "api-service"
-    expose:
-    - "80"
-    labels:
-    - "traefik.enable=true"
-    - "traefik.http.routers.api-service.rule=PathPrefix(`/`)"
-    - "traefik.http.routers.api-service.entrypoints=api"
-    - "traefik.http.routers.api-service.middlewares=add-xpublic"
-    - "traefik.http.routers.api-service.middlewares=add-public"
-    - "traefik.http.middlewares.add-xpublic.headers.customrequestheaders.X-Public=1"
-    - "traefik.http.middlewares.add-public.addprefix.prefix=/public"
-  dating-service:
-    <<: *service
-    image: "marginalia/dating-service"
-    container_name: "dating-service"
-    expose:
-      - 80
-    labels:
-      - "traefik.enable=true"
-      - "traefik.http.routers.dating-service.rule=PathPrefix(`/`)"
-      - "traefik.http.routers.dating-service.entrypoints=dating"
-      - "traefik.http.routers.dating-service.middlewares=add-xpublic"
-      - "traefik.http.routers.dating-service.middlewares=add-public"
-      - "traefik.http.middlewares.add-xpublic.headers.customrequestheaders.X-Public=1"
-      - "traefik.http.middlewares.add-public.addprefix.prefix=/public"
-  explorer-service:
-    <<: *service
-    image: "marginalia/explorer-service"
-    container_name: "explorer-service"
-    expose:
-    - 80
-    labels:
-    - "traefik.enable=true"
-    - "traefik.http.routers.explorer-service.rule=PathPrefix(`/`)"
-    - "traefik.http.routers.explorer-service.entrypoints=explore"
-    - "traefik.http.routers.explorer-service.middlewares=add-xpublic"
-    - "traefik.http.routers.explorer-service.middlewares=add-public"
-    - "traefik.http.middlewares.add-xpublic.headers.customrequestheaders.X-Public=1"
-    - "traefik.http.middlewares.add-public.addprefix.prefix=/public"
-  control-service:
-    <<: *service
-    image: "marginalia/control-service"
-    container_name: "control-service"
-    expose:
-    - 80
-    labels:
-    - "traefik.enable=true"
-    - "traefik.http.routers.control-service.rule=PathPrefix(`/`)"
-    - "traefik.http.routers.control-service.entrypoints=control"
-    - "traefik.http.routers.control-service.middlewares=add-xpublic"
-    - "traefik.http.routers.control-service.middlewares=add-public"
-    - "traefik.http.middlewares.add-xpublic.headers.customrequestheaders.X-Public=1"
-    - "traefik.http.middlewares.add-public.addprefix.prefix=/public"
-  mariadb:
-    image: "mariadb:lts"
-    container_name: "mariadb"
-    env_file: "run/env/mariadb.env"
-    command: ['mysqld', '--character-set-server=utf8mb4', '--collation-server=utf8mb4_unicode_ci']
-    ports:
-      - "127.0.0.1:3306:3306/tcp"
-    healthcheck:
-      test: mysqladmin ping -h 127.0.0.1 -u $$MARIADB_USER --password=$$MARIADB_PASSWORD
-      start_period: 5s
-      interval: 5s
-      timeout: 5s
-      retries: 60
-    volumes:
-      - db:/var/lib/mysql
-      - "./code/common/db/src/main/resources/sql/current/:/docker-entrypoint-initdb.d/"
-    networks:
-      - wmsa
-  traefik:
-    image: "traefik:v2.10"
-    container_name: "traefik"
-    command:
-      #- "--log.level=DEBUG"
-      - "--api.insecure=true"
-      - "--providers.docker=true"
-      - "--providers.docker.exposedbydefault=false"
-      - "--entrypoints.search.address=:80"
-      - "--entrypoints.control.address=:81"
-      - "--entrypoints.api.address=:82"
-      - "--entrypoints.dating.address=:83"
-      - "--entrypoints.explore.address=:84"
-    ports:
-      - "127.0.0.1:8080:80"
-      - "127.0.0.1:8081:81"
-      - "127.0.0.1:8082:82"
-      - "127.0.0.1:8083:83"
-      - "127.0.0.1:8084:84"
-      - "127.0.0.1:8090:8080"
-    volumes:
-      - "/var/run/docker.sock:/var/run/docker.sock:ro"
-    networks:
-      - wmsa
-  prometheus:
-    image: "prom/prometheus"
-    container_name: "prometheus"
-    command:
-      - "--config.file=/etc/prometheus/prometheus.yml"
-    ports:
-      - "127.0.0.1:8091:9090"
-    volumes:
-      - "./run/prometheus.yml:/etc/prometheus/prometheus.yml"
-      - "/var/run/docker.sock:/var/run/docker.sock:ro"
-    networks:
-      - wmsa
-networks:
-  wmsa:
-volumes:
-  db:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/db
-  logs:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/logs
-  model:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/model
-  conf:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/conf
-  data:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/data
-  samples-1:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-1/samples
-  index-1:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-1/index
-  work-1:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-1/work
-  backup-1:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-1/backup
-  uploads-1:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-1/uploads
-  samples-2:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-2/samples
-  index-2:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-2/index
-  work-2:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-2/work
-  backup-2:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-2/backup
-  uploads-2:
-    driver: local
-    driver_opts:
-      type: none
-      o: bind
-      device: run/node-2/uploads
\ No newline at end of file
diff --git a/run/download-samples.sh b/run/download-samples.sh
deleted file mode 100755
index bbae77e6..00000000
--- a/run/download-samples.sh
+++ /dev/null
@@ -1,59 +0,0 @@
-#!/bin/bash
-
-set -e
-
-# Check if wget exists
-if command -v wget &> /dev/null; then
-  dl_prg="wget -O"
-elif command -v curl &> /dev/null; then
-  dl_prg="curl -o"
-else
-  echo "Neither wget nor curl found, exiting .."
-  exit 1
-fi
-
-case "$1" in
-"s"|"m"|"l"|"xl")
-    ;;
-*)
-    echo "Invalid argument. Must be one of 's', 'm', 'l' or 'xl'."
-    exit 1
-    ;;
-esac
-
-SAMPLE_NAME=crawl-${1:-m}
-SAMPLE_DIR="node-1/samples/${SAMPLE_NAME}/"
-
-function download_model {
-  model=$1
-  url=$2
-
-  if [ ! -f $model ]; then
-    echo "** Downloading $url"
-    $dl_prg $model $url
-  fi
-}
-
-pushd $(dirname $0)
-
-if [ -d ${SAMPLE_DIR} ]; then
-    echo "${SAMPLE_DIR} already exists; remove it if you want to re-download the sample"
-fi
-
-mkdir -p node-1/samples/
-SAMPLE_TARBALL=samples/${SAMPLE_NAME}.tar.gz
-download_model ${SAMPLE_TARBALL}.tmp https://downloads.marginalia.nu/${SAMPLE_TARBALL} && mv ${SAMPLE_TARBALL}.tmp ${SAMPLE_TARBALL}
-
-if [ ! -f ${SAMPLE_TARBALL} ]; then
-  echo "!! Failed"
-  exit 255
-fi
-
-mkdir -p ${SAMPLE_DIR}
-tar zxf ${SAMPLE_TARBALL} --strip-components=1 -C ${SAMPLE_DIR}
-
-cat > "${SAMPLE_DIR}/marginalia-manifest.json" <<EOF
-{ "description": "Sample data set ${SAMPLE_NAME}", "type": "CRAWL_DATA" }
-EOF
-
-popd
diff --git a/run/readme.md b/run/readme.md
index 87d193dd..0a890feb 100644
--- a/run/readme.md
+++ b/run/readme.md
@@ -1,8 +1,10 @@
 # Run
 
-When developing locally, this directory will contain run-time data required for
-the search engine. In a clean check-out, it only contains the tools required to 
-bootstrap this directory structure.
+This directory is a staging area for running the system.  It contains scripts
+and templates for installing the system on a server, and for running it locally.
+
+See [https://docs.marginalia.nu/](https://docs.marginalia.nu/) for additional
+documentation.
 
 ## Requirements
 
@@ -16,8 +18,7 @@ graalce is a good distribution choice but it doesn't matter too much.
 ## Set up
 
 To go from a clean check out of the git repo to a running search engine,
-follow these steps.  This assumes a test deployment.  For a production like
-setup... (TODO: write a guide for this).
+follow these steps. 
 
 You're assumed to sit in the project root the whole time.
 
@@ -35,106 +36,40 @@ $ run/setup.sh
 ```shell
 $ ./gradlew docker
 ```
-
-### 3. Initialize the database
-
-Before the system can be brought online, the database needs to be initialized.  To do this,
-bring up the database in the background, and run the flyway migration tool.
+### 3.  Install the system
 
 ```shell
-$ docker-compose up -d mariadb
-$ ./gradlew flywayMigrate
+$ run/install.sh <install-directory>
 ```
 
-### 4. Bring the system online. 
+To install the system, you need to run the install script.  It will prompt 
+you for which installation mode you want to use.  The options are:
 
-We'll run it in the foreground in the terminal this time because it's educational to see the logs. 
-Add `-d` to run in the background.
+1. Barebones - This will install a white-label search engine with no data.  You can 
+   use this to index your own data.  It disables and hides functionality that is strongly
+   related to the Marginalia project, such as the Marginalia GUI. 
+2. Full Marginalia Search instance - This will install an instance of the search engine
+   configured like [search.marginalia.nu](https://search.marginalia.nu).  This is useful
+   for local development and testing.
+
+It will also prompt you for account details for a new mariadb instance, which will be
+created for you.  The database will be initialized with the schema and data required
+for the search engine to run.
+
+After filling out all the details, the script will copy the installation files to the
+specified directory.
+
+### 4. Run the system
 
 ```shell
-$ docker-compose up
+$ cd install_directory
+$ docker-compose up -d 
+# To see the logs: 
+$ docker-compose logs -f
 ```
 
-There are two docker-compose files available, `docker-compose.yml` and `docker-compose-barebones.yml`;
-the latter is a stripped down version that only runs the bare minimum required to run the system, for e.g.
-running a whitelabel version of the system.  The former is the full system with all the frills of
-Marginalia Search, and is the one used by default.
+You can now access a search interface at `http://localhost:8080`, and the admin interface
+at `http://localhost:8081/`.   
 
-To start the barebones version, run:
-
-```shell
-$ docker-compose -f docker-compose-barebones.yml up
-```
-
-### 5. You should now be able to access the system.
-
-By default, the docker-compose file publishes the following ports:
-
-| Address                 | Description      |
-|-------------------------|------------------|
-| http://localhost:8080/ | User-facing GUI  |
-| http://localhost:8081/ | Operator's GUI   |
-
-Note that the operator's GUI does not perform any sort of authentication.  
-Preferably don't expose it publicly, but if you absolutely must, use a proxy or 
-Basic Auth to add security.
-
-### 6. Download Sample Data
-
-A script is available for downloading sample data. The script will download the
-data from https://downloads.marginalia.nu/ and extract it to the correct location.
-
-The system will pick the data up automatically.
-
-```shell
-$ run/download-samples.sh l
-```
-
-Four sets are available:
-
-| Name | Description                     |
-|------|---------------------------------|
-| s    | Small set, 1000 domains         |
-| m    | Medium set, 2000 domains        |
-| l    | Large set, 5000 domains         |
-| xl   | Extra large set, 50,000 domains |
-
-Warning: The XL set is intended to provide a large amount of data for 
-setting up a pre-production environment. It may be hard to run on a smaller
-machine and will on most machines take several hours to process.
-
-The 'm' or 'l' sets are a good compromise between size and processing time 
-and should work on most machines.
-
-### 7. Process the data
-
-Bring the system online if it isn't (see step 4), then go to the operator's
-GUI (see step 5).  
-
-* Go to `Node 1 -> Storage -> Crawl Data`
-* Hit the toggle to set your crawl data to be active
-* Go to `Actions -> Process Crawl Data -> [Trigger Reprocessing]`
-
-This will take anywhere between a few minutes to a few hours depending on which
-data set you downloaded.  You can monitor the progress from the `Overview` tab.
-
-First the CONVERTER is expected to run; this will process the data into a format 
-that can easily be inserted into the database and index.
-
-Next the LOADER will run; this will insert the data into the database and index.
-
-Next the link database will repartition itself, and finally the index will be
-reconstructed.  You can view the process of these steps in the `Jobs` listing.
-
-### 8. Run the system
-
-Once all this is done, you can go to the user-facing GUI (see step 5) and try
-a search.  
-
-Important! Use the 'No Ranking' option when running locally, since you'll very
-likely not have enough links for the ranking algorithm to perform well.
-
-## Experiment Runner
-
-The script `experiment.sh` is a launcher for the experiment runner, which is useful when 
-evaluating new algorithms in processing crawl data. 
+There is no data in the system yet.  To load data into the system,
+see the guide at [https://docs.marginalia.nu/](https://docs.marginalia.nu/).