CatgirlIntelligenceAgency/code
2023-08-01 22:33:30 +02:00
..
api (WIP) Make it possible to sideload encyclopedia data. 2023-07-28 18:14:43 +02:00
common (file-storage) Deprecate mustClean flag 2023-08-01 22:32:30 +02:00
features-convert Make processed data Serializable 2023-07-28 18:11:19 +02:00
features-crawl (crawler) Update URL blocklist 2023-07-10 18:58:43 +02:00
features-index (lexicon) Optimize lexicon by using Murmur3_128's hash function 2023-08-01 15:02:13 +02:00
features-search (search) Fix a bug where space-like characters weren't normalized in query processing. 2023-07-10 18:58:43 +02:00
libraries (controller) Improve the storage interface 2023-07-21 19:56:16 +02:00
process-models (loader) Fix bug where trailing deferred domain meta inserts weren't executed 2023-07-31 14:23:23 +02:00
processes (crawler) Fix rare ConcurrentModificationError due to HashSet 2023-08-01 17:28:29 +02:00
services-core (control) Fix bug where CrawlActor and RecrawlActor would steal each others' mail 2023-08-01 22:33:30 +02:00
services-satellite (db) Use flwyay for database migrations. 2023-08-01 17:08:42 +02:00
tools (converter) Hook crawl job extractor and adjacencies calculator into control service. 2023-07-26 15:46:22 +02:00
readme.md Fix broken diagram links after doc/ restructuring. 2023-03-25 16:32:10 +01:00

Code

This is a pretty large and diverse project with many moving parts.

You'll find a short description in each module of what it does and how it relates to other modules. The modules each have names like "library" or "process" or "feature". These have specific meanings. See doc/module-taxonomy.md.

Overview

A map of the most important components and how they relate can be found below.

image

Services

Processes

Processes are batch jobs that deal with data retrieval, processing and loading.

Tools

Features

Features are relatively stand-alone components that serve some part of the domain. They aren't domain-independent, but isolated.

Libraries and primitives

Libraries are stand-alone code that is independent of the domain logic.

  • common elements for creating a service, a client etc.
  • libraries containing non-search specific code.
    • array - large memory mapped area library
    • btree - static btree library