CatgirlIntelligenceAgency/code
2023-06-07 22:02:17 +02:00
..
api Update readme.md 2023-04-22 16:05:57 +02:00
common More word metadata tests 2023-05-28 11:57:06 +02:00
features-convert Don't consider slash to be a sentence separator. 2023-05-31 16:54:30 +02:00
features-crawl Yet more restructuring. Improved search result ranking. 2023-03-16 21:35:54 +01:00
features-index Clean up of the index query handling related code. 2023-04-10 14:50:57 +02:00
features-search Add a ranking parameter for biasing toward recent or old content. 2023-04-20 16:00:59 +02:00
libraries Don't consider slash to be a sentence separator. 2023-05-31 16:54:30 +02:00
process-models Bugfix crawl plan, doesn't use rewrite() everywhere 2023-03-30 15:41:07 +02:00
processes Up the default crawl delay to 1 second. 2023-06-07 22:02:17 +02:00
services-core Fix putative overflow error with a large dictionary 2023-05-28 11:57:06 +02:00
services-satellite Api service response cache (#16) 2023-04-22 15:42:32 +02:00
tools Adjust the logic for the crawl job extractor to set a relatively low visit limit for websites that are new in the index or has not yielded many good documents previously. 2023-06-07 22:01:35 +02:00
readme.md Fix broken diagram links after doc/ restructuring. 2023-03-25 16:32:10 +01:00

Code

This is a pretty large and diverse project with many moving parts.

You'll find a short description in each module of what it does and how it relates to other modules. The modules each have names like "library" or "process" or "feature". These have specific meanings. See doc/module-taxonomy.md.

Overview

A map of the most important components and how they relate can be found below.

image

Services

Processes

Processes are batch jobs that deal with data retrieval, processing and loading.

Tools

Features

Features are relatively stand-alone components that serve some part of the domain. They aren't domain-independent, but isolated.

Libraries and primitives

Libraries are stand-alone code that is independent of the domain logic.

  • common elements for creating a service, a client etc.
  • libraries containing non-search specific code.
    • array - large memory mapped area library
    • btree - static btree library