CatgirlIntelligenceAgency/code/common
Viktor Lofgren 0caef1b307 (warc) Toggle for saving WARC data
Add a toggle for saving the WARC data generated by the search engine's crawler.  Normally this is discarded, but for debugging or archival purposes, retaining it may be of interest.

The warc files are concatenated into larger archives, up to about 1 GB each.
An index is also created containing filenames, domain names, offsets and sizes
to help navigate these larger archives.

The warc data is saved in a directory warc/ under the crawl data storage.
2024-01-12 13:45:14 +01:00
..
config (warc) Toggle for saving WARC data 2024-01-12 13:45:14 +01:00
db (warc) Toggle for saving WARC data 2024-01-12 13:45:14 +01:00
linkdb (linkdb) Add delegating implementation of DomainLinkDb 2024-01-08 19:56:33 +01:00
model (*) Fix bug in EdgeDomain where it would permit domains with a trailing period, DNS style. 2023-12-29 16:36:01 +01:00
process (*) WIP Add node affinity to EC_DOMAIN 2023-10-19 17:48:34 +02:00
renderer (search) Fix acknowledgement page for domain complaints rendering as plain text 2024-01-10 09:26:34 +01:00
service (warc) Toggle for saving WARC data 2024-01-12 13:45:14 +01:00
service-client (search) Move site information out of the search service and into assistant. 2023-12-09 16:30:06 +01:00
service-discovery (mqapi/control) Repair repartition endpoint, deprecate notify endpoints. 2023-11-27 16:01:12 +01:00
readme.md Update readme.md 2023-03-25 15:27:11 +01:00

Common

These are packages containing the basic building blocks for running a service as well as shared models.

  • db contains SQL code and some database-related utilities.
  • config contains some @Injectables.
  • renderer contains utility code for rendering website templates.
  • service is the shared base classes for main methods and web services.
  • service-client is the shared base class for RPC.
  • service-discovery contains tools that lets the services find each other.
  • process contains boiler plate for batch processes.