CatgirlIntelligenceAgency/code/features-crawl/readme.md
2024-02-06 16:29:55 +01:00

9 lines
403 B
Markdown

# Crawl Features
These are bits of search-engine related code that are relatively isolated pieces of business logic,
that benefit from the clarity of being kept separate from the rest of the crawling code.
* [content-type](content-type/) - Content Type identification
* [crawl-blocklist](crawl-blocklist/) - IP and URL blocklists
* [link-parser](link-parser/) - Code for parsing and normalizing links