161sh/CatgirlIntelligenceAgency

Viktor Lofgren d82532b7f1 More restructuring, big bug fixes in keyword extraction.

2023-03-13 17:39:53 +01:00

524 B

Raw Blame History

Crawling Process

The crawling process downloads HTML and saves them into per-domain snapshots.

Central Classes

CrawlerMain orchestrates the crawling.
CrawlerRetreiver visits known addresses from a domain and downloads each document.
HttpFetcher fetches a URL.

See Also

features-convert