161sh/CatgirlIntelligenceAgency

History

Viktor Lofgren d895f83520 (blocking-thread-pool) Move DumbThreadPool to its own micro-library Also rename it to SimpleBlockingThreadPool.		2023-09-20 10:11:49 +02:00
..
src	(blocking-thread-pool) Move DumbThreadPool to its own micro-library	2023-09-20 10:11:49 +02:00
build.gradle	(blocking-thread-pool) Move DumbThreadPool to its own micro-library	2023-09-20 10:11:49 +02:00
readme.md	More restructuring, big bug fixes in keyword extraction.	2023-03-13 17:39:53 +01:00

readme.md

Crawling Process

The crawling process downloads HTML and saves them into per-domain snapshots.

Central Classes

CrawlerMain orchestrates the crawling.
CrawlerRetreiver visits known addresses from a domain and downloads each document.
HttpFetcher fetches a URL.

See Also

features-convert