CatgirlIntelligenceAgency/code/process-models/processed-data
Viktor Lofgren f655ec5a5c (*) Refactor GeoIP-related code
In this commit, GeoIP-related classes are refactored and relocated to a common library as they are shared across multiple services.

The crawler is refactored to enable the GeoIpBlocklist to use the new GeoIpDictionary as the base of its decisions.

The converter is modified ot query this data to add a geoip:-keyword to documents to permit limiting a search to the country of the hosting server.

The commit also adds due BY-SA attribution in the search engine footer for the source of the IP geolocation data.
2023-12-10 17:30:43 +01:00
..
src (*) Refactor GeoIP-related code 2023-12-10 17:30:43 +01:00
build.gradle (build) Move unit test configuration to root build.gradle 2023-10-04 12:46:22 +02:00
readme.md (docs) Update the documentation up-to-date information 2023-09-14 11:33:36 +02:00

The processed-data package contains models and logic for reading and writing parquet files with the output from the converting-process.

Main models:

Since parquet is a column based format, some of the readable models are projections that only read parts of the input file.

See Also

third-party/parquet-floor