aa0d256d6a
* Change language version * Upgrade Lombok to a JDK20 compatible version |
||
---|---|---|
.. | ||
src/main/java/nu/marginalia/tools | ||
build.gradle | ||
readme.md |
Term Frequency Extractor
Generates a term frequency dictionary file from a batch of crawl data.
Usage:
PATH_TO_SAMPLES=run/samples/crawl-s
export JAVA_OPTS=-Dcrawl.rootDirRewrite=/crawl:${PATH_TO_SAMPLES}
term-frequency-extractor ${PATH_TO_SAMPLES}/plan.yaml out.dat