ebc84c22fb
This permits tests to run on JDK20 environments. |
||
---|---|---|
.. | ||
src/main/java/nu/marginalia/tools | ||
build.gradle | ||
readme.md |
Term Frequency Extractor
Generates a term frequency dictionary file from a batch of crawl data.
Usage:
PATH_TO_SAMPLES=run/samples/crawl-s
export JAVA_OPTS=-Dcrawl.rootDirRewrite=/crawl:${PATH_TO_SAMPLES}
term-frequency-extractor ${PATH_TO_SAMPLES}/plan.yaml out.dat