372 B
372 B
Term Frequency Extractor
Generates a term frequency dictionary file from a batch of crawl data.
Usage:
PATH_TO_SAMPLES=run/samples/crawl-s
export JAVA_OPTS=-Dcrawl.rootDirRewrite=/crawl:${PATH_TO_SAMPLES}
term-frequency-extractor ${PATH_TO_SAMPLES}/plan.yaml out.dat