27ffb8fa8a
Previously, in order to load encyclopedia data into the search engine, it was necessary to use the encyclopedia.marginalia.nu converter to first create a .db-file. This isn't very ergonomic, so parts of that code-base was lifted in as a 3rd party library, and conversion from .zim to .db is now done automatically. The output file name is based on the original filename, plus a crc32 hash and a .db-ending, to ensure we can recycle the data on repeat loads. |
||
---|---|---|
.. | ||
commons-codec | ||
count-min-sketch | ||
encyclopedia-marginalia-nu | ||
monkey-patch-gson | ||
monkey-patch-opennlp | ||
openzim | ||
parquet-floor | ||
porterstemmer | ||
rdrpostagger | ||
symspell | ||
uppend | ||
xz | ||
README.md |
Third Party Code
This is a mix of code from other projects, that has either been aggressively modified to suite the needs of the project, or lack an artifact, or to override some default that is inappropriate for the type of data Marginalia throws at the library.
Sources and Licenses
Modified
- RDRPosTagger - GPL3
- PorterStemmer - LGPL3
- Uppend - MIT
- OpenZIM - GPL-2.0+
- Commons Codec - Apache 2.0
- encylopedia.marginalia.nu - GPL 2.0+
Repackaged
- SymSpell - LGPL-3.0
- Count-Min-Sketch - Apache 2.0
Monkey Patched
- Stanford OpenNLP - Apache-2.0
- GSON - Apache-2.0