161sh/CatgirlIntelligenceAgency

History

Viktor Lofgren 1d34224416 (refac) Remove src/main from all source code paths. Look, this will make the git history look funny, but trimming unnecessary depth from the source tree is a very necessary sanity-preserving measure when dealing with a super-modularized codebase like this one. While it makes the project configuration a bit less conventional, it will save you several clicks every time you jump between modules. Which you'll do a lot, because it's modular. The src/main/java convention makes a lot of sense for a non-modular project though. This ain't that.		2024-02-23 16:13:40 +01:00
..
java/nu/marginalia/language	(refac) Remove src/main from all source code paths.	2024-02-23 16:13:40 +01:00
resources/dictionary	(refac) Remove src/main from all source code paths.	2024-02-23 16:13:40 +01:00
test/nu/marginalia/language	(refac) Remove src/main from all source code paths.	2024-02-23 16:13:40 +01:00
test-resources/html	(refac) Remove src/main from all source code paths.	2024-02-23 16:13:40 +01:00
build.gradle	(refac) Remove src/main from all source code paths.	2024-02-23 16:13:40 +01:00
readme.md	(refactor) Remove features-search and update documentation	2023-10-09 15:12:30 +02:00

readme.md

Language Processing

This library contains various tools used in language processing.

Central Classes

SentenceExtractor - Creates a DocumentLanguageData from a text, containing its words, how they stem, POS tags, and so on.

See Also

features-convert/keyword-extraction uses this code to identify which keywords are important.

features-qs/query-parser also does some language processing.