CatgirlIntelligenceAgency/code/libraries
Viktor 0f9b90eb1c
Better fingerprinting (#35)
* Better fingerprinting for server tech
* Many more features in FeatureExtractor
* Blog specialization
* SiteType table
2023-07-10 17:36:12 +02:00
..
array Fix broken transformation functions in the PagingArray classes. 2023-05-28 13:31:05 +02:00
big-string Use fixed buffers for BigString compression and decompression to reduce GC churn. 2023-06-19 17:58:19 +02:00
braille-block-punch-cards Increase search result relevance (#8) 2023-04-07 20:18:08 +02:00
btree Tools for merging sorted lists, and merging btrees. (#14) 2023-04-20 15:28:09 +02:00
easy-lsh Move all code to a code directory. 2023-03-07 17:14:32 +01:00
guarded-regex Move all code to a code directory. 2023-03-07 17:14:32 +01:00
language-processing Better fingerprinting (#35) 2023-07-10 17:36:12 +02:00
next-prime Additional code restructuring to get rid of util and misc-style packages. 2023-03-11 13:48:40 +01:00
random-write-funnel Move all code to a code directory. 2023-03-07 17:14:32 +01:00
term-frequency-dict Fix typeahead suggestions 2023-03-25 10:20:52 +01:00
LICENSE.txt The refactoring will continue until morale improves. 2023-03-12 10:50:31 +01:00
readme.md The refactoring will continue until morale improves. 2023-03-12 10:50:31 +01:00

Libraries

These are libraries that are not strongly coupled to the search engine's business logic. These libraries may not depend on features, services, processes, models, etc.

NOTE: These libraries are co-licensed under the MIT license.

Libraries

  • The array library is for memory mapping large memory-areas, which Java has bad support for. It's designed to be able to easily replaced when Java's Foreign Function And Memory API is released.
  • The btree library offers a static BTree implementation based on the array library.
  • language-processing contains primitives for sentence extraction and POS-tagging.

Micro libraries

  • easy-lsh is a simple locality-sensitive hash for document deduplication
  • guarded-regex makes predicated regular expressions clearer
  • big-string offers seamless string compression
  • random-write-funnel is a tool for reducing write amplification when constructing large files out of order.
  • next-prime naive brute force prime sieve.
  • braille-block-punch-cards renders bit masks into human-readable dot matrices using the braille block.