Commit Graph

634 Commits

Author SHA1 Message Date
Viktor Lofgren
23f0c79fba (control) GUI for data sets/domain types. 2023-10-19 17:48:34 +02:00
Viktor Lofgren
81dd3809e9 (*) WIP Add node affinity to EC_DOMAIN
Very messy commit due to fractalline yak shaving
2023-10-19 17:48:34 +02:00
Viktor Lofgren
2bf0c4497d (*) Tool for unfcking old crawl data so that it aligns with the new style IDs 2023-10-19 17:48:34 +02:00
Viktor Lofgren
978550f809 (executor-service) Retire features-convert and move the corresponding packages into the executor service. 2023-10-16 15:43:46 +02:00
Viktor Lofgren
84fea0fd05 (node) Nodes auto-start their monitor actors. 2023-10-16 15:33:22 +02:00
Viktor Lofgren
2df3e0f881 (node) Nodes auto-configure on start-up instead of requiring manual configuration. 2023-10-16 14:46:35 +02:00
Viktor Lofgren
c98117f69d (actor) FS monitor should pick up stuff in BACKUP as well. 2023-10-16 14:37:36 +02:00
Viktor Lofgren
ede5d1f890 (actor) Give process spawners more easily recognizable names. 2023-10-16 14:19:00 +02:00
Viktor Lofgren
39911e3acd (control) Fix incorrect storage base and clean up GUI for data 2023-10-16 13:30:26 +02:00
Viktor Lofgren
3d1c15ef99 (client) Refactor liveness monitor 2023-10-16 12:34:01 +02:00
Viktor Lofgren
f718482e98 (client) Fix tests 2023-10-16 12:12:16 +02:00
Viktor Lofgren
8dafd13cd7 (client) Fix executor tests 2023-10-16 12:02:57 +02:00
Viktor Lofgren
0b19b28a64 (file-storage) Delete unused code 2023-10-16 12:02:57 +02:00
Viktor Lofgren
c245f7ce3a (control) Bootstrapify review-domains and search-to-ban views. 2023-10-15 22:04:23 +02:00
Viktor Lofgren
607d647483 (control) Remove services listing view 2023-10-15 21:48:55 +02:00
Viktor Lofgren
9a38a455c9 (control/exec) File listings in control GUI 2023-10-15 19:15:44 +02:00
Viktor Lofgren
16e0738731 (*) Get multi-node routing working. 2023-10-15 18:38:30 +02:00
Viktor Lofgren
eacbf87979 (control) New list and form for index nodes. 2023-10-14 21:46:52 +02:00
Viktor Lofgren
108b4cb648 (service) Keep disabled multi-noded services dormant when they are configured to be disabled. 2023-10-14 20:58:55 +02:00
Viktor Lofgren
a9dff407a1 (config/db) Clean up migrations 2023-10-14 20:34:03 +02:00
Viktor Lofgren
9e26109e36 (reverse-index) Don't always POST 2023-10-14 16:48:29 +02:00
Viktor Lofgren
6308a8dfcd (control) Node configuration 2023-10-14 16:47:52 +02:00
Viktor Lofgren
4baf9527d7 (*) WIP Control GUI redesign, executor-service, multi-node mq
This turned out to be very difficult to do in small isolated steps.

* Design overhaul of the control gui using bootstrap
* Move the actors out of control-service into to a new executor-service, that can be run on multiple nodes
* Add node-affinity to message queue
2023-10-14 12:08:43 +02:00
Viktor Lofgren
199c459697 (*) Add node-affinity to services, processes and file storage. 2023-10-10 12:32:22 +02:00
Viktor Lofgren
61288c5e68 (service, client) First steps towards multiple nodedness 2023-10-09 22:13:27 +02:00
Viktor Lofgren
8375237de5 (converter) Add special keyword for websites with a tilde url. 2023-10-09 17:02:32 +02:00
Viktor Lofgren
6319b8ef51 (api-service) Improved testability, always set content type to application/json 2023-10-09 15:39:34 +02:00
Viktor Lofgren
397a85eaa4 (query-service) Apply blacklisting to search results 2023-10-09 15:18:53 +02:00
Viktor Lofgren
3889c4bdd9 (refactor) Remove features-search and update documentation 2023-10-09 15:12:30 +02:00
Viktor Lofgren
c899f1cb85 (docs) Update documentation to reflect new query service 2023-10-09 14:56:59 +02:00
Viktor Lofgren
d8956c51d0 (refactor) Remove api:search-api
Application services should not have an API, but purely act as clients
to the core services (which should always have an API).
2023-10-09 14:42:33 +02:00
Viktor Lofgren
5dd55c7cad (refactor) Rename satellite services to application services
This is a better descriptor, since they now all implement different applications on top of the core services' APIs.
2023-10-09 13:45:45 +02:00
Viktor Lofgren
c0e61d4c87 (refactor) Move search service into services-satellite 2023-10-09 13:40:01 +02:00
Viktor Lofgren
97e17282ab (query-service) Move query parsing from search-service to the new query service. 2023-10-09 13:27:44 +02:00
Viktor Lofgren
94c882af7d (query-service) Provide delegate of IndexApi's query functionality.
This is an intermediate step in the process of introducing the query-service as a proxy between search and index.
2023-10-08 22:22:26 +02:00
Viktor Lofgren
89c6d85f2f (query-service) Create new empty 'query-service' service 2023-10-08 17:31:50 +02:00
Viktor Lofgren
cf366c602f (search) Refactor SearchQueryIndexService in preparation for feature extraction.
Prefer working on DecoratedSearchResultItem in favor of UrlDetails.
2023-10-08 17:15:41 +02:00
Viktor Lofgren
77ccab7d80 (index) Move linkdb to index from search.
This makes index complete in the sense that you can deploy an index instance and build a complete separate application on top of it, without having to go through the Marginalia-laden search service.
2023-10-08 16:48:35 +02:00
Viktor Lofgren
f51ba63742 (search) Remove dead file 2023-10-07 21:05:06 +02:00
Viktor Lofgren
9044518be5 (search) Fix broken link to git repo 2023-10-07 19:43:22 +02:00
Viktor Lofgren
9e0367eef4 (search) Filter blacklisted items in API query service as well 2023-10-07 16:16:04 +02:00
Viktor Lofgren
235bb6c1b9 (control) Administrative QOL improvement, GUI for banning spam 2023-10-07 15:45:50 +02:00
Viktor Lofgren
49344d7ea8 (control) Administrative QOL improvement, GUI for banning spam 2023-10-07 15:43:18 +02:00
Viktor Lofgren
1b418d77ff (search) We got some new IP ranges to work with for the crawler 2023-10-07 13:41:55 +02:00
Viktor Lofgren
80cc302627 (search) We can't in claim to be on PC hardware anymore... 2023-10-07 11:49:29 +02:00
Viktor
8e1abc3f10
(index-reverse) Parallel construction of the reverse indexes. (#52)
* (index-reverse) Parallel construction of the reverse indexes.

* (array) Remove wasteful calculation of numDistinct before merging two sorted arrays.

* (index-reverse)  Force changes to disk on close, reduce logging.

* (index-reverse)  Clean up merging process and add back logging

* (run)  Add a conservative default for INDEX_CONSTRUCTION_PROCESS_OPTS's parallelism as it eats a lot of RAM

* (index-reverse)  Better logging during processing

* (array) 2GB+ compatible write() function

* (array) 2GB+ compatible write() function

* (index-reverse) We are logging like Bolsonaro and I will not have it.

* (reverse-index) Self-diagnostics

* (btree) Fix bug in btree reader to do with large data sizes
2023-10-07 10:00:00 +02:00
Viktor Lofgren
e498c6907a (forward-index) Don't leak off heap memory 2023-10-05 21:22:13 +02:00
Viktor Lofgren
08e8fc6736 (index-journal) Thread safe IndexJournalReadEntry 2023-10-05 19:39:09 +02:00
Viktor Lofgren
f6e9ef6de9 (array) Fix transferFrom() so it survives larger than 2 GB transfers 2023-10-04 13:57:36 +02:00
Viktor Lofgren
c51159672e (build) Move unit test configuration to root build.gradle 2023-10-04 12:46:22 +02:00
Viktor Lofgren
233b51e29e (test) flag DomainTypesTest as Slow to exclude from regular CI 2023-10-04 12:23:10 +02:00
Viktor Lofgren
54c8e13a68 (term-frequency-dict) Fix memory leak in TermFrequencyDict 2023-10-04 11:55:11 +02:00
Viktor Lofgren
405300b4b2 (control) Fix bug where finishing one process ad hoc task would remove all other tasks from the db 2023-10-04 11:44:31 +02:00
Viktor Lofgren
40768e935b (test) Removing /tmp-guardrails as it doesn't hold in CI 2023-10-02 16:52:59 +02:00
Viktor Lofgren
13ee31770a (file storage) Make it possible to override the value returned by getFileStorage(type) with a JVM property. 2023-10-01 12:57:53 +02:00
Viktor Lofgren
93dc80000c (bugfix) Fix NPE in KeywordExtractor due to bad SoftReference handling 2023-09-26 17:16:41 +02:00
Viktor Lofgren
e0cd3cd991 (converter) Alter StackexchangeSideloader's summary length to align with the rest of the system. 2023-09-26 12:19:43 +02:00
Viktor Lofgren
81ae501e73 (converter) Use ThreadLocalSentenceExtractorProvider for PlainText plugin as well 2023-09-25 18:28:34 +02:00
Viktor Lofgren
9b781f8404 (keyoword-extractor) Address very rare race condition in memoization logic 2023-09-25 18:28:04 +02:00
Viktor Lofgren
f797a92f87 (converter, minor) Use domain name in task heartbeat progress 2023-09-25 18:27:04 +02:00
Viktor Lofgren
ec6c9bca62 (common) Fix factual error in comments 2023-09-24 19:40:19 +02:00
Viktor Lofgren
a433bbbe45 (converter) Fix rare sentence extractor bug
It was caused by non-thread safe concurrent memory access in SentenceExtractor.
2023-09-24 19:39:48 +02:00
Viktor Lofgren
8ca20f184d (keyword-extraction) Chasing my tail looking for a bug 2023-09-24 19:39:48 +02:00
Viktor Lofgren
d160954080 (index) Two useful debug endpoints 2023-09-24 19:39:48 +02:00
Viktor Lofgren
14372e0ef0 (index) Slightly reduce alloc churn 2023-09-24 19:36:14 +02:00
Viktor Lofgren
03bffa27ac (search) Add combined id to the search result HTML 2023-09-24 19:34:35 +02:00
Viktor Lofgren
028b5a4f0d (minor performance) Reduce GC churn in index 2023-09-24 12:12:08 +02:00
Viktor Lofgren
cd12f49fc0 (long-array) Return slices SegmentLongArray of itself for range() &c 2023-09-24 11:31:54 +02:00
Viktor Lofgren
1bd146fb8e (minor) Remove dead code 2023-09-24 10:55:20 +02:00
Viktor Lofgren
5f6c3da7a4 (index) Add close methods on the index readers so they clean up their mmaps 2023-09-24 10:54:23 +02:00
Viktor Lofgren
d0aa754252 (long-array) Implement java.lang.foreign.Arena based lifecycle control for LongArray.
Further de-ByteBuffer:ing of these classes is to be done, but this is the smallest most urgently needed benefit.

This commit is a WIP but in a fully working state, pushing due to the importance of the changes to offer lifecycle control over mmaps.
2023-09-24 10:40:06 +02:00
Viktor Lofgren
dbe9235f3a (*) Upgrade to JDK21 with preview enabled.
... also move some common configuration into the root build.gradle-file.

Support for JDK21 in lombok is a bit sketchy at the moment, but it seems to work.  This upgrade is kind of important as the new index construction really benefits from Arena based lifecycle control over off-heap memory.
2023-09-24 10:38:59 +02:00
Viktor Lofgren
d78569986b (backups) Fix bug where backup service would zero the linkdb when restoring. 2023-09-22 18:34:34 +02:00
Viktor Lofgren
95323e6caa (backups) Support restore multi-source load data 2023-09-22 18:34:17 +02:00
Viktor Lofgren
f809d22fc6 (loader) Support simultaneous loading of multiple processed data sets 2023-09-22 13:14:58 +02:00
Viktor Lofgren
10cad3abb2 (dating) Implementing @samstorment's fantastic design polish 2023-09-21 15:19:50 +02:00
Viktor Lofgren
9338f35cd8 (doc) Remove confusingly outdated ER-diagrams 2023-09-21 15:08:27 +02:00
Viktor Lofgren
ad660cf420 (converter) Bugfix: Don't try to Path.of() on optional field 2023-09-21 13:27:09 +02:00
Viktor Lofgren
75f8ae2815 (file-storage) Use human-readable timestamps in the names of file storage directories 2023-09-21 13:22:53 +02:00
Viktor Lofgren
70aa04c047 (converter, stackexchange-xml) Add the ability to sideload stackexchange data 2023-09-21 12:48:33 +02:00
Viktor Lofgren
4aa47e87f2 (blocking-thread-pool) Add isTerminated convenience function 2023-09-21 12:47:41 +02:00
Viktor Lofgren
f8050816ac (search) Don't run LSH deduplication on details with zero lsh to support not calculating this hash. 2023-09-21 12:47:02 +02:00
Viktor Lofgren
5b0a6d7ec1 (stackexchange-converter) Create tool for converting stackexchange 7z-files to digestible sqlite db:s 2023-09-20 15:15:13 +02:00
Viktor Lofgren
3b4d08f52b (stackexchange-integration) Add better comments 2023-09-20 14:43:06 +02:00
Viktor Lofgren
6bbf40d7d2 (stackexchange-integration) Tools for reading stackexchange xml files 2023-09-20 14:17:33 +02:00
Viktor Lofgren
d895f83520 (blocking-thread-pool) Move DumbThreadPool to its own micro-library
Also rename it to SimpleBlockingThreadPool.
2023-09-20 10:11:49 +02:00
Viktor Lofgren
f6b9e8c5eb (converter) JavadocSpecialization should truncate its summary if it gets too long 2023-09-17 16:25:33 +02:00
Viktor Lofgren
98bcdf6028 (converter) DirtreeSideloader now trims /index.html from the URL if present
This is a crawler artifact in 9 cases out of 10, and may lead to bad URLs.
2023-09-17 16:08:16 +02:00
Viktor Lofgren
9b385ec7cc (converter) Make it possible to sideload documents from a directory tree 2023-09-17 14:35:06 +02:00
Viktor Lofgren
5c040f7a46 (crawl-spec) Parquetify crawl spec
* Crawl-specs are now parquet files
* Deprecate the crawl-job-extractor tool
2023-09-17 09:41:34 +02:00
Viktor Lofgren
c67d95c00f (converter) Write dummy processor log when sideloading 2023-09-14 14:13:03 +02:00
Viktor Lofgren
5e5aaf9a7e (converter, control) Re-enable sideloading encyclopedia data 2023-09-14 12:12:07 +02:00
Viktor Lofgren
35996d0adb (docs) Update the documentation up-to-date information 2023-09-14 11:33:36 +02:00
Viktor Lofgren
eaeb23d41e (refactor) Remove converting-model package completely 2023-09-14 11:21:44 +02:00
Viktor Lofgren
c71f6ad417 (converter) Add heartbeats to the loader processes and execute the tasks in parallel for a ~2X speedup 2023-09-14 10:11:57 +02:00
Viktor Lofgren
87a8593291 (work-log) Fix bug where items weren't added to the current batch on logItem 2023-09-14 10:11:04 +02:00
Viktor Lofgren
4799dd769e (converting) WIP begin to remove converting-model and the old InstructionsCompiler 2023-09-13 19:18:58 +02:00
Viktor Lofgren
24b4606f96 (converter,loader) Converter outputs parquet files instead of compressed json. 2023-09-13 16:13:41 +02:00
Viktor Lofgren
064bc5ee76 (processed-data) New parquet-serializable models for converter output 2023-09-11 14:08:40 +02:00
Viktor Lofgren
a52d78c8ee (work-log) New batching work log 2023-09-11 14:08:08 +02:00
Viktor Lofgren
07d7507ac6 (control-service) Move Actions up in storage-details
Papercut fix. If a file storage area has a lot of files, you have to scroll down a long way to get to the actions otherwise.
2023-09-02 15:41:55 +02:00
Viktor Lofgren
c68d17d482 (keyword-extraction) Fix bug leading to position data missing on some keywords.
This was due to a discrepancy between the KeywordPositionBitmask and WordsTfIdfCounts' concept of a keyword.
2023-09-02 14:48:55 +02:00
Viktor Lofgren
9e185e80ce (control-service) Add timestamp to file storages. 2023-09-02 14:01:04 +02:00
Viktor Lofgren
676e7c7947 (keywords) Add Serializable properties that went missing as the record became a class 2023-09-02 09:52:01 +02:00
Viktor Lofgren
04212b2cef (btree) Add more consistent asserts on sortedness 2023-09-01 15:45:02 +02:00
Viktor Lofgren
bafc2a1f30 (reverse-index) Force() final docs after being written
Unlikely to be a problem, but we want to ensure it's on dsik before we go read it later.
2023-09-01 15:43:53 +02:00
Viktor Lofgren
563e388a45 (reverse-index) Fix parallel documents sorting bug
Bug was caused by parallel sorting capturing the iterator rather than the offsets to sort.
2023-09-01 15:42:45 +02:00
Viktor Lofgren
d31d8ec5b0 (index) Log keyword ids on hex format 2023-09-01 15:40:24 +02:00
Viktor Lofgren
2b00cd632d (process) Propagate environment JVM params to the index constructor 2023-09-01 15:39:42 +02:00
Viktor Lofgren
5f427d2b4c (keywords) Clean up leaky abstractions, clean up tests 2023-09-01 13:52:00 +02:00
Viktor Lofgren
8c0ce4fc1d (index journal; minor) Clean up 2023-09-01 11:32:24 +02:00
Viktor Lofgren
10a74f45ea (index journal; minor) Even cleaner separation of concerns. 2023-09-01 11:28:02 +02:00
Viktor Lofgren
320dad7f1a (index journal) Fix leaky abstraction in IndexJournalReader.
The caller shouldn't be required to know the on-disk layout of the file to make use of the data in a performant way.
2023-09-01 11:18:13 +02:00
Viktor Lofgren
88ac72c8eb (journal/reverse index) Working WIP fix over-allocation of documents 2023-08-31 20:16:02 +02:00
Viktor Lofgren
f74b9df0a7 (array) Don't use paging arrays when mapping small files for writing 2023-08-31 20:15:10 +02:00
Viktor Lofgren
a6f1335375 (loader) Fix bugfix where the loader would omit some meta and words. 2023-08-31 17:48:43 +02:00
Viktor Lofgren
f321fa5ad3 (array) Override to Paging...Array$range()
This is a big performance boost in array.range().get().

Without an override, each access will go through pages[page].get(...) for each get()-operation.  This adds up very quickly.  BTreeReader does a bunch of get():s on a range()'d array during traversal in the queryData... methods.
2023-08-31 13:52:29 +02:00
Viktor Lofgren
03d999444d (ldb) Re-add accidentally removed stmt.addBatch that breaks 2023-08-31 12:06:30 +02:00
Viktor Lofgren
763ed260c3 (ldb) Better handling of null pubYear 2023-08-30 23:08:27 +02:00
Viktor Lofgren
764e7d1315 (index) Add more comprehensive integration tests for the index service. 2023-08-30 10:37:24 +02:00
Viktor Lofgren
048f685073 (ldb) add OR IGNORE to insert status query
Otherwise it will sometimes fail because documents may appear more than once in error scenarios.
2023-08-30 10:34:01 +02:00
Viktor Lofgren
e4d7958379 (control) ProcessLivenessMonitorActor shouldn't reap tasks based on service instance liveness 2023-08-29 18:19:04 +02:00
Viktor Lofgren
3f288e264b (minor) Clean up dead endpoints 2023-08-29 17:04:54 +02:00
Viktor Lofgren
dd593c292c (loader) Minor optimizations and bugfixes.
* Reduce memory churn in LoaderIndexJournalWriter, fix bug with keyword mappings as well
* Remove remains of OldDomains
* Ensure LOADER_PROCESS_OPTS gets fed to the processes
* LinkdbStatusWriter won't execute batch after each added item post 100 items
2023-08-29 15:37:52 +02:00
Viktor Lofgren
39c1857c61 (heartbeat, reverse-index) Better heartbeat mocking, improved heartbeats for reverse index construction. 2023-08-29 13:07:55 +02:00
Viktor Lofgren
c57a2d0dc3 (control-service) Remove old index journal files when restoring a backup. 2023-08-29 11:58:01 +02:00
Viktor Lofgren
a2e6616100 (index-reverse) Add documentation and clean up code. 2023-08-29 11:35:54 +02:00
Viktor Lofgren
ba4513e82c (loader) Revert accidental experimental changes that slipped by in an earlier commit 2023-08-28 19:54:56 +02:00
Viktor Lofgren
6525b16e1f (minor) Improved logging and error messages 2023-08-28 19:53:55 +02:00
Viktor Lofgren
b6a92506d1 (index) Hook in missing DocIdRewriter
This enables documents to be ranked properly.
2023-08-28 19:53:43 +02:00
Viktor Lofgren
ffa0366deb (minor) Fix typo in ActorStateMachine's logging 2023-08-28 16:11:52 +02:00
Viktor Lofgren
00c4686ef0 (reverse-index) Fix over-allocation of the count array in merging 2023-08-28 14:36:28 +02:00
Viktor Lofgren
3101b74580 (index) Move to a lexicon-free index design
This is a system-wide change.  The index used to have a lexicon, mapping words to wordIds using a large in-memory hash table.   This made index-construction easier, but it
also added a fairly significant RAM penalty to both the index service and the loader.

The new design moves to 64 bit word identifiers calculated using the murmur hash of the keyword, and an index construction based on merging smaller indices.

It also became necessary half-way through to upgrade guice as its error reporting wasn't *quite* compatible with JDK20.
2023-08-28 14:02:23 +02:00
Viktor Lofgren
194a6057dd (index,control) Recoverable index backups 2023-08-25 14:57:43 +02:00
Viktor Lofgren
e710e057e2 (db) Remove EC_URL and EC_PAGE_DATA from mariadb database 2023-08-25 13:45:03 +02:00
Viktor Lofgren
28188a6e59 (control) Simplify ConvertAndLoadActor 2023-08-25 13:30:20 +02:00
Viktor Lofgren
70a5df96c8 (control) Display progress of process tasks 2023-08-25 13:05:21 +02:00
Viktor Lofgren
460998d512 (index) Move index construction to separate process.
This provides a much cleaner separation of concerns, and makes it possible to get rid of a lot of the gunkier parts of the index service.  It will also permit lowering the Xmx on the index service a fair bit, so we can get CompressedOOps again :D
2023-08-25 12:52:54 +02:00
Viktor Lofgren
e741301417 (search) Remove endpoint flush-search-caches
It's not necessary anymore with the new linkdb.
2023-08-25 09:51:06 +02:00
Viktor Lofgren
5ed5298409 (converter) Update confusing state description
SWAP_LEXICON doesn't instruct the index service to do anything.  It just moves the file.
2023-08-24 18:56:49 +02:00
Viktor Lofgren
b911665691 (index) Clean up and optimize valuator 2023-08-24 18:34:06 +02:00
Viktor Lofgren
56eb83319d (index) Clean up result domain deduplicator 2023-08-24 18:24:55 +02:00
Viktor Lofgren
1e6800565a (system) Remove EdgeId<T> and similar objects
They seemed like a good idea at the time, but in practice they're wasting resources and not really providing the clarity I had hoped.
2023-08-24 17:46:02 +02:00
Viktor Lofgren
c909120ae1 (search) Basic working integration of linkdb in search service 2023-08-24 17:24:56 +02:00
Viktor Lofgren
9894f37412 (index) Implement new URL ID coding scheme.
Also refactor along the way.  Really needs an additional pass, these tests are very hairy.
2023-08-24 16:44:27 +02:00
Viktor Lofgren
6a04cdfddf (loader) Implement new linkdb in loader
Deprecate the LoadUrl instruction entirely. We no longer need to be told upfront about which URLs to expect, as IDs are generated from the domain id and document ordinal.

For now, we no longer store new URLs in different domains.  We need to re-implement this somehow, probably in a different job or a as a different output.
2023-08-24 13:07:54 +02:00
Viktor Lofgren
c70670bacb (common) New UrlIdCodec class
Have a single class responsible for encoding and decoding URL ids, as it's a bit finicky and used all over.
2023-08-24 11:41:07 +02:00
Viktor Lofgren
7bb3e44a76 (common) Deprecate EdgeId and similar 2023-08-24 11:16:28 +02:00
Viktor Lofgren
b958acb76a (file-storage) New File Storage type for linkdb 2023-08-24 09:06:13 +02:00
Viktor Lofgren
b22f4fbb72 (linkdb) New Module for sqlite-backed document db 2023-08-24 09:06:13 +02:00