Viktor Lofgren
ffadfb4149
(control) Use a partial template for the storage types tabs.
2023-10-31 17:12:14 +01:00
Viktor Lofgren
b7e38cfbae
(control) Add exports view
2023-10-31 17:08:48 +01:00
Viktor Lofgren
659743b39c
(executor) Export Data actor allocates its own storage
2023-10-31 17:04:07 +01:00
Viktor Lofgren
69758c5859
(control) Nicer redirects acknowledging actions
2023-10-31 16:26:29 +01:00
Viktor Lofgren
2871a326e6
(ctrl/exe) Clean up UX and code
2023-10-29 14:09:39 +01:00
Viktor Lofgren
abb42f0f36
(crawler) Fix bug in SQL statement
...
Arguments were in the wrong order in inserting fetching sites submitted to be crawled
2023-10-29 13:19:17 +01:00
Viktor Lofgren
88f49834fd
(docs) Update documentation
2023-10-27 12:45:39 +02:00
Viktor Lofgren
c7cb6664b4
(control) Indicate missing services with danger-color instead of having a distracting and constantly updating last-seen number
2023-10-26 18:05:22 +02:00
Viktor Lofgren
79adba9284
(index) Fix bug in dealing with quoted search terms
2023-10-26 16:28:23 +02:00
Viktor Lofgren
f613f4f2df
(array) Fix spurious search results
...
This was caused by a bug in the binary search algorithm causing it to sometimes return positive values when encoding a search miss.
It was also necessary to get rid of the vestiges of the old LongArray and IntArray classes to make this fix doable.
2023-10-26 15:27:02 +02:00
Viktor Lofgren
abbadc92a0
(exdecutor) Prevent TriggerAdjacencyCalculationActor from showing up in the actions tab when it isn't running
2023-10-25 21:25:07 +02:00
Viktor Lofgren
97fcbdd6d9
(control) Move storage actions into the actions tab
...
* Also disable annoying CSS animations
2023-10-25 21:23:56 +02:00
Viktor Lofgren
d7686b665e
Refactoring
...
* Encyclopedia sideloader; permit providing base URL.
* Storage base shows node id in GUI
* ProcessLivenessMonitorActor restarts automatically
* Clean-up of outbox code
2023-10-25 18:51:02 +02:00
Viktor Lofgren
84cdac83d6
(control) Move message queue monitor to control
2023-10-24 16:44:28 +02:00
Viktor Lofgren
313cc2965c
(index-creation) Print whether full or prio is created
...
Previous state of saying reverse index for both was pretty confusing.
2023-10-24 16:23:10 +02:00
Viktor Lofgren
95f74c5ea7
(control) Filter out heartbeats that are stopped
2023-10-24 16:09:28 +02:00
Viktor Lofgren
0406e76889
(api) Remove logging cruft
2023-10-24 13:39:05 +02:00
Viktor Lofgren
c2b28c0f8d
(api) Trial streaming API
2023-10-24 13:26:46 +02:00
Viktor Lofgren
a860f8f1a8
(index/qs) GRPC API for better query peformance
2023-10-24 11:38:07 +02:00
Viktor Lofgren
2ed2f35a9b
(actor) Rewrite of the actor prototype class using record pattern matching
2023-10-23 10:18:20 +02:00
Viktor Lofgren
119151cad3
(converter) Separtion of concerns
2023-10-22 14:35:33 +02:00
Viktor Lofgren
758f9b5aa5
(converter) Get UUID pips out of the models
...
Rendering concerns shouldn't be in the models, it's poor separation of concerns and very difficult to follow.
2023-10-22 14:24:52 +02:00
Viktor Lofgren
eb4158df0b
(control) Fix start/stop FSM endpoints
2023-10-22 14:03:09 +02:00
Viktor Lofgren
12fda1a36b
(control) Temporarily re-writing the data balancer to get it to work in prod
...
Need to clean this up later.
2023-10-22 14:03:09 +02:00
Viktor Lofgren
e927f99777
(control) JSON serializes Map<Integer> to Map<Double> and Java gets confused
2023-10-21 16:24:20 +02:00
Viktor Lofgren
044bcf55bd
(control) Fix SQL in rebalance actor
2023-10-21 16:13:37 +02:00
Viktor Lofgren
e475af9f49
(control) Initialize controlActorService
2023-10-21 16:06:53 +02:00
Viktor Lofgren
c6abcd91fa
(control) Better use of FS states, fix bug with start/stop actors
2023-10-20 16:37:49 +02:00
Viktor Lofgren
d76d926c38
(control/executor) Add new configuration options for node
...
It's now possible to configure prod instance to not retain processed data.
2023-10-20 14:05:19 +02:00
Viktor Lofgren
2b3c167845
(controller) Additional configuration options for node
2023-10-20 13:13:36 +02:00
Viktor Lofgren
584bb3a648
(fs) interface cleanup
2023-10-20 12:24:18 +02:00
Viktor Lofgren
7b5ec6b98f
(executor-service) Embed dist/ in executor-service's docker image
2023-10-19 17:48:34 +02:00
Viktor Lofgren
23526f6d1a
(executor) Executor service now pulls DomainType list for CRAWL on "recrawl"
...
This is an automatic integration with the submit-site repo on github and also
crawl-queue.
2023-10-19 17:48:34 +02:00
Viktor Lofgren
809b3ee023
(control) Update GUI for crawl specs. They are now less important than they were before.
2023-10-19 17:48:34 +02:00
Viktor Lofgren
23f0c79fba
(control) GUI for data sets/domain types.
2023-10-19 17:48:34 +02:00
Viktor Lofgren
81dd3809e9
(*) WIP Add node affinity to EC_DOMAIN
...
Very messy commit due to fractalline yak shaving
2023-10-19 17:48:34 +02:00
Viktor Lofgren
978550f809
(executor-service) Retire features-convert and move the corresponding packages into the executor service.
2023-10-16 15:43:46 +02:00
Viktor Lofgren
84fea0fd05
(node) Nodes auto-start their monitor actors.
2023-10-16 15:33:22 +02:00
Viktor Lofgren
2df3e0f881
(node) Nodes auto-configure on start-up instead of requiring manual configuration.
2023-10-16 14:46:35 +02:00
Viktor Lofgren
ede5d1f890
(actor) Give process spawners more easily recognizable names.
2023-10-16 14:19:00 +02:00
Viktor Lofgren
39911e3acd
(control) Fix incorrect storage base and clean up GUI for data
2023-10-16 13:30:26 +02:00
Viktor Lofgren
8dafd13cd7
(client) Fix executor tests
2023-10-16 12:02:57 +02:00
Viktor Lofgren
c245f7ce3a
(control) Bootstrapify review-domains and search-to-ban views.
2023-10-15 22:04:23 +02:00
Viktor Lofgren
607d647483
(control) Remove services listing view
2023-10-15 21:48:55 +02:00
Viktor Lofgren
9a38a455c9
(control/exec) File listings in control GUI
2023-10-15 19:15:44 +02:00
Viktor Lofgren
16e0738731
(*) Get multi-node routing working.
2023-10-15 18:38:30 +02:00
Viktor Lofgren
eacbf87979
(control) New list and form for index nodes.
2023-10-14 21:46:52 +02:00
Viktor Lofgren
108b4cb648
(service) Keep disabled multi-noded services dormant when they are configured to be disabled.
2023-10-14 20:58:55 +02:00
Viktor Lofgren
6308a8dfcd
(control) Node configuration
2023-10-14 16:47:52 +02:00
Viktor Lofgren
4baf9527d7
(*) WIP Control GUI redesign, executor-service, multi-node mq
...
This turned out to be very difficult to do in small isolated steps.
* Design overhaul of the control gui using bootstrap
* Move the actors out of control-service into to a new executor-service, that can be run on multiple nodes
* Add node-affinity to message queue
2023-10-14 12:08:43 +02:00
Viktor Lofgren
199c459697
(*) Add node-affinity to services, processes and file storage.
2023-10-10 12:32:22 +02:00
Viktor Lofgren
61288c5e68
(service, client) First steps towards multiple nodedness
2023-10-09 22:13:27 +02:00
Viktor Lofgren
6319b8ef51
(api-service) Improved testability, always set content type to application/json
2023-10-09 15:39:34 +02:00
Viktor Lofgren
397a85eaa4
(query-service) Apply blacklisting to search results
2023-10-09 15:18:53 +02:00
Viktor Lofgren
3889c4bdd9
(refactor) Remove features-search and update documentation
2023-10-09 15:12:30 +02:00
Viktor Lofgren
c899f1cb85
(docs) Update documentation to reflect new query service
2023-10-09 14:56:59 +02:00
Viktor Lofgren
d8956c51d0
(refactor) Remove api:search-api
...
Application services should not have an API, but purely act as clients
to the core services (which should always have an API).
2023-10-09 14:42:33 +02:00
Viktor Lofgren
c0e61d4c87
(refactor) Move search service into services-satellite
2023-10-09 13:40:01 +02:00
Viktor Lofgren
97e17282ab
(query-service) Move query parsing from search-service to the new query service.
2023-10-09 13:27:44 +02:00
Viktor Lofgren
94c882af7d
(query-service) Provide delegate of IndexApi's query functionality.
...
This is an intermediate step in the process of introducing the query-service as a proxy between search and index.
2023-10-08 22:22:26 +02:00
Viktor Lofgren
89c6d85f2f
(query-service) Create new empty 'query-service' service
2023-10-08 17:31:50 +02:00
Viktor Lofgren
cf366c602f
(search) Refactor SearchQueryIndexService in preparation for feature extraction.
...
Prefer working on DecoratedSearchResultItem in favor of UrlDetails.
2023-10-08 17:15:41 +02:00
Viktor Lofgren
77ccab7d80
(index) Move linkdb to index from search.
...
This makes index complete in the sense that you can deploy an index instance and build a complete separate application on top of it, without having to go through the Marginalia-laden search service.
2023-10-08 16:48:35 +02:00
Viktor Lofgren
f51ba63742
(search) Remove dead file
2023-10-07 21:05:06 +02:00
Viktor Lofgren
9044518be5
(search) Fix broken link to git repo
2023-10-07 19:43:22 +02:00
Viktor Lofgren
9e0367eef4
(search) Filter blacklisted items in API query service as well
2023-10-07 16:16:04 +02:00
Viktor Lofgren
235bb6c1b9
(control) Administrative QOL improvement, GUI for banning spam
2023-10-07 15:45:50 +02:00
Viktor Lofgren
49344d7ea8
(control) Administrative QOL improvement, GUI for banning spam
2023-10-07 15:43:18 +02:00
Viktor Lofgren
1b418d77ff
(search) We got some new IP ranges to work with for the crawler
2023-10-07 13:41:55 +02:00
Viktor Lofgren
80cc302627
(search) We can't in claim to be on PC hardware anymore...
2023-10-07 11:49:29 +02:00
Viktor
8e1abc3f10
(index-reverse) Parallel construction of the reverse indexes. ( #52 )
...
* (index-reverse) Parallel construction of the reverse indexes.
* (array) Remove wasteful calculation of numDistinct before merging two sorted arrays.
* (index-reverse) Force changes to disk on close, reduce logging.
* (index-reverse) Clean up merging process and add back logging
* (run) Add a conservative default for INDEX_CONSTRUCTION_PROCESS_OPTS's parallelism as it eats a lot of RAM
* (index-reverse) Better logging during processing
* (array) 2GB+ compatible write() function
* (array) 2GB+ compatible write() function
* (index-reverse) We are logging like Bolsonaro and I will not have it.
* (reverse-index) Self-diagnostics
* (btree) Fix bug in btree reader to do with large data sizes
2023-10-07 10:00:00 +02:00
Viktor Lofgren
c51159672e
(build) Move unit test configuration to root build.gradle
2023-10-04 12:46:22 +02:00
Viktor Lofgren
405300b4b2
(control) Fix bug where finishing one process ad hoc task would remove all other tasks from the db
2023-10-04 11:44:31 +02:00
Viktor Lofgren
40768e935b
(test) Removing /tmp-guardrails as it doesn't hold in CI
2023-10-02 16:52:59 +02:00
Viktor Lofgren
d160954080
(index) Two useful debug endpoints
2023-09-24 19:39:48 +02:00
Viktor Lofgren
14372e0ef0
(index) Slightly reduce alloc churn
2023-09-24 19:36:14 +02:00
Viktor Lofgren
03bffa27ac
(search) Add combined id to the search result HTML
2023-09-24 19:34:35 +02:00
Viktor Lofgren
028b5a4f0d
(minor performance) Reduce GC churn in index
2023-09-24 12:12:08 +02:00
Viktor Lofgren
1bd146fb8e
(minor) Remove dead code
2023-09-24 10:55:20 +02:00
Viktor Lofgren
5f6c3da7a4
(index) Add close methods on the index readers so they clean up their mmaps
2023-09-24 10:54:23 +02:00
Viktor Lofgren
dbe9235f3a
(*) Upgrade to JDK21 with preview enabled.
...
... also move some common configuration into the root build.gradle-file.
Support for JDK21 in lombok is a bit sketchy at the moment, but it seems to work. This upgrade is kind of important as the new index construction really benefits from Arena based lifecycle control over off-heap memory.
2023-09-24 10:38:59 +02:00
Viktor Lofgren
d78569986b
(backups) Fix bug where backup service would zero the linkdb when restoring.
2023-09-22 18:34:34 +02:00
Viktor Lofgren
95323e6caa
(backups) Support restore multi-source load data
2023-09-22 18:34:17 +02:00
Viktor Lofgren
f809d22fc6
(loader) Support simultaneous loading of multiple processed data sets
2023-09-22 13:14:58 +02:00
Viktor Lofgren
70aa04c047
(converter, stackexchange-xml) Add the ability to sideload stackexchange data
2023-09-21 12:48:33 +02:00
Viktor Lofgren
f8050816ac
(search) Don't run LSH deduplication on details with zero lsh to support not calculating this hash.
2023-09-21 12:47:02 +02:00
Viktor Lofgren
9b385ec7cc
(converter) Make it possible to sideload documents from a directory tree
2023-09-17 14:35:06 +02:00
Viktor Lofgren
5c040f7a46
(crawl-spec) Parquetify crawl spec
...
* Crawl-specs are now parquet files
* Deprecate the crawl-job-extractor tool
2023-09-17 09:41:34 +02:00
Viktor Lofgren
5e5aaf9a7e
(converter, control) Re-enable sideloading encyclopedia data
2023-09-14 12:12:07 +02:00
Viktor Lofgren
07d7507ac6
(control-service) Move Actions up in storage-details
...
Papercut fix. If a file storage area has a lot of files, you have to scroll down a long way to get to the actions otherwise.
2023-09-02 15:41:55 +02:00
Viktor Lofgren
9e185e80ce
(control-service) Add timestamp to file storages.
2023-09-02 14:01:04 +02:00
Viktor Lofgren
d31d8ec5b0
(index) Log keyword ids on hex format
2023-09-01 15:40:24 +02:00
Viktor Lofgren
2b00cd632d
(process) Propagate environment JVM params to the index constructor
2023-09-01 15:39:42 +02:00
Viktor Lofgren
764e7d1315
(index) Add more comprehensive integration tests for the index service.
2023-08-30 10:37:24 +02:00
Viktor Lofgren
e4d7958379
(control) ProcessLivenessMonitorActor shouldn't reap tasks based on service instance liveness
2023-08-29 18:19:04 +02:00
Viktor Lofgren
3f288e264b
(minor) Clean up dead endpoints
2023-08-29 17:04:54 +02:00
Viktor Lofgren
dd593c292c
(loader) Minor optimizations and bugfixes.
...
* Reduce memory churn in LoaderIndexJournalWriter, fix bug with keyword mappings as well
* Remove remains of OldDomains
* Ensure LOADER_PROCESS_OPTS gets fed to the processes
* LinkdbStatusWriter won't execute batch after each added item post 100 items
2023-08-29 15:37:52 +02:00
Viktor Lofgren
39c1857c61
(heartbeat, reverse-index) Better heartbeat mocking, improved heartbeats for reverse index construction.
2023-08-29 13:07:55 +02:00
Viktor Lofgren
c57a2d0dc3
(control-service) Remove old index journal files when restoring a backup.
2023-08-29 11:58:01 +02:00
Viktor Lofgren
6525b16e1f
(minor) Improved logging and error messages
2023-08-28 19:53:55 +02:00