CatgirlIntelligenceAgency

Author	SHA1	Message	Date
Viktor Lofgren	3fd2a83184	* Extract the search-query function	2024-02-22 15:27:39 +01:00
Viktor Lofgren	66c1281301	(zk-registry) epic jak shaving WIP Cleaning out a lot of old junk from the code, and one thing lead to another... * Build is improved, now constructing docker images with 'jib'. Clean build went from 3 minutes to 50 seconds. * The ProcessService's spawning is smarter. Will now just spawn a java process instead of relying on the application plugin's generated outputs. * Project is migrated to GraalVM * gRPC clients are re-written with a neat fluent/functional style. e.g. ```channelPool.call(grpcStub::method) .async(executor) // <-- optional .run(argument); ``` This change is primarily to allow handling ManagedChannel errors, but it turned out to be a pretty clean API overall. * For now the project is all in on zookeeper * Service discovery is now based on APIs and not services. Theoretically means we could ship the same code either a monolith or a service mesh. * To this end, began modularizing a few of the APIs so that they aren't strongly "living" in a service. WIP! Missing is documentation and testing, and some more breaking apart of code.	2024-02-22 14:01:23 +01:00
Viktor Lofgren	ee8e0497ae	(refac) Move service discovery injection to a separate guice module	2024-02-20 15:41:04 +01:00
Viktor Lofgren	0307c55f9f	(refac) Zookeeper for service-discovery, kill service-client lib (WIP) To avoid having to either hard-code or manually configure service addresses (possibly several dozen), and to reduce the project's dependency on docker to deal with routing and discovery, the option to use [Zookeeper](https://zookeeper.apache.org/) to manage services and discovery has been added. A service registry interface was added, with a Zookeeper implementation and a basic implementation that only works on docker and hard-codes everything. The last remaining REST service, the assistant-service, has been migrated to gRPC. This also proved a good time to clear out primordial technical debt from the root of the codebase. The 'service-client' library has been taken behind the barn and given a last farewell. It's replaced by a small library for managing gRPC channels. Since it's no longer used by anything, RxJava has been removed as a dependency from the project. Although the current state seems reasonably stable, this is a work-in-progress commit.	2024-02-20 11:41:14 +01:00
Viktor Lofgren	8cb5825617	(search) Temporarily disable the Popular filter This filter currently does not distinguish itself very much from the unfiltered results, and lends the impression that the filters don't "do anything". It may come back in some shape or form in the future, with some additional tweaking of the rankings...	2024-02-18 08:02:01 +01:00
Viktor Lofgren	a175b36382	(search) Correct accidental regression of the SmallWeb filter	2024-02-15 18:16:56 +01:00
Viktor Lofgren	16526d283c	(search) Correct accidental regression of the Vintage filter	2024-02-15 18:13:34 +01:00
Viktor Lofgren	752e677555	(search) Expose getSearchTitle in DecoratedSearchResults	2024-02-15 13:56:44 +01:00
Viktor Lofgren	f796af1ae8	(search) Fix failed refactoring	2024-02-15 13:53:19 +01:00
Viktor Lofgren	2515993536	(search) Fix issue where searchTitle setting gets lost when searching again It's important that the field names in SearchParameters matches the fields referenced in search-form.hdb, otherwise they will get lost in transit.	2024-02-15 13:52:11 +01:00
Viktor Lofgren	66b3e71e56	(search) Expose more search options This change set updates the query APIs to enable the search service to add additional criteria, such as QueryStrategy and TemporalBias. The QueryStrategy makes it possible to e.g. require a match is in the title of a result, and TemporalBias enables penalizing results that are not within a particular time period. These options are added to the search interface. The old 'recent results' is modified to use TemporalBias, and a new filter 'Search In Title' is added as well. The vintage filter is modified to add a temporal bias for the past.	2024-02-15 13:39:51 +01:00
Viktor Lofgren	3d54879c14	(API, minor) Clean up comments.	2024-02-14 12:09:16 +01:00
Viktor Lofgren	e17fcde865	(API, minor) Remove unnecessary inject.	2024-02-14 12:05:50 +01:00
Viktor Lofgren	6950dffcb4	(API) Fix result order in API results These results should be presented in the same order as their ranking score.	2024-02-14 11:47:14 +01:00
Viktor Lofgren	7564dfeb7a	(minor) Correct link in documentation for app services	2024-02-12 15:55:06 +01:00
Viktor Lofgren	10bad635a8	(search) Experimental support for clustering search results Improves clustering of results.	2024-02-11 20:00:11 +01:00
Viktor Lofgren	7cc8b0fed5	(search) Experimental support for clustering search results Improves clustering of results.	2024-02-11 19:58:55 +01:00
Viktor Lofgren	a77846373b	(search) Experimental support for clustering search results Improves clustering of results.	2024-02-11 19:48:55 +01:00
Viktor Lofgren	bcd0dabb92	(search) Experimental support for clustering search results Adds experimental support for clustering search results by e.g. domain. At a first stage, this is only enabled for the wiki and forum filters. The commit also cleans up the UrlDetails class, which contained a number of vestigial entries.	2024-02-11 17:31:38 +01:00
Viktor Lofgren	ef261cbbd7	(search) Remove stray spaces in bang commands	2024-02-08 14:46:18 +01:00
Conor Flynn	9d7df87886	(search) Fix broken !ddg handling https://duckduckgo.com/search?q=asdf leads to running a search for the term "search" instead of "asdf". Both https://duckduckgo.com/<query> and https://duckduckgo.com/?q=<query> are accepted, but using GET vars seemed more in-keeping with the code.	2024-02-08 13:28:02 +01:00
Viktor Lofgren	a4b2323ca3	(search) Change default search profile to No Filter Recent changes to the result ranking mean the no filter mode returns sufficiently good results for most queries that filtering by default just makes the search results more restricted.	2024-02-08 13:04:05 +01:00
Viktor Lofgren	d83a3bf4e2	(search) Fix broken !w handling Printf format error derp.	2024-02-08 12:11:33 +01:00
Viktor Lofgren	f2b39ad055	(search) Fix broken !bang handling !bang query handling seems to have fallen victim to an overzealous refactoring effort, and broken. It's now repaired, and a test is in place to ensure we know if it breaks again.	2024-02-08 12:05:09 +01:00
Viktor Lofgren	5a62b3058f	(query-api) Make the search set identifier a string value in the API This will free the core marginalia search engine to use arbitrary search set definitions, while the app can use its hardcoded defaults.	2024-01-16 10:55:24 +01:00
Viktor Lofgren	07a916a720	(search) Give the swipe hint on mobile a nicer finish	2024-01-13 18:51:54 +01:00
Viktor Lofgren	7c6e18f7a7	(*) Overhaul settings and properties Use a system.properties file to configure the system. This is loaded statically by MainClass or ProcessMainClass. Update the property names to be more consistent, and update the documentations to reflect the changes.	2024-01-13 17:12:18 +01:00
Viktor Lofgren	708a741960	(test) Clean up test usage of migrations Several tests were manually running migrations in a large copy-paste blob of code. This makes the test less useful as it's possible to break the code while keeping the tests green by introducing a new migration that never gets run in the tests, and it's also difficult to reason about what the tests are doing. A new test helper library is introduced with a TestMigrationLoader that can both run Flyway migrations, or load specific migrations in the cases a specific set of migrations need to be loaded. Existing tests are migrated to use the new code.	2024-01-12 15:55:50 +01:00
Viktor Lofgren	734996002c	(*) install script for deploying Marginalia outside the codebase The changeset also makes the control service responsible for flyway migrations. This helps reduce the number of places the database configuration needs to be spread out. These automatic migrations can be disabled with -DdisableFlyway=true. The commit also adds curl to the docker container, to enable docker health checks and interdependencies.	2024-01-11 12:40:03 +01:00
Viktor	fad9575154	Merge pull request #69 from MarginaliaSearch/converter-optimizations Refactor the DomainProcessor to take advantage of the new crawl data format	2024-01-10 09:46:54 +01:00
Viktor Lofgren	97e11e1ac9	(search) Fix acknowledgement page for domain complaints rendering as plain text This was caused by incorrect usage of the renderInto() function, which was always buggy and should never be used. This method is removed with this change.	2024-01-10 09:37:40 +01:00
Viktor Lofgren	e6a1e164b2	(search) Swap swipe direction for more consistent experience	2024-01-10 09:37:40 +01:00
Viktor Lofgren	e4f8f81e89	(search) Mobile UX improvements. Swipe right to show filter menu. Fix CSS bug that caused parts of the menu to not have a background.	2024-01-10 09:37:39 +01:00
Viktor Lofgren	176b3bb526	(search) Toggle for showing recent results Actually persist the value of the toggle between searches too...	2024-01-10 09:37:39 +01:00
Viktor Lofgren	b07752fa9b	(search) Toggle for showing recent results Will by default show results from the last 2 years. May need to tune this later.	2024-01-10 09:37:39 +01:00
Viktor Lofgren	68fd0efbde	(search) Clean up search results template Rendering is very slow. Let's see if this has a measurable effect on latency.	2024-01-10 09:37:39 +01:00
Viktor Lofgren	c80d3eb812	(search) Remove dead code	2024-01-10 09:37:35 +01:00
Viktor Lofgren	f9320995d6	(search) When clicking asn-links, show results from the unfiltered view...	2024-01-10 09:37:13 +01:00
Viktor Lofgren	f592c9f04d	(search) Fix acknowledgement page for domain complaints rendering as plain text This was caused by incorrect usage of the renderInto() function, which was always buggy and should never be used. This method is removed with this change.	2024-01-10 09:26:34 +01:00
Viktor Lofgren	bd7970fb1f	(search) Swap swipe direction for more consistent experience	2024-01-09 13:38:40 +01:00
Viktor Lofgren	c47730f2cc	(search) Mobile UX improvements. Swipe right to show filter menu. Fix CSS bug that caused parts of the menu to not have a background.	2024-01-09 13:30:30 +01:00
Viktor Lofgren	41cccfd2aa	(search) Toggle for showing recent results Actually persist the value of the toggle between searches too...	2024-01-09 11:36:49 +01:00
Viktor Lofgren	aff690f7d6	(search) Toggle for showing recent results Will by default show results from the last 2 years. May need to tune this later.	2024-01-09 11:28:36 +01:00
Viktor Lofgren	d4b0539d39	(search) Clean up search results template Rendering is very slow. Let's see if this has a measurable effect on latency.	2024-01-08 20:57:40 +01:00
Viktor Lofgren	cb55273769	(search) When clicking asn-links, show results from the unfiltered view...	2024-01-08 20:02:19 +01:00
Viktor Lofgren	edc1acbb7e	(*) Replace EC_DOMAIN_LINK table with files and in-memory caching The EC_DOMAIN_LINK MariaDB table stores links between domains. This is problematic, as both updating and querying this table is very slow in relation to how small the data is (~10 GB). This slowness is largely caused by the database enforcing ACID guarantees we don't particularly need. This changeset replaces the EC_DOMAIN_LINK table with a file in each index node containing 32 bit integer pairs corresponding to links between two domains. This file is loaded in memory in each node, and can be queried via the Query Service. A migration step is needed before this file is created in each node. Until that happens, the actual data is loaded from the EC_DOMAIN_LINK table, but accessed as though it was a file. The changeset also migrates/renames the links.db file to documents.db to avoid naming confusion between the two.	2024-01-08 15:53:13 +01:00
Viktor Lofgren	9e3386dbbb	(search) Fetch fewer results per page This is a test to evaluate how this impacts load times.	2024-01-05 13:22:13 +01:00
Viktor Lofgren	343ea9c6d8	(search) Fetch fewer results per page This is a test to evaluate how this impacts load times.	2024-01-04 13:18:07 +01:00
Viktor Lofgren	c70f508ae8	(prometheus) Saner histogram buckets	2024-01-02 17:13:14 +01:00
Viktor Lofgren	72b773f06d	(search) fix search metrics labeling	2024-01-02 15:46:14 +01:00

1 2 3

133 Commits