Commit Graph

1814 Commits

Author SHA1 Message Date
Viktor Lofgren
9415539b38 (docs) Update docs 2024-02-28 12:25:19 +01:00
Viktor Lofgren
84bab2783d (docs) Fix fake news in docs 2024-02-28 12:16:45 +01:00
Viktor
0d6e7673e4
Merge pull request #81 from MarginaliaSearch/service-discovery
Zookeeper for service-discovery, kill service-client lib, refactor everything
2024-02-28 12:15:25 +01:00
Viktor Lofgren
d78e9e715f (misc) Fix broken tests 2024-02-28 12:12:43 +01:00
Viktor Lofgren
a8ec59eb75 (conf) Add migration warning when ZOOKEEPER_HOSTS is not set. 2024-02-28 12:09:38 +01:00
Viktor Lofgren
20fc0ef13c (gradle) Add task alias 'docker' for 'jibDockerBuild'
The change also moves the jib boilerplate to an include.
2024-02-28 11:59:15 +01:00
Viktor Lofgren
37ae8cb33c Migrate the docker compose files 2024-02-28 11:48:16 +01:00
Viktor Lofgren
9f1649636e Clean up documentation and rename domain-links to link-graph 2024-02-28 11:40:39 +01:00
Viktor Lofgren
3a65fe8917 Add offload executor to GrpcChannelPoolFactory 2024-02-27 22:08:39 +01:00
Viktor Lofgren
99a6e56e99 (index-client) Increase thread count in index client
This should be a fair bit larger than the number of index nodes
2024-02-27 22:00:29 +01:00
Viktor Lofgren
e696fd9e92 (docs) Begin un-fucking the docs after refactoring 2024-02-27 21:22:21 +01:00
Viktor Lofgren
c943954bb4 (domain-info) Reduce memory usage 2024-02-27 21:22:21 +01:00
Viktor Lofgren
eaf836dc66 (service/grpc) Reduce thread count
Netty and GRPC by default spawns an incredible number of threads on high-core CPUs, which amount to a fair bit of RAM usage.

Add custom executors that throttle this behavior.
2024-02-27 21:22:21 +01:00
Viktor Lofgren
dbf64b0987 (logs) Add the option for json logging 2024-02-27 21:22:20 +01:00
Viktor Lofgren
8d0af9548b (search) Bot mitigation
Add the ability to indicate to the search service that a request is malicious, and to poison the results by providing randomly reorered old results instead.
2024-02-27 21:22:19 +01:00
Viktor Lofgren
67aa20ea2c (array) Attempting to debug strange errors 2024-02-27 21:22:18 +01:00
Viktor Lofgren
5604e9f531 (query) Bump query length, see what happens :P 2024-02-27 21:22:17 +01:00
Viktor Lofgren
1a51ec2d69 (index) Index optimization 2024-02-27 21:22:17 +01:00
Viktor Lofgren
3eb0800742 (index) Improve granularity of candidate queue polling 2024-02-27 21:22:17 +01:00
Viktor Lofgren
427f3e922f (index) Retire count operation, clean up index code. 2024-02-27 21:22:17 +01:00
Viktor Lofgren
823ca73a3f (domain-ranking) Fix a crash during ranking the edges of the similarity graph doesn't quite match the vertices of the link graph. 2024-02-27 21:22:17 +01:00
Viktor Lofgren
7fc0d4d786 (index) Observability for query execution queues 2024-02-27 21:22:17 +01:00
Viktor Lofgren
b8e336e809 (index) Reduce time allocation a bit 2024-02-27 21:22:17 +01:00
Viktor Lofgren
9429bf5c45 (index) Clean up 2024-02-27 21:22:17 +01:00
Viktor Lofgren
f7f0100174 (build) Make docker image registry and tag configurable in root build.gradle 2024-02-25 11:08:49 +01:00
Viktor Lofgren
fc00701a1e (index) Experimental refactoring of the indexing functionality 2024-02-25 11:05:10 +01:00
Viktor Lofgren
09447f2ad2 (process service) Inherit parent's assertion status 2024-02-24 18:32:37 +01:00
Viktor Lofgren
ff0ef1eebc (cleanup) Minor cleanups 2024-02-24 15:33:56 +01:00
Viktor Lofgren
1d34224416 (refac) Remove src/main from all source code paths.
Look, this will make the git history look funny, but trimming unnecessary depth from the source tree is a very necessary sanity-preserving measure when dealing with a super-modularized codebase like this one.

While it makes the project configuration a bit less conventional, it will save you several clicks every time you jump between modules.  Which you'll do a lot, because it's *modul*ar.  The src/main/java convention makes a lot of sense for a non-modular project though.  This ain't that.
2024-02-23 16:13:40 +01:00
Viktor Lofgren
56d35aa596 (refac) Move execution API out of executor service 2024-02-23 13:26:11 +01:00
Viktor Lofgren
2201b1a506 (refac) Clean up code issues 2024-02-23 11:39:19 +01:00
Viktor Lofgren
5cdb07023b (refac) Clean up unused imports 2024-02-23 11:27:20 +01:00
Viktor Lofgren
6154e16951 (refac) Remove "distPath" 2024-02-23 11:22:02 +01:00
Viktor Lofgren
f4ff7185f0 (refac) Move process-mqapi out of api directory 2024-02-23 11:18:29 +01:00
Viktor Lofgren
6357d30ea0 Clean up docs 2024-02-22 19:53:20 +01:00
Viktor Lofgren
8d4ef982d0 Clean up docs 2024-02-22 19:37:59 +01:00
Viktor Lofgren
4740156cfa Clean up docs 2024-02-22 18:18:58 +01:00
Viktor Lofgren
f8e7f75831 Move index to top level of code 2024-02-22 18:01:35 +01:00
Viktor Lofgren
085137ca63 * Extract the index functionality 2024-02-22 17:31:25 +01:00
Viktor Lofgren
3fd2a83184 * Extract the search-query function 2024-02-22 15:27:39 +01:00
Viktor Lofgren
66c1281301 (zk-registry) epic jak shaving WIP
Cleaning out a lot of old junk from the code, and one thing lead to another...

* Build is improved, now constructing docker images with 'jib'.  Clean build went from 3 minutes to 50 seconds.
* The ProcessService's spawning is smarter.  Will now just spawn a java process instead of relying on the application plugin's generated outputs.
* Project is migrated to GraalVM
* gRPC clients are re-written with a neat fluent/functional style. e.g.
```channelPool.call(grpcStub::method)
              .async(executor) // <-- optional
              .run(argument);
```
This change is primarily to allow handling ManagedChannel errors, but it turned out to be a pretty clean API overall.
* For now the project is all in on zookeeper
* Service discovery is now based on APIs and not services.  Theoretically means we could ship the same code either a monolith or a service mesh.
* To this end, began modularizing a few of the APIs so that they aren't strongly "living" in a service.  WIP!

Missing is documentation and testing, and some more breaking apart of code.
2024-02-22 14:01:23 +01:00
Viktor Lofgren
73947d9eca (zk-registry) Filter out phantom addresses in the registry
The change adds a hostname validation step to remove endpoints from the ZkServiceRegistry when they do not resolve.  This is a scenario that primarily happens when running in docker, and the entire system is started and stopped.
2024-02-20 18:09:11 +01:00
Viktor Lofgren
a69c0b2718 (grpc-client) Fix warmup crash
The warmup would sometimes crash during a cold start-up, because it could not get an API.  Changed the warmup to just create a GrpcSingleNodeChannelPool for the node.
2024-02-20 18:03:57 +01:00
Viktor Lofgren
6c764bceeb (doc) Update documentation for service-discovery 2024-02-20 16:09:49 +01:00
Viktor Lofgren
273aeb7bae (doc) Update documentation with new gRPC service setup 2024-02-20 16:06:05 +01:00
Viktor Lofgren
d185858266 (minor) Add missing query parameter to ServiceEndpoint.toURL 2024-02-20 15:49:43 +01:00
Viktor Lofgren
453bd6064b (minor) Add warm-up to GrpcMultiNodeChannelPool to speed up the initial messages
Without doing this, connections would be created lazily, which is probably never desirable.
2024-02-20 15:45:16 +01:00
Viktor Lofgren
904f2587cd (minor) Add default ZOOKEEPER_HOSTS to service.env 2024-02-20 15:44:26 +01:00
Viktor Lofgren
14172312dc (query-client) Fix query client
The query service delegates and aggregates IndexDomainLinksApiGrpc
messages to the index services.  The query client was accidentally
also doing this, instead of talking to the query client.

Fixed so it correctly talks to the query client and nothing else.
2024-02-20 15:44:07 +01:00
Viktor Lofgren
c600d7aa47 (refac) Inject ServiceRegistry into WebsiteAdjacenciesCalculator 2024-02-20 15:42:32 +01:00