Commit Graph

31 Commits

Author SHA1 Message Date
Viktor Lofgren
46d761f34f (language) fasttext based language filter 2023-08-16 15:48:12 +02:00
Viktor Lofgren
e7192a9cad (mq) Refactor mq and actor library and move it to libraries out of common 2023-08-15 10:53:23 +02:00
Viktor Lofgren
36a23707c1 (control) Control service should be a core service. 2023-08-01 15:49:50 +02:00
Viktor Lofgren
86a5cc5c5f (hash) Modified version of common codec's Murmur3 hash 2023-08-01 14:57:40 +02:00
Viktor Lofgren
f11103d31d (WIP) Make it possible to sideload encyclopedia data.
This is mostly a pilot track for sideloading other large websites.

Also change coverter to produce a more compact output (java serialization instead of json).
2023-07-28 18:14:43 +02:00
Viktor Lofgren
bca4bbb6c8 (*) Refactor MQ and MQSM 2023-07-17 13:57:32 +02:00
Viktor Lofgren
31ae71c7d6 Message queue WIP 2023-07-04 14:28:14 +02:00
Viktor Lofgren
62cc9df206 Embryo of new control process
* New events and heartbeat tables in mariadb
* Refactored to a cleaner Service interface
2023-07-03 10:40:32 +02:00
Viktor Lofgren
e853483ef3 Bump Crawler Commons version 2023-06-29 14:14:18 +02:00
Viktor Lofgren
e7af77e151 Tests for crawler specialization + testdata 2023-06-27 10:57:54 +02:00
Viktor Lofgren
5d862d119c Bump dependency versions. 2023-06-20 12:03:12 +02:00
Viktor Lofgren
266ad2e4de Re-introduce monkey patched GSON to make converter run better.
fixup! Re-introduce monkey patched GSON to make converter run better.

fixup! Re-introduce monkey patched GSON to make converter run better.
2023-06-19 17:58:19 +02:00
Viktor Lofgren
fbbaf584ba Adjustments to screenshot capture tool. 2023-04-16 08:55:57 +02:00
Viktor Lofgren
3e9b37c264 Refactor website screenshot tool and website adjacencies calculator into code/tools. 2023-04-11 16:20:27 +02:00
Viktor Lofgren
affcf8cf41 Load test tool 2023-04-02 09:43:43 +02:00
Viktor Lofgren
8f51345a1d Add experiment runner tool and got rid of experiments module in processes. 2023-03-28 16:58:46 +02:00
Viktor
ac1ac3ea57
Move database to a separate module
* Move database to a separate project, break apart sql file into separate entities.
* Fix front page news listing.
2023-03-25 15:26:17 +01:00
Viktor Lofgren
2eb972dea1 Remove unrelated code, break tools into their own directory. 2023-03-17 16:03:11 +01:00
Viktor Lofgren
449471a076 Yet more restructuring. Improved search result ranking. 2023-03-16 21:35:54 +01:00
Viktor Lofgren
0ecab53635 Yet more restructuring. 2023-03-13 23:40:26 +01:00
Viktor Lofgren
d82532b7f1 More restructuring, big bug fixes in keyword extraction. 2023-03-13 17:39:53 +01:00
Viktor Lofgren
8b8fc49901 The refactoring will continue until morale improves. 2023-03-12 11:42:07 +01:00
Viktor Lofgren
73eaa0865d The refactoring will continue until morale improves. 2023-03-12 10:50:31 +01:00
Viktor Lofgren
616effdb3c The refactoring will continue until morale improves. 2023-03-12 10:04:48 +01:00
Viktor Lofgren
6d939175b1 Additional code restructuring to get rid of util and misc-style packages. 2023-03-11 13:48:40 +01:00
Viktor Lofgren
ad1be7c835 Move all code to a code directory. 2023-03-07 17:14:32 +01:00
Viktor Lofgren
b945fd7f39 A lot of readmes, some refactoring. 2023-03-06 18:32:13 +01:00
Viktor Lofgren
cf1f878a39 Remove old unused protobuf crap 2023-03-04 14:17:13 +01:00
Viktor Lofgren
4fdaaa16ba Restructuring the git repo 2023-03-04 13:19:01 +01:00
vlofgren
3200c36072 Experimental changes for 22-08/09 update. 2022-08-26 16:08:46 +02:00
vlofgren
c24b978c51 first commit 2022-05-19 17:45:26 +02:00