Viktor Lofgren
6f222b9800
(search) Add refresh link to explore mode.
...
This is a QOL improvement for mobile users, who otherwise would have to scroll all the way up to refresh.
Also removed the confusing "this is a random set of domains"-message when viewing adjacent websites, as it's not random.
2023-08-22 12:43:44 +02:00
Viktor Lofgren
c7f0276005
(control) Don't spin on process output printing
...
This is the "correct" way of copying stdout and stderr to the curren't process' output.
2023-08-22 11:48:54 +02:00
Viktor Lofgren
46df58d28b
(control-service) Use default value for WMSA_HOME if it is not set
2023-08-22 11:11:01 +02:00
Viktor Lofgren
15912f31d0
(control-service) Basic GUI for deleting bad links from exploration mode
2023-08-21 18:35:26 +02:00
Viktor Lofgren
93f49f1fb3
(search-service) RSS feed for the news feed
2023-08-20 12:58:34 +02:00
Viktor Lofgren
704de50a9b
(forward-index, valuator) HTML features in valuator
...
Put it in the forward index for easy access during index-side valuation.
2023-08-18 11:54:56 +02:00
Viktor Lofgren
efee904531
(search) Use the adtech bit instead of ads for ads flag
2023-08-18 11:24:59 +02:00
Viktor Lofgren
46d761f34f
(language) fasttext based language filter
2023-08-16 15:48:12 +02:00
Viktor Lofgren
4598c7f40f
(valuation) Penalize wordpress style kebab case urls
2023-08-16 13:11:24 +02:00
Viktor Lofgren
606db54dc8
(docs) Fix dead links to message-queue after moving it to libraries
2023-08-15 19:26:40 +02:00
Viktor Lofgren
df85468c01
(control) Action for refreshing the blogs definition.
2023-08-15 11:38:52 +02:00
Viktor Lofgren
e7192a9cad
(mq) Refactor mq and actor library and move it to libraries out of common
2023-08-15 10:53:23 +02:00
Viktor Lofgren
019b61b330
(control) Remove message queue listing from actors view.
2023-08-13 13:50:04 +02:00
Viktor Lofgren
f997707049
(control) Move event log out of plumbing
2023-08-13 13:40:50 +02:00
Viktor Lofgren
c56ee10185
(control) Separate [Process] and [Process and Load] actions for crawl data; all SLOW data is deletable.
2023-08-13 13:39:59 +02:00
Viktor Lofgren
8210e49b4e
(control) Helpful tooltips for the Actor table.
2023-08-13 12:55:56 +02:00
Viktor Lofgren
a8f2e9ee2c
(control) Tidy up empty tables, remove actors from index view
2023-08-12 15:18:14 +02:00
Viktor Lofgren
a91b909103
(control) Event log on stop actor
2023-08-12 15:02:53 +02:00
Viktor Lofgren
99e031c529
(control) Remove broken pagination from events and message queue; new "light" events table for some views
2023-08-12 14:57:55 +02:00
Viktor Lofgren
998f239ed9
(control) Filterable event log view
2023-08-12 14:43:11 +02:00
Viktor Lofgren
0961f627b1
(control) Pretty up the nav bar
2023-08-12 14:42:42 +02:00
Viktor Lofgren
4f8048be31
(blacklist) Blacklist management
2023-08-10 15:40:07 +02:00
Viktor Lofgren
ce293029c7
(converter) Treat adtech tracking as advertisement.
2023-08-09 14:23:53 +02:00
Viktor Lofgren
251fc63b42
(*) Fix merge gore
2023-08-09 13:33:28 +02:00
Viktor Lofgren
47f3855a4b
(control) More informative readme.md
2023-08-09 12:42:23 +02:00
Viktor Lofgren
71dfe9f33e
(control) Clean up the ControlService, move mq-related endpoints to MessageQueueService.
2023-08-09 12:42:01 +02:00
Viktor Lofgren
4ab1cd9502
(*) last touches
2023-08-07 12:57:44 +02:00
Viktor Lofgren
be444f9172
(control) New actions view, re-arrange navigation menu
2023-08-05 14:45:04 +02:00
Viktor Lofgren
bf37a3eb25
(search-service) Make flushCaches endpoint a notice and not a request
2023-08-05 14:45:04 +02:00
Viktor Lofgren
00eb8b90dc
(control) Message Queue GUI
2023-08-04 22:05:29 +02:00
Viktor Lofgren
912129311d
(control) Message Queue GUI
2023-08-04 17:54:18 +02:00
Viktor Lofgren
624b78ec3a
(heartbeat) Task heartbeats
2023-08-04 14:40:06 +02:00
Viktor Lofgren
1d0cea1d55
(converter) GUI for dealing with user complaints
2023-08-03 17:59:57 +02:00
Viktor Lofgren
f01f608474
(blacklist) Support blacklists with subdomain
2023-08-03 17:58:52 +02:00
Viktor Lofgren
63e857f7cd
(control) Add basic api key management
2023-08-02 20:14:03 +02:00
Viktor Lofgren
9979c9defe
(search/index) Add blogosphere filter
2023-08-02 20:13:30 +02:00
Viktor Lofgren
8de3e6ab80
(control) Fix bug where CrawlActor and RecrawlActor would steal each others' mail
2023-08-01 22:33:30 +02:00
Viktor Lofgren
867410c66b
(file-storage) Automatic file storage discovery via manifest file
2023-08-01 18:05:43 +02:00
Viktor Lofgren
36a23707c1
(control) Control service should be a core service.
2023-08-01 15:49:50 +02:00
Viktor Lofgren
e22e65eee4
(index) Fix bug related to debug print statements
2023-07-22 14:33:58 +02:00
Viktor Lofgren
d7ab21fe34
(*) Refactor Control Service and processes
2023-07-17 21:20:31 +02:00
Viktor Lofgren
8b74e3aa0d
(*) File Storage WIP
2023-07-14 17:08:10 +02:00
Viktor Lofgren
88b9ec70c6
(control, WIP) Run reconvert-load from converter :D
2023-07-11 18:05:37 +02:00
Viktor
cbbf60a599
Better fingerprinting ( #35 )
...
* Better fingerprinting for server tech
* Many more features in FeatureExtractor
* Blog specialization
* SiteType table
2023-07-10 18:58:43 +02:00
Viktor Lofgren
96eecc6ea5
Minor: Readability.
2023-07-10 18:58:43 +02:00
Viktor Lofgren
d9e6c4f266
Trial integration of MQ-FSM into index service.
2023-07-06 18:04:16 +02:00
Viktor Lofgren
62cc9df206
Embryo of new control process
...
* New events and heartbeat tables in mariadb
* Refactored to a cleaner Service interface
2023-07-03 10:40:32 +02:00
Viktor Lofgren
0f34beb1aa
Update search front page
2023-06-29 17:14:27 +02:00
Viktor Lofgren
a6a66c6d8a
Improve site info for unknown domains:
...
* Placeholder screenshot should work
* Add a link to git-repo for submitting the site for crawling
2023-06-27 15:32:11 +02:00
Viktor Lofgren
d86e8522e2
Add search profiles for wiki, forum and docs.
2023-06-24 12:17:35 +02:00