Commit Graph

414 Commits

Author SHA1 Message Date
Viktor Lofgren
4e3a977049 Merge branch 'release' into master 2022-08-18 18:41:13 +02:00
vlofgren
340d80f6c7 Don't try to fetch text/css and text/javascript-files. Refactor fetcher to separate content type sniffing logic. Clean up crawler a smidge. 2022-08-18 18:40:34 +02:00
Viktor Lofgren
a915b2d37a Merge pull request 'Don't try to fetch ftp://, webcal://, etc.' (#90) from master into release
Reviewed-on: https://git.marginalia.nu/marginalia/marginalia.nu/pulls/90
2022-08-18 18:27:15 +02:00
vlofgren
6b6cd56e3a Don't try to fetch text/css and text/javascript-files. Refactor fetcher to separate content type sniffing logic. Clean up crawler a smidge. 2022-08-18 18:25:12 +02:00
Viktor Lofgren
e5d63d8a61 Merge branch 'release' into master 2022-08-18 17:26:18 +02:00
vlofgren
4afccdc536 Don't try to fetch ftp://, webcal://, etc. 2022-08-18 17:25:22 +02:00
Viktor Lofgren
4435334ebe Merge pull request 'Fix bug where url fragments were considered path elements' (#89) from master into release
Reviewed-on: https://git.marginalia.nu/marginalia/marginalia.nu/pulls/89
2022-08-18 16:48:48 +02:00
Viktor Lofgren
579037db05 Merge branch 'release' into master 2022-08-18 16:48:32 +02:00
vlofgren
5cd552458a Fix fragment bug. 2022-08-18 16:47:59 +02:00
vlofgren
2bc81e8e9a Fix fragment bug. 2022-08-18 16:45:51 +02:00
vlofgren
a034e3245e Fix fragment bug. 2022-08-18 16:43:34 +02:00
Viktor Lofgren
a8745d627b Merge pull request 'Fix bug in redirect handling that caused the crawler to not index some documents.' (#88) from master into release
Reviewed-on: https://git.marginalia.nu/marginalia/marginalia.nu/pulls/88
2022-08-17 00:52:34 +02:00
vlofgren
0bac422091 Fix bug in redirect handling that caused the crawler to not index some documents. 2022-08-17 00:51:10 +02:00
Viktor Lofgren
8f2485870d Merge branch 'release' into master 2022-08-17 00:49:55 +02:00
vlofgren
ce9abc00dc Fix bug in redirect handling that caused the crawler to not index some documents. 2022-08-17 00:49:32 +02:00
Viktor Lofgren
5f2258d459 Merge pull request 'Prepare for new crawl round' (#87) from master into release
Reviewed-on: https://git.marginalia.nu/marginalia/marginalia.nu/pulls/87
2022-08-16 22:53:20 +02:00
Viktor Lofgren
ef97414edb Merge branch 'release' into master 2022-08-16 22:49:26 +02:00
vlofgren
5cfef610b0 Preparations for new crawl round 2022-08-16 22:48:16 +02:00
vlofgren
123675d73b More caching 2022-08-15 15:39:10 +02:00
vlofgren
ceacfa5917 Tune down log spam 2022-08-15 15:37:26 +02:00
Viktor Lofgren
4fc0c59d29 Merge pull request 'Optimize search service by removing weird query spam' (#86) from master into release
Reviewed-on: https://git.marginalia.nu/marginalia/marginalia.nu/pulls/86
2022-08-15 15:28:05 +02:00
Viktor Lofgren
fdbb02bcaa Merge branch 'release' into master 2022-08-15 15:27:55 +02:00
vlofgren
f6b3e75cee Optimize search service by removing weird query spam 2022-08-15 15:27:22 +02:00
Viktor Lofgren
0c51cf5116 Merge pull request 'Crawling and processing improvements, index optimization' (#85) from master into release
Reviewed-on: https://git.marginalia.nu/marginalia/marginalia.nu/pulls/85
2022-08-15 13:59:49 +02:00
Viktor Lofgren
c800af3a59 Merge branch 'release' into master 2022-08-15 13:59:38 +02:00
vlofgren
beafdfda9c Index optimizations that should reduce small object churn and IOPS a bit. 2022-08-15 13:58:18 +02:00
vlofgren
460dd098b0 Add advertisement Feature to search,
Add adblock simulation to processor,
Add filename and email address extraction to processor.
2022-08-12 17:12:16 +02:00
Viktor Lofgren
02abe498ff master (#84)
Co-authored-by: vlofgren <vlofgren@gmail.com>
Reviewed-on: https://git.marginalia.nu/marginalia/marginalia.nu/pulls/84
2022-08-12 13:50:57 +02:00
Viktor Lofgren
d039b138a6 Merge branch 'release' into master 2022-08-12 13:50:47 +02:00
vlofgren
30d2a707ff Add advertisement Feature to search,
Add adblock simulation to processor,
Add filename and email address extraction to processor.
2022-08-12 13:50:18 +02:00
vlofgren
0e28ff5a72 Add features to suggestions 2022-08-10 21:32:19 +02:00
Viktor Lofgren
f3a8a20321 Merge pull request 'Add features to suggestions' (#83) from master into release
Reviewed-on: https://git.marginalia.nu/marginalia/marginalia.nu/pulls/83
2022-08-10 19:51:03 +02:00
Viktor Lofgren
b826dbc3b5 Merge branch 'release' into master 2022-08-10 19:50:54 +02:00
vlofgren
ba9e0d9829 Add features to suggestions 2022-08-10 19:50:14 +02:00
vlofgren
ffde8c8305 Faster crawling 2022-08-10 18:46:13 +02:00
vlofgren
ce09fce639 Faster crawling 2022-08-10 17:03:58 +02:00
vlofgren
9c6e3b1772 Topical detection (experimental),
Adblock simulation (experimental)
2022-08-10 15:04:29 +02:00
Viktor Lofgren
48cfa9db97 Merge pull request 'Adjust search result sort order to penalize scriptiness a bit' (#82) from master into release
Reviewed-on: https://git.marginalia.nu/marginalia/marginalia.nu/pulls/82
2022-08-08 19:01:25 +02:00
Viktor Lofgren
4c9e1fa686 Merge branch 'release' into master 2022-08-08 19:01:17 +02:00
vlofgren
d7167f956e Adjust search result sort order to penalize scriptiness a bit 2022-08-08 18:59:57 +02:00
Viktor Lofgren
dbb5a4d3bf Merge pull request 'Cooking mode' (#81) from master into release
Reviewed-on: https://git.marginalia.nu/marginalia/marginalia.nu/pulls/81
2022-08-08 18:09:37 +02:00
Viktor Lofgren
8bd88c7d40 Merge branch 'release' into master 2022-08-08 18:09:30 +02:00
vlofgren
0f59675f7c Clean up preconverter code 2022-08-08 18:08:18 +02:00
vlofgren
2af2c50f34 Clean up preconverter code 2022-08-08 15:29:47 +02:00
Viktor Lofgren
10d8678f63 Merge pull request 'master' (#80) from master into release
Reviewed-on: https://git.marginalia.nu/marginalia/marginalia.nu/pulls/80
2022-08-08 15:19:11 +02:00
Viktor Lofgren
31b5742280 Merge branch 'release' into master 2022-08-08 15:19:00 +02:00
vlofgren
2bfde9d030 Recipe detection 2022-08-08 15:18:18 +02:00
vlofgren
0dfcf2f7af Recipe detection 2022-08-08 15:18:07 +02:00
vlofgren
5c952d48f4 Speed up conversion 2022-08-08 15:18:07 +02:00
Viktor Lofgren
1ad1cf75c8 Merge pull request 'Add support for additional random sets' (#79) from master into release
Reviewed-on: https://git.marginalia.nu/marginalia/marginalia.nu/pulls/79
2022-08-07 17:52:08 +02:00