Commit Graph

5 Commits

Author SHA1 Message Date
Viktor Lofgren
7326ba74fe Tweaks to pub date heuristics to make it mostly get the 'historyofphilosophy.net' case right.
Use HTML standard for plausibility checks in the more guesswork-like heuristics. Added more class names to look for date strings.
2023-06-20 14:15:05 +02:00
Viktor Lofgren
21125206b4 Fix some bugs in JSON+LD-heuristics for pub date. 2023-06-19 17:58:19 +02:00
Viktor Lofgren
619fb8ba80 (converter) Adjust the pub-date sniffing heuristics' order. Doing HTML5 tags too early puts some sites too early. Also expanded support for JSON+LD. 2023-04-19 15:28:50 +02:00
Viktor Lofgren
449471a076 Yet more restructuring. Improved search result ranking. 2023-03-16 21:35:54 +01:00
Viktor Lofgren
d82532b7f1 More restructuring, big bug fixes in keyword extraction. 2023-03-13 17:39:53 +01:00