(index) Adjust rank weightings to fix bad wikipedia results

There was as bug where if the input of ResultValuator.normalize() was negative, it was truncated to zero.  This meant that "bad" results always rank the same.  The penalty factor "overallPart" was moved outside of the function and was re-weighted to accomplish a better normalization.

Some of the weights were also re-adjusted based on what appears to produce better results.  Needs evaluation.
This commit is contained in:
Viktor Lofgren 2024-01-01 17:16:29 +01:00
parent faa50bf578
commit 9330b5b1d9
2 changed files with 6 additions and 1 deletions

View File

@ -108,7 +108,8 @@ public class ResultValuator {
}
}
return normalize(bestTcf + bestBM25F + bestBM25P + bestBM25PN * 0.25 + overallPart);
return normalize(2* bestTcf + bestBM25F + bestBM25P + bestBM25PN * 0.5) - overallPart / 4;
}
private double calculateQualityPenalty(int size, int quality, ResultRankingParameters rankingParams) {

View File

@ -10,6 +10,8 @@ import nu.marginalia.index.client.model.results.SearchResultItem;
import nu.marginalia.linkdb.LinkdbReader;
import nu.marginalia.linkdb.model.LdbUrlDetail;
import nu.marginalia.ranking.ResultValuator;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.sql.SQLException;
import java.util.ArrayList;
@ -21,6 +23,8 @@ import java.util.Map;
@Singleton
public class IndexResultDecorator {
private static final Logger logger = LoggerFactory.getLogger(IndexResultDecorator.class);
private final LinkdbReader linkdbReader;
private final ResultValuator valuator;