2023-03-06 18:32:13 +01:00
|
|
|
# Domain Ranking
|
|
|
|
|
2024-02-16 18:04:58 +01:00
|
|
|
Contains domain ranking algorithms. The domain ranking algorithms are based on
|
|
|
|
the JGraphT library.
|
|
|
|
|
|
|
|
Two principal algorithms are available, the standard PageRank algorithm,
|
|
|
|
and personalized pagerank; each are available for two graphs, the link graph
|
|
|
|
and a similarity graph where each edge corresponds to the similarity between
|
|
|
|
the sets of incident links to two domains, their cosine similarity acting as
|
|
|
|
the weight of the links.
|
|
|
|
|
|
|
|
With the standard PageRank algorithm, the similarity graph does not produce
|
|
|
|
anything useful, but something magical happens when you apply Personalized PageRank
|
|
|
|
to this graph. It turns into a very good "vibe"-sensitive ranking algorithm.
|
|
|
|
|
|
|
|
It's unclear if this is a well known result, but it's a very interesting one
|
|
|
|
for creating a ranking algorithm that is focused on a particular segment of the web.
|
2023-03-06 18:32:13 +01:00
|
|
|
|
|
|
|
## Central Classes
|
|
|
|
|
2024-02-16 18:04:58 +01:00
|
|
|
* [PageRankDomainRanker](src/main/java/nu/marginalia/ranking/PageRankDomainRanker.java) - Ranks domains using the
|
|
|
|
PageRank or Personalized PageRank algorithm depending on whether a list of influence domains is provided.
|
2023-03-06 18:32:13 +01:00
|
|
|
|
|
|
|
### Data sources
|
|
|
|
|
2024-02-16 18:04:58 +01:00
|
|
|
* [LinkGraphSource](src/main/java/nu/marginalia/ranking/data/LinkGraphSource.java) - fetches the link graph
|
|
|
|
* [InvertedLinkGraphSource](src/main/java/nu/marginalia/ranking/data/InvertedLinkGraphSource.java) - fetches the inverted link graph
|
|
|
|
* [SimilarityGraphSource](src/main/java/nu/marginalia/ranking/data/SimilarityGraphSource.java) - fetches the similarity graph from the database
|
2023-03-20 16:41:34 +01:00
|
|
|
|
2024-02-16 18:04:58 +01:00
|
|
|
Note that the similarity graph needs to be precomputed and stored in the database for
|
|
|
|
the similarity graph source to be available.
|
2023-03-21 17:32:09 +01:00
|
|
|
|
|
|
|
## See Also
|
|
|
|
|
2023-10-09 15:12:30 +02:00
|
|
|
* [result-ranking](../result-ranking) - Ranks search results
|
2023-03-21 17:32:09 +01:00
|
|
|
|
2023-03-20 16:41:34 +01:00
|
|
|
## Useful Resources
|
|
|
|
|
|
|
|
* [The PageRank Citation Ranking: Bringing Order to the Web](http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf)
|