Clean up docs

2024-02-22 18:18:58 +01:00 · 2024-02-22 18:18:58 +01:00 · 4740156cfa
commit 4740156cfa
parent f8e7f75831
3 changed files with 28 additions and 40 deletions
--- a/code/common/service-discovery/src/main/java/nu/marginalia/service/client/GrpcSingleNodeChannelPool.java
+++ b/code/common/service-discovery/src/main/java/nu/marginalia/service/client/GrpcSingleNodeChannelPool.java
@ -8,7 +8,6 @@ import nu.marginalia.service.discovery.monitor.ServiceChangeMonitor;
 import nu.marginalia.service.discovery.property.PartitionTraits;
 import nu.marginalia.service.discovery.property.ServiceEndpoint.InstanceAddress;
 import nu.marginalia.service.discovery.property.ServiceKey;
-import org.jetbrains.annotations.NotNull;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;

--- a/code/index/readme.md
+++ b/code/index/readme.md
@ -1,30 +1,41 @@
 # Index

-These are components that offer functionality for the [index-service](../../services-core/index-service).
+This module contains the components that make up the search index.
+
+It exposes an API for querying the index, and contains the logic 
+for ranking search results.  It does not parse the query, that is
+the responsibility of the [search-query](../functions/search-query) module.

 ## Indexes

 There are two indexes with accompanying tools for constructing them.

-* [index-reverse](index-reverse/) is code for `word->document` indexes. There are two such indexes, one containing only document-word pairs that are flagged as important, e.g. the word appears in the title or has a high TF-IDF. This allows good results to be discovered quickly without having to sift through ten thousand bad ones first. 
+* [index-reverse](reverse-index/) is code for `word->document` indexes. There are two such indexes, one containing only document-word pairs that are flagged as important, e.g. the word appears in the title or has a high TF-IDF. This allows good results to be discovered quickly without having to sift through ten thousand bad ones first. 

-* [index-forward](index-forward/) is the `document->word` index containing metadata about each word, such as its position. It is used after identifying candidate search results via the reverse index to fetch metadata and rank the results. 
+* [index-forward](forward-index/) is the `document->word` index containing metadata about each word, such as its position. It is used after identifying candidate search results via the reverse index to fetch metadata and rank the results. 

-These indices rely heavily on the [libraries/btree](../../libraries/btree) and [libraries/array](../../libraries/array) components.
+Additionally, the [index-journal](index-journal/) contains code for constructing a journal of the index, which is used to keep the index up to date.

-## Algorithms
+These indices rely heavily on the [libraries/btree](../libraries/btree) and [libraries/array](../libraries/array) components.

-* [domain-ranking](domain-ranking/) contains domain ranking algorithms.
-* [result-ranking](result-ranking/) contains logic for ranking search results by relevance.
+---

-# Libraries
+# Result Ranking

-* [index-query](index-query/) contains structures for evaluating search queries.
-* [index-journal](index-journal/) contains tools for writing and reading index data.
+The module is also responsible for ranking search results, and contains various heuristics
+for deciding which search results are important with regard to a query. In broad strokes [BM-25](https://nlp.stanford.edu/IR-book/html/htmledition/okapi-bm25-a-non-binary-model-1.html)
+is used, with a number of additional bonuses and penalties to rank the appropriate search
+results higher.
+
+## Central Classes
+
+* [ResultValuator](src/main/java/nu/marginalia/ranking/results/ResultValuator.java)
+
+---

 # Domain Ranking

-Contains domain ranking algorithms.  The domain ranking algorithms are based on
+The module contains domain ranking algorithms.  The domain ranking algorithms are based on
 the JGraphT library.

 Two principal algorithms are available, the standard PageRank algorithm,
@ -42,14 +53,14 @@ for creating a ranking algorithm that is focused on a particular segment of the

 ## Central Classes

-* [PageRankDomainRanker](src/main/java/nu/marginalia/ranking/PageRankDomainRanker.java) - Ranks domains using the
+* [PageRankDomainRanker](src/main/java/nu/marginalia/ranking/domains/PageRankDomainRanker.java) - Ranks domains using the
  PageRank or Personalized PageRank algorithm depending on whether a list of influence domains is provided.

 ### Data sources

-* [LinkGraphSource](src/main/java/nu/marginalia/ranking/data/LinkGraphSource.java) - fetches the link graph
-* [InvertedLinkGraphSource](src/main/java/nu/marginalia/ranking/data/InvertedLinkGraphSource.java) - fetches the inverted link graph
-* [SimilarityGraphSource](src/main/java/nu/marginalia/ranking/data/SimilarityGraphSource.java) - fetches the similarity graph from the database
+* [LinkGraphSource](src/main/java/nu/marginalia/ranking/domains/data/LinkGraphSource.java) - fetches the link graph
+* [InvertedLinkGraphSource](src/main/java/nu/marginalia/ranking/domains/data/InvertedLinkGraphSource.java) - fetches the inverted link graph
+* [SimilarityGraphSource](src/main/java/nu/marginalia/ranking/domains/data/SimilarityGraphSource.java) - fetches the similarity graph from the database

 Note that the similarity graph needs to be precomputed and stored in the database for
 the similarity graph source to be available.
@ -57,14 +68,3 @@ the similarity graph source to be available.
 ## Useful Resources

 * [The PageRank Citation Ranking: Bringing Order to the Web](http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf)
-
-# Result Ranking
-
-Contains various heuristics for deciding which search results are important
-with regard to a query. In broad strokes [BM-25](https://nlp.stanford.edu/IR-book/html/htmledition/okapi-bm25-a-non-binary-model-1.html)
-is used, with a number of additional bonuses and penalties to rank the appropriate search
-results higher.
-
-## Central Classes
-
-* [ResultValuator](src/main/java/nu/marginalia/ranking/ResultValuator.java)
--- a/code/services-core/index-service/readme.md
+++ b/code/services-core/index-service/readme.md
@ -6,17 +6,6 @@ It is the service that most directly executes a search query.  It does this by
 evaluating a low-level query, and then using the index to find the documents 
 that match the query, finally ranking the results and picking the best matches.

-## Central Classes
+This module only contains service boilerplate. The guts of this service are 
+in the [index](../../index) module.

-* [IndexService](src/main/java/nu/marginalia/index/IndexService.java) is the REST entry point that the internal API talks to.
-* [IndexQueryService](src/main/java/nu/marginalia/index/svc/IndexQueryService.java) executes queries. 
-* [SearchIndex](src/main/java/nu/marginalia/index/index/SearchIndex.java) owns the state of the index and helps with building a query strategy from parameters.
-* [IndexResultValuator](src/main/java/nu/marginalia/index/results/IndexResultValuator.java) determines the best results.
-
-## See Also
-
-The index service relies heavily on the primitives in [features-index](../../features-index):
-
-* [features-index/index-forward](../../features-index/index-forward/)
-* [features-index/index-reverse](../../features-index/index-reverse/)
-* [features-index/index-query](../../features-index/index-query)