diff --git a/code/process-models/crawl-spec/readme.md b/code/process-models/crawl-spec/readme.md
new file mode 100644
index 00000000..63bcec96
--- /dev/null
+++ b/code/process-models/crawl-spec/readme.md
@@ -0,0 +1,16 @@
+# Crawl Spec
+
+A crawl spec is a list of domains to be crawled.  It is a parquet file with the following columns:
+
+- `domain`: The domain to be crawled
+- `crawlDepth`: The depth to which the domain should be crawled
+- `urls`: A list of known URLs to be crawled
+
+Crawl specs are used to define the scope of a crawl in the absence of known domains.
+
+The [CrawlSpecRecord](src/main/java/nu/marginalia/model/crawlspec/CrawlSpecRecord.java) class is 
+used to represent a record in the crawl spec.  
+
+The [CrawlSpecRecordParquetFileReader](src/main/java/nu/marginalia/io/crawlspec/CrawlSpecRecordParquetFileReader.java)
+and [CrawlSpecRecordParquetFileWriter](src/main/java/nu/marginalia/io/crawlspec/CrawlSpecRecordParquetFileWriter.java)
+classes are used to read and write the crawl spec parquet files.
diff --git a/code/process-models/crawling-model/readme.md b/code/process-models/crawling-model/readme.md
index d360e80f..ac0d0906 100644
--- a/code/process-models/crawling-model/readme.md
+++ b/code/process-models/crawling-model/readme.md
@@ -1,13 +1,69 @@
 # Crawling Models
 
-Contains models shared by the [crawling-process](../../processes/crawling-process/) and
+Contains crawl data models shared by the [crawling-process](../../processes/crawling-process/) and
 [converting-process](../../processes/converting-process/).
 
+To ensure backward compatibility with older versions of the data, the serialization is
+abstracted away from the model classes.  
+
+The new way of serializing the data is to use parquet files.  
+
+The old way was to use zstd-compressed JSON.  The old way is still supported 
+*for now*, but the new way is preferred as it's not only more succinct, but also 
+significantly faster to read and much more portable.  The JSON support will be
+removed in the future.
+
 ## Central Classes
 
 * [CrawledDocument](src/main/java/nu/marginalia/crawling/model/CrawledDocument.java)
 * [CrawledDomain](src/main/java/nu/marginalia/crawling/model/CrawledDomain.java)
 
 ### Serialization
+
+These serialization classes automatically negotiate the serialization format based on the 
+file extension.
+
+Data is accessed through a [SerializableCrawlDataStream](src/main/java/nu/marginalia/crawling/io/SerializableCrawlDataStream.java),
+which is a somewhat enhanced Iterator that can be used to read data. 
+
 * [CrawledDomainReader](src/main/java/nu/marginalia/crawling/io/CrawledDomainReader.java)
-* [CrawledDomainWriter](src/main/java/nu/marginalia/crawling/io/CrawledDomainWriter.java)
\ No newline at end of file
+* [CrawledDomainWriter](src/main/java/nu/marginalia/crawling/io/CrawledDomainWriter.java)
+
+### Parquet Serialization
+
+The parquet serialization is done using the [CrawledDocumentParquetRecordFileReader](src/main/java/nu/marginalia/crawling/parquet/CrawledDocumentParquetRecordFileReader.java)
+and [CrawledDocumentParquetRecordFileWriter](src/main/java/nu/marginalia/crawling/parquet/CrawledDocumentParquetRecordFileWriter.java) classes,
+which read and write parquet files respectively.
+
+The model classes are serialized to parquet using the [CrawledDocumentParquetRecord](src/main/java/nu/marginalia/crawling/parquet/CrawledDocumentParquetRecord.java)
+
+The record has the following fields:
+
+* `domain` - The domain of the document
+* `url` - The URL of the document
+* `ip` - The IP address of the document
+* `cookies` - Whether the document has cookies
+* `httpStatus` - The HTTP status code of the document
+* `timestamp` - The timestamp of the document
+* `contentType` - The content type of the document
+* `body` - The body of the document
+* `etagHeader` - The ETag header of the document
+* `lastModifiedHeader` - The Last-Modified header of the document
+
+The easiest way to interact with parquet files is to use [DuckDB](https://duckdb.org/),
+which lets you run SQL queries on parquet files (and almost anything else).
+
+e.g. 
+```sql
+$ select httpStatus, count(*) as cnt 
+       from 'my-file.parquet' 
+       group by httpStatus;
+┌────────────┬───────┐
+│ httpStatus │  cnt  │
+│   int32    │ int64 │
+├────────────┼───────┤
+│        200 │    43 │
+│        304 │     4 │
+│        500 │     1 │
+└────────────┴───────┘
+```
\ No newline at end of file