2023-03-04 14:00:46 +01:00
# Array Library
2023-03-04 17:57:17 +01:00
The array library offers easy allocation of large [memory mapped files ](https://en.wikipedia.org/wiki/Memory-mapped_file )
2023-03-21 16:38:39 +01:00
with much less performance overhead than the traditional `buffers[pos/size].get(pos%size)` -style constructions
2023-03-04 17:57:17 +01:00
java often leads to given its suffocating 2 Gb ByteBuffer size limitation.
It accomplishes this by delegating block oerations down to the appropriate page. If the operation
crosses a page boundary, it is not delegated and a bit slower.
2023-03-04 14:00:46 +01:00
2023-03-21 16:38:39 +01:00
The library is written in a fairly unidiomatic way to accomplish diamond inheritance.
2023-03-04 14:00:46 +01:00
2023-03-04 18:06:53 +01:00
## Quick demo:
2023-03-04 17:21:13 +01:00
```java
var array = LongArray.mmapForWriting(Path.of("/tmp/test"), 1< < 16 ) ;
2023-03-04 14:00:46 +01:00
array.transformEach(50, 1000, (pos, val) -> Long.hashCode(pos));
array.quickSort(50, 1000);
if (array.binarySearch(array.get(100), 50, 1000) >= 0) {
System.out.println("Nevermind, I found it!");
}
array.range(50, 1000).fill(0, 950, 1);
array.forEach(0, 100, (pos, val) -> {
System.out.println(pos + ":" + val);
});
2023-03-04 18:06:53 +01:00
```
## Query Buffers
2023-03-04 19:14:20 +01:00
The classes [IntQueryBuffer ](src/main/java/nu/marginalia/array/buffer/IntQueryBuffer.java )
and [LongQueryBuffer ](src/main/java/nu/marginalia/array/buffer/LongQueryBuffer.java ) are used
heavily in the search engine's query processing.
They are dual-pointer buffers that offer tools for filtering data.
```java
LongQueryBuffer buffer = new LongQueryBuffer(1000);
2023-03-04 19:15:51 +01:00
// later ...
// Prepare the buffer for filling
buffer.reset();
2023-03-04 19:19:47 +01:00
fillBufferSomehow(buffer);
2023-03-04 19:15:51 +01:00
// length is updated and data is set
// read pointer and write pointer is now at 0
2023-03-04 19:14:20 +01:00
// A typical filtering operation may look like this:
while (buffer.hasMore()) { // read < end
if (someCondition(buffer.currentValue())) {
// copy the value pointed to by the read
// pointer to the read pointer, and
// advance both
buffer.retainAndAdvance();
}
else {
// advance the read pointer
buffer.rejectAndAdvance();
}
}
2023-03-04 19:19:12 +01:00
// set end to the write pointer, and
// resets the read and write pointers
2023-03-04 19:14:20 +01:00
buffer.finalizeFiltering();
2023-03-04 19:19:12 +01:00
// ... after this we can filter again, or
// consume the data
2023-03-04 19:14:20 +01:00
```
2023-03-04 18:06:53 +01:00
Especially noteworthy are the operations `retain()` and `reject()` in
2023-03-04 19:14:20 +01:00
[IntArraySearch ](src/main/java/nu/marginalia/array/algo/IntArraySearch.java ) and [LongArraySearch ](src/main/java/nu/marginalia/array/algo/LongArraySearch.java ).
2023-03-04 19:17:30 +01:00
They keep or remove all items in the buffer that exist in the referenced range of the array,
which must be sorted.
These are used to offer an intersection operation for the B-Tree with sub-linear run time.