Paper III — LLLV: The Retrieval Atom
Verifiable Memory for Accountable Agents
Normative keywords per RFC 2119/8174 (MUST/SHOULD/MAY) apply.
The Story
January 2025. A legal AI assistant. A malpractice lawsuit.
The AI had recommended a specific legal strategy based on "relevant case law." The case was lost. The client sued. In discovery, the question arose: why did the AI retrieve those specific cases?
The vendor's answer: "The vector search returned them as most relevant."
The lawyer's follow-up: "Prove it."
Silence. The retrieval system was a black box. No one could explain why those three cases were returned instead of the four that would have won the case. No one could prove the index hadn't been tampered with. No one could reconstruct the decision.
$4.2 million settlement. The AI couldn't explain its own memory.
Now imagine a different architecture.
Every retrieval returns an evidence chain:
{
"type": "LLLV_TOPK_EVIDENCE_V1",
"query_cid": "b3:7f3a9b2c...",
"index_pack_cid": "b3:4d5e6f7a...",
"results": [
{
"id": "case:smith_v_jones_2019",
"dist": 0.1831,
"proof": {
"block": "POSTINGS",
"merkle_path": ["b3:a1b2c3...", "b3:d4e5f6..."]
}
}
],
"stats": {
"algo": "hnsw",
"params_cid": "b3:8f9a0b1c...",
"visited": 812
},
"signature": "ed25519:responder_key"
}
This evidence can be verified offline, years later, by anyone with the pack file. The proof is cryptographic. The explanation is deterministic.
The lawsuit becomes a hash comparison.
This is LLLV.
I. The Problem
Modern retrieval systems are black boxes. They return results without explanation. They claim relevance without proof. They mutate state without receipts.
When an agent retrieves information to make a decision:
- How do we know the retrieval was honest?
- How do we know the index wasn't tampered with?
- How do we reconstruct "why these results" after the fact?
- How do we audit memory at scale?
Vector search without verification is not infrastructure. It is faith.
II. The Thesis
**Retrieval becomes infrastructure when answers are provable artifacts, not opaque guesses.**
LLLV defines:
1. Vector Capsule — the atomic, content-addressed unit binding embedding to provenance
2. Index Pack — a portable, merkleized ANN index verifiable offline
3. Top-K Evidence — a proof bundle explaining why these K items were returned
4. Temporal Narrative — an append-only timeline turning time into evidence
When memory can be audited, it can be trusted.