genes.apexpots.com / research source: donto-company-vision-2026-06-01.md

donto — A Strategy for Turning a Knowledge Substrate Into a Company (2026-06-01)

donto: A Strategy for Turning a Knowledge Substrate Into a Company

Prepared from 11 areas of landscape research and 5 adversarial thesis stress-tests. Written to be forwarded to your smartest friend, not to flatter you.

1. The one-sentence thesis

donto is the verifiable memory substrate for the agentic era — the only knowledge layer that keeps contradictions alive, anchors every claim to its source byte, replays what was believed at any moment in time, and enforces who is allowed to know what — and the company is built by leading with a governed, evidence-first consumer (memory and contested-evidence research) while keeping the substrate clean underneath.

That sentence already contains a correction the research forced on me. You want donto to be "substrate, never a product." Three of the five stress-tests independently concluded that the literal version of that philosophy is the single most dangerous thing about your strategy — it is the canonical platform-paradox trap, and it is exactly how the semantic web died commercially. So the thesis above keeps your architecture domain-neutral (correct, defensible, beautiful) while explicitly rejecting "substrate-first" as a go-to-market. You sell a product. The substrate emerges as a byproduct — the way AWS emerged from Amazon selling books for twelve years, not the way RDF was sold as "annotate the world and someday it will pay off."

The expanded vision. The first move of the last three years was that LLMs ate the reasoning layer; the open question of the next three is who owns the knowledge layer underneath them. Andrej Karpathy's 2025 framing is the cleanest articulation anyone has given of why donto should exist: the model is the "cognitive core" — the CPU — and it should offload bulk factual knowledge to an external system, because Allen-Zhu & Li proved (ICLR 2025, "Physics of Language Models 3.3") that an LLM stores only ~2 bits of knowledge per parameter. Parametric memory is finite, lossy, un-updatable, and un-citable. The durable disk below the cognitive core is the prize. A whole "agent memory" category — Mem0 ($24M), Zep/Graphiti (YC), Letta ($10M), Cognee (€7.5M), Supermemory — is now racing to be that disk.

But almost every one of them is building a forgetful, opinionated disk: they overwrite on conflict (Mem0), invalidate the older fact and "consistently prioritize new information" (Zep/Graphiti), or pick a winner via LLM judgment. That is fine for "remember the user prefers window seats." It is a catastrophic, often illegal design for the domains where memory actually matters and where money is changing hands under regulatory duress: legal evidence, clinical records, scientific claims, intelligence, journalism, and — your own proving ground — contested native-title genealogy. In those domains the disagreement is the asset. The court needs to know that two sources gave two birth years and which one you believed when. The regulator (EU AI Act, Article 10 / Annex IV, in force August 2026) requires you to trace any output back to its source data and show belief lineage. donto is the only system architected, from the primary key up, for exactly that.

So the ten-year vision is not "a better Mem0." It is: the trust layer that sits between the three currently-disconnected provenance silos — content authenticity (C2PA, which proves a file's origin but says nothing about whether its claims are true), training-data lineage (Collibra/Atlan, which track tables and pipelines, not facts), and inference-time grounding (Vectara/Perplexity, which cite a chunk then throw the provenance graph away) — unifying them at the only granularity that matters for truth: the individual claim. If the next decade demands that AI systems be auditable, contestable, and governable, donto is the substrate that makes a claim itself a first-class, time-stamped, evidence-anchored, policy-bound, contradiction-tolerant object. Nobody else is building that. The hard part is not the architecture (you've built it). The hard part is everything else, and most of this document is about the everything else.

2. Why now (the moment)

Four macro forces converge, and they are unusually well-aligned with what you already have running.

Force 1 — The agent explosion made memory the bottleneck, not the model. The field's own consensus in 2025-2026 is that "memory is the limiting factor, not model capability"; ~65% of enterprise AI failures in 2025 were attributed to context/memory loss (mem0.ai "State of AI Agent Memory 2026"). The agent-memory market is sized ~$6.3B (2025) → ~$28.5B (2030) at ~35% CAGR. MCP (Model Context Protocol) went from ~100K to ~97M monthly SDK downloads in 18 months and was donated to the Linux Foundation's Agentic AI Foundation in December 2025. Critically, MCP defines the socket, not the knowledge backend — and the canonical reference memory server is a flat local JSONL file with nine tools and zero provenance, time, or contradiction model. The slot donto fits is standardized and empty.

Force 2 — The AI-slop / provenance crisis turned "where did this come from?" from idealism into procurement. The Data Provenance Initiative (MIT/Cohere, Nature Machine Intelligence Aug 2024) audited 1,800+ training datasets and found >70% license omission and >50% license error — provenance is empirically broken at scale. Stanford's RegLab found purpose-built legal RAG still hallucinates in 17-34% of queries. C2PA is now the de-facto file-provenance standard (OpenAI, Google, Adobe, Sony on the steering committee; Pixel 10 signs every photo). The market for "content provenance solutions" is ~$1.63B (2025) → ~$5.12B (2030). The whole stack is converging on the demand donto answers — but at the file and dataset level, leaving the claim level wide open.

Force 3 — Regulation makes it mandatory, with a hard date. The EU AI Act's high-risk requirements (Article 10 data governance + Annex IV traceability) carry penalties up to €35M / 7% of global revenue and come into force August 2026. They literally require documented data provenance, data→model→decision lineage, and auditor-traceability of any output back to its source. donto's bitemporal "what did we believe at time T?" plus byte-offset evidence-anchoring is close to a turnkey implementation of Annex IV. ~$281-321M flowed into ~16-20 AI-governance startups in 2025-2026 — the budget line exists, and it is new.

Force 4 — Memory is the agent moat the labs will not neutralize for you. This is the subtle one, and the stress-tests sharpened it. Yes, OpenAI/Anthropic/Google all ship native memory now (Anthropic's is even auditable markdown, which undercuts a naive "we're transparent" pitch). But the labs have no incentive to make memory portable, neutral, multi-model, contradiction-preserving, or governed across providers — those features cut directly against their lock-in. The "Plaid for memory" thesis (Mem0's framing) is structurally sound precisely because the labs won't build the neutral layer. The danger is not that the labs absorb the deep layer; it's that funded independents (Zep especially) absorb it first, because each of donto's legs is individually copyable.

The honest read on timing: you are early enough that the category donto truly occupies ("contradiction-preserving, evidence-first, governed memory") has no name and no leader — that's the land-grab. You are late enough that the adjacent category ("agent memory") has a leader (Mem0), a benchmark regime (LoCoMo/LongMemEval/BEAM), and a near-architectural-twin (Zep). The window is real but it is not wide.

3. What genuinely sets donto apart (the defensible core)

Here is the synthesis of what survived adversarial attack, sorted by how defensible it actually is — not by how proud you are of it.

The one genuinely rare, genuinely defensible thing: paraconsistency + governance, fused

Across all 11 research areas and 5 stress-tests, two capabilities held up as both rare-in-production and hard-to-copy-quickly:

True paraconsistency — contradictory claims both live forever as legal state, with a queryable contradiction frontier and typed argument edges (supports/rebuts/undercuts). Every competitor does the opposite. Zep/Graphiti explicitly "sets t_invalid = t_valid of the invalidating edge" and "consistently prioritizes new information." Mem0 self-edits/overwrites. Supermemory does "contradiction resolution" (picks a winner) and "selective forgetting." A-MEM mutates old notes. This is not a marketing distinction; it is an architectural commitment nobody else has made because the agent-memory market's revealed preference is the opposite — devs want one clean answer. That last clause is the catch, and I'll return to it hard.
The Trust Kernel — 15 action-level policy capsules, fail-closed default, governance that propagates to all derivatives (embeddings, translations, exports inherit source policy), operationalizing FAIR + CARE (indigenous data sovereignty). Torch Capital's portable-memory thesis names the exact white space: "No discussion of data provenance, audit trails, or who validates memory accuracy... governance mechanisms notably absent." No agent-memory competitor implements CARE or policy-inheriting derivatives. This is the leg with both genuine product-space emptiness and a buyer with a compliance budget (EU AI Act, IEEE 2890-2025 Indigenous-provenance standard, GIDA CARE).

The stress-test verdict was precise: the combination of bitemporality + paraconsistency + evidence-anchoring + policy governance is real and currently unoccupied — but it "partially holds" because it's a feature-bag, not a moat, and the most distinctive leg (paraconsistency) is something the mass market actively does not want. Conclusion: your moat is not "all four together." Your moat is paraconsistency + governance, deployed in domains where picking a winner is itself a defect. Lead with those two. Treat the rest as supporting cast.

What is rare but copyable (use it, don't bet the company on it)

Identity-as-hypothesis (query-time identity lenses: strict/likely/exploratory; non-destructive merge). Genuinely sophisticated — the academic basis is Bhattacharya & Getoor's query-time entity resolution (2011), unproductized anywhere. But a funded team that reads the paper can ship a version.
Evidence-as-primary-key (3-tier byte-offset source trace, content-addressed blobs). Stronger than anyone shipping (Cognee's page-level provenance is the nearest, Diffbot's URL+timestamp is coarser). But TierMem (2026) and Microsoft's "Portable Agent Memory" paper (arXiv 2605.11032 — Merkle-DAG/BLAKE3 provenance + Ed25519-signed roots + confidence-scored S-P-O triples) show the idea is now in the water.
Lean 4 overlay that certifies but never gates ingest — structurally the AlphaProof pattern (neural intuition + symbolic verification). A real credibility marker for regulated/scientific buyers. Unique today.
Signed RO-Crate / W3C-PROV / DataCite release machinery — research-grade interop the startups completely lack. A wedge into FAIR/EOSC science markets and a Daubert/admissibility story for legal.

What is commodity — stop pitching these as differentiators

Bitemporality. This is the most important correction. Zep/Graphiti already ships a bitemporal temporal knowledge graph for agent memory, with a peer-reviewed paper (arXiv 2501.13956), real benchmarks, MCP server, and 30x usage growth. XTDB v2 (Grid Dynamics/JUXT) ships bitemporal-on-every-row SQL to regulated finance. If you lead with "bitemporal," you are a late, lesser-known entrant on your competitor's marquee feature. Bitemporality is table stakes in your category, not your edge.
Postgres-native storage. Redis, Supermemory (pgvector), and the entire "use the DB you have" crowd share this. It's good plumbing, not a moat — and it's a double-edged one (see §8: Postgres incumbency is exactly what commoditized the vector-DB specialists; "just turn on the extension" can happen to you).
The /memorize → extract → /recall loop. Functionally identical to Mem0, Zep, Cognee, Supermemory. Parity, not advantage.

Where donto is behind (the honest subsection)

Be merciless with yourself here, because buyers and investors will be:

Gap	The brutal version
No published benchmarks	The category is won in comparison tables. Mem0 cites ~92.5 LoCoMo / 94.4 LongMemEval; Zep 94.8% DMR; "Memento" 92.4% LongMemEval. donto has zero public numbers. Until you post one, you are invisible — and "697 facts from 'cat is red'" reads as noise/cost, not quality, to anyone in this field.
No SDK, no MCP server, no framework adapters	Mem0 ships "6 lines of code," 20+ vector-store and 21+ framework integrations, and is the exclusive memory provider for the AWS Agent SDK. donto is HTTP endpoints on one VM. The single most urgent integration gap is an MCP server compatible with Anthropic's reference KG-memory tools.
Scale is small, and 39.5M is not a moat	This was the sharpest hit. Single-node GraphDB/Virtuoso routinely load 8-100 billion triples; AutoSchemaKG/ATLAS is 5.9B edges; Diffbot is 1T facts. 39.5M is ~0.04-0.4% of a routine single-server load. Drop "39.5M statements" as evidence of anything. Your claim is design-proven, not scale-proven — say that.
Conceptual heaviness	21-clause DontoQL + identity lenses + trust kernel + 11×3 predicate alignment + Lean overlay is the exact "built by academics for academics" profile that killed the semantic web. It must be hidden behind a one-line default API or it repels every developer who just wants `memorize(text)`.
No reasoning layer over the contradictions	You store argument edges; you don't compute over them. The entire value of Dung/ASPIC+/Belnap is calculating which arguments are accepted under grounded/preferred semantics. Without that, the contradiction frontier risks being "a messy database" rather than a reasoner. This is your single biggest capability gap and, conveniently, a fundable differentiator.
The consolidation critique	"Contextual Agentic Memory is a Memo, Not True Memory" (arXiv 2604.27707) argues stores that "accumulate notes indefinitely" are lookup, not memory — and your "maximal extraction" is the platonic ideal of hoarding. You have no demonstrated semantic-abstraction/consolidation pathway. The field is moving toward selective memory; you're sprinting the other way.
Extraction trust + economics	"Maximal extraction" optimizes recall, which tanks precision — LLM triple extraction shows ~28-65% hallucination before verification. And the cheap economics rest on running extraction through a GLM coding subscription via OpenCode, which Z.AI now throttles/bans for non-coding use. That is both TOS-violating and an expiring subsidy.
Solo/no-team, no funding, no brand	75% of VC funds made zero solo-founder investments in 2025 (Carta). Pre-revenue + horizontal-infra + solo is close to unfundable institutionally.

4. The competitive landscape

donto competes in four overlapping arenas. The mistake would be to think of them as one. Here is the map.

Arena A — Agentic memory (donto-memory's direct ring)

Player	Funding	Core model	Contradiction handling	Provenance	Governance
Mem0	$24M (Basis Set/Peak XV/YC); AWS Agent SDK exclusive; ~48K stars	Fact extraction → vector+graph+KV, active curation	Overwrites / self-edits	Near-none	None
Zep / Graphiti	YC; Graphiti ~20-27K stars; arXiv 2501.13956	Bitemporal temporal KG	Invalidates older edge, prioritizes new	Episode-level	None
Letta (MemGPT)	$10M @ $70M (Felicis)	OS-tiered self-editing memory blocks	Overwrites blocks	None	None
Cognee	€7.5M (Pebblebed); Bayer, U. Wyoming	ECL pipeline → graph+vector	Has "forget" (destructive)	Page-level (rare exception)	Air-gapped/residency
Supermemory	$2.6M (Susa/Browder; Jeff Dean)	Universal memory API, MCP-native, OpenCode/Claude Code plugins	"Contradiction resolution" (picks winner) + selective forgetting	Shallow	None
LangMem	LangChain distribution	Semantic/episodic/procedural SDK	None	None	None
OpenAI / Anthropic / Google native memory	Effectively unlimited	Built-in, lock-in by design	Picks one answer	Anthropic: auditable markdown	Per-platform
donto-memory	$0, solo	Bitemporal + paraconsistent quad store, evidence-first	Keeps both forever	Byte-offset, primary-key	Trust Kernel / CARE

The story this table tells: donto is uniquely the one that doesn't pick a winner and uniquely the one with real governance. It is behind everyone on distribution, benchmarks, SDK, and team. Supermemory is the most strategically dangerous because it is already MCP-native and ships OpenCode/Claude Code plugins — your exact stack — and markets the opposite philosophy. Zep is the closest architectural cousin and the one most able to bolt on a "keep-both" flag.

Arena B — KG construction / GraphRAG

Microsoft GraphRAG (~33K stars; "From Local to Global"; proved indexing costs ~75% of token budget) and LightRAG (~36K stars, ~6000x cheaper/query) define the cost-conscious mainstream. AutoSchemaKG/ATLAS (HKUST: 5.9B edges, 92% schema alignment) is the closest thing to your "maximal extraction" ambition — at 150× your scale, autonomously. KARMA (NeurIPS 2025) runs 9 agents to reduce conflict edges 18.6% (the philosophical inverse of you). Diffbot (1T facts, per-fact provenance, profitable, bootstrapped) is the existence-proof that web-scale KG-with-provenance is a real business — but it canonicalizes to one entity. The entire field is racing toward less extraction per dollar (Microsoft's LazyGraphRAG: ~0.1% of indexing cost). You are the only one racing the other way. That is either a moat (nobody wants to pay for forensic exhaustiveness) or a trap (it's economically irrational and quality-negative). It depends entirely on choosing the domains where exhaustiveness is the feature.

Arena C — Bitemporal / immutable / provenance databases

XTDB v2 (Grid Dynamics/NASDAQ:GDYN; bitemporal SQL to finance) is your nearest commercial peer on bitemporality and proof the market pays for it — but it's rows, not claims, with no paraconsistency/evidence/identity/governance. Datomic (free, Apache-2.0, owned by Nubank) is the cautionary tale: immutable + Datalog + as-of, even free, stayed niche because of learning curve and thin docs — a direct warning about DontoQL. Amazon QLDB was discontinued (EOL July 2025) — a hyperscaler killed a standalone immutable ledger for insufficient pull. The lesson is written in neon: never sell "an immutable/bitemporal database." Sell the downstream value. Wikidata (qualifiers/references/preferred-normal-deprecated ranks) is your closest data-model peer and proves source-qualified, rank-able claims work at planet scale. The standards to align with rather than reinvent: RDF 1.2 / RDF-star, PROV-O, and nanopublications — whose 2025 proposed 4th "knowledge provenance" graph (supporting + conflicting evidence) maps almost 1:1 onto your contradiction frontier. That's your natural FAIR/science export format.

Arena D — Personal AI / second brain (the graveyard)

Rewind/Limitless (~$33M → Meta acqui-hire, hardware killed, ~$2M ARR), Mem.ai ("$40M second brain failure"), Personal.ai (niche). The lesson is unambiguous: do not build a consumer capture app or hardware. Capture friction and platform absorption kill it. Tana/Pieces could be consumers on donto; they are not your battlefield.

Where donto sits

donto sits in the white space between the silos: claim-level (vs C2PA's file, Collibra's table, Vectara's chunk), contradiction-preserving (vs everyone's resolve-or-overwrite), and governed-with-inheritance (vs nobody). The risk is that white space between silos is precisely where horizontal infra goes to die unless it picks a wedge. Which brings us to the company.

5. The layered company

How the existing portfolio fits

                    ┌─────────────────────────────────────────────┐
   CONSUMERS        │ genes   donto-memory   donto-lang   (+ new)  │
   (products,       │ (vertical) (horizontal API)  (pilot)         │
    where the    ───┼─────────────────────────────────────────────┤
    money is)       │              DontoQL / MCP / SDK             │  ← the seam:
                    ├─────────────────────────────────────────────┤    hide complexity here
   SUBSTRATE        │  donto: bitemporal · paraconsistent ·        │
   (architecture,   │  evidence-first · identity-as-hypothesis ·   │
    not the pitch)  │  Trust Kernel · Lean overlay · RO-Crate      │
                    └─────────────────────────────────────────────┘
                           Postgres (pg_donto) + dontosrv

The portfolio is correct as an architecture. The error is treating all three consumers as equal product bets. They are not:

donto-memory is the horizontal API product — the on-ramp, the developer-distribution motion, the thing that gets stars and an MCP listing. It must hide the substrate behind a dead-simple memorize/recall default with an opinionated "best current answer" lens, and expose the superpowers (AS_OF, contradiction frontier, identity lens, evidence trace) as advanced opt-ins.
genes is the vertical proving-ground and credibility corpus, NOT the revenue engine (the stress-test refuted "genealogy as early revenue" — §8). It hardens every invariant, produces the publishable CARE/contradiction story, and yields the lighthouse case study. It should be firewalled organizationally so it doesn't silently re-domain the company.
donto-lang is a research pilot — keep it as proof of domain-neutrality, don't resource it.

Six new second-layer verticals worth pursuing

For each: the wedge, and specifically why donto's invariants are non-negotiable there (not nice-to-have). The selection criterion is the one the stress-tests kept hammering: pick domains where picking a winner is itself a defect and a buyer has a compliance budget.

1. Clinical evidence & pharmacovigilance adjudication. Wedge: reconciling conflicting studies, conflicting EHR entries, conflicting adverse-event reports into a contradiction-aware, time-stamped, source-anchored evidence record. Why the invariants matter: two studies will disagree on a drug's effect; the regulator (and the clinician) needs both, with provenance and "what did we believe when we made the dosing decision." Bitemporal belief-replay + paraconsistency + byte-offset provenance + fail-closed access governance is a compliance dream, and HIPAA-style audit is a forcing function exactly like the EU AI Act. Strongest leg used: all four.

2. Legal evidence / e-discovery / contradictory-witness modeling. Wedge: a contradiction frontier over depositions, exhibits, and precedents where "what did we know, and when" (AS_OF) is the literal legal question. Why: picking a winner among conflicting witnesses is the opposite of what discovery wants; the typed argument edges (supports/rebuts/undercuts) are AIF/ASPIC+ attack relations the legal-argumentation field already formalized. The signed RO-Crate/DataCite release machinery becomes the Daubert admissibility story (testable, has provenance, auditable belief-history). Risk: novel methodology can be challenged as "not generally accepted" — so partner with the argumentation community (ARG-tech, Chris Reed) for credibility.

3. Scientific claim-curation / research integrity. Wedge: a nanopublication-native store of supporting+conflicting evidence for scientific claims, citable as FAIR Digital Objects. Why: you already emit signed RO-Crate; nanopubs' proposed "knowledge provenance" graph is your contradiction frontier; the Data Provenance Initiative proved the need empirically. Buyers: EOSC, research-data repositories, journals fighting the reproducibility/retraction crisis. Lower margin, high credibility, paper-generating.

4. Intelligence / OSINT / Analysis of Competing Hypotheses. Wedge: ACH is methodologically central to intelligence analysis and academics note it's under-tooled. donto is literally a competing-hypotheses engine with identity-as-hypothesis and a contradiction frontier. Why: analysts must hold contradictory source reports, weight identity assertions ("is this the same person?"), and answer "what did we assess at time T." Risk: sales cycle and clearance gates; but high willingness-to-pay.

5. Regulated AI audit / EU-AI-Act traceability ("evidence pack for high-risk AI"). Wedge: be the verifiable evidence store feeding governance dashboards (Credo AI, OneTrust, Modulos), not the dashboard itself. Why: Annex IV requires data→model→decision lineage and output-to-source traceability; donto's bitemporal + byte-offset design is nearly turnkey. Partner-led GTM into a category that already has buyers and a deadline. This may be the largest near-term commercial surface.

6. Sovereign / indigenous & cultural-heritage memory (the "sovereign memory" flagship). Wedge: the only substrate that computationally enforces CARE / Local Contexts TK Labels and propagates them to embeddings and exports. Why: genes already exercises this; GLAM institutions, land councils, GBIF (which piloted TK/BC Labels 2024-25), and IEEE 2890-2025 give it institutional demand and grant funding. Caveat (critical): this is high-trust, low-margin, reputationally fragile, and — for you personally — conflicted (you are an adverse party in an EKY claim; see §8). Productize it for other communities as custodians; never position yourself as the authority.

The throughline: all six are "memory you can defend" markets, not "memory the chatbot has" markets. That is the whole strategy in one phrase.

6. The wedge & go-to-market — resolving the "substrate, never a product" tension

The stress-tests were unanimous and harsh on this point, so let me be direct: "substrate, never a product" is correct as an architecture principle and fatal as a go-to-market. Eric Paley's platform paradox, the vertical-SaaS-grows-2-3x-faster data, Palantir's explicit rejection of neutrality ("a digital twin of the organization... not generic horizontal data infrastructure"), and the semantic web's 25-year commercial failure all point the same way. You must lead with a product. The substrate emerges as a byproduct.

Here is the resolution, sequenced.

The model: open-core, three concentric rings.

Open-source the substrate primitive (pg_donto + dontosrv + a thin client SDK + an MCP server) to drive developer adoption. This is the only proven way a horizontal data primitive has ever won (Postgres, Neo4j, MongoDB). Stars, npm/PyPI pulls, MCP-registry listing, framework adapters (LangChain/LlamaIndex/CrewAI/OpenAI-Agents/Anthropic-ADK). The single highest-leverage build is an MCP memory server API-compatible with Anthropic's reference KG-memory tools so donto is a literal drop-in upgrade into 10K+ MCP hosts.
Monetize a managed multi-tenant cloud (donto Cloud) with usage-based metering (per write/recall/search op, mirroring Mem0's $0/$19/$249 tiers and Supermemory's per-token pricing). The commercial tier is governance: SSO, the full Trust Kernel/governance console, signed RO-Crate export, audit reporting, SLAs, on-prem/VPC. Governance + provenance are the natural paid tier because that is exactly what regulated enterprises pay for — and what the open core deliberately doesn't fully unlock.
Sell into one regulated beachhead vertical with a thin, opinionated product on top — not a horizontal pitch. My recommendation for the first paid wedge: regulated AI audit / EU-AI-Act evidence packs (#5) or clinical/pharmacovigilance adjudication (#1), because both have an external deadline, a named budget line, and a buyer who treats paraconsistency-and-provenance as requirements, not nice-to-haves. Genealogy/native-title is the proving-ground and case study, not the revenue beachhead.

Pricing power lives in governance, not storage. Usage-based memory pricing is racing toward $0.001/op (MemoClaw); undifferentiated storage/recall margins will compress (see Pinecone's revenue decline). Only governance/provenance/assurance features hold pricing power. So: cheap (or free) recall, paid trust.

The benchmark move is non-negotiable and comes first. You cannot sell, raise, or even appear in the category without a number. But you should not race Mem0/Zep on plain LoCoMo recall (you'll lose on a metric that doesn't reward your design). Instead: (a) publish competent LoCoMo/LongMemEval numbers to be legible, and (b) define the benchmark you win — a "contradiction-retention / provenance-recall / belief-at-time-T" eval where the correct answer is "both X and Y are claimed, by sources A and B, valid at T1/T2." IBM's WikiContradict (NeurIPS 2024: all LLMs near-random at contradiction detection) is your ready-made foil; BEAM already has a contradiction-resolution category where pick-newest designs structurally lose. Own the eval the way Zep made temporal a thing.

The funding posture: raise a small, angel-heavy pre-seed/seed ($1.5-4M — the Supermemory/Cognee band), not a mega-round. The relevant names are on the Mem0/Letta cap tables: Neo4j's Philip Rathle, Datadog's Olivier Pomel, dbt's Tristan Handy, MotherDuck's Jordan Tigani, plus governance/data angels. And add a co-founder before you raise — solo + pre-revenue + horizontal-infra is near-unfundable, and you cannot run an enterprise-compliance sales motion alone.

7. The visionary horizon — "1M+ facts per text"

I want to engage this seriously because it's the heart of your vision, and then I want to save you from the version of it that kills the company.

What's right about it. The defensive half of the thesis is sound: end-to-end and long-context models did not make explicit knowledge layers obsolete. The market converged on hybrid (GraphRAG, agent memory); long-context-vs-RAG analyses conclude "neither is obsolete, route between them," with external stores winning on freshness, updateability, multi-hop, and a ~1,250× cost advantage at scale. Allen-Zhu & Li's 2-bits-per-parameter ceiling is the quantitative floor under Karpathy's cognitive-core thesis. AlphaProof's IMO silver medal is the proof that neural+symbolic+formal-verification still beats pure scaling on hard reasoning — and that's your Lean-overlay pattern. So "explicit, auditable, external knowledge survives" is a defensible bet.

What's wrong about it. "1M+ facts per text / maximal extraction" optimizes the wrong axis. More facts ≠ more value or more truth:

It is the textbook OpenIE failure mode: maximize recall, tank precision. LLM triple extraction shows ~28-65% subject hallucination before verification.
a16z's data-moat analysis shows value asymptotes — past ~40% coverage, each additional fact costs more and adds almost nothing, while freshness decays.
The "memo not memory" critique (arXiv 2604.27707) and CoALA both say real memory requires consolidation (semantic abstraction, procedural learning), which hoarding actively works against.
And it is the Cyc bet. A multi-decade, two-person-century effort to encode "everything in extreme detail" as explicit knowledge was eclipsed by deep learning that kept knowledge implicit. donto is the LLM-accelerated reprise of exactly that wager. The bitter-lesson camp (Sutton's Turing Award; Silver & Sutton's "Era of Experience") is the genuine intellectual headwind: if agents learn world models end-to-end from grounded experience, painstakingly extracted human-authored facts are a depreciating asset.

The reframe that saves it. The vision is not wrong — it's mis-stated. Change the objective function from volume to verified, governed, contradiction-aware, byte-anchored provenance. The right sentence is not "a million facts from any text." It is:

"Lossless, contradiction-preserving decomposition of a source into claims, each anchored to its evidence, each governed, each replayable through time — so that an AI's understanding of contested reality is auditable in extreme detail."

That reframe does three things: it answers the consolidation critique (add a sleep-time semantic-consolidation pass — Letta's framing — that abstracts the granular facts into higher-level claims without destroying them, which bitemporality makes safe); it answers the precision objection (pair extraction with the Lean overlay + a published precision/recall number, and use maturity/polarity gates so raw extraction is cheap-stored but only evidence-anchored mature claims are queried); and it answers the bitter lesson (anchor in domains where ground truth is contested/legal/cultural and provenance is the product — exactly where Era-of-Experience agents have no answer because there is no reward signal for "what does the law say happened in 1898").

The 10-year picture if it works. Models become cheap, abundant reasoning cores. The scarce, valuable thing is trustworthy, contestable, governed knowledge state — the substrate that can say "here is what is claimed, by whom, on what evidence, believed when, contested how, allowed to whom." donto is the system of record for contested reality: the place a court, a regulator, a clinician, an analyst, or an agent goes when "the chatbot remembers" is not good enough and "cite the source, show both conflicting claims, prove what we believed when, and enforce who may see it" is the whole job. Inference cost falls ~10×/year (LLMflation), so the economics of high-recall verified extraction improve every quarter — and when extraction is nearly free, donto is the only substrate that can hold the output without collapsing under contradictions or losing provenance. That is a real, large, durable company. But it is built by narrowing the vision into "verified provenance," not by chasing the fact counter.

8. The hard problems you will face

No softening. These are in roughly the order they can kill you.

Founder / execution (the most likely killer).

Solo team. 75% of VC funds made zero solo-founder investments in 2025. You need a co-founder — ideally a GTM/enterprise-sales or distribution-minded partner, because your gap is not technical.
Focus / breadth. The architecture has ~8 deep invariants and four consumers. To an investor and a developer this reads as "no wedge." You must pick one and say no to the rest in public, while keeping them alive in the codebase.
The genealogy-gravity risk. This is real and specific. The corpus is overwhelmingly your own family; the only consumer of donto-memory is the Omega bot; there are zero paying users. The pull of the family corpus will silently convert donto-the-substrate into genes-the-genealogy-app unless you build a hard organizational firewall (separate brand/budget, a "no re-domaining" tripwire).
You are a contested party in your own flagship. You are adverse to CYLC, Jabalbina PBC, and the State of Queensland, trying to insert your ancestor into the EKY schedule. That is advocacy — the literal opposite of a neutral substrate vendor, and it contradicts your own "no authority is ground truth" axiom (you are picking a winner). For genealogy to be a credibility asset rather than a liability, you must exit the role of contested party in your own claim or clearly separate "donto the company" from "Thomas the claimant."

Technical.

Extraction quality vs cost. Maximal extraction = precision collapse + cost blowup. The flat-rate GLM-coding-subscription economics violate Z.AI TOS (non-coding use throttled, banned after 3 violations) and are an expiring subsidy. You need a sustainable, TOS-compliant inference contract and a precision/recall story before scaling volume.
Query planner & scale. Your own notes show /search needs careful index tuning to avoid seq-scanning 39M rows; identity-lens closure over billions of coreference edges is an unproven performance problem competitors sidestep by merging eagerly. Single-VM, no HA, no concurrency proof.
Predicate proliferation & dedup. Open-world, schema-plural extraction at maximal volume will spawn near-duplicate predicates and entities; the 11×3 predicate alignment helps but the dedup/canonicalization burden grows superlinearly.
No reasoning engine over argument edges. You store supports/rebuts/undercuts but don't compute acceptability. This is the gap between "a warehouse of contradictions" and "a contradiction reasoner."

Market.

Absorption — from the side, not above. The labs won't build neutral governed memory, but funded independents (Zep, Supermemory) can bolt on "keep-both + cite-sources" faster than you can build distribution.
Education burden. "Provenance" already means C2PA to media buyers and Collibra-lineage to data buyers. Your claim-level/paraconsistent meaning is a third definition with no budget line yet — long, expensive evangelism.
Postgres commoditization-from-below. The same incumbency that ate the vector-DB specialists ("just turn on pgvector") can eat you ("just turn on the bitemporal extension"). Your moat must be the governance/network/relationships, not the code — a16z: "if your moat is code, you don't have a moat."
Benchmark invisibility. Until you post a number, you don't exist in the category.

Business.

Who pays, and what for. Chatbot-memory buyers want cheap single answers (you can't win there). Regulated buyers want governance/audit (slow, sales-heavy, the GTM a small team is worst at). The fast self-serve market is the one you can't win; the market you can win is slow. That gap is the central business problem — bridge it with open-core developer adoption (fast, free) funneling into governed-cloud upsell (slow, paid).
Pricing power. Only governance/assurance holds margin; storage/recall is racing to ~$0.001/op.

Legal / ethical.

Append-only vs GDPR/CPRA right-to-be-forgotten. "Never destructively delete" collides head-on with erasure rights. You need a defensible answer (crypto-shredding of content-addressed blobs while preserving the bitemporal skeleton? policy-capsule-gated tombstoning?) before an enterprise legal review, not after.
CARE / indigenous data sovereignty. Aboriginal genealogy is collectively owned and FPIC-gated; "DNA for Aboriginality" is scientifically rejected and politically toxic in Australia. Commercializing on it invites reputational ruin unless governed by community authority, not just your policy capsules. The trust kernel is necessary but not sufficient — it needs custodian relationships.
Liability of contradiction-preserving evidence in court. Novel bitemporal/paraconsistent methodology can be challenged under Daubert as "not generally accepted." And a single LLM-extracted hallucination presented with impeccable provenance in a real native-title or clinical context could be existentially damaging — provenance on a wrong fact is worse than no fact. The Lean overlay and a faithfulness certificate must become gating for mature claims, not just shape-checking.

9. Modern research & reading map

Grouped by theme, annotated for why it matters to donto. This is the deliverable to bookmark.

A. The case FOR an external knowledge substrate (your raison d'être)

Karpathy, "cognitive core" / 2025 Year in Review — https://karpathy.bearblog.dev/year-in-review-2025/ — "LLM is the CPU, context is the RAM; offload bulk facts externally." The clearest A-list articulation of donto's reason to exist.
Allen-Zhu & Li, "Physics of Language Models 3.3" (ICLR 2025) — https://arxiv.org/abs/2404.05405 — ~2 bits of knowledge/parameter. The quantitative floor under the whole thesis.
DeepMind AlphaProof / AlphaGeometry 2 — https://deepmind.google/blog/ai-solves-imo-problems-at-silver-medal-level/ — neural + Lean verification beats scaling. Validates your Lean-overlay pattern.

B. The case AGAINST (the headwinds you must answer)

Silver & Sutton, "Welcome to the Era of Experience" (2025) — https://storage.googleapis.com/deepmind-media/Era-of-Experience%20/The%20Era%20of%20Experience%20Paper.pdf — the bitter-lesson argument that curated human knowledge is a sunset asset. Your main counter-thesis.
LeCun / JEPA (V-JEPA 2, LeJEPA) — https://www.turingpost.com/p/jepa — implicit latent world models; "don't work on LLMs." Non-inspectable, non-attributable — the opposite of your design, hence different needs.
"Critiques of World Models" (PAN) — https://arxiv.org/html/2507.05169v2 — argues mixed discrete-symbolic + continuous beats pure latent. Supports a role for structured knowledge.
a16z, "The Empty Promise of Data Moats" — https://a16z.com/the-empty-promise-of-data-moats/ — value asymptotes; freshness decays. The strongest argument against "1M facts."
Cyc — https://en.wikipedia.org/wiki/Cyc ; https://venturebeat.com/ai/how-llms-could-benefit-from-a-decades-long-symbolic-ai-project — the cautionary precedent for "encode everything explicitly."

C. Agent memory — the competitive set (read all)

Mem0 (arXiv 2504.19413) — https://arxiv.org/abs/2504.19413 — the production reference architecture and benchmark bar.
Mem0 "State of AI Agent Memory 2026" — https://mem0.ai/blog/state-of-ai-agent-memory-2026 — the leader admits provenance/contradiction/evidence-tracking are unsolved. Your strongest external validation.
Zep / Graphiti (arXiv 2501.13956) — https://arxiv.org/abs/2501.13956 — your closest cousin; bitemporal, picks-newest. The foil that defines your wedge.
MemGPT / Letta (arXiv 2310.08560) — https://arxiv.org/abs/2310.08560 — the category root.
Sleep-time Compute (arXiv 2504.13171) — https://arxiv.org/abs/2504.13171 — frame your durable Temporal extraction as this; turns "5 min/message" into a feature.
HippoRAG / HippoRAG 2 — https://arxiv.org/abs/2405.14831 — the best associative retrieval (Personalized PageRank); the layer you should bolt onto your quad store.
A-MEM (NeurIPS 2025) — https://arxiv.org/abs/2502.12110 — Zettelkasten self-organizing memory; mutates old notes (your non-destructive contrast).
"Contextual Agentic Memory is a Memo, Not True Memory" (arXiv 2604.27707) — https://arxiv.org/abs/2604.27707 — the consolidation critique you must answer.
"Memory for Autonomous LLM Agents" survey (arXiv 2603.07670) — https://arxiv.org/html/2603.07670v1 — names your exact value props as open problems; lists MemoryArena (harder bench).
Microsoft "Portable Agent Memory" (arXiv 2605.11032) — https://arxiv.org/html/2605.11032v1 — a hyperscaler circling your design (Merkle-DAG/Ed25519 provenance + S-P-O). Inspiration and threat.
Vendor landscape 2026 — https://agentmarketcap.ai/blog/2026/04/10/agent-memory-vendor-landscape-2026-letta-zep-mem0-langmem

D. KG construction / GraphRAG

Microsoft GraphRAG — https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/ — the cost lesson (75% of tokens on indexing).
LightRAG (EMNLP 2025) — https://github.com/HKUDS/LightRAG — the cost-cutting champion (your inverse).
AutoSchemaKG / ATLAS (arXiv 2505.23628) — https://github.com/HKUST-KnowComp/AutoSchemaKG — 5.9B edges; the maximal-extraction bar at scale.
KARMA (NeurIPS 2025, arXiv 2502.06472) — https://github.com/YuxingLu613/KARMA — multi-agent extraction that reduces conflicts (your inverse).
KGGen + MINE (NeurIPS 2025) — https://github.com/stair-lab/kg-gen — the emerging text-to-KG quality benchmark; run donto through it.
"Can KGs Reduce Hallucinations? A Survey" (NAACL 2024) — https://aclanthology.org/2024.naacl-long.219/ — grounding helps, but LLM-built KGs hallucinate; cautions maximal extraction.

E. Bitemporal / immutable / provenance DBs

XTDB v2 — https://xtdb.com/ ; bitemporality concepts https://v1-docs.xtdb.com/concepts/bitemporality/ — your nearest commercial peer; market-proof.
Datomic — https://www.datomic.com/ — free, admired, niche. The DontoQL learning-curve warning.
TerminusDB — https://github.com/terminusdb/terminusdb — Git-on-a-graph; design reference.
Wikidata data model — https://www.wikidata.org/wiki/Wikidata:Data_model — closest data-model peer (qualifiers/references/ranks).
(Cautionary) Amazon QLDB EOL — https://aws.amazon.com/qldb/ — never sell "an immutable DB."

F. Paraconsistency, argumentation, belief revision (your theoretical backbone)

"Dealing with Inconsistency for Reasoning over KGs: A Survey" (arXiv 2502.19023) — https://arxiv.org/html/2502.19023v1 — keep-both (you) vs repair (mainstream). Your formal map.
"Queries with Exact Truth Values in Paraconsistent DLs" (arXiv 2408.07283) — https://arxiv.org/pdf/2408.07283 — Belnap four-valued query semantics; cite to argue soundness.
AKReF (arXiv 2506.00713) — https://arxiv.org/pdf/2506.00713 — typed argument edges (undercut/rebut/undermine); your edges' academic mirror.
IBM WikiContradict (NeurIPS 2024) — https://research.ibm.com/publications/wikicontradict... — LLMs near-random at contradiction detection. Your proof-of-need and benchmark foil.
ARG-tech (Chris Reed, Dundee) — https://www.arg.tech — AIF standard; potential partner; also the 20-year "argumentation alone isn't a venture business" cautionary tale.

G. Provenance, trust, governance, standards (your moat layer)

Data Provenance Initiative (Nature MI, 2024) — https://www.nature.com/articles/s42256-024-00878-8 — provenance is empirically broken; your thesis evidence.
W3C PROV-O — https://www.w3.org/TR/prov-o/ ; RDF 1.2 / RDF-star — https://www.w3.org/TR/rdf12-concepts/ ; Nanopublications + "knowledge provenance" — https://nanopub.net/ , https://ceur-ws.org/Vol-3937/paper10.pdf — the standards to align with, not reinvent; nanopubs' 4th graph ≈ your contradiction frontier.
C2PA Content Credentials — https://c2pa.org/ — file-level provenance; interoperate (ingest/emit), don't compete.
CARE Principles (GIDA) — https://datascience.codata.org/articles/10.5334/dsj-2020-043 ; operationalization https://www.nature.com/articles/s41597-021-00892-0 ; Local Contexts / TK Labels — https://localcontexts.org/ — your governance credibility anchor.
RO-Crate Workflow Run profile (PLoS ONE 2024) — https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0309210 — adopted across bioinformatics; you already emit this.
EU AI Act — https://artificialintelligenceact.eu/ ; requirements https://goteleport.com/blog/eu-ai-act-requirements/ — the forcing function and deadline.
Stanford legal-RAG hallucination study — https://dho.stanford.edu/wp-content/uploads/Legal_RAG_Hallucinations.pdf — 17-34% hallucination; "provenance on a wrong fact is a liability."

H. Standards & ecosystem

MCP 2026 roadmap — https://blog.modelcontextprotocol.io/posts/2026-mcp-roadmap/ ; reference servers — https://github.com/modelcontextprotocol/servers — the socket and the thin baseline to obliterate.
GraphRAG-as-grounding (ACL GenAIK 2025) — https://aclanthology.org/2025.genaik-1.6/ — the live narrative to ride; cuts hallucination 30-40%.
Semantic-web post-mortem — https://www.semanticarts.com/the-year-of-the-knowledge-graph-2025/ — why RDF wins only when invisible + immediate payoff. Read it twice.

I. Strategy / GTM

Paley, "The Platform Paradox" — https://techcrunch.com/2015/11/28/the-platform-paradox/ · Tunguz, vertical-SaaS tradeoff — https://tomtunguz.com/vertical-saas-tradeoff/ · Contrary, "The Vertical AI Playbook" — https://research.contrary.com/report/the-vertical-ai-playbook · Carta solo-founders — https://carta.com/data/solo-founders-report/

10. The next 6-18 months

A concrete, sequenced plan. The ordering reflects risk, not comfort.

Phase 0 (Weeks 1-6): Become legible and reframe

Stop saying "39.5M statements" and "1M facts." Rewrite the public framing to "design-proven verifiable memory: contradiction-preserving, evidence-anchored, governed." This is free and removes two of your worst self-inflicted wounds.
Ship the MCP server — API-compatible with Anthropic's reference KG-memory tools (entities/relations/observations + the 9 tools), plus extra tools for AS_OF, contradiction frontier, identity lens, evidence trace. List it in the MCP Registry. This is your distribution on-ramp.
Ship a dead-simple SDK (Python + TS) with memorize(text) / recall(query) defaults and an opinionated "best-current-answer" lens (highest-maturity, best-evidenced claim) so paraconsistency is opt-in depth, not default friction. A Mem0-compatible add()/search() shim so you're drop-in for devs already on a competitor.

Phase 1 (Weeks 4-14): Prove a number

Publish benchmarks. Competent LoCoMo/LongMemEval to be legible; standout results on temporal/knowledge-update/contradiction subsets where your design wins. Run KGGen's MINE for text-to-KG quality.
Define and publish the eval you own — a "contradiction-retention / belief-at-time-T / provenance-recall" benchmark, using WikiContradict as the foil. A blog post + arXiv note: "the first memory layer that doesn't lose the disagreement." This is your category-defining PR.
Fix extraction economics + faithfulness. Replace the GLM-coding-subscription path with a TOS-compliant inference contract. Add a precision/recall measurement and make the Lean overlay gate mature claims. Reframe extraction as Sleep-time Compute and add a non-destructive consolidation pass.

Phase 2 (Months 4-9): Land the wedge

Pick ONE regulated beachhead (recommend EU-AI-Act evidence packs or clinical/pharmacovigilance adjudication) and build a thin vertical product on the substrate. Land 2-3 paying design partners who are not you and not your family.
Partner-led GTM: integrate as the verifiable evidence store behind a governance dashboard (Credo AI / OneTrust / Modulos) rather than competing on UI.
Use genes as the lighthouse case study — published, with the CARE/contradiction story — while firewalling it organizationally and resolving your personal conflict-of-interest. Court ARG-tech and the paraconsistency/nanopub communities as credibility partners (a workshop paper at a NeurIPS/ICLR agent-memory workshop).

Phase 3 (Months 6-18): Build the company

Add a co-founder (GTM/distribution). Non-negotiable for fundraising and for the enterprise motion.
Raise $1.5-4M angel-heavy seed from infra/data angels on: production substrate + an owned benchmark + an MCP-distributed SDK + 2-3 paying design partners + a co-founder. Keep burn low (the capital-efficiency story is an asset).
Stand up donto Cloud (managed, usage-metered, governance-as-the-paid-tier) and resolve append-only vs GDPR-erasure with a defensible crypto-shred/tombstone design.

The 3 riskiest assumptions to test first

"A buyer will pay for contradiction-preservation specifically (vs resolution)." This is the deepest unvalidated bet — the market's revealed preference is the opposite. Test: before building further, get 2-3 regulated buyers (legal/clinical/audit) to articulate, in writing, that "keep both + prove provenance + replay belief" is a requirement with a budget. If you can't, the whole differentiation needs rethinking.
"donto-memory can be adopted by developers despite its complexity." Test: ship the MCP server + simple SDK and measure adoption (stars, MCP-host installs, API calls from non-you accounts) within 90 days. If the simple path isn't getting pulled, the semantic-web fate is materializing.
"Maximal/forensic extraction can be made precise and economical enough to be trustworthy." Test: publish a precision/recall number on a real corpus with the Lean gate on, at a sustainable cost basis. If precision is low or cost is subsidy-dependent, "evidence-grade" collapses — and evidence-grade is the entire value proposition.

The bottom line. You have built something genuinely rare — the only production system that refuses to pick a winner and enforces who may know what — at a moment when regulation, the provenance crisis, and the agent explosion are all converging on exactly that need. But you have built it as a beautiful, domain-neutral, complex substrate with no wedge, no benchmark, no SDK, no team, no paying users, and a flagship corpus that is your own conflicted family history. The path to a company is not to defend the substrate philosophy; it is to narrow: lead with paraconsistency + governance, sell "memory you can defend" into one regulated vertical, distribute via an open-core MCP-native SDK, win the contradiction benchmark you define, fix the extraction economics, firewall genealogy as a proving-ground, add a co-founder, and raise small. Do that, and the substrate you love becomes the byproduct of a company that wins — which is the only way substrates have ever won.