Prepared from 11 areas of landscape research and 5 adversarial thesis stress-tests. Written to be forwarded to your smartest friend, not to flatter you.
donto is the verifiable memory substrate for the agentic era — the only knowledge layer that keeps contradictions alive, anchors every claim to its source byte, replays what was believed at any moment in time, and enforces who is allowed to know what — and the company is built by leading with a governed, evidence-first consumer (memory and contested-evidence research) while keeping the substrate clean underneath.
That sentence already contains a correction the research forced on me. You want donto to be "substrate, never a product." Three of the five stress-tests independently concluded that the literal version of that philosophy is the single most dangerous thing about your strategy — it is the canonical platform-paradox trap, and it is exactly how the semantic web died commercially. So the thesis above keeps your architecture domain-neutral (correct, defensible, beautiful) while explicitly rejecting "substrate-first" as a go-to-market. You sell a product. The substrate emerges as a byproduct — the way AWS emerged from Amazon selling books for twelve years, not the way RDF was sold as "annotate the world and someday it will pay off."
The expanded vision. The first move of the last three years was that LLMs ate the reasoning layer; the open question of the next three is who owns the knowledge layer underneath them. Andrej Karpathy's 2025 framing is the cleanest articulation anyone has given of why donto should exist: the model is the "cognitive core" — the CPU — and it should offload bulk factual knowledge to an external system, because Allen-Zhu & Li proved (ICLR 2025, "Physics of Language Models 3.3") that an LLM stores only ~2 bits of knowledge per parameter. Parametric memory is finite, lossy, un-updatable, and un-citable. The durable disk below the cognitive core is the prize. A whole "agent memory" category — Mem0 ($24M), Zep/Graphiti (YC), Letta ($10M), Cognee (€7.5M), Supermemory — is now racing to be that disk.
But almost every one of them is building a forgetful, opinionated disk: they overwrite on conflict (Mem0), invalidate the older fact and "consistently prioritize new information" (Zep/Graphiti), or pick a winner via LLM judgment. That is fine for "remember the user prefers window seats." It is a catastrophic, often illegal design for the domains where memory actually matters and where money is changing hands under regulatory duress: legal evidence, clinical records, scientific claims, intelligence, journalism, and — your own proving ground — contested native-title genealogy. In those domains the disagreement is the asset. The court needs to know that two sources gave two birth years and which one you believed when. The regulator (EU AI Act, Article 10 / Annex IV, in force August 2026) requires you to trace any output back to its source data and show belief lineage. donto is the only system architected, from the primary key up, for exactly that.
So the ten-year vision is not "a better Mem0." It is: the trust layer that sits between the three currently-disconnected provenance silos — content authenticity (C2PA, which proves a file's origin but says nothing about whether its claims are true), training-data lineage (Collibra/Atlan, which track tables and pipelines, not facts), and inference-time grounding (Vectara/Perplexity, which cite a chunk then throw the provenance graph away) — unifying them at the only granularity that matters for truth: the individual claim. If the next decade demands that AI systems be auditable, contestable, and governable, donto is the substrate that makes a claim itself a first-class, time-stamped, evidence-anchored, policy-bound, contradiction-tolerant object. Nobody else is building that. The hard part is not the architecture (you've built it). The hard part is everything else, and most of this document is about the everything else.
Four macro forces converge, and they are unusually well-aligned with what you already have running.
Force 1 — The agent explosion made memory the bottleneck, not the model. The field's own consensus in 2025-2026 is that "memory is the limiting factor, not model capability"; ~65% of enterprise AI failures in 2025 were attributed to context/memory loss (mem0.ai "State of AI Agent Memory 2026"). The agent-memory market is sized ~$6.3B (2025) → ~$28.5B (2030) at ~35% CAGR. MCP (Model Context Protocol) went from ~100K to ~97M monthly SDK downloads in 18 months and was donated to the Linux Foundation's Agentic AI Foundation in December 2025. Critically, MCP defines the socket, not the knowledge backend — and the canonical reference memory server is a flat local JSONL file with nine tools and zero provenance, time, or contradiction model. The slot donto fits is standardized and empty.
Force 2 — The AI-slop / provenance crisis turned "where did this come from?" from idealism into procurement. The Data Provenance Initiative (MIT/Cohere, Nature Machine Intelligence Aug 2024) audited 1,800+ training datasets and found >70% license omission and >50% license error — provenance is empirically broken at scale. Stanford's RegLab found purpose-built legal RAG still hallucinates in 17-34% of queries. C2PA is now the de-facto file-provenance standard (OpenAI, Google, Adobe, Sony on the steering committee; Pixel 10 signs every photo). The market for "content provenance solutions" is ~$1.63B (2025) → ~$5.12B (2030). The whole stack is converging on the demand donto answers — but at the file and dataset level, leaving the claim level wide open.
Force 3 — Regulation makes it mandatory, with a hard date. The EU AI Act's high-risk requirements (Article 10 data governance + Annex IV traceability) carry penalties up to €35M / 7% of global revenue and come into force August 2026. They literally require documented data provenance, data→model→decision lineage, and auditor-traceability of any output back to its source. donto's bitemporal "what did we believe at time T?" plus byte-offset evidence-anchoring is close to a turnkey implementation of Annex IV. ~$281-321M flowed into ~16-20 AI-governance startups in 2025-2026 — the budget line exists, and it is new.
Force 4 — Memory is the agent moat the labs will not neutralize for you. This is the subtle one, and the stress-tests sharpened it. Yes, OpenAI/Anthropic/Google all ship native memory now (Anthropic's is even auditable markdown, which undercuts a naive "we're transparent" pitch). But the labs have no incentive to make memory portable, neutral, multi-model, contradiction-preserving, or governed across providers — those features cut directly against their lock-in. The "Plaid for memory" thesis (Mem0's framing) is structurally sound precisely because the labs won't build the neutral layer. The danger is not that the labs absorb the deep layer; it's that funded independents (Zep especially) absorb it first, because each of donto's legs is individually copyable.
The honest read on timing: you are early enough that the category donto truly occupies ("contradiction-preserving, evidence-first, governed memory") has no name and no leader — that's the land-grab. You are late enough that the adjacent category ("agent memory") has a leader (Mem0), a benchmark regime (LoCoMo/LongMemEval/BEAM), and a near-architectural-twin (Zep). The window is real but it is not wide.
Here is the synthesis of what survived adversarial attack, sorted by how defensible it actually is — not by how proud you are of it.
Across all 11 research areas and 5 stress-tests, two capabilities held up as both rare-in-production and hard-to-copy-quickly:
True paraconsistency — contradictory claims both live forever as legal state, with a queryable contradiction frontier and typed argument edges (supports/rebuts/undercuts). Every competitor does the opposite. Zep/Graphiti explicitly "sets t_invalid = t_valid of the invalidating edge" and "consistently prioritizes new information." Mem0 self-edits/overwrites. Supermemory does "contradiction resolution" (picks a winner) and "selective forgetting." A-MEM mutates old notes. This is not a marketing distinction; it is an architectural commitment nobody else has made because the agent-memory market's revealed preference is the opposite — devs want one clean answer. That last clause is the catch, and I'll return to it hard.
The Trust Kernel — 15 action-level policy capsules, fail-closed default, governance that propagates to all derivatives (embeddings, translations, exports inherit source policy), operationalizing FAIR + CARE (indigenous data sovereignty). Torch Capital's portable-memory thesis names the exact white space: "No discussion of data provenance, audit trails, or who validates memory accuracy... governance mechanisms notably absent." No agent-memory competitor implements CARE or policy-inheriting derivatives. This is the leg with both genuine product-space emptiness and a buyer with a compliance budget (EU AI Act, IEEE 2890-2025 Indigenous-provenance standard, GIDA CARE).
The stress-test verdict was precise: the combination of bitemporality + paraconsistency + evidence-anchoring + policy governance is real and currently unoccupied — but it "partially holds" because it's a feature-bag, not a moat, and the most distinctive leg (paraconsistency) is something the mass market actively does not want. Conclusion: your moat is not "all four together." Your moat is paraconsistency + governance, deployed in domains where picking a winner is itself a defect. Lead with those two. Treat the rest as supporting cast.
Be merciless with yourself here, because buyers and investors will be:
| Gap | The brutal version |
|---|---|
| No published benchmarks | The category is won in comparison tables. Mem0 cites ~92.5 LoCoMo / 94.4 LongMemEval; Zep 94.8% DMR; "Memento" 92.4% LongMemEval. donto has zero public numbers. Until you post one, you are invisible — and "697 facts from 'cat is red'" reads as noise/cost, not quality, to anyone in this field. |
| No SDK, no MCP server, no framework adapters | Mem0 ships "6 lines of code," 20+ vector-store and 21+ framework integrations, and is the exclusive memory provider for the AWS Agent SDK. donto is HTTP endpoints on one VM. The single most urgent integration gap is an MCP server compatible with Anthropic's reference KG-memory tools. |
| Scale is small, and 39.5M is not a moat | This was the sharpest hit. Single-node GraphDB/Virtuoso routinely load 8-100 billion triples; AutoSchemaKG/ATLAS is 5.9B edges; Diffbot is 1T facts. 39.5M is ~0.04-0.4% of a routine single-server load. Drop "39.5M statements" as evidence of anything. Your claim is design-proven, not scale-proven — say that. |
| Conceptual heaviness | 21-clause DontoQL + identity lenses + trust kernel + 11×3 predicate
alignment + Lean overlay is the exact "built by academics for
academics" profile that killed the semantic web. It must be hidden
behind a one-line default API or it repels every developer who just
wants memorize(text). |
| No reasoning layer over the contradictions | You store argument edges; you don't compute over them. The entire value of Dung/ASPIC+/Belnap is calculating which arguments are accepted under grounded/preferred semantics. Without that, the contradiction frontier risks being "a messy database" rather than a reasoner. This is your single biggest capability gap and, conveniently, a fundable differentiator. |
| The consolidation critique | "Contextual Agentic Memory is a Memo, Not True Memory" (arXiv 2604.27707) argues stores that "accumulate notes indefinitely" are lookup, not memory — and your "maximal extraction" is the platonic ideal of hoarding. You have no demonstrated semantic-abstraction/consolidation pathway. The field is moving toward selective memory; you're sprinting the other way. |
| Extraction trust + economics | "Maximal extraction" optimizes recall, which tanks precision — LLM triple extraction shows ~28-65% hallucination before verification. And the cheap economics rest on running extraction through a GLM coding subscription via OpenCode, which Z.AI now throttles/bans for non-coding use. That is both TOS-violating and an expiring subsidy. |
| Solo/no-team, no funding, no brand | 75% of VC funds made zero solo-founder investments in 2025 (Carta). Pre-revenue + horizontal-infra + solo is close to unfundable institutionally. |
donto competes in four overlapping arenas. The mistake would be to think of them as one. Here is the map.
| Player | Funding | Core model | Contradiction handling | Provenance | Governance |
|---|---|---|---|---|---|
| Mem0 | $24M (Basis Set/Peak XV/YC); AWS Agent SDK exclusive; ~48K stars | Fact extraction → vector+graph+KV, active curation | Overwrites / self-edits | Near-none | None |
| Zep / Graphiti | YC; Graphiti ~20-27K stars; arXiv 2501.13956 | Bitemporal temporal KG | Invalidates older edge, prioritizes new | Episode-level | None |
| Letta (MemGPT) | $10M @ $70M (Felicis) | OS-tiered self-editing memory blocks | Overwrites blocks | None | None |
| Cognee | €7.5M (Pebblebed); Bayer, U. Wyoming | ECL pipeline → graph+vector | Has "forget" (destructive) | Page-level (rare exception) | Air-gapped/residency |
| Supermemory | $2.6M (Susa/Browder; Jeff Dean) | Universal memory API, MCP-native, OpenCode/Claude Code plugins | "Contradiction resolution" (picks winner) + selective forgetting | Shallow | None |
| LangMem | LangChain distribution | Semantic/episodic/procedural SDK | None | None | None |
| OpenAI / Anthropic / Google native memory | Effectively unlimited | Built-in, lock-in by design | Picks one answer | Anthropic: auditable markdown | Per-platform |
| donto-memory | $0, solo | Bitemporal + paraconsistent quad store, evidence-first | Keeps both forever | Byte-offset, primary-key | Trust Kernel / CARE |
The story this table tells: donto is uniquely the one that doesn't pick a winner and uniquely the one with real governance. It is behind everyone on distribution, benchmarks, SDK, and team. Supermemory is the most strategically dangerous because it is already MCP-native and ships OpenCode/Claude Code plugins — your exact stack — and markets the opposite philosophy. Zep is the closest architectural cousin and the one most able to bolt on a "keep-both" flag.
Microsoft GraphRAG (~33K stars; "From Local to Global"; proved indexing costs ~75% of token budget) and LightRAG (~36K stars, ~6000x cheaper/query) define the cost-conscious mainstream. AutoSchemaKG/ATLAS (HKUST: 5.9B edges, 92% schema alignment) is the closest thing to your "maximal extraction" ambition — at 150× your scale, autonomously. KARMA (NeurIPS 2025) runs 9 agents to reduce conflict edges 18.6% (the philosophical inverse of you). Diffbot (1T facts, per-fact provenance, profitable, bootstrapped) is the existence-proof that web-scale KG-with-provenance is a real business — but it canonicalizes to one entity. The entire field is racing toward less extraction per dollar (Microsoft's LazyGraphRAG: ~0.1% of indexing cost). You are the only one racing the other way. That is either a moat (nobody wants to pay for forensic exhaustiveness) or a trap (it's economically irrational and quality-negative). It depends entirely on choosing the domains where exhaustiveness is the feature.
XTDB v2 (Grid Dynamics/NASDAQ:GDYN; bitemporal SQL to finance) is your nearest commercial peer on bitemporality and proof the market pays for it — but it's rows, not claims, with no paraconsistency/evidence/identity/governance. Datomic (free, Apache-2.0, owned by Nubank) is the cautionary tale: immutable + Datalog + as-of, even free, stayed niche because of learning curve and thin docs — a direct warning about DontoQL. Amazon QLDB was discontinued (EOL July 2025) — a hyperscaler killed a standalone immutable ledger for insufficient pull. The lesson is written in neon: never sell "an immutable/bitemporal database." Sell the downstream value. Wikidata (qualifiers/references/preferred-normal-deprecated ranks) is your closest data-model peer and proves source-qualified, rank-able claims work at planet scale. The standards to align with rather than reinvent: RDF 1.2 / RDF-star, PROV-O, and nanopublications — whose 2025 proposed 4th "knowledge provenance" graph (supporting + conflicting evidence) maps almost 1:1 onto your contradiction frontier. That's your natural FAIR/science export format.
Rewind/Limitless (~$33M → Meta acqui-hire, hardware killed, ~$2M ARR), Mem.ai ("$40M second brain failure"), Personal.ai (niche). The lesson is unambiguous: do not build a consumer capture app or hardware. Capture friction and platform absorption kill it. Tana/Pieces could be consumers on donto; they are not your battlefield.
donto sits in the white space between the silos: claim-level (vs C2PA's file, Collibra's table, Vectara's chunk), contradiction-preserving (vs everyone's resolve-or-overwrite), and governed-with-inheritance (vs nobody). The risk is that white space between silos is precisely where horizontal infra goes to die unless it picks a wedge. Which brings us to the company.
┌─────────────────────────────────────────────┐
CONSUMERS │ genes donto-memory donto-lang (+ new) │
(products, │ (vertical) (horizontal API) (pilot) │
where the ───┼─────────────────────────────────────────────┤
money is) │ DontoQL / MCP / SDK │ ← the seam:
├─────────────────────────────────────────────┤ hide complexity here
SUBSTRATE │ donto: bitemporal · paraconsistent · │
(architecture, │ evidence-first · identity-as-hypothesis · │
not the pitch) │ Trust Kernel · Lean overlay · RO-Crate │
└─────────────────────────────────────────────┘
Postgres (pg_donto) + dontosrv
The portfolio is correct as an architecture. The error is treating all three consumers as equal product bets. They are not:
memorize/recall default with an
opinionated "best current answer" lens, and expose the superpowers
(AS_OF, contradiction frontier, identity lens, evidence trace) as
advanced opt-ins.For each: the wedge, and specifically why donto's invariants are non-negotiable there (not nice-to-have). The selection criterion is the one the stress-tests kept hammering: pick domains where picking a winner is itself a defect and a buyer has a compliance budget.
1. Clinical evidence & pharmacovigilance adjudication. Wedge: reconciling conflicting studies, conflicting EHR entries, conflicting adverse-event reports into a contradiction-aware, time-stamped, source-anchored evidence record. Why the invariants matter: two studies will disagree on a drug's effect; the regulator (and the clinician) needs both, with provenance and "what did we believe when we made the dosing decision." Bitemporal belief-replay + paraconsistency + byte-offset provenance + fail-closed access governance is a compliance dream, and HIPAA-style audit is a forcing function exactly like the EU AI Act. Strongest leg used: all four.
2. Legal evidence / e-discovery / contradictory-witness modeling. Wedge: a contradiction frontier over depositions, exhibits, and precedents where "what did we know, and when" (AS_OF) is the literal legal question. Why: picking a winner among conflicting witnesses is the opposite of what discovery wants; the typed argument edges (supports/rebuts/undercuts) are AIF/ASPIC+ attack relations the legal-argumentation field already formalized. The signed RO-Crate/DataCite release machinery becomes the Daubert admissibility story (testable, has provenance, auditable belief-history). Risk: novel methodology can be challenged as "not generally accepted" — so partner with the argumentation community (ARG-tech, Chris Reed) for credibility.
3. Scientific claim-curation / research integrity. Wedge: a nanopublication-native store of supporting+conflicting evidence for scientific claims, citable as FAIR Digital Objects. Why: you already emit signed RO-Crate; nanopubs' proposed "knowledge provenance" graph is your contradiction frontier; the Data Provenance Initiative proved the need empirically. Buyers: EOSC, research-data repositories, journals fighting the reproducibility/retraction crisis. Lower margin, high credibility, paper-generating.
4. Intelligence / OSINT / Analysis of Competing Hypotheses. Wedge: ACH is methodologically central to intelligence analysis and academics note it's under-tooled. donto is literally a competing-hypotheses engine with identity-as-hypothesis and a contradiction frontier. Why: analysts must hold contradictory source reports, weight identity assertions ("is this the same person?"), and answer "what did we assess at time T." Risk: sales cycle and clearance gates; but high willingness-to-pay.
5. Regulated AI audit / EU-AI-Act traceability ("evidence pack for high-risk AI"). Wedge: be the verifiable evidence store feeding governance dashboards (Credo AI, OneTrust, Modulos), not the dashboard itself. Why: Annex IV requires data→model→decision lineage and output-to-source traceability; donto's bitemporal + byte-offset design is nearly turnkey. Partner-led GTM into a category that already has buyers and a deadline. This may be the largest near-term commercial surface.
6. Sovereign / indigenous & cultural-heritage memory (the "sovereign memory" flagship). Wedge: the only substrate that computationally enforces CARE / Local Contexts TK Labels and propagates them to embeddings and exports. Why: genes already exercises this; GLAM institutions, land councils, GBIF (which piloted TK/BC Labels 2024-25), and IEEE 2890-2025 give it institutional demand and grant funding. Caveat (critical): this is high-trust, low-margin, reputationally fragile, and — for you personally — conflicted (you are an adverse party in an EKY claim; see §8). Productize it for other communities as custodians; never position yourself as the authority.
The throughline: all six are "memory you can defend" markets, not "memory the chatbot has" markets. That is the whole strategy in one phrase.
The stress-tests were unanimous and harsh on this point, so let me be direct: "substrate, never a product" is correct as an architecture principle and fatal as a go-to-market. Eric Paley's platform paradox, the vertical-SaaS-grows-2-3x-faster data, Palantir's explicit rejection of neutrality ("a digital twin of the organization... not generic horizontal data infrastructure"), and the semantic web's 25-year commercial failure all point the same way. You must lead with a product. The substrate emerges as a byproduct.
Here is the resolution, sequenced.
The model: open-core, three concentric rings.
Open-source the substrate primitive (pg_donto + dontosrv + a thin client SDK + an MCP server) to drive developer adoption. This is the only proven way a horizontal data primitive has ever won (Postgres, Neo4j, MongoDB). Stars, npm/PyPI pulls, MCP-registry listing, framework adapters (LangChain/LlamaIndex/CrewAI/OpenAI-Agents/Anthropic-ADK). The single highest-leverage build is an MCP memory server API-compatible with Anthropic's reference KG-memory tools so donto is a literal drop-in upgrade into 10K+ MCP hosts.
Monetize a managed multi-tenant cloud (donto Cloud) with usage-based metering (per write/recall/search op, mirroring Mem0's $0/$19/$249 tiers and Supermemory's per-token pricing). The commercial tier is governance: SSO, the full Trust Kernel/governance console, signed RO-Crate export, audit reporting, SLAs, on-prem/VPC. Governance + provenance are the natural paid tier because that is exactly what regulated enterprises pay for — and what the open core deliberately doesn't fully unlock.
Sell into one regulated beachhead vertical with a thin, opinionated product on top — not a horizontal pitch. My recommendation for the first paid wedge: regulated AI audit / EU-AI-Act evidence packs (#5) or clinical/pharmacovigilance adjudication (#1), because both have an external deadline, a named budget line, and a buyer who treats paraconsistency-and-provenance as requirements, not nice-to-haves. Genealogy/native-title is the proving-ground and case study, not the revenue beachhead.
Pricing power lives in governance, not storage. Usage-based memory pricing is racing toward $0.001/op (MemoClaw); undifferentiated storage/recall margins will compress (see Pinecone's revenue decline). Only governance/provenance/assurance features hold pricing power. So: cheap (or free) recall, paid trust.
The benchmark move is non-negotiable and comes first. You cannot sell, raise, or even appear in the category without a number. But you should not race Mem0/Zep on plain LoCoMo recall (you'll lose on a metric that doesn't reward your design). Instead: (a) publish competent LoCoMo/LongMemEval numbers to be legible, and (b) define the benchmark you win — a "contradiction-retention / provenance-recall / belief-at-time-T" eval where the correct answer is "both X and Y are claimed, by sources A and B, valid at T1/T2." IBM's WikiContradict (NeurIPS 2024: all LLMs near-random at contradiction detection) is your ready-made foil; BEAM already has a contradiction-resolution category where pick-newest designs structurally lose. Own the eval the way Zep made temporal a thing.
The funding posture: raise a small, angel-heavy pre-seed/seed ($1.5-4M — the Supermemory/Cognee band), not a mega-round. The relevant names are on the Mem0/Letta cap tables: Neo4j's Philip Rathle, Datadog's Olivier Pomel, dbt's Tristan Handy, MotherDuck's Jordan Tigani, plus governance/data angels. And add a co-founder before you raise — solo + pre-revenue + horizontal-infra is near-unfundable, and you cannot run an enterprise-compliance sales motion alone.
I want to engage this seriously because it's the heart of your vision, and then I want to save you from the version of it that kills the company.
What's right about it. The defensive half of the thesis is sound: end-to-end and long-context models did not make explicit knowledge layers obsolete. The market converged on hybrid (GraphRAG, agent memory); long-context-vs-RAG analyses conclude "neither is obsolete, route between them," with external stores winning on freshness, updateability, multi-hop, and a ~1,250× cost advantage at scale. Allen-Zhu & Li's 2-bits-per-parameter ceiling is the quantitative floor under Karpathy's cognitive-core thesis. AlphaProof's IMO silver medal is the proof that neural+symbolic+formal-verification still beats pure scaling on hard reasoning — and that's your Lean-overlay pattern. So "explicit, auditable, external knowledge survives" is a defensible bet.
What's wrong about it. "1M+ facts per text / maximal extraction" optimizes the wrong axis. More facts ≠ more value or more truth:
The reframe that saves it. The vision is not wrong — it's mis-stated. Change the objective function from volume to verified, governed, contradiction-aware, byte-anchored provenance. The right sentence is not "a million facts from any text." It is:
"Lossless, contradiction-preserving decomposition of a source into claims, each anchored to its evidence, each governed, each replayable through time — so that an AI's understanding of contested reality is auditable in extreme detail."
That reframe does three things: it answers the consolidation critique (add a sleep-time semantic-consolidation pass — Letta's framing — that abstracts the granular facts into higher-level claims without destroying them, which bitemporality makes safe); it answers the precision objection (pair extraction with the Lean overlay + a published precision/recall number, and use maturity/polarity gates so raw extraction is cheap-stored but only evidence-anchored mature claims are queried); and it answers the bitter lesson (anchor in domains where ground truth is contested/legal/cultural and provenance is the product — exactly where Era-of-Experience agents have no answer because there is no reward signal for "what does the law say happened in 1898").
The 10-year picture if it works. Models become cheap, abundant reasoning cores. The scarce, valuable thing is trustworthy, contestable, governed knowledge state — the substrate that can say "here is what is claimed, by whom, on what evidence, believed when, contested how, allowed to whom." donto is the system of record for contested reality: the place a court, a regulator, a clinician, an analyst, or an agent goes when "the chatbot remembers" is not good enough and "cite the source, show both conflicting claims, prove what we believed when, and enforce who may see it" is the whole job. Inference cost falls ~10×/year (LLMflation), so the economics of high-recall verified extraction improve every quarter — and when extraction is nearly free, donto is the only substrate that can hold the output without collapsing under contradictions or losing provenance. That is a real, large, durable company. But it is built by narrowing the vision into "verified provenance," not by chasing the fact counter.
No softening. These are in roughly the order they can kill you.
Founder / execution (the most likely killer).
Technical.
Market.
Business.
Legal / ethical.
Grouped by theme, annotated for why it matters to donto. This is the deliverable to bookmark.
A concrete, sequenced plan. The ordering reflects risk, not comfort.
memorize(text) / recall(query) defaults and an
opinionated "best-current-answer" lens (highest-maturity, best-evidenced
claim) so paraconsistency is opt-in depth, not default friction. A
Mem0-compatible add()/search() shim so you're drop-in for
devs already on a competitor.The bottom line. You have built something genuinely rare — the only production system that refuses to pick a winner and enforces who may know what — at a moment when regulation, the provenance crisis, and the agent explosion are all converging on exactly that need. But you have built it as a beautiful, domain-neutral, complex substrate with no wedge, no benchmark, no SDK, no team, no paying users, and a flagship corpus that is your own conflicted family history. The path to a company is not to defend the substrate philosophy; it is to narrow: lead with paraconsistency + governance, sell "memory you can defend" into one regulated vertical, distribute via an open-core MCP-native SDK, win the contradiction benchmark you define, fix the extraction economics, firewall genealogy as a proving-ground, add a co-founder, and raise small. Do that, and the substrate you love becomes the byproduct of a company that wins — which is the only way substrates have ever won.