donto is a contradiction-preserving claim/discovery substrate for machines. It ingests evidence, decomposes it into source-grounded typed claims, holds mutually-incompatible claims as permanent legal state, and then generates, ranks, and re-ranks relationship hypotheses from claim combinations — always able to show a human why a hypothesis exists. The product is this claim lifecycle, not the "lens" vocabulary. Lenses are just one way to generate candidate typed claims; a lens that doesn't emit structured claims that change the graph is decorative. Everything below is one product spec for that lifecycle, plus the domains that prove it.
This 8-step loop is the spine. Each step is something donto does to the graph.
| # | Step | What donto does (one line) |
|---|---|---|
| 1 | Ingest evidence | Anchor every source to a byte range; raw artifact + cleaned copy stored, never summarized-and-discarded. |
| 2 | Extract typed claims | Run extraction lenses → emit normalized, typed, falsifiable claims (skill-claim, causal-claim, tlink), each evidence-linked. Volume lives here. |
| 3 | Hold incompatible claims | Store contradictory claims side-by-side as legal state
(hypothesis_only, contradiction frontier) — never average,
overwrite, or drop. |
| 4 | Generate relationship hypotheses | Combine claims into candidate cross-entity edges via shared intermediate facts (Swanson ABC join keys: shared sense/frame/entity-IRI/skill). This step is net-new code. |
| 5 | Attach evidence + counter-evidence | Wire typed argument edges
(supports/rebuts/undercuts,
AIF-style) to each hypothesis. |
| 6 | Rank | Score = f(novelty, plausibility, evidentiary-support, contradiction-value, verification-cost). This ranker IS the FDR controller. |
| 7 | Re-rank on new evidence | New byte arrives → re-score the existing hypothesis pool (cheap), don't re-generate (expensive). Bitemporal trigger. |
| 8 | Explain | Surface the backing claim-chain, the counter-evidence, the confidence, and a verification path ("confirm by adding repo link"). |
Steps 1–3 are mostly built (genealogy data). Steps 4–7 are the product gap. Step 8 is the differentiator the incumbents structurally can't offer.
The founder pushed back on the prior report's de-emphasis of volume. The founder is right — with one condition. The disagreement dissolves once you split volume across two layers.
The founder's intuition is literally measurable. Modern NLP defines ~12 distinct gold-standard, falsifiable annotation layers over the same text, each emitting a different claim type:
Arithmetic for one rich 100-word paragraph: ~100 tokens × ~5 UD claims ≈ 500 syntactic claims; + ~80–120 AMR nodes/edges; + ~10–20 predicates × (sense + args + ~20 UDS scalars) ≈ several hundred semantic claims; + temporal + discourse + entity links. Conservatively 1,000–2,000+ typed falsifiable claims per paragraph, zero prose. The latent typed structure of one text is vast.
This is not aspirational. SemMedDB/SemRep extracted 96–130M typed subject-predicate-object predications from ~29–37M PubMed abstracts — and that database is the substrate of automated biomedical discovery. (It was deprecated Dec 31 2024 — a dated, open market gap donto's extraction layer directly fills.)
Why density enables discovery (the clincher). Swanson's Raynaud↔︎fish-oil discovery worked because the bridge B-concept (blood viscosity) existed as a typed claim in both literatures, even though only 4 of 489 articles co-mentioned A and C. The discovery lived in the intermediate typed facts, not the prose. No B-facts → no bridge → the true link is unrecoverable. Sparse-graph completion provably degrades (real commonsense/biochem KGs avg degree ~2; models "degrade quickly as density is reduced", Malaviya 2019). Under-extraction is a recall ceiling no downstream ranker can lift.
Where volume hurts. Candidate relationships scale ~O(N²) (worse for typed paths) while true links grow ~linearly. Calude & Longo (Foundations of Science 2017) prove via Ramsey theory that large enough databases must contain arbitrary correlations as a function of size alone — "most correlations are spurious," findable even in random data. So the unverified-relationship layer needs a gate (Benjamini-Hochberg FDR logic; Bonferroni is "too conservative to be useful" at large m).
| Typed-extraction / decomposition layer | Unverified-relationship-hypothesis layer | |
|---|---|---|
| Volume verdict | YES — recall, the raw fuel | GATED — combinatorial false-discovery |
| Why | 1k–2k typed claims/paragraph; Swanson needs B-facts; denser graph = more reachable true links | O(N²) candidates; Calude-Longo: spurious regularities guaranteed by size |
| Failure if ignored | Recall ceiling — true links permanently unrecoverable | Deluge — real findings buried in noise |
| donto state | claim entering graph (evidence-anchored, re-checkable) — allowed at any volume | hypothesis promoted hypothesis_only → believed — must
pass scorer under an FDR budget |
| donto today | UNDER-fed (4.7% evidence coverage) | EMPTY (the missing generate+gate step) |
Conclusion: The founder is right — extract maximally — under two conditions: (1) lenses emit typed claims, never prose (prose has no join key, no falsifiability, no provenance type — it can neither feed the ranker nor bridge two texts); (2) the relationship layer is gated by the ranker. Volume feeds the verifier; they are sequential, not opposed. The only universal-noise case is free-form prose output.
A lens is a typed-claim emitter with a declared target schema. The objective test for "decorative": does it write rows in its declared predicate family that the ranker can use — i.e. raise recall of true latent links OR sharpen the gate? If neither, cut it.
| Lens | Typed claim it emits | How it changes the graph |
|---|---|---|
| Causal | causal-claim(A causes B, polarity, strength) (SciClaim
schema) |
New candidate causal edge → enters ranker; can rebut an existing causal claim |
| Temporal | tml:event,
tml:tlink(before/during/overlaps) (Allen) |
Time-bounds an assertion; enables bitemporal "what was true at T"; detects time contradictions |
| Linguistic | ud:deprel, amr:ARGn,
wn:sense, uds:proto-role(scalar) |
Adds join keys (shared sense/frame) for cross-text bridges; graded scalars become ranker weights |
| Genealogical | relationship-hypothesis(parent-of, source-specific) |
Candidate kinship edge with per-source evidence; competing edges held paraconsistently |
| Legal | treatment-edge
cites/distinguishes/overrules/follows |
Builds the argument graph; overrules flips valid-time
of a precedent without deletion |
| Social (FAN) | co-occurrence/associate-of(witness, sponsor, neighbor) |
Cluster edges that triangulate identity and surface hidden links |
| Skills/competency | skill-claim(surface → ESCO/O*NET IRI, confidence),
implicit-skill-claim |
Resolves to taxonomy node non-destructively; implicit claims enter
as hypothesis_only |
| Identity | identity_edge(maybe-same-as, evidence) |
Query-time, non-destructive merge candidate; competing identities coexist |
Kill the decorative failure mode: any lens emitting a paragraph of analysis instead of typed rows is deleted. "It explained the text nicely" is not a graph mutation.
This is the cleanest possible proof of the whole thesis. Matching
is literally scored relationship-discovery over typed
claims. The founder owns both sides of the graph and
the distribution channel: JSON Resume (MIT standard, schema
v1.0.0, ~2.4k stars, 10,000+ devs, 400+ themes, Gist-backed registry at
registry.jsonresume.org/<user>) plus a draft
job-schema.json. The resume side is already half-decomposed
into claims
(basics/work/education/skills{name,level,keywords}/projects).
Two: resume (a person's claim-set) and job (a requirement claim-set). Both decompose into the same typed schema, so matching is set-to-set relationship discovery — not cosine similarity between two opaque vectors.
| Resume claims | Job claims |
|---|---|
| skill-claim (surface → ESCO/Lightcast/O*NET IRI) | required-skill-claim |
| role-claim, seniority-claim (time-bounded) | preferred-skill-claim |
| domain-claim (e.g. fintech), tenure-claim | required-seniority-claim |
| education-claim, trajectory-claim (role transitions) | domain-claim |
implicit/inferred-skill-claim
(hypothesis_only) |
comp-claim, location/remote-claim |
| identity-variant-claim (duplicate/variant profiles) |
Anchor vocabulary: ESCO (14,575 skills / 3,039 occupations / 28 languages) + O*NET (1,016 occupations, 277 descriptors) as canonical free/open skill IRIs; Lightcast Open Skills (~33,000 skills, refreshed biweekly from 1B+ postings) as a freshness benchmark. Each skill is three claims: raw surface form + normalized IRI + confidence — the bullet text is never destroyed.
Implicit-skill lens example: bullet
"Led migration of monolith to microservices on AWS for a 5M-user fintech"
→ explicit (AWS, microservices, leadership) and
inferred-as-hypothesis_only (distributed-systems,
observability, PCI/regulatory exposure, team-scale, domain=fintech,
seniority signal). Industrial implicit-skill extraction runs
~80% precision / ~86% recall — real and measured, and
exactly why it must enter as hypothesis, not belief.
Each match is an evidence-anchored edge: "requirement R met by skill-claim S extracted from bullet B of resume gist G." Framed as FEVER-style claim verification (skill-claim SUPPORTS / REFUTES / NOT-ENOUGH-INFO a job requirement). The UX leads with:
SOTA admits the gap verbatim. ConFit v2/v3 (ACL 2025/2026): embedding rankers "lack controllability and explainability as the ranking process happens entirely in latent embedding space" — they bolt on LLM re-ranking to recover the WHY (+13.8% recall, +17.5% nDCG over BM25/OpenAI embeddings, but one benchmark 72% male).
| Capability | Keyword/Embedding | donto |
|---|---|---|
| Why this match | latent / none | claim-chain to source bullet |
| Contradiction | silently averaged | senior claim rebutted by tenure
summing 18 months → surfaced as risk panel |
| Skill freshness | static vector | bitemporal: skill-claim valid-from 2018, no recent reinforcement → down-weighted; skill decay |
| Audit / compliance | opaque | bitemporal "replay why we matched" — EU AI Act-style explainability incumbents can't retrofit |
| Hidden candidates | keyword-gated | implicit-skill match even when surface keywords don't |
This is where the founder's volume intuition is unambiguously right — it's a genuine network effect, not just recall:
These are emergent typed relationships incumbents charge enterprise prices for; donto generates them as a byproduct of the lifecycle. Volume feeds the verifier: dense claims → denser graph → more candidate matches → ranker promotes only evidence-backed ones.
| Player | ARR / scale | Weakness donto exploits |
|---|---|---|
| Eightfold AI | ~$96.6M ARR, ~$2.1B val, $410M raised | opaque deep-learning talent graph (low explainability) |
| Beamery | ~$112.8M ARR, was $1B; laid off 12% then ~25% | capital-stressed; proprietary graph |
| SeekOut | ~$25.2M ARR, val $1.2B→$435M, 30% layoffs | profile-aggregation alone = fragile moat |
| LinkedIn Skills Graph | 875M people, 200K+ skill links | proprietary, not exportable |
| hiring.cafe | 14,000+ companies, free | ingestion model to copy for the jobs side |
The wedge: incumbents have bigger skills graphs. donto's differentiation is not "we have a skills graph" — it's evidence-anchoring + contradiction-preservation + bitemporal replay + human-readable WHY, on an open standard the founder controls. The likely paid wedge is explainability/fairness compliance (TA software ~$22–26B, AI-recruiting ~$3.2B at ~12% CAGR), not raw accuracy. If donto collapses claims to scores like everyone else, there is no moat.
Harvest the jobs side from open ATS feeds (Greenhouse/Lever/Workday)
the way hiring.cafe does; resume side from the registry (Gist version
history = free bitemporal valid-time). Test: on a held-out set,
beat an ESCO-embedding baseline on precision@k of promoted
matches against a real outcome signal (got-interviewed / hire), while
producing an evidence-anchored, contradiction-flagged explanation for
each. Report the false-discovery proportion among promoted
matches as a first-class number.
Template: Domain · Entities · Lenses→claims · Relationships discovered · Load-bearing property · Falsifiable test · Who pays.
1. jsonresume → jobs Domain: talent matching.
Entities: resumes, jobs. Lenses→claims: skill-lens→ESCO skill-claims;
implicit-skill-lens→inferred PCI/distributed-systems
(hypothesis_only); trajectory-lens→career-path edges.
Relationships: candidate↔︎job matches, skill-adjacency, hidden
candidates. Load-bearing: volume / network effects (+
evidence-anchor for the WHY). Test: beat ESCO-embedding
precision@k on got-interviewed. Pays: recruiters, job
platforms, fairness-compliance buyers.
2. Genealogy / native-title (proving ground) Domain:
ancestry/legal apicals. Entities: persons, sources, places.
Lenses→claims: genealogical-lens→parent-of per-source;
identity-lens→maybe-same-as (the 16 distinct "Kittys");
FAN-lens→witness/sponsor edges. Relationships: contested lineage chains
held paraconsistently. Load-bearing: contradiction +
identity-as-hypothesis + bitemporal (every source is an
interpretive witness, not ground truth). Test: surface the EKY #58 vs
Brady 2013 incest-conflict without collapsing either reading. Pays:
native-title corporations, RNTBCs, family researchers.
3. Deep linguistic decomposition / intertextuality (volume showcase) Domain: NLP/corpus. Entities: documents, tokens, predicates. Lenses→claims: UD-lens→5–6 claims/token; UDS-lens→20 scalars/edge; AMR-lens→concept nodes. Relationships: cross-text bridges via shared WordNet sense / FrameNet frame / AMR concept (Swanson ABC join keys). Load-bearing: volume (1k–2k claims/paragraph). Test: recover a held-out cross-document link findable only through a shared B-concept. Pays: research labs, intelligence/analysis, RAG vendors.
4. Biomedical drug-repurposing / LBD (discovery) Domain: biomedicine. Entities: drugs, genes, diseases. Lenses→claims: predication-lens→TREATS/INHIBITS/CAUSES (SemMedDB schema, UMLS-normalized); contradiction-lens→inter-study conflict. Relationships: A→C repurposing candidates via intermediate B (Hetionet ~2.25M edges / DRKG ~5.8M triples). Load-bearing: discovery + contradiction (SemMedDB deprecated 2024 = live gap). Test: rediscover a held-out, later-confirmed drug-disease link from pre-discovery literature. Pays: pharma, biotech, academic discovery.
5. Legal case-law / argument graph Domain: law.
Entities: cases, holdings, statutes. Lenses→claims:
treatment-lens→cites/distinguishes/overrules/follows;
argument-lens→facts/issues/rules (LAMUS-style). Relationships: precedent
chains; superseded-not-deleted law. Load-bearing: contradiction
+ bitemporal ("what was good law on date T" — the killer query;
Caselaw Access Project 6.7M cases). Test: correctly answer
good-law-as-of-date for a precedent overruled at a known date. Pays: law
firms, legal-research vendors (decision-support, never authoritative
ruling).
6. Scientific claim-curation / research-integrity Domain: science. Entities: papers, claims, retractions. Lenses→claims: claim-lens→SciClaim typed associations; provenance-lens→nanopub 3-graph envelope. Relationships: supporting vs conflicting bodies of evidence; re-rank on retraction. Load-bearing: contradiction frontier + bitemporal (Retraction Watch 60k+; "believed at T1, retracted at T2" cascades). Test: detect a contradiction pair Bucur's super-pattern flags, and re-rank dependents when a cited paper is retracted. Pays: publishers, funders, meta-science.
7. OSINT / investigative journalism Domain: investigations. Entities: people, companies, leaks, records. Lenses→claims: entity-lens→FollowTheMoney typed claims; conflict-lens→leak-vs-official disagreement. Relationships: hidden links across contradictory datasets with confidence tiers (confirmed/probable/possible). Load-bearing: identity-as-hypothesis + conflicting sources (OCCRP Aleph; "why do we believe X" survives legal review). Test: reproduce a known cross-leak link with a court-defensible why-trail. Pays: newsrooms, NGOs, due-diligence.
8. Clinical patient-evidence Domain: healthcare. Entities: patients, diagnoses, guideline recs. Lenses→claims: event-lens→OMOP temporal events; guideline-lens→conflicting recommendations under multimorbidity. Relationships: rec conflicts resolved by argumentation; revised diagnoses over time. Load-bearing: bitemporal + contradiction + governance (OHDSI ~800M+ records). Test: flag a guideline conflict for a comorbid patient with each rec traced to its trial. Pays: health systems, CDS vendors (hypothesis/decision-support only).
9. Financial-crime / beneficial-ownership / AML Domain: compliance. Entities: persons, companies, ownership. Lenses→claims: ownership-lens→FtM control claims; sanctions-lens→PEP/sanction status. Relationships: true beneficial owner through shell layers; explainable matches. Load-bearing: entity resolution + conflicting registries (OpenSanctions 2M+ entities / 337 sources; false positives are the #1 pain). Test: surface a true sanctioned beneficial owner through ≥2 shell layers with a regulator-readable explanation, beating an ER baseline on false-positive rate. Pays: banks, fintechs, compliance teams.
10. Personal-AI memory Domain: agent memory. Entities: a person's evolving facts. Lenses→claims: fact-lens→time-bounded life claims; revision-lens→superseding-but-preserved updates. Relationships: belief evolution; re-rank on new context. Load-bearing: evidence-anchor + bitemporal, framed against Mem0 (Mem0 replaces contradictory facts; donto preserves and re-ranks them). Test: answer "what did the user believe at T" correctly after a documented preference reversal. Pays: AI-assistant vendors, agent platforms.
(Properties spread: contradiction → 2,4,5,6,7,8,9; identity → 2,7,9; bitemporal → 2,5,6,8,10; evidence-anchor → 1,7,10; volume → 1,3,4.)
Brutally concrete. Two tracks.
Track A — jsonresume matching (the monetizable
proof). Build the held-out benchmark: resumes (registry, with
Gist history as valid-time) × jobs (ATS-scraped). Promote only matches
with evidence-anchored skill→requirement satisfaction above threshold;
hold the rest hypothesis_only.
Track B — discovery via retrospective time-slicing. Cut the corpus at time T, generate hypotheses, check against post-T confirmed truth (drug-repurposing link, or "got-interviewed" outcome). This tests step 4–7 honestly without waiting for the future.
Baselines to beat (multi-agent discovery alone is NOT novel — these already exist):
| Baseline | Why it's the bar |
|---|---|
| Plain frontier-LLM prompt | the bitter-lesson null hypothesis — if a single prompt matches you, the substrate adds nothing |
| Vector / embedding search | ConFit shows it works but can't explain; beat it on
precision@k and explainability |
| Normal KG link prediction | the structured baseline (Hetionet/DRKG metapaths) |
| Multi-agent discovery (mat2vec, AI Co-Scientist Elo tournament, SciAgents) | DeepMind's Co-Scientist (Nature, May 2026) already ships generate→rank→re-rank — in-memory per session. donto's only claim is the persistent paraconsistent bitemporal substrate underneath. Frame it precisely. |
The number that proves it works: beat the strongest
baseline on precision@k of promoted hypotheses
while (a) attaching an evidence-anchored explanation to
each, (b) surfacing ≥1 real contradiction the baseline silently dropped,
and (c) demonstrably re-ranking the pool when one new evidence byte
arrives. Less is a demo; this is the product.
donto_argument = 2,426 edges,
identity_edge = 77, evidence ~4.7%. The
contradiction/identity machinery — the differentiator — is barely
exercised; this domain will stress-test unbuilt paths.hypothesis_only + ranker) is not
optional — without it the matcher is a spam engine.precision@k estimate, not a proven
bound. Say so.