genes.apexpots.com / research source: donto-claim-substrate-2026-06-02.md

donto as a Claim/Discovery Substrate — Product Spec, jsonresume→jobs & 10 Projects (2026-06-02)

donto as a Claim/Discovery Substrate — Iteration 3 (Product Spec)

1. The reframe, in one paragraph

donto is a contradiction-preserving claim/discovery substrate for machines. It ingests evidence, decomposes it into source-grounded typed claims, holds mutually-incompatible claims as permanent legal state, and then generates, ranks, and re-ranks relationship hypotheses from claim combinations — always able to show a human why a hypothesis exists. The product is this claim lifecycle, not the "lens" vocabulary. Lenses are just one way to generate candidate typed claims; a lens that doesn't emit structured claims that change the graph is decorative. Everything below is one product spec for that lifecycle, plus the domains that prove it.


2. The claim lifecycle (the actual product spec)

This 8-step loop is the spine. Each step is something donto does to the graph.

# Step What donto does (one line)
1 Ingest evidence Anchor every source to a byte range; raw artifact + cleaned copy stored, never summarized-and-discarded.
2 Extract typed claims Run extraction lenses → emit normalized, typed, falsifiable claims (skill-claim, causal-claim, tlink), each evidence-linked. Volume lives here.
3 Hold incompatible claims Store contradictory claims side-by-side as legal state (hypothesis_only, contradiction frontier) — never average, overwrite, or drop.
4 Generate relationship hypotheses Combine claims into candidate cross-entity edges via shared intermediate facts (Swanson ABC join keys: shared sense/frame/entity-IRI/skill). This step is net-new code.
5 Attach evidence + counter-evidence Wire typed argument edges (supports/rebuts/undercuts, AIF-style) to each hypothesis.
6 Rank Score = f(novelty, plausibility, evidentiary-support, contradiction-value, verification-cost). This ranker IS the FDR controller.
7 Re-rank on new evidence New byte arrives → re-score the existing hypothesis pool (cheap), don't re-generate (expensive). Bitemporal trigger.
8 Explain Surface the backing claim-chain, the counter-evidence, the confidence, and a verification path ("confirm by adding repo link").

Steps 1–3 are mostly built (genealogy data). Steps 4–7 are the product gap. Step 8 is the differentiator the incumbents structurally can't offer.


3. Settling the volume question (honestly)

The founder pushed back on the prior report's de-emphasis of volume. The founder is right — with one condition. The disagreement dissolves once you split volume across two layers.

The founder's intuition is literally measurable. Modern NLP defines ~12 distinct gold-standard, falsifiable annotation layers over the same text, each emitting a different claim type:

Arithmetic for one rich 100-word paragraph: ~100 tokens × ~5 UD claims ≈ 500 syntactic claims; + ~80–120 AMR nodes/edges; + ~10–20 predicates × (sense + args + ~20 UDS scalars) ≈ several hundred semantic claims; + temporal + discourse + entity links. Conservatively 1,000–2,000+ typed falsifiable claims per paragraph, zero prose. The latent typed structure of one text is vast.

This is not aspirational. SemMedDB/SemRep extracted 96–130M typed subject-predicate-object predications from ~29–37M PubMed abstracts — and that database is the substrate of automated biomedical discovery. (It was deprecated Dec 31 2024 — a dated, open market gap donto's extraction layer directly fills.)

Why density enables discovery (the clincher). Swanson's Raynaud↔︎fish-oil discovery worked because the bridge B-concept (blood viscosity) existed as a typed claim in both literatures, even though only 4 of 489 articles co-mentioned A and C. The discovery lived in the intermediate typed facts, not the prose. No B-facts → no bridge → the true link is unrecoverable. Sparse-graph completion provably degrades (real commonsense/biochem KGs avg degree ~2; models "degrade quickly as density is reduced", Malaviya 2019). Under-extraction is a recall ceiling no downstream ranker can lift.

Where volume hurts. Candidate relationships scale ~O(N²) (worse for typed paths) while true links grow ~linearly. Calude & Longo (Foundations of Science 2017) prove via Ramsey theory that large enough databases must contain arbitrary correlations as a function of size alone — "most correlations are spurious," findable even in random data. So the unverified-relationship layer needs a gate (Benjamini-Hochberg FDR logic; Bonferroni is "too conservative to be useful" at large m).

Typed-extraction / decomposition layer Unverified-relationship-hypothesis layer
Volume verdict YES — recall, the raw fuel GATED — combinatorial false-discovery
Why 1k–2k typed claims/paragraph; Swanson needs B-facts; denser graph = more reachable true links O(N²) candidates; Calude-Longo: spurious regularities guaranteed by size
Failure if ignored Recall ceiling — true links permanently unrecoverable Deluge — real findings buried in noise
donto state claim entering graph (evidence-anchored, re-checkable) — allowed at any volume hypothesis promoted hypothesis_only → believed — must pass scorer under an FDR budget
donto today UNDER-fed (4.7% evidence coverage) EMPTY (the missing generate+gate step)

Conclusion: The founder is right — extract maximally — under two conditions: (1) lenses emit typed claims, never prose (prose has no join key, no falsifiability, no provenance type — it can neither feed the ranker nor bridge two texts); (2) the relationship layer is gated by the ranker. Volume feeds the verifier; they are sequential, not opposed. The only universal-noise case is free-form prose output.


4. Lenses must emit typed claims (not prose)

A lens is a typed-claim emitter with a declared target schema. The objective test for "decorative": does it write rows in its declared predicate family that the ranker can use — i.e. raise recall of true latent links OR sharpen the gate? If neither, cut it.

Lens Typed claim it emits How it changes the graph
Causal causal-claim(A causes B, polarity, strength) (SciClaim schema) New candidate causal edge → enters ranker; can rebut an existing causal claim
Temporal tml:event, tml:tlink(before/during/overlaps) (Allen) Time-bounds an assertion; enables bitemporal "what was true at T"; detects time contradictions
Linguistic ud:deprel, amr:ARGn, wn:sense, uds:proto-role(scalar) Adds join keys (shared sense/frame) for cross-text bridges; graded scalars become ranker weights
Genealogical relationship-hypothesis(parent-of, source-specific) Candidate kinship edge with per-source evidence; competing edges held paraconsistently
Legal treatment-edge cites/distinguishes/overrules/follows Builds the argument graph; overrules flips valid-time of a precedent without deletion
Social (FAN) co-occurrence/associate-of(witness, sponsor, neighbor) Cluster edges that triangulate identity and surface hidden links
Skills/competency skill-claim(surface → ESCO/O*NET IRI, confidence), implicit-skill-claim Resolves to taxonomy node non-destructively; implicit claims enter as hypothesis_only
Identity identity_edge(maybe-same-as, evidence) Query-time, non-destructive merge candidate; competing identities coexist

Kill the decorative failure mode: any lens emitting a paragraph of analysis instead of typed rows is deleted. "It explained the text nicely" is not a graph mutation.


5. Flagship: jsonresume → jobs

This is the cleanest possible proof of the whole thesis. Matching is literally scored relationship-discovery over typed claims. The founder owns both sides of the graph and the distribution channel: JSON Resume (MIT standard, schema v1.0.0, ~2.4k stars, 10,000+ devs, 400+ themes, Gist-backed registry at registry.jsonresume.org/<user>) plus a draft job-schema.json. The resume side is already half-decomposed into claims (basics/work/education/skills{name,level,keywords}/projects).

(a) Entities

Two: resume (a person's claim-set) and job (a requirement claim-set). Both decompose into the same typed schema, so matching is set-to-set relationship discovery — not cosine similarity between two opaque vectors.

(b) Typed claims per side

Resume claims Job claims
skill-claim (surface → ESCO/Lightcast/O*NET IRI) required-skill-claim
role-claim, seniority-claim (time-bounded) preferred-skill-claim
domain-claim (e.g. fintech), tenure-claim required-seniority-claim
education-claim, trajectory-claim (role transitions) domain-claim
implicit/inferred-skill-claim (hypothesis_only) comp-claim, location/remote-claim
identity-variant-claim (duplicate/variant profiles)

Anchor vocabulary: ESCO (14,575 skills / 3,039 occupations / 28 languages) + O*NET (1,016 occupations, 277 descriptors) as canonical free/open skill IRIs; Lightcast Open Skills (~33,000 skills, refreshed biweekly from 1B+ postings) as a freshness benchmark. Each skill is three claims: raw surface form + normalized IRI + confidence — the bullet text is never destroyed.

Implicit-skill lens example: bullet "Led migration of monolith to microservices on AWS for a 5M-user fintech" → explicit (AWS, microservices, leadership) and inferred-as-hypothesis_only (distributed-systems, observability, PCI/regulatory exposure, team-scale, domain=fintech, seniority signal). Industrial implicit-skill extraction runs ~80% precision / ~86% recall — real and measured, and exactly why it must enter as hypothesis, not belief.

(c) Matching as explainable relationship hypotheses

Each match is an evidence-anchored edge: "requirement R met by skill-claim S extracted from bullet B of resume gist G." Framed as FEVER-style claim verification (skill-claim SUPPORTS / REFUTES / NOT-ENOUGH-INFO a job requirement). The UX leads with:

(d) Why donto beats keyword + embedding matching

SOTA admits the gap verbatim. ConFit v2/v3 (ACL 2025/2026): embedding rankers "lack controllability and explainability as the ranking process happens entirely in latent embedding space" — they bolt on LLM re-ranking to recover the WHY (+13.8% recall, +17.5% nDCG over BM25/OpenAI embeddings, but one benchmark 72% male).

Capability Keyword/Embedding donto
Why this match latent / none claim-chain to source bullet
Contradiction silently averaged senior claim rebutted by tenure summing 18 months → surfaced as risk panel
Skill freshness static vector bitemporal: skill-claim valid-from 2018, no recent reinforcement → down-weighted; skill decay
Audit / compliance opaque bitemporal "replay why we matched" — EU AI Act-style explainability incumbents can't retrofit
Hidden candidates keyword-gated implicit-skill match even when surface keywords don't

(e) Network-effect payoff at millions of datapoints

This is where the founder's volume intuition is unambiguously right — it's a genuine network effect, not just recall:

These are emergent typed relationships incumbents charge enterprise prices for; donto generates them as a byproduct of the lifecycle. Volume feeds the verifier: dense claims → denser graph → more candidate matches → ranker promotes only evidence-backed ones.

(f) Competitive landscape

Player ARR / scale Weakness donto exploits
Eightfold AI ~$96.6M ARR, ~$2.1B val, $410M raised opaque deep-learning talent graph (low explainability)
Beamery ~$112.8M ARR, was $1B; laid off 12% then ~25% capital-stressed; proprietary graph
SeekOut ~$25.2M ARR, val $1.2B→$435M, 30% layoffs profile-aggregation alone = fragile moat
LinkedIn Skills Graph 875M people, 200K+ skill links proprietary, not exportable
hiring.cafe 14,000+ companies, free ingestion model to copy for the jobs side

The wedge: incumbents have bigger skills graphs. donto's differentiation is not "we have a skills graph" — it's evidence-anchoring + contradiction-preservation + bitemporal replay + human-readable WHY, on an open standard the founder controls. The likely paid wedge is explainability/fairness compliance (TA software ~$22–26B, AI-recruiting ~$3.2B at ~12% CAGR), not raw accuracy. If donto collapses claims to scores like everyone else, there is no moat.

(g) Falsifiable first test

Harvest the jobs side from open ATS feeds (Greenhouse/Lever/Workday) the way hiring.cafe does; resume side from the registry (Gist version history = free bitemporal valid-time). Test: on a held-out set, beat an ESCO-embedding baseline on precision@k of promoted matches against a real outcome signal (got-interviewed / hire), while producing an evidence-anchored, contradiction-flagged explanation for each. Report the false-discovery proportion among promoted matches as a first-class number.


6. 10 example projects across domains

Template: Domain · Entities · Lenses→claims · Relationships discovered · Load-bearing property · Falsifiable test · Who pays.

1. jsonresume → jobs Domain: talent matching. Entities: resumes, jobs. Lenses→claims: skill-lens→ESCO skill-claims; implicit-skill-lens→inferred PCI/distributed-systems (hypothesis_only); trajectory-lens→career-path edges. Relationships: candidate↔︎job matches, skill-adjacency, hidden candidates. Load-bearing: volume / network effects (+ evidence-anchor for the WHY). Test: beat ESCO-embedding precision@k on got-interviewed. Pays: recruiters, job platforms, fairness-compliance buyers.

2. Genealogy / native-title (proving ground) Domain: ancestry/legal apicals. Entities: persons, sources, places. Lenses→claims: genealogical-lens→parent-of per-source; identity-lens→maybe-same-as (the 16 distinct "Kittys"); FAN-lens→witness/sponsor edges. Relationships: contested lineage chains held paraconsistently. Load-bearing: contradiction + identity-as-hypothesis + bitemporal (every source is an interpretive witness, not ground truth). Test: surface the EKY #58 vs Brady 2013 incest-conflict without collapsing either reading. Pays: native-title corporations, RNTBCs, family researchers.

3. Deep linguistic decomposition / intertextuality (volume showcase) Domain: NLP/corpus. Entities: documents, tokens, predicates. Lenses→claims: UD-lens→5–6 claims/token; UDS-lens→20 scalars/edge; AMR-lens→concept nodes. Relationships: cross-text bridges via shared WordNet sense / FrameNet frame / AMR concept (Swanson ABC join keys). Load-bearing: volume (1k–2k claims/paragraph). Test: recover a held-out cross-document link findable only through a shared B-concept. Pays: research labs, intelligence/analysis, RAG vendors.

4. Biomedical drug-repurposing / LBD (discovery) Domain: biomedicine. Entities: drugs, genes, diseases. Lenses→claims: predication-lens→TREATS/INHIBITS/CAUSES (SemMedDB schema, UMLS-normalized); contradiction-lens→inter-study conflict. Relationships: A→C repurposing candidates via intermediate B (Hetionet ~2.25M edges / DRKG ~5.8M triples). Load-bearing: discovery + contradiction (SemMedDB deprecated 2024 = live gap). Test: rediscover a held-out, later-confirmed drug-disease link from pre-discovery literature. Pays: pharma, biotech, academic discovery.

5. Legal case-law / argument graph Domain: law. Entities: cases, holdings, statutes. Lenses→claims: treatment-lens→cites/distinguishes/overrules/follows; argument-lens→facts/issues/rules (LAMUS-style). Relationships: precedent chains; superseded-not-deleted law. Load-bearing: contradiction + bitemporal ("what was good law on date T" — the killer query; Caselaw Access Project 6.7M cases). Test: correctly answer good-law-as-of-date for a precedent overruled at a known date. Pays: law firms, legal-research vendors (decision-support, never authoritative ruling).

6. Scientific claim-curation / research-integrity Domain: science. Entities: papers, claims, retractions. Lenses→claims: claim-lens→SciClaim typed associations; provenance-lens→nanopub 3-graph envelope. Relationships: supporting vs conflicting bodies of evidence; re-rank on retraction. Load-bearing: contradiction frontier + bitemporal (Retraction Watch 60k+; "believed at T1, retracted at T2" cascades). Test: detect a contradiction pair Bucur's super-pattern flags, and re-rank dependents when a cited paper is retracted. Pays: publishers, funders, meta-science.

7. OSINT / investigative journalism Domain: investigations. Entities: people, companies, leaks, records. Lenses→claims: entity-lens→FollowTheMoney typed claims; conflict-lens→leak-vs-official disagreement. Relationships: hidden links across contradictory datasets with confidence tiers (confirmed/probable/possible). Load-bearing: identity-as-hypothesis + conflicting sources (OCCRP Aleph; "why do we believe X" survives legal review). Test: reproduce a known cross-leak link with a court-defensible why-trail. Pays: newsrooms, NGOs, due-diligence.

8. Clinical patient-evidence Domain: healthcare. Entities: patients, diagnoses, guideline recs. Lenses→claims: event-lens→OMOP temporal events; guideline-lens→conflicting recommendations under multimorbidity. Relationships: rec conflicts resolved by argumentation; revised diagnoses over time. Load-bearing: bitemporal + contradiction + governance (OHDSI ~800M+ records). Test: flag a guideline conflict for a comorbid patient with each rec traced to its trial. Pays: health systems, CDS vendors (hypothesis/decision-support only).

9. Financial-crime / beneficial-ownership / AML Domain: compliance. Entities: persons, companies, ownership. Lenses→claims: ownership-lens→FtM control claims; sanctions-lens→PEP/sanction status. Relationships: true beneficial owner through shell layers; explainable matches. Load-bearing: entity resolution + conflicting registries (OpenSanctions 2M+ entities / 337 sources; false positives are the #1 pain). Test: surface a true sanctioned beneficial owner through ≥2 shell layers with a regulator-readable explanation, beating an ER baseline on false-positive rate. Pays: banks, fintechs, compliance teams.

10. Personal-AI memory Domain: agent memory. Entities: a person's evolving facts. Lenses→claims: fact-lens→time-bounded life claims; revision-lens→superseding-but-preserved updates. Relationships: belief evolution; re-rank on new context. Load-bearing: evidence-anchor + bitemporal, framed against Mem0 (Mem0 replaces contradictory facts; donto preserves and re-ranks them). Test: answer "what did the user believe at T" correctly after a documented preference reversal. Pays: AI-assistant vendors, agent platforms.

(Properties spread: contradiction → 2,4,5,6,7,8,9; identity → 2,7,9; bitemporal → 2,5,6,8,10; evidence-anchor → 1,7,10; volume → 1,3,4.)


7. First milestone & baselines

Brutally concrete. Two tracks.

Track A — jsonresume matching (the monetizable proof). Build the held-out benchmark: resumes (registry, with Gist history as valid-time) × jobs (ATS-scraped). Promote only matches with evidence-anchored skill→requirement satisfaction above threshold; hold the rest hypothesis_only.

Track B — discovery via retrospective time-slicing. Cut the corpus at time T, generate hypotheses, check against post-T confirmed truth (drug-repurposing link, or "got-interviewed" outcome). This tests step 4–7 honestly without waiting for the future.

Baselines to beat (multi-agent discovery alone is NOT novel — these already exist):

Baseline Why it's the bar
Plain frontier-LLM prompt the bitter-lesson null hypothesis — if a single prompt matches you, the substrate adds nothing
Vector / embedding search ConFit shows it works but can't explain; beat it on precision@k and explainability
Normal KG link prediction the structured baseline (Hetionet/DRKG metapaths)
Multi-agent discovery (mat2vec, AI Co-Scientist Elo tournament, SciAgents) DeepMind's Co-Scientist (Nature, May 2026) already ships generate→rank→re-rank — in-memory per session. donto's only claim is the persistent paraconsistent bitemporal substrate underneath. Frame it precisely.

The number that proves it works: beat the strongest baseline on precision@k of promoted hypotheses while (a) attaching an evidence-anchored explanation to each, (b) surfacing ≥1 real contradiction the baseline silently dropped, and (c) demonstrably re-ranking the pool when one new evidence byte arrives. Less is a demo; this is the product.


8. The honest risks