genes.apexpots.com / research source: donto-abundance-2026-06-02.md

donto — The Substrate for Generative Abundance (unified, 2026-06-02)

donto — The Substrate for Generative Abundance

A unified vision report · 2026-06-02

1. The thesis

For sixty years, generating typed knowledge about the world was the scarce, expensive, human-bottlenecked step in every knowledge and discovery system. That scarcity is gone. A guided frontier LLM can now emit an essentially unbounded, multi-directional space of properties and relations about any entity — inventing the axes and the predicates as it goes — for a fraction of a cent each. donto is the substrate that turns that firehose into durable, evidence-anchored, contradiction-tolerant, self-densifying knowledge — and discovers the relationships no one ever drew.

The expansion. Every prior system paid a human tax on generation: Cyc paid knowledge engineers per assertion; literature-based discovery rode co-occurrence statistics; formal concept analysis required attributes defined up front. The supply of typed claims was the constraint, so those systems were small, brittle, and slow. The 2024–2026 result that reorients everything: a single cheap model (GPT-4o-mini), pointed at entities and asked to elaborate, materialized 105M typed triples over 2.9M entities using 2,133 distinct relations — ~36 typed properties per entity — at $0.00009 per correct triple (GPTKB). It invented predicates no schema had (historicalSignificance, hasArtStyle, hobbies), and 69.5% of the entities it described don't exist in Wikidata at all. Generation is no longer scarce. It is abundant, multi-directional, and self-extending.

That changes the hard problem. When generation was scarce, the architecture question was "how do we get enough typed knowledge?" Now it is: "where do we put an unbounded, contradictory, evidence-anchored firehose without throwing most of it away?" The standard answers — vector DBs, normal KGs, even 2025's best agent-memory graphs — all collapse: they dedup, pick a winner, and invalidate the loser at write time, destroying exactly the speculative, minority, not-yet-supported claims that are the raw material of discovery. donto does the opposite. It is bitemporal, paraconsistent, and evidence-first: it holds incompatible claims forever as legal state, anchors each to its source, links them with typed argument edges, and re-ranks by reality over time instead of deleting on conflict. Generation-abundance + paraconsistent holding + evidence-anchoring + bitemporal re-ranking = a knowledge base that grows itself in all directions and prunes by reality.

2. What actually changed (why now is not 1986)

The single assumption that defined pre-LLM knowledge engineering — that producing typed properties and relations is slow, costly, and requires experts — is dead. Here is the evidence, in numbers, with the old systems used only as one-line touchstones to mark the constraint that lifted.

Emission is unbounded and multi-directional.

GPTKB: 105M triples / 2.9M entities / 2,133 relations / 367 classes from GPT-4o-mini in 27 hours; the model coined its own axes. (Cyc's touchstone: this is the knowledge-engineer step, done for free, along axes nobody pre-declared.)
GPTKB v1.5 / "Mining the Mind": ~100M beliefs recursively elicited from a single frontier model — and the same model asserts mutually contradictory claims depending on framing. Abundance is real and abundantly contradictory.
AutoSchemaKG / ATLAS: a 900M+ node, 5.9B edge graph from 50M+ documents with zero predefined schema — the LLM induced all entity/event/concept types on the fly, hitting 95% alignment with human-crafted schemas while preserving 93–97% of source information. (FCA's touchstone: predefined attributes are now a generated artifact.) Strikingly, that was done with a small Llama-3-8B extractor — the frontier ceiling (GPT-5.5 at 1M-token context, Claude Opus 4.5) is far above what these papers measured.
The directions are real, not metaphor. Anthropic's sparse-autoencoder work extracted 34M distinct interpretable features (concept-directions) from one mid-size model, ~12M alive. When we say "any direction," the substrate inside the model literally has tens of millions of them. The job is not to manufacture directions — they exist in superabundance — but to elicit the useful ones and anchor each emission to evidence.

Cost collapsed, and is a steering variable now, not a wall.

Inference cost for a fixed capability has fallen ~10×/year for three straight years (a16z "LLMflation"): GPT-3-quality dropped from $60/M tokens (Nov-2021) to ~$0.06/M (Nov-2024) — ~1000×. Epoch AI puts the per-task decline at a median 50×/year, accelerating to ~200×/year on post-Jan-2024 data.
Concretely: emitting ~500 typed properties for one entity (~5K output tokens) now costs ~$0.004–0.04 at floor rates (DeepSeek V4-Pro $0.44/$0.87 per M; Gemini 2.5 Flash $0.30/$2.50). A full 1M-resume corpus, fully decomposed, is a $4K–40K line item — not a knowledge-engineering department-decade.
The honest nuance: headline premium prices bifurcated in May-2026 (GPT-5.5 doubled to $2.50/$15) even as cost-per-capability kept falling 5–10×/year. The builder's move is to route bulk emission to the floor model and spend premium reasoning only on the relationship/re-rank steps. Cost is a dial.

What's newly possible (the outside-the-box part):

Point a guided LLM at one entity → a fan of dozens-to-hundreds of typed claims plus freshly-invented predicates, not a fixed feature row.
Build a web-scale graph with no ontology authored first — the predicate set grows itself.
Hold 100% of the firehose instead of paying the collapse tax: prior LLM-KG pipelines need ~70% entity reduction (DEG-RAG) just to stay usable in a collapsing store. A paraconsistent substrate keeps all of it.
Discover relationships at the intersections of many decompositions — skill adjacency, hidden career paths, candidate↔︎role bridges, drug↔︎disease repurposings — at ~$0.0001/claim. Machine serendipity, now with a per-claim cost.
Close the loop with built-in ground truth: got-interviewed/hired, lab-validated, retained — live labels that continuously re-weight which emitted properties actually predict outcomes.

3. The generative-abundance engine

This is the heart of the design, and it carries the founder's late correction as a headline principle: emit free and untyped now; join, type, and align later.

The headline design principle: emit free / untyped now, align at query time

The earlier reports made two mistakes we now correct. The "lens engine" report fixed a rigid lens taxonomy. Iteration 3 required emissions to be pre-typed claims conforming to a schema at generation time. Both cap the abundance. The model's superpower is precisely that it will mint its own predicates and its own directions faster and broader than any taxonomy we could author. So we let it.

Free / untyped / self-invented predicates = YES. Structured-as-an-evidence-anchored-claim = YES. Rigid-schema-at-write-time = NO.

The model emits freely, abundantly, in any direction, coining predicates as it goes — even un-normalized, even near-duplicate (hasHobby vs hobbies vs interests). The hard problems of typing, alignment, identity-resolution, and joining are all DEFERRED to query time. This is not a compromise we tolerate; it is donto's native strength. A bitemporal, paraconsistent quad store with query-time entity-resolution lenses is built to defer these decisions — to hold every variant and resolve per-use, non-destructively, reversibly.

The live proof point, reframed: the store already holds ~938,000 distinct predicates. The pre-LLM literature would call that a "predicate proliferation problem." We call it the signature of free, abundant emission — a feature. Predicate alignment is a typed, scoped, query-time operation (e.g. align the jobs context's predicates to ESCO/O*NET/Lightcast when you query the jobs context), not a write-time gate. Identity is a hypothesis resolved at query time, not a merge committed at ingest.

The only write-time invariant (lightweight, and it preserves everything)

Each emission must be a CLAIM — some subject / some (possibly brand-new) predicate / some object — anchored to its source evidence. That's it. Not a schema. Not a type. Not a canonical entity. Just: a triple-shaped assertion that points back at what produced it.

This single invariant is what keeps abundance joinable and falsifiable later. Because every claim is shaped as subject/predicate/object, it can be joined at query time even if the predicate was coined a millisecond ago. Because every claim carries an evidence anchor, you can always ask the measurable question — "did this emitted property predict the held-out outcome?" — for any predicate, including one the model just invented. Measurability survives free emission.

The directions to emit along (a starting set, not a cage)

We keep the good part of "lenses" — that a lens is a direction of decomposition that changes the graph — but the model generates the directions and the predicates itself. Seed it with a standing set and an open invitation:

attributes · parts/composition · functions/affordances · causes/effects · counterfactuals · comparisons/analogies · temporal trajectory · context/provenance
+ an open "invent new axes" lens: "what other true, evidence-supported properties of this entity matter that I haven't named yet?" — this is where historicalSignificance and skill-adjacency bridges come from.

Each result is written as a hypothesis_only claim, with the lens recorded as provenance so we can later measure which directions pay. With 1M-token context, do whole-corpus single-pass decomposition (a person's full resume history + a job family + the relevant ESCO subtree in one pass) so cross-entity properties are generated jointly, not stitched from chunks.

Target: ≥30 typed claims/entity at ≥90% human-judged faithfulness on a 200-entity audit, with the predicate inventory per ctx:* treated as a first-class growing object — new predicates land as hypotheses, get aligned at re-rank time, and are kept only if they show downstream lift.

4. The substrate as a possibility-space

Abundance needs a home that doesn't punish it. The standard storage targets are structurally hostile to it — and that hostility is measurable:

Vector DBs collapse meaning to one embedding. When the LLM emits a fact that conflicts with a stored one, the closer vector wins and the other is silently lost. (Mem0 demonstrably returns the stale-but-embedding-closer address on LoCoMo.) Recall is capped by a dedup threshold, not by generation.
Normal KGs and even 2025's best agent-memory graphs enforce single-truth. Zep/Graphiti uses an LLM to detect contradicting edges and invalidate the overlapping one. Mem0's update step overwrites. Every collapsing store destroys minority and speculative claims at write time — exactly the raw material of discovery — and LLM-built KGs need ~70% entity reduction (DEG-RAG) just to be usable.

donto inverts all three failure modes into assets:

Property	Collapsing store	donto
Recall ceiling	set by dedup threshold	set by generation — nothing emitted is lost
Contradiction	invalidate-on-conflict (loser deleted)	held forever as legal bitemporal state, with supports/rebuts/undercuts edges
Audit	opaque embedding / canonical merge	100% provenance + counter-evidence per claim
New evidence	overwrite	re-rank old hypotheses (bitemporal) — compounding, not resetting
Identity	merged at write (destructive)	hypothesis resolved at query time (non-destructive, reversible)

And there's a theorem behind the moat. Model collapse — the nightmare of a self-feeding knowledge base — happens only under replacement (train on synthetic, discard real → error grows ~linearly). Under accumulation (keep all real + synthetic forever) error is provably bounded, independent of iteration count (Gerstgrasser et al. 2024). donto is an accumulation system by construction: bitemporal, never overwrites, never dedups, every claim keeps provenance and counter-evidence. The field's converged collapse-avoidance recipe — accumulate + verify/curate — is donto's claim lifecycle. The contradiction-preserving substrate isn't merely compatible with generative abundance; it is the provably-safe container for it.

A knowledge base that grows in all directions and prunes by reality. The only architecture where "generate everything, keep everything, let evidence decide what survives" is even expressible.

And it stays usable: donto's POST /search already ranks across the full 39.3M-statement substrate in 270–820ms including stopwords. Abundance is only valuable if it stays queryable — that part is built.

5. Measurement as the steering wheel

Once generation is free, the scarce resource becomes knowing which emissions are worth keeping — and accuracy alone is the wrong yardstick (a true-but-redundant fact has ~zero value). We keep all the measurable rigor and point it forward: measurement is not a backward judge, it's the active steering wheel that decides what to generate next.

Four measurable signals, each a real 2024–2026 result, become a per-claim value score:

Downstream task-lift (the gold metric). Does adding the structure move a real metric? GraphRAG delivers 72–83% comprehensiveness vs vector RAG and +12.8 QA points from better-built graphs; 3.4× accuracy in enterprise scenarios. Rule: every kept property class must show measurable lift on a held-out task. The flagship has this built in: got-interviewed / hired.
Information gain / Bayesian surprise. How much a claim shifts the posterior over an entity — which doubles as the steering signal. BED-LLM / Uncertainty-of-Thoughts choose what to generate next by maximizing expected information gain. Build an EIG-ordered "what to generate next" queue — turn the firehose into a guided drill.
Novelty & diversity. AI-generated research ideas are rated more novel than 100+ expert humans (p<0.05, Stanford) — but lower diversity, so we measure and optimize diversity explicitly (distinct-n, embedding-distance, NovelSum). Multi-lens decomposition + diversity-aware sampling makes "serendipity that accumulates" actually span the space.
Coverage / completeness. MINE (Feb-2025) scores how faithfully a KG represents source text; KGGen beat OpenIE/GraphRAG by 18%. Make a MINE-style coverage score a CI metric — regression-test that new prompts/models don't drop coverage.

Faithfulness is an engineerable knob, not a fixed property. VeriFY cuts factual hallucination 9.7–53.3% with only 0.4–5.7% recall loss via structured verification traces; Claimify extracts atomic claims at 96.7% precision / 87.6% coverage. Wire VeriFY-style traces into extraction as a first filter; prefer external grounding (source-anchored evidence, the hire signal) over self-consistency as the arbiter.

The synthetic-data regime rule (so abundance doesn't drown reality): weight generated claims down where dense real evidence exists, up for sparse entities. A simple per-entity evidence-count feature implements it.

Targets to commit to: kept-claim set delivers ≥90% of full-firehose task-lift at <20% of the claims; per-entity generation <$0.05 for ~500 properties; <30% of total spend on premium-tier reasoning; measured downstream task-lift flat-or-up across N self-ingestion cycles (the accumulation guarantee, verified empirically).

6. The claim lifecycle (the product spine)

The 8-step loop, now fed by abundance and decoupled into two layers by the hypothesis_only flag — maximize at extraction, gate at the relationship layer.

#	Step	What happens	Layer / gate
1	Ingest	Pull any source (resume, paper, deed, posting)	—
2	Emit (free)	LLM emits unbounded claims along many directions, inventing predicates & axes; only invariant = subject/predicate/object + evidence anchor	Extraction — emit everything, gate nothing
3	Hold incompatible	Contradictory claims stored side-by-side as legal bitemporal state	Paraconsistent hold
4	Generate relationship hypotheses	LLM proposes typed relationships at the intersections of decompositions (the unbuilt core)	Relationship layer
5	Attach evidence ± counter-evidence	supports / rebuts / undercuts argument edges; 100% provenance	Promotion gate
6	Rank	Score by value (info-gain × novelty × downstream-lift), not accuracy alone	Relationship layer
7	Re-rank (bitemporal)	New evidence (e.g. hired) re-scores old hypotheses — compounding	Standing job
8	Explain	LLM explains only what evidence already supports (faithful-by-construction)	Lean-4-checkable shape

The waterline: abundance lives below it as hypothesis_only; reality pulls a vanishing fraction above it. A claim earns .candidate → .proved only via independent supports-edges, with a precision target ≥0.95 (Claimify's bar) and 100% provenance completeness as a CI assertion. The gate, not the firehose, is the contract.

7. Flagship: jsonresume → jobs

The cleanest proving ground, because abundance here has built-in, falsifiable ground truth: got-interviewed / hired / retained. The founder runs jsonresume.org — the corpus and the network are real.

Why the bottleneck is genuinely gone here. Most required skills in a posting are expressed implicitly, never as keywords — and zero-shot GPT-4 ESCO matching beat the entire prior supervised SOTA by +22.33 / +29.75 pp RP@10 (arXiv:2307.03539). A resume's real competency set is often 3–10× larger than its listed skills. And abundance improves ranking, not just recall: ConFit v3 adds +7.81 pp nDCG@10 over the strongest embedding baseline on a 49K-resume set; the explainable Synapse reports +22% nDCG@10 over embedding-only retrieval.

The abundance extractor (per resume). A guided multi-lens pass emits typed, evidence-anchored claims across ~10 directions — explicit skills · inferred/implicit skills (Docker + Terraform 3yr ⇒ Kubernetes-ready) · seniority-from-trajectory · transferable BRIDGES (competitive StarCraft ⇒ real-time resource allocation; client-facing ops 6yr ⇒ stakeholder-management + incident-comms) · working-style signals · latent traits (low-confidence, opt-in) · identity variants — each anchored to ESCO/O*NET/Lightcast with an evidence_link to the resume span. Per job: required / nice-to-have / implied competencies. Match = relationship discovery at the lens intersections — surfacing the candidate who never held the title, as an auditable evidence chain.

The architectural win, validated externally. 2026's best explainable matcher, JobMatchAI, wins by strictly separating a deterministic scoring layer from a generative explanation layer — "the LLM can explain a ranking but never inflate one" — yielding 100% faithful / 0% unsupported claims / 94.5% weakness-surfacing at 82ms. That is exactly donto's split: emit the firehose, hold contradictions (senior per title vs junior per tenure live side-by-side), gate at ranking with deterministic auditable utility, and let the LLM explain only what evidence supports — enforced as a Lean-4-checkable shape. A vector DB must collapse to one embedding; a normal KG must dedup to one canonical skill. donto is the only home that keeps "claims Kubernetes (inferred)" alongside "no direct Kubernetes evidence" and re-ranks when the interview outcome lands.

Network-effect discovery. Skill-adjacency and career-path edges (RELATED_TO, IMPLIES, NEXT_ROLE, OBSOLETED_BY) accumulate across all resumes into a population skill-graph — the "serendipity that accumulates," grounded to taxonomy (KARRIEREWEGE+: 100K resumes → 3,039 ESCO occupations, MRR 43.58). De-conflate preference vs qualification as two separate claim contexts (ctx:jobs/preference/* vs ctx:jobs/qualification/*) — "wants executive role" and "qualified for executive role" coexist without collapse.

The competitive wedge. LinkedIn (800M members; required skills projected to double by 2027) and Lightcast (32K+ skills from 1B+ postings) prove the demand and the moat — but they are closed and embedding-collapsed. An open jsonresume claim-substrate, anchored to the official ESCO↔︎O*NET crosswalk, does the one thing incumbents cannot: expose why a non-obvious candidate fits, as a checkable evidence chain — compliance-grade, and continuously self-calibrated by hire outcomes. Fairness is measurable and no worse than the status quo (GPT-4 shows no larger demographic group differences than humans across 736 real submissions). Full-firehose extraction over millions of resumes is an O($1K–10K) line item ($72–$9,000 per 1M docs; batch API −50%).

Falsifiable first milestone: extract abundance-claims for a held-out jsonresume cohort, match vs an embedding-only baseline → target ConFit-v3-class lift (+7–8 pp nDCG@10) with 100%-faithful explanations, and inferred-claim precision ≥ explicit-claim precision within 2 outcome cycles. Post to the public TalentCLEF 2025 leaderboard (best title-match MAP 0.534; title→skill MAP 0.360).

8. Ten example projects

Each makes a different donto property load-bearing, and each names its abundance angle.

jsonresume → jobs (flagship). Property: identity-as-hypothesis + bitemporal re-rank. Abundance: 3–10× the listed skills, plus latent traits & cross-domain bridges. Ground truth: got-interviewed/hired. (Built-in falsifiability.)
Drug repurposing ledger. Property: paraconsistent hold + Elo re-ranking. Abundance: emit mechanism / pathway / contraindication claims per compound; propose drug↔︎disease bridges at lens intersections. Touchstone: Robin found ripasudil for dry-AMD in 2.5 months — but each run starts fresh; donto gives it a persistent, compounding ledger. Metric: serendipity hit-rate rising as evidence accumulates (frontier ceiling today <13%).
Rare-disease differential dx. Property: hold many competing hypotheses + reality-anchored re-rank. Abundance: emit symptom/gene/phenotype claims, hold all differentials. Touchstone: agentic + KG retrieval lifted Top-5 +17%, recall to 41.4%. Metric: held-out diagnosis accuracy vs single-shot.
Science-integrity / contradiction-rank. Property: contradiction as a queryable signal (only possible because both sides are retained). Abundance: emit every claim a paper makes; rank papers/authors by internal inconsistency. Touchstone: "Mining the Mind" — models contradict themselves by framing. Metric: flagged-inconsistency precision vs human audit.
Genealogy (genes). Property: identity-as-hypothesis + paraconsistent witness-attestation. Abundance: emit kinship/event/place claims per source; hold rival readings (e.g. competing parentage) as separate evidence-bearing claims. Metric: 100% provenance completeness as a hard gate (today many kinship triples have empty evidence_links — abundance amplifies this; the gate fixes it).
AML / financial-crime triage. Property: audit-grade provenance + supports/rebuts separation. Abundance: emit entity-relationship & risk-signal claims; resolve identity (sanctions name-variants) at query time. Touchstone: multi-agent ER hits 94.3% on name-variation; AML frameworks require explicit citations. Metric: alert precision with full evidence chain.
Legal claim-graph. Property: typed argument edges (supports/rebuts/undercuts) + Lean-4-certified shapes. Abundance: emit holdings / facts / obligations per document; build argument trees. Metric: 0% unsupported claims in generated briefs (JobMatchAI-style faithfulness as a checkable shape).
Open serendipity engine for R&D. Property: cross-entity relationship generator + accumulation. Abundance: decompose two entities along dozens of axes each, surface relationships at the crossings. Touchstone: SciAgents built a 33K-node graph from ~1000 papers — but ephemeral; donto banks it. Metric: novel adjacency edges/month with downstream support above chance.
OSINT entity-resolution. Property: query-time entity-resolution lenses, non-destructive merge. Abundance: emit alias / affiliation / event claims; keep all variants as competing claims. Touchstone: rule systems over-match, LLMs fail on transliteration/dates — complementary failure modes donto reconciles by holding both. Metric: resolution F1 vs single-strategy baselines.
Self-improving extraction (the loop). Property: accumulation guarantee (Gerstgrasser-bounded). Abundance: claims that survive evidence-anchored re-ranking become high-quality curation signal (the phi-1 / STaR pattern), fed back to improve the extractor. Metric: downstream task-lift flat-or-up across N self-ingestion cycles.

9. First milestones & the numbers that prove it

Falsifiable, baseline-anchored, and chosen so each one proves the substrate earns its place over the obvious cheaper alternative.

Milestone 1 — Abundance extractor works. Per entity, ≥30 typed claims at ≥90% human-judged faithfulness (200-entity audit), >95% syntactically/ontologically valid. Baseline to beat: donto's own live 483-valid-facts-from-one-sentence. Dashboard: facts/min, valid-fact %, downstream-lift-per-fact.

Milestone 2 — The collapse-delta (the headline proof). Ingest the same LLM firehose into (a) donto, (b) a vector DB, (c) a Graphiti-style invalidate-on-conflict graph. Measure % of emitted minority/contradictory claims still retrievable. Target: donto 100% retained vs measurable loss in both collapsing stores (Mem0 demonstrably loses to stale-but-closer). This is the single benchmark that justifies the substrate.

Milestone 3 — Flagship held-out matching. jsonresume cohort, abundance-claims vs embedding-only. Baselines: plain frontier LLM, vector cosine, KG link-prediction. Targets: +7–8 pp nDCG@10 (ConFit-v3 class), hidden-candidate recall up, 100%-faithful explanations, and skill-linking RP@10 ≥ +22 pts over distant supervision.

Milestone 4 — Discovery via time-slicing (the compounding test). Re-run the same query a week apart and show a previously low-ranked hypothesis rise on new evidence — a capability no shipped in-memory discovery agent has (Co-Scientist, Robin, SciAgents all start fresh). Falsifiable: a handful of got-interviewed/hired events visibly move rankings; report calibration.

Milestone 5 — The gate beats keep-everything. Prove the per-claim value score earns its keep: kept-claim set delivers ≥90% of full-firehose task-lift at <20% of the claims, at <$0.05/entity and <30% premium spend.

The one metric that proves the substrate earns its place: the collapse-delta combined with the time-slicing compounding test. A vector DB or normal KG can match donto on a single-shot match. Only donto can retain 100% of contradictory abundance AND show a buried hypothesis surface as reality arrives. That pair is the moat.

10. The horizon

Picture donto in ten years as a self-densifying knowledge organism. Generation is free and unbounded, so the substrate grows in every direction at once — every entity decomposed along dozens of axes, every axis the model can invent, every claim anchored to its source and held even when it contradicts its neighbor. Reality flows in continuously as evidence — interviews and hires, lab results, court rulings, new documents — and the bitemporal re-rank lets that reality prune the firehose without ever deleting the record of what was once believed. Old hypotheses rise and fall on new evidence. The base compounds. It never resets.

"Understand everything in extreme detail" becomes coherent and measurable: not a slogan but a curve — properties-per-entity climbing, coverage scores climbing, downstream task-lift climbing across self-ingestion cycles (bounded, by theorem, so it never collapses). The accumulation guarantee means this organism can safely feed on its own and other models' output forever. The cross-entity relationship generator — today's unbuilt core — becomes a standing property of the substrate: a co-scientist that never forgets, mining the intersections of a graph that grows in all directions, banking the serendipity instead of recomputing it each run.

The flagship is the proof that this isn't romance. jsonresume → jobs has a real network, a real corpus, and a built-in reality check, so the abundance thesis gets scored every time someone gets an interview. Win there — explainable, evidence-anchored matches that beat opaque incumbents and tell you why — and the same loop transfers to medicine, law, science, and lineage. The same machine that finds the hidden candidate finds the repurposed drug, the overlooked precedent, the missing ancestor, the contradicted result. A knowledge base that grows in all directions and prunes by reality — and gets measurably, compoundingly smarter every day it runs.

11. The honest edges

Real constraints, framed as engineering targets with numbers — not death-knells.

Per-claim precision isn't free even when breadth is. GPT-5 hits only 0.616 F1 on ChemProt RE; GPTKB v1.5 accuracy is below prior benchmarks. Target: gate at the relationship layer; ≥90% faithfulness on audited claims while keeping 100% of breadth as hypothesis_only.
The cross-entity relationship generator is the unbuilt core. Per-entity emission is solved; generating + ranking relationships across entities at scale is the frontier. Target: a pairwise/cluster generator + promotion gate at precision@10 >0.4 on a gold set, exploiting 1M-token joint passes.
Vocabulary sprawl from invented predicates (hasHobby/hobbies/interests). Target: query-time alignment to ESCO/O*NET/Lightcast; prune only predicates with zero downstream lift — keep the rest as the signature of abundance.
Cost & storage scale with the firehose (10× lenses = real money + real rows on a 39.5M-stmt box). Target: cost-per-promoted-claim as the unit economic; tier hypothesis_only storage cheaply, keep promoted claims hot; extend bounded-candidate query patterns so worst-case latency stays sub-second.
Ground truth is delayed, sparse, and biased (only some matches get hire signals; the funnel itself is biased). Target: treat outcomes as missing-not-at-random in re-ranking; use the 736-resume fairness result as a floor (no worse than humans) and measure group-differential outcomes continuously.