# Does donto Work? — 105 Queries Against an Abundance-Extracted Knowledge Graph

**An empirical stress-test of the donto substrate after abundance extraction: 21 analytical lenses, 105 queries, with full results and an honest good/bad analysis. 2026-06-03.**

> **What this is.** We extracted ~12,500 evidence-anchored RDF-style facts from 7 frontier-conflict event documents (Univ. of Newcastle *Colonial Frontier Massacres* dataset) into donto — a paraconsistent, evidence-first knowledge graph — using abundance extraction (emit free/untyped, ~1,000–2,300 facts and ~1,300 predicates *per event*, defer typing/alignment to query time). Then we asked: **does it actually work, and is the abundance useful or just noise?** We ran **105 distinct queries across 21 lenses** (a 22-agent workflow), verified the strongest claims by hand, and a final judge synthesized the verdict. Every query, its result, and the analysis are below. Companion reports: the extraction-system report and the abundance vision, same `/research/` dir.

## Method

- **Corpus:** ~12,500 live statements, **6,111 distinct predicates**, 725 entity-subjects across **10 contexts** (7 distinct events + 3 re-extractions of event 10690 / 11176). One `donto_statement` table (~39M rows total); queries scoped by indexed `context` equality, live = `upper(tx_time) IS NULL`.
- **Queries:** **105 queries** across **21 lenses**, each lens an independent agent running ~5 queries and capturing the actual rows.
- **Headline pass rate:** **103/105 queries returned real, on-point rows (98%)**; **56** flagged genuinely surprising.

## Verdict — does donto work?

**YES — donto works as a queryable substrate, and I verified the strongest claims directly rather than trusting the lens self-reports. Every scoped query returned real, on-point rows in ~1s over a 39M-row table; the context-equality scoping discipline is the thing that makes it usable. The corpus is genuinely a knowledge graph, not a text blob: 12,589 live statements / 6,111 distinct predicates / 725 subjects across 10 contexts, with real entity IRIs, real cross-event recurrence, and real typed edges. The headline differentiator — paraconsistent holding of contradictory claims — is unambiguously real and was the single best thing I confirmed: for one event subject (ex:attack-wilmot-lagoon-near-mt-larcomb-1855) the graph simultaneously holds killedCountPerMurrayLetter=11, killCountPerCapricornian5Sept1925="over 100", countLowerBound=100, AND countDiscrepancyWith="Murray reports 11 killed and 3 wounded; Boles Jnr reports over 100 killed", plus epistemic hedges (killCountUnclearAssignment, killingsOnlyInRetrospectiveAccount, murrayLetterOnlyDescribesRecoveryNotKillings). A vector store would have collapsed these to one chunk; a normal KG would have invalidated-on-conflict and kept one winner. donto keeps all of it as legal state and lets you query the disagreement itself. The big honest caveat: it is a paraconsistent EVIDENCE STORE with WEAK entity resolution, not a clean linked-data graph. (Evidence ANCHORING, by contrast, is sound — see the correction note below: every claim is span-wired to a retrievable source snippet.) It is excellent for forensic/humanities synthesis by a human or an LLM; it is NOT yet ready for fully automated reasoning without a query-time normalization/dedup layer.**

> **Correction (2026-06-03, post-publication).** The original adversarial pass reported its single worst finding as *"evidence links are wired at the weakest tier — all point at `target_run_id`, with `target_span_id=0` and `target_document_id=0`, so the source snippet is not dereferenceable."* **That finding was FALSE — a measurement artifact.** `target_span_id`, `target_run_id`, and `target_document_id` are UUID columns; the original query compared a UUID column against the integer literal `0` (and read a NULL UUID as "0"), which inverted the reading. Re-checked directly against live `donto-pg`: across the corpus's evidence links (15,078 live, all `link_type='extracted_from'`), **every link has `target_span_id` SET, ZERO point at `target_run_id`, and ZERO at `target_document_id`.** Every span carries real `surface_text` plus byte offsets and resolves `span → revision → document`. So evidence anchoring is **SOUND — 100% span-wired and dereferenceable** (`claim → span → revision → document`), exactly meeting donto's `fact → evidence_link → span → revision → blob` promise. The corrected sections below replace the original ones; verification SQL is archived at `donto-align/evidence-link-correction/verify-evidence-link-tier.sql`.

## Is abundance useful, or noise?

USEFUL, with a real and quantified cost. The abundance is not noise — I sampled the long tail and it holds: of 6,111 predicates, ~81.7% are singletons, but they are well-formed domain predicates (no stop-words, no debug/temp junk, no parse garbage), each carrying a genuine event-specific fact. The payoff is that abundance lets you ask questions a normal KG/vector store cannot: not just "how many died?" but "how did the reported death count drift across 58 years of sources?" The deep extraction (ctx:test/ingest-verify/10690) literally computed deathCountTrendOverTime="increasing (2 in 1855 to 12 in 1913)", earliestSourcesMoreConservative=true, interSourceDisagreementOnDeathCount="severe (1-12)", deathCountRange="1-12" — these are DERIVED meta-facts about the historiography, the kind of thing you would normally pay a historian to notice. That is abundance earning its keep. BUT the cost is concrete and I measured it: (1) predicate fragmentation — same concept under multiple spellings (rdfType=899 / rdf:type=95 / "rdfType"-with-space; rdfsLabel=829 / rdfs:label=67), so a consumer MUST normalize at query time or miss data; (2) positional-suffix pseudo-arrays — the LLM emits layer1..layer6, level1..level6, link1..link6, nameVariation1..4, associatedEvent/1/2/3 instead of one multi-valued predicate; (3) IN-BAND CROSS-SOURCE ATTRIBUTION — *which* of several sources made a given claim is baked into predicate names (killCountPerCapricornian5Sept1925) and literals ("11 men killed per John Murray letter") rather than a reified claim→source edge, so a cross-source toll join cartesian-products (every toll × every source) instead of joining cleanly. (This is NOT an evidence-anchoring problem — every claim IS span-wired to a retrievable source snippet; see the correction note above. It is specifically about per-source attribution.) Net: abundance is a feature for a smart consumer that defers typing/alignment to query time (exactly donto's stated thesis), and a liability for a naive one.

## What's GOOD — where donto shines

- Contradiction preservation — the core thesis is real and load-bearing here. Multiple incompatible casualty counts, dates (Christmas Day 1855 vs after 27 Dec 1855), locations (4 miles vs 15 miles), and event terminologies (massacre/attack/outrage/slaughter, one per source) coexist as live facts with explicit countDiscrepancyWith/dateDiscrepancyWith/eventTerminologyBySource edges. This is the one thing a vector store and a winner-takes-all KG structurally cannot do.
- Derived historiographic meta-facts — deathCountTrendOverTime, sourceDepth ('quinary (based on Leith Hay's...')', sourceTemporalDistance ('uncertain (58 years later, third person recollection)'), reliabilityRating ('not particularly reliable'). The extractor reasons ABOUT the evidence, not just from it.
- Causal/reprisal chains as graph edges — reprisalFor, provokedReprisal, triggeredBy, subsequentReprisal, plus killedToTriggerDeathRatio='at least 100 killed in reprisal for 5 deaths' and full 6-step causationChainDetailed prose. You can traverse trigger→reprisal→secondary-reprisal across events.
- Cross-event entity recurrence — ex:nmp appears across 5 contexts (127 statements), Frederic Charles Urquhart and Burke Pastoral District across 4, building usable prosopographies and command hierarchies (Captain O'Connell → Lt Murray → Sgt Boles → Constables) from independently-extracted documents.
- Decolonial framing as data — euphemismDecoded ('paid the penalty → killed', 'dispersed → killing'), perpetratorsFramedAs='the avengers' vs victimsFramedAs='the black murderers', colonialJustification, framingBias. Period bias is captured as structured fact rather than editorialized away — directly usable for critical historiography.
- Honest uncertainty — possibleEvent=true, dateDisputed=true, allegedlyUnprovoked, certaintyMarker='confirmed by depositions'/'supposed'. The graph marks unknowns visibly instead of fabricating confidence.

### Best findings (verified)

- VERIFIED: deathCountTrendOverTime = 'increasing (2 in 1855 to 12 in 1913)' on ex:attack-rannes-1855 (ctx:test/ingest-verify/10690), alongside earliestDeathCount=2, latestDeathCount=12, deathCountRange='1-12', earliestSourcesMoreConservative=true, interSourceDisagreementOnDeathCount='severe (1-12)' — the extractor performed longitudinal source analysis and emitted derived meta-facts about how the historical record itself drifted. This is the strongest argument that abundance > a flat KG.
- VERIFIED paraconsistency on one subject: killedCountPerMurrayLetter=11 AND killCountPerCapricornian5Sept1925='over 100' AND countLowerBound=100 AND countDiscrepancyWith='Murray reports 11 killed and 3 wounded; Boles Jnr reports over 100 killed' — all live, all queryable, none collapsed. plus self-aware hedges like murrayLetterOnlyDescribesRecoveryNotKillings=true and killingsOnlyInRetrospectiveAccount=true.
- VERIFIED dense identity hypotheses capturing real OCR/colonial-renaming ambiguity: maconachie/maconochie/maccronerky → john-mcconachie; kliediewarry/kalidawarry/kalidawarry-waterhole → kaliduwarry; james-bowles-nmp/mr-bol/white-sergeant-nmp → james-boles-nmp. These are exactly the messy variants a frontier-records historian fights, surfaced as queryable likelySameAs/possiblySameAs edges (with confidence grading: possiblySameAs for the uncertain ones).
- VERIFIED structured evidence links EXIST, are dense, AND are anchored at the STRONGEST tier — 15,078 live evidence links across the corpus (all link_type='extracted_from'), and **every one points at a `target_span_id`**, with ZERO at `target_run_id` and ZERO at `target_document_id`. Each span carries real `surface_text` plus start/end byte offsets and resolves cleanly `span → revision → document`, so a claim dereferences to its exact source snippet via a join — donto's `fact → evidence_link → span → revision → blob` promise is met at the span tier, not the run tier. (The original pass reported the opposite — `span=0, doc=0, points at run` — which was a UUID-column-vs-integer-`0` measurement artifact, now corrected.)
- MEASURED the abundance cost: predicate fragmentation is real (rdfType=899 vs rdf:type=95 vs a space-prefixed rdfType; rdfsLabel=829 vs rdfs:label=67) and the LLM fakes arrays with positional suffixes (layer1..6, level1..6, link1..6, nameVariation1..4). A cross-source toll join cartesian-products because tolls and sources both hang off the event subject with no claim→source reification.

## What's BAD — where it falls short

- Provenance attribution (which SOURCE made a claim) is in-band, not reified. NOTE: this is a narrower point than the original report's #1 "bad" finding, which was withdrawn — evidence ANCHORING is fine (every claim is span-wired to a retrievable snippet via `fact→evidence_link→span→revision`, see the correction note above). What is NOT structured is *cross-source attribution*: a claim resolves to *a* source document, but WHICH of several sources asserted a given fact is encoded in predicate names (killCountPerCapricornian5Sept1925) and literal text ('11 men killed per John Murray letter'). That is human/LLM-readable but not a clean per-source edge, so a "who said which toll?" query can't join on it cleanly (see next bullet).
- No structured claim→source reification, so multi-source facts don't join cleanly. A natural query ('which source said which toll?') cartesian-products because both the toll and the source hang off the event subject as independent predicates. You must parse predicate names/literals to recover who-said-what. (This is the genuine provenance gap; it is about source *reification*, NOT about whether evidence is dereferenceable — it is.)
- Predicate fragmentation forces query-time normalization. Same concept under rdfType / rdf:type / ' rdfType', rdfsLabel / rdfs:label, describedAs / describedAsHaving / etc. Any query that doesn't regex-normalize silently undercounts.
- Positional-suffix pseudo-arrays (layer1..6, level1..6, link1..6, nameVariation1..4, associatedEvent/1/2/3) — the LLM uses numbered predicates as a fake array, which is both fragmentation and a sign of unconstrained emission. These should be one multi-valued predicate.
- Entity resolution is hypotheses, not resolved. likelySameAs edges are dense and bidirectional but require union-find consolidation at query time; the same pair can be linked by both aliasOf and likelySameAs to different targets. Until resolved, ex:combo-james vs ex:combo-jimmy vs ex:colin-killed remain distinct nodes.
- Subjects are NOT stable across extraction runs — the same Rannes attack is ex:attack-nmp-rannes-1855-09-23 in one context and ex:attack-rannes-1855 / ex:ranges-station in others; locations vary (Rannes vs Ranges). Cross-extraction-consistency must be done with loose regex + manual inspection.
- 64.5% of objects are literals, not IRIs (verified ~35.5% IRI), so graph TRAVERSAL is shallow — much of the value is text packed into object_lit that you read, not edges you walk. The in-degree distribution is also dominated by the boolean literal 'true' (2620), which drowns real entity hubs unless filtered.

## Killer examples (concrete query → result)

- PARACONSISTENCY (verified, ctx:.../46860): query subject=ex:attack-wilmot-lagoon-near-mt-larcomb-1855 → returns killedCountPerMurrayLetter='11', killCountPerCapricornian5Sept1925='over 100', countLowerBound='100', countDiscrepancyWith='Murray reports 11 killed and 3 wounded; Boles Jnr reports over 100 killed', killCountUnclearAssignment='not clear whether 11 killed were all associated with the second reprisal or with both'. Both numbers held live; the conflict is itself a first-class fact.
- DERIVED HISTORIOGRAPHY (verified, ctx:test/ingest-verify/10690): subject=ex:attack-rannes-1855 → deathCountTrendOverTime='increasing (2 in 1855 to 12 in 1913)', earliestDeathCount='2', latestDeathCount='12', deathCountRange='1-12', earliestSourcesMoreConservative='true', interSourceDisagreementOnDeathCount='severe (1-12)', earliestToLatestSourceSpan='1855 to 1975'. No vector store produces this; it is a computed claim about source drift.
- IDENTITY UNDER OCR/RENAMING (verified): predicate likelySameAs → maconachie/maconochie/maccronerky→john-mcconachie, kliediewarry/kalidawarry/kalidawarry-waterhole→kaliduwarry, james-bowles-nmp/mr-bol/white-sergeant-nmp→james-boles-nmp. Real spelling-variant resolution surfaced as graded hypotheses (likelySameAs vs possiblySameAs).
- TERMINOLOGY-BY-SOURCE (lens-reported, consistent with verified discrepancy machinery): ex:attack-rannes-1855 eventTerminologyBySource → 'massacre (Morning Bulletin 1912)', 'attack (dataset)', 'outrage (Empire Oct 15)', 'slaughter (De Satge 1901)'. Same event, four framings, each attributed — exactly the meta-data a historian needs.
- THE JOIN THAT EXPOSES THE LIMIT (verified): joining toll predicates to source predicates on ctx:.../46860 returns a cartesian product — 'over 100' paired with capricornian-1925-09-05-p60 AND james-boles-jnr AND john-murray-nmp AND capricornian-newspaper indiscriminately, because there is no claim→source reification. The who-said-what is real but only recoverable by reading the predicate name/literal, not by a clean SQL join.
- EVIDENCE-LINK REALITY CHECK (verified, corrected): joining `donto_evidence_link → donto_span → donto_document_revision → donto_document` over the corpus → 15,078 live rows, ALL link_type='extracted_from', **ALL targeting `target_span_id` (ZERO at `target_run_id`/`target_document_id`)**, every span carrying real `surface_text`+offsets that resolve to a source document (e.g. offset 1958–2024 → 'Rockhampton Bulletin 10 July 1873, p3; Telegraph, 16 July 1873, p3'). Evidence wiring is near-total in coverage AND anchored to a retrievable source span — donto is genuinely 'evidence-anchored', not merely 'evidence-aware'. (The original reading of 'span=0, doc=0, points at run' was a UUID-vs-integer-`0` artifact and is withdrawn.)

## Per-lens scorecard

| Lens | Queries | Worked | Surprising | Lens verdict |
|---|---|---|---|---|
| contradictions | 5 | 5 | 4 | VERDICT: donto SUCCESSFULLY handles the contradictions lens. It genuinely holds both sides of conflicts: / / STRENGTHS: … |
| cross-event-entities | 5 | 5 | 4 | EXCELLENT SIGNAL. The cross-event-entities lens reveals that donto's graph is HIGHLY QUERYABLE and reveals genuine front… |
| euphemism-decode | 5 | 5 | 2 | The euphemism-decode lens WORKS EXCELLENTLY. Donto captures the scholarly core: period euphemisms (dispersal, lesson, pu… |
| motive-tactics-resistance | 5 | 5 | 3 | STRONG PERFORMANCE. The donto graph successfully extracted and linked: / / 1. **Motive inference**: The model identified… |
| provenance-epistemics | 5 | 5 | 4 | DONTO HANDLES PROVENANCE-EPISTEMICS VERY WELL. The graph is NOT noise. Findings: (1) **Rich attestation network**: 73 at… |
| casualties-quantities | 5 | 5 | 1 | donto handles the casualties-quantities lens well. The graph successfully extracts and stores casualty counts (numberOfP… |
| nmp-structure | 5 | 4 | 1 | donto successfully reconstructs NMP COMMAND HIERARCHY and CAMP TOPOLOGY from frontier-conflict events. Queries 1, 2, 4, … |
| kinship-social | 5 | 5 | 2 | donto's kinship-social lens is HIGHLY QUERYABLE and GENUINELY USEFUL. The graph successfully captures: (1) genealogical … |
| spatial-jurisdiction | 5 | 5 | 3 | donto successfully reconstructs the spatial-colonial hierarchy (site → pastoral run → pastoral district → colony) with c… |
| temporal-sequence | 5 | 5 | 3 | SOLID GRAPH, genuinely useful for temporal analysis. The donto extraction captured: (1) Precise dates at multiple granul… |
| causation-second-order | 5 | 5 | 1 | DONTO HANDLES THIS LENS EXCEPTIONALLY WELL. The causation-second-order graph is QUERYABLE and USEFUL. / / STRENGTHS: / 1… |
| novel-predicates | 5 | 5 | 2 | Donto's novel-predicate long tail is GENUINELY USEFUL and semantically coherent. Of 6,111 unique predicates extracted ac… |
| sources-as-entities | 5 | 5 | 3 | Donto handles the sources-as-entities lens VERY WELL. The graph successfully captures: (1) source entities as first-clas… |
| bias-framing-language | 5 | 5 | 3 | STRONG PASS. Donto successfully captures **period-source bias and framing as structured facts**, not as editorial judgme… |
| identity-hypotheses | 5 | 5 | 2 | VERDICT: **EXCELLENT QUERYABLE GRAPH.** The identity-hypotheses lens is genuine, rich, and immediately actionable. / / *… |
| analytics-topology | 5 | 5 | 2 | Donto IS queryable and captures real semantic structure, but with significant limitations. STRENGTHS: The graph correctl… |
| single-event-deep | 5 | 5 | 3 | donto handles single-event-deep VERY WELL. Event 10690 is fully reconstructable across 5 critical axes: geographic/ident… |
| cross-extraction-consistency | 5 | 5 | 5 | Donto returns REAL data on core facts (event date, victim count, victim names agree across runs), but the graph is sever… |
| weapons-methods | 5 | 4 | 3 | The weapons-methods lens is highly functional and genuinely useful for historical violence research. Donto successfully … |
| surprising-emergent | 5 | 5 | 5 | **VERDICT: donto is GENUINELY USEFUL for this corpus.** The surprise-emergent lens reveals a working, queryable knowledg… |
| adversarial-junk | 5 | 5 | 0 | donto does NOT produce significant junk. The extraction is FAITHFUL rather than hallucinating false confidence. Booleans… |

## Appendix — all 105 queries & answers

Per lens: each query's intent, the actual sample rows donto returned, and whether it worked / surprised.

### contradictions

*VERDICT: donto SUCCESSFULLY handles the contradictions lens. It genuinely holds both sides of conflicts: / / STRENGTHS: / 1. **Paraconsistent by design**: The graph explicitly stores contradictory claims as separate facts. Query 1 returned 20 contradiction predicates (dateDisputed, dateDiscrepancyWith, countDiscrepancyWith, conflictsWith, threeConflictingDates). These are queryable, precise, and a…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 1 | Find all contradiction predicates within events (conflict, discrepancy, dispute, contradiction) | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| dateDisputed \| true / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| dateDiscrepancyWith \| event record says after 27 December 1855 but Capricornian 19 Sept 1925 says Ch… | ✅ | 💡 |
| 2 | Find corroborating claims (sources confirming or aligning with each other) | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| corroboratedBy \| ex:capricornian-1925-09-05-p60 / ex:sydney-mail-1933-11-08 \| corroborates \| ex:sydney-mail-1933-09-20 / ex:silcock-j-2009 \| corroboratedBy \| ex:hercus… | ✅ |  |
| 3 | Find all death count claims for one event to show paraconsistent holding of conflicting numbers | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| countDiscrepancyWith \| Murray reports 11 killed and 3 wounded; Boles Jnr reports over 100 killed / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| countLowerBound \| 100 /… | ✅ | 💡 |
| 4 | Deep extraction context (ingest-verify/10690 with 2334 facts) - check if multiple extractions hold both contradictions and corroborations | ex:attack-rannes-1855 \| countDiscrepancyWith \| ex:queenslander-1892-12-10 / ex:attack-rannes-1855 \| countDiscrepancyWith \| ex:capricornian-1909-04-10 / ex:attack-rannes-1855 \| datasetDeathCount \| 3 / ex:attack-rann… | ✅ | 💡 |
| 5 | Find cases where sources explicitly disagree on distance, toll, or narrative detail | ex:attack-nmp-detachment-beresford-mckinlay-ranges-1883 \| reportedDeathTollPerSource \| 1 / ex:attack-nmp-detachment-beresford-mckinlay-ranges-1883 \| reportedDeathTollPerSource \| 5 / ex:attack-nmp-detachment-beresford… | ✅ | 💡 |

### cross-event-entities

*EXCELLENT SIGNAL. The cross-event-entities lens reveals that donto's graph is HIGHLY QUERYABLE and reveals genuine frontier-war structure: (1) Key institutional entities like ex:nmp (Native Mounted Police) recur across 5 different event contexts with 127 total statements—providing rich, usable profiles of colonial agents. (2) Named individuals like Frederic Charles Urquhart appear in 2+ distinct e…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 6 | Find all subjects appearing in 2+ distinct event contexts to identify recurring entities (people, places, institutions) | ex:nmp \| 5 \| 127 / ex:university-of-newcastle \| 5 \| 18 / ex:burke-pastoral-district \| 4 \| 26 / ex:queensland \| 4 \| 25 / ex:moreton-bay \| 4 \| 15 / ex:brisbane \| 4 \| 9 / ex:empire-1855-10-15 \| 3 \| 185 / ex:de… | ✅ | 💡 |
| 7 | Inspect the Native Mounted Police (ex:nmp) entity across all events where it appears to see what predicates are used | ex:nmp \| abbreviation \| NMP \| ctx:genealogy/frontier-massacres/10615 / ex:nmp \| actedToSuppressAboriginalResistance \| true \| ctx:genealogy/frontier-massacres/10615 / ex:nmp \| colonialCoOptationDynamic \| colonised… | ✅ |  |
| 8 | Find individual people (not institutions) that appear in 2+ events to identify named officers and Aboriginal persons mentioned across differ… | ex:cloncurry-nmp-camp \| 3 / ex:native-mounted-police \| 3 / ex:gladstone-nmp-camp \| 3 / ex:frederic-charles-urquhart \| 3 / ex:brisbane-trooper \| 2 / ex:banana-station \| 2 / ex:don-river \| 2 / ex:brisbane-courier \|… | ✅ | 💡 |
| 9 | Examine Combo James (Aboriginal NMP trooper) across extraction variants to see predicate inconsistency/richness from different passes | ex:combo-james \| ctx:genealogy/frontier-massacres/10690 \| alsoKnownAs \| Combo Jimmy / ex:combo-james \| ctx:genealogy/frontier-massacres/10690 \| killedIn \| ex:attack-nmp-rannes-1855-09-23 / ex:combo-james \| ctx:gen… | ✅ | 💡 |
| 10 | Compare place entity (Burke Pastoral District) across events to see how location semantics and relationships vary (contains vs containsPlace… | ctx:genealogy/frontier-massacres/10622 \| contains \| ex:alexandra-river-swamp / ctx:genealogy/frontier-massacres/10622 \| isProductOfColonialExpansion \| true / ctx:genealogy/frontier-massacres/10622 \| locatedIn \| ex:… | ✅ | 💡 |

### euphemism-decode

*The euphemism-decode lens WORKS EXCELLENTLY. Donto captures the scholarly core: period euphemisms (dispersal, lesson, punish, paid the penalty) paired with decoded meanings (killing, mass killing, reprisal killings). The graph structure includes direct mappings, victim/perpetrator framing reversals, source-anchored quotations, explicit bias annotations, and causal chains linking euphemisms to casu…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 11 | Surface direct euphemism-to-meaning mappings via euphemismFor/euphemismInSource predicates | ex:attack-wilmot-lagoon-near-mt-larcomb-1855\|euphemismInSource\|shot down / ex:attack-wilmot-lagoon-near-mt-larcomb-1855\|euphemismFor\|killed / ex:attack-wilmot-lagoon-near-mt-larcomb-1855\|euphemismDecoded\|paid the p… | ✅ |  |
| 12 | Find decoded meanings attached to euphemisms via euphemismDecoded/decodedAs predicates | ex:attack-wilmot-lagoon-near-mt-larcomb-1855\|euphemismDecoded\|punish the offenders → kill Aboriginal people / ex:kaliduwarry-attack-1879\|euphemismDecoded\|mass killing of Aboriginal people / ex:kaliduwarry-attack-1879… | ✅ | 💡 |
| 13 | Uncover narrative framing reversals: how perpetrators and victims were renamed in colonial discourse | ex:attack-wilmot-lagoon-near-mt-larcomb-1855\|victimsFramedAs\|the black murderers / ex:attack-wilmot-lagoon-near-mt-larcomb-1855\|perpetratorsFramedAs\|the avengers / ex:aboriginal-people\|framedAs\|murderers / ex:kalka… | ✅ | 💡 |
| 14 | Isolate the canonical 'dispersal' euphemism and trace it to actual events and violence | ex:nmp-dispersal-glengyle\|describedAsDispersal\|true / ex:queenslander-1879-05-24\|euphemismUsed\|dispersed / ex:queenslander-1879-05-24\|quotedAsSaying\|Gough and Kaye started after the blacks, found a large camp, and … | ✅ |  |
| 15 | Capture euphemisms of punishment and penalty with casualty counts and causal chains | ex:attack-wilmot-lagoon-near-mt-larcomb-1855\|euphemismInSource\|paid the penalty / ex:attack-wilmot-lagoon-near-mt-larcomb-1855\|euphemismDecoded\|paid the penalty → killed / ex:capricornian-1925-09-05-p60\|quotedAsSayi… | ✅ |  |

### motive-tactics-resistance

*STRONG PERFORMANCE. The donto graph successfully extracted and linked: / / 1. **Motive inference**: The model identified distinct motive types (reprisal, punishment, resistance to encroachment, NMP troop retaliation) for different actors—Aboriginal attackers, settler reprisal parties, and NMP detachments. The "motivationTargetingNmpOnly" statement shows nuanced analysis of intra-colonial tension (…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 16 | Extract stated motives for settler and Aboriginal attacks, including reprisal, punishment, and resistance claims | ex:attack-wilmot-lagoon-near-mt-larcomb-1855\|motiveStated\|Reprisal for the Mt Larcomb killings of the Forans, Smelt, Murray and an Aboriginal man / ex:kaliduwarry-attack-1879\|motiveStated\|punishment for murder of a s… | ✅ | 💡 |
| 17 | Identify tactical patterns: surprise attacks, multi-pronged assaults, deception, disarmament, strategic calculations by Aboriginal attackers | ex:aboriginal-attackers-rannes\|strategicCalculation\|calculated they could easily overpower the reduced NMP guard / ex:attack-nmp-rannes-1855-09-23\|decoyTactic\|Aboriginal women allegedly used to lure NMP troopers befo… | ✅ | 💡 |
| 18 | Capture Aboriginal resistance framing: cattle-killing as economic resistance, opposition to pastoral encroachment, engagement with NMP and c… | ex:kalkadoon\|resistedColonialEncroachment\|true / ex:kalkadoon\|engagedInResistanceAgainst\|ex:native-mounted-police / ex:kalkadoon-people\|knownForResistance\|Kalkadoon people known for armed resistance to colonial set… | ✅ |  |
| 19 | Track reprisal chains: which events triggered which reprisals, multi-level escalation sequences, and linkage between Aboriginal and settler … | ex:attack-wilmot-lagoon-near-mt-larcomb-1855\|reprisalFor\|ex:attack-europeans-mt-larcomb-station-1855 / ex:attack-europeans-mt-larcomb-station-1855\|subsequentReprisal\|ex:attack-aboriginal-people-nankin-creek-1855 / ex… | ✅ |  |
| 20 | Identify provocation, escalation triggers, and causality: deaths that triggered reprisals, trigger-to-response timing, and asymmetric kill r… | ex:attack-wilmot-lagoon-near-mt-larcomb-1855\|triggeredBy\|ex:attack-europeans-mt-larcomb-station-1855 / ex:attack-wilmot-lagoon-near-mt-larcomb-1855\|killedToTriggerDeathRatio\|at least 100 killed in reprisal for 5 deat… | ✅ | 💡 |

### provenance-epistemics

*DONTO HANDLES PROVENANCE-EPISTEMICS VERY WELL. The graph is NOT noise. Findings: (1) **Rich attestation network**: 73 attestedBy statements show witness/source linking; 37 corroboratedBy statements explicitly track cross-validation. (2) **Source taxonomy working**: Predicates like sourceDepth, sourceReliability, sourceTemporalDistance are populated with semantically meaningful values (primary/seco…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 21 | Identify single-witness vs multi-source attestations — which claims have corroborating evidence across independent witnesses or sources | ex:kaliduwarry-attack-1879 \| attestedBy \| ex:brisbane-courier-1879-03-28 / ex:kaliduwarry-attack-1879 \| attestedBy \| ex:sydney-mail-1933-11-08 / ex:kaliduwarry-attack-1879 \| attestedBy \| ex:silcock-j-2009 / ex:atta… | ✅ | 💡 |
| 22 | Assess source type distribution — identify primary sources (newspapers, letters, depositions) vs secondary sources (later compilations, oral… | ex:attack-rannes-1855 \| reportedInNewspaper \| ex:sydney-morning-herald-1855-10-22 / ex:qsa-urquhart-letter-1884-08-21 \| sourceReliability \| primary source - letter from investigating officer / ex:qsa-inquest-259-1883… | ✅ |  |
| 23 | Measure source depth and temporal distance — understand how close sources are to events (primary witness, secondhand, decades later recollec… | ex:capricornian-1913-07-19 \| sourceDepth \| uncertain (58 years later, third person recollection) / ex:letter-leith-hay-holt-1855-10-17 \| sourceDepth \| primary (station owners present at scene) / ex:letter-o-connell-t… | ✅ | 💡 |
| 24 | Extract certainty markers — identify explicit epistemic hedging by sources (claimed, supposedly, confirmed, possibly, unclear) | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| certaintyMarker \| not clear / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| certaintyMarker \| Possible / ex:attack-nmp-detachment-beresford-mckinlay-ranges-1883 \| cert… | ✅ | 💡 |
| 25 | Detect conflicting accounts — identify discrepancies across sources for the same event (divergent death tolls, different event terminology, … | ex:attack-rannes-1855 \| eventTerminologyBySource \| massacre (Morning Bulletin 1912) / ex:attack-rannes-1855 \| eventTerminologyBySource \| attack (dataset) / ex:attack-rannes-1855 \| eventTerminologyBySource \| outrage… | ✅ | 💡 |

### casualties-quantities

*donto handles the casualties-quantities lens well. The graph successfully extracts and stores casualty counts (numberOfPeopleKilled, reportedDeathTollPerSource) with explicit qualifiers (approximate, lowerBound, disputed), tracks source multiplicity (Murray vs Boles Jnr vs Capricornian with differing figures), and resolves body treatment in resolvable chains (buriedBy→agent→further metadata). Coun…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 26 | Find core casualty counts and qualifiers (killed, wounded, casualty toll). | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| numberOfPeopleKilled \| 100 / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| killedCountReportedPerSource \| over 100 / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| kil… | ✅ |  |
| 27 | Identify count discrepancies, qualifiers (approximate, lower bound, disputed) and disputes between sources. | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| killedCountApproximate \| true / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| countLowerBound \| 100 / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| countDisputed \| t… | ✅ |  |
| 28 | Find body treatment predicates: burial, grave marking, body disposal methods (burned, thrown, buried). | ex:marcus-beresford \| buriedBy \| ex:j-bell / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| sheepThrownIntoBonfires \| true / ex:marcus-beresford \| buriedOnSite \| true / ex:marcus-beresford \| graveMarkedWithHeadAnd… | ✅ | 💡 |
| 29 | Query numerical casualty counts by source and context, including counts of troops and livestock. | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| numberOfPeopleKilled \| 100 / ex:john-murray-nmp \| reportedWoundedCount \| 3 / ex:john-murray-nmp \| reportedKillCount \| 11 / ex:nmp-troopers-wilmot-lagoon \| countApprox… | ✅ |  |
| 30 | Search for casualty-related predicates mixed with meta-notes, reliability indicators, and event causation chains. | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| killedCountPerMurrayLetter \| 11 / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| causationChainDetailed \| 1. Aboriginal people killed Europeans at Mt Larcomb station → 2… | ✅ |  |

### nmp-structure

*donto successfully reconstructs NMP COMMAND HIERARCHY and CAMP TOPOLOGY from frontier-conflict events. Queries 1, 2, 4, 5 returned clean, on-point structure: Captain O'Connell → Lt. John Murray → Sgt. James Boles → Constables; 4 named camps (Gladstone, Rockhampton, Burke River/Boulia, Cloncurry) with stationed officers and geographic containment. Query 3 (trooper recruitment) FAILED: regNo (regist…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 31 | Discover all NMP-related predicates and core structure (camps, officers, troopers, detachments, ranks, command) | ex:john-murray \| commandedDetachment \| ex:nmp-detachment-rannes / ex:marcus-beresford \| ledTroopers \| ex:larry-trooper-4 / ex:cloncurry-nmp-camp \| suppliedTrooper \| ex:billy-trooper-6 / ex:robert-george-walker \| o… | ✅ |  |
| 32 | Reconstruct officer profiles: John Murray and James Boles career trajectories, ranks, command relations, associations | ex:john-murray \| receivedOrdersFrom \| ex:maurice-o-connell / ex:john-murray \| commandedDetachment \| ex:nmp-detachment-rannes / ex:gladstone-nmp-camp \| rdfType \| ex:NmpCamp / ex:gladstone-nmp-camp \| stationedOffice… | ✅ |  |
| 33 | Query trooper groups, recruitment origins, ethnicity, regNo (registration numbers), deployment | ex:nmp-troopers-wilmot-lagoon \| memberOf \| ex:nmp / ex:nmp-troopers-wilmot-lagoon \| recruitedFrom \| other regions / ex:nmp \| trooperCount \| 12 / ex:nmp-troopers-wilmot-lagoon \| ethnicity \| Aboriginal / ex:nmp-tro… | ❌ | 💡 |
| 34 | Reconstruct command hierarchy: ranks, chain of command (captain → lieutenant → sergeant → constable), commandedBy/commanded relations | ex:john-murray-nmp \| rank \| Lieutenant / ex:james-boles-nmp \| rank \| Orderly-sergeant / ex:captain-oconnell \| rank \| Captain / ex:captain-oconnell \| commanded \| ex:john-murray-nmp / ex:john-murray-nmp \| ledBy \|… | ✅ |  |
| 35 | Map camp locations, geographic containment, stationing of officers, camp typology | ex:burke-river-nmp-camp-boulia \| rdf:type \| ex:NmpCamp / ex:burke-river-nmp-camp-boulia \| locatedAt \| ex:boulia / ex:gladstone-nmp-camp \| rdfType \| ex:NmpCamp / ex:gladstone-nmp-camp \| rdfsLabel \| Gladstone (Port… | ✅ |  |

### kinship-social

*donto's kinship-social lens is HIGHLY QUERYABLE and GENUINELY USEFUL. The graph successfully captures: (1) genealogical parent-child links with correct directional predicates; (2) rich marriage records including dates, locations, and relationship variants; (3) occupational hierarchies with rank and status distinctions; (4) employment chains and land ownership networks showing economic power struct…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 36 | Extract direct parent-child genealogical links (parentOf, childOf, fatherOf, motherOf) | ex:james-boles-jnr \| childOf \| ex:james-boles-nmp / ex:james-boles-snr \| parentOf \| ex:james-boles-jnr / ex:james-boles-nmp \| fatherOf \| ex:james-boles-jnr | ✅ |  |
| 37 | Extract marriage and spousal relations (spouseOf, marriedTo, marriage dates/locations) | ex:richard-h-blamey \| spouseOf \| ex:margaret-louisa-murray / ex:richard-h-blamey \| marriedOnDate \| 1871-09-18 / ex:charleville \| siteOfMarriage \| ex:blamey-marriage / ex:margaret-louisa-murray \| spouseOf \| ex:ric… | ✅ | 💡 |
| 38 | Extract occupational status, rank, and titles (occupation, status, rank, officer, sergeant, etc.) | ex:alfred-a-hart \| title \| J.P. / ex:james-boles-nmp \| rank \| Orderly-sergeant / ex:john-murray-nmp \| rank \| Lieutenant / ex:mr-wilmot \| occupation \| hotelkeeper / ex:william-young \| occupation \| squatter | ✅ |  |
| 39 | Extract employment, subordination, and land ownership hierarchies (servedUnder, employedBy, landownerOf, etc.) | ex:mount-larcomb-station \| landowner \| ex:william-young / ex:james-boles-nmp \| servedUnder \| ex:john-murray-nmp / ex:thomas-patterson-scott \| employedBy \| ex:roderick-robert-urquhart / ex:roderick-robert-urquhart \… | ✅ | 💡 |
| 40 | Extract sibling, kinship, and family relations (brotherOf, siblingOf, uncle, cousin, etc.) | ex:john-murray-nmp \| siblingOf \| ex:murray-brother-other / ex:charles-leith-hay \| siblingOf \| ex:james-leith-hay / ex:norman-leith-hay \| siblingOf \| ex:charles-leith-hay / ex:james-boles-snr \| servedBothMurrayBrot… | ✅ |  |

### spatial-jurisdiction

*donto successfully reconstructs the spatial-colonial hierarchy (site → pastoral run → pastoral district → colony) with clean, queryable containment predicates. Station ownership chains are solid. Native title boundary links exist and are valuable for matching colonial events to modern land claims. Aboriginal country assignments (onCountryOf, countryIncludes) are present but signal is weak due to a…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 41 | Find basic location facts: locatedAt, nearPlace, locationType, locatedWithin, containsPlace | ex:burke-river-nmp-camp-boulia \| locatedAt \| ex:boulia / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| nearPlace \| ex:mt-larcomb / ex:wilmot-lagoon \| locationType \| Lagoon/waterhole / ex:wilmot-lagoon \| locatedWi… | ✅ | 💡 |
| 42 | Reconstruct the containment hierarchy: site->run->district->colony, with station ownership and NMP camp locations | ex:wilmot-lagoon \| withinPoliceDistrict \| ex:gladstone-police-district / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| withinPastoralRun \| ex:mount-larcomb-station / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| w… | ✅ | 💡 |
| 43 | Link events to Aboriginal countries and native title claims (onCountryOf, countryIncludes, nativeTitle predicates) | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| onCountryOf \| ex:bayali-language-group / ex:bayali-language-group \| countryIncludes \| ex:wilmot-lagoon / ex:bayali-language-group \| countryIncludes \| ex:mount-larcomb-… | ✅ |  |
| 44 | Extract water features, geographic boundaries, and native title boundary containment (rivers, creeks, bioregion, withinNativeTitleBoundary) | ex:kaliduwarry-attack-1879 \| withinNativeTitleBoundary \| ex:wangkangurru-yarluyandi-native-title-claim / ex:kaliduwarry-attack-1879 \| bioregion \| Simpson Desert / ex:attack-nmp-detachment-beresford-mckinlay-ranges-18… | ✅ | 💡 |
| 45 | Trace pastoral station ownership and spatial hierarchy (landowner, headStation, withinPastoralDistrict) | ex:mount-larcomb-station \| landowner \| ex:william-young / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| withinPastoralRun \| ex:mount-larcomb-station / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| withinPastoralDi… | ✅ |  |

### temporal-sequence

*SOLID GRAPH, genuinely useful for temporal analysis. The donto extraction captured: (1) Precise dates at multiple granularities (ISO dates, years, day-month phrases). (2) Robust event sequencing (precededBy/followedBy) across multi-event conflict chains. (3) Rich reprisal networks (reprisalFor, causedBy, triggeredBy, subsequentReprisal) that link trigger events to responses with full temporal cont…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 46 | Extract direct temporal facts: dates, years, months associated with events | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| occurredOnDate \| 1855-12-25 / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| occurredInYear \| 1855 / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| dayMonth \| After 27… | ✅ |  |
| 47 | Identify temporal ordering relationships: precededBy, followedBy, and event sequence chains | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| precededBy \| ex:attack-europeans-mt-larcomb-station-1855 / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| followedBy \| ex:attack-aboriginal-people-sneaker-creek-1855 / e… | ✅ | 💡 |
| 48 | Extract reprisal chains and causal relationships: reprisalFor, causedBy, triggeredBy, inResponseTo | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| causedBy \| ex:attack-europeans-mt-larcomb-station-1855 / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| reprisalFor \| ex:attack-europeans-mt-larcomb-station-1855 / ex:at… | ✅ | 💡 |
| 49 | Join temporal and causal facts: find events with both dates AND reprisal relationships | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| causedBy \| ex:attack-europeans-mt-larcomb-station-1855 \| occurredInYear \| 1855 / ex:kaliduwarry-attack-1879 \| causedBy \| ex:death-of-thomas-patterson-scott \| occurred… | ✅ |  |
| 50 | Identify temporal disputes and uncertainty: dateDisputed, dateUncertain, alternateDateClaim, dateDiscrepancy | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| dateDisputed \| true / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| dateDiscrepancyWith \| event record says after 27 December 1855 but Capricornian 19 Sept 1925 says Ch… | ✅ | 💡 |

### causation-second-order

*DONTO HANDLES THIS LENS EXCEPTIONALLY WELL. The causation-second-order graph is QUERYABLE and USEFUL. / / STRENGTHS: / 1. EXPLICIT CAUSAL PREDICATES: The extractor invented rich, domain-specific predicates (causedBy, triggeredBy, inResponseTo, provokedReprisal, organisedAfterKillings) that directly encode causal intent. Not generic RDF—tailored to frontier violence narratives. / / 2. MULTI-HOP NAR…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 51 | Find causal predicates linking events: causedBy, triggeredBy, inResponseTo, causationChain | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| causationChainDetailed \| 1. Aboriginal people killed Europeans at Mt Larcomb station → 2. Punitive party organised to bury victims and punish offenders → 3. Aboriginal par… | ✅ |  |
| 52 | Find provoked/retaliation chains: provokedReprisal, escalation, retaliation predicates | ex:attack-europeans-mt-larcomb-station-1855 \| provokedReprisal \| ex:attack-wilmot-lagoon-near-mt-larcomb-1855 / ex:death-of-john-mcconachie \| provokedReprisal \| ex:kaliduwarry-attack-1879 / ex:attack-nmp-detachment-b… | ✅ | 💡 |
| 53 | Find result/aftermath/consequence predicates showing outcomes of causal events | ex:attack-rannes-1855 \| resultedIn \| ex:shearers-abandoning-work / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| resultedIn \| over 100 Aboriginal people killed / ex:death-of-john-mcconachie \| resultedIn \| ex:kalid… | ✅ |  |
| 54 | Find behavioral second-order effects: fleeing, dispersal, retreat patterns triggered by violence | ex:kaliduwarry-attack-1879 \| victimsWereFleeing \| true / ex:kaliduwarry-attack-1879 \| victimsFleeingTo \| springs in the desert west of the Georgina River / ex:beresford-detachment \| coveredBodyBeforeFleeing \| true … | ✅ |  |
| 55 | Find institutional/investigative second-order effects: investigations, orders, organizational responses to violence | ex:punitive-party \| organisedQuicklyAfterKillings \| true / ex:punitive-party \| organisedAfter \| ex:attack-europeans-mt-larcomb-station-1855 / ex:o-connell-maurice \| authorisedResponseTo \| ex:attack-nmp-rannes-1855-… | ✅ |  |

### novel-predicates

*Donto's novel-predicate long tail is GENUINELY USEFUL and semantically coherent. Of 6,111 unique predicates extracted across 12,589 statements, 4,995 (81.7%) appear exactly once—a extreme abundance phenomenon. However, these are NOT noise: they are carefully formed, domain-specific predicates (aboriginalLaborExploitation, aboriginalPartyTrackedBySheepTrail, accusedOfFencingOutAboriginalPeople) tha…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 56 | Identify the rarest predicates and their frequency distribution across the frontier-massacres dataset | aboriginalPeopleHadBlanketsAndClothing \| 1 / aboriginalPeopleNumberAtLagoon \| 1 / aboriginalPeopleDistracted \| 1 / aboriginalPeopleEasilyTracked \| 1 / abbreviatedCompanionTestimony \| 1 / aboriginalHelpers \| 1 / abo… | ✅ | 💡 |
| 57 | Understand the long-tail distribution: what % of predicates are singletons vs. common | freq=899: 1 predicate (rdfType) / freq=829: 1 predicate / freq=130–72: 1 predicate each / freq=1: 4,995 predicates (81.7% of all unique predicates!) / freq=2: 583 predicates / freq=3: 169 predicates | ✅ | 💡 |
| 58 | Sample singleton (freq=1) predicates to judge whether they encode real meaning or are noise | ex:aboriginal-people-perpetrators \| abandonedPursuitOf \| ex:jack-aboriginal-stockman / ex:shepherds-rannes \| abandonedWorkAfterAttack \| true / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| aboriginalActivityPrecedi… | ✅ |  |
| 59 | Examine medium-frequency predicates (freq=5–10) to see if they form coherent domain ontology | heldBy \| killedAt \| referredToAs \| woundedIn \| describedAsHaving \| motiveStated \| occurredInYear \| surname \| firstName \| provokedReprisal \| sourceReliability \| colonialFraming \| weaponUsed \| datedOn \| narra… | ✅ |  |
| 60 | Sample actual triples for domain predicates to verify semantic coherence | ex:peter-blackboy \| killedAt \| ex:attack-europeans-mt-larcomb-station-1855 / ex:james-foran \| killedAt \| ex:attack-europeans-mt-larcomb-station-1855 / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| motiveStated \| R… | ✅ |  |

### sources-as-entities

*Donto handles the sources-as-entities lens VERY WELL. The graph successfully captures: (1) source entities as first-class RDF subjects with metadata (publication dates, names, reliability ratings, source types); (2) verbatim quotes attributed to sources via quotedAsSaying; (3) source-to-event mappings (reports, reportsKillCount, reportsSurvivorCount); (4) attestation links (attestedBy); (5) explic…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 61 | Find newspaper, gazette, and source entities in the graph and their basic metadata | ex:capricornian-1925-09-05-p60 \| rdfsLabel \| Capricornian 5 September 1925, p60 / ex:capricornian-1925-09-05-p60 \| sourceType \| other source / ex:capricornian-1925-09-05-p60 \| publicationName \| Capricornian / ex:ca… | ✅ | 💡 |
| 62 | Extract verbatim quotes attributed to sources (quotedAsSaying predicate) | ex:capricornian-1925-09-19-p59 \| quotedAsSaying \| whilst getting ready for a barbaric feast of mutton roasted in the wool / ex:capricornian-1925-09-05-p60 \| quotedAsSaying \| Over 100 blacks paid the penalty out of th… | ✅ | 💡 |
| 63 | Map sources to events they report on (reports, publishesOn predicates with target event IRIs) | ex:capricornian-1925-09-05-p60 \| reports \| ex:attack-wilmot-lagoon-near-mt-larcomb-1855 / ex:sydney-mail-1933-09-20 \| reports \| ex:kaliduwarry-attack-1879 / ex:capricornian-1925-09-05-p60 \| reportsKillCount \| over … | ✅ |  |
| 64 | Surface source reliability ratings, source types (primary vs. secondary, contemporary vs. retrospective), and attestation | ex:capricornian-1925-09-05-p60 \| sourceType \| other source / ex:capricornian-1925-09-05-p60 \| reliabilityRating \| not particularly reliable / ex:capricornian-1925-09-05-p60 \| attestedBy \| ex:james-boles-jnr / ex:br… | ✅ |  |
| 65 | Detect discrepancies, conflicts, and corroboration chains between sources reporting the same event | ex:sydney-mail-1933-11-08 \| corroborates \| ex:sydney-mail-1933-09-20 / ex:brisbane-courier-1879-03-28 \| corroborates \| ex:queenslander-1879-05-24 / ex:sydney-mail-1933-11-08 \| countDiscrepancyWith \| ex:south-austra… | ✅ | 💡 |

### bias-framing-language

*STRONG PASS. Donto successfully captures **period-source bias and framing as structured facts**, not as editorial judgment. The graph models: / (1) Asymmetric characterization—Aboriginal subjects as "murderers," "savage," "wild"; colonials as "avengers," "victims," "performing duty" / (2) Inflammatory vocabulary—"outrage," "depredations," "barbaric"—reliably extracted and attributed to events / (3…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 66 | Find predicates marking how subjects were described or framed (framedAs, describedAs, racialLanguage patterns) | ex:unnamed-aboriginal-people-wilmot-lagoon \| describedAs \| murderers / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| victimsFramedAs \| the black murderers / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| perpetrato… | ✅ | 💡 |
| 67 | Find perpetrator roles and who is attributed blame vs. victimhood in source language | ex:john-murray-nmp \| perpetratorOf \| ex:attack-wilmot-lagoon-near-mt-larcomb-1855 / ex:constable-thribble \| perpetratorOf \| ex:attack-wilmot-lagoon-near-mt-larcomb-1855 / ex:nmp \| perpetratorGroupOf \| ex:attack-wil… | ✅ |  |
| 68 | Capture inflammatory language directed at Aboriginal subjects: miscreant, murderer, outrage, depredation, savagery | ex:sydney-mail-1934-01-03 \| reportsOutrage \| true / ex:attack-nmp-rannes-1855-09-23 \| describedAsOutrageBy \| ex:qsa212594-1855-letter-o-connell-murray-27-sept / ex:attack-nmp-rannes-1855-09-23 \| depredationsBeforeAt… | ✅ | 💡 |
| 69 | Extract colonial justifications: how settlements rationalized reprisals, deterrence, 'necessary' violence | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| colonialJustification \| retaliation for killings at Mt Larcomb station / ex:kaliduwarry-attack-1879 \| colonialJustification \| retribution for death of white man/settlers… | ✅ | 💡 |
| 70 | Find source attribution: link framing/inflammatory language to specific documents, newspapers, witnesses | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| reportedIn \| ex:capricornian-1925-09-05-p60 / ex:capricornian-1925-09-05-p60 \| attestedBy \| ex:james-boles-jnr / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| attested… | ✅ |  |

### identity-hypotheses

*VERDICT: **EXCELLENT QUERYABLE GRAPH.** The identity-hypotheses lens is genuine, rich, and immediately actionable. / / **STRENGTHS:** / - ~20 core identity predicates (sameAs, likelySameAs, aliasOf, alsoKnownAs, nameVariant, spellingVariant, nameVariantOf, possiblySameAs, hasSpellingVariant, nameSpellingVariant, spellingVariantOf, etc.) are consistently present and semantically coherent / - Identi…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 71 | Discover broad identity links: sameAs, likelySameAs, aliasOf, alsoKnownAs relationships | ex:wangkamana \| likelySameAs \| ex:wangkangurru / ex:maconachie \| likelySameAs \| ex:john-mcconachie / ex:james-boles-snr \| sameAs \| ex:james-boles-nmp / ex:port-curtis-pastoral-district \| sameAs \| ex:gladstone-pol… | ✅ | 💡 |
| 72 | Find name variants and spelling variants (nameVariant, spellingVariant, nameVariantOf) | ex:james-boles-nmp \| spellingVariant \| Bole / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| spellingVariantsInSource \| Larcomb (record) vs Larcombe (Capricornian) vs Larcom (township); Boles vs Bowles vs Bole / ex:r… | ✅ | 💡 |
| 73 | Identify branching identity chains: subjects with multiple targets for same predicate (consolidation candidates) | ex:aboriginal-people-alexandra \| likelySameAs \| 2 \| ex:gkuthaarn-and-kukatj-people \| ex:kukatj / ex:alfred-a-hart-jp \| alsoKnownAs \| 2 \| Mr A.A. Hart \| Mr Hart J.P. / ex:attack-rannes-1855 \| alsoKnownAs \| 7 \| … | ✅ |  |
| 74 | Find highly-connected identity hubs (entities with most identity relationships, needing consolidation) | ex:henry-walker \| 8 \| alsoKnownAs, nameVariant, possiblySameAs / ex:devoncourt-station \| 8 \| alsoKnownAs, hasSpellingVariant, likelySameAs, nameSpellingVariant, spellingVariant / ex:nmp-corps \| 7 \| alsoKnownAs / ex… | ✅ |  |
| 75 | Detect redundancy and multi-predicate divergence: subjects pointing to multiple targets via different identity predicates (confidence mixing… | ex:combo-jimmy \| 2 \| aliasOf, likelySameAs | ✅ |  |

### analytics-topology

*Donto IS queryable and captures real semantic structure, but with significant limitations. STRENGTHS: The graph correctly identifies focal entities (massacre events, attack IRIs, witnesses), respects domain predicates (discrepancy tracking, quoted testimony, name variants), and shows measurable variance across extraction contexts (ingest-verify is 2.5x denser than agnostic in statements/subject). …*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 76 | Find the most connected subjects (entities by outgoing statement count) within the frontier-massacres dataset — which entities are focal to … | ex:attack-rannes-1855 \| 580 / ex:attack-nmp-rannes-1855-09-23 \| 380 / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| 312 / ex:companion-guide \| 262 / ex:attack-powell-jack-mistake-creek-1884 \| 251 / ex:event-10622 \… | ✅ |  |
| 77 | Measure predicate frequency distribution — what semantic relationships dominate the graph? Are there domain-specific predicates or just RDF … | rdfType \| 899 / rdfsLabel \| 829 / describedAs \| 130 / quotedAsSaying \| 122 / alsoKnownAs \| 94 / nameVariant \| 81 / reportedIn \| 77 / attestedBy \| 73 / memberOf \| 72 / locatedWithin \| 71 / likelySameAs \| 64 / p… | ✅ |  |
| 78 | Assess anchor coverage — what fraction of objects are IRIs (entity references) vs literals (text values)? Are entities well-linked or mostly… | 4465 \| 8124 \| 12589 \| 35.5 | ✅ | 💡 |
| 79 | Calculate entity in-degree — which objects appear most frequently as statement targets? Do real entity hubs exist or is the graph dominated … | true \| 2620 / ex:Person \| 165 / ex:attack-rannes-1855 \| 119 / ex:Place \| 104 / ex:Source \| 93 / ex:Group \| 78 / ex:rannes-station \| 64 / ex:attack-powell-jack-mistake-creek-1884 \| 63 / ex:Event \| 60 / ex:kaliduw… | ✅ | 💡 |
| 80 | Examine statement topology by context — which events are dense? Do different extraction strategies (ingest-verify vs agnostic vs compact) pr… | ctx:test/ingest-verify/10690 \| 2334 \| 154 \| 1320 \| 15.00 / ctx:genealogy/frontier-massacres/10622 \| 1807 \| 105 \| 1344 \| 17.00 / ctx:genealogy/frontier-massacres/10615 \| 1623 \| 49 \| 1096 \| 33.00 / ctx:genealog… | ✅ |  |

### single-event-deep

*donto handles single-event-deep VERY WELL. Event 10690 is fully reconstructable across 5 critical axes: geographic/identity anchors, timeline and casualties, people (NMP killed by name and regiment), source provenance (primary letters 1855, 7 newspapers spanning 58 years, secondary sources to 1975, archive reels), and explicit contradictions (6 sources disagree on death count; Christmas Day date e…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 81 | Geographic and event-level identity anchors | ex:rannes-station \| hasCoordinateLongitude \| 150.129817 / ex:rannes-station \| hasCoordinateLatitude \| -24.090367 / ex:banana-station \| regionalProximityTo \| ex:rannes-station / ex:rawbelle \| regionalProximityTo \|… | ✅ |  |
| 82 | Reconstruct timeline, death counts, casualty numbers, date precision | ex:attack-rannes-1855 \| numberOfPeopleWounded \| 4 / ex:combo-james \| killedIn \| ex:attack-rannes-1855 / ex:attack-rannes-1855 \| occurredInMonth \| September / ex:attack-rannes-1855 \| occurredInYear \| 1855 / ex:att… | ✅ | 💡 |
| 83 | Identify killed NMP troopers by name, regiment, camp origin, response dispatch | ex:hamlet-nmp \| hasRegNo \| 53 / ex:hamlet-nmp \| rdfType \| ex:NMPTrooper / ex:hamlet-nmp \| killedIn \| ex:attack-rannes-1855 / ex:colin-nmp \| rdfType \| ex:NMPTrooper / ex:colin-nmp \| killedIn \| ex:attack-rannes-1… | ✅ | 💡 |
| 84 | Source chain: primary letters, newspaper coverage (1855–1913), secondary sources, archive references | ex:letter-o-connell-to-murray-1855-09-27 \| authoredBy \| ex:maurice-o-connell / ex:letter-inspector-general-to-murray-1855-12-05 \| heldInArchive \| ex:nsw-colonial-secretary-letters / ex:nsw-colonial-secretary-letters … | ✅ |  |
| 85 | Track explicit contradictions, discrepancies in death counts, date variants, narrative conflicts | ex:attack-rannes-1855 \| dateDiscrepancyWith \| ex:christmas-day-erroneous-date / ex:attack-rannes-1855 \| countDiscrepancyNoted \| number of troopers killed is often exaggerated / ex:attack-rannes-1855 \| countDiscrepan… | ✅ | 💡 |

### cross-extraction-consistency

*Donto returns REAL data on core facts (event date, victim count, victim names agree across runs), but the graph is severely fragmented by predicate variance and entity deduplication failure. Same facts are encoded with ~10+ different predicate names (killedIn vs killedInEvent vs hasKilledCount vs numberOfPeopleKilled vs killedPerEmpireOct15). Same victims get different IRIs (ex:combo-james vs ex:c…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 86 | Do all three extractions agree on the location of the attack? | ctx:genealogy/frontier-massacres/10690 \| ex:attack-nmp-rannes-1855-09-23 \| locationDescription \| Rannes station, opposite side of the creek to the headstation / ctx:test/agnostic/10690 \| ex:ranges-station \| location… | ✅ | 💡 |
| 87 | Do all three extractions agree on victim counts and death tolls? | ctx:genealogy/frontier-massacres/10690 \| ex:attack-nmp-rannes-1855-09-23 \| numberKilled \| 3 / ctx:test/agnostic/10690 \| ex:attack-rannes-1855 \| hasKilledCount \| 3 / ctx:test/ingest-verify/10690 \| ex:attack-rannes-… | ✅ | 💡 |
| 88 | Are the same three indigenous victims (Combo James, Hamlet, Colin) consistently recognized across all three runs? | ctx:genealogy/frontier-massacres/10690 \| ex:combo-james \| aboriginalPerson \| true / ctx:test/agnostic/10690 \| ex:colin-killed \| killedInEvent \| ex:attack-rannes-1855 / ctx:test/ingest-verify/10690 \| ex:combo-james… | ✅ | 💡 |
| 89 | Do all three runs agree on the event date (1855-09-23)? | ctx:genealogy/frontier-massacres/10690 \| ex:attack-nmp-rannes-1855-09-23 \| eventDate \| 1855-09-23 / ctx:genealogy/frontier-massacres/10690 \| ex:attack-nmp-rannes-1855-09-23 \| eventYear \| 1855 / ctx:genealogy/fronti… | ✅ | 💡 |
| 90 | How consistent are predicate names across the three extractions for semantically identical facts (killing, location, victimhood)? | ctx:genealogy/frontier-massacres/10690 \| aboriginalPerson \| 4 / ctx:genealogy/frontier-massacres/10690 \| killedIn \| 3 / ctx:genealogy/frontier-massacres/10690 \| locationDescription \| 2 / ctx:test/agnostic/10690 \| … | ✅ | 💡 |

### weapons-methods

*The weapons-methods lens is highly functional and genuinely useful for historical violence research. Donto successfully extracted weapon types (Snider rifles, spears, poison), violence methods with source attribution, specific injury descriptions (named individuals with wound locations), and kill counts per event. Weapon entities are modeled as RDF objects with operational properties (stacked nigh…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 91 | Find statements about specific weapons used (spears, firearms, carbines) | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| weaponsUsed \| weapons / ex:johnnycake-miller \| attackedByWeapons \| spears and nulla-nullas / ex:marcus-beresford \| spearWoundInRightThigh \| true / ex:beresford-detachm… | ✅ | 💡 |
| 92 | Find kill counts, victim names, and killing methods | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| numberOfPeopleKilled \| 100 / ex:peter-blackboy \| killedAt \| ex:attack-europeans-mt-larcomb-station-1855 / ex:marcus-beresford \| causeOfDeath \| skull smashed in, spear … | ✅ |  |
| 93 | Find drowning or water-based violence methods | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| killCountMayIncludeNankinCreek \| true / ex:william-henry-carr-boyd \| attestedKayePassedDownRiver \| true / ex:wilmot-lagoon \| hasWaterDepth \| 23 ft / ex:kaye-henry-poll… | ❌ |  |
| 94 | Find specific injury descriptions, poisoning, and weapon types with source attribution | ex:marcus-beresford \| causeOfDeath \| skull smashed in, spear wound through thigh / ex:johnnycake-miller \| notoriousFor \| fenced people out of their lands and poisoned or shot anyone who returned / ex:attack-nmp-detac… | ✅ | 💡 |
| 95 | Explore weapon entities as RDF objects with operational properties and relationships | ex:snider-rifle \| rdfType \| ex:Weapon / ex:snider-rifle \| carriedBy \| ex:beresford-detachment / ex:snider-rifle \| stackedInBeresfordTentEachNight \| true / ex:spear-points \| foundInTroopers \| true / ex:snider-rifl… | ✅ | 💡 |

### surprising-emergent

***VERDICT: donto is GENUINELY USEFUL for this corpus.** The surprise-emergent lens reveals a working, queryable knowledge graph. Strengths: (1) witness reliability chains are captured as semantic relations, not just name-date pairs; (2) perpetrator-victim-location triangulation works well—I can trace who killed whom on whose country; (3) **temporal disputes are captured as first-class facts** (dat…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 96 | Find witness testimony chains and reliability assessments—how sources refer to each other and assess credibility | ex:james-boles-nmp \| reliabilityAsWitness \| not particularly reliable / ex:james-boles-jnr \| accountDerivedFrom \| ex:james-boles-nmp / ex:james-boles-jnr \| providedAccountOf \| ex:attack-wilmot-lagoon-near-mt-larcom… | ✅ | 💡 |
| 97 | Hunt for perpetrator composition—who was blamed, mixing of state forces, settlers, and non-indigenous auxiliaries in single attacks | ex:john-murray-nmp \| perpetratorOf \| ex:attack-wilmot-lagoon-near-mt-larcomb-1855 / ex:constable-thribble \| perpetratorOf \| ex:attack-wilmot-lagoon-near-mt-larcomb-1855 / ex:nmp \| perpetratorGroupOf \| ex:attack-wil… | ✅ | 💡 |
| 98 | Surface ancestral country and language-group victimhood—which Indigenous peoples inhabited attacked places | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| onCountryOf \| ex:bayali-language-group / ex:unnamed-aboriginal-people-wilmot-lagoon \| languageGroup \| ex:bayali-language-group / ex:attack-wilmot-lagoon-near-mt-larcomb-… | ✅ | 💡 |
| 99 | Capture human dimensions—children, survivors, refugees, those who fled or hid during attacks | ex:james-boles-jnr \| childOf \| ex:james-boles-nmp / ex:unnamed-aboriginal-people-wilmot-lagoon \| escapedFrom \| ex:attack-wilmot-lagoon-near-mt-larcomb-1855 / ex:unnamed-aboriginal-people-wilmot-lagoon \| fewEscaped \… | ✅ | 💡 |
| 100 | Detect temporal anomalies and source conflicts—discrepancies in reported dates, name variations, and uncertainty metadata | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| occurredOnDate \| 1855-12-25 / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| occurredAfterDate \| 1855-12-27 / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| dateDispute… | ✅ | 💡 |

### adversarial-junk

*donto does NOT produce significant junk. The extraction is FAITHFUL rather than hallucinating false confidence. Booleans are intentional semantic markers (possibleEvent, dateDisputed). Duplicates (3–5 counts) are expected cross-referencing of shared entities. Predicates are well-formed, descriptive, domain-appropriate camelCase (no corruption, no debug/temp prefixes). The most "noisy" finding—ille…*

| # | Question (intent) | donto's answer (sample rows) | ✓ | 💡 |
|---|---|---|---|---|
| 101 | Find long-form sentences or narrative text improperly extracted as single objects (should be predicates or decomposed) | ex:capricornian-1925-09-05-p60 \| quotedAsSaying \| the blacks were very busily engaged in preparing for a huge banquet... / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| eventNarrativeSummary \| Following the killing … | ✅ |  |
| 102 | Detect boolean/nonsense object values and junk predicates (xxx, unk, temp, debug) | ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| possibleEvent \| true / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| victimsUnnamed \| true / ex:attack-wilmot-lagoon-near-mt-larcomb-1855 \| dateDisputed \| true / ex:j… | ✅ |  |
| 103 | Find duplicate facts (same subject, predicate, object) appearing multiple times across contexts | ex:university-of-newcastle \| rdfsLabel \| University of Newcastle \| 5 / ex:brisbane \| rdfsLabel \| Brisbane \| 4 / ex:nmp \| rdfsLabel \| Native Mounted Police \| 4 / ex:gangulu \| rdfsLabel \| Gangulu \| 3 | ✅ |  |
| 104 | Detect malformed predicates: extremely long names (>100 chars), special chars (;, \|, $, %), or debug/temp prefixes | ex:empire-1855-10-15 \| temporalCutoffForInformation \| when the messenger left \| 28 / ex:colonial-frontier-massacres-dataset \| temporalCoverageStart \| 1788 \| 21 / ex:blamey-witnessed-event \| temporalAnchor \| ex:bl… | ✅ |  |
| 105 | Find illegible/uncertain source text or hedging language captured as facts (OCR ????s, question marks, 'unknown', 'unclear', 'disputed') | ex:capricornian-1913-07-19 \| quotedAsSaying \| Up at Rannes on one night they killed twelve native police...????ts... / ex:maurice-o-connell \| quotedAsSaying \| 1. I have the honor to acknowledge...????ts had reached y… | ✅ |  |

## Conclusion & implications

donto **works** as a queryable, paraconsistent, evidence-first substrate, and abundance extraction earns its keep — it answers questions (held contradictions, cross-source death-toll drift, decoded euphemisms, reprisal causation, prosopography) that a vector store or winner-takes-all KG structurally cannot. The honest caveats are concrete and actionable, not fatal:

1. **Evidence anchoring is SOUND — every claim dereferences to its source snippet** (CORRECTED). The original report's #1 "bad" finding — *"evidence links are wired at the weakest tier (point at the run; `span=0`, `doc=0`; not dereferenceable)"* — was **FALSE**, a UUID-column-compared-to-integer-`0` measurement artifact. Re-verified on live `donto-pg`: of the 15,078 live evidence links, **every one points at a `target_span_id`** (zero at `target_run_id`/`target_document_id`); every span carries real `surface_text` + byte offsets and resolves `span → revision → document`. So `claim → exact source snippet` is already a join, not a string-parse — donto's `fact → evidence_link → span → revision → blob` model is met at the span tier. The remaining gap is narrower and is item #3: *cross-source* attribution (which of several sources made a claim) is in-band, not a reified `claim → source` edge.
2. **Predicate fragmentation is real** — same concept under multiple spellings (`rdfType`=899 vs `rdf:type`=95) and positional pseudo-arrays (`layer1..6`, `nameVariation1..4`). A consumer **must normalize at query time** (exactly donto's defer-to-query-time thesis) — but a query-time alignment/dedup layer is now a hard requirement, not optional.
3. **Cross-source attribution should be reified** — evidence anchoring is already sound (item #1), but *which source* made a claim is baked into predicate names (`killCountPerCapricornian5Sept1925`) rather than a reified `claim → source` edge, so cross-source joins ("who reported which toll?") cartesian-product. A `claim → source` edge would fix it. **This is now the highest-value provenance fix** (it replaces the original report's withdrawn "weakest-tier" claim as the real provenance gap).
4. **Abundance is a feature for a smart (LLM/human) consumer and a liability for a naive one** — which is precisely donto's stated design point. The 98% query pass-rate and the derived historiographic meta-facts (`deathCountTrendOverTime = 'increasing (2 in 1855 to 12 in 1913)'`) are the proof it's worth it.

*Generated 2026-06-03 from a 22-agent, 105-query workflow over the live donto substrate.*