genes.apexpots.com / research source: donto-status-2026-05-28.md

donto — status snapshot (2026-05-28)

donto — status snapshot (2026-05-28)

Self-orientation pass. Branch main @ eb23b78, working tree clean, ahead of remote by 1 commit. All four prod services healthy.

One-paragraph framing

donto is a bitemporal, paraconsistent quad store recast in the PRD as an "evidence operating system for contested knowledge." Postgres extension (pg_donto, pgrx) + Rust workspace shipping an HTTP sidecar (dontosrv), CLI (donto), TUI, and Lean 4 overlay. Native query language DontoQL (SPARQL 1.1 subset also supported). Every statement is evidence-backed, filed under a context, has both valid_time and tx_time, and contradictions are preserved as data rather than rejected. Language documentation is the formal first proving domain; genes (~39M statements about North-Queensland genealogy, oral histories, DNA matches) is the live exercise.

Live state

donto_statement rows 39.3M
Distinct predicates 938.9k
Top contexts genes/research-db 21.8M → genes/smoketest 4.1M → genes/analysis-db 3.8M → genes-family-trees 1.2M
donto_* tables 71
Highest migration 0131 object_iri_trgm.sql
Tripwire tests 77 files in packages/donto-client/tests/
Services up dontosrv:7879donto-api:8000donto-api-workerdonto-debug:3002

localhost:7879/healthok. localhost:8000/health{ status: ok, dontosrv: ok }. (Daemon reload pending on a few units — file mtime drift, not active failure.)

Milestone position (PRD M0–M9)

Recent trajectory (last 3 weeks)

eb23b78  feat: predicate fragmentation endpoint + cost budgets + align activity rewire
20a158e  feat: predicate alignment + context-spans + conceivable quarantine
5f8d957  docs: ROADMAP-AFTER-MAY18 — deferred items from the infra review
3fb4f45  feat: entity-merge endpoint + data-hygiene polish
2b16a13  trace: Stage D.6 disambiguation pass
3bd2ac7  chore: switch default extraction model to z-ai/glm-5
947963b  feat: GET /context-facts
9cae9bc  fix: /search indexes object_iri too
72969cd  fix: /extract — register source + persist anchors
5928bff  feat: anchor-aware ingest + exhaustive-by-default extraction
31c519b  feat: vocab-aware extraction — stop minting fresh predicates

Theme: predicate alignment + extraction hygiene + provenance trace (Stage D), then a hop to context-spans and cost budgets. HEAD~10 diffstat: 13 files, +1249/-77.

Open items (where the next push lands)

From REVIEW-FINDINGS.md, ROADMAP-AFTER-MAY18.md, ROADMAP-NEXT.md, and EXTRACTION-MAXIMALISM.md:

High-leverage data hygiene (1–2 weeks):

Medium-leverage infrastructure (2–4 weeks):

Trust Kernel HTTP wiring (F-1 follow-on): SQL substrate for donto_register_source + policy_id exists; HTTP middleware that enforces it on write paths does not. Genes has hundreds of unpoliced sources — natural end-to-end testbed.

Domain / overlay:

Workspace shape

packages/ (22): substrate (sql, pg_donto, donto-blob, donto-ingest); query + extraction (donto-query, donto-client, donto-trace); 5 linguistic importers (donto-ling-{cldf,ud,unimorph, lift,eaf}); ops (donto-alert-sink, donto-analytics, donto-release, donto-migrate, donto-synthetic); frontend (client-ts, tsconfig); lean overlay.

apps/ (5): donto-cli (Rust — extract / ingest / query / migrate / release / analyze / cite / bench / man / completions), dontosrv (Rust HTTP gateway, :7879), donto-api (FastAPI + Temporal, :8000), donto-tui (refreshed May 2026), docs.

Non-negotiables (still load-bearing)

Map of docs (where to dig in next)