genes.apexpots.com / research source: donto-lens-engine-appendix-2026-06-01.md

The Lens Engine — Research Appendix (2026-06-01)

The Lens Engine — Research Appendix (raw findings)

Companion to the lens-engine report. Structured output of the 10-area study + 4 adversarial critiques (2026-06-01).


Critiques (adversarial)

PARTIALLY-HOLDS (confidence 0.72)

Claim: Agentic MANY-LENS decomposition (philosophical/linguistic/temporal/causal/ethical/...) discovers genuinely NOVEL and VALUABLE inter-entity relationships that existing methods (KG embedding link-prediction, literature-based discovery, analogy mining) do NOT already find.

PARTIALLY-HOLDS (confidence 0.72)

Claim: "The discovery signal is real and not fatally drowned by combinatorial noise — there is a workable precision/ranking story (novelty × plausibility × value) that surfaces the rare gold rather than producing a hallucinated mess."

PARTIALLY-HOLDS (confidence 0.72)

Claim: donto's paraconsistent + evidence-anchored + bitemporal + identity-as-hypothesis substrate is genuinely the RIGHT home for machine-generated relationship-hypotheses — materially better than a vector DB + reranker — because it can hold contradictory speculative edges, anchor them to evidence, and certify/verify the survivors.

PARTIALLY-HOLDS (confidence 0.6)

Claim: The 'lens engine' vision is fundamentally an ADVANCE over (not a rebrand of) Swanson literature-based discovery and KG completion — the agentic + many-deep-lenses + hold-and-verify-on-a-paraconsistent-substrate combination is genuine white space.


Area findings

literature-based-discovery

Literature-Based Discovery (LBD) is the single most direct intellectual ancestor of the founder's "find connections nobody made" vision, and it is far more developed than most people realize. Its founding insight, Don R. Swanson's 1986 concept of "undiscovered public knowledge," is precisely the founder's premise: knowledge that is logically derivable from the union of two existing bodies of literature, but that no single human ever assembled because no one read both literatures. Swanson formalized this as the ABC syllogism: if literature reports A→B (e.g., Raynaud's disease involves elevated blood viscosity/platelet aggregation) and a separate, non-co-citing literature reports B→C (fish oil/eicosapentaenoic acid reduces blood viscosity), then a plausible, untested A→C link (fish oil treats Raynaud's) exists in the "complementary but disjoint" literatures. He published the fish-oil/Raynaud's hypothesis in 1986 and it was clinically confirmed by a 1989 trial (DiGiacomo); his 1988 "Migraine and magnesium: eleven neglected connections" produced 11 indirect links supporting magnesium-deficiency→migraine, later clinically supported. These are existence-proofs that the method generates real, non-obvious, testable discoveries from text alone.

The field splits discovery into two modes that map cleanly onto the founder's two use-cases. Open discovery ("serendipity mode"): start from A, find all B intermediates, rank all candidate C's — a fan-out search for unexpected endpoints. Closed discovery ("verification mode"): given a fixed A and C (a hypothesis you already suspect), find and rank the B-paths that would explain/support it. donto's "generate many speculative relationships then verify the valuable few" is exactly open-then-closed discovery. The hard engineering problem LBD has wrestled with for 40 years is ranking: open discovery generates a combinatorially overwhelming candidate list, so the entire literature is essentially a competition over scoring functions for "which of these millions of latent links is worth a human's attention." Classic systems (Swanson & Smalheiser's Arrowsmith, Hristovski's BITOLA, Petric's RaJoLink, Weeber's concept-based DAD-system) rank B/C candidates by frequency, tf-idf, and co-occurrence association measures; LION LBD (Pyysalo/Cambridge, 2019) added a rich menu — Jaccard, normalized PMI, symmetric conditional probability, chi-squared, log-likelihood — over a graph of ~27M PubMed abstracts with NER-grounded entities. The recurring lesson, brutally relevant to donto: even with good ranking, the true target often sits at rank 56–299 (closed) or rank 15–120,000 (open) in LION's own evaluation — i.e., the signal is real but buried, and precision of ranking is the entire game.

The modern shift (roughly 2018→present) moved LBD from explicit co-occurrence to learned representations. SemMedDB/Semantic MEDLINE (Kilicoglu, Rindflesch, NLM) replaced raw co-occurrence with ~130M typed semantic predications (subject-predicate-object triples like "Drug-X TREATS Disease-Y") extracted by the SemRep parser, enabling discovery over a typed knowledge graph rather than bag-of-terms — a direct precursor to donto's quad/predicate structure (note: NLM deprecated SemMedDB on 31 Dec 2024). Then knowledge-graph-embedding methods (TransE, RDF2Vec, complex link prediction) and contextual embeddings (BioBERT-based, temporal-difference embeddings) reframed open discovery as link prediction on a literature KG — and crucially adopted time-sliced evaluation: train on literature before year Y, test whether the model predicts links that were actually published after Y. This is the field's hard-won, honest evaluation protocol and donto should adopt it directly. Embedding methods improved recall of plausible links but lost interpretability (you get a score, not a B-path), spawning a tension between ranked-list quality and explainability that remains unresolved.

The frontier (2024-2026) is finally agentic and partially multi-perspective — which is where the founder's vision overlaps most and is least uniquely novel. Markus Buehler's SciAgents (MIT, 2024/2025, Advanced Materials) samples paths through a large ontological knowledge graph (built from ~1,000 papers, 33K nodes/49K edges) and runs a multi-agent team — an Ontologist that defines the concepts on the path, Scientist agents that draft a hypothesis spanning the path, and a Critic that adversarially reviews — to surface cross-domain connections (e.g., silk ↔︎ energy-intensive materials) that classical ABC could never reach because the link is a multi-hop, multi-domain narrative rather than a single B-term. Google DeepMind's AI Co-Scientist (Nature, 2025/2026) runs Generation/Reflection/Ranking/Evolution/Proximity/Meta-review agents with an Elo tournament of "scientific debates" to rank competing hypotheses, grounded in literature + ChEMBL/UniProt, and produced experimentally validated results (a liver-fibrosis drug-repurposing candidate that blocked ~91% of a scarring response at Stanford; an antimicrobial-resistance mechanism that matched years of unpublished lab work at Imperial). These systems have already operationalized "agents break a problem down, propose cross-domain links, then critique/rank them" at impressive quality. What they have NOT done is (a) decompose entities through the full spectrum of human analytical lenses (they are scientifically/biomedically scoped — causal/mechanistic, not philosophical/aesthetic/semiotic/teleological), (b) treat identity as a query-time hypothesis, or (c) persist the millions of rejected/speculative/contradictory links as durable, evidence-anchored legal state. They generate, rank, surface the top few, and discard the rest. That discard is donto's white space.

Foundational works:

Modern AI systems:

Relevance to the lens engine: BORROW: (1) The open/closed discovery distinction is donto's exact two-mode architecture — open discovery = generate speculative cross-lens links; closed discovery = the verification/curation pass on a suspected link. Name and build both explicitly. (2) Time-sliced evaluation is the field's hard-won, non-gameable validation protocol: train donto on the corpus pre-year-Y, measure whether its top-ranked machine-proposed links were later asserted/published. This is how you prove the engine works without manual labeling, and it is essentially the only honest LBD metric. (3) Convergent multi-path scoring (Swanson's 'eleven connections', LION's path-accumulation functions): a cross-lens link is far more credible when MULTIPLE independent lenses surface it — donto should rank a hypothesis-link by how many of its lenses converge on it, not by any single lens's confidence. (4) Typed predications over bag-of-words (SemMedDB lesson) — donto's quad structure already has this advantage; preserve typed argument edges. (5) Drill-down to the source byte: LION's and Arrowsmith's usability came from letting a human see the originating sentence — donto's evidence-anchoring is the same affordance and is essential for trust. AVOID: (1) The ranking-precision trap — LION honestly shows true links buried at rank 15-120,000 in open mode; raw fan-out without a strong scorer produces an unusable haystack. donto's many-lens combinatorics will be far worse than ABC's single-B-term blowup, so ranking/pruning is THE make-or-break problem, not generation. (2) Co-occurrence-only signal generates spurious links; prefer typed/argumentative edges and require explanatory paths. (3) The reproducibility crisis — pin corpora, seeds, and evaluation splits from day one. (4) Don't discard the rejected candidates the way SciAgents/Co-Scientist do — that persistence IS donto's differentiator (see white space).

Already done vs white space: ALREADY DONE (the founder should NOT assume 'no one has thought to do this'): The core thesis — that machine-assembled connections across disjoint knowledge can constitute genuine, validated discovery — is 40 years old and clinically proven (Swanson 1986/1988). Open/closed discovery, candidate-link ranking, scaling to ~27-37M documents, typed-predication knowledge graphs, KG-embedding link prediction with time-sliced evaluation, and now multi-agent path-sampling + adversarial critique + tournament ranking (SciAgents, Co-Scientist) are all built and, in the agentic case, producing wet-lab-validated discoveries in 2025-2026. 'Agents decompose a problem, propose cross-domain links, and critique/rank them' is effectively the state of the art, not white space. GENUINE WHITE SPACE for donto: (1) MANY-LENS DECOMPOSITION as the link-generation substrate. Every LBD system to date is mono-perspective — biomedical/causal/mechanistic. None decomposes entities through the full spectrum (philosophical, semiotic, teleological, aesthetic, phenomenological, ethical, mereological, ecological) and then mines link candidates at lens INTERSECTIONS. LBD finds A-B-C bridges within one ontology; donto's bet is that the richest unmade connections live between incommensurable lenses, which no LBD system attempts. (2) PERSISTING the rejected/speculative/contradictory candidates as durable, evidence-anchored, paraconsistent legal state. Every LBD/agentic system generates millions of candidates, surfaces the top-k, and throws the rest away. donto's hypothesis_only + contradiction-frontier + supports/rebuts/undercuts edges let the discarded 99.9% remain queryable forever and be re-ranked as the corpus and lenses evolve — a standing 'latent-structure reservoir' no LBD system maintains. (3) IDENTITY-AS-HYPOTHESIS at query time. LBD assumes entity grounding/NER is settled before discovery; donto lets the merge itself be a weighted, lens-dependent hypothesis — which is where many false LBD links actually come from (spurious co-occurrence of ambiguously-grounded entities). (4) Lean-4 certification of the rare valuable link — formal shape/rule verification of a discovered relationship has no analog in LBD, which validates only empirically/clinically.

Hard problems:

bisociation-computational-creativity

The founder's intuition — "connections no one thought of, because no one holds all the lenses at once" — is, almost word for word, Arthur Koestler's BISOCIATION. In "The Act of Creation" (1964), Koestler argues every creative act (the comic Haha, the scientific Aha, the artistic Ah) shares one structure: perceiving a situation or idea simultaneously in two self-consistent but HABITUALLY INCOMPATIBLE "matrices" / frames of reference (M1, M2). Ordinary thought is ASSOCIATION — moving within a single plane/matrix. Creativity is BISOCIATION — the collision/fusion of two planes that normally never touch. The single richest precedent: the value is not the facts inside a frame, it is the relation that springs from intersecting two frames. This is the founder's thesis, stated in 1964.

Koestler's idea was operationalized into a real computational program by the EU FP7 BISON project (2008-2012), summarized in Michael Berthold (ed.), "Bisociative Knowledge Discovery: An Introduction to Concept, Algorithms, Tools, and Applications" (Springer LNCS 7250, 2012; 32 chapters; consortium incl. Berthold/Konstanz, Nada Lavrac & Dunja Mladenic/Jozef Stefan Institute, Werner Dubitzky, Christian Borgelt). They formalized bisociation as discovery of bridges between weakly-connected or disjoint "domains" inside a heterogeneous graph called a BisoNet (Bisociative Information Network — nodes are concepts/units from many sources, edges are evidential relations). Three computational TYPES of bisociation were distinguished: (1) bridging CONCEPTS (a single term/node co-occurring in two otherwise-unlinked domains — the classic b-term); (2) bridging GRAPHS / structural similarity (two subgraphs in different domains share an isomorphic relational pattern — analogy); (3) bridging by GRAPHS (a connecting path/subgraph that links two domains). Crucially they tried to make "bisociativeness" a RANKABLE score — distinguishing a genuinely surprising cross-domain link from a trivially common one.

The concrete, working instantiation is CrossBee (Cross-Context Bisociation Explorer; Jursic, Cestnik, Urbancic, Lavrac, ICCC 2012; http://crossbee.ijs.si). You feed it two document sets from two domains (e.g. two non-interacting literatures); it ranks candidate BRIDGING TERMS ("b-terms") by a BISOCIATION SCORE computed as an ENSEMBLE of text-mining heuristics (frequency, tf-idf, outlier-ness, appearance in both domains, etc.) voting together, then offers side-by-side document inspection so a human EXPERT verifies the link. This is a direct ancestor of donto's lens-engine: machine over-generates candidate cross-context links; human (or downstream certifier) disposes. Its intellectual root is older still — Don Swanson's LITERATURE-BASED DISCOVERY ("Undiscovered Public Knowledge," 1986): two literatures (Raynaud's disease and fish oil; migraine and magnesium) never co-cited, but logically linked through a shared bridging concept B (blood viscosity), yielding a testable A-C hypothesis later clinically confirmed. Swanson's ABC model (A-B + B-C therefore maybe A-C) is the canonical computational template for "relationships no one drew because the two literatures were isolated."

Running alongside bisociation is CONCEPTUAL BLENDING / conceptual integration (Gilles Fauconnier & Mark Turner, "The Way We Think," 2002). Where Koestler collides two frames, blending integrates them: two (or more) INPUT mental spaces selectively project into a BLENDED space via a GENERIC space, and the blend develops EMERGENT structure not present in either input (their canonical example: "the Buddhist monk" riddle; "this surgeon is a butcher"). Blending has been computationally modeled: Joseph Goguen's algebraic/category-theoretic Unified Concept Theory; Pereira's DIVAGO (2005, optimality-principle metrics); and the EU COINVENT project (Schorlemmer, Kutz, Confalonieri, Pease, et al., 2014-2016) which models blends as AMALGAMS (knowledge-transfer via colimits in category theory) and applied them to mathematical concept invention and music. The third lens is Margaret Boden ("The Creative Mind," 1990; "Creativity and Art," 2010): creativity comes in three kinds — COMBINATIONAL (novel combinations of familiar ideas — bisociation/blending live here), EXPLORATORY (finding unvisited points inside an existing conceptual space / set of rules), and TRANSFORMATIONAL (changing the rules of the space so previously-impossible ideas become thinkable). The donto vision spans all three: many lenses = many conceptual spaces; cross-lens intersection = combinational; pushing a lens "to the utmost" = exploratory; a lens that rewrites another's assumptions = transformational. Modern LLM work has revived all of this: PopBlends (Petridis et al., CHI 2023 — LLM+knowledge-base conceptual blends for design), LiveIdeaBench (2024-25, benchmarking LLM divergent thinking on single-keyword scientific idea generation), Nature/Sci-Reports studies showing LLMs reach population-average creativity but not top-decile humans, and multi-agent hypothesis engines (Google's AI Co-Scientist 2025, SciMON ACL 2024, Sakana AI Scientist) that generate-debate-rank-evolve cross-literature hypotheses — exactly the "agentic many-lens generate-then-verify" loop, but without a paraconsistent substrate to hold the rejected/contradictory candidates.

Foundational works:

Modern AI systems:

Relevance to the lens engine: BORROW: (1) The vocabulary and metrics — frame donto's payoff explicitly as bisociation (Koestler) and conceptual blending (Fauconnier-Turner): a discovered relationship is valuable in proportion to how HABITUALLY INCOMPATIBLE its two source lenses/domains are. Steal CrossBee's idea of a rankable BISOCIATION SCORE computed as an ENSEMBLE of heuristics over a heterogeneous graph — donto already is that graph (a BisoNet by another name). (2) Swanson's ABC bridging template is the cleanest first product: surface A-B and B-C claims sitting in two different ctx:* contexts/lenses that are never co-cited, propose A-C as a hypothesis_only edge, anchor B as the bridge with evidence. (3) Boden's taxonomy gives a roadmap and honest framing: most cross-lens output will be combinational; treat exploratory (push one lens to its limit) and transformational (one lens rewrites another's assumptions) as harder, rarer, higher-value tiers. (4) COINVENT/Divago teach that the bottleneck is SELECTION/optimality among a combinatorial explosion of blends — design for ranking and pruning from day one, not generation. AVOID / be warned: (a) The b-term/blend space explodes combinatorially; CrossBee, LBD tools, and blending systems ALL hit the same wall — far more candidate links than any human can review, and no agreed way to tell signal from noise. Donto's edge must be that its paraconsistent substrate can HOLD the explosion as legal hypothesis_only state forever (where prior systems had to discard), with the Lean-4 overlay + evidence-anchoring + argument edges (supports/rebuts/undercuts) as the eventual VERIFY/prune mechanism — this is genuinely the missing piece in every prior system. (b) LLM creativity clusters/homogenizes (Nature 2025); a single agent run over many lenses risks producing samey 'connections.' Force lens-diversity structurally (distinct prompts/personas/conceptual spaces per lens, as the AI Co-Scientist's specialized agents do) and measure diversity, not just count. (c) Resist raw-volume framing — Koestler, Swanson and BISON all insist the win is the RARE high-bisociativeness link across distant frames, not millions of intra-frame facts; the founder's refined 'depth-of-decomposition-then-intersection' view is correct and should be the headline.

Already done vs white space: ALREADY DONE (the founder must not reinvent these): (1) The core idea — "creativity = connecting two habitually-incompatible frames" — is Koestler 1964, named bisociation; it is not new. (2) "Find relationships no human drew because two corpora/frames are isolated" is Swanson's literature-based discovery (1986) and was clinically validated. (3) An entire EU research program (BISON, Berthold ed. 2012) computationalized exactly this: BisoNets, three formal bisociation types, rankable bisociativeness, and a working tool (CrossBee) that over-generates cross-domain bridging links and has a human verify them. (4) "Intersection of frames yields emergent relational structure" is conceptual blending (Fauconnier-Turner 2002), and it has been made algorithmic (Goguen, Divago, COINVENT amalgams). (5) "Agents generate-debate-rank-evolve cross-literature hypotheses" is the 2024-25 AI-co-scientist / SciMON / Sakana wave. So "no one has thought to do this" is, frankly, false at the level of the concept and even of single-pair tooling. GENUINE WHITE SPACE (where donto is actually novel): (a) SCALE + MANY LENSES SIMULTANEOUSLY — every prior system bisociates TWO domains/literatures at a time chosen by a human; nobody runs the FULL spectrum of analytical lenses agentically over ALL entities at once and harvests the combinatorial set of cross-lens intersections. (b) A PARACONSISTENT, CONTRADICTION-PRESERVING SUBSTRATE THAT CAN HOLD THE SPECULATION FOREVER — this is the deepest gap. CrossBee/LBD/COINVENT/AI-Co-Scientist all generate transient candidates that are discarded if not immediately validated; none has a legal, queryable, permanent home for unanchored, mutually-contradictory, hypothesis_only relationship-claims with typed argument edges. (c) IDENTITY-AS-HYPOTHESIS + EVIDENCE-FIRST + LEAN-CERTIFICATION as the curation layer — using formal proof to certify the rare valuable shapes out of the speculative frontier is, as far as the literature shows, unattempted. (d) Closing the loop: generate (agents) -> hold (paraconsistent quad store) -> rank (bisociation score) -> certify (Lean) -> promote, as one persistent system rather than a one-shot pipeline. The novelty is NOT the lens idea and NOT bisociation; it is the AGENTIC-MANY-LENS + PERSISTENT-PARACONSISTENT-HOLD + FORMAL-VERIFY combination at substrate scale.

Hard problems:

analogy-structure-mapping

Analogical reasoning is the most directly relevant intellectual tradition to the lens-engine vision, because a "lens comparison across entities" IS structurally an analogy: it asks whether the system of relations holding among one entity's parts also holds among another's, independent of surface features. The field's canonical theory is Dedre Gentner's Structure-Mapping Theory (1983): an analogy maps relational structure from a base domain to a target, and the quality of a mapping is governed by the systematicity principle — people (and good algorithms) prefer to carry over deep, interconnected systems of higher-order relations (causal, mathematical) rather than isolated attributes or surface features. This was operationalized in the Structure-Mapping Engine (SME) (Falkenhainer, Forbus & Gentner, 1986/1989), a local-to-global structural alignment algorithm that, given two predicate-calculus representations, returns correspondences, a structural-evaluation score, and candidate inferences — new claims about the target imported from the base. The candidate-inference output is exactly the "relationship no one thought to draw" the founder wants: SME doesn't just match, it generates novel hypotheses by projecting unmatched base structure onto the target.

The second great lineage is Douglas Hofstadter's Fluid Analogies Research Group (FARG) and its books Fluid Concepts and Creative Analogies (1995) and Surfaces and Essences (Hofstadter & Sander, 2013). Hofstadter's radical claim — "analogy is the core, the fuel and fire, of all thinking" — reframes categorization, perception, and concept-formation themselves as analogy-making. The computational model, Copycat (Mitchell & Hofstadter), differs philosophically from SME: rather than receiving fixed representations and aligning them, Copycat builds its representations fluidly via a Slipnet (a conceptual network whose link-lengths/"conceptual slippage" change dynamically), a Workspace (a blackboard of perceptual structures), a Coderack of stochastic codelets (parallel micro-agents that compete/cooperate), and a global temperature that anneals the search and serves as a quality proxy. The deep lesson for donto: representation is not given, it is constructed under pressure, and the same situation supports many rival construals — which maps cleanly onto donto's "identity-is-a-hypothesis" and paraconsistent-frontier stance.

The third tradition is Fauconnier & Turner's Conceptual Blending / Conceptual Integration (1990s–2002, The Way We Think). Where structure-mapping is asymmetric (base→target), blending is many-to-one: two-or-more input mental spaces, a generic space of shared structure, and a blend that selectively projects from each input and crucially generates emergent structure (via composition, completion, elaboration) present in neither input. This is the theoretical name for "relationships that emerge at the intersection of lenses" — the payoff the founder describes is essentially emergent structure in a blend. Computational blending (Goguen's algebraic/category-theory amalgams, the COINVENT project, divago) exists but is brittle and hard to evaluate.

The scale story arrived with Dafna Shahaf, Aniket Kittur, Joel Chan and Tom Hope: "Accelerating Innovation Through Analogy Mining" (KDD 2017, Best Paper) learned purpose and mechanism vector representations from product descriptions (crowdsourcing + RNNs) so that analogies could be mined from messy real-world repositories (the patent/idea corpus) — finding products with the same purpose but different mechanism, or vice versa. SOLVENT / the Analogy Search Engine (Chan, Hope et al., 2018) extended this to scientific papers, annotating background/purpose/mechanism/findings and embedding them so cross-domain research analogies surface that pure IR misses. This is the closest existing relative of donto's vision at the document level — but it uses one coarse facet schema (purpose×mechanism), not a full spectrum of philosophical/linguistic/temporal/ethical/etc. lenses, and it does not hold contradictory mappings as durable state.

The 2023–2026 LLM wave reopened everything. Webb, Holyoak & Lu (Nature Human Behaviour 2023) reported "emergent analogical reasoning" in GPT-3/4 (Raven's-style matrices, letter strings, story analogies) at or above human level zero-shot. This was sharply contested: Lewis & Mitchell (2024) and Hodel & West showed performance collapses on counterfactual variants (permuted/synthetic alphabets) where humans stay robust — evidence the apparent reasoning leans on training-data similarity, not domain-general structure mapping. Webb et al. (2024) replied that with code-execution/tool augmentation the capacity generalizes. Newer work splits the difference and is most useful to donto: hybrid systems like YARN (Khojasteh et al., 2026) explicitly re-fuse Gentner-style structural mapping with LLM-derived multi-level abstractions, finding that pure LLM prompting fails on "far" (low-surface-similarity) analogies and pure SME scores below random, but LLM-abstraction-then-structural-align beats both — a direct template for donto. "Parallelograms Strike Back" (2026) even argues LLMs now generate better analogies than people in some settings. Net: LLMs are excellent at proposing candidate cross-domain relations and at abstracting messy text into mappable structure, but unreliable at certifying whether a mapping is structurally valid and not surface-pattern-matching — precisely the gap donto's evidence-anchoring + Lean-4 certification + paraconsistent hold-without-collapse could fill.

Foundational works:

Modern AI systems:

Relevance to the lens engine: A lens-to-lens comparison across two entities IS an analogy in Gentner's exact sense, so this field hands donto a ready vocabulary and tooling. BORROW: (1) The systematicity principle as a relevance filter — rank machine-proposed relationships by the SIZE and INTERCONNECTEDNESS of the relational system they share under a lens, not by surface attribute overlap; this is the antidote to the combinatorial-noise problem (most cross-lens pairs will be junk). (2) SME's candidate-inference mechanism as the literal generator of 'relationships no one thought to draw' — when two entities align structurally under, say, the teleological lens, project the UNMATCHED base structure onto the target as a new hypothesis_only edge. (3) The Hope/Shahaf/SOLVENT purpose×mechanism schema as proof that faceting documents into structured aspects and embedding each facet separately yields better cross-domain matches than holistic embeddings — donto generalizes this from 2 facets to N lenses. (4) SciAgents' randomized KG-path-sampling between distant nodes as a concrete way to PROPOSE candidate relationships across donto's 39.5M statements without enumerating all pairs. (5) The YARN result that LLM-abstraction-THEN-structural-alignment beats both pure LLM and pure SME — donto should use LLM lenses to ABSTRACT entities, then a structural/Lean-certified aligner to VALIDATE, never trusting the LLM's raw mapping. (6) Copycat's temperature/codelet stochasticity and Fauconnier-Turner's emergent-structure vocabulary to name and rank the payoff. AVOID: (a) treating LLM-proposed analogies as ground truth — the Lewis & Mitchell counterfactual collapse shows they pattern-match; donto must keep them as weighted, evidence-anchored, contestable claims (its native mode). (b) Requiring strict isomorphism (classic SME brittleness) — Holyoak-Thagard multiconstraint and donto's paraconsistency both argue for soft, purpose-weighted, contradiction-tolerant matching. (c) One fixed facet schema — donto's many-lens ambition is exactly the generalization SOLVENT stopped short of.

Already done vs white space: ALREADY DONE (the founder should not reinvent these): (1) The core theory that cross-domain relationship discovery = structural analogy, with a working algorithm that emits NOVEL hypotheses (SME candidate inferences) — 40 years old. (2) Analogy MINING AT SCALE over messy real-world repositories using learned facet embeddings — Hope/Shahaf (patents 2017) and SOLVENT (scientific papers 2018) already demonstrated 'find cross-domain analogies humans missed, experts find them useful.' (3) AGENTIC, multi-agent, KG-path-sampling cross-domain hypothesis generation that 'reveals hidden interdisciplinary relationships' — SciAgents (2024-25) is a published, peer-reviewed instance of a large chunk of the founder's pitch, in materials science. (4) LLMs as competent analogy proposers AND as the abstraction layer feeding a structural aligner (YARN 2026). So 'use AI to break entities down and find cross-domain relationships' is, at the level of a single facet schema in a single domain, NOT new. GENUINE WHITE SPACE: (a) The FULL SPECTRUM of lenses — every prior system uses one or a few facets (purpose/mechanism; ontological KG edges). Nobody has run a dozen+ heterogeneous analytical lenses (phenomenological, semiotic, ethical, aesthetic, mereological, teleological...) over the SAME entities and looked for relationships at lens INTERSECTIONS. The intersection-of-many-lenses is real white space. (b) PARACONSISTENT, PERSISTENT HOLDING of speculative/contradictory machine-proposed mappings as durable first-class state with typed argument edges (supports/rebuts/undercuts) — every analogy-mining system today is one-shot retrieval; none KEEPS the rejected and the contradictory mappings as a queryable frontier over time. (c) EVIDENCE-ANCHORING each proposed relationship to a source byte + bitemporal provenance — SOLVENT/SciAgents do not anchor or version their analogies. (d) FORMAL CERTIFICATION (Lean-4) of the rare valuable mappings' structural validity — no analogy system formally proves a mapping's shape. The combination (many-lens + agentic-proposal + paraconsistent-hold + evidence-anchor + certify) at substrate scale is, as far as the literature shows, unbuilt.

Hard problems:

Knowledge graph completion (KGC), a.k.a. link prediction, is the machine-learning field devoted to scoring the plausibility of unstated triples (h, r, t) so that missing/latent edges can be inferred from observed ones. It is the single most directly relevant prior art to donto's "discover relationships no one drew" vision — it has been predicting unstated relationships at scale for a decade. The dominant paradigm is the geometric/algebraic EMBEDDING model: TransE (Bordes et al., NeurIPS 2013) treats a relation as a translation h + r ≈ t in real vector space; DistMult uses a bilinear diagonal product; ComplEx (Trouillon et al., ICML 2016) moves to complex-valued embeddings so the Hermitian dot product can score asymmetric relations differently by argument order; RotatE (Sun et al., ICLR 2019) models a relation as a rotation in complex space, letting one model express symmetry/antisymmetry, inversion, AND composition simultaneously. These are scored against a corrupted-negative ranking objective and evaluated by MRR / Hits@k on benchmarks (FB15k-237, WN18RR, YAGO3-10). They are cheap, scalable to tens of millions of edges, and genuinely surface unstated facts — but they are SHALLOW relational pattern-matchers, transductive (no embedding exists for an unseen entity), and they encode only the latent geometry of co-occurrence, not meaning.

A second, older and more interpretable tradition is RULE MINING: AMIE / AMIE+ / AMIE3 (Galárraga et al., WWW 2013 onward) mine closed Horn rules ("Datalog") with support/confidence under a partial-completeness assumption; AnyBURL (Meilicke et al., IJCAI 2019, VLDBJ 2023) samples bottom-up paths and generalizes them into rules anytime, and — strikingly — a simple symbolic rule learner MATCHES OR BEATS most embedding models on link prediction while producing human-readable, evidence-bearing explanations. This matters enormously for donto: rules are inherently auditable and map naturally onto donto's typed argument edges and Lean-certifiable shapes. Hybrids now feed embedding-predicted links back to enrich the graph before rule mining (Betz/Meilicke et al., 2024).

The frontier moved to two places. (1) GNN / path-based and INDUCTIVE KGC: GraIL (Teru et al., 2020) reasons over enclosing subgraphs so it generalizes to unseen entities; NBFNet (Zhu et al., NeurIPS 2021) reframes link prediction as a learned generalized Bellman-Ford over paths and is a strong SOTA; and ULTRA (Galkin et al., ICLR 2024) is the watershed — a single FOUNDATION MODEL that does zero-shot link prediction on ANY KG with any entity/relation vocabulary, by learning representations of the graph of relations (how relations interact) rather than fixed per-entity embeddings, beating graph-specific baselines across 50+ KGs. This is the closest thing to "one model, every domain" — but it still operates purely on graph topology, not on cross-modal or deep-semantic content. (2) LLM-AUGMENTED KGC (2023-2026): KICGPT (Wei et al., EMNLP 2023 / 2024) couples a structure-aware retriever with an LLM reranker to fix the long-tail problem; KG-LLM, SAT (structure-aware alignment-tuning, 2025), DrKGC (subgraph-retrieval-augmented LLMs, 2025), and ontology-enhanced LLM-KGC (2025) all inject the LLM's world knowledge and natural-language semantics as a second signal. These finally bring lexical/semantic understanding to KGC — the bridge to donto's agentic-lens idea — but they bolt the LLM onto a single structural task, not onto a many-lens decomposition.

Crucially, the field is honest about what it MISSES. (a) Degree/popularity bias: embedding KGC preferentially scores high-degree "rich club" entities, so it amplifies what is already well-studied and overlooks the long tail (Shomer et al., WWW 2023; biological-KG topology study, bioRxiv 2024) — the opposite of serendipity. (b) Plausibility ≠ novelty ≠ truth: KGC ranks how much a candidate edge resembles the existing distribution, so "best" predictions are often the most obvious/redundant ones, and benchmarks (FB15k/WN18) are inflated by reverse-triple leakage and binarized n-ary relations (Akrami et al.; "On Large-scale Evaluation," 2025). (c) Calibration is poor, especially under the realistic open-world assumption — confidence scores do not equal probabilities of truth (Tabacof & Costabello, EMNLP 2020; "Using Model Calibration to Evaluate Link Prediction," WWW 2024; KGE-Calibrator, EMNLP 2025). (d) Contradictions: standard KGC assumes a single consistent truth and cannot natively HOLD a contradiction; even uncertain-KG embeddings (UKGE, Chen et al., AAAI 2019) model confidence but still struggle to represent negative/false links as legal state. Every one of these gaps is something donto's paraconsistent, evidence-first, calibration-agnostic substrate is architecturally built to absorb.

Foundational works:

Modern AI systems:

Relevance to the lens engine: BORROW (don't reinvent): (1) Use KGC as the cheap, scalable FIRST PASS that proposes candidate edges over donto's 39.5M statements — ULTRA-style inductive/foundation models and AnyBURL-style path-rules both run at this scale and need no per-entity training, fitting a substrate where 'identity is a hypothesis.' (2) Adopt rule mining (AMIE/AnyBURL/NBFNet paths) as the EXPLAINABLE generator: every proposed edge arrives with a path/rule that can be written as a donto typed argument edge (supports/rebuts) and handed to the Lean-4 overlay for shape certification — turning machine guesses into auditable claims. (3) Treat calibration as a first-class output, not an afterthought: store the model score AND a calibrated open-world probability on each hypothesis_only edge so curation can triage. (4) Use the LLM-KGC wave (KICGPT, SAT, DrKGC) as the bridge from structure to SEMANTICS — they are the existence proof that natural-language meaning improves link prediction. AVOID: (a) Don't let embedding-style scoring be the arbiter of value — it is degree-biased ('rich club') and rewards REDUNDANT, distribution-conforming edges, which is anti-serendipity; the rare valuable cross-domain link will score LOW by construction. donto should explicitly up-weight low-prior, cross-context (cross-ctx:*) candidates rather than top-ranked ones. (b) Don't inherit the single-consistent-truth assumption baked into every KGC loss; donto's paraconsistent frontier is precisely the part KGC cannot do. (c) Don't trust benchmark MRR as a proxy for discovery quality — it is inflated by leakage and measures plausibility, not novelty or truth.

Already done vs white space: ALREADY DONE (the founder should NOT believe 'no one has thought to do this'): Predicting unstated relationships at massive scale is a solved, decade-old industry — TransE→RotatE→ComplEx→NBFNet→ULTRA do it transductively and now zero-shot across 50+ graphs. Generating NOVEL, actionable cross-entity hypotheses is done in drug repurposing / literature-based discovery (with wet-lab validation). Explainable, evidence-bearing link proposals exist (AMIE, AnyBURL, NBFNet paths). Confidence/uncertainty on proposed edges exists (UKGE, calibration work). And — most pointedly — AGENTS traversing a knowledge graph to surface previously-unrelated interdisciplinary links and auto-generate hypotheses ALREADY EXISTS in SciAgents (Buehler, 2025) and the broader agentic-graph-discovery wave (GraphAgents, cross-domain materials design, 2026). So the 'agents break things down and find links no human drew' core is real and demonstrated. GENUINE WHITE SPACE (the defensible novelty): (1) The MANY-LENS decomposition as the GENERATIVE engine. All prior KGC predicts within a SINGLE relation vocabulary / single ontology / single modality; nobody systematically decomposes each entity through the full spectrum of analytical lenses (mereological, teleological, semiotic, phenomenological, ethical, ecological...) and then mines relationships at the INTERSECTION of lenses. SciAgents traverses one graph; it does not multiplex perspectives. (2) The PARACONSISTENT, contradiction-preserving SUBSTRATE for holding millions of speculative, mutually-incompatible machine-proposed edges forever as legal state — no KGC system can do this; they all collapse to one truth. (3) The EVIDENCE-FIRST byte-anchoring of every speculative edge plus a Lean-4 certification path, giving a generate-hold-verify lifecycle that the ML field has no equivalent for (KGC outputs a ranked list, not a curated, source-anchored, machine-checked claim store). The honest novelty is therefore NOT 'discover unstated links' (done) but 'the AGENTIC + MANY-LENS generation × PARACONSISTENT/EVIDENCE-FIRST holding × certifiable verification PIPELINE at substrate scale' — the combination, not any single piece.

Hard problems:

ai-scientific-hypothesis-generation

This field is the single closest existing analogue to donto's "many-lens relationship-discovery engine," and its 60-year arc is essential context. The intellectual root is Don Swanson's Literature-Based Discovery (LBD, 1986): his "undiscovered public knowledge" thesis holds that independently-created literature fragments can be logically related yet never connected, and his ABC model (A relates to B, B relates to C, therefore hypothesize A-C) found the fish-oil/Raynaud's link purely by bridging non-interacting MEDLINE literatures — later clinically validated. This is EXACTLY the founder's intuition ("relationships no human thought to draw because no human holds all the literatures/lenses at once"), and it predates LLMs by 40 years. LBD's whole premise is that the value is in the intersection/bridge term, not the facts inside either literature — identical to the founder's "payoff is at the intersection of lenses."

The second lineage is closed-loop autonomous science: Ross King's Robot Scientist Adam (Cambridge/Aberystwyth, 2009, first machine to autonomously discover new scientific knowledge — yeast functional genomics) and Eve (drug screening), now Genesis. The critical lesson here is that Adam/Eve close the loop — they generate hypotheses, design discriminating experiments, RUN them with lab robotics, and revise. This is the "verify/curate" half of the founder's vision made physical, and it's the part pure-text systems lack. The third lineage is embedding/representation-based latent-knowledge extraction: Tshitoyan et al. (Nature 2019, "mat2vec") trained Word2vec on 3.3M materials-science abstracts and showed the unsupervised embeddings recommended thermoelectric materials YEARS before their actual discovery — i.e., the "latent structure of future discoveries is already embedded in past text." This is the strongest empirical proof that machine-readable latent relationships exist in a corpus and can be surfaced. The fourth lineage is knowledge-graph link prediction as hypothesis generation: drug-repurposing KGs (DRKG, COVID-19 KGs using GNNs/ComplEx/ensemble KG embeddings) frame a new drug-disease hypothesis literally as predicting a missing edge, validated via AUROC/AUPRC and explanatory paths — this is the paradigm donto's substrate is architecturally nearest to.

The 2024-2026 wave fuses LLM agents with all of the above. SciAgents (Ghafarollahi & Buehler, MIT, arXiv:2409.05556, Advanced Materials 2025) is the MOST relevant single system: it builds a large ontological knowledge graph (~33K nodes / 49K edges from ~1,000 papers), then samples a PATH between two concepts — crucially WITH INJECTED RANDOMNESS / random waypoints to force non-deterministic, exploratory, serendipitous bridges — and hands that path to a multi-agent pipeline (Ontologist → Scientist-1 proposes hypothesis → Scientist-2 adds mechanism/experiment → Critic evaluates → novelty checked against Semantic Scholar). It explicitly claims to reveal "hidden interdisciplinary relationships previously considered unrelated." This is essentially the founder's engine for one domain, minus the paraconsistency and the persistent contradiction-holding substrate. Google DeepMind's AI co-scientist (Feb 2025, Gemini 2.0) is the most mature: a Supervisor orchestrates Generation, Reflection, Ranking, Evolution, Proximity, and Meta-review agents; hypotheses compete in an Elo TOURNAMENT via simulated scientific debate (self-play), and test-time compute scales the search. It produced wet-lab-validated results: AML drug-repurposing candidates that inhibited tumor viability, anti-fibrotic epigenetic targets in liver organoids, and independently re-derived a then-unpublished antimicrobial-resistance mechanism (phage capsid gene transfer). Adjacent recent systems — BioDisco (dual-mode KG+literature evidence, iterative feedback, and a notable TEMPORAL evaluation that tests whether a hypothesis is confirmed by post-cutoff literature), KG-CoI / Knowledge-Grounded LLMs (arXiv:2411.02382), TruthHypo/KnowHD, and Bayes-Entropy collaborative agents — all converge on grounding hypotheses in graphs to fight hallucination and ranking/refining to control quality.

The honest verdict on what LLM ideation actually delivers: Si, Yang & Hashimoto (Stanford, arXiv:2409.04109, 100+ NLP researchers) found LLM-generated ideas were judged statistically MORE NOVEL than expert ideas (p<0.05) but slightly less feasible — encouraging for the founder. BUT the follow-up "Ideation-Execution Gap" (arXiv:2506.20803, 2025) had 43 experts actually EXECUTE the ideas (100+ hrs each): after execution, LLM ideas' scores collapsed on every metric and human ideas overtook them. The lesson directly applicable to donto: surface novelty is cheap and machine-abundant; durable value requires execution/verification, which is exactly why a substrate that can HOLD speculative relationships cheaply and selectively VERIFY the rare valuable ones is the right architecture — but the verification step is where all the real difficulty (and value) lives.

Foundational works:

Modern AI systems:

Relevance to the lens engine: BORROW: (1) Swanson's ABC/bridge logic and SciAgents' random-waypoint path sampling are the proven mechanics for generating cross-domain relationships no human posited — donto can run the same over its 39.5M-statement graph, but with MANY analytical lenses as the typed dimensions the bridge can traverse, not just co-occurrence. SciAgents needed only ~1,000 papers; donto already has the substrate. (2) The disposer/loop is non-negotiable: Adam/Eve and the AI co-scientist show value comes only when generation is paired with a ranking + verification mechanism (Elo tournament, critic agents, novelty-vs-Semantic-Scholar, and ultimately experiment). donto's Lean-4 certification + evidence-anchoring + argument edges (supports/rebuts/undercuts) ARE a disposer — wire the agent-proposed relationships into it (mirrors donto's own 'agent-proposes / Lean-disposes' rosie-search pattern). (3) BioDisco's temporal evaluation maps perfectly onto donto's bitemporality: hold a speculative relationship as legal state now, and let later-ingested evidence retro-confirm or rebut it without rewriting history. (4) mat2vec proves a cheap embedding pass over the corpus can pre-rank which speculative edges are worth materializing — use it as a candidate generator before expensive agentic decomposition. AVOID / heed the warnings: the Ideation-Execution Gap is the central caution — machine novelty is abundant and cheap; do NOT treat 'hundreds of speculative relationships per pass' (donto already extracts ~483 facts/pass) as the win condition, because surface novelty evaporates under execution. The differentiated value is the VERIFICATION funnel, not generation volume. Also avoid SciAgents/co-scientist's reliance on a single LLM judge as ground truth (your own memory note: 'no authority is ground truth') — donto's paraconsistent, contradiction-preserving design is precisely the antidote, letting you keep rival relationship-claims and their argument edges instead of collapsing to one ranked answer. Combinatorial blow-up (path/lens-pair explosion) is the operational risk SciAgents controls with bounded random sampling and the co-scientist controls with Elo pruning — donto needs an equivalent bounded-candidate + ranking gate (it already has the pattern in its bounded-candidate /search query).

Already done vs white space: ALREADY DONE (founder should not reinvent): The core thesis that valuable relationships live unconnected across literatures/domains and can be surfaced mechanically — Swanson proved it in 1986 and it has been clinically validated. 'Latent future relationships already encoded in the corpus' — Tshitoyan/mat2vec proved it (2019). Multi-agent, ontological-KG, randomized-path, serendipitous cross-domain hypothesis generation with a critic and a novelty check — SciAgents IS this (2024-25), for materials. Tournament-ranked, self-debating, evolving multi-agent hypothesis generation with REAL wet-lab validation — Google's AI co-scientist (2025). Relationship-as-link-prediction over biomedical KGs with explanatory paths — the entire drug-repurposing-KG field (2020-23). Evidence-grounded, hallucination-resisting, KG+literature hypothesis generation — KG-CoI, BioDisco, TruthHypo (2024-25). Temporal/retro-validation of hypotheses — BioDisco. Honest evals of whether any of it produces durable value — Si et al. + Ideation-Execution Gap. So the components 'agentic,' 'many-perspective decomposition,' 'cross-domain bridging,' and 'KG substrate' each individually EXIST. GENUINE WHITE SPACE: (1) The full SPECTRUM of human analytical lenses as first-class, typed, persistent dimensions — every existing system uses ONE implicit lens (semantic similarity / domain ontology / co-citation). Nobody has made philosophical, mereological, teleological, semiotic, phenomenological, ethical, ecological etc. lenses explicit, simultaneous, and cross-indexed so relationships emerge at lens INTERSECTIONS. (2) A PARACONSISTENT, contradiction-PRESERVING substrate that HOLDS millions of mutually-incompatible machine-proposed relationships forever as legal state with typed argument edges — every existing discovery engine collapses to a single ranked hypothesis list and discards rivals; donto's hold-without-collapse + identity-as-hypothesis is genuinely unexplored at scale. (3) Domain-GENERALITY — all proven systems are narrow (yeast, materials, biomed); a domain-agnostic engine over a general 39.5M-statement substrate is untested. (4) Evidence-anchoring to source bytes + Lean-4 certification of the rare valuable edge as the disposer is a verification architecture no one has assembled. The combination — agentic many-lens decomposition + paraconsistent hold + formal/evidence verification, domain-general, at substrate scale — is novel even though no individual ingredient is.

Hard problems:

foundational-faceted-ontologies

This tradition supplies the rigorous theory of the lens itself — what an orthogonal analytical dimension is, how to decompose an entity through several at once, and (critically for donto's "relationship discovery" payoff) how implicit structure emerges from the intersection of dimensions. It splits into three lineages that the founder's vision unknowingly braids together.

(1) Upper / foundational ontologies define a small set of top-level categories through which any entity can be viewed — the formal backbone of "lenses." BFO (Barry Smith, Buffalo; the upper ontology of the OBO Foundry / ~hundreds of biomedical ontologies) splits reality into continuants (3D enduring things, with independent/dependent/quality/role/disposition/function sub-distinctions) vs occurrents (4D processes), unifying 3D-ist and 4D-ist views in one frame. DOLCE (Gangemi, Guarino, Masolo, Borgo; the LOA in Trento, ~2002) is explicitly "cognitive/linguistic-biased" — it carves the categories underlying natural language and common sense (endurants, perdurants, qualities, abstracts), and its Descriptions & Situations (D&S) extension is the single most relevant piece here: it reifies descriptions (roles, concepts, parameters) separately from the situations/states-of-affairs they "satisfy," so the same facts can be re-interpreted under many descriptions/perspectives without conflict — a near-exact formal analogue of donto's "identity/relationship is a hypothesis queried under a lens." SUMO (Niles & Pease, Teknowledge/Articulate, 2000) is a large, fully axiomatized ontology with first-order reasoning (Sigma/Vampire/E provers) and a complete manual mapping of every WordNet synset to SUMO terms — the best example of bridging a lexical lens to a formal one. Cyc (Lenat, 1984–) is the deepest precedent for the paraconsistent / contextual angle: its microtheories (Mt) scope assertions to assumption-contexts so globally contradictory views (relativistic vs Newtonian physics, fiction vs fact, conflicting economic theories) coexist without exploding — Cyc deliberately is locally consistent but globally contradiction-tolerant, exactly donto's posture.

(2) Faceted classification is the methodology of many orthogonal lenses. S. R. Ranganathan's Colon Classification (1933) and its PMEST fundamental categories — Personality (the focal entity), Matter (substance/material), Energy (action/process/operation), Space, Time — were the first analytico-synthetic scheme: you analyze a subject into facets, then synthesize a compound class number by combining foci from independent facets with connecting symbols (the colon). The deep claim is that a small set of orthogonal facets can compose to express an unbounded space of compound subjects no one enumerated in advance — precisely the combinatorial generativity the founder wants, stated in 1933. PMEST is the historical ancestor of (a) modern faceted search/navigation (Pollitt, Shneiderman, Marchionini, and especially Marti Hearst's Flamenco, Berkeley 2000s — multi-dimensional filter UIs everywhere now), and (b) BFO/DOLCE-style category systems. The founder's "philosophical, temporal, causal, mereological, teleological…" list is, structurally, a much larger PMEST.

(3) Frame semantics + Formal Concept Analysis give the emergence engine. Charles Fillmore's frame semantics (1970s–80s) and FrameNet (ICSI Berkeley, 1997–) say a word's meaning is only graspable against a whole frame — a structured scene with frame elements (roles): the COMMERCIAL_TRANSACTION frame binds Buyer, Seller, Goods, Money. Frames are reusable relational "lenses" with typed slots — the conceptual template for any per-lens schema and for relation extraction (semantic role labeling). Formal Concept Analysis (Rudolf Wille, "Restructuring Lattice Theory," 1981; Ganter & Wille's Mathematical Foundations, 1996/1999, on Birkhoff lattice theory and Peirce/Port-Royal logic) is the deepest mathematical realization of the founder's exact payoff. From a binary formal context (objects × attributes table), a Galois connection between extents and intents produces formal concepts (maximal object-set/attribute-set pairs where neither can grow), and ordering them yields a concept lattice — a complete lattice whose nodes are emergent concepts the analyst never named, plus a canonical basis of attribute implications (A→B: "every object with all of A has all of B") computable via attribute exploration. This is literally a machine that surfaces latent concepts and rules from an object×attribute matrix — the founder's "relationships no human thought to draw." Its multi-relational extension, Relational Concept Analysis (RCA) (Rouane-Hacène, Huchard, Napoli, Valtchev, 2013), iterates FCA over several object sorts linked by relations, abstracting links into relational attributes and producing a family of coupled lattices — i.e., discovering cross-entity relations across multiple "kinds" at once, which is structurally what donto's many-lens cross-entity discovery aims at.

Foundational works:

Modern AI systems:

Relevance to the lens engine: BORROW (4 concrete imports): (1) PMEST's analytico-synthetic principle as the design contract for lenses — keep each lens ORTHOGONAL and have the value be in the synthesis (foci combined across facets), not in any single facet. This is the founder's intuition, already formalized in 1933; treat it as a constraint (lenses should be as independent as possible) rather than reinventing it. (2) FCA/RCA as the literal back-end for the 'relationship discovery' step: once agents fill many lenses, project the cross-lens output into formal contexts (object×attribute) and per-relation RCA contexts, then compute the concept lattice + canonical implication basis. The emergent concepts and implications ARE the 'relationships no one thought to draw' — and they come with a provenance-free, deterministic derivation that pairs perfectly with donto's evidence-anchoring and Lean-4 certification (FCA implications are exactly the shape Lean can verify). (3) DOLCE's Descriptions & Situations and Cyc's microtheories as prior art for donto's 'identity/relationship is a hypothesis under a lens' and contradiction-holding — donto should cite these as the lineage it extends, and reuse D&S's description/situation split as the modeling pattern for 'a relationship-claim viewed under lens X.' (4) Frames (FrameNet) as the per-lens schema template — each lens defines typed roles to fill, making extraction structured and relation-ready. AVOID: (a) the upper-ontology trap of forcing one universal, globally-consistent category tree — BFO/SUMO spent two decades on alignment wars; donto's paraconsistent, lens-relative stance is the differentiator, so do NOT collapse lenses into a single canonical ontology. (b) Classical FCA's brittleness — it requires exact binary incidence and is noise-sensitive and worst-case exponential in concepts; LLM-extracted attributes are noisy and graded, so use fuzzy/relaxed FCA, bounded candidate generation, and donto's hypothesis_only/contradiction-frontier to absorb noise instead of letting it explode the lattice. (c) The FrameNet/Cyc lesson that hand-curation does not scale — the whole bet must be that AGENTS fill lenses cheaply; that agentic fill is the genuinely new ingredient these classical systems lacked.

Already done vs white space: ALREADY DONE (the founder should not claim these as novel): (1) The idea that a small set of ORTHOGONAL lenses composes to an unbounded analytical space — Ranganathan, 1933 (PMEST). (2) That implicit concepts and relationships can be automatically derived from an object×attribute table — FCA, Wille 1981; and across multiple related kinds — RCA, 2013. This is the founder's core 'discovery engine' payoff, mathematically solved decades ago for clean binary data. (3) Holding mutually-contradictory claims in scoped contexts without explosion — Cyc microtheories, 1984+; locally-consistent/globally-tolerant KBs are old news. (4) Re-interpreting the same facts under many descriptions/perspectives — DOLCE D&S, ~2004; multi-perspective and aspect-oriented ontology development are established sub-fields. (5) Bridging a lexical lens to a formal lens (SUMO×WordNet) and frames-as-relational-lenses (FrameNet) — done. (6) Even the modern hint that LLM embeddings already contain FCA-style concept lattices — Stanford 2026. GENUINE WHITE SPACE: No prior system combines all four legs at once — (i) AGENTIC, LLM-driven population of MANY heterogeneous human-analytical lenses (philosophical/teleological/aesthetic/semiotic/ecological, far beyond PMEST's five or BFO's continuant/occurrent split), at (ii) WEB-SCALE over a (iii) PARACONSISTENT, evidence-anchored, bitemporal substrate that can hold the resulting speculative cross-lens relations FOREVER as legal state, with (iv) a verification layer (FCA-implication mining + Lean-4 certification) to promote the rare valuable hypotheses. FCA/RCA assumed clean curated contexts and tiny scale; Cyc/DOLCE assumed human knowledge engineers; FrameNet assumed manual annotation; upper ontologies assumed one consistent world. The novel claim that survives scrutiny is the integration: agents as the lens-fillers, paraconsistency as the holding-tank for cross-lens serendipity, and FCA/Lean as the disciplined harvester. That specific assembly appears genuinely unexplored.

Hard problems:

semantic-decomposition-primitives

The unifying claim of this tradition is that meaning is not atomic — it decomposes into a small, recurring set of deeper, comparable components. Five major frameworks instantiate this in importantly different ways, and together they form the most direct intellectual ancestry for donto's "many lenses on every entity" vision.

(1) Wierzbicka's Natural Semantic Metalanguage (NSM) is the most radical reductionist program: roughly 65 indefinable, cross-linguistically universal "semantic primes" (I, YOU, SOMETHING, GOOD, BAD, DO, HAPPEN, KNOW, WANT, THINK, BECAUSE, IF, NOT, BEFORE, PART, KIND, LIKE...) plus a universal mini-grammar and ~50 "semantic molecules" (man, water, hands). Any concept, however culture-specific, is "explicated" as a paraphrase built only from primes, so two concepts from different cultures become directly comparable at the prime level. NSM is the purest expression of "break meaning to the utmost" — a finite alphabet of thought.

(2) Pustejovsky's Generative Lexicon (GL, 1991/1995) is the single most lens-like framework and the most architecturally relevant. Its QUALIA STRUCTURE assigns every noun FOUR modes of explanation, explicitly derived from Aristotle's four aitiai (via Moravcsik 1975): FORMAL (what kind of thing it is), CONSTITUTIVE (its parts/material — mereology), TELIC (its purpose/function — teleology), and AGENTIVE (how it came into being — origin/causation). A noun like 'book' carries formal=physical object, constitutive=pages/text, telic=read(x), agentive=write(x); 'door' carries telic=pass-through, etc. GL's generative devices — type coercion ("begin a book" coerces to "begin reading"), co-composition ("bake a cake" vs "bake a potato"), selective binding ("fast car" binds to the telic driving event) — and its dot-objects/complex types (book = PHYSICAL•INFORMATION, a single entity legitimately under two types at once) solve LOGICAL POLYSEMY without sense enumeration. This is essentially a four-lens decomposition built into the lexicon, and the dot-object is a near-exact precedent for "one entity, multiple co-present aspects."

(3) Schank's Conceptual Dependency (CD, late 1960s–70s, Yale) decomposes all event meaning into ~11 primitive ACTs (ATRANS abstract-transfer/give, PTRANS physical-transfer/go, MTRANS mental-transfer/tell, MBUILD, INGEST, EXPEL, MOVE, GRASP, PROPEL, ATTEND, SPEAK) plus conceptual cases and states, so paraphrases ("John gave Mary a book" / "Mary took a book from John") collapse to one canonical, language-independent representation enabling inference. CD scaled up into scripts/plans/goals (SAM, PAM). It is the canonical predicate-decomposition lens and the historical lesson in over-reduction.

(4) Jackendoff's Conceptual Semantics treats meaning as a level of THOUGHT (Conceptual Structure), built from a fixed ontology of categories — Event, State, Thing, Place, Path, Property, Amount — combined by functions like GO, BE, STAY, CAUSE, INCH. Crucially Jackendoff argues decomposition is the cognitive-science method itself: meanings are decomposed into primitives "as the semantic equivalents of phonological features."

(5) The modern, data-driven heirs are Universal Decompositional Semantics (UDS; White, Reisinger, Rawlins, Van Durme, 2016–2020) and Abstract Meaning Representation (AMR; Banarescu et al. 2013). UDS is the most directly transferable to donto: instead of discrete categories it annotates each predicate/argument with many SCALAR, real-valued, confidence-weighted properties across orthogonal dimensions — 18 semantic proto-role properties (volition, sentience, causation, change-of-state, grounded in Dowty 1991), genericity, factuality, time/duration, event aspect (telicity/dynamicity), 26 entity supersenses — over a single graph (PredPatt). That is precisely "many independent lenses, each a graded hypothesis, layered on one graph." AMR is the production-scale graph meaning-representation (rooted DAG over PropBank predicates, "who did what to whom," abstracting away syntax), now with strong LLM parsers (Smatch ~86) and a 52-language MASSIVE-AMR corpus — but it deliberately drops tense, number, quantifier scope, and figurative meaning.

The throughline for the founder: every one of these is, in effect, a fixed set of LENSES that turn an entity or predicate into deep, comparable atoms. GL's qualia literally are four lenses; UDS's property sheets are dozens of scalar lenses. The relationship-discovery payoff donto wants is exactly what these atoms enable: once two entities are decomposed into the same primitive vocabulary, latent cross-entity relations (shared telic purpose, shared agentive origin, matching proto-role profiles) become computable rather than guessed.

Foundational works:

Modern AI systems:

Relevance to the lens engine: BORROW, concretely: (1) GL's qualia are a ready-made, defensible STARTING lens-set — donto's 6 apertures could be extended with formal/constitutive/telic/agentive, which are entity-level (your current 6 are text-extraction-level) and yield exactly the cross-entity links you want (shared telic purpose, shared agentive origin, part-of overlap). (2) GL's DOT-OBJECT (book = PHYSICAL•INFORMATION) is a near-perfect formal precedent for donto's 'identity is a hypothesis' / one entity legitimately under multiple co-present aspects — cite it; it gives your design philosophical pedigree. (3) UDS is your closest sibling and the single best model to imitate: store each lens as SCALAR, CONFIDENCE-WEIGHTED, ORTHOGONAL properties on a graph rather than discrete labels — that is exactly what a paraconsistent, hypothesis-weighted substrate wants, and it makes 'relationship at the intersection of lenses' a vector-similarity / shared-profile query. (4) NSM is the right vocabulary-design discipline: a small, comparable atom set is what makes two entities from different domains LINEABLE at all; without a shared decompositional alphabet, cross-entity discovery degrades to surface string matching. (5) Swanson/LBD is your proof-of-concept and your evaluation template (ABC model; closed vs open discovery; replicate a known discovery to validate). (6) FrameNet's ~1200 frames are a free situational-lens library. AVOID: (a) Schank's mistake — a fixed, too-small primitive set that loses coverage (CD covered only a fraction of real-event corpora); keep lenses OPEN/extensible, not a closed alphabet. (b) AMR's deliberate amnesia — it drops tense, number, scope, figurative meaning; donto must NOT collapse those, since temporal/modal/figurative differences are often where the novel relation hides (and your bitemporal + paraconsistent design is built precisely to keep them). (c) Feeding raw symbolic structure into LLMs naively (SR-LLM result: it can hurt) — let agents produce decompositions but mediate the structure carefully. (d) The Fodor trap — make each lens's output FALSIFIABLE and evidence-anchored (your byte-level evidence + Lean certification is the right answer to 'is this decomposition real content or relabeling?').

Already done vs white space: ALREADY DONE (do not reinvent): The four-lens-per-entity idea (GL qualia, 1991), the many-graded-lenses-on-one-graph idea (UDS), the small-universal-atom idea (NSM/CD/Jackendoff), the situation/frame lens (FrameNet), the scalar-multi-property role decomposition (Dowty→UDS), the whole-sentence graph (AMR), and — critically — the 'discover relationships no human drew across decomposed concepts' idea (Swanson's literature-based discovery, 1986, and the entire 2024–2025 LLM-hypothesis-generation field). The founder's belief that 'no one has thought to do this' is FALSE at the level of any single component; relationship-discovery-via-decomposition is a 40-year-old research program. GENUINE WHITE SPACE (the real novelty is the COMBINATION at scale, not any piece): (1) No prior system runs the FULL SPECTRUM of human analytical lenses (philosophical+ethical+aesthetic+economic+ecological+semiotic+phenomenological, far beyond linguistic) — every framework above is linguistic/lexical, narrow by design; an agentic engine that applies dozens of heterogeneous interpretive lenses is genuinely unattempted. (2) No prior decomposition substrate is PARACONSISTENT and contradiction-preserving — UDS/AMR/GL assume one correct analysis; donto can hold mutually contradictory lens-outputs as legal state forever, which is exactly right for 'speculative machine-proposed relations.' (3) No prior LBD/KG system is simultaneously evidence-first-to-the-byte AND formally certifiable (Lean) — this directly answers the Fodor critique and the surveys' #1 complaint (extracted triples lack corroboration). (4) Agentic generation of lenses at 39M-statement scale with HOLD-then-VERIFY is new: classic LBD/KG curate eagerly; donto's 'generate speculative, hold without collapsing, certify the rare valuable few' is an unexplored operating model. The honest pitch: the lenses, the atoms, and the discovery goal are all prior art; the AGENTIC + MANY-HETEROGENEOUS-LENS + PARACONSISTENT-EVIDENCE-FIRST-CERTIFIABLE substrate, at this scale, is the defensible novelty.

Hard problems:

network-science-of-discovery

The network-science / science-of-science tradition gives the most rigorous answer to the donto founder's central, unstated question: not "how do I generate connections?" but "which generated connections are VALUABLE?" Its core, empirically-validated finding is that value lives in a specific place — at the BRIDGES between otherwise-disconnected clusters, and in the ATYPICAL recombination of distant elements — but only when that novelty is anchored in convention. This is the field's deepest result and it directly contradicts a naive reading of the founder's intuition. Volume of connections is worthless; positionally-improbable connections are everything.

The lineage runs through three nested layers. (1) Network structure: Granovetter's "strength of weak ties" (1973) showed that novel information flows across bridges (weak, non-redundant ties), not within dense clusters where everyone already knows the same things. Burt formalized this as STRUCTURAL HOLES — a gap between two clusters with non-redundant information — and his "Structural Holes and Good Ideas" (AJS 2004, the Raytheon study) demonstrated empirically that managers whose networks SPAN holes have a "vision advantage": their ideas are disproportionately rated as valuable, less likely to be dismissed. Burt's line "the creative spark on which serendipity depends is to see bridges where others see holes" is almost a literal mission statement for a lens-intersection engine. (2) Combinatorics of discovery: Weitzman's "Recombinant Growth" (1998) and Arthur's "The Nature of Technology" (2009) model innovation as recombination of existing components, with the supply of ideas effectively unbounded — the binding constraint is the R&D/evaluation effort to test combinations, not the combinations themselves. Kauffman's "adjacent possible" and the TAP equation (Cortês, Steel, Kauffman et al.) formalize how the space of possible combinations explodes (a long plateau then a hockey-stick) as each new object opens new adjacent recombinations. (3) Empirical scoring of novelty value: Uzzi, Mukherjee, Stringer & Jones, "Atypical Combinations and Scientific Impact" (Science 2013, 17.9M papers) is the keystone. They measure a paper's combinations by z-scoring every pair of co-referenced journals against a degree-preserving randomized null (how surprising is this pairing vs chance), then take the paper's MEDIAN conventionality and its 10th-percentile TAIL novelty. The hit finding: the highest-impact papers are NOT the most novel — they sit in the high-conventionality / high-tail-novelty quadrant. A bedrock of convention with a sharp intrusion of one atypical combination is 2x more likely to be a hit. Pure novelty underperforms.

The science-of-science tradition also quantifies the OPPOSITE problem the founder will hit. Foster, Rzhetsky & Evans, "Tradition and Innovation in Scientists' Research Strategies" (ASR 2015), mapped millions of biomedical claims as a network of chemical relationships and showed scientists overwhelmingly play it safe (extending known nodes) because the reward premium for risky bridging strategies, though real (higher expected impact), is insufficient to compensate for the higher chance of being ignored. Wang, Veugelers & Stephan, "Bias Against Novelty in Science" (Research Policy 2017), showed the most novel papers are SYSTEMATICALLY undervalued in short windows, suffer delayed recognition, and are cited mainly in "foreign" fields — precisely because no single evaluator holds all the lenses. This is the strongest external validation of the founder's thesis: there is a real, measurable surplus of value in cross-lens bridges that human, discipline-bounded evaluation leaves on the table. The Funk/Owen-Smith CD-index and the Park et al. (Nature 2023) "disruption is declining" work give an alternative, network-based way to score whether a connection CONSOLIDATES or DISRUPTS its neighborhood.

Crucially for donto's paraconsistent design, Chen, Ding & Evans-style work — "New Directions in Science Emerge from Disconnection and Discord" (arXiv 2103.03398) — shows that DISAGREEMENT/contradiction between clusters, not just disconnection, is the strongest predictor of where new scientific directions emerge. Bridges that span a structural hole AND carry discord are disproportionately generative. This is the empirical warrant for holding contradictions as legal state rather than collapsing them: a contradiction frontier IS a map of where novel directions are most likely.

The throughline for a discovery-scoring engine: a discovered relationship should be scored not by plausibility alone but by (a) the network DISTANCE/improbability of the entities it bridges (structural-hole span, z-score atypicality), (b) the CONVENTIONALITY of its surrounding scaffold (Uzzi: anchor the leap in known ground), and (c) the presence of unresolved DISCORD across the bridge. Score for surprise-given-grounding, not for either alone.

Foundational works:

Modern AI systems:

Relevance to the lens engine: This area is donto's scoring layer — it tells the engine how to RANK the relationships its many-lens decomposition proposes. BORROW: (1) Uzzi's exact recipe — for any discovered relationship, compute a z-score atypicality against a degree-preserving randomized null over the entity graph, then favor relationships that pair HIGH conventional scaffolding with a HIGH-novelty TAIL (a single surprising bridge anchored in known ground), not maximal novelty. This converts 'plausible' into 'valuable.' (2) Burt's structural-hole span — score a proposed edge by how many non-redundant clusters it connects and how large the hole it bridges; brokerage betweenness over the entity graph is a directly computable value signal, and donto already has the quad graph to compute it. (3) Sourati-Evans 'avoid the crowd' — model which relationships are already cognitively reachable (densely co-occurring, low surprise) and DOWN-weight them; up-weight the 'alien' bridges far from existing co-mention, which is where the unrecovered surplus value sits. (4) Chen/Ding/Evans discord+disconnection — donto's contradiction frontier is not a bug to resolve but a PRIORITY MAP: rank candidate relationships highest where they bridge disconnected clusters that also carry argument-edge discord (supports/rebuts). donto's paraconsistent substrate is uniquely able to hold and exploit this signal where a consistency-enforcing store would have destroyed it. (5) Weitzman/Arthur/Kauffman combinatorics — internalize that generation is cheap and unbounded; the engine's entire moat is the triage filter, and the adjacent-possible explosion means you MUST bound exploration (sample paths, cap fan-out) or drown. AVOID: (a) optimizing for raw novelty or raw volume — Wang/Veugelers/Stephan and Uzzi both show pure novelty is low-value and even penalized; (b) shortest-path / nearest-neighbor relationship discovery — SciAgents found random/distant paths strictly better for creativity; (c) treating LLM-rated plausibility as value — the 2025 benchmarks show LLMs over-produce plausible-invalid hypotheses, so plausibility must be a gate, never the ranking. Net: donto should ship a 'brokerage + atypicality + discord' composite score as the lens it applies at query time to triage machine-proposed hypothesis_only edges, and use Lean-4 certification only on the thin top slice that survives.

Already done vs white space: ALREADY DONE (the founder should not reinvent): (1) The CORE THESIS that valuable connections live at bridges/atypical combinations is not a hunch — it is one of the most replicated results in social science (Granovetter→Burt→Uzzi→Foster/Evans, across millions of papers). (2) The exact MATH to score a connection's value-improbability already exists and is open (Uzzi z-score atypicality, Burt brokerage/effective-size, CD/disruption index, Novelpy package). (3) MANY-LENS / cross-domain GRAPH TRAVERSAL to generate bridging hypotheses is a shipped product category — SciAgents (random-path graph reasoning), Sourati-Evans (human-aware walks), analogy mining, LBD link-prediction, AI co-scientist all do 'find the bridge no one drew.' (4) The empirical proof that latent cross-document relationships exist and are recoverable (mat2vec) is settled. GENUINE WHITE SPACE — the defensible combination: (a) PERSISTENT, PARACONSISTENT HOLDING of speculative relationships as first-class legal state. Every system above generates hypotheses transiently and either validates-or-discards them; NONE holds a durable, contradiction-preserving, evidence-anchored frontier of millions of unresolved machine-proposed edges that can be re-queried, re-scored, and accreted over time as new lenses/entities arrive. donto's bitemporal contradiction store turns one-shot generation into a compounding asset. (b) SCALE + GENERALITY: the discovery systems are domain-locked (materials, biomedicine); donto is a general 39.5M-statement substrate, so it can compute brokerage/atypicality across domains that have never been jointly indexed — exactly the foreign-field surplus Wang/Veugelers/Stephan showed is undervalued. (c) EVIDENCE-ANCHORING + LEAN CERTIFICATION of the survivors: no discovery system byte-anchors every claim AND offers a formal certification overlay, which is precisely what closes the 'plausible-but-invalid' gap the 2025 benchmarks expose. (d) IDENTITY-AS-HYPOTHESIS: discovery in these systems assumes fixed entities; donto's queryable-merge-under-a-lens means the SAME substrate can discover relationships under different identity resolutions — a genuinely unexplored degree of freedom. So 'no one has thought of the many-lens bridge idea' is FALSE; 'no one has built a persistent, paraconsistent, evidence-first, cross-domain substrate that holds and compounds the firehose and then certifies the survivors' is essentially TRUE and is the real moat.

Hard problems:

multi-perspective-agentic-reasoning

The founder's vision — decompose any entity through the full spectrum of human analytical lenses (philosophical, temporal, causal, mereological, teleological, ethical, semiotic, etc.) and harvest the RELATIONSHIPS that emerge at the INTERSECTION of lenses — sits at the confluence of a deep philosophical lineage and a very active 2023-2026 AI research front. The intellectual root is PERSPECTIVISM (Nietzsche: knowledge is irreducibly perspectival, and crucially his methodological perspectivism — "the more affects we allow to speak about a thing, the more complete will be our concept of it"; Ortega y Gasset; Wittgenstein's aspect-seeing). The engineering root is Minsky's "Society of Mind" (1986): intelligence as the emergent product of many simple, specialized, non-intelligent agents. The discovery root is Don Swanson's Literature-Based Discovery (1986, fish-oil/Raynaud's): valuable knowledge already exists latently as UNCONNECTED public facts across disciplinary silos (the A-B-C model), and the payoff is connecting them — which is almost exactly the founder's "relationships no human thought to draw because no human holds all the lenses." Conceptual Blending Theory (Fauconnier & Turner) supplies the cognitive mechanism for why cross-frame combination is generative rather than merely additive.

The modern AI realization is the multi-agent / multi-perspective LLM literature. Du, Li, Tenenbaum & Mordatch's "multiagent debate" (2023, ICML 2024) showed multiple LLM instances proposing and critiquing over rounds improves factuality and math/strategic reasoning — explicitly framed as a "society of minds." Tree-of-Thoughts (Yao et al. 2023) and Graph-of-Thoughts (Besta et al. 2023/AAAI 2024) generalize single-chain reasoning to branched/graph search with self-evaluation, lookahead, backtracking, and — in GoT — synergistic recombination of intermediate thoughts, the structural analog of intersecting lenses. Solo-Performance-Prompting (Wang et al., NAACL 2024) is the most direct precursor to the founder's "many lenses on one object": a single LLM dynamically identifies and simulates multiple task-relevant PERSONAS ("cognitive synergy"), and critically finds that DYNAMICALLY-IDENTIFIED, fine-grained personas ("Film Expert") beat fixed generic ones ("Expert") — e.g. 79% vs 38% on Codenames — though synergy only EMERGES at GPT-4-level capability. Mixture-of-Agents (Wang et al. 2024) layers proposer LLMs whose outputs are aggregated, beating GPT-4-Omni on AlpacaEval. CAMEL (Li et al., NeurIPS 2023) and AutoGen operationalize role-based agent ensembles as infrastructure.

The single most important system for this vision is Google DeepMind's AI co-scientist (Gomes et al., arXiv 2502.18864, Feb 2025; Nature 2026). It is a near-literal instantiation of the generate→hold-many→curate-the-valuable pipeline the founder describes, with named specialized agents: a Generation agent proposes hypotheses; a PROXIMITY agent clusters them specifically so the system does not collapse into a single line of thinking (the anti-redundancy mechanism); a Reflection agent acts as virtual peer reviewer scoring novelty/correctness/rigor; a Ranking agent runs an Elo "idea tournament" of simulated debates; an Evolution agent recombines and refines top hypotheses; a meta-review agent feeds back. It produced experimentally validated novel findings (AML drug-repurposing candidates with in-vitro tumor inhibition; novel epigenetic liver-fibrosis targets validated in human organoids; in-silico rediscovery of an unpublished gene-transfer mechanism). This is concrete evidence that agentic multi-perspective generation-plus-curation yields genuinely novel, valuable relationships — not just redundancy.

On the founder's central empirical question — does diverse decomposition produce EMERGENT INSIGHT or just REDUNDANCY? — the literature gives a sharp, honest answer: BOTH, and which one you get is a design problem, not a guarantee. The rigorous theory is the bias-variance-DIVERSITY decomposition (Wood, Mu, Brown et al., JMLR 2023): an ensemble's expected error = average bias + average variance − DIVERSITY, where diversity is precisely member DISAGREEMENT. Diversity is provably valuable, BUT only when members are individually competent (if "experts disagree very frequently they are individually poor estimators"). The cautionary 2024-2026 evidence is strong: "Talk Isn't Always Cheap" (2509.05396) shows debate frequently DEGRADES accuracy — agents flip from correct to incorrect under social/peer pressure (conformity dominates truth-seeking), weak agents contaminate strong ones, and accuracy can fall over rounds. "Representational Collapse in Multi-Agent LLM Committees" (2604.03809) measured that 3 same-model agents under different role prompts had mean cosine similarity 0.888 and effective rank 2.17/3 — i.e. nominal "diversity" via persona prompts can be largely ILLUSORY. The "tyranny of the majority" / echo-chamber effect is documented repeatedly. The constructive response is diversity-AWARE design: diversity-aware message retention (2603.20640), structured disagreement analysis for uncertainty (DiscoUQ 2603.20975), and the co-scientist's Proximity-agent clustering — all aimed at PRESERVING genuine divergence instead of letting it collapse.

Foundational works:

Modern AI systems:

Relevance to the lens engine: BORROW: (1) The co-scientist topology is the proven recipe for donto's lens engine — a Generation phase (run N lens-agents over an entity/text) feeding a PROXIMITY/clustering step (essential: without it you get redundancy collapse, not emergent relationships), then Reflection/Ranking via an Elo idea-tournament to curate the rare valuable cross-lens relationships, then Evolution to recombine survivors. donto's paraconsistent substrate is the ideal place to HOLD the generated-but-unranked tournament population that co-scientist keeps only in-memory. (2) SPP's strongest, most actionable lesson: DYNAMIC, task-specific lenses beat a fixed generic list — so rather than hard-coding the same 6/N philosophical lenses every time, let an agent pick the fine-grained lenses an entity actually rewards (a treaty rewards 'legal/temporal/diplomatic-game-theory'; a poem rewards 'prosodic/semiotic/phenomenological'). (3) GoT's recombination-of-thoughts is the literal mechanism for 'relationships at the intersection of lenses' — model lens-outputs as graph vertices and explicitly generate edges BETWEEN them; do not just concatenate per-lens fact lists. (4) Swanson LBD + conceptual blending are the right framing for the PAYOFF metric: a valuable output is an A-C relationship surfaced because lens-A and lens-C share a B-term — instrument for that, not for raw fact count. AVOID / GUARD AGAINST: (a) Redundancy/representational collapse — same base model under N persona prompts gives ~rank-2 'diversity' (cosine 0.888); donto must measure semantic diversity of lens outputs (effective rank / pairwise distance) and discount near-duplicates, or genuinely vary models/temperature/tools per lens. (b) Conformity & 'tyranny of the majority' — do NOT make lenses debate to consensus; donto's paraconsistent design is a STRENGTH here precisely because it can preserve minority/contradictory lens-claims as legal state (hypothesis_only, supports/rebuts/undercuts edges) instead of collapsing them — this is donto's genuine differentiator over every debate-to-consensus system. (c) Weak-agent contamination — a low-quality lens degrades the pool; gate lens-outputs by an individual-competence check (bias-variance-diversity theory: diversity only helps among competent members). (d) Cost/emergence floor — SPP shows synergy only emerges at frontier capability; budget for strong models on the generation lenses or you'll get redundancy, not insight.

Already done vs white space: ALREADY DONE (the founder should NOT assume 'no one has thought of this'): The core loop — run many specialized perspectives/agents over a problem, hold a population of candidate hypotheses, debate/rank/evolve them, and surface validated novel ones — is fully built and peer-reviewed in the AI co-scientist (Nature 2026), with WET-LAB-validated novel discoveries. 'Many personas/lenses on one object then combine' is done at the prompt level by SPP (NAACL 2024) and at the architecture level by MoA, CAMEL, AutoGen, multiagent debate, and ToT/GoT. The conceptual claim that latent cross-silo relationships are the prize is 40 years old (Swanson LBD) and being actively LLM-ified (Elicit, SKiM, the 2024 MDPI LBD work, hypothesis-generation surveys arXiv 2504.05496). 'Many critical lenses over one text' is standard literary pedagogy and is being studied for LLMs (arXiv 2507.11582). Multi-view/multi-agent KG construction with conflict resolution (CooperKGC, KARMA) overlaps donto's extraction layer. GENUINE WHITE SPACE (donto's defensible novelty is the COMBINATION, not any single piece): (1) PERSISTENCE & SCALE — every system above generates-and-discards within a single session/query; NONE durably HOLDS the full speculative cross-lens relationship population as queryable, bitemporal, evidence-anchored legal state across millions of entities. donto can keep the 99% of machine-proposed relationships that co-scientist throws away, forever, for later re-evaluation as lenses/evidence improve. (2) PARACONSISTENT CO-EXISTENCE — every debate/committee system is consensus-seeking and thus actively destroys the minority and contradictory readings; donto's contradiction-preserving substrate with typed argument edges (supports/rebuts/undercuts) and identity-as-hypothesis is, as far as the literature shows, UNIQUE as a place to let mutually-contradictory cross-lens relationship-claims coexist without collapse. (3) CROSS-ENTITY × CROSS-LENS at substrate scale — the systems above run many lenses over ONE object; the founder's distinctive move is harvesting relationships at the intersection of lenses ACROSS millions of entities simultaneously (a global LBD over a 39M-statement graph). That global, always-on, lens-indexed serendipity surface does not exist in the literature. (4) FORMAL CERTIFICATION of the curated survivors — pairing speculative generation with a Lean-4 overlay that can CERTIFY the rare valuable relationship's shape/rule is genuinely unexplored (co-scientist validates in wet labs / Elo, not by formal proof).

Hard problems:

serendipity-novelty-evaluation

This field exists to answer the donto founder's make-or-break question directly: when a machine proposes a vast number of novel relationships, how do you tell a profound connection from pareidolia? Three research traditions converge on it, and all three have already discovered the same hard truth.

(1) Computational serendipity in recommender systems is the most mature. The field's consensus decomposition (Kotkov, Wang & Veijalainen 2016 survey; Murakami 2008; Ge, Delgado-Battenfeld & Jannach 2010; Adamopoulos & Tuzhilin 2014) is that serendipity = relevant AND novel AND unexpected/surprising, where each component is operationalized separately. The standard trick for unexpectedness is the "primitive prediction model" (Murakami/Ge): a recommendation is unexpected iff it would NOT have been produced by an obvious baseline — Runexp = R \ PM(u). Serendipity score SRDP then multiplies unexpectedness by usefulness (relevance/rating). Adamopoulos & Tuzhilin formalize unexpectedness as distance from a set of expectations E (items the user/system already takes for granted), explicitly separating it from novelty (unknown) and diversity (intra-list dissimilarity). The crucial, sobering lesson from this tradition (Kotkov et al., "The Dark Matter of Serendipity," CHIIR 2024): serendipity is fundamentally a subjective, experienced event, yet ~all systems measure only afforded/observable serendipity via objective proxies — so the metrics are biased and most genuinely serendipitous hits are invisible to them. There is no clean offline ground truth for "valuable surprise."

(2) Surprise as a formal quantity. Itti & Baldi's Bayesian Surprise (NIPS 2006 / Vision Research 2009) is the canonical operational definition: surprise = KL divergence between an observer's PRIOR and POSTERIOR beliefs after seeing data, D_KL(posterior‖prior). It is provably distinct from Shannon information/rarity (a rare-but-belief-irrelevant event has high Shannon surprisal but zero Bayesian surprise). Empirically it is "the strongest known attractor of human attention" (~72–84% of gaze shifts go to above-average-surprise locations). This has been ported to recommenders (Kim et al., "Topic-Level Bayesian Surprise and Serendipity," RecSys 2023) by tracking KL divergence between a user's prior and posterior topic distributions. Bayesian surprise is the most principled, substrate-friendly metric available for the lens engine: it is exactly "how much does this relationship change the model's beliefs."

(3) Literature-Based Discovery (LBD) — Swanson's 1986 Raynaud's/fish-oil and migraine/magnesium discoveries via the ABC model (A relates to B, B relates to C, A↔︎C unknown → hypothesize A–C) — is the closest historical analog to "relationships no human drew because no one held all the lenses." Critically, LBD has spent 30 years grappling with exactly the founder's evaluation problem and has NOT solved it. Two evaluation regimes exist, both flawed: replication (rediscover Swanson's 2–3 known cases — cherry-picked, no statistical power) and time-slicing (Yetisgen-Yildiz & Pratt 2009: pick cutoff year t, treat post-t co-occurrences of A–C absent before t as "discoveries," compute precision/recall/F/AUC/MAP/MRR). Sebastian/Moreau (Bioinformatics 2023, "addressing the subpar evaluation methodology") shows time-slicing is "too noisy": the gold standard is dominated by meaningless co-occurrences (Ebolavirus + Professional Burnout), the true-discovery fraction is "unknown and likely low," so the metric rewards co-occurrence prediction, not insight. There is no agreed benchmark, no shared task, no formal definition of "a discovery."

The unifying finding across all three traditions, plus computational creativity (Boden's new-surprising-valuable; Ritchie's novelty/quality/typicality; Lamb et al.'s 2019 survey of evaluation methods — CAT/Amabile, Colton's tripod, Jordanous's SPECS/components) and modern LLM-idea studies (Si, Yang & Hashimoto 2024 — 100+ reviewers found LLM ideas MORE novel but LESS feasible/valid; TruthHypo 2025 — explicit novelty↔︎validity tradeoff, high hallucination): novelty is cheap and mechanizable; value is expensive and resists automation. Generation is solved; discrimination is not. At scale this collides with the statistics of multiple comparisons / false discovery rate: an engine that proposes millions of cross-lens links is running millions of implicit hypothesis tests, so the EXPECTED number of spurious-but-surprising connections is enormous (apophenia by construction). Without FDR control, calibration, or downstream validation, "a connection no human ever drew" and "a connection no human ever drew because it's noise" are indistinguishable.

Foundational works:

Modern AI systems:

Relevance to the lens engine: BORROW: (1) The decomposition discipline — never score a relationship with one number. Carry relevance, novelty (unknown-ness), unexpectedness (distance from an expectation set E), and value as SEPARATE axes (Kotkov/Adamopoulos-Tuzhilin). donto's bitemporal + evidence-first design already lets you compute novelty cheaply (is this triple absent from the substrate?) and unexpectedness via the 'primitive prediction model' trick (Murakami/Ge): a cross-lens link is surprising iff a cheap single-lens baseline would NOT have produced it — flag exactly the links that survive that subtraction. (2) Bayesian surprise (Itti-Baldi) is the ideal native scorer: D_KL between the substrate's belief BEFORE and AFTER admitting a hypothesis edge measures 'how much does this relationship change what donto believes' — belief-relative, not mere rarity, and it composes naturally with paraconsistency (a contradiction-inducing edge is maximally surprising). (3) SciAgents is the proof-of-concept of your exact mechanism — random paths between distant nodes to manufacture cross-domain surprise — so adopt its agent topology (Ontologist→Scientist→Critic) but FIX its admitted gap: it has no cross-hypothesis ranking/filtering. (4) Robin is the north star for the verification end: the only unambiguous serendipity metric is downstream validation; design donto's 'rare valuable' curation tier around external grounding (the evidence-anchor-to-source-byte and Lean-4 certification overlays are precisely the right substrate for this). (5) Toms' 'poor similarity' and 'analogy' mechanisms and Boden's combinational creativity legitimize the intersection-of-lenses thesis intellectually. AVOID: (1) Treating volume as the goal — every tradition shows novelty is cheap and the bottleneck is value-discrimination; a million unanchored links is the disease, not the cure. Hold them as hypothesis_only (donto already does this) but never surface them un-triaged. (2) Believing offline metrics certify value — time-sliced LBD evaluation rewards co-occurrence prediction, not insight; the 'dark matter' critique shows objective serendipity proxies miss most real serendipity. Treat any automatic 'interestingness' score as a triage filter, never a verdict. (3) Ignoring multiple comparisons — at your scale FDR is not optional; an engine proposing millions of links manufactures apophenia by construction. Make the false-discovery budget an explicit, tunable parameter and recycle rejected links as negatives (HITL-KG pattern). (4) LLM self-evaluation as the final gate — Si et al. and TruthHypo show models over-rate their own novel-but-invalid outputs.

Already done vs white space: ALREADY DONE (do not reinvent): (a) The conceptual decomposition of serendipity into relevance/novelty/unexpectedness/value, with formal metrics for each (Kotkov, Murakami, Ge, Adamopoulos-Tuzhilin). (b) A principled, belief-relative surprise metric (Itti-Baldi Bayesian surprise) and its recommender port. (c) The exact generative mechanism the founder describes — sampling random paths between distant concepts in a knowledge graph to surface 'connections no one drew' — is implemented and published (SciAgents). (d) Retrospective time-sliced evaluation of discovery systems (Yetisgen-Yildiz & Pratt) and the multiple-comparisons/FDR machinery. (e) End-to-end agentic discovery with real wet-lab validation (Robin) and large human studies of AI-idea novelty (Si et al.). So 'agents that decompose and propose cross-domain links and score their novelty' is NOT white space; it is a crowded, ~40-year lineage (LBD) plus a 2024-2026 agentic wave. GENUINE WHITE SPACE — the defensible combination: (1) MANY LENSES SIMULTANEOUSLY AS THE GENERATIVE SUBSTRATE. Every prior system uses ONE representation (a citation graph, one ontology, topic vectors). Nobody systematically decomposes each entity through the full spectrum of analytical lenses (mereological, teleological, semiotic, phenomenological, ethical, ecological...) and then mines the INTERSECTIONS across lenses for relationships. The lens-cross-product as the search space is novel. (2) A PARACONSISTENT, CONTRADICTION-PRESERVING HOLDING TANK. Every prior system must commit, prune, or collapse contradictory hypotheses; donto can legally hold mutually-contradictory machine-proposed relationships forever as hypothesis_only with typed supports/rebuts/undercuts argument edges and a contradiction frontier. This dissolves the field's worst constraint: you no longer must decide value at generation time — you can accumulate speculative links and let evidence/curation arrive asynchronously. No serendipity, LBD, or creativity system has this. (3) IDENTITY-AS-HYPOTHESIS + EVIDENCE-ANCHORING + LEAN-CERTIFICATION as the verification pipeline that the entire field is MISSING (it is the unsolved 'value' problem). The white space is not generating relationships — it is the principled architecture for HOLDING millions speculatively and VERIFYING the rare valuable ones with byte-level evidence and machine-checkable proof. That triage/verification layer over a many-lens generator is unexplored.

Hard problems: