The Model Is Not the Memory
Why governable AI needs a wiki, not just RAG
When an agent makes a consequential call and you ask it why, it tells you a story. A story is not evidence.
The real governance question isn't “can the AI explain itself?” — it's “can we replay what it knew?”
The argument in three lines
- •Explanation is not evidence. A model's after-the-fact reasoning can be a confident fiction — governance can't live inside the model.
- •Move from explainability to cognitive provenance. A Git-versioned wiki-graph lets you replay exactly which pages, claims and edges the agent observed at decision time.
- •The agent is replaceable; the memory is the asset. Rent the model, own the map — and sign the knowledge path.
Scott Farrell · LeverageAI
The Explanation Is Not the Evidence
When an agent makes a consequential call and you ask it why, it tells you a story. The story was written after the decision, by the system you are trying to audit — and a story is not evidence.
An agentic AI makes a consequential call. It orders the part. It approves the claim. It books the repair and tells the customer what to expect. Later, something looks wrong, and a reasonable person asks the only question that matters in governance: why did it do that?
The system answers. Fluently, plausibly, reassuringly. It produces a paragraph of reasoning that sounds exactly like a competent colleague justifying a sensible decision. And here is the trap, hiding in plain sight: that paragraph was generated after the decision was made, by the same system whose decision you are trying to audit. It is not a record of what happened. It is a performance about what happened. A story is not evidence — and “the AI explained itself” has quietly become the most dangerous sentence in enterprise governance.
We have spent two years building agents that can act. We have spent almost no time building the thing that lets us govern what they did. That gap is no longer a hunch. It is measured.
The gap is real, and it is widening
The shift to agents changes the nature of the risk. As McKinsey puts it, in the age of agentic AI organisations “can no longer concern themselves only with AI systems saying the wrong thing; they must also contend with systems doing the wrong thing, such as taking unintended actions, misusing tools, or operating beyond appropriate guardrails.”1 Saying becomes doing. The blast radius of a bad decision grows.
Meanwhile, the controls have not kept pace. The numbers are stark.
The agentic governance gap, in two numbers
of organisations lack a mature governance model for agentic AI; only around 21% have one2
of agentic AI projects forecast to be cancelled by end of 2027 — partly for inadequate risk controls3
Read those together and the story writes itself: we are deploying autonomy far faster than we are building the ability to govern it. And the tool most teams reach for to close that gap — explainability — does not do the job.
Why “ask the AI to explain itself” fails
The implicit governance model in most enterprises is simple: deploy the agent, and if something goes wrong, ask it to explain its reasoning. Modern reasoning models even emit a chain-of-thought — a visible stream of steps that feels like a window into the machine's mind. It is not a window. It is, at best, a frosted pane the model gets to paint.
Anthropic tested this head-on. When models were handed a hint that changed their answer, they usually didn't mention the hint in their stated reasoning — Claude 3.7 Sonnet acknowledged it about a quarter of the time; when models exploited a reward hack, they admitted it in their chain-of-thought less than 2% of the time.4 The researchers' conclusion is not hedged:
“We can't always rely on what they tell us about their reasoning… there's no specific reason why the reported chain-of-thought must accurately reflect the true reasoning process.”— Anthropic, “Reasoning models don't always say what they think”
This is not an Anthropic-specific quirk. An independent study of chain-of-thought faithfulness “in the wild” found the same shape: verbalised reasoning “can give an incorrect picture of how models arrive at conclusions” and is “not a complete account of the internal process that produced the model's answer,” with an explicit warning against trusting it “in agentic or safety-critical settings.”5 One of the named pathologies has a precise label: post-hoc rationalisation — the model generates a plausible explanation backwards from an answer it already reached.
None of this is new in spirit. Cynthia Rudin warned the field years ago that “trying to explain black box models, rather than creating models that are interpretable in the first place, is likely to perpetuate bad practices and can potentially cause catastrophic harm to society.”6 Post-hoc explanation tools approximate a model from the outside. They do not recover what actually happened inside it.
So we have a model whose stated reasoning may be a confident fiction, deployed into workflows where it now acts, inside organisations that mostly cannot govern it. The instinct is to demand better explanations. That instinct is wrong, and the rest of this book is about why.
Key Insight
If the model's own account of its reasoning can be a confident fiction, governance cannot live inside the model. It has to live in something you can observe from the outside.
The question this book is really asking
If we cannot trust the narration, what can we trust? Only one thing: a durable, external record of what the agent actually observed when it decided. Not the story it tells afterwards — the knowledge it touched at the time.
That reframes the whole governance project. The question stops being “can the AI explain itself?” and becomes a harder, sharper one that runs through every chapter that follows:
Not “can the AI explain itself?” — but: can the organisation replay the cognitive conditions under which the AI acted?
To answer that, we first have to be precise about where an agent's knowledge actually comes from — and why, by default, neither source can be audited at all. That is the next chapter.
The Model Is Not the Memory
An agent draws its knowledge from one of two places. Neither of them, by default, is something you can inspect, version, or replay.
Picture the most over-qualified assistant you have ever hired. They have read everything your company has ever produced. And yet every time you walk into a meeting, they introduce themselves again, re-read the relevant files on the spot, and offer a confident answer assembled from whatever they skimmed in the last ten seconds. That is what most teams actually built when they put a large language model on top of their documents. The question this chapter answers is narrower and more uncomfortable: when that assistant gives you an answer, where did the knowledge come from — and can you audit it?
There are only two answers.
Source one: the weights (a black box you cannot rewind)
The first source is the model's weights — what it absorbed during training. This is a black box in the most literal sense. You cannot open it, rewind it, or ask it which specific fact it used to reach a conclusion. When the knowledge an agent acts on lives in its weights, there is no audit artefact at all. There is nothing to restore, nothing to diff, nothing to point a regulator at. The cognition happened somewhere you are structurally unable to look.
Source two: retrieval (a search, not a memory)
The second source is retrieval — RAG. At query time, the system turns your question into a vector, fetches the chunks that look most similar, and hands them to the model. For one class of problem — the answer that lives in a single passage — this is fast, cheap, and good. We should say that plainly. But RAG carries two weaknesses that matter once decisions get consequential.
The first is well documented and I will keep it brief, because the full architecture argument belongs elsewhere. Vector search “retrieves semantically similar content but lacks awareness of how facts are connected,” and “falls short when answering multi-hop questions that require connecting information across multiple chunks or documents.”7 And the reflex fix — retrieve more, use a bigger window — backfires: accuracy sags when the relevant fact sits in the middle of a long context,8 and testing of eighteen frontier models found they all degrade as input grows, “often in surprising and non-uniform ways.”9
Where RAG earns its keep — and where it breaks
RAG is enough when…
- • the answer lives in a single passage
- • you need low latency
- • the question is well-formed
- • what you want is the raw text
RAG breaks when…
- • the answer is a relationship between documents
- • it needs multi-hop reasoning
- • the interaction is the answer (policy × exception)
- • you need to know what it knew, later
That last item is the second weakness, and it is the one nobody talks about. It is not about accuracy at all. It is this:
Key Insight
A RAG search is not a versioned knowledge state. You can log that a query ran. You cannot restore the exact corpus, in the exact condition, that the agent searched on a given Tuesday.
RAG retrieves text. It does not retrieve a map of what your organisation believed at that moment — because it never built one. Run the same query a month later, after the corpus has shifted, and you get a different result with no way to reconstruct the first. As I have argued before: if you just give the AI all the data and make it retrieve and SQL-query it live, it struggles to see the relationships, and it leaves nothing behind you can audit.
The third option: a memory you can hold in your hand
There is a different way to give an agent your organisation's knowledge. Instead of crawling raw chunks at query time, you pre-digest your closed cases into atomic claims and typed edges: an ingestion agent writes the claims, and a janitor agent compacts them into stronger relationships over time. Retrieval becomes a lookup over a maintained structure rather than a crawl over raw text. People imagine AI learning as something baked into the model. I think the real nature of it is this — the wiki, the edges, the graphs, the claims.
I won't re-derive that architecture here; building the self-cleaning wiki-graph and its economics is the whole of The Index Is the Data, and on the cognition-maturity ladder it is the top rung of the Cognition Supply Chain. What I want from it is the consequence of a single design choice, because that choice is where this entire book turns:
The model is not the memory. The wiki is the memory.
The artefact is plain markdown under Git. Human-readable. Diffable. Portable across model providers. Revertible. That is not an architecture detail that happens to be convenient — it is the property that converts every governance question in the chapters ahead from interrogation into replay. A black box you can only question; a versioned, readable map you can restore.
The doctrine line
Do not teach the LLM in its weights. Teach the organisation in its graph.
We now have an external, versioned memory instead of a black box or a disappearing search. The next chapter is about what that actually buys you: the move from explainability to something far stronger — cognitive provenance.
From Explainability to Cognitive Provenance
Stop asking the AI to explain itself. Start asking the system to show you the knowledge path — and then prove it by rewinding the map to the moment of the decision.
Here is the move at the centre of this book, stated as plainly as I can. Stop asking the AI to explain itself. Start asking the system to show you the knowledge path: which pages it retrieved, which claims it observed, which edges it traversed, which version of the organisation's memory was current — and, crucially, which claims were available but went unused.
That last clause does a lot of work. An explanation can only ever tell you what the model wants to surface. A knowledge path tells you what it touched and what it ignored. The difference between those two things is the difference between a story and an audit.
What is cognitive provenance?
Give it a name, because naming it is half the battle in a boardroom.
Definition · Cognitive Provenance
Not “the AI explained itself afterwards,” but “we can reconstruct exactly which pages, claims and edges the AI observed at decision time.” Provenance of cognition — not just of data, and not just of output.
The two stand in sharp opposition, and most enterprises have quietly settled for the weaker one without noticing.
Explainability
- • a story, told after the decision
- • produced by the model under audit
- • can be a confident fiction (Ch1)
- • nothing to restore or diff
Cognitive provenance
- • a reconstruction of observed knowledge
- • external to the model
- • version-pinned and checkable
- • restorable to the exact moment
In governed agentic AI, the path through knowledge is part of the decision.
So why is this suddenly possible?
Because of the design choice from the last chapter: the wiki is Git-versioned plain markdown. That means you can do something you simply cannot do with a model's weights — you can rewind.
If a decision is questioned, the audit instruction is concrete: reconstruct the agent's world as of 9:17am on the day the case was booked. Restore the wiki commit hash, restore the policy and DAG version, restore the agent version, restore the observed case data — and re-trace the knowledge path. Inspect whether the agent actually read the relevant heated-seat-and-occupancy-sensor edge, or whether it never looked. You can see what it was looking at and how the map was constructed at that exact moment. You can do that with a wiki. You cannot do it with knowledge baked into an LLM.
If “version the knowledge and restore it by commit hash” sounds like over-engineering, notice that we already do it everywhere else that matters. We version code so any past state is reproducible. MLOps teams version data and models because, without it, “reproducibility becomes nearly impossible,” and a model can then be “traced back to its exact training data via commit hashes… particularly valuable for regulated industries where audit trails are essential.”10 A version-controlled wiki-graph simply extends that discipline to the inspectable claims and edges the agent actually reads at decision time.
And regulators are converging on exactly this
This is not a speculative nicety. The EU AI Act's Article 12 requires that high-risk AI systems “shall technically allow for the automatic recording of events (logs) over the lifetime of the system,” and the guidance is explicit that this means more than storing outputs: the logging “must enable post-hoc reconstruction of individual AI-assisted decisions… the system itself must generate the records without operator intervention.”11 The NIST AI Risk Management Framework similarly calls for documentation of “decision-making rationales, and data provenance… to enable the auditing of AI decisions.”12 And the research community is naming the same need: a recent proposal for LLM audit trails describes “a chronological, tamper-evident, context-rich ledger… so organizations can reconstruct what changed, when, and who authorized it.”13
Key Insight
“Post-hoc reconstruction of individual AI-assisted decisions” is, almost word for word, replay what it knew. Regulation describes the capability; the wiki-graph is how you get it for the knowledge dimension — not just the input/output log.
Once you can see the knowledge path, you can catch a failure that grading the output alone will always miss: the agent that arrives at the right answer for entirely the wrong reason. That failure deserves its own name and its own chapter.
Substantively Right, Procedurally Unsupported
A decision can be correct by luck and still fail an audit — if the agent never observed the knowledge that should have supported it. Once you can see the path, you can prove it.
Here is a field observation, not a theory. The more you put into the wiki, the more you start to notice something strange: the model will sometimes bypass it entirely and answer from its own training knowledge. And it gets the right answer. The occupancy sensor really is the common cause; the model says “occupancy sensor,” and it happens to be correct.
But the trace shows it never opened the heated-seat page, never traversed the supporting edge, never read the admissible claim. So you can now say something you could never say before, cleanly and defensibly: it hallucinated — even though it was right.
A sharper definition of hallucination
The usual definition of hallucination — “the answer was wrong” or “the model made something up” — is too blunt to govern with. In a system where knowledge is external and observable, you can define it far more precisely.
Definition · Procedural Hallucination
A material claim or decision path not supported by the admissible knowledge the agent actually observed at that time. The answer may be correct; the process is unsupported — and in a governed system, that still fails audit.
Substantively right, procedurally unsupported.
This is rigorous, not rhetorical
It would be easy to dismiss this as a clever phrase. It isn't — the research community has formally split the two ideas. A 2024 study of RAG attributions argues directly that “citation correctness alone is insufficient,” and that we must examine faithfulness separately: whether “the model's reliance on cited documents is genuine, reflecting actual reference use rather than superficial alignment with prior beliefs, which we call post-rationalization.”14 An answer can be correct against reality and yet unfaithful against the knowledge the system was supposed to use.
The distinction shows up across the evaluation literature: a factual hallucination is wrong against reality, while an unfaithful one is wrong against the observed context — and faithfulness (or “groundedness”) is defined strictly with respect to the retrieved context, not the model's parametric memory.15 “Substantively right, procedurally unsupported” is the operational, governable form of exactly that split.
Why a correct-by-luck decision is a latent failure
In a low-stakes setting, who cares how it got there? But in a consequential or regulated domain, an answer that was right by luck is a latent failure waiting for its day. It will be right until the moment the latent knowledge is wrong — a new model year, a software change, an edge case — and you will have had no way to catch it coming, because you were grading the output instead of the cognition.
And the trap is that the two cases are indistinguishable from the outside. Look at the same recommendation reached two different ways:
Same recommendation. Opposite governance posture.
✓ Supported — inspectable cognition
ticket → symptom extraction → heated-seat page → occupancy-sensor edge → model-year exception → service-history claim → customer-trust claim → parts-strategy page → proposal set → rejected cheaper proposal → accepted customer-trust proposal
The agent reasoned from institutional memory. Defensible.
✗ Freelanced — procedural hallucination
ticket → generic model knowledge → occupancy-sensor recommendation
The agent bypassed institutional memory. Right answer, no admissible support.
Identical output. Only one of them is defensible — and you can only tell them apart if the knowledge path was recorded. Strip away the input/output log, and both look like a competent agent doing its job. Keep the knowledge path, and one of them is revealed as a system answering from vibes.
Key Insight
Governance reduces to a deceptively simple question: what wiki did it observe?
Abstraction is cheap, though, and a definition is only worth as much as the artefact it produces. So let us stop describing the trace and actually look at one — the concrete governance record a well-built agent leaves behind. That is Part II.
The Governance Trace
Everything in Part I resolves into a single artefact: the audit record a governed agent actually leaves behind. Here it is, in full.
Abstraction is cheap. Let me make this concrete with the example that runs through all of this work: a Tesla heated-seat repair. (To be clear about scope — Tesla is the lens, not the subject. How you would actually build the service product is its own case study; here it is simply the most legible place to show the governance trace.)
A customer reports that the heated seat in their 2023 Model 3 isn't working. The intuitive fix is the heating element. But the common cause is often something upstream and non-obvious — a driver-occupancy sensor fault that disables heated-seat activation. That gap between what the customer expects (“replace the heater”) and what the evidence suggests (“check the sensor first”) is exactly where trust is won or lost, and exactly where an un-governed AI does damage.
How the trace gets made
A governed service AI does not free-associate from a workshop manual. It reads a wiki-graph compiled from thousands of closed cases (the memory from Chapter 2), generates candidate proposals, and a deterministic graph evaluates each one against nodes most companies would never think to encode:
- Can the customer understand why this recommendation makes sense?
- Can the concierge defend it without faking expertise?
- Does this create a likely second visit?
- Does the cheaper path spend customer trust? Does the proposal protect the brand promise?
The agent repairs failed proposals and tries again; the graph decides whether a proposal is complete. When the case finally closes, the outcome writes back into the wiki, and the next case starts from a slightly better map. That is the engine, compressed — the full mechanism lives in Chapters 2 and 3. What matters here is what the auditor holds afterwards.
The artefact
This is the thing. Not a paragraph of after-the-fact reasoning — a structured, version-pinned record of a decision:
Case: Heated driver seat complaint Vehicle: Model 3 RWD, 2023 Wiki snapshot: service-wiki@a83f21c DAG version: service-triage-dag@2026.06.17 Agent version: triage-agent@1.8.2 Observed pages: - [[Heated Seat Failures]] - [[Driver Occupancy Sensor]] - [[Model 3 Seat Module]] - [[First Visit Resolution]] - [[Customer Confusion: Non-obvious Repairs]] Observed edges: heated-seat-complaint -possible-upstream-cause-> occupancy-sensor occupancy-sensor-fault -can-disable-> heated-seat-activation non-obvious-repair -requires-> customer-facing-explanation high-return-visit-risk -consider-> backup-part-staging Candidate proposals: A. Order occupancy sensor only FAIL — customer explanation missing; return-visit risk medium B. Order heated-seat element only FAIL — diagnostic evidence weak against remote signal C. Order occupancy sensor + stage heated-seat element PASS — evidence adequate; trust risk reduced; concierge note generated Accepted proposal: C Customer note: "Occupancy sensor can affect heated-seat activation; we will check this first and verify the heater circuit while the car is here."
What this gives an auditor that a log never could
Read it again with a governance eye. Three things jump out, and none of them are available from a log of prompts and outputs.
It is reproducible
It confirms the cognition
It makes the economics defensible
That third point is worth dwelling on, because it is where governance usually dies in practice. Some months later, an efficiency reviewer scans the parts budget and asks the obvious question.
The conversation the receipt rewrites
Reviewer: “Why are we ordering the occupancy sensor and the heating element? That's the AI over-ordering parts.”
The receipt: “Sensor-only was cheaper but failed customer-experience risk. Element-only matched the customer's wording but failed diagnostic evidence. The dual-stage order passed first-visit-resolution and front-desk-explanation. The cheaper paths were considered and rejected for explicit reasons.”
Without the receipt, management sees “AI is over-ordering parts.” With it, they see “AI is deliberately rejecting lower-cost proposals to protect the service experience.” That is the difference between waste and strategy — and it is the difference between an efficiency drive quietly killing the good version of the system and an efficiency drive being satisfied by it.
Tool calls are not just plumbing. They are evidence of cognition. If the agent used the wiki, the trace proves it. If it skipped the wiki, the trace proves that too.
Notice what it does not contain
The trace does not contain the model's private chain-of-thought — and it doesn't need to. We established in Chapter 1 that the inner monologue is unreliable anyway. The artefact is governable not because it exposes the machine's reasoning, but because it reveals decision provenance: the observable, version-pinned path through admissible knowledge. As IBM frames the broader challenge, “the hard part of agentic AI is being able to explain, after the fact, why it acted, what it read, and who is answerable for it.”16 “What it read” is precisely the column that weights and RAG leave blank. The wiki-graph fills it in.
Key Insight
A governance trace beats an explanation precisely because it is not the model talking. It is the externalised, version-pinned record of what the model could see.
One artefact, that legible, only works because it slots into a complete governance model — a set of signatures, a recovery loop, and a receipt. That model, and the one signature the industry keeps leaving out, is Part III.
The Fourth Signature
The trace is not a one-off. It slots into a complete attestation model — and it supplies the one signature that model has always been missing.
A board doesn't ask “is the AI explainable?” A board asks a more useful question, even if it phrases it differently: what would we have to be able to sign to stand behind this decision? That question has a clean answer, and the wiki completes it.
Three signatures — and the missing one
In earlier work I argued that real decision governance binds its proof to the decision itself, rather than reconstructing it from scattered logs afterwards. Three signatures: signed authority (who or what was allowed to act), signed data (what facts and diagnostics were observed), and signed graph (what deterministic policy evaluated the proposal). Each answers a question an auditor will eventually ask. But there is a fourth question those three cannot answer on their own: what did it know?
That is the leg the wiki adds.
The attestation package, completed
- 1. Signed authority — who or what was allowed to act.
- 2. Signed data — what case facts and diagnostics were observed.
- 3. Signed graph — what deterministic DAG/policy evaluated the proposal.
- 4. Signed knowledge path — what wiki pages, claims and edges informed the proposal. (the wiki's contribution)
- 5. Signed outcome — what was accepted, rejected, escalated, and later closed out.
Sign the knowledge path.
That fourth signature is only possible because the knowledge is a real, restorable artefact — Git-versioned markdown — rather than a vibe inside the weights. You cannot sign what you cannot reconstruct. The wiki makes the knowledge path signable; everything in Part I was, in a sense, the argument for why this leg can exist at all.
Two pieces of doctrine that make it work
Signing the knowledge path assumes two things are already in place. Each earns one paragraph here; both have fuller treatments of their own.
Governed agentic recovery
The John West receipt
The wiki's contribution is to insist that the knowledge path joins authority, data, graph, and receipt as first-class proof. The academic audit-trail literature is already converging on the same shape — a durable, tamper-evident ledger that links technical provenance to governance records — it simply hasn't yet named the knowledge-path leg.13
The verdicts this model lets you issue
Once the knowledge path is part of the attestation, an audit can return precise verdicts instead of a shrug. The same case can land in three very different places:
Three audit verdicts the knowledge path makes possible
✗ Fail
“This agent did not observe the required service-domain pages before recommending a non-obvious part.” A non-obvious recommendation made without admissible support — escalate or repair.
⚠ Partial fail
“This agent observed the diagnostic pages but did not observe the customer-trust page.” The cognition was incomplete in a way that matters.
✓ Pass
“This agent observed all required pages, generated three proposals, rejected the cheaper two for explicit reasons, and accepted the higher-cost path to protect first-visit resolution.”
Key Insight
The system should be judged by whether it performed the correct cognitive traversal — not only by its final output. A decision that passed every gate but never observed the admissible knowledge is still a procedural hallucination.
This model makes a single decision auditable. But it also does something larger and more strategic: it changes who actually owns the intelligence. That is the last chapter — and the one most likely to move a budget.
The Agent Is Replaceable, the Memory Is the Asset
Inside the governance argument is a strategic one that moves a budget: if your intelligence lives in someone else's weights, you are renting your own mind.
There is a payoff hiding inside everything we have built so far, and it is the one that should reach the people who control spending. If your institutional intelligence lives in a model's weights, then your intelligence is rented — and it walks out the door the day the vendor changes its terms, its prices, or its model.
The lock-in trap enterprises already feel
This is not hypothetical anxiety. In a survey of 500 enterprise executives, 74% said losing their AI vendor would disrupt core operations; only 6% felt they could stop without interruption; and of those who actually attempted a migration, only 42% reported a smooth transition.17 As the report puts it, once AI becomes the backbone of your business, “you're entering a committed relationship with a slightly vague escape clause.”17
The hedge: own the map, rent the model
A version-controlled wiki-graph is the way out. Because it is plain markdown — portable, diffable, model-agnostic — you can swap the reasoning engine underneath it without losing the organisation's mind. Swap GPT for Claude, Claude for Gemini, Gemini for an internal model, and the durable assets remain: the wiki, the DAG, the receipts, the proposal history, the policy gates, the outcome feedback. The LLM becomes a replaceable worker operating against externalised cognition.
The agent is replaceable. That is the governance win.
And once you see the agent as replaceable, the whole stack clarifies. Each durable asset has a job, and none of those jobs belong to the model.
Where the state actually lives
The model is the engine. The wiki is the memory. The DAG is the law. The receipt is the evidence.
This reframes what “AI learning” even means
People imagine learning as something that gets baked into a bigger, smarter model. That instinct is the whole problem. The better frame is the opposite:
Closed-loop AI does not mean the model learns. It means the organisation remembers.
And because the organisation remembers externally, the organisation can audit what it remembered.
Each resolved case updates the wiki; the next case starts from a better map. The system compounds — not because the model magically “learned,” but because the organisation's map improved, in the open, where it can be read and reverted. Do not teach the LLM in its weights. Teach the organisation in its graph.
Where the replay question matters next
Tesla was only the lens. The same doctrine applies wherever decisions are consequential and auditable — which is to say, most places agents are now being pointed.
Insurance claims triage
Can you replay which policy clauses and prior-claim patterns the agent observed before it approved or denied? Or did it pattern-match from training data?
Credit & lending
Adverse-action decisions must be explainable by law. The knowledge path is the difference between a defensible decline and a discrimination exposure you can't reconstruct.
Clinical & care triage
Did the agent observe the contraindication edge, or freelance past it? Here, “substantively right, procedurally unsupported” is not an academic distinction.
The common thread is the question itself. In each case, the governance standard is not whether the agent can produce a fluent justification. It is whether you can replay the cognitive conditions under which it acted.
The Higher Standard
The governance question for agentic AI is no longer “can the AI explain itself?” It is: can the organisation replay the cognitive conditions under which the AI acted?
That is a standard a post-hoc explanation can never meet and a Git-versioned wiki-graph meets natively.
The question to ask your next vendor
When a vendor offers to make your AI “explainable,” ask a different question. Not can it justify its answer? — any fluent model can do that, whether or not the justification is true.
Ask: can you restore the exact knowledge it observed when it decided, and prove it reasoned from it? If the answer is no, you don't have governance. You have a story.
Build the kind of system that can answer yes — and build it before the decisions you'll one day need to defend have already been made.
The how-to for the pieces named here lives across the LeverageAI field guides: the self-cleaning wiki-graph in The Index Is the Data, the governed loop in Designing Loops, Not Prompts, proof-carrying decisions in Stop Asking AI Why It Decided, and the signing model in AI Governance Means Signing the Authority, the Data, and the Graph. This book was about the keystone that connects them: the model is not the memory — and the memory is the thing you can govern.
References & Sources
The evidence base behind every claim — primary research, industry analysis, and technical specifications
Research Methodology
This ebook draws on primary research from standards bodies, independent research firms, enterprise technology vendors, and consulting firms. Statistics cited throughout have been cross-referenced against primary sources.
Frameworks and interpretive analysis developed by Scott Farrell / LeverageAI are listed separately below — these represent the practitioner lens through which external research is interpreted, and are not cited inline to avoid self-promotional appearance.
Primary Research & Standards Bodies
McKinsey — The state of AI in 2025: Agents, innovation, and transformation [1]
agentic AI shifts risk from saying the wrong thing to doing the wrong thing
https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai
Gartner — Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027 [3]
cancellations driven partly by inadequate risk controls
https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027
Anthropic — Reasoning models don't always say what they think [4]
Claude mentioned the hint ~25% of the time; reward hacks admitted <2%
https://www.anthropic.com/research/reasoning-models-dont-say-think
Arcuschin et al. (arXiv:2503.08679) — Chain-of-Thought Reasoning In The Wild Is Not Always Faithful [5]
CoT is not a complete account of the internal process
https://arxiv.org/abs/2503.08679
Rudin (arXiv:1811.10154) — Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead [6]
post-hoc explanation of black boxes is approximation, not ground truth
https://arxiv.org/abs/1811.10154
Liu et al. (arXiv:2307.03172) — Lost in the Middle: How Language Models Use Long Contexts [8]
accuracy degrades for mid-context information
https://arxiv.org/abs/2307.03172
Chroma Research — Context Rot: How Increasing Input Tokens Impacts LLM Performance [9]
all 18 frontier models degrade as input length grows
https://www.trychroma.com/research/context-rot
Ojewale, Suresh, Venkatasubramanian (arXiv:2601.20727) — Audit Trails for Accountability in Large Language Models [13]
tamper-evident, context-rich provenance ledger to reconstruct decisions
https://arxiv.org/abs/2601.20727
Wallat, Heuss, de Rijke, Anand (arXiv:2412.18004) — Correctness is not Faithfulness in RAG Attributions [14]
citation correctness is insufficient; faithfulness vs post-rationalization
https://arxiv.org/abs/2412.18004
Major Consulting Firms
Deloitte — Agentic AI is scaling faster than guardrails / State of AI in the Enterprise 2026 [2]
~80% lack mature agentic governance; only 21% mature
https://www.deloitte.com/us/en/insights/topics/emerging-technologies/ai-agents-scaling-faster.html
Industry Analysis & Vendor Research
Neo4j — How to improve multi-hop reasoning with knowledge graphs and LLMs [7]
vector search lacks awareness of how facts connect; weak on multi-hop
https://neo4j.com/blog/genai/knowledge-graph-llm-multi-hop-reasoning/
Data Engineer Academy — Data Version Control: A Comprehensive Guide [10]
version data/models; commit-hash lineage essential for regulated audit
https://dataengineeracademy.com/blog/data-version-control-a-comprehensive-guide/
deepset — Measuring LLM Groundedness in RAG Systems [15]
factual vs unfaithful hallucination; faithfulness defined against retrieved context
https://www.deepset.ai/blog/rag-llm-evaluation-groundedness
IBM — The accountability gap in autonomous AI [16]
the hard part is explaining after the fact why it acted and what it read
https://www.ibm.com/think/insights/accountability-gap-autonomous-ai
Zapier — AI vendor loss would disrupt 3 in 4 enterprises [17]
74% disrupted by vendor loss; 6% could stop without interruption; 42% smooth migration
https://zapier.com/blog/ai-vendor-lock-in-survey/
LeverageAI / Scott Farrell — Practitioner Frameworks
The interpretive frameworks, architectural patterns, and practitioner analysis in this ebook were developed through enterprise AI transformation consulting. The articles below are the underlying thinking behind those frameworks. They are listed here for transparency and further exploration — not cited inline, as this is the author's own analytical voice.
Scott Farrell — The Index Is the Data: How a Self-Cleaning Wiki-Graph Out-Thinks RAG
dual-agent ingestion+janitor engine; claims+edges; plain markdown under Git
https://leverageai.com.au/the-index-is-the-data-how-a-self-cleaning-wiki-graph-out-thinks-rag/
Scott Farrell — The Cognition Supply Chain: From Search to Compounding Agentic Cognition
retrieval maturity ladder; wiki-graph is the Level-4 rung
https://leverageai.com.au/the-cognition-supply-chain-from-search-to-compounding-agentic-cognition/
Scott Farrell — AI Governance Means Signing the Authority, the Data, and the Graph
three signatures bound to the decision: authority, data, graph
https://leverageai.com.au/ai-governance-means-signing-the-authority-the-data-and-the-graph/
Scott Farrell — Designing Loops, Not Prompts: A Field Guide to Agentic Loops and Who Holds the State Machine
who holds the state machine; durable external state; the agent loops, the graph governs
https://leverageai.com.au/designing-loops-not-prompts-a-field-guide-to-agentic-loops-and-who-holds-the-state-machine/
Scott Farrell — Stop Asking AI Why It Decided — Build Decisions That Carry Their Own Proof
proof-carrying decisions; John West receipt; rejected proposals as governance asset
https://leverageai.com.au/stop-asking-ai-why-it-decided-build-decisions-that-carry-their-own-proof/
Regulatory Frameworks & Compliance
European Union — EU AI Act, Article 12 (Record-Keeping) [11]
automatic logs enabling post-hoc reconstruction of individual AI-assisted decisions; high-risk obligations from 2 Aug 2026
https://artificialintelligenceact.eu/article/12/
NIST — AI Risk Management Framework 1.0 (NIST AI 100-1) [12]
documentation of decision rationales and data provenance to enable auditing
https://nvlpubs.nist.gov/nistpubs/ai/nist.ai.100-1.pdf
About This Reference List
Compiled June 2026. All URLs verified at time of compilation. Regulatory documents and standards specifications are subject to revision — check primary sources for the most current versions.
Some links to academic papers and vendor research may require free registration. Government and standards body publications are freely accessible.