The Wiki Is the Kernel
Boot Profiles and Demand Paging for a Worldview That Stopped Being a Document
I wrote the kernel as a monolithic document — then couldn’t answer my own question: if the kernel is the kernel, why do I have three of them?
The three were never a design. They were a workaround for a document that couldn’t be queried. The wiki is the kernel after the refactor — already decomposed, entered from anywhere, recomposed on the fly. And every AI failure you can name is the same monolith, seen from a different corner.
The argument in three lines
- •The kernel was a monolith. Named kernels were a clone-per-task workaround for a document with no query interface. A wiki demotes them to boot profiles and demand-pages the rest.
- •Every copilot is myopic. Vendor AI sees only its own fragment; four locks keep it there. The compiled worldview can only live on your side of the boundary.
- •The substrate underneath. The wiki is the missing memory tier, the territory your prompt was a photograph of, and the compounding layer your perishable tuning kept evaporating off.
Scott Farrell · LeverageAI
Why Do I Have Three Kernels?
I built a kernel for my AI. Then a taste kernel. Then a brand kernel. And then I stalled on a question I couldn’t answer: if the kernel is the kernel, why do I have three of them?
TL;DR
- •Multiple named “kernels” — taste, brand, a frameworks file — were never a design. They were a workaround for a document that couldn’t be queried.
- •A flat kernel forces write-time resolution: one compilation of your worldview for every future task. So when a new task needs different emphasis, you clone the document and specialise the copy — and the copies drift.
- •This chapter openly supersedes my own earlier kernel doctrine. The fix isn’t a better document. It’s a different substrate: the wiki is the kernel, already decomposed, and you recompose the parts you need on the fly.
Let me start with an admission, because it’s the honest way into this whole book. For a long time I wrote about giving an AI a kernel — the small, always-resident document that carries who you are, how you judge, what you value, so that everything the AI produces inherits that judgement instead of needing to be corrected one output at a time. It worked. So I did the natural thing and made more of them. I read about a taste kernel for aesthetic judgement, so I wrote one. A brand kernel for voice and positioning, so I wrote that too. A frameworks file. A constraints file. A little constellation of kernels, each doing real work.
And then I got stuck on my own question, the one I couldn’t argue my way past: if the kernel is the kernel, why are there several of them? A kernel is supposed to be the single, central thing — the one source of truth an agent boots from. Having three of them is a contradiction sitting in plain sight. I had built the constellation without noticing that its existence was evidence something upstream was wrong.
The reframe
The three kernels were never a design. They were a workaround for a document that couldn’t be queried — and the moment the substrate could be queried, the workaround became pure liability.
This is a correction to my own earlier writing, and I want to make it plainly rather than smuggle it in. I have argued before that the durable asset in AI work is the generation recipe — the kernel — and that you should nuke the output and regenerate it rather than nurse it. I stand by that instinct completely. What I got wrong was the shape of the kernel. I pictured it as a monolithic document — a single, carefully maintained markdown file — and everything that felt awkward afterwards flowed from that one picture being incomplete. The candour here is deliberate: this book is a refactor of a doctrine I published, and I’d rather show the refactor than pretend the first version was already finished.
Watch two kernels drift
The clearest way to see the problem is to watch it fail. Here are two of my own kernel files, or near enough, on the single question of how blunt the writing should be.
One claim, two files, quietly contradicting each other
brand_kernel.md — written in March
“Concede the trade-offs out loud. The register is fit, not superiority. Never oversell; the honesty is the credibility.”
taste_kernel.md — edited in May
“State beliefs flat. Don’t hedge. Name the enemy. Take the position and hold it.”
Both are true; both are me. But an agent that boots the brand kernel writes a careful, conceding paragraph, and an agent that boots the taste kernel writes a blunt, adversarial one. Same author, two personalities, decided entirely by which file happened to load. Edit one in isolation — as you inevitably do, because they’re separate documents — and the gap widens. This is a DRY violation with a nervous system: the same idea living in two places, drifting until the day it ships a contradiction.
Any engineer knows this smell. When the same fact lives in two files, the files will disagree eventually, because maintenance never touches both at the same instant. We spent decades learning to never do this in code — single source of truth, don’t repeat yourself — and here I was cheerfully doing it with the most important document in my whole AI stack, the one that decides what the machine believes about me.
Why the workaround existed in the first place
Here is the part that turns a confession into a design principle. I didn’t make three kernels out of carelessness. I made them because the substrate forced me to. A flat document has exactly one resolution, chosen once, when you write it. It cannot be too detailed for one task and just-right for another; it is whatever you froze it as, for every question you will ever ask, including the ones you haven’t thought of yet. Engineers have a name for baking a decision in before you know the inputs: this is write-time resolution, and it is the original sin of the monolithic kernel.
So what do you do when a document’s single frozen resolution doesn’t fit a new kind of task? You do the only thing the substrate allows: you copy the document and specialise the copy. Aesthetic work needs a version tuned for taste, so you fork off a taste kernel. Marketing work needs voice front-and-centre, so you fork off a brand kernel. Each fork is a locally sensible move and a globally corrosive one, because now you own three frozen compilations of one worldview, maintained by hand, guaranteed to diverge. The multiplicity wasn’t richness. It was the document’s inability to answer “what do you need for this task?” leaking out as duplicated files.
Every extra kernel is a confession that the first one couldn’t be queried. You don’t clone a database per report; you clone a document per task shape only because a document has no query interface.
And that reframes the whole puzzle. “Why do I have three kernels?” has the same answer as “why does this codebase have the same constant copied into four files?” — not because four constants were wanted, but because nobody gave it one home the others could point at. The kernels multiplied to route around a missing capability. Give the worldview a query interface — make it a graph you can ask, rather than a page you must load whole — and the reason to fork evaporates. One task wants a shallow overview; another wants to drill to the canonical source; both are served from the same single source of truth, at the resolution the task asks for, chosen at read time instead of frozen at write time.
The substrate changed under the doctrine
None of this was wrong to do when the kernel could only be a flat file. Given a document, cloning-and-specialising is the best available move; I’d make it again under the same constraint. What changed is the constraint. Once your worldview lives as a wiki — a graph of small claim pages joined by edges, the substrate I’ve argued for elsewhere as the way to out-think retrieval — the whole premise of the monolithic kernel dissolves. The wiki is queryable by construction. It has one home for each claim and edges to everything related. Write-time resolution is gone, and with it the only reason the three kernels ever existed.
So the sentence I couldn’t previously finish finishes itself. The wiki is the kernel. It has already been decomposed for you — not into three rival documents, but into hundreds of small pages you can enter from anywhere and recompose on demand for the job in front of you. The taste kernel and the brand kernel don’t need to be separate documents at all; as the next two chapters show, they become views onto one graph. But before we get there, it’s worth being precise about why a graph succeeds exactly where the monolith failed — and the cleanest way to see it is to borrow a forty-year-old idea from operating systems. A kernel that compiles everything in is a statically linked binary. The fix has a name, and the industry has known it since before AI existed.
That’s the next chapter. For now, hold onto the diagnosis, because every later problem in this book — vendor AI that stays useless about your life, the memory tier the industry skipped, the mega-prompt that keeps failing, the tuning that evaporates on every model release — turns out to be this same monolith, seen from a different corner.
Statically Linked vs Loadable
The monolithic kernel is a statically linked binary: every capability compiled into one image, the whole thing loaded on every boot. Operating systems solved this before AI existed — and the fix has a name.
TL;DR
- •A flat kernel is statically linked: everything compiled in, loaded whole, and every addition bloats every boot. A wiki is loadable modules: boot a tiny core, mount a filesystem, demand-load pages as the task touches them.
- •The piece that makes the difference is the page table — and in a wiki, the edges are the page table. A pile of markdown files without edges is not a kernel; it’s just a folder.
- •Decompose, make navigable, recompose. That’s how we already write code; it’s what the wiki does to a worldview; and it’s the underlying agentic pattern.
Reach back to a distinction every systems programmer carries around. You can build a kernel two ways. You can statically link it — compile every driver, every capability, every subsystem into one monolithic image that loads in full at boot. Or you can build a small core that boots fast and loads modules on demand — mount a filesystem, and pull in each page of code only when something actually touches it. Both produce a working system. They have completely different economics as they grow.
The statically linked kernel has one fatal property: every capability you add inflates every boot, forever, whether or not this particular run needs it. Support one more device and the image gets bigger for everyone, including the machine that will never use that device. There is no way to say “load only what this task needs,” because loading is all-or-nothing. That is precisely the monolithic worldview kernel from Chapter 1: everything about you compiled into one document, loaded whole on every single call, so that adding your detailed pricing policy makes every unrelated task carry the pricing policy in its context, taxing attention it will never spend.
Key Insight
The monolith isn’t wrong because it’s big. It’s wrong because it can’t be partial. Loading is all-or-nothing, so every addition is a tax on every unrelated task.
The wiki adds the missing page table
Loadable modules only work because of one unglamorous piece of machinery: the page table. It’s the map that lets the system resolve “I need this” into “fetch that page from over there” without holding everything resident. Demand paging — loading a page precisely when it’s referenced — is impossible without it. Take the page table away and you’re back to loading the whole image, because you have no way to find the part you want.
This is the exact thing a folder of markdown files lacks, and the exact thing a wiki adds. Dumping your worldview into a hundred separate notes does not give you a loadable kernel; it gives you a hundred things you still have to read in full to find anything, which is the monolith with extra steps. What converts a pile of pages into a kernel is the edges — the links that say this claim depends on that one, this policy has these exceptions, this decision was superseded by that page. The edges are the page table. They’re what let an agent resolve “what do I need for this pricing email?” into a short walk that pages in the pricing policy, the exceptions, and this customer’s history — and nothing else.
A pile of notes is not a kernel. The edges are. Without them you can only load everything; with them you can load exactly what the task references — demand paging for your worldview.
I’ve made the memory-management argument for the context window itself before — that building agents today feels like programming a machine with almost no RAM, where you must page data in and out deliberately. This chapter applies the same move one level up: not just paging data into the window, but paging your kernel into the window. The worldview stops being something you load and starts being something you demand-page from.
What legitimately stays resident: the boot ROM
Demand paging doesn’t mean nothing is resident. A real system keeps a tiny core always in memory — the boot ROM — and pages in everything else. Your kernel has an equivalent, and naming it precisely is what stops “move it to the wiki” from becoming “the agent knows nothing until it goes looking.” The always-resident core is small and stable: identity (who this agent is acting as), the North Star (what it’s ultimately for), the schema (the shape of the world it operates in), and toolbelt conventions (how it acts and how it walks the wiki).
Everything else — the actual knowledge, the policies, the histories, the examples — demand-loads. The test for whether something belongs in the resident core is simple: does every task need it to orient itself? Identity and North Star pass. Your third-quarter pricing exception does not; it’s a page, fetched when a pricing task references it. Get this split right and the resident footprint stays tiny no matter how vast your worldview grows, because growth happens in the paged region, not the boot ROM. That’s the property the monolith could never have: capability that scales without inflating the boot.
One task, two architectures
Task: “Draft a reply to this partner about the pricing they queried.”
Statically linked kernel
- • Loads the entire worldview into context
- • Pricing policy, voice, and this relationship arrive — alongside HR rules, unrelated projects, everything else
- • Attention spread thin across mostly-irrelevant detail
- • Every future addition makes this reply’s context heavier
Loadable modules + page table
- • Boots the tiny core (identity, North Star, how to walk)
- • Follows edges: pricing policy → exceptions → this partner’s history → voice
- • Pages in four relevant pages; ignores the rest of the graph
- • The worldview can grow to any size without touching this footprint
Decompose, make navigable, recompose
There’s a reason this OS metaphor feels natural rather than clever, and it’s that we already write everything else this way. Nobody ships a thousand-line function anymore. You decompose it into small units, name them so they’re navigable, and recompose them at the call site. I learned this viscerally with AI coding agents: the smaller and more self-contained the artefact, the better the model reasoned over it — early models would start losing the thread and mangling patches the moment a file got large, because they were being asked to hold too much at once. The move that fixed it was never “bigger context.” It was “smaller pieces, made navigable.”
The same is true of the ebook you’re reading: it exists as a set of small per-chapter files, each written in its own focused context, recomposed by a single index at build time. And it’s true of an agent’s tools, which are best kept as plain, inspectable, single-purpose files rather than one tangled program. Look across all three — code, documents, tools — and the same pattern is doing the work: decompose the complex thing into small units, make the units navigable, recompose on demand for the job at hand.
The pattern under all of it
Decompose, make navigable, recompose is not a filing convention. It’s the agentic pattern — the same reason small functions beat monoliths is the reason a wiki beats a kernel document.
That’s what the wiki is doing to your worldview: taking a genuinely complex thing and making it composable and navigable, so an agent can focus on one part at a time and then pull the relevant parts together. Which lands us on the obvious next question. If the taste kernel and the brand kernel shouldn’t be separate documents, and everything is now pages-plus-edges, then what happens to them? They don’t vanish — taste and brand are real, useful distinctions. They just stop being rival documents and become something lighter and more powerful: boot profiles.
Boot Profiles, Not Rival Documents
So what happens to the taste kernel and the brand kernel? They don’t die. They get demoted — from rival documents to boot profiles: an entry point plus a selection bias, with no duplicated content to drift.
TL;DR
- •A named kernel becomes a boot profile: a hub page whose edges point into task-relevant regions. An entry point plus a selection bias — not a copy of the doctrine.
- •You can boot in anywhere; the edges guarantee you can still reach everything the task needs. The kernel stopped being a thing you load and became a field you stand in.
- •Any flat
kernel.mda tool still wants is a build output — regenerated from the graph, disposable, never hand-edited. And “one wiki or many?” becomes a governance question, not an architecture one.
We left Chapter 2 with a real question, not a rhetorical one. Taste and brand are genuine distinctions — aesthetic judgement really is a different lens from market positioning — so if they shouldn’t be separate documents, what are they? The answer is the load-bearing idea of Part I, so let’s be exact about it. A named kernel becomes a boot profile: a single hub page whose edges point into the regions of the graph a given kind of work tends to need. It carries almost no content of its own. What it carries is a starting point and a bias about what to surface first.
Anatomy of a boot profile
Concreteness helps here, so here is what a “brand voice” boot profile actually is. It is not a document restating how I write. It is a hub with about a dozen edges and a one-line bias.
boot-profile: brand-voice (a hub, not a document)
Selection bias: when writing for an audience, page in voice and positioning first; prefer conceding trade-offs; drill to worked examples before asserting.
Edges:
- →
voice.principles· →register.fit-not-superiority· →stance.concede-tradeoffs - →
examples.before-after(three) · →pattern.cta-band· →north-star - →
named-concepts.index· →constraints.claims-must-be-verifiable
Every one of those edges points at a page that also belongs to other boot profiles. There is exactly one register.fit-not-superiority page. The taste profile can point at it too. Nothing is copied, so nothing can drift.
Now watch it run. The task is “write the closing call-to-action for this ebook.” The agent boots the brand-voice profile, which doesn’t hand it a wall of doctrine — it hands it a place to start and a bias. Following the edges the task actually pulls on, it pages in the voice principles and the CTA-pattern page, glances at one before/after example, and stops. It did not load the taste profile, the pricing policy, or nine-tenths of the graph. It recomposed exactly the context the job needed, from a shared single source of truth, at read time.
Compare that to the old world, where “brand voice” was a 2,000-word document that both restated the principles and slowly diverged from the taste document restating the same principles. The boot profile holds the same useful distinction — “this is voice work, start here” — while owning none of the content that could contradict another profile. The distinction survives; the duplication doesn’t.
Boot anywhere; the edges guarantee reachability
The deeper shift is what “boot” even means now. With a monolith, booting is loading the document — a single, fixed act. With a graph, you can bootload into any page and still reach everything the task needs, because the edges guarantee reachability. Enter from the pricing page and you can walk to the customer, the policy, the voice. Enter from the customer page and you can walk to the pricing. There is no privileged front door, because the graph is connected. The entry point only sets where you start and what you see first; it never limits what you can reach.
The kernel stopped being a thing you load and became a field you stand in. You don’t open it; you enter it — anywhere — and recompose what you need from where you’re standing.
This is why the drift failure from Chapter 1 simply cannot recur. Drift needed two copies of a claim, maintained separately, sliding apart. In a graph there is one page per claim and many profiles pointing at it. Edit the register page once and every profile that touches it — brand, taste, anything — sees the change immediately, because they were never holding their own copy. Single source of truth isn’t a discipline you have to enforce here; it’s the shape of the thing.
Any flat kernel.md is a build output
Objection, and a fair one: some tool in your stack genuinely wants a flat kernel.md — a single file it can read at startup, because that’s how it was built. Fine. Give it one. The point is where the file comes from. You do not hand-author it and maintain it alongside the graph — that just reintroduces the monolith and its drift. You generate it: take a boot profile, walk its reachable closure over the edges, flatten those pages into markdown, and emit the file.
kernel.md as a compiled artefact (conceptual)
build_kernel(profile):
pages = closure(profile, follow=edges, depth=policy)
ordered = topo_sort(pages, by=selection_bias)
return flatten_to_markdown(ordered) # emit kernel.md
# kernel.md is CACHE, not source.
# Regenerate on change. Never hand-edit the output.
The generated kernel.md is disposable — the same nuke-and-regenerate discipline I’ve argued for outputs generally, now pointed at the kernel itself. If it’s wrong, you don’t patch the file; you fix the page or the edge and rebuild.
This closes the loop on my own earlier doctrine. I used to treat the kernel as the durable, hand-tended artefact. The refactor is that the graph is the durable artefact and any flat kernel is a build output of it — regenerated, never nursed. The recipe was always the asset; I just had the recipe and the output the wrong way round.
One wiki or many? That’s governance, not architecture
The last thing Part I has to settle is the question that used to feel architectural: do I have one big wiki, or many connected ones? Under the monolith it felt like a design decision with real stakes. Under the graph it stops being one. Once the kernel is defined as “the reachable closure from your entry point,” the number of underlying graphs is invisible to the agent — it boots, it walks edges, it recomposes. Whether those edges cross into one store or three changes nothing about how the kernel behaves.
What the graph count does decide is governance: scoping, allowlists, blast radius, who is permitted to edit which region, what an agent in one context is allowed to reach. You split into multiple wikis for the same reasons you split a codebase into services with boundaries — to contain permissions and limit what a compromised or confused agent can touch — not because the architecture demands it. That’s a genuine question with a real answer, but it’s a policy answer, decided by who should be allowed to see and change what — not an existential one about the shape of the kernel.
Part I in one line
The wiki is the kernel: decomposed into pages, navigated by edges, entered through boot profiles, and flattened to a document only as a disposable build output. The monolith is dead; what replaces it is a field you stand in.
That is the spine. Now Part II puts it under load in the place people feel the pain most sharply — not their personal kernel, but the parade of vendor AI features that demo beautifully and stay useless about their actual life and business. The reason is the same monolith, wearing a very different disguise: every copilot can see only its own fragment of your world, and the compiled understanding that would actually help can only live on your side of the boundary.
The Four Locks
Ask your email’s AI for the June invoice and it works. Ask it for “that person I talked to years ago about a thing we might build” and it’s helpless. Same inbox, opposite outcome — and the reason is structural, not a maturity gap.
TL;DR
- •Lookups inside a silo work — that’s the demo. Judgement across silos fails — that’s the job. Every vendor copilot sits on the wrong side of that line.
- •Four locks keep it there and make it permanent: unit economics, liability, the silo boundary, and vendor incentive. You only need one to hold. All four do.
- •The same failure appears in your inbox and at your dentist — and, in the next chapter, across an enterprise. Structural, not anecdotal.
Part I fixed your kernel. This part is about everyone else’s — the vendors who keep shipping AI features into the tools you already pay for, each one demoing beautifully and staying stubbornly useless about your actual life or business. It is the same monolith from a different corner: a copilot can see only its own fragment of your world, and the compiled understanding that would actually help is on the wrong side of a boundary it cannot cross. Once you see why, you can evaluate any AI feature you’re ever offered with a single question — but first the mechanism.
Start where everyone lives, the inbox. The AI built into your mail client can find the electricity invoice from June without trouble. Ask it instead for “that person I had a long conversation with a few years ago about something we were thinking of building — I can’t remember their name,” and it has nothing. Both questions are “about your email.” They are not the same kind of question at all.
The argument, once
Here is the whole of Part II in two sentences; everything after is proof at increasing scale. Lookups inside a silo work; cross-silo judgement fails. The invoice is a keyword lookup over data the silo already holds. The relationship question is a judgement that requires a model of your whole world — who those people are, what “something we might build” means for you, why it mattered — and that model is a synthesis across boxes the vendor cannot see. Judgement needs a place to stand that sees everything at once. That place is your side of the boundary, or nowhere.
The reframe
The demo is the lookup. The job is the judgement. Vendors ship the demo and describe it as the job — and the demo works precisely because it only ever asks what the silo already contains.
Why no vendor ships the layer that would help
The obvious question is why. These vendors have enormous engineering budgets and every commercial reason to sell you more capability. If the cross-silo layer is what you actually need, why doesn’t someone just build it? Because four independent forces all push the same way, and you only need one of them to hold for the boundary to be permanent. All four hold at once.
The four locks
- 1. Unit economics. A genuine model of your world means paying comprehension costs across years of your data, recurring as your world changes — a real capital expense per user. A vendor on a flat monthly seat cannot eat that for millions of users, so it ships the cheap thing: a lookup over data it already stores.
- 2. Liability. The cross-silo layer is a synthetic dossier of you — relationships, health, finances, inferred and cross-referenced. No vendor wants to hold that, and increasingly no regulator wants them to. The safe product forgets between sessions; the useful product never forgets. Those pull in opposite directions.
- 3. The silo boundary. Your email vendor cannot see your accounting system; your practice software cannot see your phone system or your reviews. Each is structurally blind past its own edge — by architecture, not choice. The one view that would help is the one no single vendor can assemble.
- 4. Vendor incentive. Even where a vendor could see across, its AI will never recommend against its own product, question its own module, or tell you the process it automates shouldn’t exist. A compiled worldview has to be able to say “cancel this, it isn’t working.” A vendor’s AI structurally cannot.
Hold those four, because the point of the escalation ahead is that the same four explain the failure whether you’re one person with an inbox or a five-thousand-seat enterprise. That invariance is what makes it structural rather than a story about immature software.
Scale one: your inbox
The AI in your mail client, and the “personal agent” the whole industry is promising, both reach your email through a thin, deterministic bridge — search, fetch, list — the kind of interface open connector standards were built to standardise.1 Those connectors are genuinely useful. They ship the reliable bottom rungs of the ladder — list, read, search, call — deterministic operations over a single system.2 What they do not ship — what they were never meant to ship — is the map above those rungs: the compiled model of who matters, what’s in flight, what you’d want to know.
So the invoice lookup works and the relationship judgement doesn’t, and neither is a bug. One is inside the silo; the other is a diff against your whole worldview, and that worldview lives on your side of the boundary or nowhere. This is exactly why a personal agent bolted onto vendor connectors underwhelms the moment you ask it anything that matters.
A personal agent without your wiki is a genius stranger with your password. Competent, credentialed, and useless about you.
And notice lock 2 doing its quiet work. Even if your mail vendor wanted to compile that worldview, it would have to build and permanently hold a separate, cross-referenced corpus of everything you’ve ever discussed — the privacy liability alone is a reason not to, and the compute is another on top. I know this because I run the alternative: a triage agent over my own email that was hopeless on raw access and became a genius the moment it had a compiled worldview of a few years of my correspondence, built and held on my side. Nothing about the model changed — the input did — and that result is the spine of the previous field guide in this series, so I’ll name it rather than re-derive it here.
The inbox, two ways
Vendor AI on connectors
- • Finds the June electricity invoice instantly
- • Answers “emails from Priya last week”
- • Blank on “that person, years ago, about that idea”
- • Judges each message in isolation, with no model of you
An agent over a worldview you own
- • Knows who matters and what’s in flight
- • Resolves the fuzzy relationship query by walking edges
- • Interrupts only for the genuinely novel or urgent
- • The compiled understanding lives on your side of the line
Scale two: your dentist
Now give the problem a profit-and-loss statement. A dental practice runs on a practice-management system, and that vendor now ships an AI feature. It answers questions about the recall list and the appointment book beautifully — those live inside the silo. Ask it the question the owner actually has at six o’clock on a Tuesday — is this practice healthy, and what should I change this month? — and it goes blind, because that answer spans the marketing spend, the Google reviews, the payroll, the phone system, and the owner’s own inbox, none of which the practice software can see.
What the PMS copilot answers
- • Who’s due for recall next month?
- • Which slots are open on Thursday?
- • What’s outstanding on this account?
- • How many hygiene visits last quarter?
What running the practice needs
- • Are we winning or losing new patients, and from where?
- • Is the front desk converting the calls marketing paid for?
- • Which services should we stop offering?
- • Is this month’s dip seasonal, or the start of a trend?
The right-hand column is a whole-practice operating review, and it is a cross-silo synthesis by definition. I’ve built exactly this shape — deterministic collectors pulling each system, layered summaries, and a model at the top that reads the compressed whole and finds the moves a single-system view can’t. The practice-software vendor will never build it, and here lock 4 becomes the binding one: its AI will never tell the practice that a service it bills for should be dropped, or that the recall process it was sold to run shouldn’t exist in that form. It can optimise the recall list. It structurally cannot ask whether the recall list is the point.
Two scales, one shape. The inbox and the dental practice fail identically: the useful question is a synthesis the vendor can’t assemble, blocked by the same four locks. If the pattern holds when we add three more zeros — when “the vendor” becomes a whole portfolio of enterprise copilots — then it isn’t a story about small software being immature. It’s architecture. That’s the next chapter, and it’s where the most serious counterargument lives: the suite vendor who insists that they are the cross-silo layer.
Five Copilots Don’t Compose
The enterprise doesn’t have an AI. It has a portfolio of copilots — one per silo — that each demo well and never answer the question the leader actually has. Adding zeros doesn’t change the shape; it confirms it.
TL;DR
- •Five fragments don’t compose into a picture, because composition needs a shared map none of the five can hold. A real question like “why is churn up here?” is the join across systems no copilot can see.
- •The suite-vendor counterclaim — “we ARE the cross-silo layer” — fails twice: coverage (even the biggest suite is a fraction of the estate) and lock-in (renting your worldview back is the worst trade there is).
- •The fix is identical at every scale: own the compiled layer; demote vendor AI to connectors. One question evaluates any copilot — which side of the boundary does the understanding live on?
Scale the pattern up and the disguise gets more convincing, which is exactly why it’s worth taking apart carefully. A modern enterprise doesn’t buy an AI. It accumulates a portfolio of copilots — one in the CRM, one in the support desk, one in the code host, one in the productivity suite, one in the data warehouse. Each demos well inside its own silo. The implicit pitch is that together they cover the business. They don’t, and the reason is precise: five fragments don’t compose into a picture, because composition needs a shared map that none of the five can hold.
One question, five copilots, no answer
Take the question a leader actually asks in a Monday meeting: why is churn up in this segment? Watch it decompose across the portfolio and die in the gaps between the boxes.
Where the fragments die
- CRM copilot sees the accounts that churned and the sales notes — but not why the product frustrated them.
- Support copilot sees the ticket spike — but not that billing changed the plan underneath it.
- Billing copilot sees the plan migration — but not the feature that quietly stopped working.
- Product-telemetry copilot sees a feature’s usage collapse — but not that it maps onto the exact segment sales flagged.
- The answer is the join across all four, and no copilot can join across systems it cannot see.
Every one of those copilots gives a competent, confident answer about its own column, and the sum of five competent column-answers is not the row that explains the churn. The row — the causal chain that runs billing → feature → telemetry → support → cancellation — exists only in a view that spans all five, and no vendor can build that view because each is blind past its own edge. This is lock 3 at enterprise scale, and it doesn’t soften with budget; it hardens, because the number of silos grows faster than any one vendor’s reach.
The serious objection: “we ARE the cross-silo layer”
Here is the counterargument that deserves a real answer, because a real vendor will make it and it’s the most persuasive thing in the room: “Our suite already spans CRM, support, and productivity — so buy more of us, standardise, and the fragments compose inside our platform.” It sounds like the resolution. It’s the trap. Answer it twice, because it fails on two independent grounds and either one is fatal.
First, coverage. Even the broadest suite covers a fraction of a real enterprise’s estate. The billing system is someone else’s. The data warehouse is someone else’s. The phone system, the shadow-IT spreadsheets that run half the operation, the industry-specific tools, the email threads where the actual decisions were made — all outside the suite. A cross-silo layer that can’t see most of the silos is just a larger silo, and the churn question still spans its boundary. You’ve consolidated five blind spots into one bigger one and called it integration.
Second, and worse, lock-in. Suppose a vendor genuinely could see everything. Handing that vendor the compiled model of how your business actually works is the most complete lock-in in commercial history — not lock-in on a database you could migrate, but on the understanding of your own company. You would be renting your own worldview back from a supplier, on precisely the asset a business must own outright. This is the own-versus-rent argument I’ve made for software carried one level deeper, to understanding: buying generic product was already the weaker move; renting your comprehension of yourself is worse.
An AI strategy that is the union of vendor roadmaps is a strategy for owning nothing. Their copilots are features of their products; your worldview is a property of your business.
The fix is the same at every scale
Because the failure is one structure repeated — inbox, dentist, enterprise — the fix is one structure repeated. Assemble the compiled cross-silo layer on your side of the boundary, as a worldview you own, and demote every vendor AI to what it is genuinely excellent at: deterministic connectors into its own system, operating underneath your map. The mail vendor stays the fetch-the-message tool. The practice software stays the pull-the-recall-list tool. The five enterprise copilots become five reliable adapters. The synthesis — the join, the judgement, the “should this exist” — happens above them, in a substrate that sees across because you built it to.
This is not anti-vendor, and reading it as a grudge misses the point. Vendor AI at the bottom rungs is fast, cheap, and reliable, and you should use it there without hesitation. What you must not do is wait for it to grow the top of the ladder, because the four locks guarantee it won’t. If the boundary is structural for the largest platform vendors on earth — the ones with the best engineers and the deepest pockets — then your vertical-software supplier is not quietly closing it in the v2 on their roadmap.
The one-question evaluation rule
For any copilot you’re offered, ask: which side of the silo boundary does the compiled understanding live on — theirs or mine? If “theirs,” you’re buying a better lookup and renting your own judgement. If “mine,” the vendor is a connector and you own the asset. Buy the connector; never rent the worldview.
Run the rule on the copilots you already have. The mail assistant: understanding lives on their side — connector, fine, don’t expect judgement from it. The practice-software feature: their side — connector, fine, but the operating review is yours to build. The enterprise suite promising to become your cross-silo brain: their side — and that is the one to refuse, because it’s the asset you can least afford to rent.
That closes the flagship. Every vendor will keep shipping the impressive in-silo demo, and every demo will keep being the lookup, not the job. The part that compounds, the part competitors can’t clone, the part that can say “stop” — is the compiled understanding you assemble on your own side of the line. But saying “own the compiled layer” raises the question Part III exists to answer: what is this thing, architecturally? It isn’t a database, and it isn’t a bigger prompt. It’s a tier of memory the industry’s standard diagram left out — and that’s where we go next.
The Missing Memory Tier
Weights, context window, KV cache — the memory hierarchy everyone draws stops one tier too early. The tier it skips is the only one you own, and the only one that appreciates.
TL;DR
- •The full hierarchy is weights → context → KV cache → wiki. Everything below the wiki depreciates — models commoditise, sessions evaporate. The wiki compounds.
- •The wiki is a compile, not a cache. A cache re-derives the identical thing cheaply; the wiki adds structure — edges — that was never in the source.
- •The model is the interchangeable CPU; the wiki is the disk and the identity. Swap the frontier model underneath and it’s still your system. That’s why the wiki becomes the AI.
Part II ended with an instruction — own the compiled layer — and a debt: what is that layer, architecturally? People reach for “it’s a database” or “it’s a big prompt” or “it’s a fancy cache,” and every one of those undersells it in a way that matters. The clean way to place it is on the memory hierarchy the field already uses to think about LLMs — and to notice that the standard diagram is missing its most important row.
The hierarchy, completed
Everyone draws three tiers. There is a fourth, and it changes the picture.
The LLM memory hierarchy
| Tier | Lifetime | Owner | Direction |
|---|---|---|---|
| Weights | Baked at training | The vendor | Depreciates — commoditises with each release |
| Context window | One turn | Nobody | Evaporates — paid again every call |
| KV cache | One session | Nobody | Evaporates — gone at session end |
| Wiki | Durable | You | Appreciates — compounds under maintenance |
Read down the ownership column and the strategic asymmetry jumps out. The three tiers everyone talks about are either owned by the vendor or owned by nobody, and all three depreciate — the weights commoditise as every lab ships a comparable model, the window and the cache vanish the moment the work is done. The wiki is the only tier you own, and the only one that gets more valuable with use. Everything the industry obsesses over lives in the depreciating region. The appreciating asset is the row it forgot to draw.
A compile, not a cache
The most common way to undersell the wiki is to call it a cache. It’s the natural analogy — a store of things you’d otherwise recompute — but it’s wrong in the way that matters. A cache implies cheap re-derivation of the identical thing: you keep the answer so you don’t redo the computation, but the cached value is exactly what the computation would have produced. The wiki isn’t that. It’s a transformation. The edges — this claim depends on that, this decision superseded that, this person connects to that project — are added structure that was never present in the source. You cannot re-derive them cheaply from a fresh read of the raw material, because they weren’t in the raw material; they were synthesised across it.
What the edges add: source vs page
The source (an email thread)
A long back-and-forth about a delivery date, full of pleasantries, a changed figure, and an off-hand line about a future project. Large, unstructured, self-contained. Says nothing about how it relates to anything else.
The compiled page
Small, self-describing: the decision that was reached, with edges to the project it affects, the person who made it, the prior decision it revised, and the future idea it seeds. It holds relationships the thread never stated.
This is why comprehension is paid once and reused forever, and why the wiki out-thinks plain retrieval that merely fetches the nearest chunks — the intelligence is baked into the structure before any question is asked, which is the whole argument for a graph over a cache. How you actually build that structure — the ingestion, the self-cleaning, the edge-wiring — is its own subject that I’ve treated at length elsewhere; here I’ll take the built asset as given and focus on where it sits. A page, once compiled, is worth more than the source it came from: smaller, self-describing, and — crucially — describing its own relationship to everything else. It’s a neuron in a map, not a note in a pile.
The wiki is the identity
Now the claim that sounds grand until you make it precise: the wiki becomes the AI. Take it apart. A wiki without a model is a filing cabinet — structure, no cognition. A model without a wiki is an amnesiac genius, brilliant and re-introducing itself to your world on every single call. Neither is the system. The system is the pairing, and in that pairing the wiki is the part that is you. The model is the interchangeable CPU — swap it for a faster one and the machine runs the same programs. The wiki is the disk and the identity — swap it and you have a different system entirely.
The model-swap test
Change the frontier model underneath a wiki-backed agent and the behaviour survives, because the behaviour lived in the wiki, not the weights. Change nothing but remove the wiki, and a genius resets to a stranger. That’s the tell for where the identity actually lives.
I’ve watched exactly this happen: the same triage behaviour riding intact across a model change, because everything that made it “mine” was in the compiled worldview, and the model was just the processor executing against it. It is the literal version of a line I’ve used loosely before — that the wiki is a set of “soft weights” you own, a cached state of your whole worldview that biases every model you pour through it. Every wiki-less personal agent resets to a stranger on each migration; a wiki-backed one carries its identity across.
There is a sibling to this argument worth naming so it isn’t confused with it. Placing the wiki as the memory tier is an architecture and identity claim — where durable understanding lives, and why swapping models leaves the system intact. A separate line of the same doctrine is about governance: why an auditable wiki of knowledge, rather than opaque model memory, is what makes an AI system governable at all. Same substrate, different lens — and the governance lens is exactly what the last two chapters of this part turn on, because once knowledge is a durable, ownable tier, the question becomes: why is anyone still stuffing it into the system prompt?
Your System Prompt Is a Photograph
The carefully engineered mega-prompt keeps failing because it froze the resolution of your knowledge at write time — simultaneously too much and too little — for every question you hadn’t been asked yet.
TL;DR
- •A mega-prompt freezes knowledge at write-time resolution: too much (an attention tax every turn) and too little (the needed detail fell below the cutoff) at the same time.
- •A wiki chooses resolution at read time, per question, with a drill path to canonical source — and carries staleness control and provenance the prompt can’t.
- •What survives in the prompt is orientation, not knowledge. The prompt is a boot loader; the wiki is the filesystem it mounts.
The monolithic kernel from Part I has a twin that lives one layer out, in production, and it’s where most teams actually feel the pain: the enormous system prompt. The instinct is ancient and reasonable — pack everything the agent might need into the prompt. Here’s what to do. Here’s how we work. Here’s how to tell good from bad. Here’s the exception for enterprise accounts. And it fails, reliably, in a way that feels like it should be fixable by writing a better prompt — which is the trap, because the failure is structural, the same write-time-resolution problem wearing production clothes.
Too much and too little, at once
The author of a mega-prompt has to guess, in advance, what detail every future task will need — and the guess is impossible to get right, because it’s not one guess, it’s a single frozen answer to a thousand different future questions. Freeze the resolution high, packing in every detail, and you pay an attention tax on every single turn, most of which don’t need most of the detail — and attention is finite, so density degrades the output before you even hit a token limit. Freeze it low, keeping the prompt lean, and the one task that needed the specific exception finds it fell below the cutoff and was never included. The mega-prompt fails at both ends simultaneously, and adding more text only moves you further up one failure while deepening the other.
The photograph
A system prompt is a photograph of the territory, taken once, at write time. The demo works because the demo only asks about the part of the territory the photograph happened to capture. The real task asks about what fell outside the frame.
A wiki dissolves both failure ends at once, because it doesn’t freeze resolution at all. It chooses resolution at read time, per question: the task that needs a one-line overview gets the overview; the task that needs the third-quarter enterprise exception drills an edge and pages it in, all the way back to canonical source if it has to. Nobody had to predict, at write time, which of those two a future question would want. This is the knowledge half of a problem whose instruction half I’ve argued separately — that over-specifying a capable model wastes its attention, and you should give it a North Star instead of a spec. That piece owns the instruction half. This is the same disease in the knowledge you stuff alongside the instructions.
Two things the prompt can’t carry: freshness and provenance
Even set the resolution problem aside and two more failures remain, both fatal at scale. The first is staleness. A mega-prompt is a hand-maintained monolith, and like any hand-maintained monolith it drifts into confidently wrong — a policy changed, nobody updated the paragraph, and now the agent asserts last quarter’s rule with perfect fluency. A wiki page has a janitor, a lint pass, and a git history; you can see when a claim was last touched and by whom. The second is provenance. A flat assertion in a prompt is unauditable vibe — the agent believes it because it’s in the prompt, full stop. The same knowledge as a wiki page carries an edge to its canonical source, so “why did the agent believe that?” becomes an answerable question with a link at the end of it. (That answerability is the hinge of the whole next chapter.)
The treadmill tell
Here is how you know a team is trapped in this without their realising it: they iterate on the mega-prompt forever. Every incident looks like a missing sentence, so every fix adds a sentence — and each sentence adds tokens, dims attention across every unrelated turn, and helps cause the next incident, which looks like another missing sentence. The prompt grows, gets slower, and stays wrong in a moving target of places.
A support agent’s treadmill
- • Agent issues a refund it shouldn’t have. Fix: add a sentence about refund limits.
- • Next month it refuses a valid refund on an enterprise contract. Fix: add the enterprise exception.
- • Then it mishandles a partial refund on a legacy plan. Fix: add another clause.
- • The prompt is now four thousand tokens, slower on every call, and still wrong on the case nobody’s hit yet — while its size degrades the ninety per cent of turns that were never about refunds.
Each clause was a policy that wanted to be a page with edges — owned, dated, drillable — not a sentence competing for attention with everything else the agent does.
Knowledge that keeps demanding prompt space is knowledge asking to be a page with edges. The treadmill is the symptom; write-time resolution is the disease.
What survives: orientation, not knowledge
None of this means the system prompt disappears. It means it gets demoted to what it’s actually good at, exactly as the named kernels were demoted to boot profiles in Part I. What stays in the prompt is orientation: who this agent is, what it’s ultimately for, the conventions of its toolbelt, and how to walk the wiki. What leaves the prompt is knowledge: the policies, the exceptions, the company’s specifics — all of it moved to pages the agent pulls in when a task references them.
“How our company works”, dissected
Stays — the boot loader
- • Identity and role
- • The North Star / purpose
- • Toolbelt conventions
- • How to search and walk the wiki
Leaves — becomes pages
- • Refund and pricing policy (+ exceptions)
- • Account-tier rules
- • Product specifics and edge cases
- • Anything that changes, or has an owner
The line is clean and it’s the same line as everywhere else in this book: the prompt is a boot loader; the wiki is the filesystem it mounts. A boot loader is small, stable, and its job is to get you oriented and hand off to the real storage — not to contain the operating system. Which raises the question that decides whether any of this is worth the trouble: once knowledge lives in pages instead of prompts, what happens to all the work — the prompt-tuning, the test harnesses, the ownership — that used to pile up around the mega-prompt? That’s the last, and most consequential, chapter of this part: the tuning was perishable, and the curation compounds.
The Perishable Layer
Prompt-tuned agents spend engineering effort on the perishable layer — behaviour-coupled calibration nobody owns and every model release invalidates. Substrate agents spend the same effort on the layer that compounds.
TL;DR
- •A prompt is a program written against an undocumented, unstable API. Every model release invalidates the calibration — N agents × M migrations, forever. Move knowledge to the wiki and the coupling shrinks to the boot loader.
- •Regression transforms with it: test the path, not the output shape. And ownership decomposes along domain lines — changing what the AI believes becomes a reviewable diff with an approver.
- •Show me the bits you read moves repair into user-land. The burden doesn’t vanish — it changes direction: the old work evaporated per release; the new work compounds and has owners.
This is the chapter that decides whether the whole refactor pays for itself, because it’s about where your effort goes and whether it survives. There are two halves — a technical one about the treadmill of model-coupled tuning, and an organisational one about who owns what the AI believes — and they turn out to be the same story told at two scales.
The technical half: a prompt is a program against an unstable API
Be precise about what a tuned prompt actually is. It is a program — and the machine it’s written for is a specific model’s behaviour, which is undocumented and changes without notice. You can’t develop that program analytically, because there’s no spec for the target; you develop it empirically, a thousand iterations against output-shape harnesses, feeling for the phrasing that lands. And then a new model ships, the behaviour shifts underneath you, and the calibration you paid for is invalidated. Multiply it out and it’s brutal: N agents, each hand-tuned, times M model migrations, each forcing a re-tune, forever. The effort is real, and it is perishable by construction.
Moving the intelligence to the wiki cuts the coupling at the root. Knowledge becomes declarative — a page states what’s true, in plain language, independent of which model reads it. What stays coupled to a specific model shrinks to the boot loader: identity, orientation, how to walk the graph. So a migration stops being a re-tuning project and becomes a bake-off — point the new model at the same substrate and see if it walks the wiki competently. Nearly all of your agents converge on the same shape, too: tools, a North Star, and the shared substrate, with only the North Star and inputs differing. Every agent starts to look like the same agent, which is exactly the architecture-over-prompts direction I’ve argued the whole cognition pipeline should take — context architecture, not model choice, as the binding constraint.
One agent, across a model migration
Prompt-calibrated build
- • Thousands of tuning iterations against output harnesses
- • Knowledge and behaviour welded together in one blob
- • New model → re-run the harness, re-tune the phrasing
- • Cost recurs on every migration, per agent
Substrate build
- • Knowledge declared once as pages; behaviour is a thin loader
- • New model → a bake-off: does it walk the wiki well?
- • Curation effort carries forward, model to model
- • Cost paid into an appreciating asset, not a perishable one
Test the path, not the output shape
Regression testing transforms in the same motion, and this is the part teams miss. An output-shape test asserts on the string the model produced — it’s a property of the model, so it’s brittle across a swap and dies when phrasing shifts. A substrate lets you test something durable instead: which pages did the agent walk to reach its answer? That’s a property of the map — stable across models, because the right knowledge path is the right knowledge path whoever is reading. A refund decision that walked the refund policy, the account-tier page, and the exceptions is correct-by-construction even if the wording changed; an output-shape test on the exact refusal sentence would have failed the migration for no real reason. Test the path, and your regression suite stops dying every release.
Follow the economics out and it becomes a platform shift, not a tooling tweak. When agent number N costs a paragraph — a North Star and a toolbelt over the shared substrate — the wiki is the filesystem and the agents are processes running against it. That is the organisational form I’ve sketched as the Org Brain: a shared substrate the whole fleet reads from, rather than N hand-built minds.
The organisational half: who owns what the AI believes?
Now the question that quietly sinks enterprise deployments. The mega-prompt has no natural owner. It interleaves HR’s rules, finance’s thresholds, legal’s exceptions, and operations’ conventions in one blob that’s cross-functional by accident — so nobody wants to own it, everyone’s afraid to touch it, and it drifts. Prompting looks like something an end user could do for themselves right up until it’s technical and load-bearing, at which point no end user will go near it, and it falls into an ownership vacuum.
Decompose the knowledge into a wiki and ownership decomposes with it. The refund policy is a page finance owns; the legal exceptions are pages legal owns; each domain edits its own pages in plain English. Changing what the AI believes becomes a reviewable diff with an approver and a history — CODEOWNERS, but for beliefs. And notice you invented no new process: this is document control, the reviewed-change discipline organisations have run for a century, pointed at the AI’s knowledge.
Epistemic status a prompt can’t carry
- • Current — the live rule.
- • Deprecated but binding — no longer default, still governs old contracts.
- • Genuinely contested — two domains disagree; the disagreement is recorded, not flattened.
- • Tried and abandoned, with reasons — so every new agent and new hire stops re-litigating a settled failure.
A flat prompt can only hold “the current rule.” A wiki page holds its own status — and “deprecated but visible” is the one that quietly saves the most time, because it stops the organisation re-deciding things it already decided.
Repair moves to user-land: show me the bits you read
Here is where the provenance from Chapter 7 pays off as a whole operating loop. Because every wiki-grounded answer can show the pages it walked, repair becomes something a domain expert can do without a prompt-engineering ticket. Wrong answer → open the citation → read the plain-English page the agent relied on → the expert who owns that page edits it → every agent in the fleet behaves differently on the very next call. Contest, trace, correct, propagate — in the language of documents, which is a language organisations already know how to govern.
“Show me the bits you read” drags AI repair out of the black box and back into user-land — where a person who understands the domain, not the model, fixes what the machine believes.
One boundary to keep clean: this loop governs beliefs — what the AI knows and why — not actions. Stopping an agent from doing something it shouldn’t at the moment it acts is a runtime-authority problem, a separate discipline with its own machinery. The wiki repair loop and runtime authority are siblings — one owns beliefs, the other owns actions — and confusing them is how governance ends up covering neither.
The honest boundary
Fit, not hype, so mark the edge of the claim clearly. Closed-world, tool-shaped agents — a CI fixer, a form-filler, anything whose entire world lives in its APIs — do fine without a substrate; there’s no compiled worldview to miss, because the difficulty was never context-depth. The wiki claim holds wherever difficulty is context-depth wearing an intelligence costume, which is most of the interesting work, but not all of it. And the burden doesn’t vanish when you move it; it changes character. Prompt drama becomes knowledge governance — you trade a thousand tuning iterations for the ongoing work of curating pages, statuses, and owners. That’s real work. The difference is direction: the old work evaporated on every model release, and the new work compounds and has a name on it.
The thesis, stated flat
Prompt engineering spends effort on the perishable layer — behaviour-coupled, unowned, invalidated every release. Wiki curation spends the same effort on the compounding layer — declarative, owned, and appreciating. Same budget, opposite direction.
That completes the case. Part I refactored the kernel; Part II showed the myopia the refactor cures; Part III placed the substrate, replaced the mega-prompt, and moved the work from the perishable layer to the compounding one. All that’s left is to put it together — to see that all four problems were one problem, and to name the concrete first moves — which is the final chapter.
Standing in the Field
Four problems, one refactor. The exciting artefact — the frontier model, the hand-perfected kernel — matters less than the unglamorous substrate underneath it. Here’s how to start standing in the field.
TL;DR
- •Every problem in this book was the monolith from a different corner: three drifting kernels, vendor myopia, the missing memory tier, the frozen prompt, the perishable tuning. One refactor answers all of them.
- •The move is small to start: fix a tiny static core, compile the pages one high-frequency task touches, give the agent a bootloader and walk tools, regenerate any
kernel.mdas a build output. - •The frontier model keeps the terminal judgement. The wiki is what it — and every cheap reader beneath it — reads from. Own the field.
I’ll close where the source conversation for this book closed — with me telling the powerful frontier model I was talking to that it wasn’t the important release. The one that changed everything for me was the boring one: the cheap, fast, cached model that made compiling a whole worldview affordable. The exciting artefact mattered less than the dull substrate it ran against. This whole book is that same demotion, applied to the kernel. The hand-perfected kernel document was the exciting artefact. The field you stand in — the graph — is the dull substrate that turned out to matter more.
One monolith, four corners
Step back and the four parts collapse into one claim seen from four angles. Each was the monolith wearing a different disguise, and each dissolves in the same refactor.
The same problem, four times
- Part I — three kernels drifting: the monolith as a document you clone per task shape, because it can’t be queried.
- Part II — every copilot is myopic: the monolith as a fragment you rent, stranded on the wrong side of a silo boundary.
- Part III — the missing memory tier: the monolith as a worldview crammed into weights and windows instead of a durable, owned layer.
- Part III — the photograph & the perishable layer: the monolith as a mega-prompt you re-tune forever, its knowledge frozen and its calibration invalidated each release.
The refactor is one move in every case: stop maintaining a document; stand up a graph and boot into it. Decompose, make navigable, recompose — at read time, per task, from wherever you’re standing.
How to start — the build order
None of this requires boiling the ocean, and the fastest way to stall is to try to compile your entire world before shipping anything. The move is incremental, and it looks like this.
Six steps to standing in the field
- Fix the boot ROM. Write the tiny static core: identity, North Star, schema, toolbelt conventions, and how to walk the wiki. Nothing that changes; nothing task-specific.
- Pick one high-frequency, context-shaped task. Triage, a specific recurring report, one kind of proposal. Frequency is what makes the compile pay back.
- Compile just the pages that task touches. The smallest wiki that covers it — claims and edges, not the whole corpus. Resist the urge to model everything.
- Give the agent a bootloader and walk/read tools. Let it enter from any page and demand-page the rest by following edges.
- Regenerate any flat
kernel.mda tool wants as a build output. Walk the closure, flatten, emit — and never hand-edit the output. - Add epistemic status and “show the pages you read.” Give each page a status; surface the citations, so repair lands in user-land from day one.
Then do it again for the next task, sharing pages as they overlap. The graph grows by accretion, each task reusing what the last one compiled, and the moment two tasks share a page you’ve stopped maintaining two things and started maintaining one.
What stays the frontier model’s job
This is not an argument that intelligence stopped mattering — it’s an argument about where to spend it. The cheap layers read; the frontier model judges; and the wiki is what both read from. That division is the spine of the two field guides that precede this one. One split a single agent along a time seam — a cheap scout that explores, a frontier senior that decides. The other put a price on it — a utility model plus a compiled worldview capturing the spread the market keeps paying for cognition. The frontier model keeps the highest-leverage tokens — the terminal judgement, the design conversation, the gardener pass that adds the edges the readers will later walk. What this book adds is the object all of that reads from: the field itself.
And the field compounds, which is the quiet reason to start now rather than wait. Every task you compile makes the next one cheaper and the whole graph richer — the audience for what you build is you and your next agent, and each turn of that loop raises the floor. The kernel you were nursing depreciated the day the model changed; the field you stand in appreciates every time you use it.
The kernel was never a document you perfect. It’s a field you stand in and recompose from — already decomposed, entered from anywhere, paged in on demand. Stop maintaining the monolith. Own the field.
Build the field, not another monolith
If your agents keep needing bigger prompts, your copilots keep answering the wrong question, and every model release resets your tuning — you don’t have a model problem. You have a substrate problem, and the fix is buildable today. That’s the work we do at LeverageAI: compile the worldview onto your side of the boundary, demote the vendors to connectors, and put your effort on the layer that compounds.
Read more at leverageai.com.au — or start with the two field guides this one builds on: The Scout and the Senior and Context Arbitrage.
References & Sources
The evidence base behind every claim — primary research, industry analysis, and technical specifications
Research Methodology
This ebook draws on primary research from standards bodies, independent research firms, enterprise technology vendors, and consulting firms. Statistics cited throughout have been cross-referenced against primary sources.
Frameworks and interpretive analysis developed by Scott Farrell / LeverageAI are listed separately below — these represent the practitioner lens through which external research is interpreted, and are not cited inline to avoid self-promotional appearance.
LeverageAI / Scott Farrell — Practitioner Frameworks
The interpretive frameworks, architectural patterns, and practitioner analysis in this ebook were developed through enterprise AI transformation consulting. The articles below are the underlying thinking behind those frameworks. They are listed here for transparency and further exploration — not cited inline, as this is the author's own analytical voice.
Scott Farrell — Stop Nursing Your AI Outputs: Nuke Them and Regenerate
the durable asset is the generation recipe (the kernel), not the output; once patching cost exceeds regeneration cost, regenerate from a fixed kernel
https://leverageai.com.au/stop-nursing-your-ai-outputs-nuke-them-and-regenerate/
Scott Farrell — Worldview Recursive Compression
a person's accumulated knowledge and decisions compile into reusable frameworks and an AI operating system; feeding outputs back into the substrate compounds reasoning quality over time
https://leverageai.com.au/worldview-recursive-compression-how-to-better-encompass-your-worldview-with-ai/
Scott Farrell — The Index Is the Data: How a Self-Cleaning Wiki-Graph Out-Thinks RAG
pre-process the corpus off-cycle into a self-maintaining markdown wiki-graph of claims and edges, baking intelligence into structure before any question is asked
https://leverageai.com.au/the-index-is-the-data-how-a-self-cleaning-wiki-graph-out-thinks-rag/
Scott Farrell — Context Engineering: Why Building AI Agents Feels Like Programming on a VIC-20 Again
manage the LLM context window like OS memory: context is attention, not just capacity, so load only what the current task demands and treat the window as a tiered memory hierarchy
https://leverageai.com.au/context-engineering-why-building-ai-agents-feels-like-programming-on-a-vic-20-again/
Scott Farrell — The North Star Prompt: Stop Writing Specs for Models That Can Think
give a capable model orientation and intent rather than exhaustive specification; what stays resident is direction, not knowledge
https://leverageai.com.au/the-north-star-prompt-stop-writing-specs-for-models-that-can-think/
Scott Farrell — A Blueprint for Future Software Teams
composable, canon-driven architecture lets AI agents reason over small navigable units and recompose them, rather than holding one monolith in context
https://leverageai.com.au/a-blueprint-for-future-software-teams/
Scott Farrell — Compliance Cosplay: Why AI Governance Without Runtime Authority Is Theatre
scoping and enforcement are governance concerns handled at an authority boundary, distinct from the knowledge architecture beneath
https://leverageai.com.au/compliance-cosplay-why-ai-governance-without-runtime-authority-is-theatre/
Scott Farrell — Context Arbitrage: Turn Intelligence from Opex into Capex
a compiled worldview lets a utility-class model outperform a frontier model without one; judgement is a diff and the wiki is what you diff against
https://leverageai.com.au/context-arbitrage-turn-intelligence-from-opex-into-capex/
Scott Farrell — The Terminal Value Doctrine: Stop Optimising the Horse
select AI investments by whether they defend terminal value; the compiled worldview is a terminal-value asset, and renting it back from a vendor is maximum lock-in on the thing you must own
https://leverageai.com.au/the-terminal-value-doctrine-stop-optimising-the-horse/
Scott Farrell — Don't Buy Software. Don't Hire Experts. Build AI Instead.
AI-era economics make the specification the appreciating asset while generic vendor product depreciates; own the durable artefact
https://leverageai.com.au/dont-buy-software-dont-hire-experts-build-ai-instead/
Scott Farrell — RAG Was Built for Chatbots, Agents Need a Wiki
retrieval fetches raw nearby text (a cache); a wiki-graph adds edges and structure ahead of time (a compile), which is what agents actually need
https://leverageai.com.au/rag-was-built-for-chatbots-agents-need-a-wiki/
Scott Farrell — The Model Is Not the Memory: Why Governable AI Needs a Wiki, Not Just RAG
governable AI needs durable, auditable knowledge paths in a wiki rather than opaque model or retrieval memory; the governance sibling to the architecture argument
https://leverageai.com.au/the-model-is-not-the-memory-why-governable-ai-needs-a-wiki-not-just-rag/
Scott Farrell — The Cognition Supply Chain: From Search to Compounding Agentic Cognition
model capability is overrated for domain work; context architecture is the binding constraint, and a compounding substrate beats per-agent prompt tuning
https://leverageai.com.au/the-cognition-supply-chain-from-search-to-compounding-agentic-cognition/
Scott Farrell — The Scout and the Senior: Swap the Brain, Keep the Transcript
split an agent along a time seam so a cheap cached scout explores read-only and a frontier senior emits one governed decision; the Model Barbell allocates cheapest and smartest with nothing between
https://leverageai.com.au/the-scout-and-the-senior-swap-the-brain-keep-the-transcript/
Scott Farrell — The AI Learning Flywheel: 10X Your Capabilities in 6 Months
consistent engagement compounds capability; each pass feeds the substrate and raises the baseline of what good looks like
https://leverageai.com.au/the-ai-learning-flywheel-10x-your-capabilities-in-6-months/
Industry Analysis & Vendor Research
Anthropic — Introducing the Model Context Protocol [1]
open standard connecting AI assistants to external systems (email, drives, business tools) through deterministic connectors; the hands layer, with no compiled model of the user's world
https://www.anthropic.com/news/model-context-protocol
Model Context Protocol — Model Context Protocol Specification [2]
the primitives an MCP server exposes are deterministic operations over one system; a cross-source compiled worldview lives in the client, not the connector
https://modelcontextprotocol.io/
About This Reference List
Compiled July 2026. All URLs verified at time of compilation. Regulatory documents and standards specifications are subject to revision — check primary sources for the most current versions.
Some links to academic papers and vendor research may require free registration. Government and standards body publications are freely accessible.