Stop Nursing Your AI Outputs
Why the Recipe is the Real Asset
The economics of ephemeral outputs and durable kernels in AI-assisted systems
By Scott Farrell
What You'll Learn
- ✓ Why regeneration is now cheaper than maintenance — and what that changes
- ✓ The Brand Kernel concept — the durable recipe that encodes your judgment
- ✓ Two-stage compilation — first compile worldview, then apply to contexts
- ✓ How model upgrades become free value when outputs are ephemeral
Part I
The Ephemeral Economics Doctrine
Chapters 1–3
The Inversion: Why Regeneration Beats Patching
I built a proposal compiler that worked. Then I spent three days making it worse—one patch at a time.
Every builder knows this feeling: the output works at 7 out of 10, you "just need to tweak a few things." Twelve patches later, you're not sure what the system does anymore. New team members ask "why is this like this?" and you don't have a good answer.
This isn't a story about incompetence. It's about operating with muscle memory from a world that no longer exists—a world where patching was rational and regeneration was expensive. That world ended somewhere between 2024 and 2025, but our habits haven't caught up yet.
The Pre-AI World: Patching Was Rational
For fifty-plus years of software development, the economics were clear: rebuilding from scratch required expensive human labour. Design documents were afterthoughts describing existing code. Incremental improvement was the only economical option. "Don't touch what works; just add to it" wasn't just caution—it was economic common sense.
This shaped everything about how we work:
- Treat outputs as precious because they took effort
- Code review focused on the code, not the design document
- Proposal feedback went into the proposal, not the framework
- Workflow improvements were patched into workflows, not blueprints
The 2024-2025 Inflection Point
Something fundamental changed. AI regeneration crossed the "cheap enough" threshold. What used to take weeks of expensive human labour now takes minutes of cheap computational power.
Agentic coding tools like Claude Code and Cursor make full system regeneration practical, not theoretical. The capability frontier moved: full system regeneration that once took weeks now takes minutes. Quality is often higher than accumulated patches. Each regeneration benefits from model improvements automatically.
"The rise of AI coding tools in 2024 and 2025 mirrors the early days of cloud migration. Just as physical servers gave way to Infrastructure-as-Code, software development is shifting toward AI-assisted and AI-native creation."— Matt Baldwin, "The Premise: Code Is Ephemeral"
McKinsey's research validates this shift with hard numbers: organisations using AI regeneration approaches are seeing cost reductions of over 50% in modernisation projects.
"Code has always been temporary. We deploy it, replace it and rewrite it sometimes within hours or days. Yet most organizations still treat it like a permanent asset, investing in its preservation instead of its adaptability."— Matt Baldwin, "The Premise: Code Is Ephemeral"
The Evidence: What 153M+ Lines of Code Reveal
GitClear's landmark research analysed over 153 million lines of code across thousands of repositories, tracking what happened as AI coding assistants became mainstream. The findings are stark.
Code churn—the percentage of lines that are reverted or updated less than two weeks after being authored—doubled from 2021 to 2024. This isn't a small uptick. It's a fundamental shift in how code is being written and maintained.
The GitClear Numbers
| Metric | Change |
|---|---|
| Code churn | 2x increase (2021-2024) |
| Code cloning | 4x increase |
| Copy/paste vs refactoring | First time copy > refactor in history |
| AI duplication rate | 2-3x higher than human-written code |
Source: GitClear, "Coding on Copilot: 2023 Data Suggests Downward Pressure on Code Quality"
The alarming patterns are clear: a 4x increase in code cloning, copy/paste now exceeding moved code for the first time in history, and refactoring activity in commit histories dropping sharply. GitClear summed it up well: AI-generated code resembles "an itinerant contributor, prone to violate the DRY-ness of the repos visited."
But here's the crucial interpretation: this isn't evidence against AI. It's evidence that we're using AI wrong. We're generating outputs and then nursing them, patch after patch. Every iteration adds to the complexity debt. The right response isn't to stop using AI—it's to fix the specification upstream and regenerate clean.
The 100x Cost Multiplier
A Forbes article on technical debt revealed a number that should stop every engineering leader cold: bugs fixed at the planning or specification stage cost around $100 to resolve. The same bug, left as technical debt until production, costs around $10,000.
That's a 100x cost multiplier from specification to production patching.
"If software bugs are addressed early in the SDLC, such as in the planning stage, they can cost as little as $100 to fix. However, if that same bug is left in the system as technical debt, the cost to fix it can escalate to $10,000."— Software Improvement Group
Why does this multiplier exist? Early fixes are about intent: "What do we actually want?" Late fixes are archaeology: "Why does this behave this way?" Each patch obscures the original design intent further. New patches must work around previous patches, creating tangled dependencies that no one fully understands.
The aggregate cost is staggering. Technical debt costs US businesses $2.41 trillion annually. McKinsey estimates that technical debt can amount to as much as 40% of a company's technology estate. Over half of businesses now spend a quarter or more of their IT budgets just managing debt.
This isn't a sustainability problem. It's a compounding crisis.
Why Habits Haven't Caught Up
The economics have inverted, the evidence is clear, and yet teams everywhere are still nursing AI outputs like precious artifacts. Why?
The sunk cost trap is powerful. "I invested effort in this output, so it must be valuable." People conflate effort to create with value of the artifact. The output is tangible, reviewable, something you can point to. The recipe—the specification, the design intent—feels abstract, harder to justify.
Then there's institutional inertia. Traditional software development treated code as the artifact. Entire processes were built around preserving and extending code. Code review, testing, deployment—all code-centric. Design documents were relegated to afterthought status, updated (if at all) long after the code was written.
And historically, this made sense. Before cheap AI regeneration, patching was the only economical option. The habit was rational. It just isn't anymore. But rational habits forged over decades are the hardest to break.
The Anti-Pattern Named: "Hacking on the Output Too Many Times"
Let's give this behaviour a name so we can recognise it when it happens: "hacking on the output too many times."
Here's what it looks like in practice:
Patch 1-2
Reasonable fixes, low overhead. Small tweaks to get the output working in context. This is fine.
Patch 3-5
Judgment is accumulating in the output instead of the recipe. You're encoding "how things should be" in patches rather than updating the upstream specification. Warning signs appear.
Patch 6+
You're nursing a patient that should have been regenerated. The system works, but no one knows why. Fear of touching legacy patches sets in. You've crossed the threshold.
The diagnostic questions that reveal this anti-pattern:
- "Why is this like this?" → No one can answer without archaeological investigation
- "Can we change X?" → "Careful, that breaks the workaround for Y"
- "How do I add a feature?" → "First, understand these 12 historical patches"
The emotional experience is unmistakable: fear of touching legacy patches, inconsistencies between parts of the system, defensive coding to avoid breaking unknown dependencies, new team members getting warnings like "don't touch that file."
Even knowing this pattern intellectually, the pull toward patching is strong. When I built my proposal compiler and forgot to encode my frameworks upstream, I iterated on the output for days before recognising what I was doing. The confession in my notes: "I've iterated on it now a few times and it's probably okay. But I'm just saying for next time how I should run it."
Even experts fall into this trap. The muscle memory runs deep.
The New Economics in Plain Terms
Here's the inversion, laid out clearly:
The Inversion Table
| Dimension | Old Pattern | New Pattern |
|---|---|---|
| Primary artifact | Code / output | Design document / recipe |
| Documentation role | Afterthought describing output | Input to generation |
| Source of truth | What the output does | What the spec specifies |
| When they diverge | Update documentation to match output | Fix spec, regenerate output |
| Bug response | Patch the output | Fix the recipe, regenerate |
Source: "A Blueprint for Future Software Teams" (LeverageAI)
The economic logic is straightforward once you see it:
- Human judgment is expensive to encode, but once encoded, can be applied infinitely at near-zero marginal cost
- AI regeneration is cheap and getting cheaper with every model release
- Therefore: invest your effort in encoding judgment (specifications, design documents, frameworks), and let AI handle regeneration
"Now intelligence is cheap, and as a result good judgement is 100x more valuable."— Nate's Newsletter, "Good Judgement is a Million Dollar Skill"
This is the fundamental shift. Intelligence—the ability to follow instructions, generate variations, apply patterns—has become commodity infrastructure. Judgment—knowing what to build, why it matters, what to reject, how to navigate trade-offs—remains scarce and expensive.
The new pattern invests heavily in encoding that judgment into durable specifications and lets AI handle the cheap part: turning specifications into outputs.
What This Chapter Establishes
Key Takeaways
- • The economics of artifact maintenance have inverted (2024-2025)
- • Regeneration is now cheaper than accumulated patching
- • But human judgment remains expensive—that's where value lives
- • The evidence is clear: 153M+ lines show the patching approach creates debt
- • The 100x cost multiplier makes spec-first dramatically cheaper
- • Our habits haven't caught up—we still treat outputs as precious
If outputs are ephemeral, what's durable? The answer: the recipe—the "brand kernel" that encodes your judgment. In the next chapter, we'll explore what goes in that recipe, and why it compounds.
The Brand Kernel
The durable asset is the kernel — the set of meta-files that encode your worldview and judgment.
In Chapter 1, we established that outputs should be treated as regenerable — code, proposals, workflows are ephemeral artifacts that can be recreated on demand. But regeneration needs a source. Something must persist between iterations, something that captures not just what was built, but why it was built that way.
What is that thing? And why does it matter so much?
The answer: the brand kernel — the set of meta-files (marketing.md, frameworks.md, constraints.md, style.md) that encode your worldview and judgment. AI can cheaply regenerate artifacts, but human judgment is expensive to re-encode. The kernel is where human judgment lives.
What a Brand Kernel Contains
The brand kernel comprises four canonical files, each encoding a different dimension of your expertise:
marketing.md — Who You Are
Your positioning, your audience, your differentiation. The "why us" that should permeate every output.
Not marketing copy — marketing DNA.
frameworks.md — Your Thinking Tools
The diagnostic sequences you use, the mental models you apply, the patterns you recognize and the patterns you reject.
This is your compressed strategic judgment.
constraints.md — What You Never Recommend
The anti-patterns you avoid, the projects you won't take, the approaches that don't fit your risk posture.
Your "no-go zones" are as valuable as your expertise.
style.md — How You Communicate
Voice and tone guidelines, document structure patterns, what "done" looks like for your outputs.
Consistency that scales without micromanagement.
Beyond the Core: Meta-Scaffolding
The four canonical files form the minimum viable kernel. But sophisticated AI-assisted systems often expand the scaffolding to include:
- • patterns.md — Your go-to solution shapes and implementation approaches
- • anti_patterns.md — What you actively avoid and why
- • diagnostic_flow.md — The standard questions you ask about any business
- • engagement_shapes.md — Preferred implementation patterns like "pilot + internal champion"
Why call this "meta-scaffolding"? Because these files exist before any client work. They're the scaffolding your outputs are built on. Without them, every output starts from scratch. With them, every output inherits your accumulated judgment.
"You need to start with some very macro meta sort of Markdown files before you even start."
Why Kernels Are Expensive (And Why That's the Point)
Human judgment costs time to encode. Frameworks are "compressed judgment in language," and building them requires:
What Makes Kernel Creation Hard
- • Articulating tacit knowledge ("I know it when I see it" → explicit rules)
- • Resolving contradictions in your own thinking
- • Testing against edge cases and revising
- • Distilling years of experience into compact form
The Economics of Judgment
Intelligence = Commodity
AI provides cheap, scalable intelligence
Judgment = Scarce
Humans encode experience into frameworks
Result = 100x Value
Good judgment is now exponentially more valuable
"Now intelligence is cheap, and as a result good judgement is 100x more valuable." — Good Judgement is a Million Dollar Skill in the Age of AI
The investment calculation is straightforward once you understand the asymmetry:
Time spent encoding judgment into kernel
Value extracted from that judgment (every generation)
Time spent re-explaining judgment per output (after encoding)
Why Kernels Compound
Once encoded, AI applies your judgment infinitely. One hour refining frameworks.md improves every future output. One insight captured in constraints.md prevents mistakes forever.
This creates a flywheel:
You do work → capture insight → update kernel
Every project reveals patterns worth encoding
Next output benefits → reveals new insight → update kernel
Each iteration strengthens the kernel
Compounding begins — each output gets better "for free"
The kernel becomes an appreciating asset
The Recipe Is the IP, the Bread Is Today's Product
Think of your AI workflow like a bakery. Bakeries don't guard loaves — they guard recipes. The loaf is consumed; the recipe persists.
Applied to AI-Assisted Systems
Your proposals are loaves
Consumed by clients, specific to context
Your kernel is the recipe
Makes infinite loaves, improves with use
Your Markdown files are the IP
The real asset that compounds over time
Investing in loaf quality (output polishing) delivers limited returns — each polished output benefits only that one client. Investing in recipe quality (kernel improvement) delivers compounding returns — every kernel improvement benefits all future outputs.
The business implication: your Markdown files become the IP. The real asset isn't what you shipped yesterday. It's what lets you ship again, better, tomorrow.
Kernel vs Output: Where to Invest Your 80%
Right now, most teams mis-allocate effort: 80% on output polish, 20% on recipe refinement. They over-invest in visible outputs because outputs are reviewable and demonstrable. The kernel feels abstract.
The recommended inversion: 80% effort on recipe refinement, 20% on output polish. Why? Because recipe improvements benefit all future outputs. Spending 10% extra time in recipe maintenance saves 40% time on every future generation.
| Activity | Current Allocation | Recommended |
|---|---|---|
| Output polish (patching, refinement) | 80% | 20% |
| Recipe refinement (kernel updates) | 20% | 80% |
What does "investing in kernel" look like in practice?
The Investment Shift
🛑 Stop Doing
- • Polishing individual outputs excessively
- • Patching without capturing the insight
- • Treating each project as standalone
- • Making one-off decisions repeatedly
✅ Start Doing
- • Spend first hour of projects reviewing kernel
- • After each project: "What did we learn?"
- • Version-control kernel files like source code
- • Treat kernel updates as high-leverage work
Making the Invisible Asset Tangible
Why do outputs feel more real than kernels? Because outputs are concrete — you can point to a proposal, a codebase, a deployed workflow. The kernel is "just documents," abstract and invisible to stakeholders.
The reframe: treat the recipe as the real product and outputs as proof that the product works. The kernel is the product. Outputs are demos. Demos prove the product works; they're not the product itself.
Making the Kernel Tangible
- • Give it a name: "Brand Kernel" or "Strategic Kernel"
- • Put it in version control with clear commit messages
- • Review kernel changes like you'd review code changes
- • Track improvements over time — measure kernel quality, not just output quality
What the Kernel Enables Downstream
A well-maintained brand kernel unlocks several strategic advantages that are impossible with the output-nursing approach:
Consistency at Scale
Every output inherits the same DNA. New team members produce on-brand work immediately. You achieve "brand-consistent mass-customisation."
Shorter, Cleaner Prompts
You don't need to restate the philosophy each time. Reference frameworks by name: "Use the Market Maturity Ladder (see frameworks.md)."
Coherence Across Outputs
Not just "custom" but "distinctly us." File structures mirror your frameworks. Research steps align with your diagnostic sequence. Headings, tone, examples default to your style.
Onboarding Acceleration
New team members read the kernel, not archaeological dig sites. They understand "why" before diving into "what." They can produce aligned outputs from day one.
The Kernel as Governance
The brand kernel enforces architectural constraints, quality standards, risk postures, and brand voice — without requiring constant human oversight. This matters profoundly when working with AI.
AI doesn't have judgment — it applies yours. Without a kernel, AI applies generic internet judgment. With a kernel, AI applies your distilled expertise. The kernel becomes your governance layer.
"Your frameworks are doing compression (hundreds of hours of thinking into clear lenses), selection (filtering out nonsense), and alignment (every part agrees with your view of value, risk, what 'good' looks like)."
Key Takeaways
- 1. The brand kernel is the durable asset — marketing.md + frameworks.md + constraints.md + style.md
- 2. Kernels are expensive to create because human judgment is expensive to encode
- 3. Kernels compound — every improvement benefits all future outputs
- 4. The recipe is the IP, the bread is today's product
- 5. Invest 80% in kernel refinement, 20% in output polish
- 6. Make the kernel tangible: version-control it, review it, name it
Coming Next: Two-Stage Compilation
We know what the kernel is. But how does it get applied to produce outputs?
The answer: two-stage compilation. First compile your worldview into a builder, then the builder compiles outputs. That's Chapter 3.
Two-Stage Compilation: How the Pattern Works
Every effective AI-assisted system is a two-stage compiler — most people only run stage 2.
The Pattern Most People Miss
The common approach looks simple: take a task, give it to AI, get an output, ship. This is "Stage 2" only. You provide task requirements, AI generates an artifact, done.
But there's a fundamental piece missing: the worldview that should shape the output.
Why this matters: Stage 2 alone produces generic outputs shaped by internet averages. Stage 1 + Stage 2 produces "you-shaped" outputs that apply your judgment to specific contexts. The difference is brand coherence, judgment application, and compounding quality.
Stage 1: Compile You
In Stage 1, you encode your worldview into a generator or builder. The generator becomes "biased" toward your philosophy—not in a negative sense, but deliberately aligned with your values, frameworks, and approach.
Stage 1 Inputs & Outputs
Inputs:
- • marketing.md — your positioning and differentiation
- • frameworks.md — your thinking tools and lenses
- • patterns.md — go-to solution shapes
- • anti_patterns.md — what you avoid
- • style.md — how you communicate
- • Maybe a few canonical past outputs as exemplars
Output:
A builder that now thinks in your dialect.
Not a neutral AI—a you-shaped AI, ready to apply your judgment to any specific context.
This is the "value-biased compiler" concept. The builder inherits your philosophy about risk, quality, what counts as "good work," and what to reject. It becomes capable of producing outputs that feel like you wrote them—because it's applying the judgment you've encoded.
Stage 2: Compile Them
In Stage 2, you apply the now-biased builder to specific contexts. The builder brings your worldview; the context brings their situation.
The inputs to Stage 2 are situational: client data, company financials, project requirements, industry specifics, organizational context. The builder ingests their information, applies your frameworks, and generates outputs shaped by both.
The outputs might be proposals, code, content, workflows, reports—any generated artifact. But the critical difference is this:
Stage 2 Without Stage 1
Generic output shaped by internet averages. The AI applies broad patterns but no specific judgment.
Stage 2 With Stage 1
Your-shaped output applied to their context. Not just "custom" but "distinctly us"—brand-consistent, framework-aligned, judgment-encoded.
The Two-Stage Compilation Flow
STAGE 1: COMPILE YOU
Inputs:
↓
Output:
STAGE 2: COMPILE THEM
Inputs:
↓
Output:
The Shallow Take vs. Deep Take
The shallow take: "Use design docs before writing code." Everyone nods, nobody changes behavior. Design docs remain afterthoughts.
The deep take: "Every AI-assisted system is a two-stage compiler. Stage 1 compiles your worldview into a generator. Stage 2 applies the generator to specific contexts. Most people are only running Stage 2 and wondering why their outputs lack coherence."
Why the Deep Take Works
- • Compilation is a familiar, powerful metaphor that explains the process
- • Makes the order of operations explicit—you must do Stage 1 before Stage 2
- • Explains the "why"—coherence, compounding quality, brand consistency
Design Documents as Gospel
The industry is shifting. As documented in "A Blueprint for Future Software Teams," the design document is no longer an afterthought. It's the input. The specification. The source of truth.
"Production code still matters. It runs. It's tested. It's deployed. It serves customers. What 'ephemeral' means is: code can be regenerated from design if needed."— A Blueprint for Future Software Teams
The Inversion Table
| Dimension | Old Pattern | New Pattern |
|---|---|---|
| Primary artifact | Code/Output | Design document/Recipe |
| Documentation role | Afterthought describing code | Input to generation |
| Source of truth | What the code does | What the design specifies |
| When they diverge | Update docs to match code | Fix design, regenerate code |
| Bug response | Patch the output | Fix recipe, regenerate |
Source: A Blueprint for Future Software Teams
Spec-Driven Development: The Industry Recognizes the Pattern
As Martin Fowler documents, "Spec-driven development means writing a 'spec' before writing code with AI. The spec becomes the source of truth for the human and the AI."
This isn't just one practitioner's opinion—the industry is converging on this pattern independently. Multiple sources arriving at the same conclusion: "Specs should live longer than the code. Code becomes a by-product of well-written specifications."
The Workflow Contrast
Two Paths Forward
❌ Without Two-Stage Compilation
You → Prompt → AI → Code → Bugs → More prompts → Fixes → Hope
- • Each step is ad-hoc
- • No accumulated judgment
- • Every project starts from scratch
✓ With Two-Stage Compilation
You → Spec → AI → Plan → Tasks → Implementation → Tests (that pass)
- • Structured, predictable process
- • Accumulated judgment in Stage 1
- • Every project benefits from prior learning
The deeper pattern: "Spec-driven development flips the script: write specifications before code. Your spec becomes the single source of truth defining what you're building and why, without dictating how."
Infrastructure as Code: The Same Pattern
This pattern isn't new—it's been proven at massive scale through Infrastructure as Code (IaC). Terraform, AWS CloudFormation, and similar tools have demonstrated the economic and operational advantages of treating configuration as durable and infrastructure instances as ephemeral.
Infrastructure as Code is "the practice of managing and provisioning IT infrastructure using machine-readable configuration files instead of manually configuring hardware." Configuration files are the recipe; infrastructure instances are the ephemeral output.
"When you change certain arguments in a Terraform resource, it can't just update the existing component. It has to destroy the old one and create a new one. It forces replacement."— The Complete Idiot's Guide to Immutable Infrastructure
The declarative approach is key: "You write code that describes your infrastructure, and Terraform makes it happen. It's declarative, meaning you define the end state you want, not the steps to get there."
The kernel describes desired end state. AI figures out how to get there. This is the same pattern scaled across domains.
Why Two Stages, Not One?
A reasonable question: why not just encode everything in a single prompt and generate outputs directly?
The Compression Question
"The context window is tiny compared to 'the world', so what you choose to compress into it is everything." You can't put everything into every prompt. Stage 1 pre-compresses your judgment into a reusable form.
The Separation of Concerns
Stage 1: What do I believe? (stable, slow to change)
Stage 2: What do they need? (varies per project, fast to change)
Mixing them creates confusion and inconsistency.
The Economic Logic
Stage 1 is expensive (encodes judgment) but done once. Stage 2 is cheap (applies encoded judgment) and done many times. You want expensive work done once, cheap work done repeatedly.
Applying the Pattern Beyond Code
This isn't just about software. Any AI-generated system benefits from two-stage compilation. The pattern is domain-agnostic—it's about the economics of encoding judgment once and applying it many times.
Proposals
Kernel (methodology + brand) → Builder → Client-specific proposals
Content
Kernel (voice + frameworks) → Builder → Articles, ebooks, posts
Workflows
Kernel (patterns + constraints) → Builder → Process automation
Research
Kernel (questions + methods) → Builder → Investigation outputs
The unifying insight: prior "ephemeral code" coverage focused on software development. This extends the principle to ANY AI-generated system. The economics are the same; the domains vary.
Key Takeaways
- • Two-stage compilation: Stage 1 compiles you, Stage 2 compiles them
- • Most people only run Stage 2—they skip encoding their worldview
- • Without Stage 1, outputs are generic (shaped by internet averages)
- • With Stage 1, outputs are you-shaped (apply your judgment)
- • The industry is converging: spec-driven development, design-as-source
- • Infrastructure as Code proves the pattern at scale: config is durable, instances are ephemeral
- • The pattern applies beyond code—to proposals, content, workflows, research
We've covered the theory: what the kernel is, how two-stage compilation works. Now let's see it in practice. The Marketplace-of-One proposal compiler provides a concrete example of building and using a compiler in the real world.
Part II
The Proposal Compiler Story
Chapters 4–6
Building a Marketplace-of-One Compiler
The Marketplace-of-One is a proposal-generating machine—and the machine itself is just the output of a compilation process.
Part II: The Flagship Story
Why a flagship example: Part I established the doctrine—kernels, two-stage compilation, ephemeral outputs. Now we need to see it in practice: a concrete, end-to-end worked example of building a compiler, warts and all.
Why this example: Fresh, lived experience that demonstrates the full pattern including failure modes. Directly relevant to consultants, agencies, and technical founders. Shows the doctrine working in a non-software domain.
The Marketplace-of-One Strategy Explained
Traditional market segmentation says: pick a niche, build one offer, optimise around averages. The problem? You're customised for averages, not individuals. Your team turns up as if you were built for that customer—but you're only optimised for statistical patterns, not their specific reality.
The Marketplace-of-One strategy inverts this completely:
- Don't pick a niche; pick a company
- Assume everyone wants AI but can't implement it
- Use cheap intelligence to deeply research that one company
- Hand them a speculative, fully customised playbook as your opening line
Why this works now: the cheapness of AI means you can do it on spec. You can produce hundreds or thousands of these. Industrial-scale bespoke.
Marketplace of One vs. Traditional Segmentation
| Traditional Approach | Marketplace-of-One |
|---|---|
| Pick a niche | Pick a company |
| Build one offer for many | Build custom proposal per company |
| Optimise for averages | Optimise for this specific company |
| High cost per customisation | Low cost per company (AI-driven) |
| Focus and efficiency through targeting | Scale and precision through automation |
The 30-Page Opening Wedge
The opening line to the customer is simple: Here's what we think you should do. But it's delivered as a 30-page PDF, fully customised for them.
What's in the proposal:
- Research on the actual people who work there
- Analysis of their company and financial reports
- Specific recommendations using your frameworks
- Ideas you rejected and why—proof you didn't guess
- The research itself demonstrates your AI capability
"The 30-page report is based on our frameworks and our ideas and what we've discovered... we actually write in there what are the ideas we rejected."
The strategic purpose is threefold: prove capability before the first meeting, show that this isn't guesswork, and turn the proposal itself into a demo of what you deliver. The document is both the pitch and the proof.
What the Proposal Compiler Does
The technical flow works like this:
- Ingests company data: website, financials, staff profiles, prior initiatives
- Queries your frameworks and patterns
- Generates structure for the 30-page PDF
- Populates it using your lenses and their context
- Annotates with rejected paths and reasoning
The architecture is deliberately lightweight. Built on a Markdown OS—one Markdown file calls another via chained English instructions. Almost no traditional code: just two small Python programs for data extraction. Most research runs through Tavily. The whole system is built around your frameworks, not around code.
Proposal Compiler Components
Inputs
- • Company data
- • Your frameworks
- • Web research
- • Financial reports
Process
- • Markdown instructions
- • Framework-based analysis
- • Rejection documentation
- • Structure generation
Outputs
- • 30-page custom proposal
- • Annotated rejections
- • Proof of capability
- • Opening wedge artifact
The Markdown OS Approach
What does "Markdown OS" mean? The workflow is defined in Markdown files. One file calls another via instructions. The LLM interprets and executes. Almost zero traditional code.
Why this matters:
- Inspectable: You can read exactly what the workflow does
- Editable: You can change it without writing code
- Regenerable: The Markdown OS itself is an output of a higher-level prompt
- Transparent: The workflow IS the compiler—no hidden logic
The meta realisation: the OS itself is ephemeral; the kernel is durable. The Markdown OS can be rebuilt from its builder prompt. The builder prompt is the real asset.
Initial Quality: 7-8 Out of 10
When the first version ran, it worked. It produced reasonable outputs. It got the company context right, applied the frameworks (mostly), and produced something shippable. Call it 7 or 8 out of 10.
But something was missing:
- Didn't feel "distinctly us"—generic engineer voice, not brand voice
- Structure didn't deeply mirror the frameworks
- Research steps weren't aligned with the diagnostic sequence
- It worked—but felt like anyone could have built it
"It works pretty good. I'd sort of score it 7 out of 8 out of 10. But what I didn't do was some of the things that I've done before that are fairly obvious."
The troubling symptom: the outputs lacked the DNA that makes them recognisably yours. The builder behaved like a generic engineer, not a "you-shaped engineer."
Why This Example Matters
It's not software
Proposals, not code. Shows the pattern generalises beyond development. Relevant to consultants, agencies, advisors—anyone who produces custom deliverables at scale.
It's honest
Includes the mistake, not just the polished outcome. Shows the learning process. More useful than a clean success story because it reveals the failure modes.
It's economically significant
Mass-custom proposals change the economics of business development. Hyper-personalised consulting as a productised workflow. The product isn't consulting—it's a machine that manufactures consulting-grade artifacts on spec.
The Two Layers Revealed
What this example exposes is the fundamental two-layer architecture:
Design-Time vs. Run-Time Architecture
Design-Time Layer
The builder that designs the Markdown OS
- • Takes marketing.md + frameworks.md as input
- • Generates file structure that mirrors your thinking
- • Bakes your DNA into every workflow step
Run-Time Layer
The OS that generates company-specific proposals
- • Takes company data as input
- • Applies your frameworks automatically
- • Produces branded, customised outputs
The initial architecture made a subtle mistake: the generic LLM designed the OS at design-time, then we tried to infuse marketing and frameworks at run-time. Result: generic structure with late-stage customisation.
The better architecture: feed marketing.md and frameworks.md into the builder first. The builder designs the OS with your DNA baked in. Everything downstream inherits: file naming, section structure, tone, diagnostic sequence, examples. This is brand-first compilation.
Key Takeaways
- 1 Marketplace-of-One: Mass-custom proposals on spec, not niche segmentation. Industrial-scale bespoke is now economically viable.
- 2 The 30-page wedge: The proposal is both deliverable and demo of capability. It proves you can do the work before you're hired.
- 3 Rejected paths as proof: The graveyard is social proof of rigour, not waste. Document what you didn't do and why.
- 4 Markdown OS architecture: The compiler is almost no traditional code—just chained instructions that are inspectable and regenerable.
- 5 Quality ceiling at 7-8/10: Good but not distinctive. The outputs lacked brand DNA because the kernel came in too late.
- 6 Ephemeral OS, durable kernel: The Markdown OS itself can be regenerated. The builder prompt is the real asset.
- 7 Two-layer architecture: Design-time builds the builder; run-time applies the builder. Feed the kernel upstream, not downstream.
Coming Up: Chapter 5
We built a working compiler that produced 7-8/10 outputs. But something was wrong—the outputs lacked DNA.
Next: The specific mistake that held quality back. What happens when you compile without the kernel? And how iterating on the output instead of fixing the recipe creates the "hacking on it too many times" anti-pattern.
The Mistake: Compiling Without the Kernel
The builder didn't know the worldview — so the outputs lacked brand DNA.
The Confession
I need to tell you about a mistake I made. Not a small oversight — a fundamental architectural error that I knew better than to make.
I built a proposal compiler for my Marketplace-of-One strategy. One big prompt, comprehensive instructions, designed to build a complete Markdown operating system that would generate 30-page custom proposals for companies on spec.
I hit run.
It worked. The system generated reasonable proposals. Quality was around 7-8 out of 10. Good enough to iterate on. Good enough to ship.
And that's when I realized: I forgot to compile marketing.md and frameworks.md into the builder.
"Mistakes I made in building it. I built up one big large prompt and told it to build it, and it worked pretty good... But what I didn't do was some of the things that I've done before that are fairly obvious."
If someone who writes about these patterns can miss this, anyone can. The pull toward "just build it" is strong. Kernel-first is counter-intuitive even when you intellectually know it's right.
What Went Wrong: The Diagnosis
The architecture I actually built looked like this:
Actual Architecture (Wrong)
Correct Architecture
There are two layers in this system:
I injected the kernel at run-time — when generating proposals. The builder didn't know my worldview when it was designing the OS structure.
Result? The builder behaved like a generic engineer, not a "you-shaped engineer."
Reasonable outputs. Missing the DNA. 7-8 out of 10 instead of 9-10.
What Was Missing Downstream
When you skip the kernel at design-time, specific symptoms appear downstream. Here's what I saw:
File Structure
Didn't mirror the frameworks. Generic folder organisation instead of framework-aligned structure. New team members couldn't navigate by framework logic.
Research Steps
Didn't align with my diagnostic sequence. Generic research questions instead of framework-guided investigation.
Headings, Tone, Examples
Defaulted to generic. Internet-average voice instead of brand voice. Examples were reasonable but not distinctively "us."
Overall Feel
The outputs were custom — but not distinctly us. Anyone could have built this. It worked but didn't compound the brand.
The Iteration Trap
Here's what happened next.
Outputs were 7-8 out of 10. Good enough to iterate on. So I started patching: fix this heading, adjust that section, tweak the research flow.
Each patch improved one thing. Each change was visible, demonstrable. I was making progress.
But overall quality stayed at 7-8 out of 10.
The Patching Paradox
❌ What I Was Doing (The Trap)
- • Patch 1: Fix heading structure
- • Patch 2: Adjust research questions
- • Patch 3: Improve voice consistency
- • Patch 4-7: More incremental fixes...
Result: Complexity increased. Quality ceiling remained. Structural problem persisted.
✓ What I Should Have Done
- • Recognise the structural issue
- • Update the builder prompt with kernel files
- • Regenerate the entire OS fresh
- • Inherit brand DNA from the foundation
Result: Higher quality ceiling. Consistent brand DNA. Scalable architecture.
This is the trap: iteration feels productive. You're fixing visible problems. But when the structural problem is upstream, no amount of downstream patching breaks through the quality ceiling.
"I've iterated on it now a few times and it's probably okay. But I'm just saying for next time how I should run it is do the meta prompting on the builder."
Translation: I was hacking on the output too many times. The right fix was upstream, not downstream.
The Underlying Mechanism
Why did I make this specific mistake? Three reasons:
1. The Kernel Feels Like Documentation
Marketing.md and frameworks.md look like "meta" files. They feel like overhead, like supporting materials. Not like the foundation of the system.
2. "Just Build It" Is Faster in the Short Term
Writing one big prompt and running it feels efficient. Encoding the kernel first feels like a detour. The cost appears later, diffusely — in the quality ceiling, in the iteration trap, in the missing brand coherence.
3. The Model Did Its Best — But Its Best Was Average
I let the system architecture be designed by a model that didn't yet know my worldview, brand, or frameworks. The builder behaved like a competent generic engineer. It produced internet-average outputs.
The insight: Your worldview is what makes outputs better than average. If the builder doesn't know your worldview, the outputs will be average.
Stage 1 isn't optional overhead. It's the whole point.
The Contrast: What Should Have Happened
Here's what the correct architecture looks like:
Better Architecture: Brand-First Compilation
Step 1: Feed marketing.md + frameworks.md + "what we reject" + "how we write" into the builder
Step 2: Ask that now-biased builder to design the Markdown OS
Result: Everything downstream inherits your DNA
What would have been different:
- • File naming reflects framework structure
- • Section structure mirrors diagnostic sequence
- • Tone defaults to your voice
- • Research standards are defined by your criteria for "good enough"
- • Rejected ideas follow your anti-patterns automatically
Why This Pattern Recurs
This isn't just my mistake. It's a pattern I see everywhere in AI-assisted work:
The Universal Anti-Pattern
Surface Symptom:
Teams "hack on the output" of AI systems — iteratively patching code, proposals, or workflows — accumulating cruft and losing alignment with original intent.
Root Cause:
They treat generated artifacts as precious because they invested effort in them, when the effort should be invested upstream in the generation recipe.
Cost of Inaction:
- • Technical debt (in systems)
- • Design drift (in workflows)
- • Quality ceiling (7-8/10 instead of 9-10/10)
- • Each output starts from scratch rather than benefiting from improved recipes
When does this pain spike? Three moments:
If any of these moments feel familiar, you're in the same trap I was.
The Saving Grace
Here's why this wasn't a disaster:
Three Things That Made Recovery Possible
1. The original builder prompt still existed
I hadn't lost the recipe — just used it incorrectly
2. The mistake was identified before too many iterations
I caught it around patch 3-5, not patch 30
3. The fix was clear: update the prompt, not the OS
My prior doctrine provided the language to diagnose the problem
Even in making the mistake, I knew what to do. I'd written about ephemeral code. I'd documented the Design-Compiler pattern. The frameworks I'd built gave me the tools to recognize and fix the error.
"Realistically I should know that myself and my marketplace-of-one build is ephemeral and I should go back, have the original prompt, update the prompt for what I didn't get right and re-run it."
The lesson isn't "don't make mistakes." The lesson is: when you have the doctrine and the original builder prompt, you can recover.
Update the prompt with kernel files. Regenerate. This is the doctrine in action.
Key Takeaways
- 1. The mistake: I forgot to compile the kernel (marketing.md + frameworks.md) into the builder
- 2. The result: The builder designed a generic OS, not a you-shaped OS
- 3. The ceiling: Outputs hit 7-8/10 quality and stayed there despite iteration
- 4. The trap: Patching outputs instead of fixing the recipe compounds complexity without breaking the ceiling
- 5. Symptoms: File structure, research steps, tone — all defaulted to generic because the brand DNA was missing
- 6. Root cause: Treating the kernel as optional overhead instead of foundational architecture
- 7. Saving grace: The original prompt still existed; prior doctrine provided the diagnosis and the fix
We've diagnosed the mistake. Now comes the question: what's the right fix?
Hint: it's not "patch the OS more carefully."
It's: nuke and rebuild.
→ Continue to Chapter 6: The Recovery — Nuke and Rebuild
The Recovery: Nuke and Rebuild
The right fix wasn't to patch more carefully. It was to treat the OS as ephemeral, update the kernel, and regenerate from scratch.
I'd made the mistake. I'd built a proposal compiler without compiling the kernel into the builder first. The outputs hit a 7-8/10 quality ceiling. I'd spent three days iterating, patching, tweaking—and nothing fundamentally changed.
And then I remembered something I'd written before.
The Recognition Moment
In a previous framework, I'd discussed that the code of a normal software project isn't the artifact—it's the prompt that built it.
When you're using agentic coding agents that do the final coding, if you've got rework to do, you don't rework the code every time. You go back to the prompt, the design document, the specification. You describe what you want better. Then you treat the code as ephemeral—delete it and recreate it.
The same principle applies here. My Marketplace-of-One build should be ephemeral too.
The doctrine I'd written about for code applied to the very system I'd just built. The Markdown OS wasn't sacred. It was output. The builder prompt was the real asset.
"I should go back, have the original prompt, update the prompt for what I didn't get right and re-run it and treat the marketplace-of-one markdown operating system as ephemeral as well."
The Right Fix: Rebuild, Don't Patch
The path forward was clear, even if it felt emotionally wrong at first.
❌ What NOT to Do
- • Continue hacking on the OS
- • Add more patches to fix symptoms
- • Try to inject brand DNA into existing structure
- • Keep iterating on outputs
✓ What TO Do
- • Keep the original builder prompt
- • Update it with kernel files (marketing.md, frameworks.md)
- • Delete the old OS
- • Regenerate fresh from updated builder
Why Rebuild Beats Patch
Regeneration isn't just faster—it produces fundamentally better results.
Consistency
Every part of the system inherits the same DNA. No "this file was patched after, so it's different." No gradual drift between components. The entire architecture reflects your worldview from the ground up.
Example: File structure mirrors frameworks automatically. Research steps align with your diagnostic sequence by default.
Auditability
Changes live in the kernel, not scattered across outputs. You can see exactly what worldview the system embodies. New team members read the kernel, not archaeological layers of patches.
Example: "Why does the system do X?" Answer: "Read frameworks.md section 3." Not: "Well, there were 12 patches and..."
Upgradability
New AI models produce better outputs from the same kernel. Your investment in the recipe compounds with every model release. You're not locked to the capability ceiling of when you started.
Example: Claude Sonnet 4.5 reads the same kernel that Sonnet 4 read, but produces 25% more accurate outputs.
Simplicity
One place to make changes (kernel). One regeneration to propagate changes (rebuild). No patch interaction bugs. No "careful, that change might break the workaround."
Example: Update style.md heading guidelines once. Regenerate. All outputs inherit the change.
The Threshold Question
Not every small fix demands a rebuild. But there's a point where accumulated patches signal it's time.
Diagnostic Signals: When Have You Patched Too Many Times?
⚠️ Warning Zone (3-5 patches)
- • Judgment accumulating in output instead of kernel
- • Patches starting to reference other patches
- • Questions like "why is this like this?" emerge
Response: Seriously consider regeneration
🚨 Critical (6+ patches)
- • Fear of touching certain files
- • New features require understanding patch history
- • "Don't touch that file" warnings to new team members
Response: You're nursing a patient that should have been regenerated
✓ Acceptable (1-2 patches)
- • Small cosmetic fixes
- • Low overhead adjustments
- • True one-off situations
Response: Patch is fine, but document in kernel if pattern emerges
The rule of thumb: after 3-5 patches, the crossover point hits. Regeneration becomes cheaper than continued nursing. We'll explore this threshold in detail in the next chapter.
PRs Are Now for Intent, Not Just Edits
The discipline shift goes deeper than just "regenerate when needed." It changes how you think about improvements themselves.
Old PRs vs. New PRs: The Workflow Shift
Old Pattern
PR: "Fix heading format in proposal.md"
Scope: One file, one output
Future outputs: Must remember to fix again manually
Result: Patches accumulate, knowledge scatters
New Pattern
PR: "Update style.md heading guidelines"
Scope: Kernel file (intent layer)
Future outputs: All inherit the fix automatically
Result: Improvement compounds, knowledge centralizes
The workflow becomes:
- Find bug or improvement opportunity in an output
- Ask: Is this a symptom of missing/unclear kernel guidance?
- Update the kernel to prevent it system-wide
- Regenerate to propagate the change
- Verify the bug is gone in all outputs, not just one
PRs become about intent. "I want the system to behave differently" gets captured in a kernel change. All outputs inherit the new intent after regeneration. No manual propagation. No hoping you remember next time.
The Infrastructure as Code Parallel
This pattern isn't new—it's just newly applicable to knowledge work.
| Layer | IaC (Terraform) | AI-Assisted Systems |
|---|---|---|
| Durable | Configuration files (.tf files) | Kernel files + builder prompt |
| Ephemeral | Server instances | Generated OS / code / proposals |
| Update pattern | Update config → destroy old → create new | Update kernel → delete old → regenerate |
| Anti-pattern | SSH into server and patch manually | Edit outputs directly, accumulate patches |
The forcing function in Terraform is built into the tool: certain changes must destroy and recreate. The discipline is automated.
For AI workflows, we don't have the forcing function yet—but we can adopt the discipline manually. Ask yourself: what would it take to make regeneration the default response to bugs, not patching?
The Emotional Resistance
Even knowing the economics favor rebuild, it feels wrong.
"But I put work into this output!"
This is the sunk cost fallacy in action. Outputs are tangible—you can see them, review them, point to them. Rebuilding feels like throwing away work.
But here's the reframe: What if the mistake was investing in the output, not failing to invest in the recipe?
The Work in the Output Is Already Lost Value
Every patch you made to an output only benefits that output. The judgment you exercised—"this heading should be formatted differently," "this section needs more context"—is trapped in that one file.
If you'd encoded that judgment in the kernel instead, it would apply to all future outputs automatically. The work would compound.
The practical relief: Rebuilding is fast—AI does the work. The kernel improvements aren't lost. You're not starting from scratch; you're starting from a better recipe.
"Instead of going back and hacking on it and hacking on it too many times, which is what I've done."— The honest admission
What the Recovery Enables
Once you've rebuilt from an updated kernel, the benefits cascade.
Immediate Benefits
- • OS now has brand DNA baked in at structural level
- • File structure mirrors frameworks automatically
- • Research steps align with your diagnostic sequence
- • Tone defaults to your voice, not internet-average
Ongoing Benefits
- • Every future proposal benefits from the corrected kernel
- • Model upgrades automatically improve all outputs
- • New team members onboard faster—they read kernel, not archaeology
- • Changes propagate cleanly without patch interaction bugs
The Compound Effect
- • This recovery isn't just about this project
- • Every future project using this compiler inherits the fix
- • The kernel improvement is permanent
- • The rebuild cost is one-time; the benefit is ongoing
The Retrospective Insight
The recovery wasn't just about fixing this Marketplace-of-One compiler. It was about encoding a process change for every future system.
Every AI-assisted system should ask three questions:
- Before building: What's my kernel? Is it compiled into the builder?
- During building: Does the structure reflect my worldview, or generic best practices?
- After building: If I find a bug, do I fix the kernel or patch the output?
Key Takeaways
- • The right fix: update kernel, regenerate OS—not "patch OS more carefully"
- • Rebuild beats patch on four dimensions: consistency, auditability, upgradability, simplicity
- • The threshold: 3-5 patches is warning zone; 6+ means rebuild overdue
- • PRs are now for intent: change the kernel, not the outputs directly
- • The IaC parallel: don't patch servers, update config and regenerate—same logic applies
- • Emotional resistance is real, but the economics decisively favor rebuild
- • This one recovery improves every future project using this compiler—the benefit compounds
Bridge to Part III: Applying the Doctrine
We've told the flagship story: build, mistake, recovery. We've established the core pattern—two-stage compilation, brand kernel, ephemeral outputs.
But questions remain: How do we know when to patch vs. regenerate in practice? Where else does this pattern apply beyond proposal compilers? What happens when we upgrade AI models?
Part III explores variants of the same doctrine—content pipelines, the crossover point, model upgrades as free value. The pattern is transferable. Let's see where it leads.
Part III
Applying the Doctrine
Chapters 7–9
The Crossover Point: When to Patch vs. Regenerate
There's a threshold where regeneration becomes cheaper than continued nursing. The question isn't "if" you'll hit it—it's "when," and whether you'll recognize it in time.
Not every change requires a full rebuild. Small isolated fixes are perfectly reasonable: typos, minor bugs, quick adjustments that don't encode judgment or create precedent. These are the exceptions, not the rule.
But here's the danger: "small fixes" accumulate. Each one feels reasonable in isolation. Together they become technical debt. The frog boils slowly.
The Crossover Economics
At roughly 3-5 patches, judgment starts accumulating in the output. That judgment should be in the kernel. Beyond this threshold, you're not maintaining code—you're nursing a patient that should have been regenerated.
| Patches | Status | What's Happening |
|---|---|---|
| 1-2 | Probably fine | Low overhead, isolated fixes |
| 3-5 | Warning zone | Judgment accumulating in output, should move to kernel |
| 6+ | Rebuild territory | Nursing a patient that should be regenerated |
By patch 3, you're making decisions about how the output should behave. Those decisions are judgment. Judgment belongs in the kernel. If it's only in this output, it won't be in the next one.
"The threshold is 3-5 patches; after that, you've encoded enough judgment in the output that it should be in the recipe."
Recognising the Threshold: Warning Signs
Certain symptoms tell you the crossover point has passed. Here are the red flags:
"Why is this like this?" gets archaeological answers
New team members ask questions you can't answer without excavating patch history. "It's like this because we patched X, then patched Y..."
Inconsistencies between parts of the system
"This section uses one format, that section uses another." Patches applied unevenly create patchwork behavior.
Fear of touching legacy patches
"Don't change that file." "That's load-bearing code." Defensive behavior around patch zones.
Patches have started interacting
Fix A breaks fix B. The archaeology problem: understanding the system requires understanding its patch history.
The Migration Cost Reality
Every patch increases rebuild cost. Patches encode decisions. To rebuild, you need to recover those decisions—either re-discover them or lose them. This creates a perverse incentive: the more patches, the more expensive rebuild seems, so you add another patch instead. Death spiral.
The Compounding Trap
Patch 1
Easy to understand. Decision is fresh. Rebuild cost: minimal.
Patch 5
Requires context from patches 1-4. Rebuild cost: growing. Perverse incentive starts.
Patch 10
Archaeological expedition required. Each patch makes the next rebuild more expensive. You're trapped.
"The migration cost grows with every patch; better to start the recipe discipline now."
The Objection: "My System Is Too Complex to Regenerate"
You'll hear this: "My system is too complex to regenerate—it would take forever to get it back to current state."
Here's the truth: This is evidence you've been patching too long.
The complexity you're describing isn't feature complexity; it's patch archaeology. The accumulated decisions, encoded in layers of fixes, create the illusion of irreducible complexity.
The Practical Approach
Start with new outputs
Kernel-first for anything new. Don't replicate the patch pattern.
Gradually rebuild legacy
As legacy outputs need significant work, rebuild from kernel instead of patching.
Don't try to migrate everything at once
Big-bang migrations fail. Incremental rebuilds succeed.
The reframe: "My system is too complex to regenerate" really means "I've accumulated too much un-encoded judgment." The fix isn't to keep patching—it's to start encoding.
The Practical Rule
For Repeated Outputs
Strategy: Kernel-first, always
Why: Every improvement to the kernel benefits all outputs
ROI: The extra upfront investment pays off across repetitions
Proposals, code patterns, content pipelines—anything you'll do more than once.
For True One-Offs
Strategy: Patching is fine
Why: If you'll never generate something similar, no compounding to capture
ROI: Just fix it and move on
Genuine unique artifacts with no reuse potential.
The Decision Framework
Before you patch, ask these questions:
1. How many patches has this output already received?
2. Am I encoding judgment that should be in the kernel?
3. Will this decision need to be repeated in other outputs?
4. Can I explain this change without referencing prior patches?
The Decision Tree
Patch count > 2?
→ YES: Consider rebuild
→ NO: Continue
Encoding judgment?
→ YES: Update kernel first
→ NO: Continue
Would this help other outputs?
→ YES: Update kernel first
→ NO: Continue
✓ Patch is fine
When to Patch
- ✓ Patch count: 1-2
- ✓ No judgment being encoded
- ✓ True one-off fix
- ✓ Isolated, simple correction
When to Rebuild
- ⚠ Patch count: 3+ (especially if trending up)
- ⚠ Fix encodes judgment about how things should work
- ⚠ Same fix would improve other outputs
- ⚠ Explaining the fix requires patch history
The Time Horizon Effect
Your discount rate determines your decision. Short-term thinking favours patching. Medium- and long-term thinking make the kernel investment obvious.
Short-term thinking
Logic: "It's faster to just fix it here"
True timeframe: The next 30 minutes
Hidden cost: Accumulates across all future outputs
Medium-term thinking
Logic: "If I update the kernel, I fix this everywhere"
True timeframe: Next 5-10 outputs
ROI: More effort now, less effort across all future outputs
Long-term thinking
Logic: Patches accumulate complexity; kernels compound quality
True timeframe: The life of the system
ROI: The math is clear; only the discount rate makes patching look attractive
The 10% Rule
What if investing 10% extra time in recipe maintenance saved 40% time on every future generation?
That's the compound interest of kernel-first development. The investment pays for itself within 3-4 outputs, then continues compounding forever.
Applying This to the Flagship
In the proposal compiler story, patches were accumulating. The system worked at 7-8 out of 10 but wasn't improving. Each patch fixed one thing without elevating overall quality. The crossover point had been passed.
By the time the mistake was recognized, rebuild was already overdue. The patches had encoded judgment—about structure, about format, about what good proposals look like. That judgment belonged in the kernel: marketing.md, frameworks.md, style.md.
This is the pattern to watch for in your own systems. When patches start encoding "how we do things," the crossover point is behind you.
Key Takeaways
- • Not everything needs a rebuild—small isolated fixes are fine
- • The crossover is 3-5 patches; beyond that, you're accumulating un-encoded judgment
- • Warning signs: archaeological explanations, inconsistencies, fear of touching files
- • Every patch increases rebuild cost—the incentives are perverse
- • "Too complex to regenerate" is evidence you've patched too long
- • Kernel-first for repeated outputs, patching OK for true one-offs—but most things aren't one-offs
Next: We've established when to patch vs. regenerate. Now let's see the kernel pattern applied to another domain—content pipelines, where ebooks and articles follow the same doctrine.
Content Pipelines: Kernels for Ebooks and Articles
The same two-stage compilation applies to content production — kernel + context = output.
The Same Pattern, Different Domain
Everything we've discussed about ephemeral code and durable kernels applies beyond software. Content production faces exactly the same economics: regeneration versus patching, specifications versus outputs, kernels versus artifacts.
The mapping is direct:
Software Development
- • Output: Code
- • Recipe: Design document
- • Process: Compiler
- • Quality: Tests, type checking
Content Production
- • Output: Article/Ebook
- • Recipe: Pre-think + Framework library
- • Process: Content generation workflow
- • Quality: Editorial standards, research validation
Many readers produce content, not code. The doctrine transfers completely.
The Ebook Flywheel as Compilation
The content production workflow is a series of compilation stages, each narrowing and refining:
1. World → You: Read, experiment, talk, build mental models
2. You → Frameworks: Articles, checklists, "we reject these because..."
3. Frameworks → Builder: Embed doctrine into workflow
4. Builder → Outputs: Per-topic content generation
5. Outputs → Frameworks: Good outputs feed back into the framework library
Each stage is a compression step: raw thought → structured thought → researched thought → published thought. And crucially, outputs become inputs to future outputs. Articles become frameworks. Frameworks improve future articles.
The Content Flywheel
Idea / Riff
Raw mental model, unstructured thinking
Pre-Think
Scope definition, editorial kill list, reframings
Research
External validation, citations, supporting evidence
Draft / Synthesis
Structure + frameworks + research → article
Published Output
Ebook, article, framework document
Framework Library (feeds next cycle)
Output indexed, searchable, influences future content
"That is now a framework, that next document. I feed that into the framework list, and then... I'm also trying to increasingly give it a list of articles that it can read directly."
What Lives in the Content Kernel
The content kernel is your durable asset. It's where you encode judgment about how to produce content. Four main components:
Voice and Style Guidelines
Contains: How you talk, how you structure documents, what "done" looks like
Purpose: Ensures brand consistency across all content
Example: "Use active voice, avoid jargon, end sections with takeaways"
Framework Library
Contains: Vector DB for semantic search + direct access to full articles
Purpose: Your accumulated thinking tools, not just RAG snippets
Example: "Fast/Slow framework, Brand Kernel concept, Crossover Point analysis"
Kill List of Excluded Topics
Contains: What NOT to write about, already covered elsewhere, out of scope
Purpose: Prevents duplication and scope creep
Example: "Software-specific patterns → see Blueprint article; Markdown OS architecture → separate piece"
Rejection Patterns
Contains: What makes an idea not worth pursuing, anti-patterns in content
Purpose: Explicit exclusion criteria, audit trail of decisions
Example: "Avoid generic 'AI is transforming X' pieces; need specific mechanisms and data"
The Pre-Think as Design-Time Compilation
Pre-think is "thinking about the thinking" — a meta-cognitive step that happens before main content generation. It's your design document for the content piece.
Why this is design-time compilation: it happens before the main generation, encodes judgment about this specific piece, and creates an audit trail of decisions. Explicit rejections before execution.
The output of pre-think is a "brief" for the content generation: narrowed scope, clear intent, documented exclusions. Then the main workflow can execute without thrashing.
"The pre-think just does thinking about the thinking, and thinking about what to reject and what to use, and what voice, and many different things like that."
Frameworks Influence Voice Without Being Quoted
There's a subtle but important distinction in how frameworks and external sources are used:
Framework Influence vs. Citation
Your Frameworks Influence:
- • How ideas are framed
- • What trade-offs are considered
- • Which patterns are recognised
- • The voice and structure
- • Not cited explicitly
External Citations Provide:
- • External validation
- • Third-party gravitas
- • Research backing
- • Data and benchmarks
- • Formally cited with sources
The principle: we don't quote ourselves because quoting yourself lacks gravitas. Your frameworks shape how the content is written — the analysis, the structure, the voice — but they're not presented as citations.
External authorities (McKinsey, BCG, research papers, industry data) provide the formal citations. Your frameworks shape the analysis; their authority validates conclusions.
Content as Ephemeral
The same principle applies: content can be regenerated from a better kernel when frameworks update. The kernel is durable; the content is regenerable.
When to regenerate content:
- Frameworks have significantly evolved — your thinking has sharpened, old pieces feel stale
- New research invalidates old conclusions — better data available, claims need updating
- Voice/style has matured — early pieces don't match current brand
- Better model available — new AI capabilities produce richer outputs from same kernel
The practical reality: you probably won't regenerate every article. But high-value pieces can be refreshed, and NEW content benefits from improved kernels automatically. That's the compounding advantage.
The Meta-Editor Pattern
Conversations are braided — multiple frameworks, examples, tangents all intermingled. The meta-editor solves the decomposition problem.
Meta-Editor Workflow
Raw Conversation
Multiple ideas intermingled, tangents, examples
Meta-Editor (Identify Themes)
AI + Python: extract distinct ideas, cluster like concepts
Per-Theme Extracts
Relevant quotes, context, supporting examples per topic
Content Workflow (Per Theme)
Pre-think → research → draft for each extracted theme
Clean Articles (Per Theme)
One focused piece per framework/idea, no value lost
The architecture: Python owns the loop (enumerate topics, call workflow per topic). AI owns judgment (extract relevant bits, classify, outline). Markdown files are interfaces between stages.
Why this matters: single conversation → multiple clean articles. No value lost to "intermingled ideas." Systematic extraction rather than manual cherry-picking.
The Worldview Compression at Work
This is recursive compression in action: take a view, compress it into a framework, use that framework as input to the next cycle.
Why frameworks beat raw research: you could tell an LLM "go research the latest thinking on X, then advise." But it pulls a small slice of the internet, averages the tone and ideas, mixes in its older baked priors. Result: generic consultant soup.
Your frameworks do three things raw research can't:
Compression
Hundreds of hours of thinking → a small number of clear lenses
Selection
Filtering out the "everyone should have a chatbot" tier of nonsense
Alignment
Every part of your framework agrees with your view of value, risk, what "good" looks like
"Without the frameworks... 'do a chatbot'. With frameworks... ideas of replacing processes and augmenting humans and a whole bunch of different stuff, which is just the absolute spot on right answers."
The frameworks are the difference between generic and stellar. They're your compressed worldview, applied at scale.
The Self-Reference Phenomenon
Something interesting happens when you write about AI frameworks using those frameworks: the AI starts referencing the frameworks and noting that the writing process itself proves the point.
This is valid — carefully. It's not circular reasoning ("this is good because I say it is"). It's demonstration: "this article is produced by a system that uses these principles — you are experiencing the argument being enacted."
The medium can embody the message. Just don't let it slip into pure circularity.
Applying the Doctrine to This Ebook
This ebook is itself a content pipeline output. Let's make the structure explicit:
The Kernel + Context = Output Pattern
Kernel:
article_prethink.md (scope, exclusions, reframings) + brief.md (thesis, mechanism, artefact) + frameworks library (accumulated thinking tools)
Context:
content.md (raw conversation material) + research.md (external citations and validation)
Output:
This ebook — structured, researched, validated content in professional HTML/CSS
What makes it work:
- Pre-think defined scope and exclusions (no software details, no Markdown OS architecture)
- Frameworks shaped the structure and voice (two-stage compilation, crossover point, ephemeral outputs)
- Research provided external validation (GitClear data, McKinsey economics, Anthropic benchmarks)
- The kernel + context = output pattern executed as described
The meta-insight: you're reading an example of the pattern being described. The ebook about kernels was generated using a kernel. Not proof, but demonstration.
Key Takeaways
- • Content pipelines follow the same two-stage compilation pattern as software: compile kernel → compile outputs
- • The content kernel has four components: voice/style guidelines, framework library, kill list, rejection patterns
- • Pre-think is design-time compilation for content — explicit scope, rejections, and narrative choices before execution
- • Frameworks influence voice without being quoted; external authorities (McKinsey, research) provide formal citations
- • Content is ephemeral: regenerate from improved kernel when frameworks evolve or models upgrade
- • The meta-editor pattern decomposes braided conversations into clean, focused articles — no value lost to intermingling
- • Frameworks beat raw research through compression, selection, and alignment — your worldview applied at scale
What's Next
We've seen the ephemeral output pattern in proposals (Chapter 4-6) and content production (this chapter). One more variant remains: what happens when AI models themselves upgrade?
Same kernel + better model = better outputs at same cost. The compounding benefit of treating outputs as ephemeral and kernels as durable. That's Chapter 9.
Model Upgrades as Free Value
Same Recipe, Better Outputs
When the recipe is durable and outputs are ephemeral, model upgrades become compounding advantages. This is the payoff — the reason why treating outputs as disposable isn't just a maintenance strategy, it's a growth strategy.
The Upgrade Economics
• Same kernel + better model = better outputs — No retraining. No migration project. Just regenerate.
• Your investment: kernel quality (already made). Model vendor's investment: improved capabilities (continuous). Your benefit: better outputs at same cost.
• No additional work required — This is what "free value" means.
The Data From Anthropic
When Anthropic released Claude Sonnet 4.5, they published specific performance improvements. These aren't marketing claims — they're measurable benchmarks from production use cases.
Faster vulnerability intake
Reduced average vulnerability intake time for Hai security agents
More accurate
Improved accuracy while maintaining same pricing
"Claude Sonnet 4.5 reduced average vulnerability intake time for our Hai security agents by 44% while improving accuracy by 25%, helping us reduce risk for businesses with confidence."— Anthropic, "Introducing Claude Sonnet 4.5"
At the same price. That's the key part. This isn't an upsell to a premium tier. This is a drop-in replacement that delivers meaningfully better performance for the exact same cost.
The Drop-In Replacement Reality
Anthropic describes Sonnet 4.5 as "a drop-in replacement that provides much improved performance for the same price." What does that mean practically?
- → No code changes required
- → No prompt rewrites required
- → Just switch the model endpoint
Pricing remains stable at $3 per million input tokens and $15 per million output tokens — identical to Claude Sonnet 4. You get meaningfully better real-world coding performance, stronger long-horizon autonomy, improved system and tool use, and ASL-3 safety guarantees, all without changing Sonnet-tier pricing.
Model Upgrade Economics
| Factor | Old Model | New Model |
|---|---|---|
| Price | $X | $X (same) |
| Speed | Baseline | +44% faster |
| Accuracy | Baseline | +25% better |
| Your kernel | Same | Same |
| Work required | — | None |
This is compounding value. Your kernel stays the same. Your work required is zero. The outputs get better automatically.
Why This Only Works With Ephemeral Outputs
The upgrade economics only deliver if you can actually regenerate. If outputs are precious, you're locked out of the compounding loop.
Condition 1: Outputs Must Be Regenerable
If outputs are precious, you can't regenerate. If you've patched 15 times, regeneration loses those patches. The judgment encoded in those patches is trapped in the output instead of living in the kernel.
This is why the nursing anti-pattern is so costly — it prevents you from capturing future model improvements.
Condition 2: Kernel Must Be Rich
If the kernel is thin, the new model doesn't have enough to work with. Model upgrades amplify whatever you give them. Thin kernel + better model = slightly better generic output. Rich kernel + better model = significantly better you-shaped output.
The kernel is the multiplier. Better models multiply what's already there.
The Formula
Rich kernel + ephemeral output = full upgrade capture
Thin kernel OR precious outputs = upgrade value left on table
The Version Upgrade Benefit
When architecture is right, model upgrades are smooth. The new model reads the same design document and produces better code. The kernel is the constant; model capability is the variable that keeps improving.
"The design document persists across changes that would normally require major rework: Model upgrades — New model reads the same design doc, produces better code."— "A Blueprint for Future Software Teams"
Your investment compounds. Time spent on kernel quality is fixed (with occasional updates). Model improvements are continuous (vendor investment). Quality trajectory moves upward without your effort.
Contrast: The Patch-Nursing Approach
What happens when outputs are precious? When you've invested heavily in patching and can't afford to regenerate?
- ✗ Accumulated patches don't transfer to new models — The judgment is encoded in old outputs, not in reusable kernel files
- ✗ You can't just "regenerate" — Because you'd lose the patches and all the work that went into them
- ✗ Model upgrade becomes a migration project — Manual transfer, rewriting, testing against new behavior
- ✗ You're locked to the capability ceiling — Stuck at 2024 quality in 2025 and beyond
The migration problem is real. Old patches are encoded in old outputs. The new model doesn't know about those patches. You must manually transfer or re-create them. The work that was supposed to be "done" becomes work again.
Two Paths, Two Futures
✓ Path A: Ephemeral Outputs
- 2024: Generate with Model v1
- 2025: Regenerate with Model v2 → Better outputs (44% faster, 25% more accurate)
- 2026: Regenerate with Model v3 → Even better outputs
Outcome: Continuous quality improvement at no additional cost. Your kernel investment compounds with every model release.
✗ Path B: Precious Outputs
- 2024: Generate with Model v1, start patching
- 2025: Model v2 available, but outputs are precious → Stuck with v1 quality
- 2026: Migration project required to benefit from v3
Outcome: Locked to capability ceiling. Each model generation widens the gap. Migration becomes increasingly expensive.
The Strategic Implication
If model upgrades deliver free value — but only when outputs are ephemeral and kernels are rich — then the strategic move is clear:
1. Invest in Kernel Quality
Kernel quality determines how much upgrade value you capture. Rich kernel = full benefit from model improvements. Time spent on kernel is high-leverage investment.
Every hour spent enriching kernel × every model upgrade = compounding quality improvement
2. Let Model Improvements Flow Through
Don't fight the models; ride them. Your job: encode judgment (kernel). Model vendors' job: improve execution (capability). Division of labor that compounds.
You invest once in kernel quality; vendors invest continuously in model capability. Both improvements stack.
3. Treat Outputs as Disposable
If you can't regenerate, you can't capture upgrades. Nursing outputs locks you out of the compounding loop. Disposability is a feature, not a bug.
The ability to nuke and regenerate is what makes model upgrades valuable.
Real Numbers on Quality Improvement
Beyond the headline 44% and 25% improvements, the benchmarks show meaningful gains across the board:
Claude Sonnet 4.5 Benchmark Improvements
| Metric | Improvement |
|---|---|
| Vulnerability intake time | -44% |
| Accuracy | +25% |
| Planning performance (Devin) | +18% |
| End-to-end eval scores (Devin) | +12% |
| Price | Same ($3/$15 per million tokens) |
What this means practically:
- → Same spec → better code
- → Same proposal kernel → better proposals
- → Same content kernel → better articles
The improvement is measurable, not theoretical. And the trajectory is clear: these improvements happen every few months. Each release compounds on the last. If you can capture them (ephemeral outputs), you keep improving. If you can't (precious outputs), you fall behind.
The Early Adopter Advantage
The teams that adopt the ephemeral output doctrine early don't just get better quality today. They get compounding quality improvement over time.
The competitive gap widens with each model generation. Teams treating outputs as ephemeral are iterating 3–5x faster than teams nursing patched outputs. And every new model release widens that gap further.
The Cost of Delay
6 more months of patches that encode judgment in outputs instead of recipes
Migration cost grows linearly with each patch — the longer you wait, the more expensive the fix
Competitors who get this right will have compounding systems; you'll still have one-off outputs
Closing the Loop
This chapter connects everything we've covered:
- Ch 1: Regeneration is now cheaper (the economic inversion)
- Ch 2: Kernel is the durable asset (where to invest)
- Ch 3: Two-stage compilation (the mechanism)
- Ch 4-6: Flagship example (proof it works in practice)
- Ch 7: When to rebuild (practical threshold)
- Ch 8: Content pipelines (the pattern in another domain)
- Ch 9: Model upgrades (why it compounds — the strategic reason to adopt the doctrine)
The Full Picture
1. Invest in kernel → Outputs improve immediately
2. Treat outputs as ephemeral → Can regenerate when needed
3. Models improve continuously → Regeneration produces better outputs over time
4. Your kernel investment compounds with model improvements → Free value, continuously
This is the strategic reason to adopt the ephemeral output doctrine. It's not just about maintenance convenience. It's about capturing compounding value from an industry-wide trend you don't control.
Key Takeaways
1. Same kernel + better model = better outputs — Free value through drop-in model upgrades at stable pricing
2. Works because: no retraining, no migration, just regenerate — The entire benefit comes from treating outputs as disposable
3. Requires: ephemeral outputs AND rich kernel — Both conditions must be met; missing either blocks the compounding loop
4. Nursing approach fails — Patches don't transfer, capability ceiling locks in, migration becomes expensive
5. Strategic implication — Invest in kernel quality, let model improvements flow, ride the upgrade trajectory
6. Early adopters compound; late adopters fall behind — The gap widens with each model generation
7. This is the ultimate argument for ephemeral outputs — Future improvements are free if you can capture them
Final Synthesis
- ✓ The durable asset is the kernel, not the output
- ✓ Two-stage compilation is the mechanism
- ✓ Ephemeral outputs are the practice
- ✓ Model upgrades are free value — the reward for getting this right
References & Sources
This ebook draws on primary research from major consulting firms, industry analysts, and practitioner publications. External sources are formally cited throughout the text. The author's own frameworks and interpretive analysis (LeverageAI) are presented as practitioner perspective and listed here for transparency.
Primary Research: McKinsey & Company
AI for IT modernization: faster, cheaper, and better
Source for >50% cost reduction in IT modernisation using AI regeneration approaches. Key evidence for the economic inversion thesis.
https://www.mckinsey.com/capabilities/quantumblack/our-insights/ai-for-it-modernization-faster-cheaper-and-better
Overcoming two issues sinking gen AI programs
Infrastructure as code + policy as code patterns for AI systems.
https://www.mckinsey.com/capabilities/tech-and-ai/our-insights/overcoming-two-issues-that-are-sinking-gen-ai-programs
Primary Research: GitClear
AI Copilot Code Quality: 2025 Data
Analysis of 153M+ lines of code. Source for code churn doubling (2021-2024), 4x code duplication increase, and copy/paste exceeding refactoring for first time in history.
https://www.gitclear.com/ai_assistant_code_quality_2025_research
Coding on Copilot: 2023 Data Shows AI's Downward Pressure on Code Quality
Original research on code churn trends in AI-assisted development.
https://www.gitclear.com/coding_on_copilot_data_shows_ais_downward_pressure_on_code_quality
Primary Research: Anthropic
Claude Sonnet 4.5
Source for model upgrade benefits: 44% faster vulnerability intake, 25% improved accuracy, 18% planning performance increase—at same price point.
https://www.anthropic.com/claude/sonnet
Introducing Claude Sonnet 4.5
Drop-in replacement economics—same specification, better output, same cost.
https://www.anthropic.com/news/claude-sonnet-4-5
Industry Analysis & Commentary
The Messy Cost Of AI Code - Forbes
Analysis of complexity debt accumulation from AI-generated code.
https://www.forbes.com/sites/kolawolesamueladebayo/2025/12/03/the-messy-cost-of-ai-code/
Technical debt and its impact on IT budgets - Software Improvement Group
Source for the 100x cost multiplier: $100 fix at planning stage vs $10,000 in production.
https://www.softwareimprovementgroup.com/technical-debt-and-it-budgets/
How to Manage Technical Debt - Netguru
Source for $2.41 trillion annual technical debt cost in US businesses.
https://www.netguru.com/blog/managing-technical-debt
AI Generated Code: Revisiting the Iron Triangle - AskFlux
Developer trust declining (29% down from 40%); 66% spending more time fixing AI code than saved.
https://www.askflux.ai/blog/ai-generated-code-revisiting-the-iron-triangle-in-2025
Ephemeral Code & Specification-Driven Development
The Premise: Code Is Ephemeral - Matt Baldwin
Foundational article on treating code as output, not solution. Source for "the value is in the context, design, and guardrails" thesis.
https://medium.com/@matt.b.baldwin/the-premise-code-is-ephemeral-context-value-and-guardrails-matter-565e52005613
Understanding Spec-Driven Development - Martin Fowler
Three forms of spec-first: spec-first, spec-anchored, spec-as-source.
https://martinfowler.com/articles/exploring-gen-ai/sdd-3-tools.html
Spec-Driven Development: AI-Centered Future
"Specs should live longer than the code. Code becomes a by-product of well-written specifications."
https://medium.com/@geisonfgfg/spec-driven-development-a-deep-dive-into-the-ai-centered-future-of-software-engineering-db2d15fa882e
How to Vibe Code like a Google Engineer - Carly Taylor
Practical spec-driven development workflow comparison.
https://carlytaylor.substack.com/p/ai-spec-driven-development
Infrastructure as Code & Immutable Patterns
IaC with Terraform: A Practical Guide - Codefresh
Immutable infrastructure pattern: "servers are never modified after they're deployed."
https://codefresh.io/learn/infrastructure-as-code/iac-with-terraform-a-practical-guide/
Terraform Tutorial for Beginners 2025 - K21Academy
Introduction to configuration-as-recipe patterns.
https://k21academy.com/terraform/terraform-beginners-guide/
The Complete Idiot's Guide to Immutable Infrastructure - Scalr
Declarative infrastructure and forced replacement patterns.
https://scalr.com/learning-center/the-complete-idiots-guide-to-immutable-infrastructure/
Human Judgment & AI Economics
Good Judgement is a Million Dollar Skill in the Age of AI - Nate's Newsletter
"Intelligence is cheap. Judgment is scarce."
https://natesnewsletter.substack.com/p/in-the-age-of-ai-good-judgement-is
AI won't make the call: Why human judgment drives innovation - Harvard Business School
Research on AI's limitations in distinguishing good ideas from mediocre ones.
https://www.hbs.edu/bigs/artificial-intelligence-human-jugment-drives-innovation
Machine Intelligence and Human Judgment - IMF
Economic analysis of judgment-intensive vs prediction-intensive tasks.
https://www.imf.org/en/publications/fandd/issues/2025/06/machine-intelligence-and-human-judgement-ajay-agrawal
AI Code Security & Quality
2025 GenAI Code Security Report - Veracode
45% of AI code samples failed security tests with OWASP Top 10 vulnerabilities.
https://www.veracode.com/blog/genai-code-security-report/
State of AI code quality in 2025 - Qodo
65% of developers report at least 25% of commits are AI-generated or shaped.
https://www.qodo.ai/reports/state-of-ai-code-quality/
LeverageAI / Scott Farrell
Practitioner frameworks and interpretive analysis developed through enterprise AI transformation consulting. These frameworks are presented as author voice throughout the ebook (not formally cited inline) and listed here for readers who wish to explore the underlying thinking.
A Blueprint for Future Software Teams
Source for the Design-Compiler Pattern, inversion table (primary artifact shifts from code to design document), and "ephemeral code" thesis applied to software development.
https://leverageai.com.au/a-blueprint-for-future-software-teams/
Markdown as an Operating System
Foundation for the Markdown OS architecture used in the proposal compiler example.
https://leverageai.com.au/markdown-as-an-operating-system/
SiloOS: The Agent Operating System for AI You Can't Trust
Security-first AI execution patterns referenced in context hygiene discussion.
https://leverageai.com.au/siloos-the-agent-operating-system-for-ai-you-cant-trust/
Why Code-First Agents Beat MCP
Background for the Code-First framework discussion in Chapter 10.
https://leverageai.com.au/why-code-first-agents-beat-mcp/
Discovery Accelerators: The Path to AGI Through Visible Reasoning Systems
Foundation for the Discovery Accelerator and Negamax algorithm discussion in Chapter 9.
https://leverageai.com.au/discovery-accelerators/
Note on Research Methodology
Research for this ebook was compiled in December 2025. Sources were prioritised based on: (1) credibility of the publishing organisation, (2) recency of data, and (3) relevance to the specific thesis of ephemeral outputs and durable specifications.
External sources (consulting firms, research organisations, industry publications) are formally cited throughout the text. The author's own frameworks (LeverageAI) are presented as practitioner interpretation and not formally cited inline—this maintains the distinction between "external proof" and "author's lens."
Some URLs may require subscription access. Where possible, key statistics and quotes have been included in the text for readers without access to the full source material.
Summary of Key Metrics
| Metric | Value | Source |
|---|---|---|
| AI code security failures | 45% | Veracode 2025 |
| Code churn increase (2021-2024) | 2x | GitClear |
| Code duplication: AI vs human | 2-3x higher | GitClear |
| Code cloning increase | 4x | GitClear 2025 |
| Technical debt annual cost (US) | $2.41T | Netguru |
| Cost multiplier: planning vs production | 100x | Forbes/SIG |
| Developer trust in AI outputs | 29% (↓ from 40%) | AskFlux |
| McKinsey modernisation cost reduction | >50% | McKinsey |
| Claude 4.5 improvement (same price) | 44% faster, 25% more accurate | Anthropic |
Ready to Apply the Doctrine?
The durable asset is the kernel, not the output. Start encoding your judgment today.
© 2025 Scott Farrell / LeverageAI. All rights reserved.