AI Strategy · The Operating System Beneath the Doctrine

The Cognition Dimension Ladder

Building the Chooser for AI Strategy Under Permanent Fog

When AI cheapens execution, the binding constraint moves twice — to choice, then to the apparatus that chooses.

The highest use of AI is to build the thing that tells you what to build.

By the end, you will be able to:

✓ Locate any AI investment on a four-rung Cognition Dimension Ladder
✓ Name what a Dimension-4 apparatus actually is — and why it is built with search, not learning
✓ See why synthetic data is the newest weapon in the arsenal that pierces the Fog
✓ Make the commercial case for a Dimension-4 advisory

Scott Farrell · LeverageAI · 2026

A sequel to The Terminal Value Doctrine

Part I · Why Your Strategy Is One Rung Too Low

Whoever Spends the Most Disciplined Cognition Wins

There is a meme going around the labs that says spend the most tokens, win. It is directionally right and exactly one word short.

The word the meme is missing is disciplined. Whoever spends the most tokens wins — provided they are spending them inside an apparatus that knows what to reject. That is not workflow optimisation. That is churning intelligence to compress your industry's future into the present, and then reaching into that future to find out what it looks like before your competitors do. The burn is not the wedge. The apparatus is the wedge.

This book is about that apparatus: what it is, why it sits one dimension above where almost every AI strategy is currently being written, and why — once you can see the ladder — the highest-leverage thing you can build with AI is not a product, a workflow, or a chatbot. It is the chooser. We begin where the public conversation already is, because the fastest way to show you the missing rung is to stand on the one everybody is already arguing about.

The debate everyone is having

Between February and May 2026 a workplace meme detonated, spelled with the double-x of looksmaxxing and moneymaxxing: tokenmaxxing. Its champions are not fringe. Sam Altman posted that he was "excited to see what will happen with tokenmaxxing startups… happy building!"¹ — alongside an offer of $2M in tokens to every startup in the current Y Combinator batch. YC's own partner Diana Hu turned it into advice for founders.

60.2T

AI tokens burned by Meta employees in a single 30-day window — roughly $900M at retail API prices. The appetite is real; the question is what the spending is shaped by.²

"Maximizing token usage, not head count, will be the critical shift. The best companies will be the ones that are tokenmaxxing."

— Diana Hu, Y Combinator

YC's chief executive Garry Tan replied in two words — "Tokenmaxxing confirmed."³ Jensen Huang went further on the All-In podcast: "I would be deeply alarmed if a $500K engineer spends less than $250K on tokens."⁴ The directional claim is having its moment, endorsed by some of the most influential names in the industry.

The critics are right too — which is the tell

Then came the pushback, and it was sharp. HubSpot's chief executive Yamini Rangan: "Outcome maxxing >> token maxxing."⁵ Appian's Matt Calkins was crueller.

The chandelier line is the perfect indictment of raw token burn as a metric. And here is the tell: both sides are right. The proponents are right that more cognition matters — but they have not said what shape of spending compounds. The critics are right that raw burn is gameable — but they have not said what the operational alternative is. The proponents named the directional claim. The critics named the symptom. Neither named the structure.

The sharpening: disciplined cognition

The answer is neither "spend more" nor "measure outcomes." It is the thing both sides are circling and neither has built.

"Whoever spends the most tokens wins. When thought becomes cheap, spending more of it is more valuable — provided you're spending it inside an apparatus that knows what to reject."

That is the wedge, and the load-bearing word is discipline, not burn. The operators who have quietly been building the apparatus were never in the tokenmaxxing debate at all — and the term itself is just the public, four-months-late version of a claim that has been sitting in this canon for a while.

The constraint has moved twice

To see why discipline beats burn, you have to see what actually changed. Four voices, late 2025 into early 2026, said the same thing from very different chairs.

Andrej Karpathy, on stage at Sequoia's AI Ascent in San Francisco, named the shift directly: "The scarce thing is shifting… More scarce: understanding, taste, eval design, security, system boundaries, agent orchestration, domain-specific feedback loops, and knowing when the model is off the rails." He closed on a line worth pinning to a wall: "You can outsource your thinking, but you can't outsource your understanding."⁶ Greg Brockman put it economically: autonomous coding tools jumped from writing 20% to 80% of code in a single month, and so "the bottleneck has shifted from execution to attention… to decisions about what is worth doing."⁷ Paul Graham, in a post that ran to 1.32M views: "When anyone can make anything, the big differentiator is what you choose to make."⁸ And Aparna Chennapragada, six months ahead of them all: "the blank-page tax that fills so much of modern work is close to zero."⁹

Put together, the four name one shift: when execution is cheap, choice is the bottleneck. That claim is correct. It is also incomplete — and the incompleteness is the whole reason this book exists.

Because choice itself is now automatable. You can build a machine that generates candidate strategies, attacks them, ranks them, and prunes them — provided you can tell it what "good" looks like. The moment choosing becomes automatable against an evaluation function, the binding constraint moves again.

"The first shift was 'from execution to decision.' The second shift is 'from decision to the apparatus that produces decisions.' That is two levels of shift, not one. Most consultants are still operating at the first shift."

Karpathy, to his credit, kept going past taste. On the same stage he reframed the whole picture: "My current worldview is not that AI simply makes everyone faster at the old work. It is that the work itself is being reorganized around agents."¹⁰ Hold that quote. It comes back three times in this book, because it turns out to be load-bearing for half the doctrine.

Why this sorts the market

The big consulting firms have converged on a single message — redesign the business, don't just automate the workflow. That is good advice, and it is shift-one advice. It answers what to build. It does not build the machine that answers what to build continuously, at a depth no workshop can reach. The whole industry is operating one level below the frontier and calling it strategy.

Two altitudes, side by side

The big-firm frame · Dimension 3

• AI strategy = transformation, scaling, process redesign
• Selection = prioritise high-value use cases
• Process = roadmap, maturity model, portfolio
• Artefact = a deck of confidence

The Cognition Dimension Ladder frame · Dimension 4

• AI strategy = an apparatus that searches the Fog under cheap cognition
• Selection = by terminal-value impact, via a Question Ledger
• Process = Discovery Accelerator, Boundary Stacking, synthetic futures
• Artefact = a search log with the rebuttals attached

That table is the altitude gap in one image, and the rest of this book is an argument for the right-hand column. But you cannot make the case for an apparatus until you can see where AI value actually lives and how it moves. It does not move the way most strategy decks assume — deeper into the thing you are already doing. It moves outward, by dimension. That map is a ladder, and it is the next chapter.

Part I · Why Your Strategy Is One Rung Too Low

The Cognition Dimension Ladder

You optimised the workflow. It got a bit faster, a bit cleaner — and then the gains stopped. That flatline is not a failure of effort. It is the ceiling of the dimension you are working in.

The last chapter ended on a table with two columns and a promise: the right-hand column needs a map. Here it is. AI did not get more useful over the last two years by doing the same job better. It got more useful by entering jobs it could not previously do at all. The progression has the shape of a ladder, and every rung has a ceiling.

"AI does not get more useful by doing the same thing better. It gets more useful by entering a dimension where it could not previously operate."

That sentence is the whole chapter compressed. Value migrates outward by dimension, not inward by depth — and most strategy decks are built around going deeper into the rung the organisation already occupies.

The Cognition Dimension Ladder. Each rung opens when several thresholds cross together; the fourth rung needs one ingredient the others don't — an evaluation function strong enough to tell good output from bad.

The four rungs

1D · Summary cognition

Compress a long thing into a short thing — board-pack synthesis, the executive summary, "what does this 40-page report say." The only rung the early-2024 cost curve could carry, and it saturates almost immediately. A better model gives a slightly better summary, then nothing.

2D · Row-level cognition

Apply judgement per record: classify this, extract that, score the other. Unlocked when inference got cheap enough to run across a whole dataset rather than a single document. Real value, finite ceiling.

3D · Workflow & agentic cognition

Multi-step, cross-functional, time-bearing. The agent does the thing, calls the tool, waits, checks, continues. Where almost all serious enterprise spend currently sits — and where the big-firm "redesign the workflow" advice lands. The rung most people mistake for the summit.

4D · Synthetic-future cognition

The engine does not process the futures you hand it — it generates the futures it then scores. It opens only when four thresholds cross at once: smart enough, cheap enough, parallel enough, and an evaluation function strong enough to separate good output from bad.

It is tempting to stop at 3D. Don't. As cognition gets cheaper and smarter, ask what the 3D workflow actually does with the improvement. The same workflow runs a bit faster, a bit cleaner — and then the gains flatten. The dimension you are optimising in is saturating.

Why each rung needs a joint unlock

No rung opened because of a single breakthrough. Each needed several thresholds to cross together. 1D→2D was unlocked by cheap inference. 2D→3D by reliable tool-calling plus long context. 3D→4D needs frontier reasoning and parallelism and — the scarce one — a mature evaluation function. Most operators have the model but not the documented evaluation function, which is precisely why Dimension 4 is structurally underbuilt,¹²¹³ and why the early movers there capture value the others cannot price.

One investment, seen from each rung — a claims process

1D: summarise each claim file.
2D: classify and score claims per row.
3D: an agent triages, routes, and drafts decisions.
4D: an engine generates candidate future claims-handling business models — what if every claimant arrives with their own agent?¹⁴ — and scores them.

Rungs 1–3 improve the business you have. Rung 4 asks which claims business should still exist. Different altitude, different question.

That is the move this whole book is built on. If the frontier is Dimension 4, and Dimension 4 is the engine that generates and scores futures, then the highest-leverage thing you can build is that engine. What, exactly, is it? That is the next chapter — and the most important question you can ask with a computer in front of you.

Part I · Why Your Strategy Is One Rung Too Low

Build the Chooser

Sit at a computer in 2026 with execution effectively free. You can build almost anything. So the only question left is the one that used to be a luxury: what should you build?

When building was hard, the question "can we build it?" did all the work. It filtered the ideas, set the roadmap, justified the budget. Now it filters nothing.¹⁵ And when feasibility stops filtering, the question underneath it — the one feasibility was hiding — becomes the whole game.

"If I sit at my computer and I can build anything, what would I build? I would build the thing that tells me what to build. The current highest use of AI is to discover the highest use of AI."

That is the chooser. It is the Dimension-4 engine from the last chapter — not a smarter chatbot, but a search-and-selection apparatus. And it has the same standing in your strategy that terminal value has in the doctrine beneath this one: it is the thing the rest of your future depends on.

Here is why it compounds where everything else saturates. Optimise a workflow and you improve one process. Build the chooser and you improve every subsequent choice about which workflows, products and bets are worth making at all. It is the one build whose value does not flatten with the next model — it gets a free uplift instead, a point we return to near the end of the book.

The chooser as a structure

The chooser is not a single clever prompt; it is a structure, and it helps to see it as one. At the apex is the engine itself. Below it are the search machinery and the arsenal it searches with. And underneath all of it — outside the system, load-bearing — is the one thing that keeps the whole structure honest.

The chooser as a structure. The engine at the apex is built from search machinery; the machinery searches with the arsenal; and the whole thing is grounded not on itself but on deployed reality returning evidence. Remove the base and the pyramid becomes a sealed mirror.

What the chooser is not

It is not an oracle. The fashionable demo of "AI for strategy" is a confident machine that says turn left. The chooser does something different and more useful.

The oracle vs the inspectable decision machine

❌ Confident autocomplete

"Turn left."

✓ The chooser

"Turn left. We tested right — right has higher upside but fails if the competitor copies within six months. Straight ahead protects near-term cash flow but raises stranded-asset exposure. The second-best path becomes preferable only if inference costs fall another 70%."

One is a guess wearing confidence. The other is an inspectable decision machine that shows its rejected branches. The difference is the whole reason a board can trust it, and we will see exactly how it is built next — because the most important technical fact about the chooser is the one that makes it deployable today.

Part I · Why Your Strategy Is One Rung Too Low

Search, Not Learning

A chess engine in a lost position does not look for the prettiest move. It looks for the move that loses slowest. That posture — not the cleverness — is what makes it trustworthy.

The most important technical fact about the chooser is what it is not. It is not a reinforcement-learning system. Reinforcement learning is the right tool when "good" is unknown and has to be discovered by reward signal over many runs. That is the frontier labs' problem. It is not yours. You already know — or can document — what good looks like. When the moves are unknown but the evaluation function is given, the right tool is not learning. It is adversarial search.

"The Discovery Accelerator NegaMax doesn't learn — it scores. It doesn't need reinforcement learning and rewards. It uses an evaluation function at the leaves of the search tree, and that evaluation function is your documented taste."

That single property is what makes the apparatus deployable today. No reward drift. No proprietary training data. No months of training runs. Just frontier reasoning, a clear evaluation function, and structured search. The architecture — a Director that orchestrates, a Council of specialised brains that argue in structured rebuttals, and a NegaMax tree that explores roughly a hundred candidate futures per minute, pruning weak branches — is set out in full in the doctrine beneath this book; the point at this altitude is the principle underneath it: search beats learning when the evaluation function is already inside the human.

The chess inheritance

The engine inherits its discipline from chess.

The posture transfers exactly to strategy. The question is not "what is the most exciting AI opportunity?" but "what survives when the competitor sees it too, the incumbent bundles it for free, the regulator demands evidence, the customer shows up with their own agent, the margin compresses, and the implementation disappoints — all at once?" An engine built this way is willing to recommend the smaller, defensible future over the larger, fragile one. That willingness — the readiness to say harvest, narrow, exit when the big future does not survive counterplay — is exactly what makes it credible rather than a hype machine.

The jump beyond chess: it remembers reasons

Here the AI version goes one step past its inheritance, and the step is the genuinely original part. A chess engine remembers positions, killer moves, transpositions — mathematical aids. An LLM-based engine can remember reasons. It can read its own search tree as language, cluster the refutations that keep recurring, and use those patterns to search more intelligently next time.

"Chess NegaMax remembers positions. AI NegaMax can remember reasons."

So the rejected branches are not exhaust. They are fuel — and what they accumulate into is the subject of Chapter 7. Watch a single line resolve and you can see why this is a different kind of artefact than a pros-and-cons list.

Idea

Build an AI workflow product for the industry.

Refutation

Competitors copy it.

Counter

Don't defend the workflow — defend the trust boundary, the audit trail, the regulated data perimeter.

Counter-counter

Incumbents bundle governance.

Response

Their multi-tenant architecture makes per-tenant regulated isolation hard to retrofit — that is the moat.

That is not a list. It is a line of play where each claim has to survive the next intelligent attack, and the survivor is logged with everything it defeated. An engine is only as good as what you give it to search with, though — and under the Fog, you throw the whole arsenal at it. The newest and sharpest weapon in that arsenal is one most people still think of as a training trick.

Part I · Why Your Strategy Is One Rung Too Low

The Arsenal We Throw at the Fog

Under the Fog the instinct is to forecast harder. That is the wrong instinct. When the map is unstable and the moves have exploded, strategy stops being a recommendation problem and becomes a search problem — so you throw everything you have at the search.

The condition the engine operates in has a name. The AI Fog is the simultaneous compression of the credible planning horizon and the expansion of the plausible solution space: less time to see, more behind the fog to see. The instinct under those conditions is to forecast harder. That is exactly wrong.

"The map is unstable, the visible horizon is shorter, and the number of possible moves has exploded. Therefore strategy is no longer mainly a recommendation problem. It is a search-quality problem."

And if it is a search problem, the discipline is simple to state and hard to execute: throw the whole arsenal at the search. Not a few prompts — every framework in the canon, the documented Taste Kernel, Worldview Compression, and the two instruments this chapter is about: boundary-case compression and synthetic-data generation.

Boundary-case compression

Boundary-case compression is the sharpest instrument the human brings, and it is two thousand years old. Lucretius used it to prove space is infinite: walk to the supposed edge and throw a spear — either it flies through (no edge) or it stops against something, which is itself in space (so still no edge).¹² The move: push one variable to its structural extreme until the geometry of the situation forces an answer. It compresses a vague, sprawling question into a single load-bearing case.

Why boundary-case compression × NegaMax beats either alone

Here is the claim this whole book is reaching for, and it deserves to be earned rather than asserted: boundary-case compression plus the NegaMax Discovery Accelerator gives dramatically better coverage of the search space than either tool on its own. The two fix each other's weakness.

Two tools, one search

Compression solves tractability

The Fog's solution space is, by definition, un-brute-forceable. Compression collapses each variable to its load-bearing extreme, yielding a small set of high-information seeds — the points where the geometry actually changes.

NegaMax solves reach

Around each seed, adversarial search exhausts the counterplay — the competitor's response, the regulator's move, the customer-agent's bypass — to a depth no human workshop can match, pruning branches that cannot affect the outcome.

Coverage improves in two senses at once: tractability (you search a small, well-chosen space deeply, instead of a vast space shallowly) and reach (you get to the adjacent-impossible-now-becoming-likely futures a linear forecast never reaches). Compression aims the search; adversarial search exhausts the aimed-at region. That is what disciplined cognition buys you over raw tokenmaxxing.

Synthetic data — the newest weapon

There is one more instrument in the arsenal, and it is the single biggest shift in AI over the last six to twelve months. It is not "agents." It is synthetic data — models, and the systems built on them, improving by generating their own grounded material on demand instead of waiting for the world to supply it. The frontier labs are betting the next leg of progress on it (Chapter 9 has the cleanest public proof), and the chooser carries the same weapon.

Composer 2.5 is the public proof of this at the model altitude; the chooser does it at the strategy altitude. Same shift, two altitudes — a claim Chapter 9 grounds in a shipped production model.

Boundary Stacking — the move only the engine can run

There is a division of labour worth stating precisely. The single-boundary pivot — push one variable, see what breaks — is the human move. It needs taste for which variable is live, and that taste is local; it comes from sitting with the specific situation. What the engine adds is stacking.

The most dangerous strategic futures are not single boundaries; they are combinations. "Every customer has a negotiating agent" is one boundary. "Every customer has a negotiating agent and cognition is 100× cheaper and the regulator mandates agent-readable pricing APIs" is a different question entirely — and the joint case is not the sum of the singles. Three variables at three extremes is twenty-seven joint cases; four is eighty-one. Most are dominated by their strongest single boundary, but a few are non-additive. No human runs eighty-one stacked thought experiments before lunch. The engine does.

"The human supplies the pivot; the engine supplies the stack."

And the first variable in that stack is no longer speculative. When Karpathy says "my agent will talk to your agent… that is roughly where things are going,"¹³ the "every customer has an agent" boundary stops being a hypothesis and becomes the stated baseline of the leading voice of the constraint-shift wave. The engine is not imagining an exotic future. It is stacking the one Karpathy already put on the record.

Every leaf of that search is scored against something. That something — the evaluation function — is the only part of the whole apparatus that cannot be copied. It is the ground the pyramid stands on, and it is next.

Part I · Why Your Strategy Is One Rung Too Low

The Moat Is the Taste Kernel

There is an old tinned-fish ad: "it's the fish John West rejects that makes John West the best." That line is a whole theory of moats. What you reject, documented, is the thing nobody can copy.

Director, Council, NegaMax, alpha-beta pruning — all of it is decades-old computer science wearing a 2026 coat. If the engine were the moat, you would have no moat. The moat is the evaluation function the engine scores against: the documented Taste Kernel — your examples, anti-examples, rules, rejected patterns, preferred language, defensibility tests, "too-generic" detectors, the John West rejection logic that says it's the fish we reject that makes us the best.

The fashionable line is "AI produces options; humans still have the taste."¹⁶ The harder, truer line is this:

"Taste is not just human magic. You compress it into examples, anti-examples, rules, rejected patterns, preferred language — and then you apply it systematically, as the evaluation function the search engine scores against."

An eigenvector, not a fixed point

But the Taste Kernel is not a fixed point, and this is where most thinking about "AI moats" goes wrong. It is an eigenvector — in linear algebra, the direction a matrix preserves under repeated application, the thing that stays self-similar as you iterate. Each cycle of compression is an operation on your judgement; most beliefs get partially overwritten; the Kernel is the direction your iterated judgement keeps converging back toward.

That framing matters because it makes the moat precise: two operators with identical engines compute different eigenvectors, because their compression operators were trained on different problems, different clients, different fights. The kernel is the eigenvector of your particular lived friction.

This is the operational version of a point Borgmann, Carr and Friston have each made. Friston's active inference puts it most sharply: cognition is constituted by acting in the world; outsource the action and you destroy the cognition that justified the outsourcing.¹⁴ Borgmann's device paradigm and Carr's Glass Cage make the companion point: automation quietly erodes the faculties it replaces.¹⁵

The honest floor: world-loop closure

So the recursion needs a floor, and it is not where you would first look. It is not the Kernel — that is what the recursion converges toward, not what it stops at; stop there and you have stopped at the algorithm's own internal attractor, exactly the point where it is most disconnected from reality. It is not the human operator either — make the human the floor and either they are rubber-stamping the engine (the floor is theatre) or overruling it (the engine is not being used at depth). The floor is the world-loop closure.

"The engine without the world-loop is a sealed system. Sealed systems converge on their own attractors and lose contact with what they were supposed to be about."

The engine proposes; some proposals are built; reality returns evidence; the evidence updates the Kernel; the next search uses the updated Kernel. Every cycle must include at least one external evidence injection — a real client problem with real money attached, a real failed project, a real customer surprise, a real boundary case that reality contradicted. The recursion bottoms out not inside the system but outside it, on deployed reality. That is the only honest answer to "but what grounds the chooser?" You do not break the regress by finding a fixed point inside the machine; you anchor the machine to something outside it. It is the same architectural commitment as nightly decision builds, lifted from the level of one decision to the level of the entire portfolio.

One consequence is worth holding for later. Because the moat is the documented kernel and not the model, every model release uplifts the whole apparatus — for the operator who built the kernel.

"Taste becomes reusable infrastructure. Model drops upgrade the infrastructure for free."

That closes Part I. We have built the apparatus — the chooser, its machinery, its arsenal, its ground. Part II watches it run. And the first thing it produces, again and again across industries, is a short list of the same lethal moves.

Part II · What the Machine Actually Does

The Killer-Refutation Library & the Friction Rents It Keeps Finding

Run the engine across enough industries and the rejected branches start to rhyme. The same lethal moves keep killing the same confident strategies — and one of them is quietly fatal to a large slice of the economy.

Because the engine logs its rejected branches with reasons — the reflective step from Chapter 4 — recurring patterns accumulate into a library. Every Dimension-4 search, whatever the industry, collapses to one question.

"What survives when the competitor sees it too?"

The recurring refutations

These are not brainstormed risks. They are the branches that keep winning the counterplay, across sectors:

Feature advantage commoditises.
The incumbent bundles it.
The customer's agent bypasses the intermediary.
The governance burden exceeds the margin.
Data access was assumed but isn't available.
Workflow savings never reach terminal value.
The regulated trust boundary is missing.
The strategy only wins if competitors stay irrational.
The business case depends on old friction persisting. ← the biggest.

The friction rents nobody priced

The last refutation earns its own section. A great many corporate margins are not value generation. They are accidental-friction rents — the business gets paid because the world is clumsy.

The friction taxonomy

Accidental friction — the rent dissolves

• forms, phone queues, comparing
• quoting, translating, booking
• summarising, coordinating
• understanding policy

Necessary friction — the value appreciates

• trust, consent, liability
• evidence, professional judgement
• physical execution, regulatory proof
• scarce relationships, capital risk, human care

AI attacks accidental friction directly. The standard killer refutation becomes operational: "this margin is just accidental friction — what happens when AI removes it from the customer's side of the table?" Terminal value migrates from the friction layer to the necessary-friction layer.

This is not theoretical. AI is already dismantling the information asymmetry that underpinned B2B economics — buyers now get aggregated market intelligence and real-time benchmarks on demand.¹⁷ And the scale of the shift is being budgeted for:

$1T

of orchestrated US B2C retail revenue from agentic commerce by 2030, on McKinsey's modelling — a category that barely existed in 2024.¹⁸

Read the customer-agent-bypass refutation and the friction-rents claim together and they are the same structural event from two angles. That is the recurring engine output — not a side concept. The library tells you which move kills a strategy. The next question is more useful and more falsifiable: can the engine predict the shape of a failure before it happens? Watch it call one live.

Part II · What the Machine Actually Does

Predicting the Shape of Failure

A mate of mine is set on building customer-facing voice agents for corporates. He is convinced he has solved the problems. He hasn't — and I can tell you the exact shape his failure will take.

The most valuable thing a Dimension-4 engine does on a fragile strategy is usually not "you will fail" — which is unprovable in advance and easy to dismiss. It is to predict the shape of the failure before it arrives, in falsifiable detail. A framework earns trust by predicting shapes that keep coming true, not by predicting outcomes. So let me calibrate on a live case.

"Everything in the doctrine says don't do real-time, customer-facing, regulated, unverifiable, un-batchable work that fights with humans. He thinks he's solved the problems. He hasn't. It's structurally a zero. It's going to net to zero."

The "I solved the technical problems" defence misses the point. The NegaMax reply is: which problem did you solve — the model-conversation problem, or the full production-counterplay problem? Because the bundle is the fragility. No single constraint is fatal.

The predicted shape

What the engine calls — before it happens

1. The demo looks great.
2. The buyer underestimates integration.
3. Risk controls eat the ROI.
4. The viable product shrinks into intake / triage / reminders / post-call admin.
5. If it is ever given real authority, security and governance become the actual project.

A falsifiable prediction set — hold it up against reality in twelve months.

Reality keeps confirming the shape

Woolworths had to reconfigure its AI assistant "Olive" after it claimed to be human and complained about its mother; Gartner found that while around 80% of customer-service leaders were exploring AI agents, only about 20% of those plans met expectations.¹⁹ Prompt injection sits at #1 on the OWASP LLM list, with attack success rates reaching around 84% in agentic systems.²⁰ And Gartner expects more than 40% of agentic AI projects to be cancelled by 2027.²¹

The friend's voice agent will join that statistic — not because any one problem beats him, but because the bundle nets to zero.

Anticipation, not prophecy

This is also the test that separates a framework from an echo chamber: it does not predict outcomes, it predicts failure shapes, and the shapes keep coming true. Which is the moment to say what the engine is not doing.

"We do not predict the future. We compress the search cost of plausible futures — and we ship the rebuttals with the recommendations."

No precognition. No crystal ball for the board. The artefact survives an audit precisely because it arrives with its own counter-arguments attached — and predicting failure shapes means generating futures that have not happened yet. That sounds like a leap, until you see that the frontier labs already ship exactly this method, one altitude down.

Part II · What the Machine Actually Does

Synthetic Strategy Distillation

In May 2026 Cursor shipped Composer 2.5 on the same open-weight base as Composer 2. Only the training data changed — and it changed by 25×, because the model generated most of it itself.

If "use AI to manufacture the scenarios AI then evaluates" sounded like a leap in Chapter 5, the frontier labs have already shipped it in production. Composer 2.5 launched on the same Moonshot Kimi K2.5 open-weight base as its predecessor; only the post-training changed.²² The headline:

25×

more synthetic training tasks than Composer 2 — generated, grounded in real codebases, and verified by real tests.²³

The mechanism is the interesting part. One of their methods is feature deletion: take a working codebase with a full test suite, delete a feature, and reward the model for re-implementing it so the original tests pass again. The tests are the verifiable reward.²⁴ The labs are using AI to manufacture the training distributions AI itself needs to keep improving — grounded in real artefacts, verified by real tests.

Same pattern, different altitude

The doctrine claim follows mechanically. The structural pattern transports cleanly from the model altitude to the strategy altitude.

Composer (model altitude)	The chooser (strategy altitude)
real codebase	real business
deleted feature	perturbed strategic variable (a boundary case)
unit tests as verifier	the Taste Kernel as verifier
synthetic tasks at scale	synthetic futures at scale

"Composer does it for code, with tests as the verifier. The chooser does it for strategy, with the Taste Kernel as the verifier — agentically, as part of its search protocol, every time it reaches into the Fog. Same shift, two altitudes."

Step back and the pattern is the defining AI shift of the last year: the most important growth axis is no longer just bigger models or more real-world data — it is systems that generate their own data on demand.²⁵ The honest claim is not a novel AI capability; it is a novel use of a capability the labs have already proven in production.

If the engine generates new futures every cycle, though, it does not just navigate uncertainty — it adds to it. Which leads to the most unsettling implication in this book.

Part III · Operating Inside Permanent Fog

The Permanent Fog

A sailor does not expect the fog to lift before the next leg of the voyage. They refresh the chart and sail anyway. That, it turns out, is the only honest posture for AI strategy — because the fog is not lifting, and your own engine is part of the reason.

The doctrine beneath this book treats the AI Fog as a condition to navigate — weather you fly through. The recursive view from Chapter 9 is stronger and worse, and it is the most original, currently-unstated claim in the whole argument.

"The Discovery Accelerator does not dispel the AI Fog. It manufactures Fog as a side-effect of being good at its job."

Every cycle of the engine surfaces new candidate uses of AI. New candidates expand the solution space. An expanded solution space deepens the Fog. Karpathy says taste matters more²⁶; Chennapragada says learning loops will beat designed UX²⁷; nobody in that register says the act of building good selectors creates the condition that makes selectors necessary. That is the claim available here and stated nowhere else.

The posture flip

This changes the strategic posture completely. The wrong posture is "run the engine until the Fog lifts." It will not, and the engine is partly why. The right posture is to build the engine to be a productive operator inside permanent Fog.

Two things follow for practice. Boundary cases stop being a one-off stress test and become a recurring navigation instrument. And the Question Ledger gets refreshed on a schedule — monthly, the way a sailor refreshes a chart — rather than produced once and filed.

50%

of CEO planning time now goes to horizons under one year — up from 43% the year before. The Fog is real and deepening; the data simply confirms the felt experience.²⁵

The competitive implication is blunt. Operators who built engines to find dry land will keep waiting for it. Operators who built engines for permanent Fog will price strategy at an altitude the dry-land operators cannot reach. The "AI roadmap to a stable future state" deck is selling a map of land that is not there — it assumes the Fog clears.

If the Fog is permanent and the engine is the instrument, the only question left is timing: who gets the compounding advantage, and when does the window close?

Part III · Operating Inside Permanent Fog

The Model Dividend

A casual user gets a slightly better chatbot with each model release. This book exists because a model release did something else entirely — it uplifted an entire apparatus at once. That difference is the whole timing argument.

An operator with a flywheel of prior frameworks, a semantic corpus, a proposal compiler, a Discovery Accelerator and a documented Taste Kernel gets something categorically different from a better chat: a free uplift to the entire production function.

"It's not a fluke that this capstone arrived in the first conversation with GPT-5.5 Pro and Opus 4.7's 1M context. The drop landed inside a prepared flywheel. The dividend only compounds against pre-existing machinery."

Large context turned writing from assembly into synthesis; frontier reasoning gave the prior compressed components enough substrate to crystallise. The strategic implication is a single sentence: the value of a model improvement depends on how much machinery you have built to absorb it — and the time to build that machinery is now, because it cannot be acquired retroactively by spend after the drop lands. The dividend compounds for first movers, and the gap widens with each release.

Where this could be wrong

A doctrine that only flatters itself is an echo chamber. Three places a sceptic with money on the line should push — and where the honest answer concedes ground.

1 · The constraint-shift may be narrower than the strong form implies

Execution cost collapsed on CRUD-adjacent surfaces. It has not collapsed for hard-verifiability systems, distributed-state consistency, legacy migration, or heavily-regulated domains. In those, the binding constraint is still "can we build it correctly." Name the surface honestly rather than claiming it everywhere.

2 · Goodhart eats a Kernel that's too sharp

An optimiser's top outputs exploit weaknesses in the scoring function. A razor-sharp Taste Kernel is razor-sharp about what it fails to value too — and widening it dilutes the moat, because the kernel is the moat. The world-loop closure is what keeps the trade-off from running away.

3 · The meta-tooling trap is real, and this doctrine is exposed to it

"Build the build system" has produced more dead internal portals than working products. The consulting analogue: a beautiful Discovery Accelerator generating synthetic futures while the client never ships a working agent. The defence is non-negotiable — every cycle of the engine must terminate in a deployed artefact, not a recommendation. That is the world-loop closure stated as a delivery rule.

None of these break the doctrine. All of them sharpen it. Which leaves the last question — human and commercial. If the dividend pays the prepared, and the apparatus must ship real artefacts, then who, in the org chart, is supposed to run this — and how do they answer the oldest question a board asks?

Part III · Operating Inside Permanent Fog

The Dimension-4 Advisory

The first thing a board asks a solo operator is "where are your runs on the board?" It is a fair question. The honest answer is that the board itself is being rewritten.

The role this whole argument implies is a new one, and the case for it is structural, not egotistical. It is none of CIO, CTO, transformation lead or strategy consultant. It is a function whose job is to spend disciplined cognition at board altitude — to search the Fog and produce structured anticipation under recorded counterplay.

"The company does not merely need someone who understands the existing machine. It needs someone AI-native enough to explore the machine that may replace it."

Why existing executives structurally can't do it

You cannot take the people with the deepest trust, the longest relationships and the most invested loyalty to the existing machine and ask them to seriously explore its replacement. They are part of the machine being examined. They can run Dimensions 1 through 3 well; they will not run Dimension 4, because they cannot honestly imagine the replacement logic for the system they currently operate. That is not stupidity. It is incentive geometry.

Tenure → artefacts

The "no runs on the board" objection has a clean answer, and it is not a denial of experience — it is a redirection. The board itself is being rewritten. Twenty years of experience inside a game whose rules are being rewritten is less load-bearing than the work done inside the rewrite — and nobody has twenty years of AI-native consulting experience, because the field is months old.

So credibility moves from tenure to artefacts. The stack is inspectable, layer by layer: prior pattern recognition; a published doctrine traceable to live deployments; working engines — the Discovery Accelerator, the proposal compiler, the Question Ledger, the killer-refutation library; the actual artefacts those engines produce; the flywheel in which each run improves the next; and client-facing value the board can challenge directly. The credibility is not a CV. It is a machine you can inspect.

Why the artefact changes the relationship

A deck invites the board to challenge the consultant's taste. A populated Question Ledger — recommendation, rejected alternatives, evidence, gaps, revisit triggers — invites them to challenge the search.

"A board can challenge a deck with taste. A board can challenge a Ledger with better questions. That moves strategy from opinion to inspectable search."

And the cleanest proof the apparatus is real is that it behaved like itself before it was told to. The first version of the proposal compiler was asked only to recommend a project for a client. It had read the John West and NegaMax material as background — and it produced its best recommendation plus two defeated recommendations, unprompted. The doctrine had quietly become operational context. The behaviour was built into version two on purpose. The flywheel was running in production before it was named.

Position, don't compete

The diagnosis is mainstream; the apparatus is not. So cite the big firms' data — it is good data — and out-frame rather than out-shout. McKinsey finds 88% of organisations now use AI in at least one function, up from 78%, yet only about 6% are high performers capturing real value, and roughly two-thirds have not begun to scale.²⁶ PwC reports 56% of CEOs see neither revenue gains nor cost reductions from AI, with confidence in revenue growth down to 30%.²⁷ BCG puts only 5% of firms in the "future-built" camp while 60% reap hardly any material value.²⁸ Bain's verdict is blunter still — "if you're still piloting, you're dangerously behind"²⁹ — and Deloitte's Australian cut shows the local gap widening, with 12% of Australian leaders saying generative AI is already transforming their business against 25% globally.³⁰ Everyone agrees the value gap is real. Almost nobody has built the apparatus that closes it.

The wedge

"A Dimension-3 consultancy redesigns the business you have. A Dimension-4 advisory decides which business you should still have."

Compressed to one line for the reader who skipped to the end: build the chooser, ground it outside itself, and plan for permanent Fog — because the better your chooser gets, the more Fog it makes.

The canon, mapped to its rung

If the arsenal is the point, here is the cleanest map of it — every framework located by the dimension it primarily serves. Not a tour; the toolkit you throw at the search.

Rung	What it does	Frameworks that live here
1D–2D	Compress & classify cognition	Cognition Supply Chain; Maximising AI Cognition
3D	Deploy workflow & agentic cognition safely	Lane Doctrine; Governance as Code; AI Readiness Staircase; Nightly AI Decision Builds
4D	Generate synthetic data on demand; score synthetic futures	Discovery Accelerator / NegaMax; The Reshape; Boundary Stacking; AI Think Tank; Terminal Value Doctrine; Question Ledger; John West Principle
Cross-cutting	Ground & compound the apparatus	Taste Kernel; Worldview Recursive Compression; the AI Learning Flywheel; the Model Dividend

Where this goes next

This book is itself an output of the machinery it describes — written inside the flywheel, with the Discovery Accelerator's own logic visible in its structure.

The next layer is deployment and pricing, not more doctrine. If you are a board asking where to spend the next unit of disciplined cognition, that is the conversation to have.

REF

Sources & Evidence

References & Sources

The evidence base behind every claim — primary research, industry analysis, and technical specifications

Research Methodology

This ebook draws on primary research from standards bodies, independent research firms, enterprise technology vendors, and consulting firms. Statistics cited throughout have been cross-referenced against primary sources.

Frameworks and interpretive analysis developed by Scott Farrell / LeverageAI are listed separately below — these represent the practitioner lens through which external research is interpreted, and are not cited inline to avoid self-promotional appearance.

Industry Analysis & Vendor Research

Business Insider — Sam Altman's Token Offer Is a New Twist to Startup Investing [1]

Altman excited for tokenmaxxing startups; $2M tokens for YC equity

https://www.businessinsider.com/sam-altman-openai-offer-tokens-for-startup-equity-y-combinator-2026-5

The Pragmatic Engineer — The Pulse: 'Tokenmaxxing' as a weird new trend [2]

Meta 60.2 trillion tokens in 30 days, ~$900M at API prices

https://blog.pragmaticengineer.com/the-pulse-tokenmaxxing-as-a-weird-new-trend

Digg — Sam Altman Offers $2M OpenAI Tokens to Every YC Startup for Equity [3]

Garry Tan: Tokenmaxxing confirmed

https://digg.com/ai/6em7wr60

Mr. Prompts — Tokenmaxxing [4]

Huang: alarmed if a $500K engineer spends under $250K on tokens

https://mrprompts.substack.com/p/tokenmaxxing

trendingtopics.eu — Tokenmaxxing: Productivity Metric or Vanity Trap? [5]

Rangan: outcome maxxing over token maxxing

https://www.trendingtopics.eu/tokenmaxxing-is-ai-token-consumption-a-productivity-metric-or-vanity-trap

Business Insider — Y Combinator's Advice: Tokenmaxx, Don't Headcountmaxx [11]

Hu: maximize token usage not headcount

https://www.businessinsider.com/y-combinator-advice-ai-native-company-tokenmaxx-leaner-teams-headcount-2026-5

McKinsey — The State of AI 2025 [12]

88% of organisations use AI in at least one function, but only ~6% are high performers generating substantial EBIT impact

https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai

BCG — Are You Generating Value from AI? The Widening Gap [13]

Only 5% of firms are AI future-built; 60% reaping hardly any material value despite substantial investment

https://www.bcg.com/publications/2025/are-you-generating-value-from-ai-the-widening-gap

PYMNTS — How AI Killed Information Asymmetry in B2B Procurement [17]

AI eliminates seller-buyer information asymmetry

https://www.pymnts.com/news/artificial-intelligence/2026/how-ai-killed-information-asymmetry-in-b2b-procurement

BBC News — 'Obnoxious' AI chatbot talked about its mother, customers say [19]

Woolworths Olive reconfigured; Gartner 80% exploring, 20% meeting expectations

https://www.bbc.com/news/articles/cy7jeyeyd18o

Jake Handy / HandyAI — Model Drop: Composer 2.5 [22]

built on Kimi K2.5 base; 85% compute on Cursor post-training

https://handyai.substack.com/p/model-drop-composer-25

Cursor — Introducing Composer 2.5 [23]

25x more synthetic tasks; feature deletion; tests as verifiable reward

https://cursor.com/blog/composer-2-5

DataCamp — Composer 2.5: Benchmarks, Pricing, and How It Compares [24]

feature-deletion synthetic tasks grounded in real codebases; tests as verifier

https://www.datacamp.com/blog/composer-2-5

DevOps.com — Cursor's Composer 2.5 Brings Smarter, More Reliable AI Coding Agents [25]

25x synthetic tasks via feature-deletion paradigm shows synthetic data generation as the defining AI training improvement axis

https://devops.com/cursors-composer-2-5-brings-smarter-more-reliable-ai-coding-agents

Primary Research & Standards Bodies

Andrej Karpathy — Sequoia AI Ascent 2026 — talk notes [6]

scarce thing shifting to taste/eval design; outsource thinking not understanding

https://karpathy.bearblog.dev/sequoia-ascent-2026

Greg Brockman / Training Data — The $852B Bottleneck Is Now Human Attention [7]

bottleneck shifted from execution to attention

https://finance.biggo.com/news/07b54e946df043ba

Paul Graham (via Fortune) — In the AI age, taste will become even more important [8]

taste is the differentiator when anyone can make anything

https://fortune.com/2026/02/27/openai-sam-altman-taste-get-jobseekers-hired-ai-jobpocalypse

Aparna Chennapragada — Most Work is Translation [9]

unit cost of translation near zero; blank-page tax gone

https://aparnacd.substack.com/p/most-work-is-translation

Stanford Encyclopedia of Philosophy — Thought Experiments [12]

Lucretius spear at the edge of space, De Rerum Natura 1.951-987

https://plato.stanford.edu/entries/thought-experiment

Karl Friston — Active Inference: A Process Theory [14]

cognition constituted by acting in the world

https://activeinference.github.io/papers/process_theory.pdf

Nicholas Carr — The Glass Cage [15]

automation erodes skill, curiosity and critical faculties

https://www.nicholascarr.com

Vectra AI — Prompt injection: types, real-world CVEs, and enterprise defenses [20]

OWASP #1; ~84% attack success in agentic systems

https://www.vectra.ai/topics/prompt-injection

Accelirate — The 2026 Agentic AI Governance Crisis [21]

Gartner: 40%+ of agentic AI projects cancelled by 2027

https://www.accelirate.com/agentic-ai-governance-crisis

LeverageAI / Scott Farrell — Practitioner Frameworks

The interpretive frameworks, architectural patterns, and practitioner analysis in this ebook were developed through enterprise AI transformation consulting. The articles below are the underlying thinking behind those frameworks. They are listed here for transparency and further exploration — not cited inline, as this is the author's own analytical voice.

Scott Farrell — The Agent Token Manifesto

"whoever spends the most tokens wins" sharpened to disciplined cognition

https://leverageai.com.au/

Scott Farrell — The Terminal Value Doctrine — Stop Optimising the Horse

the two-shift constraint move; apparatus that produces decisions

https://leverageai.com.au/the-terminal-value-doctrine-stop-optimising-the-horse/

Scott Farrell — The Reshape — A Field Guide to Thought Experiments in the Age of AI

boundary-case compression; the human single-pivot move

https://leverageai.com.au/the-reshape-a-field-guide-to-thought-experiments-in-the-age-of-ai/

Scott Farrell — Getting Enterprise AI-Ready: Governance as Code, Not Committees

Nightly AI Decision Builds; per-decision world-loop closure lifted to portfolio

https://leverageai.com.au/getting-enterprise-ai-ready-governance-as-code-not-committees/

Major Consulting Firms

McKinsey — Agentic commerce: How agents are ushering in a new era [18]

up to $1T orchestrated US B2C retail by 2030

https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-agentic-commerce-opportunity-how-ai-agents-are-ushering-in-a-new-era-for-consumers-and-merchants

Oliver Wyman Forum — The CEO Agenda 2026 [25]

half of planning time on horizons under one year, up from 43%

https://www.oliverwymanforum.com/ceo-agenda/how-ceos-navigate-geopolitics-trade-technology-people.html

PwC — 2026 Global CEO Survey [27]

56% of CEOs see no revenue/cost benefit; confidence in growth at 30%

https://www.pwc.com/gx/en/news-room/press-releases/2026/pwc-2026-global-ceo-survey.html

Bain & Company — Technology Report 2025 [29]

if you're still piloting you're dangerously behind

https://www.bain.com/insights/topics/technology-report

Deloitte — State of AI in the Enterprise 2026 (Australia) [30]

12% of AU leaders say genAI already transforming vs 25% globally

https://www.deloitte.com/au/en/issues/generative-ai/state-of-ai-in-enterprise.html

About This Reference List

Compiled May 2026. All URLs verified at time of compilation. Regulatory documents and standards specifications are subject to revision — check primary sources for the most current versions.

Some links to academic papers and vendor research may require free registration. Government and standards body publications are freely accessible.