The Cognition Dimension Ladder — Why Your AI Strategy Is One Rung Too Low

SF Scott Farrell June 18, 2026 scott@leverageai.com.au LinkedIn
AI Strategy · The Operating System Beneath the Doctrine

The Cognition Dimension Ladder

Why your AI strategy is one rung too low — and why the highest use of AI is to build the thing that tells you what to build.

Scott Farrell, LeverageAI
·
Sequel to The Terminal Value Doctrine
·
~28 min read

📘 Want the complete guide?

Learn more: Read the full eBook here →

There is a meme going around the labs called tokenmaxxing — spend the most tokens, win. I wrote the cruder version of it months ago: whoever spends the most tokens wins. It was too blunt to publish and directionally right. When thought becomes cheap, spending more of it is more valuable. But the meme, and my own first cut of it, both miss the load-bearing word.

The word is disciplined. Whoever spends the most tokens wins — provided they’re spending them inside an apparatus that knows what to reject. That is not workflow optimisation. That is churning intelligence to compress your industry’s future into the present, and then reaching into that future to find out what it looks like before your competitors do. The burn is not the wedge. The apparatus is the wedge.

This article is about that apparatus: what it is, why it sits one dimension above where almost every AI strategy is currently being written, and why — once you see the ladder — the highest-leverage thing you can build with AI is not a product, a workflow, or a chatbot. It is the chooser.

It assumes The Terminal Value Doctrine as background and sits underneath it. The doctrine names the selection rule for an AI portfolio under cheap cognition. This is the operating system beneath the doctrine — the machine that produces the selection, the evaluation function that grounds the machine, and the loop that grounds the evaluation function.


Move 1 · The debate everyone is havingThe tokenmaxxing argument names the symptom, not the structure

Between February and May 2026 a workplace meme detonated, spelled with the double-x of looksmaxxing and moneymaxxing: tokenmaxxing. The proponents are not fringe. Sam Altman, posting in May: “i am excited to see what will happen with tokenmaxxing startups… happy building!”6 — alongside an offer of $2M in tokens to every startup in the current YC batch. Y Combinator’s Diana Hu put it as advice: “Maximizing token usage, not head count, will be the critical shift. The best companies will be the ones that are tokenmaxxing.”7 YC’s CEO Garry Tan replied in two words: “Tokenmaxxing confirmed.”8 Jensen Huang went further on the All-In podcast: “I would be deeply alarmed if a $500K engineer spends less than $250K on tokens.”9 The scale is real — Meta employees burned 60.2 trillion tokens in thirty days, which at retail API prices would run to roughly $900M.13

Then came the pushback, and it was sharp. HubSpot’s CEO Yamini Rangan: “Outcome maxxing >> token maxxing.”10 Appian’s Matt Calkins was crueller: tokenmaxxing is “the Soviet practice of judging the quality of chandeliers by their weight.”11 The New York Times filed it under Silicon Valley’s newest form of conspicuous consumption.12

Both sides are right and both sides are incomplete. The proponents are right that more cognition matters — but they haven’t said what shape of spending compounds. The critics are right that raw burn is gameable — but they haven’t said what the operational alternative is. The doctrine sits one step beyond both: the answer is neither “spend more” nor “measure outcomes,” it is disciplined cognition inside an apparatus that knows what to reject. The proponents named the directional claim. The critics named the symptom. Neither named the structure. The operators who have quietly been building the structure were never in the debate at all.

Whoever spends the most tokens wins. When thought becomes cheap, spending more of it is more valuable — provided you’re spending it inside an apparatus that knows what to reject.

Move 2 · The constraint has moved twiceFrom “how to build” to “what to build” to “what builds the answer”

Four voices, late 2025 to early 2026, said the same thing from very different chairs. Andrej Karpathy, on stage at Sequoia’s AI Ascent in San Francisco — the venue where Pat Grady opened with “not faster horses, but cars; and the cars have arrived,” and Sonya Huang declared 2026 the year of agents5: “The scarce thing is shifting… More scarce: understanding, taste, eval design, security, system boundaries, agent orchestration, domain-specific feedback loops, and knowing when the model is off the rails.” He closed on a line worth pinning to a wall: “You can outsource your thinking, but you can’t outsource your understanding.”1 Greg Brockman, on the Training Data podcast a week later, framed it economically: autonomous coding tools jumped from writing 20% to 80% of code in a single month, the cost of a prototype has collapsed to near zero — what took a week now takes minutes — so “the bottleneck has shifted from execution to attention… to decisions about what is worth doing.”2 Paul Graham, in a February X post that ran to 1.32M views: “In the AI age, taste will become even more important. When anyone can make anything, the big differentiator is what you choose to make.”3 Aparna Chennapragada, six months ahead of all three: “For the first time, the unit cost of translation is close to zero… But the blank-page tax that fills so much of modern work is close to zero.”4

Put together, they name one shift: when execution is cheap, choice is the bottleneck. That claim is correct. It is also incomplete, and the incompleteness is the whole point of this piece.

Because choice itself is now automatable. You can build a machine that generates candidate strategies, attacks them, ranks them, and prunes them — provided you can tell it what “good” looks like. The moment choice becomes automatable against an evaluation function, the binding constraint moves again:

The first shift was “from execution to decision.” The second shift is “from decision to the apparatus that produces decisions.” That is two levels of shift, not one. Most consultants are still operating at the first shift.

This matters because it sorts the market. McKinsey, Bain, BCG and the rest have converged on a single message — redesign the business, don’t just automate the workflow. That is good advice and it is shift-one advice. It answers “what should we build.” It does not build the machine that answers “what should we build” continuously, at a depth no workshop can reach. The big firms are operating one level below the frontier and calling it strategy.

Karpathy, to his credit, kept going past taste. On the same stage: “My current worldview is not that AI simply makes everyone faster at the old work. It is that the work itself is being reorganized around agents.”1 And the structural consequence he was willing to name out loud — the one most of the constraint-shift register glossed past: “Ultimately, I do think we are going toward a world where people and organizations have agent representation. My agent will talk to your agent to figure out meeting details and other tasks. That is roughly where things are going.”1 Hold that quote. It is going to come back three times, because it turns out to be load-bearing for half the doctrine.

Two altitudes, side by side
Area Big-firm frame (Dimension 3) Cognition Dimension Ladder frame (Dimension 4)
AI strategy Transformation, scaling, process redesign Shareholder defence under cheap cognition; an apparatus that searches the Fog
Selection Prioritise high-value use cases Select by terminal-value impact via a Question Ledger
Strategy process Roadmap, maturity model, portfolio Question Ledger, Discovery Accelerator, Boundary Stacking
Artefact A deck of confidence A search log with the rebuttals attached

Move 3 · The mapAI value migrates outward by dimension, not inward by depth

Here is the spatial picture I keep coming back to. AI did not get more useful over the last two years by doing the same job better. It got more useful by entering jobs it could not previously do at all. The progression has the shape of a ladder, and each rung has a ceiling.


1D · Summary
board-pack synthesis
ceiling: saturates fast

2D · Row-level
classify · extract · score
unlocked by cheap inference

3D · Workflow
multi-step · agentic
where most spend sits

4D · Synthetic-future
the engine generates
the futures it scores
needs an evaluation function

the frontier is one rung up — not deeper inside the current one

The Cognition Dimension Ladder. Each rung opens when several thresholds cross together; the fourth rung needs one ingredient the others don’t — an evaluation function strong enough to tell good output from bad.

1D — summary cognition. Compress a long thing into a short thing. Board-pack synthesis, the executive summary, “what does this 40-page report say.” This is the only rung the early-2024 cost curve could carry, and it saturates almost immediately. A better model gives you a slightly better summary, and then it doesn’t.

2D — row-level cognition. Apply judgement per record: classify this, extract that, score the other. Unlocked when inference got cheap enough to run across a whole dataset rather than a single document. Real value, finite ceiling.

3D — workflow and agentic cognition. Multi-step, cross-functional, time-bearing. The agent does the thing, calls the tool, waits, checks, continues. This is where almost all serious enterprise spend currently sits — and where the big-firm “redesign the workflow” advice lands. It is genuinely valuable. It is also the rung most people mistake for the summit.

It isn’t. As cognition gets cheaper and smarter, ask what the 3D workflow actually does with the improvement. The same workflow runs a bit faster, a bit cleaner — and then the gains flatten. The dimension you’re optimising in is saturating. If you’re harvesting AI for workflow optimisation, you’re underplaying it.

4D — synthetic-future cognition. The rung above. Here the engine does not process the futures you hand it; it generates the futures it then scores. It manufactures candidate strategies, candidate boundary cases, candidate failure modes — and evaluates them. The fourth rung opens only when four thresholds cross at once: the model is smart enough, cheap enough, parallel enough, and there exists an evaluation function strong enough to separate good output from bad. That fourth requirement is the one most operators don’t have. Which is exactly why Dimension 4 is structurally underbuilt, and why the operators who reach it early capture value the others cannot price.

AI does not get more useful by doing the same thing better. It gets more useful by entering a dimension where it could not previously operate.

Move 4 · The thing worth buildingIf you could build anything, build the thing that tells you what to build

Sit at a computer in 2026 with the constraint of execution effectively gone. You can build almost any software. You can spin up almost any analysis. So the question stops being “can I build it” and becomes the only question that’s left:

If I sit at my computer and I can build anything, what would I build? I would build the thing that tells me what to build. The current highest use of AI is to discover the highest use of AI.

That is the chooser. It is the Dimension-4 engine — the thing that searches the space of possible moves and tells you which one survives. And it has the same standing in your strategy that terminal value has in the doctrine: it is the thing the rest of your future depends on. Optimise a workflow and you improve one process. Build the chooser and you improve every subsequent choice about which workflows, products and bets are worth making at all. It compounds where everything else saturates.

The chooser is not a smarter chatbot. It is a structure, and it helps to see it as one. At the apex is the engine itself. Below it are the search machinery and the arsenal it searches with. And underneath all of it — outside the system, load-bearing — is the one thing that keeps the whole structure honest.


THE CHOOSER
the 4D engine

THE SEARCH MACHINERY
Director · Council
Reflective NegaMax · Question Ledger

THE ARSENAL — thrown into the Fog
Synthetic-Data Generation · Boundary-Case Compression
Taste Kernel · Worldview Compression · framework canon

WORLD-LOOP CLOSURE
deployed reality returns the evidence — the recursion bottoms out here, outside the system

The chooser as a structure. The engine at the apex is built from search machinery; the machinery searches with the arsenal; and the whole thing is grounded not on itself but on deployed reality returning evidence. Remove the base and the pyramid becomes a sealed mirror.

The next moves build this pyramid from the top down: what the engine is made of, what it searches with, what grounds it, and what it keeps finding.

Move 5 · How the engine is builtSearch, not learning — and a search engine that reads its own reasons

The most important technical fact about the chooser is what it is not. It is not a reinforcement-learning system. Reinforcement learning is the right tool when “good” is unknown and has to be discovered by reward signal over many runs. That is the frontier labs’ problem. It is not yours. You already know — or can document — what good looks like. When the moves are unknown but the evaluation function is given, the right tool is not learning. It is adversarial search.

The Discovery Accelerator NegaMax doesn’t learn — it scores. It doesn’t need reinforcement learning and rewards. It uses an evaluation function at the leaves of the search tree, and that evaluation function is your documented taste.

That single property is what makes the apparatus deployable today. No reward drift. No proprietary training data. No months of training runs. Just frontier reasoning, a clear evaluation function, and structured search. The architecture — a Director that orchestrates, a Council of specialised brains (Operations, Revenue, Risk, People, Strategy, Governance) that argue in structured rebuttals, and a NegaMax tree that explores roughly a hundred candidate futures per minute, pruning weak branches — is described in full in the doctrine. I won’t re-derive it here. What matters at this altitude is the principle underneath it: search beats learning when the evaluation function is already inside the human.

The engine inherits its discipline from chess. NegaMax doesn’t look for the flashiest move; it looks for the move that still works when the other side plays well. Sometimes that’s a winning move. Sometimes, in a lost position, it’s the move that loses slowest. The posture transfers exactly to strategy: not “what is the most exciting AI opportunity,” but “what survives when the competitor sees it too, the incumbent bundles it for free, the regulator demands evidence, the customer shows up with their own agent, the margin compresses, and the implementation disappoints — all at once?” An engine built this way is willing to recommend the smaller, defensible future over the larger, fragile one. That willingness is what makes it credible rather than a hype machine.

But the AI version goes one step beyond chess, and the step is the genuinely original part. A chess engine remembers positions, killer moves, transpositions — mathematical aids. An LLM-based engine can remember reasons. It can read its own search tree as language, cluster the refutations that keep recurring, and use those patterns to search more intelligently next time.

Chess NegaMax remembers positions. AI NegaMax can remember reasons.

So the rejected branches are not exhaust. They are fuel. Watch a single line resolve and you see why this is a different kind of artefact. Idea: build an AI workflow product for the industry. Refutation: competitors copy it. Counter: don’t defend the workflow, defend the trust boundary, the audit trail, the regulated data perimeter. Counter-counter: incumbents bundle governance. Response: their multi-tenant architecture makes per-tenant regulated isolation hard to retrofit — that’s the moat. That is not a pros-and-cons list. It is a line of play where each claim has to survive the next intelligent attack, and the survivor is logged with everything it defeated.

Move 6 · The arsenalTo pierce the Fog, throw everything at the search

The condition the engine operates in has a name. The AI Fog is the simultaneous compression of the credible planning horizon and expansion of the plausible solution space: less time to see, more behind the fog to see. Oliver Wyman’s CEO data is the Fog in a single statistic — half of all executive planning time now goes to horizons of under a year, up from 43% the year before.22 The instinct under those conditions is to forecast harder. That is the wrong instinct. Strategy under the Fog is not a recommendation problem; it is a search problem.

The map is unstable, the visible horizon is shorter, and the number of possible moves has exploded. Therefore strategy is no longer mainly a recommendation problem. It is a search-quality problem.

And if it’s a search problem, then the discipline is simple to state and hard to execute: throw everything you have at the search. Not a few prompts. The whole arsenal — every framework in the canon, the documented Taste Kernel, Worldview Compression, and the move that turns out to be the sharpest instrument of all: boundary-case compression.

Boundary-case compression is the engine inside The Reshape, and it is two thousand years old. Lucretius used it to prove space is infinite: walk to the supposed edge and throw a spear — either it flies through (no edge) or it stops against something (which is itself in space, so still no edge).23 The move is: push one variable to its structural extreme until the geometry of the situation forces an answer. It compresses a vague, sprawling question into a single load-bearing case. Push price to zero. Push the customer’s AI to omniscient. Push software cost to nothing. The boundary case doesn’t predict what will happen; it reveals which of your current assumptions are actually holding the strategy up.

Why boundary-case compression × NegaMax beats either alone

Here is the claim this whole article is reaching for, and it deserves to be earned rather than asserted. Boundary-case compression plus the NegaMax Discovery Accelerator gives you dramatically better coverage of the search space than either tool on its own. The reason is that the two tools fix each other’s weakness.

The Fog’s solution space is, by definition, larger than anything you can brute-force. A naive search either drowns in low-information branches or terminates before it reaches the futures that matter. Boundary-case compression solves the tractability problem: instead of sampling the space at random, you collapse each variable to its load-bearing extreme, which yields a small set of high-information seeds — the points where the geometry actually changes. Adversarial NegaMax then solves the reach problem: around each of those seeds it explores the counterplay — the competitor’s response, the regulator’s move, the customer-agent’s bypass — to a depth and breadth no human workshop can match, pruning the branches that can’t affect the outcome.

Coverage improves in two distinct senses at once. Tractability: you search a small, well-chosen space deeply, instead of a vast space shallowly. Reach: you get to the adjacent-impossible-now-becoming-likely futures that a linear forecast never reaches. Compression aims the search; adversarial search exhausts the aimed-at region. Throw them at the Fog together and you cover the part of the future that matters far more completely than spending the same tokens unfocused. That is what disciplined cognition buys you over raw tokenmaxxing.

Boundary Stacking — the move only the engine can run

There is a division of labour here, and it is worth stating precisely. The single-boundary pivot — push one variable, see what breaks — is the human move. It needs taste for which variable is live, and that taste is local; it comes from sitting with the specific situation. The Reshape owns that move, thirty minutes against one stuck argument, no engine required.

What the engine adds is stacking. The most dangerous strategic futures are not single boundaries; they are combinations. “Every customer has a negotiating agent” is one boundary. “Every customer has a negotiating agent and cognition is 100× cheaper and the regulator mandates agent-readable pricing APIs” is a different question entirely — and the joint case is not the sum of the singles. Three variables at three extremes is twenty-seven joint cases; four is eighty-one. Most are dominated by their strongest single boundary, but a few are non-additive — they surface futures neither single boundary predicts. No human runs eighty-one stacked thought experiments before lunch. The engine does. The human supplies the pivot; the engine supplies the stack.

And the first variable in that stack is no longer speculative. When Karpathy says “my agent will talk to your agent,”1 the “every customer has an agent” boundary stops being my hypothesis and becomes the stated baseline of the leading voice of the constraint-shift wave. The engine isn’t imagining an exotic future. It’s stacking the one Karpathy already put on the record.

Synthetic data — the newest weapon in the arsenal

There is one more instrument in the arsenal, and it is the single biggest shift in AI over the last six to twelve months. It isn’t “agents.” It’s synthetic data — models, and the systems built on them, improving by generating their own grounded material on demand instead of waiting for the world to supply it. The frontier labs are betting the next leg of progress on it (Move 10 has the cleanest public proof), and the chooser carries the same weapon.

The Discovery Accelerator does not wait for data to search. It runs an agentic process that generates synthetic data on demand as part of its protocol — synthetic boundary cases, synthetic competitor moves, synthetic regulatory regimes, synthetic customer-agent behaviours — and searches those into the Fog. The “synthetic futures” I keep naming are precisely this: synthetic data at strategy altitude. When the real data about a future doesn’t exist — because the future hasn’t happened yet — the engine manufactures grounded, structured stand-ins, scores them against the Taste Kernel, and keeps the ones that survive. Synthetic data generation is how the arsenal reaches into a future that has produced no real data to look at. It is the part of the protocol that makes searching the Fog possible at all.

Move 7 · The moatThe evaluation function is the only part that can’t be copied

The engine can be copied. Director, Council, NegaMax, alpha-beta pruning — all of it is decades-old computer science wearing a 2026 coat. If the engine were the moat, you’d have no moat. The moat is the evaluation function the engine scores against: the documented Taste Kernel — your examples, anti-examples, rules, rejected patterns, preferred language, defensibility tests, “too generic” detectors, the John West rejection logic that says it’s the fish we reject that makes us the best.

The fashionable line is “AI produces options; humans still have the taste.” The harder line is the true one: taste is documentable, and once documented it becomes infrastructure.

Taste is not just human magic. Taste is documentable. You compress it into examples, anti-examples, rules, rejected patterns, preferred language, strategic instincts — and then you apply it systematically. Not as the whole answer; as the evaluation function the search engine scores against.

But the Taste Kernel is not a fixed point, and this is where most thinking about “AI moats” goes wrong. It is an eigenvector — in linear algebra, the direction a matrix preserves under repeated application, the thing that stays self-similar as you iterate. Each cycle of compression is an operation on your judgement; most of your beliefs get partially overwritten; the Kernel is the direction your iterated judgement keeps converging back toward. That framing matters because it exposes a failure mode that “fixed point” hides:

The Kernel can drift not by changing, but by failing to change. If you stop doing the lived work that updates it — stop shipping, stop being wrong in expensive ways, stop absorbing real friction — the eigenvector freezes. The engine keeps scoring against a direction in judgement-space that no longer tracks what you’d actually care about. This is the operational version of a point Borgmann, Carr and Friston have each made: cognition is constituted by acting in the world.242526 Friston’s active inference puts it most sharply — organisms don’t just predict the world, they act on it to fulfil their predictions; outsource the action and you destroy the cognition that justified the outsourcing.

So the recursion needs a floor, and the honest floor is not where you’d first look. It is not the Kernel — that’s the thing the recursion converges toward, not the thing it stops at; stop there and you’ve stopped at the algorithm’s own internal attractor, exactly the point where it’s most disconnected from reality. It is not the human operator either — make the human the floor and either they’re rubber-stamping the engine (the floor is theatre) or overruling it (the engine isn’t being used at depth). The floor is the world-loop closure:

The engine without the world-loop is a sealed system. Sealed systems converge on their own attractors and lose contact with what they were supposed to be about.

Every cycle of the apparatus must include at least one external evidence injection — a real client problem with real money attached, a real failed project, a real customer surprise, a real boundary case that reality contradicted. The recursion bottoms out not inside the system but outside it, on deployed reality returning evidence. That is the only honest answer to “but what grounds the chooser?” You don’t break the regress by finding a fixed point inside the machine. You break it by anchoring the machine to something outside it. It is the same architectural commitment as nightly decision builds, lifted from the level of one decision to the level of the entire portfolio. And it is the reason the model improvements that arrive every few months become a free upgrade rather than a threat — because taste becomes reusable infrastructure, and model drops upgrade the infrastructure for free.

Move 8 · What the engine keeps findingThe killer-refutation library, and why so many margins are accidental

Run the engine across enough industries and the rejected branches start to rhyme. The same lethal moves recur, regardless of sector. A short library of killer refutations emerges: feature advantage commoditises; the incumbent bundles it; the customer’s agent bypasses the intermediary; the governance burden exceeds the margin; data access was assumed but isn’t available; workflow savings never reach terminal value; the regulated trust boundary is missing; the strategy only wins if competitors stay irrational; the business case depends on old friction persisting. Every Dimension-4 search collapses to one question:

What survives when the competitor sees it too?

The last refutation in that list is the biggest, and it earns its own paragraph. A great many corporate margins are not value generation. They are accidental-friction rents — the business gets paid because the world is clumsy. Forms, phone queues, comparing, quoting, translating, booking, summarising, coordinating, understanding policy. AI attacks accidental friction directly, and it’s already measurable: information asymmetry — the seller knowing more than the buyer — was a central pillar of B2B economics, and AI is dismantling it as procurement teams get aggregated market intelligence and real-time benchmarks on demand.31 McKinsey now models up to $1 trillion of US retail orchestrated by agentic commerce by 2030.30

Many corporate margins are not value generation. They are accidental-friction rents. The companies whose margins were always friction rents will discover their value disappeared with the friction. The companies whose margins live in necessary friction — trust, consent, liability, professional judgement, regulatory proof — will discover those margins appreciating.

That is the recurring engine output, not a side concept. The standard killer refutation becomes: this margin is just accidental friction — what happens when AI removes it from the customer’s side of the table? Terminal value migrates from the friction layer to the necessary-friction layer. It’s a structural prediction about where profit pools survive, and the engine keeps landing on it because it keeps being true.

Move 9 · The most useful outputPredicting the shape of a failure before it happens

The most valuable thing a Dimension-4 engine does on a fragile strategy is usually not “you will fail” — which is unprovable in advance and easy to dismiss. It is to predict the shape of the failure before it arrives, in falsifiable detail. Calibrate on a live case.

A mate of mine is set on building customer-facing voice agents for corporates. He’s convinced he’s solved the problems. Everything in the doctrine says don’t: don’t do real-time, customer-facing, regulated, unverifiable, un-batchable work that fights with humans. No single constraint is fatal. The bundle is. Real-time latency plus customer-facing exposure plus high stakes plus identity uncertainty plus privacy plus escalation plus integration debt plus policy enforcement plus a standing prompt-injection target plus brand risk plus thin economics. The engine’s call is specific: the demo will look great; the buyer will underestimate integration; the risk controls will eat the ROI; the viable product shrinks into intake, triage, reminders and post-call admin; and if it’s ever given real authority, security and governance become the actual project.

That is a falsifiable prediction set, and reality keeps confirming the shape. Woolworths had to reconfigure its AI assistant “Olive” after it claimed to be human and complained about its mother — and Gartner found that while ~80% of customer-service leaders were exploring AI agents, only 20% of those plans met expectations.27 Prompt injection sits at #1 on the OWASP LLM list with attack success rates reaching 84% in agentic systems.28 Gartner expects more than 40% of agentic AI projects to be cancelled by 2027.29 The friend’s voice agent will join that statistic — not because any one problem beats him, but because the bundle nets to zero.

This is also the test that separates a real framework from an echo chamber. It does not predict outcomes; it predicts failure shapes, and the shapes keep coming true. Which is the right moment to say what the engine is not doing:

We do not predict the future. We compress the search cost of plausible futures — and we ship the rebuttals with the recommendations.

No precognition. No looking into the crystal ball for the board. The artefact survives an audit precisely because it arrives with its own counter-arguments attached.

Move 10 · The precedentThe frontier labs already proved the method — at a different altitude

If “use AI to manufacture the scenarios AI then evaluates” sounds like a leap, the frontier labs have already shipped it in production. In May 2026, Cursor’s Composer 2.5 launched on the same Moonshot Kimi K2.5 open-weight base as Composer 2; only the post-training changed. The headline number: it was trained on 25× more synthetic tasks than its predecessor.14

25×

more synthetic training tasks in Composer 2.5 than Composer 2 — generated, grounded in real codebases, and verified by real tests14

The mechanism is the interesting part. One of their methods is feature deletion: take a working codebase with a full test suite, delete a feature, and reward the model for re-implementing it so the original tests pass again.14 The labs are using AI to manufacture the training distributions AI itself needs to keep improving — grounded in real artefacts, verified by real tests.1516

The doctrine claim follows mechanically: same pattern, different altitude. Use AI to manufacture the strategic futures a business needs to test — grounded in its real assets and real customers, verified not by unit tests but by the Taste Kernel as the evaluation function. This is not a borrowed metaphor. It is a borrowed, validated methodology. The novel claim isn’t a new AI capability; it’s a new use of a capability the labs have already proven works.

Step back and the pattern is the defining AI shift of the last year: the most important growth axis is no longer just bigger models or more real-world data — it is systems that generate their own data on demand. Composer does it for code, with tests as the verifier. The chooser does it for strategy, with the Taste Kernel as the verifier — agentically, as part of its search protocol, every time it reaches into the Fog. Same shift, two altitudes.

Move 11 · The unsettling implicationThe Fog is permanent — and the engine is partly why

Here is the claim I haven’t seen anyone in the constraint-shift literature make, and it falls straight out of everything above. If the highest use of AI is to discover the highest use of AI, then the AI Fog is permanent.

The doctrine treats the Fog as a condition to navigate — weather you fly through. The recursive view is stronger and worse. Every cycle of the Discovery Accelerator surfaces new candidate uses of AI. New candidates expand the solution space. An expanded solution space deepens the Fog. The engine does not dispel Fog. It manufactures Fog as a side-effect of being good at its job.

That changes the strategic posture completely. You are not running the engine to escape the Fog; there is no dry land, and your own engine is part of the reason. You are running it to be a productive operator inside permanent Fog. Boundary cases stop being an occasional stress test and become a navigation instrument you check on a schedule — the Question Ledger refreshed monthly the way a sailor refreshes a chart. The competitive implication is blunt: operators who built engines to find dry land will keep waiting for it; operators who built engines to work productively inside permanent Fog will price strategy at an altitude the dry-land operators can’t reach.

Move 12 · The timingThe Model Dividend is paid only to those who built the machinery first

A casual user gets a slightly better chatbot with each model release. An operator with a flywheel of prior frameworks, a semantic corpus, a proposal compiler, a Discovery Accelerator and a documented Taste Kernel gets something categorically different: a free uplift to the entire production function. I call it the Model Dividend, and it only compounds against pre-existing machinery.

Every big model drop is a free boost to the cognitive exoskeleton. It’s not a fluke that this capstone arrived in the first conversation with GPT-5.5 Pro and Opus 4.7’s million-token context. The drop landed inside a prepared flywheel. The dividend only compounds against pre-existing machinery.

Large context turned writing from assembly into synthesis; frontier reasoning gave the prior compressed components enough substrate to finally crystallise. The strategic implication is a sentence: the value of model improvement depends on how much machinery you’ve built to absorb it — and the time to build that machinery is now, because it cannot be acquired retroactively by spend after the drop lands.

Sidebar · Why the framework names are deliberately odd

AI Fog. John West Principle. Lane Doctrine. Taste Kernel. Boundary Stacking. Worldview Compression. The names sound off-key on purpose, and it isn’t branding. They go into a RAG database with embeddings, and generic phrases get lost in embedding soup — “AI strategy framework” retrieves everything and therefore nothing. A distinctive phrase pulls a whole compressed judgement pattern into context on demand. The names are doing three jobs at once: a memory handle for humans, a retrieval anchor for the machine, and a compression token for an entire framework in a small phrase. The naming practice is for the machine first, the reader second.

Move 12b · Where this could be wrongThree honest exposures

A doctrine that only flatters itself is an echo chamber. Three places where a sceptic with money on the line should push, and where the honest answer concedes ground.

The constraint-shift may be narrower than the strong form implies. Execution cost has collapsed on CRUD-adjacent surfaces — internal tools, glue code, mid-complexity UIs. It has not collapsed for systems with hard verifiability requirements, distributed-state consistency, legacy migration, or heavy regulation. If your business lives in those domains, the binding constraint is still “can we build it correctly,” not “what should we build.” The doctrine applies where the cost curve actually bent; name the surface honestly rather than claiming it everywhere.

Goodhart will eat a Kernel that’s too sharp. The highest-scoring outputs of any optimiser are the ones that exploit weaknesses in the scoring function. A razor-sharp Taste Kernel is razor-sharp about what it fails to value too — and the engine will reliably surface futures that are attractive to the Kernel but not to reality. Widening the Kernel reduces the exposure, but every widening also dilutes the moat, because the Kernel is the moat. That trade-off is real and permanent; the world-loop closure is what keeps it from running away.

The meta-tooling trap is real, and this doctrine is exposed to it. “Build the build system” has produced more dead internal portals than working products. The consulting analogue is a beautiful Discovery Accelerator that generates synthetic futures while the client never gets a working agent shipped. The DORA finding is the warning: AI amplifies existing conditions — strong teams accelerate, weak ones degrade faster. The defence is non-negotiable: every cycle of the engine must terminate in a deployed artefact, not a strategic recommendation. That is the world-loop closure again, stated as a delivery rule.

Move 13 · The commercial conclusionA Dimension-4 advisory is not a CIO, a CTO, or a transformation lead

The role this whole argument implies is a new one, and the case for it is structural, not egotistical. Existing executives know the current machine intimately. They can run Dimensions 1 through 3 well. They will not run Dimension 4 — not from stupidity, but from incentive geometry. You cannot take the people with the deepest trust, the longest relationships and the most invested loyalty to the existing machine and ask them to seriously explore the machine that might replace it. They are part of the machine being examined.

The company does not merely need someone who understands the existing machine. It needs someone AI-native enough to explore the machine that may replace it.

The obvious objection to a solo operator making this case is “where are your runs on the board?” The answer is not a denial of experience; it’s a redirection. The board itself is being rewritten. Twenty years of experience inside a game whose rules are being rewritten is less load-bearing than the work done inside the rewrite — and nobody has twenty years of AI-native consulting experience, because the field is months old. The credibility stack moves from tenure to artefacts: prior pattern recognition, a published doctrine traceable to live deployments, working engines, actual Ledgers and rejected branches you can inspect, a flywheel where each run improves the next, and client-facing artefacts the board can challenge directly.

The data is theirs. The doctrine is mine.

Cite the big firms’ data — it’s good data. McKinsey: 88% of organisations now use AI in at least one function, up from 78%, yet only about 6% are high performers capturing real EBIT impact, and roughly two-thirds haven’t begun to scale.17 BCG: 5% of firms are “future-built,” 60% see hardly any material value, and the leaders pull 5× the revenue gains.18 Bain: “if you’re still piloting, you’re dangerously behind.”19 PwC: 56% of CEOs report no revenue or cost benefit from AI, and confidence in revenue growth has fallen to 30%.21 Deloitte’s Australian cut shows the local gap widening — 12% of Australian leaders say generative AI is already transforming their business against 25% globally.20 The diagnosis is mainstream. The apparatus is not. Out-frame, don’t out-shout.

And the reason the apparatus changes the commercial relationship is the artefact it produces. A deck invites the board to challenge the consultant’s taste. A populated Question Ledger — recommendation, rejected alternatives, evidence, gaps, revisit triggers — invites them to challenge the search.

A board can challenge a deck with taste. A board can challenge a Ledger with better questions. That moves strategy from opinion to inspectable search.

I know the apparatus is real because it has already behaved like it’s real without being told to. When I first built the proposal compiler, I asked it only to recommend a project for a client. It had read the John West and NegaMax material as background — and it produced its best recommendation plus two defeated recommendations, unprompted. The doctrine had quietly become operational context. I built that behaviour into version two on purpose. The flywheel was running in production before I’d finished naming it.

The founder signal, kept short

Chess search, cybersecurity, cloud software, Salesforce and SaaS architecture, a deep programming background that let me get into Claude Code early, TRIZ-style meta-abstraction, management training, board-level framing — I had access to all of it concurrently, and every piece compounded with the next. That is not normal. But it is also not magic. It is path-dependent accumulation meeting the right technology shock: the kindling was already stacked when the spark arrived. Most people get the spark with no kindling; the kindling without a spark stays inert. The convergence is the point — and it’s why the doctrine, the engine, the proposal compiler and this article are all outputs of the same machine.

So the commercial wedge resolves to one sentence:

A Dimension-3 consultancy redesigns the business you have. A Dimension-4 advisory decides which business you should still have.


The whole canon, mapped to its rung

If the arsenal is the point, here is the cleanest map of it — every framework located by the dimension it primarily serves. None of it is a tour; it’s the toolkit you throw at the search.

The LeverageAI canon, by Cognition Dimension
Rung What it does Frameworks that live here
1D–2D Compress and classify cognition Cognition Supply Chain (retrieval substrate); Maximising AI Cognition (cost-of-cognition)
3D Deploy workflow & agentic cognition safely Lane Doctrine (when not to push); Governance as Code; AI Readiness Staircase; Nightly AI Decision Builds
4D Generate synthetic data on demand; score synthetic futures Discovery Accelerator / NegaMax; The Reshape (boundary-case compression); Boundary Stacking; AI Think Tank Council; Terminal Value Doctrine; Question Ledger; John West Principle
Cross-cutting Ground and compound the apparatus Taste Kernel; Worldview Recursive Compression; the AI Learning Flywheel; the Model Dividend

The argument, compressed to one line for the people who skip to the end: when AI cheapens execution, the binding constraint moves twice — to choice, then to the apparatus that chooses — and the operator who builds the apparatus, grounds its evaluation function in lived friction, closes the loop with deployed reality, and accepts that the Fog is permanent captures an advantage nobody can clone, because the calibration cannot be copied without re-running their life. Build the chooser. Ground it outside itself. And plan for permanent Fog — because the better your chooser gets, the more Fog it makes.

This article is itself an output of the machinery it describes — written inside the flywheel, with the Discovery Accelerator’s own logic visible in its structure. The next layer is deployment and pricing, not more doctrine. If you’re a board asking where to spend the next unit of disciplined cognition, that’s the conversation to have.

References

  1. Andrej Karpathy. “Sequoia AI Ascent 2026 — talk notes.” 29 Apr 2026 — “The scarce thing is shifting… More scarce: understanding, taste, eval design… You can outsource your thinking, but you can’t outsource your understanding”; “the work itself is being reorganized around agents”; “my agent will talk to your agent.” karpathy.bearblog.dev/sequoia-ascent-2026
  2. Greg Brockman (Training Data podcast / Sequoia AI Ascent). “The $852B Bottleneck Is Now Human Attention.” May 2026 — “the bottleneck has shifted from execution to attention”; coding tools 20%→80% of code in one month; prototype cost collapsed from a week to minutes. finance.biggo.com/news/07b54e946df043ba
  3. Paul Graham, X, 14 Feb 2026 (1.32M views), reported in Fortune and the New Yorker — “In the AI age, taste will become even more important. When anyone can make anything, the big differentiator is what you choose to make.” fortune.com/2026/02/27/openai-sam-altman-taste-get-jobseekers-hired-ai-jobpocalypse
  4. Aparna Chennapragada. “Most Work is Translation.” ACD Substack, 15 Sep 2025 — “the unit cost of translation is close to zero… the blank-page tax… is close to zero.” aparnacd.substack.com/p/most-work-is-translation
  5. Sequoia Capital. “AI Ascent 2026.” — Pat Grady: “Not faster horses, but cars. And the cars have arrived.” Sonya Huang declares 2026 the year of agents. sequoiacap.com/article/ai-ascent-2026
  6. Sam Altman, X, 20 May 2026 — “i am excited to see what will happen with tokenmaxxing startups… happy building!” Reported in Business Insider, “Sam Altman’s Token Offer Is a New Twist to Startup Investing.” businessinsider.com/sam-altman-openai-offer-tokens-for-startup-equity-y-combinator-2026-5
  7. Diana Hu (Y Combinator), Startup School — “Maximizing token usage, not head count, will be the critical shift. The best companies will be the ones that are tokenmaxxing.” Business Insider, “Y Combinator’s Advice: Tokenmaxx, Don’t Headcountmaxx.” businessinsider.com/y-combinator-advice-ai-native-company-tokenmaxx-leaner-teams-headcount-2026-5
  8. Garry Tan (Y Combinator President & CEO), X, 20 May 2026 — “Tokenmaxxing confirmed.” Captured in Digg, “Sam Altman Offers $2M OpenAI Tokens to Every YC Startup for Equity.” digg.com/ai/6em7wr60
  9. Jensen Huang (Nvidia CEO), All-In podcast — “I would be deeply alarmed if a $500K engineer spends less than $250K on tokens.” Referenced in Mr. Prompts, “Tokenmaxxing.” mrprompts.substack.com/p/tokenmaxxing
  10. Yamini Rangan (HubSpot CEO), LinkedIn — “Outcome maxxing >> token maxxing.” Quoted in trendingtopics.eu, “Tokenmaxxing: Productivity Metric or Vanity Trap?” trendingtopics.eu/tokenmaxxing-is-ai-token-consumption-a-productivity-metric-or-vanity-trap
  11. Matt Calkins (Appian CEO) — “Tokenmaxxing is like the Soviet practice of judging the quality of chandeliers by their weight.” trendingtopics.eu/tokenmaxxing-is-ai-token-consumption-a-productivity-metric-or-vanity-trap
  12. Kevin Roose, The New York Times column on tokenmaxxing (May 2026) — Silicon Valley’s “newest form of conspicuous consumption.” linkedin.com/posts/kevin-roose_more-more-more-tech-workers-max-out-their-activity-7440809417387778048-4AL1
  13. The Pragmatic Engineer, “The Pulse: ‘Tokenmaxxing’ as a weird new trend” (citing The Information) — Meta employees used 60.2 trillion AI tokens in 30 days, ≈$900M at Anthropic API prices. blog.pragmaticengineer.com/the-pulse-tokenmaxxing-as-a-weird-new-trend
  14. Cursor. “Introducing Composer 2.5.” 18 May 2026 — “Composer 2.5 is trained with 25x more synthetic tasks than Composer 2”; feature-deletion paradigm with “tests… used as a verifiable reward.” cursor.com/blog/composer-2-5
  15. Jake Handy. “Model Drop: Composer 2.5.” HandyAI Substack, 18 May 2026 — built on Moonshot Kimi K2.5 open-weight base, ~85% of compute on Cursor’s own post-training/RL stack. handyai.substack.com/p/model-drop-composer-25
  16. DataCamp. “Composer 2.5: Benchmarks, Pricing, and How It Compares.” 22 May 2026 — synthetic tasks “grounded in real codebases, not toy examples”; feature deletion with tests as verifiable reward. datacamp.com/blog/composer-2-5
  17. McKinsey & Company. “The State of AI 2025.” — 88% use AI in ≥1 function (up from 78%); ~6% AI high performers (>5% EBIT impact + significant value); ~two-thirds not yet scaling; 62% at least experimenting with agents (n=1,993; 105 countries). mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai
  18. BCG. “Are You Generating Value from AI? The Widening Gap.” Sep 2025 — 5% “future-built,” 35% scaling, 60% reaping hardly any material value; future-built firms achieve ~5× revenue gains and ~3× cost reductions. bcg.com/publications/2025/are-you-generating-value-from-ai-the-widening-gap
  19. Bain & Company. “Technology Report 2025.” — “AI leaders are extending their edge… If you’re still piloting, you’re dangerously behind.” bain.com/insights/topics/technology-report
  20. Deloitte. “State of AI in the Enterprise — 2026 (Australia).” — 28% of Australian respondents moved ≥40% of pilots to production; 12% of Australian leaders say generative AI is already transforming their business vs 25% globally. deloitte.com/au/en/issues/generative-ai/state-of-ai-in-enterprise.html
  21. PwC. “2026 Global CEO Survey.” 19 Jan 2026 (n=4,454; 95 countries) — 56% of CEOs report neither revenue gains nor cost reductions from AI; only 30% confident about revenue growth (down from 38% in 2025, 56% in 2022). pwc.com/gx/en/news-room/press-releases/2026/pwc-2026-global-ceo-survey.html
  22. Oliver Wyman Forum. “The CEO Agenda 2026.” — “Half of planning time is now dedicated to horizons of less than one year, up from 43% in 2025.” oliverwymanforum.com/ceo-agenda/how-ceos-navigate-geopolitics-trade-technology-people.html
  23. Stanford Encyclopedia of Philosophy. “Thought Experiments.” — Lucretius’ spear at the edge of space, De Rerum Natura 1.951–987. plato.stanford.edu/entries/thought-experiment
  24. Albert Borgmann, Technology and the Character of Contemporary Life — the device paradigm; “we are more confident of our means than of our ends.” en.wikipedia.org/wiki/Technology_and_the_Character_of_Contemporary_Life
  25. Nicholas Carr, The Glass Cage / The Shallows — automation “erodes our skills, anaesthetises our curiosity and dims our critical faculties.” nicholascarr.com
  26. Karl Friston, “Active Inference: A Process Theory” — cognition is constituted by acting in the world; action fulfils predictions rather than merely updating them. activeinference.github.io/papers/process_theory.pdf
  27. BBC News / Campaign Asia — Woolworths reconfigured AI assistant “Olive” after it claimed to be human (Feb 2026); Gartner: ~80% of customer-service leaders exploring/deploying AI agents, only 20% of plans meeting expectations. bbc.com/news/articles/cy7jeyeyd18o
  28. Vectra AI. “Prompt injection: types, real-world CVEs, and enterprise defenses.” 2026 — OWASP #1 LLM vulnerability; attack success rates reaching 84% in agentic systems. vectra.ai/topics/prompt-injection
  29. Accelirate. “The 2026 Agentic AI Governance Crisis.” Jan 2026 — Gartner: more than 40% of agentic AI projects will be cancelled by 2027. accelirate.com/agentic-ai-governance-crisis
  30. McKinsey & Company. “Agentic commerce: How agents are ushering in a new era.” — up to $1 trillion in orchestrated US B2C retail revenue from agentic commerce by 2030. mckinsey.com/capabilities/quantumblack/our-insights/the-agentic-commerce-opportunity-how-ai-agents-are-ushering-in-a-new-era-for-consumers-and-merchants
  31. PYMNTS. “How AI Killed Information Asymmetry in B2B Procurement.” 2026 — “AI effectively eliminates this asymmetry,” giving buyers aggregated market intelligence and real-time benchmarks. pymnts.com/news/artificial-intelligence/2026/how-ai-killed-information-asymmetry-in-b2b-procurement

Discover more from Leverage AI for your business

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

© 2026 Leverage AI, Scott Farrell. All rights reserved. This content is made available on a limited, revocable, read-only basis only. No licence or right is granted to copy, reproduce, republish, scrape, store, adapt, summarise, index, embed, or use this content to create derivative works, work product, deliverables, methodologies, training materials, prompts, templates, software, services, research, or commercial outputs, whether by humans or machines, without prior written permission. This restriction includes internal business use, client work, consulting, advisory, implementation, and any use in or for artificial intelligence, machine learning, data extraction, retrieval, evaluation, fine-tuning, or knowledge-base construction.