Keep Up Your End
of the Bargain
Why AI Moves Work Instead of Removing It
AI doesn't remove work. It moves work.
From execution to thinking. From doing to directing.
And that changes everything about how you should use it.
What You'll Learn
- âś“ Why AI has disappointed you (and what to do about it)
- âś“ The director/crew mental model for effective AI use
- âś“ How effort redistributes from 80/20 to 10/90
- âś“ The specification mistakes that produce mediocre outputs
- âś“ Practical frameworks to unlock AI's true leverage
By Scott Farrell
December 2025
The Lie You Were Sold
You stare at the ChatGPT response. It's not wrong, exactly. It's just… not what you meant. Not what you needed. You try again. Rephrase. Add context. Hit enter. Same feeling. You've paid for a magic trick and received a card shuffle.
The promise was simple: AI will do the work for you. Just ask, and it delivers. Minimum effort in, maximum result out. "10x your productivity." "Work smarter, not harder." "Your AI assistant that understands you."
The expectation was a telepathic genius—a smart genie that could read between the lines, fill in the gaps, and hand back exactly what you imagined but hadn't quite articulated. You'd mutter something vague, and it would deliver perfection.
But that's not what happened.
The Gap Between Promise and Reality
What you expected was a magic button. Click it, and the work is done. Type a few words, receive polished output. AI as the ultimate shortcut—less work, better results, instant gratification. The demos made it look effortless. The marketing told you it would change everything.
What you got was something else entirely. Generic outputs that feel "vibey" but not useful. Endless iterations trying to coax out what you actually wanted. Bland, both-sides, hedge-everything responses that are technically correct but somehow miss the point completely. You asked for a comparison, but you wanted it in a table—you just didn't say so. You asked which option was "best," and AI gave you a paragraph about how "it depends," because you didn't specify best for whom, or by what criteria.
The Magic Button Problem
The "magic button" is a user experience anti-pattern. It promises instant results without revealing the work required. Clicking feels exciting at first, but rapidly becomes disappointing after a few tries.
Why? Because it violates a core principle of user experience: managing expectations. When AI is positioned as magic, the gap between expectation and reality creates frustration instead of value.
The sneaking suspicion grows: maybe AI is overrated. Maybe everyone else is getting better results. Maybe you're doing something wrong. But what?
The Gap Is Measurable
This isn't just anecdote. The disappointment is real, widespread, and documented by research.
In 2025, METR conducted a rigorous study of experienced software developers working on real issues in live repositories—not simplified benchmark tasks, but actual production work with all its complexity, ambiguity, and context. The results shocked the industry.
"When developers are allowed to use AI tools, they take 19% longer to complete issues—a significant slowdown that goes against developer beliefs and expert forecasts."— METR Research, 2025
Read that again. Not 19% faster. Nineteen percent slower.
And here's the truly revealing part: these developers expected AI to speed them up by 24%. Even after experiencing the slowdown in the study, they still believed AI had made them 20% faster. The gap between perception and reality is striking—and it explains why so many people keep trying AI despite mounting frustration.
Magic in the Lab, Struggle in the Field
Synthetic Benchmarks
- • Simplified, controlled tasks
- • Clear requirements and scope
- • No production complexity
- • Often tested with beginners
- • Result: "55% faster completion" (GitHub claims)
Real-World Production
- • Complex, ambiguous requirements
- • Live codebases with legacy constraints
- • Experienced developers
- • Full production context
- • Result: "19% slower completion" (METR study)
The difference: complexity, ambiguity, and context that benchmarks don't capture.
The Scale of Disappointment
The METR study isn't an outlier. It's part of a pattern visible across industries and roles.
A 2025 Upwork study found that 77% of workers say AI has either increased their workload or decreased their productivity. Not "stayed the same"—actually made things worse. These aren't people who've never tried AI. These are workers actively using AI tools in their jobs, and most are reporting negative outcomes.
Stack Overflow's annual survey of nearly 50,000 developers tracks sentiment over time. The results show a steep decline in trust. In 2024, 43% of developers trusted the accuracy of AI tool outputs. By 2025, that number had fallen to 33%. In parallel, how favorably developers view adding AI tools to their workflow dropped from 72% to 60% in a single year.
This decline isn't happening because the models got worse. Claude 4.5, GPT-5.1, and Gemini 3.0 are more capable than ever. The decline is happening because real-world use is exposing the gap between what AI was marketed as and what AI actually requires.
The Cost of Disappointment
The consequences of this gap aren't trivial. They compound over time.
At the individual level: wasted subscription fees (20–50 dollars per month for tools that don't deliver the promised value), time spent iterating on outputs that never quite get there, missed opportunities to learn the actual leverage points because early frustration blocks deeper exploration.
At the organizational level: failed AI adoption initiatives where teams conclude "AI doesn't work for us," pilot projects that never ship because outputs are too unreliable, growing resistance to AI tools from burned teams who tried and were disappointed.
The greatest cost is invisible: never discovering the actual power of AI because the first few attempts created learned helplessness. The tool gets blamed. The approach never gets questioned.
Why Early Praise Was Misleading
If AI is so problematic, why did it look so impressive at first? Why the hype?
Much of the early AI praise came from tests using simplified, synthetic tasks. Benchmarks designed to showcase what models can do under ideal conditions. Demos that show best-case scenarios—often with significant hidden prompt engineering behind the scenes. The people running these demos know how to coax good outputs from AI. But that expertise is invisible to the viewer.
In controlled lab environments with clear requirements and well-defined scope, AI genuinely looks like magic. Code completion on common patterns? Extraordinary. Summarizing a document with clear structure? Impressive. Generating variations on a well-specified theme? Powerful.
But production work isn't a benchmark. Real tasks come with ambiguity, legacy constraints, unstated assumptions, organizational context, and constantly shifting requirements. This is where the "magic" story falls apart.
"Much of the early praise around coding assistants came from tests using simplified, synthetic tasks, or beginner developers. In those environments, AI looks like magic. But this study focused on experienced developers solving real issues in live repositories. In that context, AI often didn't help at all. In fact, it sometimes made things worse."— Diginomica: AI Tools Slow Down Experienced Developers
The other factor: the novice effect. AI shows bigger productivity gains for beginners working on basic tasks. If you're learning to code, Copilot can feel transformative—it's giving you syntax you don't yet know. If you're an experienced developer working on a complex architecture problem, AI often becomes cognitive overhead rather than leverage.
The demos weren't lies. They were incomplete truths. They showed what's possible without showing what's required.
The Contractor Analogy
Imagine you need to hire a contractor to build something. You call them up and say:
"Just build me… y'know… like… an app? For booking stuff? Make it good."
The contractor asks clarifying questions. What kind of bookings? Who are the users? What devices? Do you need payments? What's your budget? What's the timeline?
You wave them off. "You're the expert. Just make it work. I'll know it when I see it."
Three weeks later, the contractor delivers something. It's functional. It handles bookings. But it's not what you imagined. It doesn't match the unspoken vision in your head—the one you assumed was obvious but never articulated.
You'd call that contractor incompetent, right? Except the problem wasn't the contractor. The problem was the brief. It was impossible. No amount of expertise can read your mind.
Yet this is exactly how most people approach AI.
"Don't offload your critical thinking to a tool that doesn't understand your environment. And definitely don't expect it to be your shortcut when you haven't done the hard thinking first."— Medium: AI Made It Worse
The Self-Reinforcing Cycle
Here's the pattern that traps most people:
- You try AI with high expectations (the marketing promised magic)
- The first output is underwhelming or misses the mark
- You iterate a few times with diminishing hope
- You conclude: "AI is overhyped" or "AI doesn't work for me"
- You either abandon the tool or use it reluctantly, never investing the effort to learn what actually works
The cycle is self-reinforcing. Vagueness leads to disappointment, which leads to abandonment, which prevents discovery of the actual leverage points. The failure mode creates learned helplessness: "I tried AI and it didn't work, therefore AI isn't useful."
The problem compounds when everyone around you seems to be getting results. You start to wonder: what am I missing? Why does it work for them but not for me?
The answer isn't what you think.
The Market Correction
We're living through a necessary market correction. The initial phase of "wow"—the demos, the hype, the promises—is being replaced by a demand for "how." How do we actually make this work? How do we get reliable results? How do we move from pilot to production?
Early adopters are beginning to feel let down as results fail to match once-optimistic expectations. AI projects often fail not because the technology is bad, but because they're untethered from reality, attached to unrealistic or uninformed expectations, or lacking the foundational understanding of what AI actually requires.
From Wow to How
2023: Experimenting with AI
"Wow, it can do that! This is going to change everything!" Early demos, viral moments, breathless headlines.
2024: Adopting AI
"Let's add this to everything. Every team gets ChatGPT Plus. Let's build pilots." Widespread adoption without widespread understanding.
2025: The Reckoning
"Why isn't this working the way we expected? Why are we slower? Why don't the outputs match production quality?" Real-world complexity meets unrealistic expectations.
Now: Understanding
"Here's the actual contract. Here's what AI requires. Here's how to use it properly." Maturity begins.
This correction is healthy. It's the moment when hype meets reality and forces a reckoning. The people who get through it will unlock genuine leverage. The people who don't will conclude AI was oversold and move on.
This Isn't About AI Being Bad
Before we go further, let's be clear about what this book is not arguing.
What This Book Will NOT Argue
- AI is useless. It's not. When used correctly, AI is extraordinarily powerful. The models available in 2025 are capable of remarkable things.
- You need to become a prompt engineer. You don't. Mindset matters far more than technique. Understanding the contract matters more than memorizing tricks.
- AI requires more total effort. Not necessarily. Effort is redistributed, not always increased. But it's redistributed in a way most people don't expect.
- You made a mistake in trying AI. You didn't. The problem is the expectation that was set, not your decision to experiment.
What This Book WILL Argue
- The expectation was wrong, not the tool. AI isn't a magic genie. It's something else—something powerful, but different.
- AI doesn't remove work—it moves work. The effort shifts from execution to specification. That shift is profound and poorly understood.
- There's a specific pattern to AI success. It's not luck. It's not magic. It's a learnable approach that most people haven't internalized.
- Your frustration is data, not failure. The gap between what you expected and what you got reveals something important about how AI actually works.
- You can fix this, starting immediately. The shift isn't complicated. But it does require rethinking the contract.
The Promise of This Book
By the time you finish this book, you will understand:
- Why AI disappointed you—not just "try harder," but the actual mechanism behind vague inputs producing vague outputs
- A new mental model that works: the director-crew relationship, not the magic genie myth
- The specific shifts that unlock AI's power—shifts in how you think about the work, not just how you phrase prompts
- How to start seeing different results with your very next prompt, because you'll understand the bargain AI actually offers
This isn't a book about prompt engineering tricks. It's not a comparison of tools. It's not a prediction about the future of work.
It's a book about the uncomfortable truth that nobody wants to hear but everyone needs to understand: AI doesn't reduce the work. AI moves the work. And once you understand where the work moves to, everything changes.
What Comes Next
The lie you were sold was that AI is a shortcut to effort. That you can do less and get more. That the hard work is over.
The truth is more interesting—and more useful.
In the next chapter, we'll unpack the core insight that changes everything: AI doesn't remove work, it moves work. From execution to specification. From doing to thinking. From implicit to explicit.
That single shift explains the 19% slowdown, the 77% dissatisfaction, the declining trust, and—most importantly—the path forward.
Because once you understand where the work moves to, you stop fighting AI and start working with it.
The problem isn't that AI is overhyped. The problem is that we approached it with the wrong mental model.
And nobody told us the real contract.
AI Doesn't Remove Work. It Moves Work.
The single insight that explains virtually all AI frustration—and the pathway to real leverage.
Here's the uncomfortable truth that the marketing never mentioned:
AI doesn't remove work. It moves work.
This isn't a pessimistic take. It's the mechanism that explains both why AI disappoints and why it delivers. Once you understand where the work moves to, everything changes.
Let me show you.
The Economic Intuition is Wrong
When people imagine using AI, they picture something like this:
- Before AI: 100 units of work
- After AI: 10 units of work
- Net result: 90% reduction in effort
That's the promise. Less work, better results, more time for strategy or coffee or going home early.
But that's not what's happening.
Here's the actual pattern:
- Before AI: 100 units of work (80 execution + 20 thinking)
- After AI: Still ~100 units of work (10 execution + 90 thinking)
The total effort hasn't disappeared. The type of effort changed. Dramatically.
"You're not going from 100 units of work to 10 units of work. You're going from 80 units execution + 20 units thinking to 10 units execution + 90 units thinking."
This ratio shift is the mechanism behind both AI success and AI frustration. When you understand it, the pattern becomes obvious across every domain.
Code, video, prose, analysis, design—same pattern. The execution becomes trivial. The specification becomes everything.
The Research Confirms the Redistribution
This isn't speculation. It's documented across multiple studies from 2024-2025.
KPMG research found that work is shifting from execution to orchestration. Humans are becoming designers, verifiers, and supervisors of intelligent agents. This requires redesigning job descriptions, decision rights, and accountability frameworks. Fifty-two percent of leaders now rank job redesign as their top workforce priority.
— KPMG: Rethinking Strategic Workforce Planning with AI Agents, 2025BCG research arrived at the same conclusion from a different angle: work is being redistributed, not eliminated. Teams find new ways to integrate AI into execution and value creation. Support-heavy roles are shrinking as AI handles execution, but organizations are shifting to cross-functional pods powered by AI assistants. The work transforms rather than disappears.
— BCG: AI Is Moving Faster Than Your Workforce Strategy, 2025McKinsey's research pushed this further: rather than replacing entire job functions, AI transforms discrete components of work. Companies are moving toward task-based planning models, analyzing work at granular levels to determine which tasks remain uniquely human and which can be offloaded to machines.
— McKinsey: Skill Shift - Automation and the Future of the Workforce, 2025Same pattern, different angles. The work still exists—just at a different granularity, requiring different skills.
The Three Hidden Layers of AI Work
Every "AI-assisted" workflow hides three forms of invisible human effort. These weren't part of the original AI promise. But they're very much part of the reality.
1. Verification Work
Checking whether outputs are correct and compliant. Did the AI produce accurate information? Does this meet quality standards? Is this appropriate for the context? This work didn't exist before AI—someone just did the task correctly the first time.
2. Correction Work
Editing, reframing, or sanitizing content before use. Fixing hallucinations, errors, and off-tone outputs. Adapting generic AI output to specific context. Making AI-ese sound like human communication. Every output becomes a draft that needs polish.
3. Interpretive Work
Deciding what AI's suggestions actually mean for your context. Translating AI output into actionable decisions. Understanding when AI is confident versus guessing. Knowing which parts to trust and which to verify. This is cognitive work masquerading as automation.
This is why 77% of workers report that AI has either increased their workload or decreased their productivity. Instead of cutting effort, AI often stacks a second layer of work on top of the first—reviewing outputs, bridging system limitations, handling exceptions.
— Upwork Research: AI and the Future of Work, 2025"In many cases, instead of cutting effort, AI just stacks a second layer of work on top of the first—reviewing outputs, bridging system limitations, handling exceptions."— Shep Bryan: Cognitive Load Research, 2025
Cognitive Supervision: The New Skill
We are witnessing the emergence of a new human skill: cognitive supervision.
It's the ability to guide, critique, and interpret machine reasoning without doing the work manually. It's the corporate equivalent of teaching someone to manage a team they don't fully understand.
— Wharton: The AI Efficiency Trap, 2025Here's why it's harder than it sounds: You need to understand the domain well enough to verify AI outputs. But if you could do the work easily yourself, why use AI? The catch is that AI is most useful for tasks you're less expert in—where verification is hardest.
This creates what I call the verification paradox:
- AI is most valuable for tasks you can't easily do yourself
- But tasks you can't easily do yourself are hardest to verify
- Result: the more leverage you want, the more cognitive supervision you need
The very situations where AI promises the biggest gains are the situations where the hidden cognitive work is highest.
The Cognitive Shift Research
A 2025 study from Microsoft Research and Carnegie Mellon University surveyed 319 professionals who regularly use AI tools in their work. They documented three major shifts in how people approach cognitive tasks when using AI.
— Microsoft/CMU: Critical Thinking and AI Tools in the Workplace, 2025The Three Cognitive Shifts
| Work Type | Pre-AI Effort | Post-AI Effort |
|---|---|---|
| Information Work | Gathering data | Verifying AI outputs |
| Problem-Solving | Direct solving | Integrating AI suggestions |
| Analysis & Evaluation | Hands-on execution | Oversight & quality control |
For information-related work, effort moves from gathering data to verifying AI outputs. In problem-solving scenarios, focus shifts from direct solutions to integrating AI suggestions. For analysis and evaluation tasks, workers transition from hands-on execution to oversight and quality control.
There's an important exception: For basic recall and comprehension tasks, 72% of participants reported decreased effort when using AI tools. AI genuinely reduces grunt work.
But for complex, judgment-intensive work? Effort shifts—it doesn't disappear. AI reduces the doing but increases the thinking.
The Skills Shift Projection
McKinsey's research projects what this means for skill demand over the coming years:
Growing skill categories:
- Social and emotional skills (AI can't replicate these)
- Technological skills (operating and integrating AI systems)
- Advanced cognitive skills (judgment, synthesis, creativity)
Declining skill categories:
- Basic cognitive skills (data input, basic processing)
- Declining by 15% of hours worked—from 18% to 14%
- These are exactly the tasks AI handles well
The implication is clear: If your job was mostly "basic cognitive skills," it's shrinking. If your job was mostly "advanced cognitive skills," it's growing. The shift rewards thinking work over doing work.
Why This Feels Like a Betrayal
People were sold "AI as shortcut." What they actually got was something different:
- A force multiplier on structured thought
- A brutally honest mirror for vague thought
You expected to do less. Instead, you're doing different—and sometimes more. The "more" was invisible in the marketing. When the hidden work becomes visible, it feels like a broken promise.
There's also a timing problem. AI value takes time to emerge. Microsoft research found that 11 weeks are required to fully realize productivity gains. Most people give up before the investment pays off. They experience the adjustment period—the cognitive load, the learning curve, the verification work—without staying long enough to see the compounding returns.
— Microsoft: Realizing Productivity Gains from AI Tools, 2025The Reframe: Leverage, Not Shortcut
AI is not a shortcut to effort. It's not a way to avoid work.
It's a way to change what kind of work you do—and amplify the results of that work.
Here's the reframe that makes everything click:
AI is a shortcut to leverage.
Same input of effort, but different output potential. Your thinking work can now scale in ways it couldn't before. One hour of clear specification can produce outputs that would take days to create manually.
But here's the catch: the hour of specification can't be skipped.
Before AI, you could hide fuzzy thinking behind activity. You'd start writing, figure it out as you went, iterate through drafts. The messiness was private. The thinking happened mixed in with the doing.
With AI, fuzzy thinking produces fuzzy outputs—at scale. The quality of your thinking is externalized and reflected back at you, wrapped in fluent language. You can't hide behind busywork anymore.
This is the uncomfortable truth: AI reveals thinking quality you could previously conceal.
What This Means Practically
For individuals:
- Stop expecting AI to reduce your workload
- Start expecting AI to change your workload
- Invest in the thinking work that produces AI leverage
- Recognize that verification and interpretation are now core skills
For teams:
- AI adoption requires thinking work, not just tool access
- Redesign workflows for the new effort distribution
- Build cognitive supervision capabilities deliberately
- Measure the right things—not just "time saved"
For leaders:
- The 19% slowdown in the METR study isn't AI failure—it's an adjustment period
- 11 weeks to fully realize gains isn't failure—it's a learning curve
- Investment in specification and thinking skills has higher ROI than tool selection
- Job redesign is not optional—52% of leaders already recognize this
The Core Truth About Effort
AI doesn't make work go away. It transforms what work means.
- From hands to brain
- From doing to directing
- From execution to specification
If you approached AI expecting less work, you misunderstood the contract. If you're willing to do different work—thinking work—AI delivers real leverage.
The Question Isn't "Does AI Work?"
The question is: Are you doing the work that makes AI work?
That "work" isn't typing. It isn't clicking. It isn't installing tools or subscribing to services.
It's the cognitive work of clarity. Specification. Direction. Verification. Interpretation. Judgment.
The 90% of effort that moved upstream when AI took over the 80% downstream.
And here's the thing: that upstream work has a name, a shape, and a learnable structure.
It's captured in a single mental model that transforms how you approach AI.
In the next chapter, we'll reveal it:
You're the Director. AI Is the Crew.
You're the Director. AI Is the Crew.
Picture a film set. There's a talented cinematographer, sound engineer, lighting tech, grips. Professionals who can execute at a high level. The crew. Now imagine the director walks in and says:
"Just make it look good."
What happens?
The cinematographer has no idea what shot to set up. Wide or tight? Handheld or locked off? Warm or cold light? High key or low key? They're paralysed by infinite options. So they default to something safe and generic. The result looks like stock footage.
Now imagine the director says instead:
"I want a slow push-in on her face."
"Natural light from the window."
"She's realising something painful but trying not to show it."
"Stay tight enough that we see the moment her eyes change."
Now the cinematographer can deliver something powerful. The specification unlocked the craft.
AI works exactly the same way.
You're not the audience expecting magic. You're the director providing vision. The AI is the crew—talented, capable, but waiting for direction.
The Method Actors Research
Recent research from arXiv demonstrates this principle with startling precision. Under what researchers call the "Method Actors" mental model, large language models should be thought of as actors; prompts are scripts and cues; and LLM responses are performances. Prompt engineering is playwriting and directing.
"Under this mental model, LLMs should be thought of as actors; prompts as scripts and cues; and LLM responses as performances. Prompt engineering is playwriting and directing."— ArXiv: LLMs as Method Actors
The same AI model. The same puzzle dataset. Different prompting approaches:
Let that sink in. The AI's capability didn't change between tests. Only the quality of direction changed. A vanilla approach solves 27% of puzzles. The strongest Method Actor approach solves 86%. That's a three-fold improvement from better direction alone.
This is the director/crew effect in action.
Joint Cognition: What You Bring, What AI Brings
The relationship between human and AI is genuinely symbiotic. But symbiosis doesn't mean mind-reading. It means complementary contributions. Each side brings something the other cannot.
The Division of Labour
What You Bring
- •Intent: What are you trying to achieve?
- •Taste: What feels right, what feels wrong?
- •Constraints: What must be true, what can't happen?
- •Context: What's the broader situation?
- •Judgement: When to push forward, when to stop?
What AI Brings
- •Speed: Generate in seconds what would take hours
- •Breadth: Explore many options simultaneously
- •Pattern matching: Recognise structures across domains
- •Generation: Produce raw material at scale
- •Scale: Multiply outputs without fatigue
Neither is sufficient alone. Both are necessary together.
The combined system is powerful—but only if your half is doing serious lifting. If you provide vague intent, you get generic output. If you provide clear direction, you get focused execution.
The quality of the output tracks the quality of your direction.
How Advanced AI Systems Use This Pattern
Anthropic's multi-agent research system demonstrates this at scale. It uses an orchestrator-worker pattern, where a lead agent coordinates the overall process while delegating to specialised subagents that operate in parallel. The orchestrator delegates, integrates results, and decides next steps.
What's revealing is how Anthropic describes effective prompting for these systems:
"The best prompts for these agents are not just strict instructions, but frameworks for collaboration that define the division of labor, problem-solving approaches, and effort budgets."— Anthropic Engineering Blog: Multi-agent research system
Notice the language: "frameworks for collaboration." You're not just writing a command. You're establishing a working relationship. You're defining how you and the AI will collaborate on this task.
The prompt is the briefing document for your AI crew.
Chain-of-Thought as Cognitive Collaboration
Chain-of-Thought prompting isn't just a technical trick. Research describes it as "a cognitive collaboration framework" that leverages how human working memory processes information. Advanced AI systems assign roles like "devil's advocate" or "synthesiser" to simulate team cognition.
This is prompt-driven collaborative psychology. The director doesn't just give a single instruction. The director designs a cognitive process:
- •Multiple perspectives on the same problem
- •Multiple passes through the material
- •Multiple checks before finalising
The director decides how the team works together. That's your role.
Prompting as a New Form of Programming
We are moving into a world where collaboration with machines is the norm. Where prompting becomes a form of programming. Research on prototyping with AI found that designers—people with expertise in crafting interfaces and designing input/output flows—were particularly drawn to using prompts as "templates" during the prototyping process.
Prompts are interface design for cognition. The prompt shapes the collaboration, just as a user interface shapes user behaviour.
Prompting isn't just "asking questions." It's designing interactions. It's structuring collaboration. It's directing a performance.
What the Director/Crew Model Changes
For Your Expectations
Stop expecting AI to guess what you want.
Start expecting to tell AI what you want, clearly.
The quality of output is your responsibility, not just AI's.
For Your Process
Before you prompt: Know what you're asking for
During prompting: Provide the direction a crew needs
After output: Evaluate like a director watching a take
Iterate: "That's good, but try it with more X"
For Your Relationship with AI
Master and servant (AI does what you say)
User and magic box (AI figures it out)
âś“ Director and crew: professional collaboration with clear roles
The Director's Briefing Checklist
Before every significant AI interaction, ask yourself these six questions:
The test: Could a talented human professional execute from this brief? If not, the AI can't either. If a cinematographer couldn't shoot from your direction, the AI can't generate from it.
Common Director Mistakes
Four Ways Directors Fail Their AI Crew
❌ The Absent Director
Giving a task with no direction at all. "Write me something about marketing."
Result: AI has infinite options, picks something generic. No director, no vision, no quality.
❌ The Micromanaging Director
Specifying every word, every comma. Leaving no room for AI's strengths.
Result: Might as well write it yourself. Wastes AI's leverage completely.
❌ The Impatient Director
One prompt, expect perfection. No iteration, no "try it another way."
Result: Directors watch multiple takes. AI users should too. The first output is rarely the final cut.
❌ The Unclear Director
Knows what they want but can't articulate it. "Make it... you know... better."
Result: The AI reflects back the vagueness. Clarity is a learnable skill.
The Mindset Shift: From User to Director
| User Mindset (doesn't work) | Director Mindset (works) |
|---|---|
| "I'll ask and AI will figure it out" | "I'll provide clear direction and AI will execute" |
| "AI should understand what I mean" | "I need to communicate what I mean clearly" |
| "If the output is bad, the AI is bad" | "If the output is bad, I should clarify my direction" |
| Passive consumption | Active collaboration |
The key insight: The director owns the result. Not because they did the work, but because they provided the vision. You are accountable for the quality of your direction.
What This Unlocks
The director/crew mental model transforms your relationship with AI from frustrating to productive:
- ✓A new relationship with AI: collaborative, not magical
- ✓A clear focus: improve your direction, not just find better AI
- ✓A path forward: learn to direct better, get better results
- ✓Accountability: you own the quality of your direction
Remember
- • You're the director. AI is the crew.
- • Good direction produces good results. Vague direction produces stock footage.
- • The capability is there—27% to 86% on the same model proves it.
- • Your direction is the variable you control.
But this raises a question: Why does vague direction produce vague outputs? What's the actual mechanism at play?
In the next chapter: The Laziness Mirroring Effect
The Laziness Mirroring Effect
"If I'm lazy, it'll be lazy."
It's a phrase that captures something weirdly accurate about working with AI. The model doesn't literally decide to slack off, but the effect is real and measurable. When you provide vague input, AI seems to "phone it in." Understanding why this happens changes everything about how you approach AI.
This isn't magic or telepathy. It's statistics. And once you understand the mechanism, you can stop falling into the trap.
The Vagueness Cascade
The mechanism behind vague inputs producing vague outputs follows a predictable four-step pattern:
This isn't the AI being lazy. It's the AI not knowing what you want. Without strong signals, it samples from "likely to be acceptable"—which is, by definition, the middle of the distribution. And the middle of the distribution is generic.
AI Is Reflecting Your Ambiguity Back at You
Here's what makes it particularly uncomfortable: AI doesn't just produce bad output from vague input. It produces plausible-sounding bad output, wrapped in fluent language and confident phrasing.
So it feels like the AI tried. The sentences flow smoothly. The tone is professional. But the content is still the safe middle—avoiding anything that might be wrong by also avoiding anything particularly useful.
"AI models, despite their sophistication, don't inherently understand context or implied information the way humans do. They operate based on patterns and probabilities derived from their training data. When we provide vague or ambiguous prompts, we're essentially asking the AI to make assumptions or interpretations, which can lead to outputs that don't align with our intentions."— PromptPanda: AI Prompt Optimization
This creates a moral discomfort. You're forced to confront how fuzzy your own thinking was. The frustration isn't really at the AI—it's at yourself. AI just made visible what you could previously hide when you were the one doing the execution.
The Research on Ambiguity
The Nielsen Norman Group, experts in user experience research, documented this pattern clearly: "AI struggles with ambiguity and is unable to deliver thoughtful results within a broad context." One of the biggest mistakes users make is providing prompts that are overly general.
A vague prompt like "Tell me about marketing" will yield a generic, unfocused answer. Not because the AI doesn't know about marketing, but because there are ten thousand valid angles on "marketing" and no signal about which one you actually want.
Research shows that models like Qwen1.5-7B and Flan-PaLM 2 perform poorly with vague prompts but improve significantly with clear wording. Same model. Same capability. Different clarity. Different results.
GIGO 2.0: The AI Version Is Worse
"Garbage in, garbage out" has been a computing principle since 1957, when IBM programmer George Fuechsel used it to explain that computers produce erroneous output when given erroneous input.
With traditional software, bad input causes obvious errors: crashes, error messages, obviously broken outputs. You know immediately that something went wrong.
With AI, bad input causes plausible-sounding bad output. The failure mode is subtler and more dangerous.
GIGO 1.0 vs GIGO 2.0
Classic GIGO (1957)
- • Bad data in → Error message or crash
- • Failure is obvious and immediate
- • Easy to spot the problem
- • System won't proceed with garbage
- • Forces you to fix the input
AI GIGO (2025)
- • Vague input → Fluent, confident, wrong output
- • Failure is subtle and delayed
- • Sounds good, might be garbage
- • System happily proceeds with ambiguity
- • Much harder to detect and fix
The principle is simple: the quality of your input determines the quality of your output. With AI, that principle intensifies because the "input" isn't just data—it's your clarity of thought.
Beyond Data Quality: Thinking Quality Matters
GIGO isn't just about data quality. It also applies to incorrect thinking, incorrect assumptions, and bias. In the AI context, sources of "garbage" include:
- Poor understanding of causality
- Incomplete or missing documentation of what you actually want
- Wrong hypotheses about what the task requires
- Inadequate research before prompting
- Miscommunication of goals
- Misunderstanding of what you're asking for
- Erroneous judgments about what "good" means
- Relying on human intuition without verification
With AI, the "input" isn't just data—it's the clarity of your thinking. Fuzzy thinking produces fuzzy AI output. Clear thinking produces clear AI output. AI amplifies thinking quality, for better or worse.
Why Vagueness Produces Generic Outputs: The Statistical Mechanism
AI doesn't "choose" to be generic. It samples from a probability distribution based on patterns it learned during training.
When your input is vague, there's no strong signal about which part of the distribution to sample from. So AI defaults to the "peak"—the most likely outputs given the prompt. And the most likely outputs are, by definition, the most common patterns. Which are generic.
A Concrete Example: PyTorch vs. TensorFlow
Here's a real example from Medium research on improving ChatGPT's ability to understand ambiguous prompts:
Because "them" is ambiguous and "difference" is multidimensional.
What are the advantages of GPT-4?"
Same AI, same topic, different clarity, different quality.
What changed? Not the AI's capability. Not the subject matter. Only the specification of what was being asked. Clarity unlocked quality.
The Mirror You Didn't Ask For
AI externalizes and reflects the quality of your thinking. It's an uncomfortable mirror that reveals fuzzy thinking you could previously hide.
In manual work, you could muddle through with vague thinking. Your execution would gradually clarify your intent. The doing would reveal the thinking. Nobody else saw the fuzzy starting point.
With AI, you have to articulate the thinking before execution. There's no gradual discovery through doing. Fuzzy input produces fuzzy output immediately. The quality of your thinking is instantly visible.
This is the moral discomfort: AI forces you to confront intellectual laziness you could previously mask with activity.
Solutions: Specificity and Structure
The research on prompt optimization is consistent across sources: vague prompts lead to generic or irrelevant outputs. The solution is to be specific and detailed when framing your requests.
"Clearly define what you need. Vague prompts lead to ambiguous results. For instance, instead of saying 'Write a blog post,' specify the topic, tone, and key points. Use precise words and clear formatting. Add context to eliminate confusion. Break complex tasks into smaller steps."— Hatchworks: Expert's Guide to Generative AI Prompts
Clarity matters more than length. As Gen AI & Prompting research notes: "Verbosity without precision hurts clarity—a concise, well-chosen keyword can often produce better results than a long but vague description."
It's not about writing more. It's about writing clearer.
Practical Specificity Tactics
Define What You Need
Instead of: "Write a blog post"
Specify: Topic, tone, length, key points, target audience, desired action
Example: "Write a 1,200-word blog post for mid-market B2B decision-makers explaining why AI projects fail. Use a direct, evidence-based tone. Include three specific failure patterns with research citations. End with a checklist."
Use Precise Words and Clear Formatting
Instead of: "Make it professional"
Specify: The exact format and structure you want
Example: "Use a comparison table with three columns: Feature, Option A, Option B. Include 5-7 rows covering cost, implementation time, ongoing maintenance, skill requirements, and vendor lock-in risk."
Add Context to Eliminate Confusion
Instead of: "Summarize this"
Specify: Who the summary is for and what they'll do with it
Example: "Summarize this 50-page technical report in 3 bullet points for a CFO who needs to decide on budget allocation. Focus only on financial implications and timeline risks. Skip technical implementation details."
Break Complex Tasks into Smaller Steps
Instead of: "Create a marketing strategy"
Specify: The sequence of discrete subtasks
Example: "Step 1: List 5 customer pain points our product addresses. Step 2: For each pain point, draft a one-sentence value proposition. Step 3: Rank them by urgency. Step 4: Write a positioning statement that leads with the top-ranked pain point."
The Laziness Test
Before you hit enter on your next AI prompt, ask yourself:
- Am I giving direction or hoping for telepathy?
If you're waiting for AI to "figure out" what you mean, you're hoping for telepathy. - Could a talented professional execute from this brief?
If a human expert couldn't produce what you want from your prompt, neither can AI. - Have I specified the dimensions of "good"?
Format, audience, constraints, tone, success criteria—are they clear? - Am I being lazy and hoping AI will compensate?
If yes: AI will reflect your laziness back. Not as punishment—as statistics.
Vague input produces vague output. Always. This is the laziness mirroring effect.
Chapter Summary
Key Takeaways
- • AI doesn't "get lazy"—it samples from probability distributions. Vague input creates infinite valid responses, so AI picks the safe, generic middle.
- • The vagueness cascade: vague input → huge output space → AI picks "generic and safe" → feels useless. This is statistics, not laziness.
- • AI is an uncomfortable mirror that externalizes the quality of your thinking. Fuzzy thinking you could hide in manual work is immediately visible in AI output.
- • GIGO 2.0 is worse than classic GIGO: instead of obvious errors, you get plausible-sounding bad outputs that are harder to detect.
- • Specificity isn't optional—it's the input. Clarity matters more than length. Be precise about topic, tone, format, audience, and constraints.
- • The laziness test: Am I hoping for telepathy? Could a human expert execute from this brief? If not, fix your input before blaming the AI.
The uncomfortable truth: you can't hide fuzzy thinking anymore. AI will reflect it back immediately, wrapped in fluent language that sounds good but delivers nothing useful.
The empowering truth: once you understand the mechanism, you can fix it. Specificity unlocks quality. Every time.
Next: This mechanism is easiest to see in a specific example. The question "What's the best?" perfectly demonstrates multidimensional ambiguity and why AI can't answer questions you haven't fully asked.
The "What's Best?" Problem
One of the most common AI requests—and one of the worst: "What's the best?" It seems simple. It's actually unanswerable. And understanding why transforms your AI use.
Picture this: you have two pieces of text. Maybe two different drafts. Maybe outputs from two different LLMs. Maybe two articles on the same topic. You ask your AI assistant:
"Which is the best?"
AI flounders. Gives you a generic, both-sides-have-merits answer. Doesn't help at all. You think: "AI is useless."
But the real problem isn't the AI. The question seems clear to you, but "best" contains at least ten hidden dimensions. Without knowing which dimensions matter, there's no answer. AI isn't failing—you're asking an impossible question.
The Ten Dimensions of "Best"
When you ask "which is best?" you might mean any of these—and they often conflict:
1. Best Written
Prose quality, elegance, flow. The craft of the writing itself.
2. Most Interesting
Captures attention, provokes thought, keeps you reading.
3. Most Informative
Density of useful information per word.
4. Most Accurate
Factual correctness, no errors or hallucinations.
5. Most Logically Sound
Logical validity, coherent arguments, no contradictions.
6. Most Detailed
Comprehensive coverage, thoroughness.
7. Most Viral
Most likely to be interesting on social media, shareability.
8. Most Accessible
Most likely to interest someone unfamiliar with the topic.
9. Most Expert-Level
Most likely to interest someone with in-depth understanding.
10. Clearest
Ease of understanding, no jargon, simple structure.
These dimensions often conflict. Most detailed ≠Most accessible. Most viral ≠Most accurate. Most expert-level ≠Most clear.
There's no universal "best"—only best for a purpose. When you don't specify which dimension matters, AI defaults to "balanced" across all dimensions. Balanced = generic. Generic = useless to you.
Why AI Flounders on "Best"
Without dimension specification, AI does the only rational thing: it covers its bases. You get responses like:
"Both have merits..."
"It depends on your purpose..."
"Option A is better for X while Option B is better for Y..."
This is technically correct—and completely unhelpful. But the hedge isn't AI failure. It's accurate acknowledgment that "best" is undefined. AI is correctly reflecting that you haven't asked a real question. The problem is the question, not the answer.
Same Content, Totally Different Cognitive Act
Watch what happens when we transform the question from vague to specific:
Prompt Comparison: Vague vs. Specific
The Vague Way
Result:
Generic comparison covering everything, highlighting nothing. AI defaults to safe, balanced summary with no actionable insight.
The Specific Way
• Clarity for a technical audience
• Actionable takeaways
• Evidence quality
• Readability for a 3-minute skim"
Result:
Precise comparison on the dimensions that matter to you. Immediately actionable. Tells you exactly what you need to know.
Same content domain. Totally different cognitive act. Both questions are about comparing two articles. But the second is actually answerable. The first asks AI to guess what matters to you. The second tells AI what matters.
The Specificity Premium
Research confirms this isn't just preference—it's fundamental to how AI works. The quality of AI output is heavily dependent on the structure and specificity of the input prompt.
"Clear structure and context matter more than clever wording—most prompt failures come from ambiguity, not model limitations."— Lakera: The Ultimate Guide to Prompt Engineering in 2025
A Harvard study found that marketing professionals using structured prompts completed 12% more assignments and reduced task time by 25%. Not because the AI got smarter—because the requests got clearer.
— Harvard Business Review: AI Productivity Study 2024Prompting Is Not Chatting—It's Designing
Most people still talk to AI as if it's a slightly magic search box. Type a few words, get a good answer. This works for Google (which has context from billions of queries). It doesn't work for generation—because AI needs your context.
See prompts differently: as spec writing, not chatting. A prompt is a micro-design document about:
- Structure: How should the output be organised? Table, list, narrative, comparison?
- Audience: Who is this for? Expert or novice? Executive or practitioner?
- Purpose: What is this meant to achieve? Inform, persuade, decide, document?
- Constraints: What are the limits? Word count, tone, depth, formality?
- Success criteria: What makes this "good" vs. "bad" in your context?
Designers don't start with "make it look good." They start with: who's using this, for what purpose, with what constraints. Apply the same thinking to prompts. Design the output before requesting it.
The Five Dimensions You Probably Forgot
Before asking "which is best?" specify these five dimensions:
1. Best for whom? (the audience)
• Technical expert or general audience?
• Decision-maker or implementer?
• Familiar with the topic or brand new?
Example: "For a CFO with 3 minutes to decide on budget allocation"
2. Best by what measure? (the metric)
• Accuracy? Clarity? Engagement? Completeness?
• Speed of understanding? Memorability?
• Persuasiveness? Actionability?
Example: "Measured by clarity and immediate actionability"
3. Best in what context? (the situation)
• Quick skim or deep read?
• Standalone or part of a series?
• High-stakes decision or casual exploration?
Example: "For a 5-minute pre-meeting briefing"
4. Best for what purpose? (the goal)
• To inform? To persuade? To entertain?
• To enable action? To satisfy curiosity?
• To document decisions? To explore options?
Example: "To persuade senior leadership to approve the project"
5. Best given what constraints? (the limits)
• Word count? Time to consume?
• Formality level? Technical depth?
• Medium (email, presentation, report)?
Example: "Under 500 words, suitable for email, semi-formal tone"
Practical Application: The Comparison Case
Let's walk through transforming a vague comparison request into a specific, answerable one:
Step 1: Define Your Evaluation Criteria
Before asking AI to compare, decide what you're comparing on. What dimensions matter to you? What would make one "better" than another in your context?
Step 2: Specify the Context
Who is this comparison for? What decision will it inform? What constraints apply?
Step 3: Request with Criteria Explicit
Make the comparison dimensions, format, and purpose crystal clear.
• Clarity for a non-technical executive
• Persuasiveness of the main argument
• Likelihood to be read completely
• Actionable next steps for the reader
Format: 2-column table with ratings (Low/Medium/High) and 1-sentence explanation per criterion."
Same question. Same content. Radically different request. The first is unanswerable. The second produces exactly what you need.
When Specificity Feels Excessive
You might be thinking: "This is so much work just to ask a simple question. Shouldn't AI figure this out? I don't have time for all this specification."
Here's the truth: if you don't specify, AI guesses. AI's guess is probably wrong. You'll spend time iterating anyway—"That's not what I meant... try again..." The specification happens either way. The question is when and how.
The Hidden Work: Specification vs. Iteration
❌ The Vague Path
- • 30 seconds to ask vague question
- • 2 minutes to read generic response
- • 1 minute to realise it's not what you wanted
- • 45 seconds to refine the request
- • 2 minutes to read second attempt
- • 1 minute to realise it's still not quite right
- • Repeat 2-3 more times...
Total: 15-20 minutes of frustration
âś“ The Specific Path
- • 3 minutes to think through what you actually want
- • 2 minutes to write specific request
- • 2 minutes to read precisely targeted response
- • Done. First time.
Total: 7 minutes, first output usable
Specification upfront < iteration after. Always. Not specifying doesn't save work—it moves it to a more frustrating place.
The "What's Best?" Lesson
This leads to a deeper question: If AI can only respond to what you specify... what is AI actually doing? The answer: it's compiling your intent. Next chapter: AI as Intention Compiler.
AI as Intention Compiler
Most people think of AI as a smart assistant, a magic genie, a search engine on steroids, or a creative partner. But there's a more accurate model—one that changes everything about how you use it:
AI is an intention compiler.
Understanding this single metaphor unlocks AI in a way that no amount of prompt tips ever will.
What Is a Compiler?
If you've never written code, think of a compiler this way:
- You write high-level code in a language humans can read (like Python or JavaScript)
- The compiler translates it into low-level machine instructions the computer can execute
- The key constraint: the compiler can only compile what you actually write
If your code is incomplete, the traditional compiler throws an error: Syntax error on line 47. The feedback is immediate and clear. You can't proceed until you fix it.
If your code is vague? Traditional compilers don't accept vague code. Code must be precise. There's no room for "sort of this" or "you know what I mean." The language enforces clarity.
AI as the Intention Compiler
AI works the same way—but with one critical difference:
- AI takes your high-level intent (expressed in natural language)
- Translates it into outputs (text, code, images, video)
- The key constraint: AI can only compile what you specify
"Generative AI is the most ambitious compiler yet because it translates from the language of thought."— Prompt Engineering: Generative AI - The New Compiler
The Critical Difference
Traditional compiler: Vague input → Error message ("I don't understand, be more specific")
AI compiler: Vague input → Plausible output that might be completely wrong
AI doesn't throw errors; it makes something up. And it sounds confident doing it.
This difference is everything:
- Traditional compilers force clarity (you have no choice)
- AI compilers reward clarity (but don't require it)
- You can be vague with AI—you'll just get generic results
- The discipline must come from you, not the system
Your Specification Is the Source Code
In programming, your code is the source; the compiled output is the result.
With AI, your specification is the source; AI output is the result.
What counts as "specification"?
- The prompt (explicit request)
- The context you provide (background information)
- The constraints you mention (boundaries and limits)
- The format you request (structure of the output)
- The audience you define (who this is for)
- The success criteria you articulate (what "good" means)
What you don't specify, AI invents. Every unspecified dimension gets filled with AI's defaults. Defaults equal generic patterns from training data. Your silence is permission for AI to guess—and AI guesses toward the safe middle.
The Vague Intent Problem
Here's how traditional compilers handle vagueness:
Here's how the AI compiler handles the same vagueness:
Traditional compiler: "I can't do this, be more specific." AI compiler: "I'll make something, hope it's what you wanted." AI never says no—it always produces something. That something may or may not match your actual intent.
Research has identified what's called "the instruction gap"—generative models are highly sensitive to language precision, but human language tolerates variants to communicate similar meaning. This gap is where outputs fail.
— ACM CHI 2025: Brickify - Enabling Expressive Design Intent SpecificationContext Engineering: The New Design Work
Here's where this gets interesting: you're not just "using AI." You're now doing something called context engineering—the design work that scaffolds intelligent behavior and ensures AI agents remain aligned with human intent.
"Context engineering is the design work that scaffolds intelligent behavior, ensuring AI agents remain aligned with human intent."— StartupHub.ai: Context Engineering AI - The New Design Frontier
Context engineering includes:
- How the system understands your intent
- How it handles ambiguous instructions
- How it transitions between tasks
- How much it remembers across interactions
- What it trusts, what it verifies, what it rejects
The quality of your context engineering determines the quality of outputs. This is the new design frontier.
Intent to Prompt Mapping
The focus of interaction design has evolved from hierarchical navigation (menus, buttons) to intent-driven interactions (express what you want, get it).
The central challenges now are:
- How clearly can users express their intent?
- How accurately can the system interpret it?
- How effectively can intent be translated into meaningful actions?
You're responsible for expressing intent clearly. AI is responsible for interpreting and executing. The quality of the mapping determines the quality of the result. Clear intent compounds into effective action and good results. Fuzzy intent compounds into scattered guesses and generic outputs.
AI Success Is Not About the Model
Here's the mistake most people make: "I need a better AI model. GPT-4 isn't good enough; I'll try Claude. Maybe Gemini will understand me better."
Model capability matters—but within a range. Most failures aren't model limitations. Most failures are specification limitations. A better model doesn't fix vague intent.
AI success is about the clarity of your thinking—not the sophistication of the model. The same model produces 27% success or 86% success depending on direction quality (Method Actors research from Chapter 3). Your specification is the variable you control.
What You Control
You Don't Control
- • Model architecture
- • Training data quality
- • Capability limits
- • Which model is "best"
You Do Control
- • Specification quality
- • Context richness
- • Clarity of intent
- • Success criteria definition
Focus your energy on what you control.
AI as the Uncomfortable Mirror
AI externalizes and reflects the quality of your thinking. It's an uncomfortable mirror that reveals fuzzy thinking you could previously hide.
What you could hide before:
- Vague ideas that "made sense to me"
- Implicit assumptions never articulated
- Half-formed thoughts that felt complete
- Fuzzy logic that worked in your head but not on paper
What AI exposes:
- Every gap in your specification
- Every assumption you didn't state
- Every dimension you didn't define
- The actual precision (or lack of) in your thinking
The gift in the discomfort: AI forces you to think more clearly, to articulate what you actually mean, to make implicit assumptions explicit. This is a feature, not a bug.
What AI Can't Compile (Because You Must Provide It)
The intention compiler metaphor reveals an important truth: there are inputs only you can provide.
Intent
What are you actually trying to achieve? AI can generate options but can't decide which option matches your unspoken goals.
Taste
What feels right? What feels wrong? AI can suggest possibilities but can't know which fits your aesthetic or brand voice.
Judgement
When to push forward? When to stop? AI can continue indefinitely; you decide when it's good enough.
Constraints
What must be true? What can't happen? AI explores infinite possibility space; you define the boundaries.
Context
What's the broader situation? AI has training data; you have lived experience and domain knowledge.
These are your inputs to the compiler. They can't come from AI. They must come from you.
The Compiler in Action: Vague vs. Clear
Let's see the intention compiler at work with a common scenario:
Two Approaches, Two Outcomes
❌ Vague Compilation
Input: "Write me a blog post about productivity"
Output: Generic 5-tips listicle with platitudes about time management, could apply to anyone or no one, feels like it was written by committee.
âś“ Clear Compilation
Input: "Write a 1200-word blog post for mid-level managers in tech companies (Series B-C startups) explaining why traditional productivity advice fails in high-interrupt environments. Format as: opening anecdote, 3 research-backed insights, practical framework they can implement this week. Tone: evidence-based but conversational, skeptical of hype. Cite at least 2 studies."
Output: Specific, targeted content addressing a defined audience with clear structure, evidence requirements, and tone guidance. AI can execute with precision because intent is clear.
Same capability. Same AI model. The variable is specification quality—the "source code" you fed to the intention compiler.
The Empowering Reframe
Here's why the intention compiler metaphor is so powerful: it shifts responsibility—and control—to where it belongs.
The old framing (disempowering):
- "AI isn't smart enough to understand me"
- "AI keeps getting it wrong"
- "I need a better AI tool"
- → You're stuck waiting for AI to improve
The intention compiler framing (empowering):
- "I haven't specified my intent clearly enough"
- "My source code (specification) has gaps"
- "I need to improve my context engineering"
- → You can fix this immediately
This is good news. The constraint isn't the AI's capability—it's the clarity of your specification. And that's something you control completely.
Key Takeaways: The Intention Compiler Model
- • AI translates intention into outputs, just like a compiler translates code into instructions. It can only compile what you specify.
- • Traditional compilers force precision through error messages. AI compilers allow vagueness but punish it with mediocrity.
- • Your specification is your source code. Gaps in specification become defaults in output. Defaults are generic.
- • Context engineering is now your job: defining how the system understands intent, handles ambiguity, and aligns with your goals.
- • AI success isn't about model choice—it's about specification quality. The same model produces 27% or 86% depending on direction.
- • AI externalizes the quality of your thinking. Fuzzy thinking produces fuzzy output. This is a feature: it forces clarity.
Next: What Happens When Specification Is Skipped Entirely?
The intention compiler model helps us understand why AI outputs fail. But what happens when developers skip specification altogether—when they "vibe code" their way to production?
There's a perfect case study demonstrating the catastrophic consequences of treating AI like a magic compiler that doesn't need source code. It's called "vibe coding"—and it's failing spectacularly in real-world production environments.
In the next chapter: The Vibe Coding Catastrophe.
The Vibe Coding Catastrophe
What happens when the specification problem is taken to its extreme—complete avoidance of human thinking
In February 2025, AI researcher Andrej Karpathy popularised a term that would become infamous in engineering circles: "vibe coding." It describes a particular approach to AI-assisted software development that sounds appealing in theory—and catastrophic in practice.
The Vibe Coding Process
Here's how vibe coding typically works:
- Developer describes a project or task to an LLM—usually in vague terms
- LLM generates code based on the prompt
- Developer does NOT review or edit the code itself
- Developer only evaluates using tools and execution results
- Developer asks the LLM for improvements based purely on outcomes
Unlike traditional AI-assisted development or pair programming, the human developer deliberately avoids examining the code. Accept what AI produces. Evaluate outputs. Skip understanding. Iterate based on vibes, not specifications.
It sounds efficient. It sounds modern. It sounds like the future.
Instead, it became a case study in what happens when the specification principle is abandoned entirely.
The Security Catastrophe
A cybersecurity firm analysed Fortune 50 companies using AI-assisted development. The findings were stark:
10x More Security Issues
AI-assisted developers produced 3–4 times more code but generated 10 times more security issues than traditional development.
— Wikipedia: Vibe codingMore code, faster. But riddled with vulnerabilities that manual development would have caught in review.
The Lovable Vulnerability
In May 2025, Lovable—a Swedish vibe coding application—made headlines for the wrong reasons. Security researchers discovered systematic vulnerabilities in the code it generated:
- 170 out of 1,645 Lovable-created web applications had exploitable issues
- These vulnerabilities would allow personal information exposure
- 10% of apps created with the tool had serious security flaws
The security issues weren't random flukes. They were systematic—patterns emerging from vague inputs producing code that "worked" in testing but failed under real-world security scrutiny.
Production Disasters
In August 2025, Final Round AI surveyed 18 CTOs about their experiences with vibe coding.
16 out of 18 reported production disasters directly caused by AI-generated code.
Nearly 90% of engineering leaders surveyed had experienced significant failures from accepting AI outputs without proper specification or review.
Specific Incidents
Google's AI Assistant
Erased user files while attempting simple folder reorganisation. A basic task turned destructive.
Replit's AI
Deleted code despite explicit instructions not to modify it. The AI "hallucinated" successful operations, then built subsequent actions on those false premises—creating what researchers call a "confabulation cascade."
The "Works Until It Doesn't" Problem
"Vibe coding's most dangerous characteristic is code that appears to work perfectly until it catastrophically fails."— Quartz: AI Vibe Coding Has Gone Wrong
This is actually worse than obviously broken code. Obviously broken code gets caught in testing and never reaches production. "Works in testing" code gets deployed—and failures only appear at scale, under edge cases, in production environments.
By then, the damage is done.
The Spaghetti Code Effect
Rapid AI-generated code frequently becomes what developers call "spaghetti code":
- Tangled, inconsistent source code lacking clear structure
- AI solving similar problems differently each time
- Patchwork of heterogeneous styles with minimal documentation
- Impossible to debug, nearly impossible to maintain
One CTO observed that each vibe-coded component worked in isolation, but the system as a whole was architecturally incoherent. No human had ever understood the codebase as a unified system—because no human had designed it as a unified system.
Enterprise Reality Check
Across the software industry, seasoned engineering leaders issued a clear consensus:
"The 'vibe coding' approach completely falls apart when dealing with the sheer, terrifying scale and complexity of real-world, mission-critical enterprise systems."
AI-assisted vibe coding may rapidly create more problems than it solves in production codebases.
— DEV Community: The Vibe Check Failed"No, you won't be vibe coding your way to production—not if you prioritise quality, safety, security and long-term maintainability at scale."— Brendan Humphreys, CTO, Canva
Why Vibe Coding Fails: The Specification Gap
Vibe coding fails for a simple reason: it skips everything that makes software development rigorous.
What Gets Skipped in Vibe Coding
Planning & Design
- • Technical design documents
- • Architecture decisions
- • Edge case handling
- • Error recovery strategies
Quality & Security
- • Security considerations
- • Performance requirements
- • Maintainability planning
- • Integration requirements
AI cannot compensate for missing specifications because:
- AI doesn't know your domain constraints
- AI doesn't know your integration requirements
- AI doesn't know your security posture
- AI doesn't know what "working" means in your context
All of this must come from human specification. When developers skip specification and "let AI figure it out," they're asking AI to guess at critical system design decisions. And AI guesses based on patterns from training data—not your actual requirements.
Vibe coding is the specification problem at its extreme: maximum avoidance of human thinking, maximum trust in AI guessing, maximum consequence when guessing fails.
Vibe Coding ≠AI-Assisted Engineering
It's critical to understand: vibe coding is not the same as AI-assisted engineering. Conflating the two is dangerous because it devalues rigorous engineering practices and gives newcomers a dangerously incomplete picture of what it takes to build production-ready software.
Real AI-Assisted Engineering Includes
- Technical design documents created before AI writes code
- Stringent code reviews where humans examine AI output line by line
- Test-driven development with explicit verification of correctness
- Clear specifications defining requirements, constraints, and success criteria
- Human understanding sufficient to explain every decision to colleagues
When AI is a tool in the hands of an engineer who maintains rigorous practices, it's powerful. When AI replaces rigorous thinking entirely, it's dangerous.
The Spec-Driven Alternative
The alternative to vibe coding isn't abandoning AI—it's using AI within a specification-driven framework.
"Enter spec coding, where you, the human conductor, lay out a clear score (specifications) to guide the AI ensemble toward harmonious, reliable results."— Red Hat Developer: How Spec-Driven Development Improves AI Coding Quality
1. Define Specifications First
Document requirements, constraints, and success criteria before AI generates any code.
2. AI Generates Based on Clear Spec
Provide specifications as context; let AI produce implementation aligned with explicit requirements.
3. Human Reviews Against Specification
Verify that generated code actually meets documented requirements, handles edge cases, and follows architecture.
4. Iterate on Specification, Not Just Output
When outputs don't match intent, refine the specification rather than just regenerating code.
The Results
Research shows spec-driven development produces dramatically better outcomes:
- 90% of code can be AI-generated successfully with proper specifications
- Specification overhead: 20–40% extra time upfront per feature
- ROI: Upfront specification hours save days or weeks of debugging and rework
Signs You're Vibe Coding (Without Realising It)
You don't need to be a developer to fall into vibe-coding patterns. The same dynamics apply to any AI use. Check yourself:
The Vibe Coding Checklist
If you checked multiple boxes, you're in vibe territory—and the risks apply to you.
The Deeper Lesson
Vibe coding isn't just about software development. It's the purest example of what happens when you outsource thinking to AI.
The pattern appears everywhere:
- Vague emails that sound fine until recipients ask clarifying questions you can't answer
- Vague reports that pass initial review but fail under executive scrutiny
- Vague analyses that look plausible but contain fundamental errors you didn't catch
It "works" in the moment—sounds fluent, looks polished—but fails when tested against reality.
The specification principle is universal: vibes in, vibes out. Specification in, quality out.
Whether you're writing code, writing emails, or writing strategy documents, the same truth applies: AI amplifies your clarity. If you don't provide clarity, AI amplifies your vagueness.
And vagueness at scale—whether it's 10x security vulnerabilities or a presentation that falls apart under questions—is worse than doing nothing at all.
The Choice
You can vibe—rely on AI to guess at your intent, accept outputs you don't fully understand, hope it works out. The data shows where that leads: 10x security issues, catastrophic failures, spaghetti systems.
Or you can specify—invest 20–40% more time upfront defining what you actually want, review outputs against those specifications, maintain understanding throughout. The data shows where that leads: 90% AI-generated success, 50% faster delivery, systems you can maintain and trust.
In the next chapter, we'll explore what this all leads to: The Bargain. The real contract between you and AI—what you must provide, what AI will deliver in return, and why this is actually good news.
The Bargain
Using AI well requires a bargain. Both parties have to keep up their end. When both do, you get genuine leverage. When one doesn't, you get frustration, waste, and failure.
Throughout this book, we've explored why AI disappoints so many people. We've seen how effort doesn't disappear—it redistributes from execution to thinking. We've examined the director/crew mental model, the laziness mirroring effect, the multidimensional ambiguity of questions like "what's best?", the intention compiler framework, and the catastrophic consequences of vibe coding.
All of this points to a simple conclusion: AI is not a shortcut to effort. It's a bargain. A two-sided contract. And both parties need to keep their end.
The Two Sides of the Bargain
Your End
What you owe AI:
-
1.
Clarity – Know what you're asking for and articulate it explicitly
-
2.
Rigour – Think through the dimensions, don't leave gaps
-
3.
Taste – Know what good looks like for your purpose
-
4.
Constraints – Define what can't happen, what must be true
-
5.
Iteration – Be willing to refine specification, not just accept output
Skip these, and AI reflects your vagueness back.
AI's End
What AI owes you:
-
1.
Speed – Generate in seconds what would take hours
-
2.
Breadth – Explore many options simultaneously
-
3.
Pattern matching – Recognise structures across domains
-
4.
Generation – Produce raw material at scale
-
5.
Scale – Apply consistent quality across volume
AI delivers—but only when you keep your end.
Why Your End Is Non-Negotiable
AI can't guess your intent accurately. It can't apply your judgement. It can't know your context unless you provide it. Your clarity is the input to the system. Garbage clarity produces garbage output—we've seen this pattern throughout every chapter.
The effort required isn't necessarily more than doing it yourself. But it's different effort. Thinking effort, not doing effort. Upfront effort, not execution effort. This is the shift from 80% execution / 20% thinking to 10% execution / 90% thinking we explored in Chapter 2.
When the Bargain Works
When both sides keep their end of the bargain, AI becomes a genuine force multiplier. You think clearly once; AI scales the output. Iteration converges quickly to good results. The 90% of work that can be AI-generated actually gets AI-generated correctly.
The difference in each case: proper specifications. Not vibes. Not hoping AI would "figure it out." Clear, upfront specification of what success looked like.
When the Bargain Fails
When you don't keep your end, AI reflects your vagueness back. You blame the tool. You iterate endlessly without convergence. You conclude "AI doesn't work for me."
This creates a self-fulfilling prophecy:
- 1. Vague input → Disappointing output
- 2. Disappointing output → Reduced investment in AI
- 3. Reduced investment → Never learn to use it well
- 4. Never learn → Permanent disappointment
Breaking the cycle requires recognising the pattern (this book helps), taking responsibility for input quality, investing in specification upfront, and evaluating outputs against your specification—not against magic.
The Pattern Holds Across All Domains
Whether you're working with code, video, prose, or any other AI application, the pattern is universal:
In Code
The real work is designing interfaces, invariants, data flow, and constraints. AI can generate syntax, but it can't choose the right abstraction for your organisation or domain unless you encode that into the prompt and process.
In Video
The real work is blocking, framing, pacing, emotional beats, and camera language. AI can generate footage, but if you don't specify, it invents generic versions that feel "wrong" because they're not your scene.
In Prose
The real work is deciding audience, purpose, stakes, and angle. AI can generate text, but without that, you get content instead of communication—words that technically say something but don't achieve anything.
"Whatever the domain, the pattern is the same. Human specification + AI execution = leverage. Human vagueness + AI execution = mediocrity."
The Pre-Prompt Checklist
Before every significant AI interaction, answer these six questions:
1. What am I trying to achieve?
The goal, not just the task. Why does this matter?
2. Who is this for?
The audience. What do they know? What do they need?
3. What format should it take?
Structure, length, style. How will it be consumed?
4. What constraints apply?
What can't happen? What must be true? What rules govern this?
5. What makes it good vs. bad?
Success criteria. How will you judge the output?
6. What context does AI need?
Background information, previous decisions, domain-specific knowledge.
If you can't answer these questions, spend time on them before prompting. The specification is the work.
The 20–40% Upfront Investment
Research shows that specification adds 20–40% more time upfront per feature. That sounds like a tax. But it saves days or weeks of iteration and rework. The ROI case is compelling: upfront hours versus downstream days.
Upfront specification effort takes hours. Manual implementation takes days or weeks. Specification reuse for similar features cuts future effort. You spend less time debugging because requirements are clear. Fewer production incidents happen because validation criteria are explicit.
What Changes When You Accept This
For Your Expectations
- • Stop expecting AI to reduce total work
- • Start expecting AI to change the type of work
- • Embrace thinking work as the new core skill
For Your Process
- • Invest time before prompting, not just after
- • Evaluate specification quality, not just output
- • Iterate on inputs when outputs miss
For Your Relationship with AI
- • Stop being a consumer hoping for magic
- • Start being a director providing vision
- • Own the quality of your direction
For Your Results
- • Specification in → quality out
- • The leverage becomes real
- • AI delivers on the promise—because you deliver on yours
The Uncomfortable Truth, Restated
Throughout this book, we've built a comprehensive understanding:
- Chapter 1: AI disappoints because expectations were wrong, not because AI is broken
- Chapter 2: AI doesn't remove work—it moves work from execution to thinking
- Chapter 3: You're the director, AI is the crew; vague direction produces stock footage
- Chapter 4: Vague inputs produce vague outputs (the laziness mirroring effect)
- Chapter 5: "Best" has no meaning without specification (multidimensional ambiguity)
- Chapter 6: AI is an intention compiler that can only compile what you specify
- Chapter 7: Skip specification entirely and you get vibe coding disasters
- Chapter 8: Keep up your end of the bargain and AI keeps its end
"AI is a force multiplier on structured thought, and a brutally honest mirror for vague thought."
The Empowering Reframe
This shift from helplessness to agency is what makes this uncomfortable truth empowering:
Before (Helplessness)
- • "AI is overhyped"
- • "AI doesn't understand me"
- • "AI gives generic outputs"
- • "AI doesn't work for me"
- • "Nothing I can do about it"
After (Agency)
- • "My specification wasn't clear enough"
- • "I can fix my input and get better output"
- • "AI reflects my clarity—I control that"
- • "The leverage is real when I do my part"
- • "I can change this, starting now"
The difference is actionable. The improvement starts immediately. You're not waiting for better AI models. You're not hoping someone builds the "right" tool. You're taking control of the variable you actually control: the quality of your specification.
Your Next Prompt
The Test
Take your next significant AI request. Before you send it, apply the Pre-Prompt Checklist. Answer all six questions. Notice how much clearer the request becomes. Notice how much better the output is.
The Habit
Do this for one week. Every significant prompt: pause and specify. Track the difference in output quality. The evidence will convince you.
The Transformation
This becomes automatic. Specification becomes natural. AI becomes genuinely useful. The bargain is kept. The leverage is real.
The Bargain, Summarised
Your End
Clarity • Rigour • Taste • Constraints • Iteration
AI's End
Speed • Breadth • Pattern Matching • Generation • Scale
Both must be kept for leverage to occur.
The uncomfortable truth: You still have to do work when using AI. It's thinking work now. And AI will amplify every bit of clarity you bring.
The empowering truth: You control your input. Better input produces better output. The leverage is real. Starting now.
"AI will happily generate oceans of possibility. You pay your side in clarity, rigour, taste, constraints, and iteration."
"Used that way, it's not a shortcut to effort—it's a shortcut to leverage."
Keep up your end of the bargain.
AI will keep its end.
References & Sources
This ebook draws on peer-reviewed research, industry analysis, and practitioner frameworks from 2024-2025. All external sources are cited inline throughout the text. The frameworks and interpretive analysis represent the author's synthesis developed through enterprise AI transformation consulting.
Primary Research & Studies
METR: Measuring the Impact of Early-2025 AI on Experienced Developer Productivity
Key finding: Experienced developers take 19% longer with AI tools, contrary to expectations of 24% speedup.
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
Harvard Study: Structured Prompts and Productivity
Marketing professionals using structured prompts completed 12% more assignments and reduced task time by 25%.
https://www.godofprompt.ai/blog/prompt-structures-for-chatgpt-basics
ArXiv: LLMs as Method Actors
Research demonstrating 27% → 86% performance improvement when treating LLMs as actors with prompts as scripts.
https://arxiv.org/html/2411.05778v2
Microsoft/Carnegie Mellon: AI Tools and Critical Thinking
Study of 319 professionals documenting cognitive shifts when using AI tools.
https://ppc.land/study-reveals-complex-shift-in-workplace-critical-thinking-due-to-ai-tools/
Stack Overflow Developer Survey 2024-2025
Developer trust in AI tool accuracy dropped from 43% (2024) to 33% (2025).
https://leaddev.com/technical-direction/trust-in-ai-coding-tools-is-plummeting
Consulting Firm Research
KPMG: Rethinking Strategic Workforce Planning with AI Agents
Work shifting from execution to orchestration; 52% of leaders rank job redesign as top priority.
https://kpmg.com/us/en/articles/2025/strategic-workforce-planning-with-ai-agents.html
BCG: AI Is Moving Faster Than Your Workforce Strategy
Work is being redistributed, not eliminated, as teams integrate AI into execution.
https://www.bcg.com/publications/2025/ai-is-outpacing-your-workforce-strategy-are-you-ready
McKinsey: Skill Shift - Automation and the Future of the Workforce
Basic cognitive skills declining by 15%; social, emotional, and advanced cognitive skills growing.
https://www.mckinsey.com/industries/public-and-social-sector/our-insights/skill-shift-automation-and-the-future-of-the-workforce
Industry Analysis & Commentary
Upwork: AI and Worker Productivity Study
77% of workers report AI has either increased workload or decreased productivity.
https://www.shepbryan.com/blog/cognitive-load-ai
Fast Company: AI's Hidden Human Work Layers
Every AI-assisted workflow hides verification, correction, and interpretive work.
https://www.fastcompany.com/91445355/ai-jobs-new-layers-human-work
Wharton: The AI Efficiency Trap
Emergence of "cognitive supervision" as a new human skill for guiding AI.
https://knowledge.wharton.upenn.edu/article/the-ai-efficiency-trap-when-productivity-tools-create-perpetual-pressure/
InformationWeek: How to Cope with AI Disappointment
Market correction from "wow" to "how" as results fail to match expectations.
https://www.informationweek.com/machine-learning-ai/how-to-cope-with-ai-disappointment
Vibe Coding & Specification Failures
Wikipedia: Vibe Coding
Definition and security analysis: 3-4x more code but 10x more security issues.
https://en.wikipedia.org/wiki/Vibe_coding
DEV Community: The Vibe Check Failed
16 out of 18 CTOs reported production disasters from AI-generated code.
https://dev.to/naveens16/the-vibe-check-failed-why-ai-assisted-vibe-coding-crashes-against-enterprise-reality-2014
Quartz: AI Vibe Coding Has Gone Wrong
Code that "appears to work perfectly until it catastrophically fails."
https://qz.com/ai-vibe-coding-software-development
Simon Willison: Not All AI-Assisted Programming is Vibe Coding
The golden rule: "I won't commit code I couldn't explain to someone else."
https://simonwillison.net/2025/Mar/19/vibe-coding/
Red Hat Developer: Spec-Driven Development
Spec coding as the alternative to vibe coding for reliable AI-assisted development.
https://developers.redhat.com/articles/2025/10/22/how-spec-driven-development-improves-ai-coding-quality
SoftwareSeni: Spec-Driven Development in 2025
90% of code can be AI-generated with proper specifications; Google and Airbnb success stories.
https://www.softwareseni.com/spec-driven-development-in-2025-the-complete-guide-to-using-ai-to-write-production-code/
Prompt Engineering & AI Theory
Prompt Engineering: Generative AI - The New Compiler
"Generative AI is the most ambitious compiler yet because it translates from the language of thought."
https://promptengineering.org/generative-ai-the-new-compiler/
Lakera: The Ultimate Guide to Prompt Engineering in 2025
Clear structure and context matter more than clever wording.
https://www.lakera.ai/blog/prompt-engineering-guide
Nielsen Norman Group: Why Vague Prompts Fail
AI struggles with ambiguity and cannot deliver thoughtful results within broad context.
https://www.nngroup.com/articles/vague-prototyping/
Anthropic: Multi-Agent Research System
Orchestrator-worker pattern; prompts as "frameworks for collaboration."
https://www.anthropic.com/engineering/multi-agent-research-system
StartupHub.ai: Context Engineering AI
Context engineering as design work that scaffolds intelligent behaviour.
https://www.startuphub.ai/ai-news/ai-research/2025/context-engineering-ai-the-new-design-frontier/
Foundational Concepts
Wikipedia: Garbage In, Garbage Out
The classic GIGO principle, originating from 1957.
https://en.wikipedia.org/wiki/Garbage_in,_garbage_out
MDPI: Promises and Pitfalls of Large Language Models
Only highest-quality prompts induce consistent high-quality feedback.
https://www.mdpi.com/2673-2688/6/2/35
Wikipedia: Intelligence Amplification
The concept of using technology to augment human intelligence, proposed in the 1950s-60s.
https://en.wikipedia.org/wiki/Intelligence_amplification
Author Frameworks & Analysis
The following frameworks and interpretive analysis were developed by the author through enterprise AI transformation consulting. They are presented as the author's voice throughout the ebook (not cited inline) but listed here for transparency.
The Effort Redistribution Model
Framework: 80/20 (execution/thinking) shifts to 10/90 with AI adoption. Total effort may be similar or higher, but the distribution changes dramatically.
Director/Crew Mental Model
You're the director providing vision; AI is the crew executing. Vague direction produces stock footage.
The Laziness Mirroring Effect
"If I'm lazy, it'll be lazy." Vague input → huge output space → AI picks generic and safe.
AI as Intention Compiler
AI translates intention into outputs like a compiler translates code—it can only compile what you specify.
The Bargain
You provide clarity, rigour, taste, constraints, iteration. AI provides speed, breadth, pattern matching, generation, scale.
The Pre-Prompt Checklist
Six questions to answer before prompting: achievement, audience, format, constraints, success criteria, context.
Note on Research Methodology
This ebook synthesises research from 2024-2025 on AI productivity, human-AI collaboration, and the emerging discipline of prompt engineering. Sources include peer-reviewed academic papers, industry research from major consulting firms, and practitioner analysis from technology publications.
External research was selected for its direct relevance to the core thesis: that AI shifts work from execution to specification rather than eliminating it. The author's frameworks represent synthesis and interpretation of these findings through the lens of enterprise AI transformation consulting.
Research compilation date: December 2025
Note: Some links may require subscription access. URLs were verified at time of publication.