The Simplicity Inversion
Why Your “Easy” AI Project Is Actually the Hardest β And Where Regulated Organizations Should Really Start
π Want the complete guide?
Learn more: Read the full eBook here β
TL;DR
- “Starting simple” with AI β automating a process, adding a chatbot β is actually the guaranteed path to failure in regulated industries
- Customer-facing + regulated + real-time is the boss fight, not the tutorial level
- The actual easy win: IT-focused, batch-oriented work where AI produces testable artifacts through governance you already have
Every regulated company I talk to has the same story. They picked a “simple” AI project β usually a customer service chatbot or workflow automation β and watched it die in compliance review. Or worse, deployed it and watched customers revolt.
Meanwhile, their IT teams are quietly using AI to write code, generate tests, and build internal tools. Nobody talks about these projects because they seem too “technical” to be strategic. But here’s the uncomfortable truth: the “complex” IT projects are succeeding at rates the “simple” customer projects can only dream of.
This is the Simplicity Inversion: what looks simple to executives is actually the hardest thing to deploy, while what looks complex is often the easiest path to real AI value.
Why “Start Simple” Is Backwards
The conventional wisdom says: pick something simple, prove value, then scale. This works for most technology adoptions. It fails catastrophically for AI.
Here’s why. When executives say “start simple,” they usually mean:
- Something visible (so the board can see progress)
- Something customer-facing (so it has obvious impact)
- Something that automates an existing process (so the use case is clear)
This logic leads directly to customer chatbots, intake form automation, and service desk AI. All of which have catastrophic failure rates in regulated environments.
The problem isn’t that these projects are technically impossible. The problem is that they combine the three hardest factors for AI deployment into a single project:
The Three-Axis Complexity Map
| Axis | “Simple” Customer Project | “Complex” IT Project |
|---|---|---|
| Blast Radius | High β customers affected | Low β internal team only |
| Regulatory Load | High β explainability required | Low β code review suffices |
| Time Pressure | Real-time β seconds to respond | Batch β minutes to hours OK |
Customer-facing, regulated, real-time. That’s not the tutorial level. That’s the boss fight.
The Customer Chatbot Catastrophe
Let’s talk about the poster child for “simple” AI: the customer service chatbot. It seems perfect β everyone uses chat, the technology exists, competitors are doing it.
Except:
The research is brutal. 78% of customers escalate to a human anyway. 63% get no resolution. And here’s the killer: 80% say chatbots increase their frustration.2
But it gets worse for regulated industries. When a chatbot fails, customers don’t blame “this specific chatbot having a bad day.” They blame AI as a category. Research published in Nature found that AI service failures create a trust death spiral β one bad experience poisons future interactions because customers assume the problem is systemic and unfixable.3
Unlike human agents, where a failure gets attributed to “that person was having a rough day,” AI failures feel permanent. And in regulated industries where trust is your primary asset, that’s catastrophic.
Meanwhile, in IT…
Here’s what nobody puts in the board deck: AI in software development is working spectacularly well.
Customer AI
- 72% “waste of time”
- 80% increased frustration
- Trust death spiral
GitHub Copilot now writes 46% of all code for its users. OpenAI’s engineers are completing 70% more pull requests per week using their own tools.7 A regional bank studied by McKinsey saw 40% productivity gains, with over 80% of developers reporting improved coding experience.6
These aren’t cherry-picked successes. They’re consistent patterns across industries, including highly regulated ones.
Why Developer Tools Win
The difference isn’t the technology. It’s the deployment context.
Developer AI tools succeed because:
- Governance already exists. Code review, testing, version control β these are mature practices that handle AI-generated code the same way they handle human-generated code. No new compliance framework needed.
- Batch context with latency tolerance. Nobody expects code to be written in 2 seconds. AI can take 30 seconds to think, run verification loops, and produce quality output.
- Human verification is built in. Every line of AI-generated code goes through review. The developer checks it. Tests validate it. CI/CD gates it. Multiple layers of verification.
- Low blast radius. If AI-generated code has a bug, it gets caught before reaching customers. The failure is internal, fixable, and doesn’t damage trust.
- Measurable outcomes. Cycle time. Deployment frequency. Bug rates. The metrics exist and everyone agrees on them.
This is what I call governance arbitrage: routing AI value through governance mechanisms that already exist, rather than inventing new ones.
“If it can’t be versioned, tested, and rolled back, it’s not an AI use-case β it’s a live experiment.”
The Real Simplicity Inversion
Here’s the pattern that keeps repeating:
| What Executives See | What’s Actually Happening |
|---|---|
| “Simple” customer chatbot | Real-time + regulated + customer-facing + novel governance required = Maximum difficulty |
| “Complex” developer tools | Batch + internal + testable artifacts + existing governance = Minimum difficulty |
The inversion happens because executives evaluate projects based on conceptual simplicity (“everyone understands chat”) rather than deployment complexity (the combination of blast radius, regulatory load, and time pressure).
A customer chatbot looks like a Level 2 project (simple Q&A). But it requires Level 5-6 governance maturity (full telemetry, error budgets, incident playbooks) because it’s making autonomous decisions with customers in real-time.8
Developer tools look like Level 5 projects (autonomous code generation). But they require only Level 2-3 governance (human review, standard testing) because the artifacts are testable and the feedback loop is tight.
The Perimeter Strategy
Think of your organization like a castle. The core is where customers interact with regulated processes. The perimeter is where your internal teams do their work.
The Perimeter Strategy says: don’t storm the castle. Secure the perimeter first.
In practice:
- Start internal. IT, operations, support infrastructure. Places where blast radius is contained.
- Stay batch. Work that can tolerate minutes or hours, not seconds. Let AI think.
- Produce artifacts. Code, reports, analyses β things that can be reviewed before they reach anyone. Not live decisions.
- Use existing governance. Route through code review, testing, approval workflows. Don’t invent new compliance mechanisms.
This isn’t about avoiding customer value. It’s about building the factory before you ship the products.
Where to Start (The Tutorial Level)
- Developer productivity tools (code generation, testing, documentation)
- Internal support ticket triage and routing
- Log analysis and incident summarization
- Data quality rules and schema validation
- Runbook generation and maintenance
- Internal documentation and knowledge base
Where NOT to Start (The Boss Fight)
- Customer-facing chatbots
- Automated customer communications
- Real-time regulatory decisions
- Anything touching customer data in production
- Any use case requiring novel compliance approval
The Compound Effect
Here’s what organizations miss: starting at the perimeter isn’t a detour. It’s the fastest path to AI capability.
Each successful internal project teaches you:
- How to evaluate AI outputs
- How to set appropriate error budgets
- How to build verification loops
- How to handle failures gracefully
- How to measure real ROI
After 5-10 internal wins, you’ve built a factory for safe automation. You have patterns, test suites, governance templates, and trained people. Each new project is cheaper and faster.
Then β and only then β you’ve earned the right to attempt customer-facing AI. Not because the technology is ready (it always was), but because your organization is ready.
What This Looks Like in Financial Services
The pattern is already playing out. JPMorgan Chase built proprietary AI for internal contract intelligence (COIN platform) while using vendor solutions for customer-facing functions. Wells Fargo built risk-modeling engines internally while buying customer interaction AI where deployment speed mattered more than customization.9
McKinsey documented a regional bank that started with developer productivity. Productivity rose 40% for targeted use cases. Over 80% of developers reported improved experience. The bank learned how AI works in their environment β then expanded to other areas from a position of strength, not hope.6
The organizations failing are the ones who saw ChatGPT demos and said “let’s put this in front of customers.” The organizations winning are the ones who said “let’s learn how this works where we can afford to make mistakes.”
The Path Forward
If you’re a CTO or CIO at a regulated company, here’s the uncomfortable truth: your “simple” AI project is probably the hardest thing you could attempt.
The questions to ask:
- Does this project require real-time response? If yes, difficulty multiplies.
- Does it touch customers or regulated data? If yes, difficulty multiplies again.
- Does it require governance we don’t have? If yes, you’re not deploying AI β you’re building a compliance program.
The better path:
- Find the places where your IT teams are already experimenting with AI. That’s signal.
- Formalize those experiments. Measure. Learn.
- Build governance muscle on low-stakes projects before high-stakes ones.
- Let visible customer AI be the graduation, not the starting point.
The tutorial level is disguised as “complex technical work.” The boss fight is disguised as “simple customer automation.”
Choose accordingly.
References
- MIT Project NANDA. “The GenAI Divide: State of AI in Business 2025.” β “95% of enterprise generative AI projects fail to deliver meaningful business impact or revenue acceleration.” fortune.com/2025/08/18/mit-report-95-percent-generative-ai-pilots-at-companies-failing-cfo/
- Forbes/UJET. “Chatbot Frustration Survey.” β “72% consider chatbots ‘waste of time’, 78% escalate to human, 63% no resolution, 80% said using chatbots increased their frustration level.” forbes.com/sites/chriswestfall/2022/12/07/chatbots-and-automations-increase-customer-service-frustrations-for-consumers-at-the-holidays/
- Nature. “Consumer Trust in AI Chatbots: Service Failure Attribution.” β “When customers experience chatbot failures, they don’t blame ‘this specific instance’βthey blame AI capabilities as a category… This creates a trust death spiral.” nature.com/articles/s41599-024-03879-5
- GitHub/arXiv. “The Impact of AI on Developer Productivity: Evidence from GitHub Copilot.” β “developers complete tasks 55-82% faster.” arxiv.org/abs/2302.06590
- Index.dev. “Developer Productivity Statistics with AI Tools 2026.” β “90% of developers feel more productive with AI tools.” index.dev/blog/developer-productivity-statistics-with-ai-tools
- McKinsey. “Extracting Value from AI in Banking: Rewiring the Enterprise.” β “A regional bank used gen AI to boost the productivity and efficiency of its software developers… Productivity rose about 40 percent… more than 80 percent of developers said gen AI improved their coding experience.” mckinsey.com/industries/financial-services/our-insights/extracting-value-from-ai-in-banking-rewiring-the-enterprise
- LinkedIn/OpenAI Internal Usage. β “OpenAI engineers are completing 70% more pull requests per week using their Codex tool.” linkedin.com/posts/justinhaywardjohnson_openai-unveils-o3-and-o4-mini-activity-7318687442868342784-1l3m
- Enterprise AI Spectrum Framework. “Autonomy Levels 5-6 (Agentic Loops) require full telemetry, error budgets, and incident playbooks β typically 18+ months AI maturity.” leverageai.com.au/the-enterprise-ai-spectrum/
- LinkedIn Analysis. “AI Project Success Rates.” β “JPMorgan Chase succeed by building proprietary solutions for core competitive advantages (COIN contract intelligence platform) while purchasing vendor solutions for commoditized functions.” linkedin.com/pulse/ai-project-success-rates-reconciling-75-vs-95-failure-tom-mathews-xma0c
Discover more from Leverage AI for your business
Subscribe to get the latest posts sent to your email.
Previous Post
Cognitive Time Travel: Great AI is Like Precognition
Next Post
The AI Executive Brief: January 2026 - What Big Consulting Is Saying