Skip to content
On BuildingOperating Model

Why AI Capability Is Easy and AI Returns Are Hard

Capability is becoming commodity. Returns come from redesigning the work underneath it — the organizational change that is harder, slower, and impossible to demo.

Argument in three

  1. 1

    Adoption has outrun impact: most companies now use generative AI somewhere, while far fewer can point to bottom-line movement. That gap is structural, not a sign anyone is doing it wrong.

  2. 2

    The fingers-and-toes problem: automating part of a role saves part of a person, and partial people never show up on an income statement. A thousand task-level wins can still sum to zero.

  3. 3

    Returns appear when the work is redesigned end to end — process retired, roles re-scoped, decision rights moved, and freed capacity redeployed or genuinely removed.

Raj Bhatia · June 1, 2026 · 5 min read · 864 words

In 1999, on a GE Capital operations floor in Gurgaon, I took 80% of the cycle time out of an FP&A billing process. I was a few years out of school. We didn't put a clever tool on top of the old process. We took the process apart and rebuilt it so that most of the steps no longer existed. The 80% was real for a specific reason: the work was gone, not faster. There was nothing left to drift back to.

I have spent the 25 years since learning that this is the only version of a productivity gain that ever reaches a bottom line, and that most organizations, then and now, are buying the other version.

The Paradox That Defines 2026

Here is the pattern that defines 2026. Most companies now report using generative AI in at least one function, while far fewer can attribute meaningful EBIT movement to it yet. Capability has outrun impact. The comforting interpretation is maturity — early days, give it time. I read it differently, because I have watched a version of this before.

A decade ago the acronym was RPA and the promise was identical: deploy the bots, watch the savings appear. I had a front-row view of how that promise got sold — the demos genuinely impressive, the savings projected to the dollar. The pilots worked. And the firm-level financial impact, for most institutions, never showed up — for reasons that had nothing to do with the software.

The Fingers-and-Toes Problem

McKinsey has a name for the core mechanism, and I think it is the most useful phrase in this whole debate: the fingers-and-toes problem. You automate part of a role. You save part of a person. Partial people do not appear on an income statement. A task gets 40% faster, the role still exists, the team is still staffed at the same level, and the freed capacity dissolves into the workday — a little more slack, a few meetings that end early. Multiply that across a thousand use cases and you get an institution that is measurably more productive at the task level and financially identical at the firm level. Every individual win is real. The sum is zero.

Stack the rest of the usual pattern on top and the paradox stops being a paradox. Use cases lifted from a vendor deck without regard for how this institution actually works. Broad, shallow copilots that help a little with everything and transform nothing. Point solutions that plateau the moment they meet a real exception. A hundred micro-initiatives, none reusable, each owned by a different team, none tied to a number anyone on the executive floor cares about. That is not a technology failure. It is the predictable output of a deployment model that was never designed to change how work happens — only to lay capability on top of work that stays exactly the same.

Where Value Actually Shows Up

Value shows up at one moment: when the work itself is redesigned end to end around the new capability. The process is redrawn, not decorated. Roles are re-scoped. Decision rights move — somebody new owns the call, and the person who used to make it is doing something else or is no longer needed. And the freed capacity is pointed somewhere it matters: redeployed to revenue, or genuinely removed. Until that organizational work is done, the capability is a feature. After it is done, it is a result.

That is what made 80% real in Gurgaon a quarter century ago. It is what separates, in every program I have run or watched since, the work that reached the income statement from the work that reached a dashboard and stopped. The institutions seeing returns from AI right now did not find better technology. They are running the same models, often the same vendors. What set them apart was doing the organizational work — the part that is harder, slower, more political, and impossible to demo. That work means telling people their jobs are changing, not that their jobs are getting easier.

So the question for any executive wondering why the savings haven't materialized is not "is the model good enough." It almost always is. The question is: when this works, what stops happening? Which process is retired, which report is killed, which decision moves, which role changes? If the program has a confident answer — names, dates, an org chart that looks different next year — it has a chance of reaching the P&L. If the answer is some version of "people will just be more productive," you have bought the fingers-and-toes version, and you will get exactly what it has always delivered: a more capable institution that is, financially, precisely where it started.

The next five years of competitive advantage in financial services will not be decided by who has the best AI. Capability is becoming commodity fast. It will be decided by who is willing to do the work that turns capability into results — taking the old way apart so the new way has somewhere to live. That work has not changed in 25 years. The acronym on the cover keeps changing. The thing that moves the number is the same as it has always been.

Adoption and EBIT-attribution figures: McKinsey, “The State of AI” survey series, 2025–2026.

I advise financial institutions on the problems these essays describe — diagnosing and redesigning how organizations actually run. If this is the conversation you're having internally, it's worth 30 minutes.

Schedule a conversation

About the Author

Raj Bhatia writes on AI and the operating models that decide whether it works — drawn from 25 years building and refining functions inside GE Capital, Moody's, Deloitte, and Code and Theory. Founder of SigmaArc.