ForecastMay 2026·11 min read

Beyond the pilot: where generative AI actually pays.

Three years and several billion dollars into the generative AI cycle, the gap between firms capturing value and firms reporting activity has widened, not narrowed. We examine why — and what separates the two cohorts.

Suzanne Cahill

Partner, AI, Transformation and Operating Models

Generative AI is now five years old in commercial terms and three years into mainstream enterprise adoption. The investment numbers are staggering. The value-capture numbers are not. Across the institutions we work with, only a minority can evidence a positive net contribution to earnings from their generative AI programmes. The remainder are running what we politely call a strategic option and what their CFOs increasingly call a cost line.

The encouraging news is that the value-capturers are not random. They share identifiable patterns of how they choose where to deploy, how they organise around it, and crucially how they decide what not to do. This paper sets out the seven patterns we observe most consistently among them.

Headline Finding

Firms in the top quartile of generative AI value capture spend less on AI per employee than the median — but spend it on three to five use cases, not thirty. Concentration of effort, not scale of spend, is the dominant predictor of measured return.

1. The value-capture gap is structural, not temporal

A common board-level reassurance in 2025 was that the value would come, eventually — that we were simply in the trough of the productivity J-curve. Our data does not support this. Comparing the same 142 firms over consecutive twelve-month windows, the cohort generating measurable returns expanded, but the cohort not generating returns did not transition into the first group. They simply continued spending.

This matters because it implies that the gap is not about time, talent, or technology maturity. It is about choices made early in the programme that, once made, are extremely difficult to reverse.

2. The seven patterns of value capture

2.1 They start from the P&L, not from the model

Value-capturing firms can articulate, before they begin, the specific line item the deployment will move and by how much. Underperformers typically begin with a capability ("we should use this model") and search for a use case afterwards. The first approach is engineering against a target; the second is solutionism.

2.2 They prioritise revenue-generating use cases over cost-out

Counter to conventional wisdom, the highest-return use cases we have measured are not in operations. They are at the revenue-generating edge: investment research personalisation, relationship manager leverage, structured-product origination. The reason is mathematical — a 5% improvement on a $100m revenue line is twice a 50% improvement on a $5m cost line, and revenue improvements compound.

2.3 They invest in proprietary data, not proprietary models

Almost none of the value-capturers we studied trained their own foundation models. All of them invested heavily in the proprietary data and feedback loops that made off-the-shelf models behave distinctively in their context. The defensible asset is the data substrate, not the model.

2.4 They redesign the workflow around the model

We covered this at length in our companion paper on agentic systems, but the pattern repeats here: the firms generating return have redesigned the surrounding process. The firms not generating return have inserted the model into an unchanged process and measured the marginal saving.

2.5 They run small, accountable squads

The modal value-capturing deployment was built by a squad of fewer than ten people, with a single accountable owner who held both the technology and business outcome. Large central AI functions correlate negatively with measured value capture in our sample — they generate capability, not outcomes.

2.6 They kill aggressively

Top-quartile firms terminate use cases early and often, while laggards let weak initiatives drift for years. The difference is not that the top firms picked worse — it is that they were willing to confront poor signal early. Sunk-cost discipline is, in our experience, the single most distinguishing organisational behaviour.

2.7 They treat governance as an accelerator

Counterintuitively, the firms with the most rigorous AI governance ship faster. The reason is simple: clear rules eliminate the deliberative paralysis that occupies most institutions in the deployment-readiness phase. Governance is the runway, not the brake.

3. Where the next $1 trillion will be made

Looking across our deal pipeline and the strategic plans we have reviewed in the past six months, three domains stand out as under-exploited relative to their economic potential:

Hyper-personalised wealth advice — the technology to deliver bespoke advice at mass-affluent scale now exists; the distribution model to monetise it largely does not.
Capital markets workflow compression — origination, syndication and post-trade processes still operate on cycle times designed for the fax machine. Compression here is worth tens of basis points across the value chain.
Regulatory and disclosure automation — a quiet, unfashionable category that will deliver disproportionate margin relief in jurisdictions where reporting cost is now the binding constraint on product proliferation.

4. Closing view

Generative AI is neither overhyped nor underhyped. It is unevenly distributed. The value is real, large, and being captured today by a small minority of institutions that have made a handful of unfashionable choices: fewer use cases, smaller teams, proprietary data over proprietary models, and a discipline of killing what is not working.

The investment case for the next phase is, in our view, not "spend more on AI". It is "spend more like the firms already winning with AI".

Findings draw on a longitudinal panel of 142 financial institutions tracked across 2024–2025, supplemented by 31 in-depth engagement case studies. All financials are normalised in USD at 2025 year-end rates.