What CRO Conversations on AI Look Like in 2026
Synthesis of 50+ Chief Risk Officer conversations across BFSI, healthcare and public sector over the past nine months. What they're actually asking about (vendor concentration, model lifecycle, audit substrate), what they've stopped asking about (jailbreaks at the chatbot layer), and the four risk-framing shifts that have happened in CRO offices since Q4 2025. Forward-looking year-in-review angle.
I had my first conversation with a Chief Risk Officer about generative AI in October 2022, sitting across a conference table from the CRO of a regional Indian bank who had read about ChatGPT in The Economic Times that morning. The question was "what is this and should I be worried?" Fifty conversations later, in May 2026, the questions are dramatically different. CROs in 2026 are not asking whether to adopt AI or whether the technology is real; they're asking specific, narrowly-scoped questions about vendor concentration risk, model lifecycle artefacts, audit substrate, and the structural mismatch between cloud LLM vendor business models and regulator expectations. This piece synthesises the conversations — 51 of them across BFSI (29), healthcare (13), public sector (7), and pharma (2) — into the four risk-framing shifts that have happened in CRO offices since Q4 2025. None of the customers are named; the shifts are widespread enough to be generalisable.
Why the framing matters
Risk framing in regulated industries is the substrate that determines what gets approved, what gets rejected, and what gets sent back to the business for further work. A CRO's framing of AI risk in 2024 was largely "this is a category of operational risk we don't yet have a framework for"; the framing in mid-2026 is largely "this is an enterprise-wide risk category that intersects with operational, model, IT, cybersecurity, and conduct risk and requires a co-ordinated cross-functional approach." The shift in framing has real consequences for what AI work gets greenlit, what governance structures get put in place, and what vendor relationships get sustained or terminated. CRO framing matters for engineering teams because the engineering choices that succeed in front of a 2024-style CRO are different from the engineering choices that succeed in front of a 2026-style CRO. Specifically: the 2024-style CRO would often accept "we have human review" as a sufficient control; the 2026-style CRO wants to see the protocol, the competence requirements, the override-authority documentation, and the post-decision audit trail. Engineering teams that haven't updated their pitch are losing decisions they would have won twelve months ago.
Shift 1 — from "jailbreaks at the chatbot layer" to "vendor concentration risk"
The dominant CRO question in mid-2024 was variants of "can users get the AI to say embarrassing or dangerous things?" In mid-2026 that question has substantially receded; the substrate (input validation, output filtering, prompt-injection defences) is mature enough that CROs are willing to treat it as a tractable engineering problem. The dominant question now is vendor concentration risk: "if our cloud LLM vendor changes pricing, deprecates a model version, or fails commercially, what is our exposure and how quickly can we recover?" The shift is driven by two things. First, the public deprecation of OpenAI's GPT-3.5 models on relatively short notice in 2024 and the model-version churn that has continued since. Second, the regulator-driven expectation that operational resilience under the Digital Operational Resilience Act and equivalents includes AI vendor concentration. The implication for engineering teams: "we use the best frontier API" has become an answer that increases CRO anxiety rather than reduces it. The answer that reduces CRO anxiety is "we have a multi-model architecture with at least one self-hosted open-weights model as the recovery path." Customers running fully on a single cloud LLM vendor in mid-2026 are increasingly facing CRO-driven mandates to build the recovery path before the AI ambition can be expanded.
Shift 2 — from "model accuracy" to "model lifecycle artefacts"
The dominant CRO question on model-quality in 2024 was "how accurate is the model?" — a question whose answer was usually a single benchmark number. In 2026 the question is "what evidence do we have about the model's training data, its evaluation methodology, its deployment lineage, its post-deployment monitoring, and its incident history?" The CRO is asking for the model lifecycle artefacts, not the model accuracy number. The shift is driven by the EU AI Act's Articles 9–15 evidence requirements and by the equivalent provisions in domestic frameworks (RBI Master Direction on IT Governance, SAMA Cyber Resilience Framework updates, NHS DSPT v2024). The implication for engineering teams: a model deployment that doesn't carry its lifecycle artefacts is a model deployment that the CRO can no longer approve. The lifecycle artefacts include the training-data lineage (or for off-the-shelf models, the vendor-provided lineage documentation), the evaluation-harness results across multiple criteria, the deployment manifest (what version is in production on what date), the post-deployment monitoring outputs (drift detection, behavioural eval scores over time), and the incident-and-near-miss log. Engineering teams that have built this substrate into the deployment pipeline (Langfuse + RAGAS + a structured evaluation harness, persisted to a content-addressed audit store) are finding CRO conversations meaningfully shorter and more constructive than teams that haven't.
Shift 3 — from "AI as a category of model risk" to "AI as an enterprise risk requiring cross-functional co-ordination"
The 2024-style CRO often treated AI as a sub-category of model risk — handled by the model risk management function under the existing model-risk framework (SR 11-7 or equivalents). The 2026-style CRO treats AI as an enterprise-wide risk requiring cross-functional co-ordination across model risk, operational risk, IT risk, cybersecurity, and conduct risk. The shift is driven by the realisation that AI systems intersect with too many risk categories to be cleanly handled by any one function. A prompt-injection attack is simultaneously a cybersecurity risk, a model risk (the model behaved outside its specification), an operational risk (the operational impact of the attack), and potentially a conduct risk (if the attack produced an output that affected customer treatment). The implication for engineering teams: pitching AI work to model-risk only is insufficient. Engineering teams need to bring the conduct-risk lens, the cybersecurity lens, and the operational-resilience lens to the same conversation. The customers we've seen execute this best have established an AI risk function that sits across (not within) the existing risk functions, with explicit co-ordination authority and a named lead.
Shift 4 — from "build vs buy" to "sovereign vs vendor-mediated"
The dominant CRO question on procurement in 2024 was the classic build-vs-buy framing — "should we develop this AI capability in-house or buy a vendor product?" In 2026 the framing has shifted to sovereign-vs-vendor-mediated. The question is no longer about who builds the system, it's about whether the customer can demonstrate exclusive control over the model lifecycle artefacts and the operational substrate. Sovereign means: model weights on customer infrastructure, inference on customer hardware, audit logs in customer SIEM, the entire stack operable air-gapped if required. Vendor-mediated means: cloud LLM vendor sits in the trust path, with the customer dependent on the vendor's compliance posture, contractual undertakings, and continued operation. The CRO concern with vendor-mediated is not about vendor competence — it's about the structural concentration risk and the inability of the vendor to meet the model-lifecycle-artefact expectations described in shift 2. The implication for engineering teams: the framing "we built our own" matters less than the framing "this is sovereign by design." A sovereign deployment that uses an off-the-shelf vendor product is acceptable; a vendor-mediated deployment, however well-built, is increasingly facing CRO scepticism.
What CROs have stopped asking about
Three categories of question have substantially receded from CRO conversations. First: "can the AI say embarrassing things?" — this question has not disappeared but is now treated as a tractable engineering problem rather than an existential risk. Customers expect the substrate (input validation, output filtering, prompt-injection defence) to be there; absence is now a defect, not a research project. Second: "is the AI good enough?" — most CROs in 2026 have seen enough enterprise AI to know that the AI is good enough for narrow scoped tasks and inappropriate for many tasks that 2024-era proposals were targeting. The question "is the AI good enough" has been replaced by "is the AI fit for this specific scope?" — a much more answerable question. Third: "will we be sued?" — the litigation risk around AI is now well-bounded enough that CROs have moved on. The new question is more nuanced: "what is our exposure under specific provisions of specific regulations?" — typically EU AI Act Articles 9–15 plus the domestic equivalents.
Healthcare CRO conversations — distinct shape
Healthcare CRO conversations have a distinct shape relative to BFSI. The risk taxonomy is different (clinical risk, patient-safety risk, medical-device-regulation risk dominate; conduct risk is less prominent). The regulatory substrate is different (HIPAA + state law in the US; NHS IG Toolkit + MHRA + EU AI Act in the UK; DPDP + NDHM + state-specific health regulation in India). The deployment posture is different (sovereign-by-default is more common because the cloud-LLM-with-BAA path has become slower than the on-prem alternative in most US covered entities). But the four shifts described above are present in healthcare conversations too, with slightly different framing. The vendor concentration concern in healthcare is most acute around the ambient-clinical-documentation category (where a small number of vendors dominate); the model lifecycle artefact concern is most acute around clinical decision support (where the MHRA and FDA are increasingly assertive); the cross-functional co-ordination concern intersects with clinical-governance structures that don't have a perfect BFSI analogue.
Public sector CRO conversations
Public sector CRO conversations (where the CRO role is typically called Chief Risk Officer, Director of Risk, or analogue depending on jurisdiction) follow a different rhythm. The supervisory framework is supervisory accountability to elected officials plus auditor-general scrutiny plus FOI obligations rather than a financial-services-style supervisor. The framing of risk is different: financial loss is less prominent; reputational risk, judicial review risk, and democratic-accountability risk are more prominent. The four shifts described above translate to public sector but with a different intensity profile. Vendor concentration is a procurement-policy concern (most public-sector procurement frameworks have explicit anti-concentration requirements). Model lifecycle artefacts intersect with FOI obligations (every artefact must be FOI-disclosable, which raises the bar on what gets captured). Cross-functional co-ordination involves not just the risk function but the policy, legal, and operational-leadership functions. Sovereign-vs-vendor-mediated is increasingly being mandated by procurement policy rather than left as a risk-function judgement call.
What this means for the next 12 months
Three forward-looking implications. First: CROs will continue to require sovereign-by-design as the default architecture for high-risk AI deployments. Engineering teams that have not built sovereign-deployment capability are increasingly losing decisions. Second: CROs will require model-lifecycle-artefact substrate (Langfuse-or-equivalent observability, structured evaluation, content-addressed audit store) as a baseline for any AI deployment with EU exposure. Engineering teams that have not built this substrate are increasingly losing decisions. Third: CROs will require cross-functional co-ordination on AI risk, with a named AI risk lead and explicit cross-cutting authority. Engineering teams that pitch into model-risk only are increasingly losing decisions. The teams that have built the sovereign substrate, the model-lifecycle-artefact substrate, and the cross-functional risk-engagement playbook are winning decisions at materially higher rates than teams that haven't. The CRO-conversation pattern is the leading indicator of the procurement-decision pattern, and the procurement-decision pattern is the leading indicator of which AI architectures dominate the regulated-industry market in the next 24 months.
Saurabh Goenka →
Saurabh has spent the last five years shipping sovereign AI for regulated enterprises. He's personally led engagements with tier-1 banks across the Gulf, East Africa and South Asia, with healthcare systems in the UK and India, and with central-government agencies on three continents. He speaks regularly at industry forums on the engineering reality of EU AI Act compliance and sovereign LLM deployment.
- ✓NASSCOM Tech Excellence 2026 — Healthcare AI category winner
- ✓ET NOW 40 Under 40 (2026)
- ✓Outlook Dynamic Leaders (2025)
- ✓ICAI 40 Under 40 (2025) · Chartered Accountant
- ✓Forbes Business Council member (2021–present)
- ✓50+ enterprise AI deployments shipped
Keep reading
The 2026 Sovereign AI Architecture Report
Data-driven analysis of every meaningful sovereign AI stack in production today. Compares 6 open-weights model families, 4 vector databases, 3 inference servers and 5 reference architectures on cost-per-million-tokens, regulator-readiness, integration substrate and operational complexity. Survey-based, with the deployment numbers from 50+ regulated-industry engagements behind every recommendation.
State of Agentic AI in Regulated Industries 2026
A production-pattern survey of agentic AI in BFSI, healthcare, public sector and pharma. What patterns actually ship (ReAct + tool-use, planner-executor, multi-agent orchestration), what fails in audit (silent loops, hidden tool calls, unbounded reasoning), and the four engineering controls separating prototypes from production. Based on the agent runtimes we've shipped at 17 regulated customers in the past 18 months.
EU AI Act Readiness Benchmark — 50 Enterprises
Anonymised readiness benchmark across 50 enterprises with EU exposure — banks, insurers, hospitals, manufacturers, public-sector bodies — measured against the 11 Articles 9–15 evidence requirements. Median readiness is 38%; only 14% would survive a supervisory audit today. Where the gaps cluster, why they're tractable in 90 days, and the five interventions that close the most ground.
Ready to apply these ideas?
Talk to our engineering team. No sales pitch — just a technical conversation.
Start a conversation →