Citi Wealth is one of the world’s largest wealth management operations, serving clients across approximately 60 countries. In the United States, its Citigold brand serves around 400,000 to 500,000 mass-affluent clients, while its retail banking arm has approximately three million customers. At the upper end, its private bank counts among its clients roughly 25 to 30 per cent of the world’s billionaires. Across that range – from people just beginning to invest to the world’s wealthiest individuals – Citi Wealth manages approximately one trillion dollars in assets.
At Google Cloud Next in Las Vegas last week, Citi Wealth unveiled Citi Sky – an AI-powered avatar built on Google Cloud and Google DeepMind technology, designed to engage clients directly through real-time voice and video conversation. Available to Citigold clients in a phased US rollout beginning this summer, it is one of the more advanced deployments of customer-facing agentic AI I have seen – not just from a major financial institution, but from any organization. I spoke with Joe Bonanno, who leads Citi Wealth’s intelligence team, and Rohit Bhat, Google Cloud’s Managing Director of Financial Services, to understand what is driving the investment – and what it is actually trying to achieve.
It’s not about efficiency
Citi Wealth’s clients hold approximately five trillion dollars in assets and the firm manages roughly one trillion of that. Bonanno is refreshingly honest about what this technology aims to do. He says:
Everybody’s doing efficiency. This is about playing offence – generating revenue in a way that also delivers better client outcomes.”
Some advisors in the US business carry 500 to 1,000 clients. Tracking every account movement, bond maturity, stock upgrade, and asset allocation drift across that book – and calling every client who needs to hear about it – is just not physically possible. What Citi Wealth has built in Sky is, in Bonanno’s view, a digital team member. Sky is always on, able to handle routine queries at any hour, and is seemingly capable of surfacing the right prompt to the right client at the right moment.
The target client is as much about access as affluence. Bonanno gave the example of a single mother who cannot call her advisor in the evening, but Sky will be available and can be there. The ambition, he said, is not productivity – it is wallet share, client engagement, and retention. This is notable at a time when much of what we are hearing about enterprise AI deployments centers on cost reduction and internal efficiency. Customer-facing agentic AI operating at this scale is still rare, and Citi Wealth is making the case that the external-facing opportunity is where the real value lies.
The platform underneath
Getting to this point required rebuilding Citi Wealth’s technology foundation. The organization came out of the financial crisis having sold Smith Barney to Morgan Stanley, which left it without the infrastructure footprint of its peers for several years. By the time Bonanno arrived, the architecture had grown into what he describes as resembling seven different companies – separate data lakes, warehouses, front ends, and product processors.
Around 18 months ago, Citi Wealth began building a unified platform called One Wealth, consolidating that fragmented estate into a single architecture. On top sit the systems of record; on top of those sit the client and advisor channels. The partnership with Google – spanning Gemini, Vertex, and now DeepMind – has become the destination for the AI layer. Citi had been working with OpenAI for several years, but has since pivoted decisively towards Google. Bonanno says:
What we’ve built is a platform called One Wealth. It has everything from soup to nuts. On top of that sit our systems of record, and on top of those sit our channels — the client mobile and online experience, the advisor workstation. Now it’s all built around conversation. Advisors can ask: which of my clients are exposed to Nvidia? Which is overweight cash? Which clients haven’t I called in the last 30 days and also own Nvidia? That’s extraction and summarization – essentially RAG-style retrieval.
Underneath those queries, agents autonomously mine for opportunities, risks, and signals. When a retail bank customer updates their direct deposit to reflect a new employer, that is a signal – a potential cross-sell opportunity, a conversation worth having. The system is being built to surface those moments to the right advisor, in real time, across multiple languages. He adds:
What we’re now building underneath are agents that autonomously mine opportunities, risks and threats. Think about three million retail bank clients. When someone changes their job from Apple to Google and updates their direct deposit, that’s a signal. They probably have a 401(k) at Apple I can cross-sell against, or suggest an IRA. They may start making large cash deposits, paying BMW, buying luxury goods – so many signals.
We’re building agents to represent every client or archetype persona – to figure out who to call today, what to talk to them about, and surface that to the advisor in real time. In different reading levels, in Spanish, in Mandarin.
One internal deployment that illustrates the cultural shift already under way is Stylus – a Gemini-powered tool now on the desktop of every Citi employee. Advisors can build their own presets: type a client’s name and automatically receive a briefing drawn from news, LinkedIn, and recent market events. Out of 220,000 Citi employees, around 180,000 are active users. It has been woven into onboarding and training, with a champion network across every business domain. Bonanno says that the cultural direction from the top matters and he cites Andy Sieg, Head of Wealth, as more engaged in AI than some of his technology leadership, constantly pushing for new use cases.
Building Sky
Citi Sky is built on Google DeepMind’s real-time avatar technology and Gemini’s Live API, enabling voice and video conversation with low latency. It will start from go-live in English and Spanish, with additional languages planned. At launch, its capabilities include prompting clients around certificate of deposit maturities, surfacing market insights from Citi Wealth’s Chief Investment Office, and handling routine queries that would otherwise consume advisor time.
In terms of how guardrails are enforced in a real-time voice context, Bhat explained that conventional voice AI systems convert speech to text, pass that to a model, and convert the response back – with loss of fidelity at each step. Google’s approach for Sky is voice-to-voice processing, with the aim of maintaining context throughout, without conversion. The consequence is that guardrails can be enforced inline, within the conversation itself, rather than applied as a post-processing layer. Bhat says:
That was a really powerful discovery. It’s why Google DeepMind was so heavily engaged on this.
Bonanno describes the compliance architecture in similar terms. When a client speaks with Sky, the response passes through Citi’s own checking layer, then on to Google – which monitors for toxicity, bias, and policy breaches – and then back into Citi’s infrastructure, which verifies the response against that client’s specific risk tolerance and investment objectives. If a client’s risk profile makes a particular conversation inappropriate, the system flags it in real time. Suitability – one of the core principles of wealth management – is built into the interaction layer, not bolted on afterwards.
All client data is masked and double-encrypted. PII protections are enforced throughout, with both preventative and detective controls monitored by agents. Regulatory and compliance teams were included from the outset of the build.
At this stage, Sky provides guidance and surfaces insights – it does not yet issue buy or sell recommendations. That capability is the stated target towards the end of this year or early next.
What the advisors said
Many agentic AI proposals from vendors include the concept of ‘human in the loop’, but the reality is often different on the ground – and so I was keen to get Citi’s perspective on the reaction from its teams that would usually be doing the work that Sky will be involved in. Advisors were involved in co-creating Sky from the beginning, and their response was less anxious than might be expected. Bonanno says:
Some of them have hundreds or even a thousand clients. They looked at Sky as their digital assistant – their advisor twin. Some of them realized: I can only do so much. If something’s happening in the market and I’m at my son’s baseball game, I have a team member who can handle 90 per cent of the routine queries – cost basis questions, document requests, all of that instantly answered.
For them it was: I have a digital team member who handles the basics and routes to me what I need to know. This could happen at ten at night – Sky answers three questions, two it can’t handle, but it logs those in the system. When I come in the next morning: ‘here’s who I took care of, here’s who you need to follow up with’.
The example that resonated most, according to Bonanno, was the market shock scenario. When the S&P moves significantly, advisors would previously send a static message to their book. Sky could run a personalized portfolio review for every client, send each a message calibrated to their specific position, and hold a full conversation with any client who responds – walking through what the movement means for them, whether they are well-positioned, what they might consider. Bonanno says that to the advisors, that capability was “mind-blowingly powerful.”
Citi Wealth is not, according to Bonanno, building Sky to reduce headcount or cut costs. The roughly $4 trillion sitting outside the firm is the number that drives the investment. The case being made here is that agentic AI, deployed responsibly and at scale, is a revenue instrument first – and that wealth management, with its combination of data richness, regulatory rigor, and client relationship economics, may prove one of the more natural environments for it to make that case. He says:
[Sky is] much less about productivity, much more about revenue and engagement.
