ASAPAi Soon As Possible · AI & tech, delivered fastest
Article

Google ships the Gemini Interactions API to GA: the server holds state and runs your agents

2026-06-24 · 3 min read

The old pattern of resending the entire conversation state on every AI call is changing. Google has released the Interactions API, a unified interface for Gemini models and agents, to general availability. The server holds conversation state directly, runs work asynchronously in the background, runs agents in remote Linux sandboxes, and cuts cost by 50% on the Flex tier. ASAP summarizes the announcement from the primary source.

The server holds state and runs in the background

The Interactions API is Google's unified interface for Gemini models and agents. The server holds conversation state directly, runs asynchronously with background=True, combines built-in tools with custom functions, and supports multimodal generation. It started as a public beta in December 2025 and has now moved to general availability.

Managed Agents work in remote Linux sandboxes

Managed Agents provision remote Linux sandboxes where an agent can reason, execute code, browse the web, and manage files. Deep Research comes in two versions, one tuned for speed and one for depth, with collaborative planning and native charts and infographics. Built-in tools such as Google Search and Google Maps combine with custom functions, and tool results can return images alongside text.

Nano Banana 2 and Lyria 3 generate images and music

Media generation includes the Nano Banana 2 image model, the Lyria 3 music model, and expressive multi-speaker speech. The schema changes "From Roles to Steps," so each action becomes a typed step. The API supports Python and JavaScript SDKs and integrates with LiteLLM, Eigent, and Agno.

Flex cuts cost 50%, paid tier retains 55 days, legacy stays supported

The Flex tier reduces cost by 50%, and the paid tier retains data for 55 days. The Interactions API becomes the default across Google AI Studio and all official documentation. The legacy generateContent API remains fully supported, Google says.

What it means: APIs shift from one-off calls to a stateful runtime

The Interactions API is a sign that AI calls are shifting from one-off requests that resend state each time to a runtime where the server holds the conversation and the work. Because agents run code in sandboxes and work in the background for long stretches, the app only has to collect the result. That said, server-side state and 55-day retention bring data governance into the picture.

Wrap-up

Google released the Interactions API, a unified interface for Gemini, to general availability. Server-held state and background execution, Managed Agents in Linux sandboxes, Nano Banana 2 and Lyria 3 media generation, and a 50% Flex discount with 55-day retention are the core. AI APIs are moving from one-off calls to a stateful runtime.

Source: ASAP summary of Google's Gemini Interactions API general availability announcement (public beta December 2025, now GA; server-held state and background=True async execution, Managed Agents in remote Linux sandboxes, built-in Google Search and Maps combined with custom functions, Deep Research in speed and depth versions, Nano Banana 2, Lyria 3 and multi-speaker TTS, Flex tier 50% cost reduction, paid-tier 55-day retention, Python and JavaScript SDKs, legacy generateContent fully supported).

ASAP

AI & tech,
delivered fastest

Beyond the headlines — into the context and the structure

Ai Soon As Possible · asapai.co.kr

AI TOP 100 (CAMPUS) 2026 finalist badge
← All posts