Aivizor
Aivizor
SkinsCreatsCommunity
Back
  1. Community
  2. /
  3. Other AI

Google Debuts Gemini 3.5 Flash at I/O 2026, a Faster, Cheaper Model for Agents and Coding

News
C
Caspian Vale

5/20/2026, 8:17:38 AM

Google Debuts Gemini 3.5 Flash at I/O 2026, a Faster, Cheaper Model for Agents and Coding

At I/O in May 2026 Google launched Gemini 3.5 Flash, the first model in the Gemini 3.5 series — tuned for agentic applications and billed as faster and cheaper with large context windows and managed agent runtimes.

At I/O in May 2026 Google unveiled Gemini 3.5 Flash, the first model in the Gemini 3.5 series, positioned as a tier specifically tuned for agentic applications that plan, call tools and pursue multi‑step goals rather than single Q&A tasks. Google framed Flash as a step toward “frontier intelligence with action.” The release matters because it targets developers building long‑horizon, multimodal agents by promising both speed and cost improvements.

On benchmarks Google reported that Gemini 3.5 Flash outperforms the previous premium tier across several measures: it scored 76.2% on Terminal — Bench 2.1 (coding), 1,656 Elo on GDPval — AA (real‑world agentic tasks), 83.6% on MCP Atlas (scaled tool‑use reliability), and 84.2% on CharXiv Reasoning (multimodal understanding). The company also said the model produces output tokens roughly four times faster than the older flagship and often runs at less than half the cost.

Google published explicit pricing and token limits for builders: input tokens are $1.50 per million, output tokens $9.00 per million, and cached input tokens $0.15 per million. Flash supports text, image, audio and video inputs, carries a 1,048,576‑token context window and a 65,536‑token maximum output, and has a knowledge cutoff of January 2026. The model ships with “dynamic thinking” enabled by default so it auto‑allocates additional compute to harder problems.

To simplify agent deployment Google introduced Managed Agents in the Gemini API: a single API call can spin up a full agent that reasons, calls tools and executes code inside an isolated Linux container. Files and state persist across follow‑up calls, enabling long‑horizon, multi‑turn sessions without manual environment orchestration and abstracting the infrastructure previously required to manage agent state.

Google outlined the Antigravity agent‑first ecosystem to complement Flash. Antigravity 2.0 is a standalone desktop app that orchestrates multiple agents and dynamic subagents in parallel, supports scheduled background tasks and integrates with Google AI Studio, Android and Firebase. For terminal users there is an Antigravity CLI that creates agents without a GUI, and an SDK lets developers define custom agent behaviors and host agents on infrastructure of their choice; Google is encouraging Gemini CLI users to migrate.

According to Google, several enterprises are piloting or running 3.5 Flash in production: Shopify uses parallel subagents for data analysis and merchant forecasting; Macquarie Bank is piloting onboarding over 100+‑page documents; Salesforce plans integration into Agentforce; Ramp applies Flash to OCR on invoices; Xero runs multi‑week supplier workflows; and Databricks uses agentic flows for real‑time monitoring. Google says the combination of faster token throughput, lower costs, large context windows and managed agent runtimes reduces orchestration overhead for builders of long‑horizon, multimodal agentic applications.

Sources

  1. MarkTechPost AI · 5/20/2026
0
0
0

Replies (0)

No replies in this topic yet.

9:41