3.1 Pro
Best for complex tasks and bringing creative concepts to life
3.5 Flash
Best for frontier performance across agents and coding
3.1 Flash-Lite
Best for high-volume tasks that need efficiency and intelligence
Slide 1 of 4
Agentic coding
Tackle complex, development tasks with advanced reasoning at speed.
Advanced multimodal understanding
Transform text, images, video and audio into rich interactive user interfaces.
Long horizon tasks
Execute sophisticated workflows over extended timeframes.
Multi-step problem-solving
Leverage advanced tools to solve demanding, real-world problems.
| Benchmark | Gemini 3.5 Flash | Gemini 3 Flash | Gemini 3.1 Pro | Claude Sonnet 4.6 | Claude Opus 4.7 | GPT-5.5 | ||
|---|---|---|---|---|---|---|---|---|
| Coding | Terminal-bench 2.1 Agentic terminal coding | Terminus-2 harness | 76.2% | 58.0% | 70.3% | — | 66.1% | 78.2% |
| SWE-Bench Pro (Public) Diverse agentic coding tasks | Single attempt | 55.1% | 49.6% | 54.2% | — | 64.3% | 58.6% | |
| Agentic | MCP Atlas Multi-step workflows using MCP | 83.6% | 62.0% | 78.2% | 69.5% | 79.1% | 75.3% | |
| Toolathlon Real-world general tool use | 56.5% | 49.4% | — | — | — | 55.6% | ||
| UI Control | OSWorld-Verified Agentic computer use | 78.4% | 65.1% | 76.2% | 72.5% | 78.0% | 78.7% | |
| Expert tasks | Finance Agent v2 Financial analysis and decision-making | 57.9% | 42.6% | 43.0% | 51.0% | 51.5% | 51.8% | |
| GDPval-AA Economically valuable knowledge work | Elo | 1656 | 1204 | 1314 | 1676 | 1753 | 1769 | |
| Multimodal | CharXiv Reasoning Information synthesis from complex charts | No tools | 84.2% | 80.3% | 83.3% | 72.4% | 82.1% | 84.1% |
| MMMU-Pro Multimodal understanding and reasoning | No tools | 83.6% | 81.2% | 80.5% | 74.5% | 75.2% | 81.2% | |
| Blueprint-Bench 2 Agentic spatial reasoning | Normalized score | 33.6% | 0.0% | 26.5% | 6.7% | 24.5% | 36.2% | |
| Long context | MRCR v2 (8-needle) Long context performance | 128k (average) | 77.3% | 67.2% | 84.9% | 84.9% | 59.3% | 94.8% |
| 1M (pointwise) | 26.6% | 22.1% | 26.3% | — | — | — | ||
| Reasoning | Humanity’s Last Exam Academic reasoning (full set, text + MM) | 40.2% | 33.7% | 44.4% | 33.2% | 46.9% | 41.4% | |
| ARC-AGI-2 Abstract reasoning puzzles | 72.1% | 33.6% | 77.1% | 58.3% | 75.8% | 84.6% |
Slide 1 of 9
Code faster through iterative loops
See how Gemini 3.5 Flash generates six payment UI options in under 60 seconds.
Develop multiple creative concepts in parallel
See how Gemini 3.5 Flash can create 64 fractal variations at a high speed.
Long horizon agentic execution
See how Gemini 3.5 Flash ingests the AlphaGo paper and builds an intelligent game autonomously.
Accelerate the creation of brand assets
Watch how Gemini 3.5 Flash coordinates multiple workflows to generate and refine a brand for a fundraiser with minimal input.
Generate interactive web animations in a single shot
See how Gemini 3.5 Flash turns a text description into fully interactive HTML components.
Create music through real-time collaboration
See how Gemini 3.5 Flash coordinates multiple agents to create a song using the Strudel music library.
Orchestrate multi-agent workflows
Watch Gemini 3.5 Flash coordinate a team of specialized agents to design and build a virtual city.
Organize large file collections efficiently
See how Gemini 3.5 Flash deploys parallel agents to automatically rename and structure messy datasets.
Build and improve a game with interactive agent loops
Watch Gemini 3.5 Flash deploy agents to continuously refine a game in real time.
Slide 1 of 6
See how Shopify is using Gemini 3.5 Flash
Shopify is running subagents in parallel to analyze complex data over a long horizon for more accurate merchant growth forecasts at a global scale.
See how Macquarie Bank is using Gemini 3.5 Flash
Macquarie Bank is piloting how 3.5 Flash can accelerate customer onboarding by reasoning over complex 100+ page documents, retrieving relevant information and making reliable recommendations with low latency.
See how Salesforce is using Gemini 3.5 Flash
Salesforce is integrating 3.5 Flash into Agentforce to reliably automate complicated enterprise tasks by deploying multiple subagents that retain context and execute complex, multi-turn tool calling.
See how Ramp is using Gemini 3.5 Flash
3.5 Flash is helping Ramp enable smarter, more reliable OCR through multimodal understanding of complex invoices combined with reasoning over historical patterns.
See how Xero is using Gemini 3.5 Flash
Xero is deploying agents to autonomously manage complex, multi-week workflows, such as identifying suppliers and gathering information for 1099 tax forms, enabling small businesses to automate tedious admin tasks.
See how Databricks is using Gemini 3.5 Flash
Databricks is using agentic workflows to monitor and retrieve real-time information, reason across massive datasets to diagnose issues, identify fixes and propose solutions for data scientists.
Slide 1 of 4
Google Antigravity
Our AI-first development platform that allows anyone to be a builder
Google AI Studio
Leap from prompt to production
Gemini API
Get started building with cutting-edge AI models
Gemini Enterprise Agent Platform
Build, scale, and govern agents
Slide 1 of 6
Gemini 3.1 Deep Think
Best for modern challenges across science, research and engineering
Gemini Omni
Create anything from anything, starting with video
Gemini Image (Nano Banana)
State-of-the-art image generation and editing models, built on Gemini
Gemini Audio
Advanced real-time audio models, built on Gemini
Gemini Robotics
Our most advanced vision-language-action model
Gemini Embedding 2
State-of-the-art multimodal embedding model
Gemini 3.1 Deep Think
Best for modern challenges across science, research and engineering
Gemini Omni
Create anything from anything, starting with video
Gemini Image (Nano Banana)
State-of-the-art image generation and editing models, built on Gemini
Gemini Audio
Advanced real-time audio models, built on Gemini
Gemini Robotics
Our most advanced vision-language-action model
Gemini Embedding 2
State-of-the-art multimodal embedding model
Gemini
Supercharge your creativity and productivity
AI Mode
Ask whatever's on your mind to get an AI powered response
Google AI Studio
The fastest path from prompt to production
Google Antigravity
Our AI-first development platform that allows anyone to be a builder
Gemini API
Get started building with cutting-edge AI models
Gemini Enterprise Agent Platform
Build, scale, and govern agents


评论 0