Summary of "Gemini 3.5 Flash is just... fine"
Summary of technological concepts, features, and analysis
Gemini 3.5 Flash release (main topic)
-
Claimed performance vs. cost (marketing): Google positions Gemini 3.5 Flash as having “frontier” performance at ~4× the speed and under half the cost of competitors. The video argues real-world results don’t match these claims.
-
Model specs / modalities:
- Context window: 1 million tokens
- Output limit: 64,000 output tokens
- Inputs supported: text, images, video, audio, and PDFs
- Emphasized as strong on multimodal workloads (consistent with Google’s historical strengths)
-
Coding & benchmark performance (mixed, per the speaker):
- On Google’s own benchmarks, Gemini 3.5 Flash is reported as:
- roughly in line with GPT 5.5 for coding, with small differences on SW Bench Pro and Terminal Bench
- outperforming Opus 4.7 on Terminal Bench by ~10%
- while Claude Opus beats Gemini on SW Bench Pro by ~10%
- Agentic benchmarks: described as winning on MCP and Toolathon benchmarks (per the speaker)
- On Google’s own benchmarks, Gemini 3.5 Flash is reported as:
-
Third-party benchmark discrepancy (where results “worsen”):
- Using Artificial Analysis (third-party), coding performance is described as:
- ~45 on the “coding index”
- below models like Kimiko 2.6
- not beating Gemini 3.1 Pro, despite Google’s internal benchmarks showing the opposite
- only slightly above Gemini 3 Flash
- Using Artificial Analysis (third-party), coding performance is described as:
-
Speed is the standout:
- Reported 278 tokens/second, described as significantly faster than Opus 4.7, GPT 5.5, Haiku, and some open models.
- The speaker concludes it offers the best “intelligence vs. speed” balance if speed is the priority.
-
Cost analysis (major critique):
- Published prices:
- $1.50 per 1M input tokens
- $9 per 1M output tokens
- Measured cost (via Artificial Analysis), per the speaker:
- $1,552 to run the “intelligence index”
- ~5.5× more expensive than Gemini 3.5 Flash
- ~75% more expensive than Gemini 3.1 Pro
- more expensive than GPT 5.5 on high reasoning (the speaker says GPT 5.5 beats Flash on coding)
- Conclusion: the video argues it is not “half the cost” and can be worse than cheaper models that also code better.
- Published prices:
-
Token-hungry behavior (why cost rises):
- In agentic evaluation, it averaged ~49 turns per task, suggesting it burns through input tokens.
-
Overall verdict on Gemini 3.5 Flash:
- Labeled as “meh” / mixed bag
- Suggested mainly for agents (speed + agent benchmarks)
- Not recommended as an overall best coding model due to:
- weak coding quality vs alternatives
- potentially unfavorable cost
Second major announcement: “Antigravity 2” + new CLI
-
Antigravity 2 app (standalone agent IDE/app):
- The speaker says it’s become hard to distinguish from other “agent coding” tools (compared visually to Codex/Cursor/etc.).
- UI elements described:
- Conversations (left)
- Projects
- Scheduled tasks
- Ability to open files and view diffs
-
Key point: it’s not an “Anti-Gravity IDE” anymore, but a standalone app.
-
Test prompts demoed:
- Cafe website (simple prompt):
- Produces a functional single
index.html - Speaker likes the UI/design output, suggesting Gemini 3.5 Flash performs well at UI design
- Notes it can have an “AI look” (e.g., card/gradient style)
- Claims it looks better than what Opus 4.7 produced in a one-off test
- Produces a functional single
- Full-stack personal finance dashboard (complex prompt):
- The app “works,” but UI looks AI-generated
- Negative reaction to the name “Aura Wealth”
- Speaker says Opus 4.7 produced a much nicer UI
- Also mentions time spent: ~20 minutes for Opus vs ~5 minutes for Gemini, implying Flash is faster but doesn’t invest enough time to polish UI
- Cafe website (simple prompt):
-
Anti-Gravity CLI:
- Gemini CLI shutdown: speaker says Gemini CLI will be unavailable after June 18
- New CLI characteristics:
- rewritten in Go
- closed source
- functionally similar to the old CLI at launch (no major new capability demonstrated)
- Speaker frames it as “Killed by Google”-style churn and expresses disappointment about the closure of the codebase.
Product/market positioning conclusion
- The speaker suggests Google may be prioritizing everyday consumer integration (Gmail, Search, Workspace, Android, etc.) more than being the top developer/coding leader.
- Expectation:
- Gemini 3.5 Flash is not the clear coding winner
- Gemini 3.5 Pro is mentioned as coming next month as a potential improvement
Main speakers/sources
- Main source/speaker: the video narrator/reviewer (author of the analysis)
- Referenced/compared model providers:
- Google (own benchmarks)
- Artificial Analysis (third-party benchmarks)
- Competitors: OpenAI (GPT 5.5), Anthropic (Claude Opus 4.7)
- Others mentioned: Kimiko 2.6, Haiku, and open-source OpenAI models
- Product sources mentioned:
- Gemini CLI, Antigravity 2, Anti-Gravity CLI
- Other coding-agent apps referenced by name: Codex, Cursor
Category
Technology
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.