Summary of "Gemini 3.5 Flash is just... fine"

Summary of technological concepts, features, and analysis

Gemini 3.5 Flash release (main topic)

Claimed performance vs. cost (marketing): Google positions Gemini 3.5 Flash as having “frontier” performance at ~4× the speed and under half the cost of competitors. The video argues real-world results don’t match these claims.
Model specs / modalities:
- Context window: 1 million tokens
- Output limit: 64,000 output tokens
- Inputs supported: text, images, video, audio, and PDFs
- Emphasized as strong on multimodal workloads (consistent with Google’s historical strengths)
Coding & benchmark performance (mixed, per the speaker):
- On Google’s own benchmarks, Gemini 3.5 Flash is reported as:
  - roughly in line with GPT 5.5 for coding, with small differences on SW Bench Pro and Terminal Bench
  - outperforming Opus 4.7 on Terminal Bench by ~10%
  - while Claude Opus beats Gemini on SW Bench Pro by ~10%
- Agentic benchmarks: described as winning on MCP and Toolathon benchmarks (per the speaker)
Third-party benchmark discrepancy (where results “worsen”):
- Using Artificial Analysis (third-party), coding performance is described as:
  - ~45 on the “coding index”
  - below models like Kimiko 2.6
  - not beating Gemini 3.1 Pro, despite Google’s internal benchmarks showing the opposite
  - only slightly above Gemini 3 Flash
Speed is the standout:
- Reported 278 tokens/second, described as significantly faster than Opus 4.7, GPT 5.5, Haiku, and some open models.
- The speaker concludes it offers the best “intelligence vs. speed” balance if speed is the priority.
Cost analysis (major critique):
- Published prices:
  - $1.50 per 1M input tokens
  - $9 per 1M output tokens
- Measured cost (via Artificial Analysis), per the speaker:
  - $1,552 to run the “intelligence index”
  - ~5.5× more expensive than Gemini 3.5 Flash
  - ~75% more expensive than Gemini 3.1 Pro
  - more expensive than GPT 5.5 on high reasoning (the speaker says GPT 5.5 beats Flash on coding)
- Conclusion: the video argues it is not “half the cost” and can be worse than cheaper models that also code better.
Token-hungry behavior (why cost rises):
- In agentic evaluation, it averaged ~49 turns per task, suggesting it burns through input tokens.
Overall verdict on Gemini 3.5 Flash:
- Labeled as “meh” / mixed bag
- Suggested mainly for agents (speed + agent benchmarks)
- Not recommended as an overall best coding model due to:
  - weak coding quality vs alternatives
  - potentially unfavorable cost

Second major announcement: “Antigravity 2” + new CLI

Antigravity 2 app (standalone agent IDE/app):
- The speaker says it’s become hard to distinguish from other “agent coding” tools (compared visually to Codex/Cursor/etc.).
- UI elements described:
  - Conversations (left)
  - Projects
  - Scheduled tasks
  - Ability to open files and view diffs
- Key point: it’s not an “Anti-Gravity IDE” anymore, but a standalone app.
- Test prompts demoed:
  - Cafe website (simple prompt):
    - Produces a functional single index.html
    - Speaker likes the UI/design output, suggesting Gemini 3.5 Flash performs well at UI design
    - Notes it can have an “AI look” (e.g., card/gradient style)
    - Claims it looks better than what Opus 4.7 produced in a one-off test
  - Full-stack personal finance dashboard (complex prompt):
    - The app “works,” but UI looks AI-generated
    - Negative reaction to the name “Aura Wealth”
    - Speaker says Opus 4.7 produced a much nicer UI
    - Also mentions time spent: ~20 minutes for Opus vs ~5 minutes for Gemini, implying Flash is faster but doesn’t invest enough time to polish UI
Anti-Gravity CLI:
- Gemini CLI shutdown: speaker says Gemini CLI will be unavailable after June 18
- New CLI characteristics:
  - rewritten in Go
  - closed source
  - functionally similar to the old CLI at launch (no major new capability demonstrated)
- Speaker frames it as “Killed by Google”-style churn and expresses disappointment about the closure of the codebase.

Product/market positioning conclusion

The speaker suggests Google may be prioritizing everyday consumer integration (Gmail, Search, Workspace, Android, etc.) more than being the top developer/coding leader.
Expectation:
- Gemini 3.5 Flash is not the clear coding winner
- Gemini 3.5 Pro is mentioned as coming next month as a potential improvement

Main speakers/sources

Main source/speaker: the video narrator/reviewer (author of the analysis)
Referenced/compared model providers:
- Google (own benchmarks)
- Artificial Analysis (third-party benchmarks)
- Competitors: OpenAI (GPT 5.5), Anthropic (Claude Opus 4.7)
- Others mentioned: Kimiko 2.6, Haiku, and open-source OpenAI models
Product sources mentioned:
- Gemini CLI, Antigravity 2, Anti-Gravity CLI
- Other coding-agent apps referenced by name: Codex, Cursor