Summary of ""VOLTEI AQUI PORQUE PRECISO FALAR ISSO": A GRANDE VIRADA da INTELIGÊNCIA ARTIFICIAL [com FABIO AKITA"

Context and meta

The presenter describes a turning point in practical usefulness of LLMs (late 2025 → early 2026). After more than 500 hours of hands‑on experiments he open‑sourced and documented the work (≈16 GitHub repositories, ~300k LOC) and published many blog posts and step‑by‑step guides.
Emphasis: moving from theory and futurology to reproducible, open tutorials and runnable projects so others can inspect, reproduce, and contribute.

Tool calling / harnesses Models are now trained to call external tools/functions instead of returning only text. LLMs can emit structured function calls (for example create_file) that a harness executes, and results are fed back into the model loop. This is the practical enabler for automation beyond natural language output.
Architecture separation

LLM = “deep thinking” / planning; harness = dumb executor that runs functions and returns results. This separation enables persistent multi‑step workflows, parallel tasks, and real execution on local machines or CI systems.
System prompts / skills Persistent system prompts (the harness context) and user‑created skills (persona/config files) allow encoding recurring behaviour and improve consistency across sessions.

Claude (Anthropic) / “Opus” / cloud‑style tools
- More disciplined and predictable.
- Better at decomposing complex plans into parallel tasks and continuing them if interrupted.
- Reliable for trusting long‑running tasks and context continuity.
Codex / OpenAI
- Historically focused on parameter scaling.
- Less reliable for multi‑task or parallel interactions — may forget subtasks and require reminders.
- Newer versions improve, but behavior varies by model and harness fit.
Google (Gemini / related / “antigravity”)
- Supports tool calling and integrations and uses its own ecosystem/harness language.
Cross‑vendor notes
- Vendors use different tool‑call formats and function names (for example create_file vs file_new vs NewFile). Models are trained against a specific harness language; a mismatch between LLM and harness degrades performance.
- Open‑source harnesses attempt to map between formats, but fidelity can be lost if the LLM wasn’t trained for the target harness.

The presenter published detailed, step‑by‑step coverage (including a teardown of a leaked Cloud Code), showing real code and how to use tool calling, skills, and session handoff.
Examples of useful automations demonstrated:
- Organizing a downloads folder.
- A ROM organizer.
- Generating UI/UX files via a saved skill that can be reloaded for later tasks.
Session continuity:
- Harnesses/apps expose session IDs so you can resume context across sessions or switch harnesses by exporting/importing the context file.
Vendor lock‑in and economics:
- Subscription models and token limits affect choice and feasibility. Some providers restrict usage patterns or block non‑approved integrations, complicating a unified tool approach; attempting to route a model through an unsupported framework may result in bans.
Code quality note:
- The Cloud Code leak revealed messy, spaghetti code — a sign teams are moving fast to release features.

Tool calling and trained function outputs are the practical enabler for real automation (not just text).
Harnesses execute functions; LLMs must be trained to speak the harness’ “language” for reliable results.
Different vendors’ function schemas cause integration and performance issues; mixing LLM + harness often worsens output unless mappings are well aligned.
Skills (saved persona/config) plus system prompts let you persist specialized behaviour across sessions and projects.
Session IDs and resume support maintain project context without rewriting prompts.
Open‑sourcing experiments and publishing step‑by‑step guides is valuable for reproducibility and community contribution.

Extensive benchmarks and step‑by‑step writeups on the presenter’s blog.
16 open‑source GitHub repositories (public, ~300k LOC) with runnable code and contribution opportunities.
Coverage and analysis of the Cloud Code leak that explains internals and code quality.
Comparative, hands‑on testing across:
- Anthropic Claude / Opus / Cloud
- OpenAI GPT / Codex
- Google Gemini / antigravity (and related)
- Various CLI and cloud integrations

Main speaker: Fabio Akita (presenter of the video).
Platforms and vendors referenced: Anthropic (Claude / Opus), OpenAI (GPT, Codex), Google (Gemini / antigravity / Cria), plus mentions of Alibaba, Tencent, and various cloud/harness ecosystems.