Summary of ""VOLTEI AQUI PORQUE PRECISO FALAR ISSO": A GRANDE VIRADA da INTELIGÊNCIA ARTIFICIAL [com FABIO AKITA"
Context and meta
- The presenter describes a turning point in practical usefulness of LLMs (late 2025 → early 2026). After more than 500 hours of hands‑on experiments he open‑sourced and documented the work (≈16 GitHub repositories, ~300k LOC) and published many blog posts and step‑by‑step guides.
- Emphasis: moving from theory and futurology to reproducible, open tutorials and runnable projects so others can inspect, reproduce, and contribute.
What changed (technical pivot)
-
Tool calling / harnesses Models are now trained to call external tools/functions instead of returning only text. LLMs can emit structured function calls (for example
create_file) that a harness executes, and results are fed back into the model loop. This is the practical enabler for automation beyond natural language output. -
Architecture separation
LLM = “deep thinking” / planning; harness = dumb executor that runs functions and returns results. This separation enables persistent multi‑step workflows, parallel tasks, and real execution on local machines or CI systems.
-
System prompts / skills Persistent system prompts (the harness context) and user‑created skills (persona/config files) allow encoding recurring behaviour and improve consistency across sessions.
Practical product comparisons and behavior
-
Claude (Anthropic) / “Opus” / cloud‑style tools
- More disciplined and predictable.
- Better at decomposing complex plans into parallel tasks and continuing them if interrupted.
- Reliable for trusting long‑running tasks and context continuity.
-
Codex / OpenAI
- Historically focused on parameter scaling.
- Less reliable for multi‑task or parallel interactions — may forget subtasks and require reminders.
- Newer versions improve, but behavior varies by model and harness fit.
-
Google (Gemini / related / “antigravity”)
- Supports tool calling and integrations and uses its own ecosystem/harness language.
-
Cross‑vendor notes
- Vendors use different tool‑call formats and function names (for example
create_filevsfile_newvsNewFile). Models are trained against a specific harness language; a mismatch between LLM and harness degrades performance. - Open‑source harnesses attempt to map between formats, but fidelity can be lost if the LLM wasn’t trained for the target harness.
- Vendors use different tool‑call formats and function names (for example
Developer experience, guides, and tool notes
- The presenter published detailed, step‑by‑step coverage (including a teardown of a leaked Cloud Code), showing real code and how to use tool calling, skills, and session handoff.
- Examples of useful automations demonstrated:
- Organizing a downloads folder.
- A ROM organizer.
- Generating UI/UX files via a saved skill that can be reloaded for later tasks.
- Session continuity:
- Harnesses/apps expose session IDs so you can resume context across sessions or switch harnesses by exporting/importing the context file.
- Vendor lock‑in and economics:
- Subscription models and token limits affect choice and feasibility. Some providers restrict usage patterns or block non‑approved integrations, complicating a unified tool approach; attempting to route a model through an unsupported framework may result in bans.
- Code quality note:
- The Cloud Code leak revealed messy, spaghetti code — a sign teams are moving fast to release features.
Key technical takeaways
- Tool calling and trained function outputs are the practical enabler for real automation (not just text).
- Harnesses execute functions; LLMs must be trained to speak the harness’ “language” for reliable results.
- Different vendors’ function schemas cause integration and performance issues; mixing LLM + harness often worsens output unless mappings are well aligned.
- Skills (saved persona/config) plus system prompts let you persist specialized behaviour across sessions and projects.
- Session IDs and resume support maintain project context without rewriting prompts.
- Open‑sourcing experiments and publishing step‑by‑step guides is valuable for reproducibility and community contribution.
Reviews, tutorials, and artifacts referenced
- Extensive benchmarks and step‑by‑step writeups on the presenter’s blog.
- 16 open‑source GitHub repositories (public, ~300k LOC) with runnable code and contribution opportunities.
- Coverage and analysis of the Cloud Code leak that explains internals and code quality.
- Comparative, hands‑on testing across:
- Anthropic Claude / Opus / Cloud
- OpenAI GPT / Codex
- Google Gemini / antigravity (and related)
- Various CLI and cloud integrations
Main speakers and sources mentioned
- Main speaker: Fabio Akita (presenter of the video).
- Platforms and vendors referenced: Anthropic (Claude / Opus), OpenAI (GPT, Codex), Google (Gemini / antigravity / Cria), plus mentions of Alibaba, Tencent, and various cloud/harness ecosystems.
Category
Technology
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.