Summary of "Stop coding AI: Use Runtime Topological Self-Assembly (UC, DeepMind)"

High-level thesis

Two recent papers argue we should stop manually coding agent architectures and instead use LLMs as optimizers that mutate and assemble discrete symbolic graphs (programs, execution graphs) at runtime. Real-world algorithms and system architectures are discrete, non-differentiable and highly compositional (e.g., ASTs, control flow), so continuous gradient-based tuning of dense vectors is often the wrong abstraction. LLMs can act as smart genetic/mutation operators over those discrete structures to discover new algorithms and agent topologies.

Briefly: treat LLMs not as end-to-end solvers but as mutation/optimizer operators over discrete programmatic structures to discover micro- and macro-architectural innovations.

Paper 1 — DeepMind: “Alpha Evolve” (Discovering multi‑agent learning algorithms with LLMs)

Core idea: use an LLM-driven evolutionary/search procedure that edits Python source code represented as abstract syntax trees (ASTs). The LLM performs mutations (insert ifs, change update rules, etc.) rather than acting as a single-shot solver.
Fitness function: game-theoretic metrics such as exploitability / distance to Nash equilibrium. The search uncovers non-intuitive, non-differentiable algorithmic updates (e.g., control-flow additions producing phase transitions).
Example result: a hybrid update that blends optimistic regret matching with a temperature-weighted softmax — a dynamic annealing of a blending parameter (lambda) that shifts from exploration to regret-based equilibrium refinement.
Implication: automated discovery of novel multi-agent learning algorithms (micro-architectural optimization of learning rules) that human continuous-gradient intuition might miss.
Costs and limitations:
- High compute to search a combinatorial program space.
- Requires a clear, well-defined fitness/evaluation metric.
- Engineering and replication complexity.

Paper 2 — Open Sage: self‑programming agent generation engine

Summary: runtime topological self-assembly — the model constructs, executes, and manages a topological execution graph of agents, tools, and memory during task execution instead of relying on a static, human-coded pipeline.

Topological execution graph:
- Nodes: agents / subagents / tool sandboxes / memory states.
- Edges: control and information flow.
- Graphs can be directed/acyclic or cyclic, with vertical (sequential decomposition) and horizontal (parallel subagents) structure.
Key features:
- Dynamic creation, execution, and termination of subagents at runtime.
- Vertical decomposition for sequential subtasks (specialized subagents) and horizontal parallelism for concurrent subtasks.
- Tool construction and management — agents can author and compile tools (Python/C++), orchestrate tool calls, isolate execution (sandboxes), and manage tool state.
- Hierarchical graph-based memory — short-term context + long-term knowledge managed by dedicated memory agents (reduces reliance on dense vector DBs).
- “Attention firewall” via node isolation — encapsulation limits catastrophic context collapse and hallucination by restricting reasoning to relevant subproblems.
Empirical results (high level):
- Benchmarked against current ADKs (e.g., LangChain/AutoGen-style static pipelines) on cyber/hacking-style terminal benchmarks; Open Sage outperforms prior systems.
- Ablations show strong dependence on vertical/horizontal topology and tooling — removing these capabilities reduces performance significantly.
- Model configurations: experiments used models like Gemini 3 Pro and Gemini (G5) mini. Mixing a powerful planner with cheaper auxiliary models can approach top performance with lower cost (cost-performance tradeoff).
- Preprint includes detailed tool lists and an implementation appendix for replication.
Positioning: macro-architectural optimization — automates system-level cognitive routing, tool selection/creation, and memory architecture.

Synthesis and implications (author analysis)

Complementarity:
- DeepMind (AlphaEvolve) = micro-architecture: automated discovery/optimization of learning update rules via AST mutation.
- Open Sage = macro-architecture: automated, runtime construction/management of agent topology, tools, and memory.
Combining opportunities:
- Use AST-level programmatic optimization (AlphaEvolve-like) to search/optimize topological parameters and orchestration rules inside Open Sage — i.e., evolve the orchestrator itself.
Practical argument:
- Human software engineering of agent topologies, memory, and optimization rules is becoming a bottleneck. Constrained, supervised machine-driven search can discover non-intuitive solutions while humans retain supervision for safety and constraints.
Caveats:
- High compute and monetary cost.
- Alignment and safety concerns — humans must supervise exploratory search spaces.
- Many engineering details and operational risks remain.

Product / feature / tutorial takeaways

For practitioners:

Look for Open Sage–style functionality: runtime graph assembly, model-driven tool compilation, hierarchical graph memory, isolation/sandboxing, and vertical/horizontal agent topologies.
Consult the Open Sage preprint/appendix for benchmarks and ablations to prioritize features that most affect performance.
Cost-performance tip: combine a strong planning LLM with cheaper auxiliary LLMs to reduce cost while keeping performance near top levels.

For researchers:

When optimizing multi-agent learning, represent solvers as ASTs and use LLM-driven mutation/search with game-theoretic fitness (exploitability / distance-to-Nash).
Run experiments that jointly evolve micro (learning-rule) and macro (agent topology/memory) structures.

For security practitioners:

Open Sage–style systems can automate complex debugging/analysis tasks by spawning specialized static/dynamic analysis subagents that maintain isolated contexts and produce distilled summaries.

Where to dive deeper

Read both papers (DeepMind AlphaEvolve and Open Sage) in parallel to appreciate micro/macro complementarity; each includes detailed appendices, tool lists, and benchmarks.
Background concepts to review: CFR (counterfactual regret), PSRO/PSO (policy-space response oracles), regret minimization, Nash equilibrium/exploitability, AST/programmatic representations, and existing ADKs such as LangChain and AutoGen.

Main sources and speakers

DeepMind paper: “Discovering multi‑agent learning algorithms with LLMs” — describes AlphaEvolve (LLM-driven AST mutation and discovered hybrid regret/softmax update). (Narration dated around Feb 20, 2026.)
Open Sage: “Open Sage — self‑programming agent generation engine” — authors/institutions include UC Santa Barbara, UC Berkeley, University of Colorado Boulder, Columbia, Duke, UCLA (preprint/archival material referenced).
Other referenced projects/tools/models: LangChain, AutoGen, Gemini 3 Pro, Gemini (G5) mini, PSRO, CFR.
Presentation: summary and synthesis were delivered by an unnamed YouTube creator / narrator.