Summary of "El PADRE de todos los CURSOS de IA: 20 Agent Harness para programar con disciplina"
Summary of Technological Concepts / Product Features / Guides
The video argues that AI “agents” fail in practice less because of the underlying model and more because they lack an Agent Harness: an engineering/runtime layer that constrains, structures, and verifies what the agent does. The creator calls this “harness failure”—for example, agents contaminating context, acting unpredictably, touching wrong files, or producing large unreviewable changes.
The solution is a disciplined system that provides direction, verifiable process state, evidence, and safety limits—not just longer prompts.
Core idea: What an “Agent Harness” is
- Definition: an operational structure around an AI agent.
- Claim: a harness doesn’t remove autonomy; it provides direction via process gates, contracts, and verifiable states.
- Three problems harnesses address:
- Contaminated context (mixing decisions, losing focus, working with noise)
- Unpredictable execution (exploration vs. implementation; touching unintended files)
- Zero traceability (no clear rationale, evidence, or decisions)
“Gentle” as a philosophy + implementation
The speaker frames Gentle as runtime discipline, not just prompt engineering:
- Not “be a senior” theatre
- Not only prompt engineering
- Gentle = philosophy/patterns that enforce:
- context, phases, memory
- verification/evidence
- safety limits and protection for the human reviewer
Main components demonstrated
-
Gentle PI: turns an agent into a controlled development environment (SDD/orchestration, subagents, TDD, review workflow, skill registry, safety)
-
Gentle Engram: a persistent memory harness stores decisions/context/sessions and supports gradual retrieval
Course structure / tutorial claims
The video is presented as a “free course” focused on red-team thinking and disciplined agent runtime. It’s organized into parts:
- Concept of Agent Harness (problem → definition → analogy)
- Gentle philosophy and how it fits
- A “map” of the top 20 harnesses governing how agents work day-to-day
Demo / tooling setup (how the system is installed/used)
The speaker uses a terminal/TUI environment called P Agent as the demo runtime:
- Shows installing packages via Pi and running update commands like
pi update. - Introduces an SD flow inspired by OpenSpec, extended with additional steps:
- proposal, spec, design, task, apply, verify, archive
- plus TDD, targeted skills, and explicit verification
Composability is emphasized:
- Components can be installed separately (e.g., PI vs Engram)
- But they operate as one cohesive system.
Top “20 main harnesses” (key features listed)
The creator walks through harnesses that change agent behavior across the SDD lifecycle. Main ones called out include:
-
Orchestration/Context Harness (SDD Orchestrator Father/Mother)
- Coordinates work and maintains responsibility; subagents do the actual execution.
- Emphasizes: coordinate, don’t execute (prevents role mixing).
-
Delegation Harness
- Decides whether changes should be done inline, delegated to subagents, or handled via full SDD.
- Avoids errors like treating big changes as tiny prompts (or the reverse).
-
SDD Init Harness (project calibration / bootstrap)
- Detects project stack, tests, test commands, conventions, artifact storage.
- Generates spec/config (example includes Node/TypeScript + Python, plus test command detection).
- Includes strict TDD mode: tests run before changes, with “triangulate” testing.
-
Execution Mode Harness
- Chooses interactive vs automatic progression while enforcing stop conditions.
-
Artifact Store Harness (source of truth != chat)
- Stores recoverable process state (artifacts) so sessions can resume after interruption.
-
Phase/Steps Harness
- Enforces stage order: init → proposal → spec/design (parallel) → tasks → apply → verify → archive.
- Prevents skipping.
-
Artifact Dependency Harness
- Enforces required inputs per phase.
- Stops or requests human confirmation when artifacts are missing.
-
Result Contract Harness
- Each phase returns a consistent “envelope”:
- status, executive summary, artifact references, next recommended step, risk + skill resolution
- Designed for predictable, auditable decision-making.
- Each phase returns a consistent “envelope”:
-
SDD Artifact Grammar Harness (process grammar)
- Shareable “contract” sequence inspired by OpenSpec (proposal/spec/design/task, etc.).
- Includes verification via specialized verification agent(s).
-
Engram Memory Harness (persistent memory)
- Stores decisions, context, preferences, and session summaries.
- Enables gradual recovery (retrieve relevant memory rather than reloading all history).
- Demonstrated across sessions and across agents (e.g., asking Claude what was last done loads the engram).
-
Strict TDD Harness
- If strict TDD is enabled, apply/verify must run evidence-based test flow (red/green/triangulate refactor).
- Injects the rule so the agent doesn’t need to “remember.”
-
Verify Harness
- “Finished” ≠ “verified.”
- Verification requests evidence: commands run, output, tests passed, risks, and checks for files touched outside scope.
-
Continuity / Follow-through Harness (task continuity)
- Prevents overlapping reruns by tracking tasks/blocks as done vs in progress.
- On rerun, it resumes from the last known state.
-
Skill Registry Harness
- Builds a scan-based index of skills for the project and user.
- Stores registry artifacts in the repo (git-ignored), enabling reuse without loading everything.
-
Skill Digestion Harness (context compiler)
- Converts large knowledge into compact, actionable rules for subagents.
- Subagents receive task-specific digest instructions instead of huge documents.
-
Skill Resolution Feedback Harness
- Tracks whether skills were resolved successfully or if fallback behavior occurred.
- Adds auditability/quality control to skill-loading.
-
Strict Agent Isolation Harness
- Each subagent runs with isolated context (phase agents: explore/spec/design/apply/verify with focused prompts).
- Analogy: operating room vs WhatsApp group.
-
Review Warlock Harness (human-friendly review risk assessment)
- Assesses review risk before applying/uploading changes (e.g., “review debt” for PRs that are too large).
- Includes a simulation to recommend splitting changes.
-
Delivery Strategy Harness
- Chooses how to split/ship changes:
- ask on risk, autochain, single PR, or explicit exception
- Goal: avoid “monster PRs” and keep reviews manageable.
- Chooses how to split/ship changes:
-
Chain Strategy Harness
- Implements splitting as chained PR branches (feature-track vs other strategies).
- Emphasizes “thinking as a team” rather than diff generation.
Additional harnesses mentioned (not deeply expanded)
The speaker briefly lists additional harnesses beyond the top 20, including:
- Model Routing (choose model per phase/agent)
- Profile Isolation (prevent cross-session contamination)
- Permissions/Security (block dangerous commands, confirmations)
- MCP Injection (control what external tools/capabilities are active per phase)
- Backup + Rollback (safety net and recovery)
- Dependency graph harness
- Command wrapper + output normalization
- Adapters per agent/runtime
- Session summary / compaction recovery harness (persist operational summaries across compactions)
Main conclusion / message
- The “agent problem” is framed as systemic: without harnesses it’s speed without direction.
- With harnesses, agents gain context, memory, process phases, contracts, evidence, delivery strategy, and security limits, often with token-saving benefits.
- Gentle PI + Engram are positioned as the practical runtime stack.
- The CDD flow (OpenSpec-inspired) provides:
- verifiable artifacts
- TDD-based verification
- skill-driven operational rules
Main speakers / sources (as identified)
- Alan Buscalas (the speaker; identified in the video as Google Developer Expert in Angular and Microsoft MVP; creator of the Gentle ecosystem and related repos/tools)
Category
Technology
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.