Summary of "El PADRE de todos los CURSOS de IA: 20 Agent Harness para programar con disciplina"

Summary of Technological Concepts / Product Features / Guides

The video argues that AI “agents” fail in practice less because of the underlying model and more because they lack an Agent Harness: an engineering/runtime layer that constrains, structures, and verifies what the agent does. The creator calls this “harness failure”—for example, agents contaminating context, acting unpredictably, touching wrong files, or producing large unreviewable changes.

The solution is a disciplined system that provides direction, verifiable process state, evidence, and safety limits—not just longer prompts.


Core idea: What an “Agent Harness” is


“Gentle” as a philosophy + implementation

The speaker frames Gentle as runtime discipline, not just prompt engineering:

Main components demonstrated


Course structure / tutorial claims

The video is presented as a “free course” focused on red-team thinking and disciplined agent runtime. It’s organized into parts:

  1. Concept of Agent Harness (problem → definition → analogy)
  2. Gentle philosophy and how it fits
  3. A “map” of the top 20 harnesses governing how agents work day-to-day

Demo / tooling setup (how the system is installed/used)

The speaker uses a terminal/TUI environment called P Agent as the demo runtime:

Composability is emphasized:


Top “20 main harnesses” (key features listed)

The creator walks through harnesses that change agent behavior across the SDD lifecycle. Main ones called out include:

  1. Orchestration/Context Harness (SDD Orchestrator Father/Mother)

    • Coordinates work and maintains responsibility; subagents do the actual execution.
    • Emphasizes: coordinate, don’t execute (prevents role mixing).
  2. Delegation Harness

    • Decides whether changes should be done inline, delegated to subagents, or handled via full SDD.
    • Avoids errors like treating big changes as tiny prompts (or the reverse).
  3. SDD Init Harness (project calibration / bootstrap)

    • Detects project stack, tests, test commands, conventions, artifact storage.
    • Generates spec/config (example includes Node/TypeScript + Python, plus test command detection).
    • Includes strict TDD mode: tests run before changes, with “triangulate” testing.
  4. Execution Mode Harness

    • Chooses interactive vs automatic progression while enforcing stop conditions.
  5. Artifact Store Harness (source of truth != chat)

    • Stores recoverable process state (artifacts) so sessions can resume after interruption.
  6. Phase/Steps Harness

    • Enforces stage order: init → proposal → spec/design (parallel) → tasks → apply → verify → archive.
    • Prevents skipping.
  7. Artifact Dependency Harness

    • Enforces required inputs per phase.
    • Stops or requests human confirmation when artifacts are missing.
  8. Result Contract Harness

    • Each phase returns a consistent “envelope”:
      • status, executive summary, artifact references, next recommended step, risk + skill resolution
    • Designed for predictable, auditable decision-making.
  9. SDD Artifact Grammar Harness (process grammar)

    • Shareable “contract” sequence inspired by OpenSpec (proposal/spec/design/task, etc.).
    • Includes verification via specialized verification agent(s).
  10. Engram Memory Harness (persistent memory)

    • Stores decisions, context, preferences, and session summaries.
    • Enables gradual recovery (retrieve relevant memory rather than reloading all history).
    • Demonstrated across sessions and across agents (e.g., asking Claude what was last done loads the engram).
  11. Strict TDD Harness

    • If strict TDD is enabled, apply/verify must run evidence-based test flow (red/green/triangulate refactor).
    • Injects the rule so the agent doesn’t need to “remember.”
  12. Verify Harness

    • “Finished” ≠ “verified.”
    • Verification requests evidence: commands run, output, tests passed, risks, and checks for files touched outside scope.
  13. Continuity / Follow-through Harness (task continuity)

    • Prevents overlapping reruns by tracking tasks/blocks as done vs in progress.
    • On rerun, it resumes from the last known state.
  14. Skill Registry Harness

    • Builds a scan-based index of skills for the project and user.
    • Stores registry artifacts in the repo (git-ignored), enabling reuse without loading everything.
  15. Skill Digestion Harness (context compiler)

    • Converts large knowledge into compact, actionable rules for subagents.
    • Subagents receive task-specific digest instructions instead of huge documents.
  16. Skill Resolution Feedback Harness

    • Tracks whether skills were resolved successfully or if fallback behavior occurred.
    • Adds auditability/quality control to skill-loading.
  17. Strict Agent Isolation Harness

    • Each subagent runs with isolated context (phase agents: explore/spec/design/apply/verify with focused prompts).
    • Analogy: operating room vs WhatsApp group.
  18. Review Warlock Harness (human-friendly review risk assessment)

    • Assesses review risk before applying/uploading changes (e.g., “review debt” for PRs that are too large).
    • Includes a simulation to recommend splitting changes.
  19. Delivery Strategy Harness

    • Chooses how to split/ship changes:
      • ask on risk, autochain, single PR, or explicit exception
    • Goal: avoid “monster PRs” and keep reviews manageable.
  20. Chain Strategy Harness

    • Implements splitting as chained PR branches (feature-track vs other strategies).
    • Emphasizes “thinking as a team” rather than diff generation.

Additional harnesses mentioned (not deeply expanded)

The speaker briefly lists additional harnesses beyond the top 20, including:


Main conclusion / message


Main speakers / sources (as identified)

Category ?

Technology


Share this summary


Is the summary off?

If you think the summary is inaccurate, you can reprocess it with the latest model.

Video