Video summary

How to Make Claude Code Better Every Time You Use It (50 Min Tutorial) | Kieran Klaassen

Main summary

Key takeaways

Technology

High-level takeaway

  • 50-minute deep demo of “compound engineering”: a workflow and Claude/Claude Code plugin (built by Kieran) that helps Claude-based agents improve each run by capturing learnings and codifying them into the repo.
  • Main idea: treat AI work like an iterative engineering loop — plan → work → assess/review → codify/compound — so subsequent runs are higher-quality and lower-friction.

Compound engineering (philosophy)

Compound engineering = capturing mistakes, decisions, and project norms and storing them in the repo so the agent “learns” and future runs improve.

Four-step loop:

  1. Plan — research and produce a grounded plan against your codebase and public resources.
  2. Work — execute the plan (generate code, tests, infra changes, artifacts).
  3. Assess / Review — run automated reviewers (security, architecture, simplicity) and QA.
  4. Codify (Compound) — write the lessons and decisions back into repo docs/skill files so future plans incorporate them.

What Kieran’s plugin/flow does (features & commands)

  • Plugin: “compound engineering” plugin for Claude Code (open-source on GitHub). Install it in your repo to use the flows.
  • Primary slash-command flows demonstrated:
    • workflows plan <description> — generates a research-backed, code-grounded plan.
    • workflows work <plan> — runs implementation steps (creates branches, edits code, writes tests).
    • workflows review — runs automated reviews from multiple perspectives.
    • triage — conversational walkthrough of review findings and lets you pick actions.
    • resolve to-do parallel — automatically resolve multiple to-dos and open a PR.
    • playwright test / test and play — writes and runs Playwright end-to-end/browser tests.
    • LFG (one-command end-to-end) — runs the entire loop (plan, work, test, review, produce marketing assets, create PR) and can run unattended for extended periods.
  • UI/UX:
    • CLI-first; sessions can be pushed to the cloud/web app and resumed on other clients (example: amp % pushes session to cloud).
    • Automatically creates artifacts (tests, screenshots, screen recordings, marketing videos) and attaches them to PRs.
    • Writes persistent files into the repo (e.g., docs/, architecture decision records, cloth.md) so those items become part of future prompts.

Technical capabilities demonstrated

  • Grounded planning:
    • The plan mode analyzes your actual codebase (frameworks, versions), searches external best-practice resources, and synthesizes a plan.
    • Can run sub-agents (parallel agents) for specialized research tasks (framework researcher, best-practices researcher).
  • Playwright-based browser automation and testing:
    • Opus 4.5 + Claude Code generates Playwright scripts, controls a browser, interacts with UI, reads console logs, and can log into external services (e.g., Gmail) to perform real integration tests.
    • Can screen-record flows and auto-upload videos to attach to PRs.
  • Automated testing + immediate fix loop:
    • If tests fail, the flow can modify code and re-run tests iteratively.
  • Review agents:
    • Automatic reviewers produce prioritized to-dos (P1/P2/P3) and place items in a to-do directory in the repo.
    • triage walks the developer through fixes and can apply them automatically.
  • Skills:
    • On-demand contextual documents/scripts/tools (skills) the agent can pull in (examples: agent-native architecture skill, image-generation skill).
    • Skills act as just-in-time context/integrations. Best practice: avoid hand-writing huge skill contexts unless needed; use skill creation helpers.
  • Sub-agents:
    • Run tasks in isolation or in parallel to avoid flooding the main context; useful for research and parallelizable work.
  • Permissions:
    • “Dangerously skip permissions” mode (alias CC in Kieran’s setup) disables interactive approvals — useful for long unattended runs but requires careful sandboxing.
    • Alternatively, you can configure repo-level permissions so Claude remembers them.

Practical tips, trade-offs, and cultural points

  • Tokens / cost:
    • Large planning and research sessions are token-heavy (tens of thousands of tokens). Requires a large-model subscription for heavy use.
    • Kieran argues it’s cost-effective relative to developer time for real product work.
  • Start with planning:
    • Invest time in the Plan stage to reduce rework; codify decisions early to avoid costly mid-implementation changes.
  • Trust & verification:
    • Use test and review stages to maintain safety and correctness. Decide where automatic changes are acceptable vs. where human approval is required (CI/PR gates).
  • Use the repo as state:
    • Documentation files (e.g., cloth.md, docs/, architecture decisions) become persistent prompt context for future runs.
  • Use the plugin as a scaffold:
    • Adopt and tweak Kieran’s plugin (commands, agents, skills) or build your own flows. Kieran estimated ~1 year to build the demo system.

Guides / step-by-step demo flow shown

  1. Install plugin and dependencies (demo shown in Warp terminal).
  2. workflows plan — provide a voice/text description of the feature; inspect the generated plan.
  3. Optionally “deepen plan” to pull in more context/resources or modify the plan (e.g., consolidate multiple setting tools into one).
  4. workflows work — run implementation, answer clarifying questions, let it code and write tests.
  5. Run playwright test — execute browser automation tests, capture console logs, record a video.
  6. workflows review / triage — run multi-perspective reviews and accept/reject fixes.
  7. resolve to-dos → PR created automatically; optionally auto-generate marketing video and changelog.
  8. Demonstrated sharing a session between CLI and web app (amp % to push) and continuing on phone/web.

Tools & integrations mentioned

  • Claude / Claude Code (Opus 4.5)
  • Playwright (browser automation & recording)
  • Warp terminal (CLI environment)
  • Typora (Markdown viewer; optional)
  • GitHub / Linear (issue/ticket integrations)
  • Gmail (demo login for integration testing)
  • Granola (sponsor; meeting notes app)
  • Plugin repository (compound engineering plugin on GitHub)

Limitations & warnings

  • High token usage for deep planning/research; needs an appropriate subscription tier.
  • Danger of unsafe operations if running with permissions-skipped mode on an unsafe machine — sandboxing recommended.
  • Skills and sub-agents may require careful prompt/metadata formatting to trigger reliably in some interfaces.

Resources & how to get started

  • Install the compound engineering plugin in your repo and run the example slash commands.
  • Typical flow: workflows plan → inspect & iterate → workflows workplaywright testworkflows reviewtriageresolve to create PRs.
  • Create and maintain cloth.md / docs/ and architecture decision records to codify project-specific learnings for future runs.
  • Explore or modify the plugin’s skills, agents, and commands to match your stack (iOS, Go CLI, etc.).

Main speakers / sources

  • Kieran Klaassen (Kieran) — CTO at Kora, creator of the compound engineering plugin; demoed the full flow.
  • Interviewer / host (YouTube channel host — referred to as Peter) — guided the demo and asked questions.

This summary is a condensed representation of a recorded demo project and plugin; no follow-up prompts were provided.

Original video