Video summary

[세미나 편집본] 8년차 실리콘밸리 엔지니어의 바이브코딩 테크닉

Main summary

Key takeaways

Educational

Main ideas & lessons

  • AI is already mainstream for builders (“top 1%”), so the real issue is no longer “use AI or fall behind,” but the widening productivity gap among AI users.
  • Same tools → wildly different outcomes because top performers:
    • run more AI tasks in parallel (delegate work),
    • produce stable, reliable results with AI agents,
    • and redesign their workflow/culture/codebase for AI.
  • Evaluation and hiring criteria are shifting:
    • Less focus on “how much code you personally write,”
    • More focus on “how stable and good your results are when running multiple AI agents.”
  • The gap is driven by operational maturity: moving from merely using AI to commanding AI, and eventually to AI-native work.

Career/workflow framework: 4-level “AI nativeness” model

Think of it as a self-assessment ladder (no “right numbers,” but a progression).

  1. Level 1: “AI-aware”

    • You know about AI, but rarely use it.
    • Example mindset/tools: automation, information, maybe using it once (e.g., GPT once).
  2. Level 2: “AI user / AI stage”

    • You use AI as a tool.
    • Typical behavior:
      • Ask AI for code,
      • Copy-paste outputs,
      • Or code line-by-line while conversing.
    • You remain the primary owner of the code (AI is assistant/support).
  3. Level 3: “AI Maximum / delegation-based”

    • You stop thinking: “how do I structure this task?”
    • You think: “how do I split and delegate work to AI?”
    • Typical behavior:
      • Hand off whole work units (e.g., “process this ticket,” “analyze this bug and upload the response”).
      • Later, you inspect results and decide next actions.
    • Key difference vs Level 2:
      • Workflow is redesigned around AI.
      • It can’t work without AI (AI-centered workflow).
  4. Level 4: “AI Nav / native”

    • AI becomes embedded in your thinking and decision-making (like “DNA”).
    • Not just automation via Cron/hooks.
    • Core change:
      • You can’t easily remember how you worked without AI.
    • To reach this, you often must:
      • redesign your codebase/environment to be AI-friendly,
      • do iterative trial-and-error (which takes time—hence the widening gap).

Where the gap widens (core mechanism)

  • The main leap isn’t “sign up for GPT.”
  • It’s the shift:
    • AI user → delegation (Level 2 → 3) requires changing how you work.
    • Delegation → AI-native (Level 3 → 4) requires changing the system:
      • AI-ready codebase + context + guardrails + cost control.

Example: serial human work vs parallel AI delegation

  • Scenario: 6 tasks (feature request, bug fix, performance, and meetings).

Old approach (Level 2 / human-driven serial)

  • Work halts when you switch tasks/meetings.
  • You might finish only 1–2 tasks per day.

Delegation approach (Level 3 / parallel AI agents)

  • Launch multiple “Code/Agent” workflows:
    • Agent 1: feature planning/analysis
    • Agent 2: bug fixes / pricing changes
    • Agent 3: profiling + performance report
  • While you attend a meeting, AI keeps executing continuously.
  • Result:
    • Two agents near completion before lunch,
    • Roughly “one person running 5 tasks in parallel” → up to 5× output, potentially “5–10 people worth” over time.

Practical techniques to overcome frequent bottlenecks

The speaker plans to focus on three areas where people get stuck when aiming for Level 4 AI-native.


1) AI-ready codebase (make your repo navigable to AI)

Key claim: the same AI model/tool can produce different quality outputs depending on how the codebase is presented to it.

  • Why small projects feel fine: AI can often fit full context.
  • Why company code fails: production repos have thousands of files, conventions, implicit rules, and limited context windows—AI guesses unless given a map.

Definition (core metaphor)

  • AI-ready codebase = a codebase with a “map drawn for AI.”
    • Mark where AI should look,
    • Provide starting points,
    • Mark what parts should not be touched,
    • Provide onboarding-like guidance for AI (analogous to human onboarding docs).

Failure cases (what goes wrong without a map)

  • Different names for the same meaning
    • Example: one module uses Price, another uses Amount.
    • Humans infer equivalence; AI may not.
    • Result can look correct in tests but fail in production (wrong unit/value → outages).
  • Implicit cross-repository dependencies
    • Example: microservices (Payment/Order/User/Notification/Analytics) share API/legacy processing.
    • A comment indicates a deprecation path, but CI breaks because dependent repos weren’t covered.
    • AI can’t “search across” missing context → misses the dependency → outage risk.

How the speaker implemented “the map” (Claude MD files)

  • Create small module-specific Claude/MD files rather than one huge wiki.
  • For each module, answer five questions (at least):
    1. What does the module do?
    2. How do I use it / how do I modify it?
    3. What must NOT be done (anti-obvious / non-obvious rule pattern)
    4. Dependencies / navigation guidance (where related modules are)
    5. Implicit knowledge that would otherwise be assumed by senior engineers

“Compass vs dictionary” principle

  • Don’t dump everything into context.
  • Provide direction (where to go), not full explanations.

Anti-pattern / “obv” (non-obvious) guardrails

  • Add explicit “do not do X” lines where AI tends to veer off.
  • Quality improves dramatically even with small amounts of such guidance.

Root file / context management rules

  • Use a short root Claude MD (roughly 100–200 lines max).
  • Root acts as an index/map; detailed info lives in referenced documents.

Keep Claude MD fresh (auto-update loop)

  • “Decaying context is worse than having none.”
  • Don’t rely on manual maintenance.
  • Implement automation via:
    • hooks/commands to refresh AI docs at session end,
    • scheduled updates (daily/weekly),
    • forced compaction/maintenance workflows.

Quantified effect (as claimed)

  • Example: created 59 files of 25–35 lines each (≈ ~1,000 lines/tokens total).
  • Result:
    • AI previously could “see” only ~5% of the codebase due to context limits,
    • after maps, AI could scan effectively across the repo (~4,100 files).
  • Tool-call waste reduced by ~40%.

2) Context & cost optimization (make AI-native affordable and stable)

Goal: cost control is a prerequisite for progressing to higher levels because AI usage must become extensive.

Cost drivers

  • Cost grows with:
    • tokens sent (context size)
    • tokens output (verbosity)
  • Cleaner context also improves quality.

Token efficiency boils down to 3 tactics

  1. Persistent context (don’t “re-teach” info every session)
  2. Procedure/prompting to reduce guessing
  3. Conversation hygiene (control session growth and format)

Common cost-wasting pattern: context swelling across many turns

  • Pre-session tokens keep accumulating.
  • If you keep everything in one long session (20+ turns), context grows compoundingly.

Countermeasures (three concrete practices)

  • Compact context proactively
    • When context reaches ~30–40%, trigger compaction
    • Or automate: notify/freeze and compact periodically
  • One task per session
    • Split tasks into separate sessions/sub-agents
  • Explicitly constrain output format
    • For code: “explain only when asked”
    • Keep responses concise to reduce output tokens

“Tone/output” control

  • Tool-using workflows increase output/context.
  • Prefer specifying what you need (location/operation) instead of vague requests.

Sub-agent isolation for “main context contamination”

  • Use sub-agents to work in separate contexts.
  • Send only final outputs back to the main agent.

Cache utilization strategy

  • Cache reduces cost significantly, but can be invalidated.
  • Two major ways to lose cache:
    1. Modifying Claude MD mid-session (changes system prompt → cache reset)
    2. Cache expiring due to time constraints
  • Recommendation:
    • don’t change Claude MD mid-session; apply updates after session ends.

Monitoring + optimization workflow

  • Detect inefficiencies by analyzing Claude Code session logs.
  • Build dashboards/visibility for the team.
  • Fix in two modes:
    • Active optimization: manually correct docs and compact now
    • Passive optimization: enforce with hooks so systems do it automatically

Claimed implementation artifacts

  • A “skill” to score and categorize inefficiency patterns.
  • A “dashboard” (“Dashbird”) and reported waste estimates:
    • accumulated waste $327
    • one maximum pattern $171
    • repeatedly changing Claude MD mid-session + excessive sub-agent calls were frequent.

3) Safe guardrails via “hooks” (prevent AI-generated code from causing outages)

Motivation: AI agents can do the right-looking thing perfectly—right up until they execute dangerous actions. Human review alone becomes a bottleneck.

Lesson from Amazon incidents (as cited)

  • Amazon faced major outage/order issues.
  • Cause traced to deploying AI-written code without review.
  • Proposed response: require junior/mid engineers to get senior approval for AI-generated production deploys.
  • Speaker argues this is inefficient long-term (humans become bottlenecks).
  • Proposed real solution:
    • System-level automatic blocking using hooks.

Definition (single-line)

  • A hook is a script automatically executed just before/after the AI performs a specific action.

Four practical hook types mentioned

  1. Build/test hook
    • Run lint + tests + build before allowing progress.
  2. PR review hook with a separate “second eye” agent
    • Don’t have the same agent review its own code (bias).
    • Use a different sub-agent to review PRs.
  3. TDD hook
    • Forbid editing/changes unless tests are written first.
  4. Service-failure pattern safety hook
    • Turn past incident post-mortems/patterns into a script.
    • Run automatically when PRs are created to block known dangerous failure patterns.

Implementation method described

  • Write post-mortems → categorize → turn into scripts → embed into hooks.
  • The system replaces “relying on people” with repeatable safety mechanisms.

Course / promotion info (what the speaker plans to teach)

  • They collaborated with Fast Campus to create a longer course (about 2 hours total, described as more comprehensive).
  • Topics to cover:
    • A-vibe coding methodology differences vs basic prompting
    • Vibe Coding / Agentic Engineering / “Hals Engineering” (as named)
    • Writing AI-ready codebases and Claude/MD maps
    • Context strategies + token efficiency strategies
    • Hook design + TDD
    • Safe deployment + AI-powered code review
  • A practical build component:
    • Build a Fintech SaaS MVP
    • Add Agent PR review to remove review bottlenecks
    • Add production considerations:
      • performance profiling,
      • DB optimization,
      • error handling,
      • logging,
      • analytics (Google Analytics after launch),
      • and an agent-assisted CI/deployment pipeline “while you sleep,” with safe integration.

Speakers / sources featured (as mentioned)

  • Ha Jae-sang (speaker; operator of “real developer channel”)
  • Meta (speaker’s employer; example repository/pipeline context)
  • Claude Code / Anthropic Claude (tool referenced)
  • Codex / OpenAI Codex (tool referenced)
  • Gemini (tool referenced)
  • GPT / ChatGPT (tool referenced)
  • Amazon (referenced for outage/policy + “Kiro” internal AI coding tool)
  • Kiro (Amazon’s in-house AI coding dog referenced)
  • Fast Campus (collaboration partner for the course)
  • Entropy (company referenced re: coding-related revenue)
  • Engineering blog / Meta Engineering blog (source where an AI-agent failure example is described, mentioned as an engineering blog post)

Original video