Video summary
[세미나 편집본] 8년차 실리콘밸리 엔지니어의 바이브코딩 테크닉
Main summary
Key takeaways
Main ideas & lessons
- AI is already mainstream for builders (“top 1%”), so the real issue is no longer “use AI or fall behind,” but the widening productivity gap among AI users.
- Same tools → wildly different outcomes because top performers:
- run more AI tasks in parallel (delegate work),
- produce stable, reliable results with AI agents,
- and redesign their workflow/culture/codebase for AI.
- Evaluation and hiring criteria are shifting:
- Less focus on “how much code you personally write,”
- More focus on “how stable and good your results are when running multiple AI agents.”
- The gap is driven by operational maturity: moving from merely using AI to commanding AI, and eventually to AI-native work.
Career/workflow framework: 4-level “AI nativeness” model
Think of it as a self-assessment ladder (no “right numbers,” but a progression).
-
Level 1: “AI-aware”
- You know about AI, but rarely use it.
- Example mindset/tools: automation, information, maybe using it once (e.g., GPT once).
-
Level 2: “AI user / AI stage”
- You use AI as a tool.
- Typical behavior:
- Ask AI for code,
- Copy-paste outputs,
- Or code line-by-line while conversing.
- You remain the primary owner of the code (AI is assistant/support).
-
Level 3: “AI Maximum / delegation-based”
- You stop thinking: “how do I structure this task?”
- You think: “how do I split and delegate work to AI?”
- Typical behavior:
- Hand off whole work units (e.g., “process this ticket,” “analyze this bug and upload the response”).
- Later, you inspect results and decide next actions.
- Key difference vs Level 2:
- Workflow is redesigned around AI.
- It can’t work without AI (AI-centered workflow).
-
Level 4: “AI Nav / native”
- AI becomes embedded in your thinking and decision-making (like “DNA”).
- Not just automation via Cron/hooks.
- Core change:
- You can’t easily remember how you worked without AI.
- To reach this, you often must:
- redesign your codebase/environment to be AI-friendly,
- do iterative trial-and-error (which takes time—hence the widening gap).
Where the gap widens (core mechanism)
- The main leap isn’t “sign up for GPT.”
- It’s the shift:
- AI user → delegation (Level 2 → 3) requires changing how you work.
- Delegation → AI-native (Level 3 → 4) requires changing the system:
- AI-ready codebase + context + guardrails + cost control.
Example: serial human work vs parallel AI delegation
- Scenario: 6 tasks (feature request, bug fix, performance, and meetings).
Old approach (Level 2 / human-driven serial)
- Work halts when you switch tasks/meetings.
- You might finish only 1–2 tasks per day.
Delegation approach (Level 3 / parallel AI agents)
- Launch multiple “Code/Agent” workflows:
- Agent 1: feature planning/analysis
- Agent 2: bug fixes / pricing changes
- Agent 3: profiling + performance report
- While you attend a meeting, AI keeps executing continuously.
- Result:
- Two agents near completion before lunch,
- Roughly “one person running 5 tasks in parallel” → up to 5× output, potentially “5–10 people worth” over time.
Practical techniques to overcome frequent bottlenecks
The speaker plans to focus on three areas where people get stuck when aiming for Level 4 AI-native.
1) AI-ready codebase (make your repo navigable to AI)
Key claim: the same AI model/tool can produce different quality outputs depending on how the codebase is presented to it.
- Why small projects feel fine: AI can often fit full context.
- Why company code fails: production repos have thousands of files, conventions, implicit rules, and limited context windows—AI guesses unless given a map.
Definition (core metaphor)
- AI-ready codebase = a codebase with a “map drawn for AI.”
- Mark where AI should look,
- Provide starting points,
- Mark what parts should not be touched,
- Provide onboarding-like guidance for AI (analogous to human onboarding docs).
Failure cases (what goes wrong without a map)
- Different names for the same meaning
- Example: one module uses
Price, another usesAmount. - Humans infer equivalence; AI may not.
- Result can look correct in tests but fail in production (wrong unit/value → outages).
- Example: one module uses
- Implicit cross-repository dependencies
- Example: microservices (Payment/Order/User/Notification/Analytics) share API/legacy processing.
- A comment indicates a deprecation path, but CI breaks because dependent repos weren’t covered.
- AI can’t “search across” missing context → misses the dependency → outage risk.
How the speaker implemented “the map” (Claude MD files)
- Create small module-specific Claude/MD files rather than one huge wiki.
- For each module, answer five questions (at least):
- What does the module do?
- How do I use it / how do I modify it?
- What must NOT be done (anti-obvious / non-obvious rule pattern)
- Dependencies / navigation guidance (where related modules are)
- Implicit knowledge that would otherwise be assumed by senior engineers
“Compass vs dictionary” principle
- Don’t dump everything into context.
- Provide direction (where to go), not full explanations.
Anti-pattern / “obv” (non-obvious) guardrails
- Add explicit “do not do X” lines where AI tends to veer off.
- Quality improves dramatically even with small amounts of such guidance.
Root file / context management rules
- Use a short root Claude MD (roughly 100–200 lines max).
- Root acts as an index/map; detailed info lives in referenced documents.
Keep Claude MD fresh (auto-update loop)
- “Decaying context is worse than having none.”
- Don’t rely on manual maintenance.
- Implement automation via:
- hooks/commands to refresh AI docs at session end,
- scheduled updates (daily/weekly),
- forced compaction/maintenance workflows.
Quantified effect (as claimed)
- Example: created 59 files of 25–35 lines each (≈ ~1,000 lines/tokens total).
- Result:
- AI previously could “see” only ~5% of the codebase due to context limits,
- after maps, AI could scan effectively across the repo (~4,100 files).
- Tool-call waste reduced by ~40%.
2) Context & cost optimization (make AI-native affordable and stable)
Goal: cost control is a prerequisite for progressing to higher levels because AI usage must become extensive.
Cost drivers
- Cost grows with:
- tokens sent (context size)
- tokens output (verbosity)
- Cleaner context also improves quality.
Token efficiency boils down to 3 tactics
- Persistent context (don’t “re-teach” info every session)
- Procedure/prompting to reduce guessing
- Conversation hygiene (control session growth and format)
Common cost-wasting pattern: context swelling across many turns
- Pre-session tokens keep accumulating.
- If you keep everything in one long session (20+ turns), context grows compoundingly.
Countermeasures (three concrete practices)
- Compact context proactively
- When context reaches ~30–40%, trigger compaction
- Or automate: notify/freeze and compact periodically
- One task per session
- Split tasks into separate sessions/sub-agents
- Explicitly constrain output format
- For code: “explain only when asked”
- Keep responses concise to reduce output tokens
“Tone/output” control
- Tool-using workflows increase output/context.
- Prefer specifying what you need (location/operation) instead of vague requests.
Sub-agent isolation for “main context contamination”
- Use sub-agents to work in separate contexts.
- Send only final outputs back to the main agent.
Cache utilization strategy
- Cache reduces cost significantly, but can be invalidated.
- Two major ways to lose cache:
- Modifying Claude MD mid-session (changes system prompt → cache reset)
- Cache expiring due to time constraints
- Recommendation:
- don’t change Claude MD mid-session; apply updates after session ends.
Monitoring + optimization workflow
- Detect inefficiencies by analyzing Claude Code session logs.
- Build dashboards/visibility for the team.
- Fix in two modes:
- Active optimization: manually correct docs and compact now
- Passive optimization: enforce with hooks so systems do it automatically
Claimed implementation artifacts
- A “skill” to score and categorize inefficiency patterns.
- A “dashboard” (“Dashbird”) and reported waste estimates:
- accumulated waste $327
- one maximum pattern $171
- repeatedly changing Claude MD mid-session + excessive sub-agent calls were frequent.
3) Safe guardrails via “hooks” (prevent AI-generated code from causing outages)
Motivation: AI agents can do the right-looking thing perfectly—right up until they execute dangerous actions. Human review alone becomes a bottleneck.
Lesson from Amazon incidents (as cited)
- Amazon faced major outage/order issues.
- Cause traced to deploying AI-written code without review.
- Proposed response: require junior/mid engineers to get senior approval for AI-generated production deploys.
- Speaker argues this is inefficient long-term (humans become bottlenecks).
- Proposed real solution:
- System-level automatic blocking using hooks.
Definition (single-line)
- A hook is a script automatically executed just before/after the AI performs a specific action.
Four practical hook types mentioned
- Build/test hook
- Run lint + tests + build before allowing progress.
- PR review hook with a separate “second eye” agent
- Don’t have the same agent review its own code (bias).
- Use a different sub-agent to review PRs.
- TDD hook
- Forbid editing/changes unless tests are written first.
- Service-failure pattern safety hook
- Turn past incident post-mortems/patterns into a script.
- Run automatically when PRs are created to block known dangerous failure patterns.
Implementation method described
- Write post-mortems → categorize → turn into scripts → embed into hooks.
- The system replaces “relying on people” with repeatable safety mechanisms.
Course / promotion info (what the speaker plans to teach)
- They collaborated with Fast Campus to create a longer course (about 2 hours total, described as more comprehensive).
- Topics to cover:
- A-vibe coding methodology differences vs basic prompting
- Vibe Coding / Agentic Engineering / “Hals Engineering” (as named)
- Writing AI-ready codebases and Claude/MD maps
- Context strategies + token efficiency strategies
- Hook design + TDD
- Safe deployment + AI-powered code review
- A practical build component:
- Build a Fintech SaaS MVP
- Add Agent PR review to remove review bottlenecks
- Add production considerations:
- performance profiling,
- DB optimization,
- error handling,
- logging,
- analytics (Google Analytics after launch),
- and an agent-assisted CI/deployment pipeline “while you sleep,” with safe integration.
Speakers / sources featured (as mentioned)
- Ha Jae-sang (speaker; operator of “real developer channel”)
- Meta (speaker’s employer; example repository/pipeline context)
- Claude Code / Anthropic Claude (tool referenced)
- Codex / OpenAI Codex (tool referenced)
- Gemini (tool referenced)
- GPT / ChatGPT (tool referenced)
- Amazon (referenced for outage/policy + “Kiro” internal AI coding tool)
- Kiro (Amazon’s in-house AI coding dog referenced)
- Fast Campus (collaboration partner for the course)
- Entropy (company referenced re: coding-related revenue)
- Engineering blog / Meta Engineering blog (source where an AI-agent failure example is described, mentioned as an engineering blog post)