Summary of "Stop Prompting Claude. Use Karpathy's Method Instead."
Main idea / premise
The video claims that most people prompt Claude (and similar LLMs) “wrong,” and argues for a faster, more reliable approach attributed to Andrej Karpathy. The method is broken into three layers—spec, verifier, and environment—plus a concluding “one thing to focus on” in the age of AI.
Layer 1: Spec (bridge human goals to what the model can execute)
Key limitation highlighted
State-of-the-art models struggle with context-driven decisions because they lack a reliable signal for the real-world “situation.”
- Example: The model answers “walk” for “car wash is 50m away,” even if the human context (cars/needs) isn’t captured well by the model.
Core concept
A spec is a structured format for delivering your understanding to the model—so the model operates within the correct framing.
Criticism of the common approach
Karpathy is presented as disliking high-level “plan mode” as too superficial. Instead, you should design a detailed spec through collaboration with the agent.
How to build the spec (as described)
-
Uncover your goal
- Distinguish between a task (“create end-of-month report”) and the underlying decision/conclusion the task supports.
- Technique: have Claude “interview” you to identify the real goal.
-
Work agile, not waterfall
- Waterfall: give the agent everything at once and see the final result later.
- Agile speccing: break scope into smaller pieces, show checkpoints, review, adjust, repeat.
- Technique: bias the spec toward smaller, compartmentalized segments.
-
Be precise and use your brain
- Precision reduces assumptions; assumptions increase drift.
- Add instructions like: “verify key decisions explicitly” so the model can’t silently skip important choices.
Output
A final “modern engineering” prompt/process to produce a tightly scoped, goal-aligned spec.
Layer 2: Verifier (make evaluation measurable and enforce verification)
Problem addressed
- It’s frustrating to review LLM output.
- Unlike humans, LLMs can’t naturally handle “non-measurable” qualities.
- They may confidently fail when missing the right context.
Karpathy framing referenced: “animals versus ghosts”
- Animals = human-like motivators/emotion-driven behavior.
- Ghosts (LLMs) = statistical simulation; they don’t “respond better” to shouting/pleading—so verification is the main lever.
Three verification tactics
-
Set evaluation criteria up front
- Vague: “make the report look good.”
- Precise: “report must have three sections, each ends with a recommendation.”
- Add this into the verification prompt.
-
Use a second AI model as a critic
- “Second librarian” idea: a different model checks/grades the first model’s output using different knowledge/assumptions.
- Mentions: for Claude Code workflows, use the Codex plugin to run consistency checks or validate steps via another system (e.g., “ensure both systems agree”).
-
Pull external signal where possible
- Technical: verify deployment by connecting Claude to the deployment system and confirming success.
- Non-technical: load historical reports to enforce the required format/spec during verification.
Claim
Claude creator Boris Cherney is quoted as saying that with a feedback loop, Claude code can produce 2–3x quality (as stated in the video).
Layer 3: Environment (tooling + persistent workspace that improves over time)
Analogy
- Spec = blueprint pinned to the wall
- Verifier = quality check station
- Environment = the workshop/tooling where the system operates
Key complaint
Most people “build the workshop from scratch” each time; merely keeping chat history isn’t the same.
How to build the environment (practical components)
-
Create and maintain a
Claude.md- Claude reads/injects it automatically on each prompt.
- Include a verification plan so verification is not optional.
- The video describes sections like:
- repo/workspace description
- routing and “custom skills”
- knowledge architecture (where to look for info)
- key rules that must always be followed
-
Build an “LLM knowledge base” (Karpathy concept)
- Create a local folder/retrieval structure so Claude can ingest materials and quickly find the right references.
- Emphasizes: “your data is your moat.”
-
Build reusable custom skills
- If you do something repeatedly, make a custom skill/handbook for it.
- Skills compound with usage (e.g., “run water through the hose”).
-
Add true guardrails (rule enforcement at tool level)
- Prompt-only rules like “don’t make up information” aren’t guaranteed.
- For critical safety/accuracy, enforce restrictions using tool hooks (e.g., block edits to a protected folder like
/important, don't edit). - Guardrails categories:
- Always do (autopilot-safe)
- Ask first (double-check)
- Never do (cannot be crossed)
Result
An “end-to-end Karpathy method” combining spec → verifier → environment.
“One thing to focus on” in the age of AI
- Quote (as presented): “You can outsource your thinking, but you can’t outsource your understanding.”
- The video interprets this as: the method’s layers are centered on your understanding of the bigger picture—goals, what matters, and how to direct AI work.
Main speakers / sources (as indicated)
- Andrej Karpathy (former Head of AI at Tesla; referenced via AISN 2026 talk + interviews)
- The narrator / video creator (summarizes/adapts Karpathy’s method and demonstrates prompts/process)
- Boris Cherney (creator of Claude Code, quoted about feedback loops improving quality)
- Mentions: Claude, Claude Code, Codex plugin (tools referenced, not individual speakers)
Category
Technology
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.