Summary of "Why Claude Keeps Hitting Usage Limits (& how to fix it)"
Concise summary
Claude now enforces 5-hour “current session” usage limits more strictly during peak hours. As a result, chats can stop mid-task when the session allowance is exhausted, even if your weekly quota remains.
Problem and recent change
- Claude enforces a 5-hour “current session” usage window more strictly during peak hours to manage demand.
- Peak windows: ~5:00–11:00 AM PT (1:00–7:00 PM GMT).
- Weekly limits are unchanged, but you’ll consume the 5-hour allotment faster during peak times — causing chats to stop mid-task when the session allowance ends.
Key technical concepts
- Session usage vs length (context) limits
- Session: a 5-hour usage window that starts when you send your first message; tracked in Settings → Usage.
- Length/context window: the amount of text Claude can hold/process in a single chat (measured in tokens). Most plans: ~200k tokens per chat; some plans up to 1M.
- Tokens
- Internal unit (roughly 0.5–0.75 words per token).
- Every message, response, file, and tool call consumes tokens.
- Model differences
- Opus 4.6 (higher-end) burns tokens about 5× faster than Sonnet/Sonic 4.6 (more efficient).
- Use Opus only for heavy tasks when you have spare session budget.
- Tools/connectors/plugins/MCPs/skills
- These can consume context tokens even when idle.
- Some connectors (e.g., Apify, Chrome integrations, Drive/Notion/Gmail) are major token sinks.
- Useful commands/features
/contextcommand: reveals per-chat token breakdown (system prompt, tools, skills, files, memory) so you can see what’s burning tokens.- Extended Thinking: a feature that increases internal computation and token use. Toggle it off when not needed.
- Extra usage/credit: you can add small credit (pay-as-you-go) to finish tasks when session limits are hit.
Practical fixes — 14 tips
- Monitor usage: check Settings → Usage to see session and weekly usage and reset timers.
- Use
/context: inspect what’s consuming tokens in the chat (system prompt, tools, MCPs, files). - Choose models wisely: default to Sonnet/Sonic for efficiency; use Opus for intensive jobs when you have spare session budget.
- Turn off Extended Thinking when unnecessary.
- Disable unused tools/connectors/MCPs/plugins (especially duplicate connectors like multiple Chrome integrations).
- Start a new chat per task: avoid long mixed-topic threads that keep accumulating context.
- Be specific in initial prompts: tell Claude exactly what to do to avoid extra back-and-forth and unnecessary processing.
- Batch requests into single prompts: group multiple edits/asks together to reduce repeated overhead.
- Preprocess heavy documents: convert PDFs to markdown/plain text before ingestion (markdown is LLM-friendly and lighter).
- Use other services for extraction: e.g., Perplexity or other tools to convert/extract content before loading into Claude.
- Use Claude Projects / RAG: projects can retrieve only relevant snippets instead of loading all files into the chat.
- Trim custom instructions & keep them short (recommend <500 words): every chat load reads instructions and costs tokens.
- Remove pointless/idle files from projects/co-work folders to avoid accidental extra context processing.
- Build and use Claude Skills for repeatable workflows: skill descriptions load minimal context and encode repeatable recipes so Claude doesn’t re-learn steps.
Operational tricks
- Session reset trick: send a throwaway prompt earlier to shift the start of the 5-hour window so you can better align heavy work with non-peak periods (effectively get two usable sittings).
- Schedule heavy jobs outside peak hours so tokens/session go further.
- Keep Projects clean, use short instructions, and leverage memory where helpful to avoid re-explaining.
Takeaways
- The apparent “sudden” drop in capacity is due to stricter peak-hour session throttling combined with token consumption from models, tools, and context.
- Quick wins: switch to Sonic (Sonnet), toggle off Extended Thinking, remove unused connectors, use
/contextto find token hogs, batch prompts, and preprocess files. - Longer-term: move heavy ingestion into preprocessed markdown or Projects/RAG, and build Skills for repeatable tasks.
Relevant product features & tools mentioned
- Claude/Anthropic: session usage tracking, context windows, models (Opus vs Sonnet/Sonic), Extended Thinking toggle, Skills, Projects, Co-work, memory,
/contextcommand. - Connectors/plugins: Drive, Gmail, Notion, Apify, Chrome integrations, MCPs.
- External tools: Perplexity (for PDF → markdown extraction).
Main speaker / sources
- Primary speaker: an unnamed video presenter — a Claude/AI power-user explaining the changes and giving a 14-point guide.
- Sources referenced: Claude update posts (Reddit/posts about session limits), Anthropic/Claude product settings and features, and external tools like Perplexity and Apify.
Category
Technology
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.
Preparing reprocess...