Claude Code can be billed two ways: against a subscription plan (Pro, Max, Team) or per-token against an Anthropic API key. The break-even between them is not obvious and depends entirely on how you actually use the tool. This article walks through the published pricing, realistic usage patterns, and where the lines fall — so you can pick the billing model that matches your usage rather than guessing.
The numbers below are pulled from publicly available pricing as of early 2026. Anthropic adjusts pricing periodically; verify against the live pricing page before making procurement decisions on figures alone.
The two billing models
Subscription plans — flat monthly fee, with usage limits that reset on a rolling window (typically a 5-hour interactive window plus a weekly cap). You don't see per-call costs; you see "you have N% of your budget remaining". Pro and Max plans both include Claude Code; Team plans include it on the Premium tier only.
API billing — per-token charges to your Anthropic API account. Set ANTHROPIC_API_KEY in your environment and Claude Code routes there automatically. No usage caps; you pay for what you use, with prompt caching discounts applying automatically.
Subscription plan tiers
| Plan | Price (USD) | Includes Claude Code | Suited to |
|---|---|---|---|
| Pro | $20 / month | Yes | Light to moderate interactive use |
| Max (5×) | $100 / month | Yes | Heavy interactive use, long sessions |
| Max (20×) | $200 / month | Yes | Very heavy use, multiple all-day sessions |
| Team Standard | $20 / seat / month | No (chat only) | Teams that only need chat |
| Team Premium | $125 / seat / month | Yes | Teams using Claude Code together |
Anthropic does not publicly disclose exact token allowances per tier. Indicative observation from heavy users: Pro feels constraining for full-day Claude Code use; Max 5× is comfortable for one full-day-coder-equivalent; Max 20× rarely hits the cap even with extended use.
API token pricing
| Model | Input ($/M tokens) | Output ($/M tokens) | Cached input ($/M tokens) |
|---|---|---|---|
| Haiku 4.5 | $1.00 | $5.00 | $0.10 |
| Sonnet 4.6 | $3.00 | $15.00 | $0.30 |
| Opus 4.7 | $15.00 | $75.00 | $1.50 |
Two things to note. Output is roughly 5× the input cost — the optimisation lever is keeping responses concise. And cached input is roughly 10% of standard input, which compounds significantly across long sessions where the same CLAUDE.md, rules, and skill descriptions are read repeatedly. Claude Code uses prompt caching automatically; you don't configure it.
Realistic token consumption
Approximate token usage for common Claude Code patterns. These are estimates from observed runs, not promises:
- 1-hour interactive coding session, moderate file reading, two or three edits, Sonnet 4.6 — typically 50K-150K input tokens (most cached after first read), 5K-15K output tokens. API cost: roughly $0.10-$0.30 with caching
- CI run, reads 5 files, writes 1 PR comment, Sonnet 4.6,
--max-turns 10— 20K-40K input tokens, 1K-3K output tokens. API cost: roughly $0.05-$0.10 - Long agent session with research, planning, multiple file edits, Opus 4.7 — 200K-500K input tokens, 30K-50K output tokens. API cost: roughly $5-$10 per session
- Large refactor, agent reads 30+ files, edits 10+ files, Sonnet 4.6 — 500K-1M input tokens, 30K-50K output tokens. API cost: roughly $1-$3
The numbers swing wildly with model choice. Opus 4.7 is roughly 5× the cost of Sonnet 4.6 per token. For most coding work, Sonnet is the right default — Opus shines when the task requires deeper reasoning across longer context, not for routine edits.
Break-even analysis
Rough break-even between API billing and each plan tier, using Sonnet 4.6 with caching as the baseline:
- Pro ($20/month) — break-even around $20 of API spend per month, which is roughly 60-100 moderate interactive sessions, or about 3-5 sessions per working day. Below that volume, API billing is cheaper. Above it, Pro is cheaper
- Max 5× ($100/month) — break-even around $100 of API spend, roughly 15-25 long agent sessions per month, or one substantial session per working day
- Max 20× ($200/month) — break-even around $200, which is genuinely heavy use — multiple long agent sessions plus daily interactive work
The plans win at higher volumes because they're priced for predictability, not for cost-per-token efficiency. You're paying for the cap that protects you from a runaway week, not for the most-tokens-per-dollar.
When API billing wins
- Light or sporadic use. A few sessions a week, mostly short — API billing keeps the cost proportional to actual usage rather than charging $20 for a quiet month
- CI and automation. CI runs are predictable per-token; API billing keeps them off your interactive plan budget and gives you clear cost-per-build numbers
- Heavy automation that exceeds plan caps. If you're running scheduled Claude jobs hourly, you'll exceed any plan's weekly cap — API billing is the only option
- Per-project cost tracking. Different API keys for different clients lets you bill back accurately; plans are personal
- Bursty heavy use. One week you do nothing, next week you do a 40-hour refactor — API billing tracks the bursts; plans don't
When plans win
- Steady daily use. Same 4-6 hours of Claude Code most days — Max 5× or 20× becomes much cheaper than the equivalent API spend
- Predictable monthly cost. Procurement teams and freelancers often prefer a flat fee for budgeting, even if it's slightly higher than usage-based
- Heavy Opus use. Plans don't differentiate Opus from Sonnet from Haiku — you can use Opus exclusively without paying its 5× API premium
- Want extended thinking budget without watching the meter. Plans include extended thinking; API billing charges for thinking tokens at output rates
The hybrid pattern
For mixed-use teams, the cleanest pattern is plan-based for interactive work and API-based for automation:
- Each developer has a personal Pro or Max plan (interactive coding)
- CI and scheduled jobs use a separate
ANTHROPIC_API_KEY(visible cost tracking) - Set the env var only in CI environments — interactive sessions use the plan automatically
This keeps interactive cost predictable at the personal level and automation cost trackable at the org level, without forcing one model to do both badly.
Levers that move the bill
- Model choice — Sonnet 4.6 is the default for a reason; Opus only when you need it
- Output length — output is 5× input; verbose responses cost more than concise ones
- Caching — automatic in Claude Code, but only if your CLAUDE.md and rules are stable across calls. Frequent CLAUDE.md edits invalidate the cache
- Max turns — runaway agent loops are the single biggest cost risk. Cap on every unattended run
- Tool restrictions — fewer allowed tools means smaller tool definitions in context, which means fewer input tokens per call
- Bare mode —
--bareskips loading hooks, skills, MCP servers, and CLAUDE.md, which can cut input tokens by 50%+ for simple jobs that don't need workspace context
The biggest single lever is matching the billing model to your actual usage shape. Most cost surprises come from users who picked the wrong model for their pattern: heavy users on Pro getting throttled, light users on Max paying for capacity they never use.
