Cost Modelling — Subscription vs API for Heavy Users

Claude Code can be billed two ways: against a subscription plan (Pro, Max, Team) or per-token against an Anthropic API key. The break-even between them is not obvious and depends entirely on how you actually use the tool. This article walks through the published pricing, realistic usage patterns, and where the lines fall — so you can pick the billing model that matches your usage rather than guessing.

The numbers below are pulled from publicly available pricing as of early 2026. Anthropic adjusts pricing periodically; verify against the live pricing page before making procurement decisions on figures alone.

The two billing models

Subscription plans — flat monthly fee, with usage limits that reset on a rolling window (typically a 5-hour interactive window plus a weekly cap). You don't see per-call costs; you see "you have N% of your budget remaining". Pro and Max plans both include Claude Code; Team plans include it on the Premium tier only.

API billing — per-token charges to your Anthropic API account. Set ANTHROPIC_API_KEY in your environment and Claude Code routes there automatically. No usage caps; you pay for what you use, with prompt caching discounts applying automatically.

Subscription plan tiers

Plan	Price (USD)	Includes Claude Code	Suited to
Pro	$20 / month	Yes	Light to moderate interactive use
Max (5×)	$100 / month	Yes	Heavy interactive use, long sessions
Max (20×)	$200 / month	Yes	Very heavy use, multiple all-day sessions
Team Standard	$20 / seat / month	No (chat only)	Teams that only need chat
Team Premium	$125 / seat / month	Yes	Teams using Claude Code together

Anthropic does not publicly disclose exact token allowances per tier. Indicative observation from heavy users: Pro feels constraining for full-day Claude Code use; Max 5× is comfortable for one full-day-coder-equivalent; Max 20× rarely hits the cap even with extended use.

API token pricing

Model	Input ($/M tokens)	Output ($/M tokens)	Cached input ($/M tokens)
Haiku 4.5	$1.00	$5.00	$0.10
Sonnet 4.6	$3.00	$15.00	$0.30
Opus 4.7	$15.00	$75.00	$1.50

Two things to note. Output is roughly 5× the input cost — the optimisation lever is keeping responses concise. And cached input is roughly 10% of standard input, which compounds significantly across long sessions where the same CLAUDE.md, rules, and skill descriptions are read repeatedly. Claude Code uses prompt caching automatically; you don't configure it.

Realistic token consumption

Approximate token usage for common Claude Code patterns. These are estimates from observed runs, not promises:

1-hour interactive coding session, moderate file reading, two or three edits, Sonnet 4.6 — typically 50K-150K input tokens (most cached after first read), 5K-15K output tokens. API cost: roughly $0.10-$0.30 with caching
CI run, reads 5 files, writes 1 PR comment, Sonnet 4.6, --max-turns 10 — 20K-40K input tokens, 1K-3K output tokens. API cost: roughly $0.05-$0.10
Long agent session with research, planning, multiple file edits, Opus 4.7 — 200K-500K input tokens, 30K-50K output tokens. API cost: roughly $5-$10 per session
Large refactor, agent reads 30+ files, edits 10+ files, Sonnet 4.6 — 500K-1M input tokens, 30K-50K output tokens. API cost: roughly $1-$3

The numbers swing wildly with model choice. Opus 4.7 is roughly 5× the cost of Sonnet 4.6 per token. For most coding work, Sonnet is the right default — Opus shines when the task requires deeper reasoning across longer context, not for routine edits.

Break-even analysis

Rough break-even between API billing and each plan tier, using Sonnet 4.6 with caching as the baseline:

Pro ($20/month) — break-even around $20 of API spend per month, which is roughly 60-100 moderate interactive sessions, or about 3-5 sessions per working day. Below that volume, API billing is cheaper. Above it, Pro is cheaper
Max 5× ($100/month) — break-even around $100 of API spend, roughly 15-25 long agent sessions per month, or one substantial session per working day
Max 20× ($200/month) — break-even around $200, which is genuinely heavy use — multiple long agent sessions plus daily interactive work

The plans win at higher volumes because they're priced for predictability, not for cost-per-token efficiency. You're paying for the cap that protects you from a runaway week, not for the most-tokens-per-dollar.

When API billing wins

Light or sporadic use. A few sessions a week, mostly short — API billing keeps the cost proportional to actual usage rather than charging $20 for a quiet month
CI and automation. CI runs are predictable per-token; API billing keeps them off your interactive plan budget and gives you clear cost-per-build numbers
Heavy automation that exceeds plan caps. If you're running scheduled Claude jobs hourly, you'll exceed any plan's weekly cap — API billing is the only option
Per-project cost tracking. Different API keys for different clients lets you bill back accurately; plans are personal
Bursty heavy use. One week you do nothing, next week you do a 40-hour refactor — API billing tracks the bursts; plans don't

When plans win

Steady daily use. Same 4-6 hours of Claude Code most days — Max 5× or 20× becomes much cheaper than the equivalent API spend
Predictable monthly cost. Procurement teams and freelancers often prefer a flat fee for budgeting, even if it's slightly higher than usage-based
Heavy Opus use. Plans don't differentiate Opus from Sonnet from Haiku — you can use Opus exclusively without paying its 5× API premium
Want extended thinking budget without watching the meter. Plans include extended thinking; API billing charges for thinking tokens at output rates

The hybrid pattern

For mixed-use teams, the cleanest pattern is plan-based for interactive work and API-based for automation:

Each developer has a personal Pro or Max plan (interactive coding)
CI and scheduled jobs use a separate ANTHROPIC_API_KEY (visible cost tracking)
Set the env var only in CI environments — interactive sessions use the plan automatically

This keeps interactive cost predictable at the personal level and automation cost trackable at the org level, without forcing one model to do both badly.

Levers that move the bill

Model choice — Sonnet 4.6 is the default for a reason; Opus only when you need it
Output length — output is 5× input; verbose responses cost more than concise ones
Caching — automatic in Claude Code, but only if your CLAUDE.md and rules are stable across calls. Frequent CLAUDE.md edits invalidate the cache
Max turns — runaway agent loops are the single biggest cost risk. Cap on every unattended run
Tool restrictions — fewer allowed tools means smaller tool definitions in context, which means fewer input tokens per call
Bare mode — --bare skips loading hooks, skills, MCP servers, and CLAUDE.md, which can cut input tokens by 50%+ for simple jobs that don't need workspace context

The biggest single lever is matching the billing model to your actual usage shape. Most cost surprises come from users who picked the wrong model for their pattern: heavy users on Pro getting throttled, light users on Max paying for capacity they never use.