By default, Claude Code consumes tokens against your subscription plan — Pro, Max, Team, or whatever you're signed in with. For most interactive work, that's the right setup: predictable cost, no per-token accounting, and the same usage limits whether you're writing code or chatting. There are situations where you want different behaviour: scripted use, CI pipelines, tools that need their own billing, or workloads heavy enough that per-token API pricing is cheaper than a plan upgrade. This article covers when and how to switch Claude Code to use the Anthropic API directly, what changes when you do, and how it relates to the Claude Agent SDK.
Plan-based vs API-based Claude Code
Plan-based Claude Code authenticates with your Anthropic account via /login. Token usage counts against the plan's limits — interactive use, with rate limits that reset on a rolling window. You don't see per-call token costs; you see "you have N% of your weekly usage remaining" instead.
API-based Claude Code uses an Anthropic API key (the same kind you'd use to call the Messages API directly from a Python or Node program). Token usage is billed per-call to your API account at the standard published rates, with no plan limits — you can run long jobs without hitting weekly caps, but you also pay for every input and output token explicitly.
The switch is controlled by a single environment variable: ANTHROPIC_API_KEY. If it's set when Claude Code starts, the CLI uses the API key for billing. If it's not set, Claude Code uses your logged-in plan. There's no separate command — set the env var, billing routes to the API; unset it and re-login, billing returns to the plan.
When to switch
Three patterns where API-based Claude Code is the right choice:
- Scripted or scheduled runs. A nightly job that runs
claude -p "do the thing"shouldn't burn your interactive plan budget. An API key keeps the cost visible and separable. - CI pipelines. Headless Claude Code in CI needs an API key — there's no interactive login flow available, and you don't want CI runs eating your personal plan.
- Heavy automation. If your usage pattern is dominated by automated runs that would exhaust a plan's weekly limits, per-token API billing can work out cheaper, and is certainly more predictable.
For ordinary interactive coding, plan billing is almost always cheaper and simpler. The flat-fee plans are priced for interactive use; switching to API billing for casual sessions usually costs more.
Setting up an API key
- Sign in to
console.anthropic.comand create an API key. The console is separate from your plan account — even if your plan is on Max, you'll need to set up the API console and add billing - Export the key in the shell where you run Claude Code:
export ANTHROPIC_API_KEY=sk-... - Add the export to
~/.zshrcor~/.bashrcif you want it on by default - For per-project use only, set the variable in
.envrc(withdirenv), or in a shell alias that sets it before launching Claude Code
To verify you're on API billing, run claude /status — the output shows the auth source. The plan and API are mutually exclusive at any given moment; whichever the env var dictates wins.
Headless mode — running Claude Code from scripts
The -p flag (or --print) runs Claude Code non-interactively. The CLI takes the prompt, runs the agent loop until done, prints the result, and exits. No TTY, no chat — useful for scripts and CI.
A minimal example:
claude -p "What's in package.json?" --output-format text
Claude reads the file, returns a summary, exits. The output is plain text by default; for scripted parsing, use --output-format json to get a structured result with the response and metadata.
Useful flags for scripted use:
--allowedTools "Read,Bash"— restrict what tools Claude can use, important for unattended runs--output-format stream-json— newline-delimited JSON for streaming consumption--max-turns N— cap the agent loop at N rounds to prevent runaway runs--bare— skip loading hooks, skills, MCP servers, and CLAUDE.md, for deterministic runs
Exit code 0 means the run succeeded; non-zero means an error. Combine with jq for parsing structured output:
claude -p "Summarise the changes since last release" \
--output-format json \
--allowedTools "Bash,Read" \
| jq -r '.result'
The Claude Agent SDK
The Claude Agent SDK is a separate library — Python and TypeScript packages — that exposes Claude Code's agent loop programmatically. Where Claude Code is the CLI, the Agent SDK is the same machinery as a library you can call from your own code. It handles the agent loop (model call, tool execution, iterate), supports custom tools, supports approval callbacks, and bills against the API key in the standard way.
The Agent SDK is what you use when:
- You want to embed Claude's agent loop inside a larger application — a web service, a Slack bot, a long-running daemon
- You need fine-grained control over the loop — custom tool authorisation, structured outputs, custom retry logic
- The CLI's interactive or headless modes don't fit your shape — you need a programmatic API, not a process to spawn
It's distinct from the older Anthropic SDK, which is the lower-level Messages API client. The Anthropic SDK gives you raw model calls — you'd implement the agent loop, tool execution, and orchestration yourself. The Agent SDK gives you the loop pre-built. Reach for the Anthropic SDK only if the Agent SDK doesn't expose what you need; for almost every agent use case, Agent SDK is the right level of abstraction.
Anthropic API features and Claude Code
Several Anthropic API features have varying availability inside Claude Code:
- Prompt caching — automatic in Claude Code; the system prompt, CLAUDE.md, and rules are cache-eligible by default. No configuration needed
- Extended thinking — supported on models that have it (Opus, Sonnet); enabled implicitly when the model decides it helps
- Files API — for uploading documents to be referenced across calls. Claude Code reads local files directly via the Read tool, so this rarely comes up; if you need it, use the Anthropic SDK
- Batch API — async, discounted bulk processing. Not exposed through Claude Code; if you need it, write a separate script using the Anthropic SDK
- Vision (image inputs) — supported in Claude Code via the Read tool on image files
- Tool use — Claude Code is built on tool use; every interaction is the agent loop using tools
The pattern: features that fit into an interactive coding agent (caching, vision, tools, thinking) are accessible from Claude Code; bulk and async features (Batch, Files for cross-call reuse) need the SDK.
Cost — what to expect
API-based usage is per-token at the published rates (input and output priced separately, with input cheaper than output). A long agent session that reads several files, plans, edits, and verifies could be tens of thousands of tokens. A short scripted run — read one file, return a summary — is a few thousand. Prompt caching cuts repeated-context costs significantly; cached input tokens are an order of magnitude cheaper than uncached ones.
The honest comparison: a heavy interactive day on plan billing might consume the equivalent of $20-50 of API tokens; the plan is priced flat at well under that. So plan billing is the cheap option for ordinary use. API billing wins when usage is heavy enough to exceed plan limits, automated enough to need its own budget, or unpredictable enough that the flat fee no longer aligns with actual consumption.
Switching back
To return to plan billing, unset the environment variable:
unset ANTHROPIC_API_KEY
Remove the export from ~/.zshrc or ~/.bashrc if it was there, then run claude /login to confirm you're back on plan auth. claude /status verifies which mode is active.
The two modes don't share usage — switching back doesn't refund the API tokens you already spent, and switching to API doesn't pause your plan's clock. Each accumulates usage independently.
