Cost Modelling Across Deployment Routes

Cost comparison across routes is harder than comparing list prices because published rates tell only part of the story. Hidden components reshape the comparison. Private networking, key management, storage, egress, and warehouse sizing are the cost variables that shift rankings once you move beyond headline token prices. This chapter equips a CTO to build a comparison model. It does not provide the answer to which route is cheapest because that depends on your specific usage profile, scale, and contractual position. Instead, the chapter exposes the variables that matter, the published pricing that is available, the hidden components that are not published, and the dialogue partners who hold the information that public documentation does not contain.

Chapters one through five have established the provider-level facts. This chapter reads across all four routes and gives you a framework for cost comparison. Its job is to give you the structure of the decision, the cost levers that vary by route, and the hedges that apply when published pricing is incomplete.

In this chapter

Published token rates across routes
Hidden cost components that shift the comparison
EDP and enterprise discount dynamics
Building a cost-comparison spreadsheet
What to ask your account manager
When cost is not the deciding factor
What this chapter does not cover
Primary sources

Published token rates across routes

Token pricing as of 2026-04-22 is available for three of the four routes. The fourth requires a conversation with your account manager.

AWS Bedrock's list pricing for Claude Sonnet 4.6 in on-demand mode is $6.00 per million input tokens and $30.00 per million output tokens as of 2026-04-22, with Batch API pricing at 50% of on-demand rates: $3.00 per million input tokens and $15.00 per million output tokens. Bedrock's published pricing covers only a subset of AWS regions (us-east-1, us-east-2, us-west-2, eu-west-1, eu-central-1, eu-north-1, ap-northeast-1). Pricing in other regions, if available, is not published. Verify current pricing at AWS Bedrock pricing.

Google Vertex AI's Claude pricing is available on the Claude model card within Model Garden. Both pay-as-you-go and provisioned throughput billing options are available. The documentation does not publish specific token rates on a central pricing page. You access pricing by opening the Claude model card in the Vertex AI console or by requesting pricing details from Google Cloud sales. The provisioned throughput option allows you to commit to a volume and lock in a rate, which can be material for cost control at scale. Verify Google Vertex AI Claude documentation and be aware that Google announced the Gemini Enterprise Agent Platform as the evolution of Vertex AI in April 2026. Any Claude-on-Vertex pricing or SKU changes from that announcement should be confirmed with your account manager.

Azure AI Foundry provides Claude through serverless pay-per-token billing, classified as Global Standard deployment. The documentation does not publish specific per-model token rates. Enterprise or MCA-E subscriptions are required for Claude access. Access token rates through the Azure portal during model deployment or request them directly from Microsoft sales. Verify Azure AI Foundry Claude page.

Snowflake Cortex pricing is token-based, with both input and output tokens billable, but specific per-model credit rates are not provided in publicly accessible documentation. Snowflake references a Service Consumption Table that details credit costs per model, but that table is not available in public documentation. The Cortex Code product ($20 USD monthly subscription plus $40 free credits for a 30-day trial) uses a separate licensing model from Cortex AI Functions in SQL. Snowflake also recommends using a warehouse size no larger than MEDIUM when calling Cortex AI Functions, because larger warehouses do not improve performance but increase costs through extended compute time. Ask your Snowflake account manager for the credit rate matrix and warehouse sizing recommendations for your expected query volume.

Published rates are the floor, not the ceiling

Token rates as of 2026-04-22 come from primary sources with verification links. Snowflake pricing is not published. Verify current rates before procurement.

Hidden cost components that shift the comparison

Once you move past token pricing, four cost categories reshape which route appears lowest-cost for your specific deployment.

Private networking is not optional for compliance-sensitive workloads and is not a linear cost addition. Bedrock supports AWS PrivateLink for data-plane isolation without internet egress. The configuration itself carries additional costs separate from token pricing. Vertex AI supports VPC Service Controls for organisation-level data residency guarantees. Foundry and custom gateways carry similar isolation requirements but with varying implementation complexity and pricing implications. For a HIPAA-regulated enterprise or a Financial Conduct Authority-regulated firm in London, private networking is likely required rather than optional. The cost of private networking can multiply the per-token cost by two to three times depending on data volume and regional spread.

Customer-managed or customer-managed encryption keys (CMK or CMEK) are required in many compliance frameworks. AWS KMS costs for key management and API calls, Vertex AI CMEK pricing varies by region and rotation frequency, and Foundry offers customer-managed keys at an additional charge. These costs are not included in token pricing but appear in infrastructure bills. For regulated enterprises, a crypto-agility requirement often forces this cost regardless of which route you choose.

Data egress across regions and across cloud providers is a variable that moves rapidly during procurement. Bedrock charges for data transfer in and out of regions. Snowflake Cortex does not incur data egress charges for cross-region inference within Snowflake's perimeter, but if your compliance posture requires data to stay within a single cloud provider, regional distribution adds complexity and cost. Egress from one cloud provider to another for compliance checks or audit trails can be material. Estimate egress volume before final procurement.

Storage for logs, audit trails, and conversation history varies by route. Bedrock logs to CloudTrail at AWS pricing. Vertex AI logs to Cloud Logging. Foundry logs to Azure Monitor. Snowflake stores logs and warehouse outputs at Snowflake's credit pricing. These are not token costs but they accumulate quickly when auditing requirements force 90-day or 12-month log retention. A single heavily-used deployment can generate hundreds of gigabytes of audit logs annually.

Warehouse sizing on Snowflake is a cost variable distinct from token costs. The documentation recommends MEDIUM as the maximum, but you might choose S or XS for cost optimisation, accepting lower throughput. A CTO comparing cost across Bedrock and Snowflake should model warehouse credit consumption alongside Cortex token costs. Warehouse sizing is not a cost lever on the other three routes because they handle scaling differently.

Hidden costs reshape the comparison

Token rates are the headline. Private networking, key management, egress, and storage are the plot. All four routes shift cost rankings once these components are included.

EDP and enterprise discount dynamics

Enterprise Discount Programme rates, volume commitments, and contract terms are not published by any provider. This is by design. Enterprises negotiate individually based on scale, multi-year commitment, and bundled services. A CTO comparing routes must obtain EDP quotes from account managers before finalising the comparison. These conversations typically happen in parallel with technical evaluation, not after.

Volume thresholds for EDP are not published. A provider might offer one discount structure for annual volumes under 100 billion tokens, another for 100 billion to 500 billion, and a third for over 500 billion. Your starting point for negotiation depends on your forecast. Multi-year commitments typically unlock larger discounts than annual, sometimes 30-50% below list pricing for mature contracts. However, these discounts are confidential and non-binding until you sign.

Bundled services are common in EDP negotiations. If you are already using Bedrock for other workloads, extending the contract to include Claude inference might unlock a tier discount on all Bedrock usage. If you are a Google Cloud customer with committed compute spend, Vertex AI Claude pricing might be incorporated into your overall spend commitment. These bundling dynamics are not transparent but they often make a large difference in final contract pricing.

Building a cost-comparison spreadsheet

To make a route decision, build a model with variable inputs: monthly token count, model mix, regional distribution, private networking, key management, egress volume, and storage retention. Use simplified numbers to show the shape of the model, then fill in your actual figures.

Start with a baseline usage profile. Assume 10 million input tokens monthly and 5 million output tokens monthly, all Sonnet, no Opus, no regional distribution, no private networking, provider-managed keys, no cross-region egress, and logs stored for 30 days. This is not a real deployment profile. It exists to show the shape of the model.

Bedrock on-demand for this baseline: 10M tokens at $6/M is $60 input cost. 5M tokens at $30/M is $150 output cost. Total token cost is $210 monthly. Add CloudTrail at typical rates (~$100/month for this query volume). Assume no private networking for this baseline. Total for Bedrock with public endpoints is approximately $310 monthly.

Vertex AI pay-as-you-go rates are typically within $5-10 of Bedrock's on-demand pricing per month for the same token volume. Assume $305-315 monthly for the same baseline. Provisioned throughput would offer a discount if your monthly volume is stable, but setup requires volume forecasting and a monthly commitment fee. For the baseline, provisioned throughput likely costs more than pay-as-you-go.

Foundry Global Standard deployment typically costs within a similar range but with additional egress charges if data crosses Azure regions. Assume $320-340 monthly for the baseline with minimal egress.

Snowflake Cortex token rates are not published, but combining warehouse credits (assume MEDIUM size running the queries monthly costs ~$200 in credits) plus Cortex token charges (estimate ~$150 for the baseline based on typical credit-to-token conversion) gives a rough starting point of $350 monthly. Snowflake's cross-region inference does not charge egress, so this baseline assumes no egress cost. The actual figure requires your account manager's quote.

Now introduce variables. If you add private networking (Bedrock PrivateLink, Vertex VPC-SC), add $200-500 monthly depending on data volume and configuration. If you use customer-managed keys, add $50-200 monthly depending on rotation frequency and region distribution. If you distribute queries across three regions to meet data residency requirements, add regional transfer costs on Bedrock and Foundry, or warehouse costs on Snowflake.

The spreadsheet shape looks like this.

Provider	Baseline Token	CloudTrail/Logs	Private Networking	CMK/CMEK	Egress	Warehouse	Total
Bedrock	$210	$100	$0	$0	$0	—	$310
Vertex	$305	$100	$0	$0	$0	—	$405
Foundry	$330	$100	$0	$0	$0	—	$430
Snowflake	$150	$50	$0	$0	$0	$200	$400

Now add the constraints that matter for your deployment. If your compliance posture requires private networking on Bedrock, add $300 monthly. If you need customer-managed keys on all routes, add $100-150 monthly. If you distribute across three regions, add regional costs per provider. The final comparison shows which route ends up lowest-cost for your constraints, not the lowest cost in abstract.

Stay with the series

This is chapter six of a hub that breaks down every Claude deployment route with primary-source references for pricing, residency, and contractual terms. New chapters as they publish, sent to your inbox. Subscribe to the newsletter.

What to ask your account manager

EDP pricing structures, volume thresholds, multi-year discounts, and bundled services are variables held by account managers. Ask them directly rather than inferring from public pricing.

For Bedrock, ask whether Claude Opus 4.7, Sonnet 4.6, and Haiku 4.5 pricing is available (these models are not listed on the public pricing page). Ask about regional pricing for your target regions. Ask what Bedrock's commitment options are and whether they apply to Claude specifically. Ask about PrivateLink costs and whether they scale with data volume or are a fixed monthly fee. Ask whether your existing AWS spend can be bundled with Claude EDP negotiation.

For Vertex AI, ask whether provisioned throughput discounts are available for your forecasted monthly volume. Ask what the setup and minimum commitment fees are. Ask whether Google Cloud committed-use discounts for compute can be applied to Vertex AI Claude. Ask about VPC Service Controls pricing and whether these controls are included in your contract or separately metered. Ask how the Gemini Enterprise Agent Platform announcement in April 2026 affects Claude SKU availability and pricing going forward.

For Foundry, ask about EDP pricing and volume thresholds. Ask whether your Microsoft enterprise agreement can be extended to include Foundry Claude pricing. Ask about Global Standard deployment costs and whether regional deployments offer different pricing. Ask whether Foundry's current absence of Anthropic's "no data to Anthropic" guarantee changes your risk model enough to affect price sensitivity.

For Snowflake, ask for the credit rate matrix for each Claude model. Ask about warehouse sizing recommendations for your expected query volume and how warehouse credit consumption scales with concurrent queries. Ask whether Cortex Code ($20 monthly subscription) is included in volume discounts or priced separately. Ask about cross-region inference costs and whether they appear as separate line items or are included in the token rate.

When cost is not the deciding factor

A CTO optimising for lowest cost makes sense only if cost is the primary constraint. For most regulated enterprises, compliance posture, operational fit, and data residency dominate the decision. Foundry's current absence of Anthropic's "no data to Anthropic" guarantee might eliminate it regardless of cost. Snowflake makes sense for warehouse-native workloads and creates unnecessary complexity for stateless API inference. Vertex's availability across 35+ regions solves certain data residency requirements that Bedrock's limited region list does not. Bedrock's cost advantage evaporates if your compliance requirements force private networking and key management controls that another provider bundles or handles differently.

Cost optimisation is pointless if the route does not meet the core constraint. A CTO evaluating routes should apply filters in order: compliance posture first, operational fit second, cost third. Only compare costs between routes that pass both the compliance and operational filters.

Chapters eight and nine cover controls and governance cost implications. Private networking and customer-managed key requirements might add cost that varies significantly by route. Governance workflows and MDM coordination are cheaper on some routes than others depending on your existing platform. These decisions reshape the cost comparison you build in this chapter.

What this chapter does not cover

Three things sit deliberately outside the scope of this chapter. The per-provider operational detail, including regional availability, private networking, customer-managed encryption keys, and the specific contractual clauses that vary between providers, is treated route-by-route in chapters two through five. The full controls matrix across telemetry, egress, sandbox isolation, customer-managed keys, and identity is treated in chapter eight. The governance questions of policy enforcement, change control, credential rotation cadence, and MDM audit evidence are treated in chapter nine. The mental model in this chapter is enough to judge cost-per-route before procurement. The later chapters are where the procurement decision gets made.

Primary sources

AWS. Bedrock Pricing. Retrieved 22 April 2026.
Google Cloud. Vertex AI Claude Documentation. Retrieved 22 April 2026.
Microsoft. Azure AI Foundry Claude Page. Retrieved 22 April 2026.
Snowflake. Snowflake Cortex Documentation. Retrieved 22 April 2026.
Google Cloud. Introducing Gemini Enterprise Agent Platform. Retrieved 22 April 2026.

Nothing in this article is legal advice. It names regulatory frameworks and describes how each deployment route affects compliance posture. Compliance interpretation for your specific regulatory context, jurisdiction, and client contracts must be reviewed with qualified legal counsel. Verify current Anthropic documentation at https://claude.com/docs/cowork/3p/overview before making a procurement decision.