An enterprise running data analysis, BI, and reporting inside a data platform faces a straightforward question when Claude is needed. Should inference run outside the platform, in a separate cloud service, routing data in and results back out? Or should Claude run inside the platform itself, in the same perimeter where the data already lives? The latter is the data-platform-native route. When it works for your architecture, it eliminates a line of data movement and a separate inference audit, which appeals to regulated organisations with tight data residency requirements.
Snowflake Cortex is the reference implementation for this pattern. Claude models run inside Snowflake's infrastructure via SQL functions. Data stays in your warehouse. Inference completes inside the same perimeter. No separate API calls, no external data pipeline. This chapter explains how that works, what guarantees apply to data-platform routes, where they fit in your architecture, and what the competitive set looks like.
In this chapter
What data-platform-native inference means
Data-platform-native inference is a deployment pattern where the large language model runs as a managed service inside the data platform's own infrastructure. Prompts flow from the platform's SQL or Python layer directly to the model without leaving the platform. Responses flow back to the same place. Everything stays within one perimeter.
Compare this to the gateway routes covered in Chapter 1. In a gateway pattern, a user or application calls Claude through a cloud provider (Bedrock, Vertex, Foundry) which routes to Anthropic's infrastructure. The cloud provider acts as the ingress point. In data-platform-native, the data platform itself hosts the model. It is your own infrastructure that the model runs in. The distinction matters because it shapes what data moves, what audit trails apply, and which compliance postures are available to you.
The pattern appeals most to organisations where Claude augments analytics workflows. A BI analyst writes a SQL query that includes a Claude function to summarise text fields, generate insights, or classify rows. The query executes entirely inside the warehouse. No external API is called. From a compliance perspective, the data path is simpler. From an operational perspective, warehouse-scale inference is a single performance and cost knob to turn.
Snowflake Cortex as the reference implementation
Snowflake Cortex AI is the production example. It offers SQL-based access to multiple Claude models inside Snowflake's managed infrastructure. The Cortex LLM functions guide describes the service as providing SQL-based access to multiple Claude model versions alongside models from OpenAI, Meta, Mistral, and DeepSeek. More specifically, the AI_COMPLETE SQL reference documents that Claude Opus 4.7, Claude Sonnet 4.6, and Claude Haiku 4.5 are available via the AI_COMPLETE SQL function, the recommended approach for latest functionality.
From a technical angle, you invoke Claude through SQL. A query like SELECT SNOWFLAKE.CORTEX.AI_COMPLETE('claude-sonnet-4-6', 'Summarise this text: ' || text_column) sends the prompt to the model and returns the response as a result row. You can join this against other warehouse data, filter results, and aggregate responses across millions of rows. Billing is consumption-based. Snowflake credits are consumed proportional to input and output tokens.
The Cortex LLM functions guide states that all models are fully hosted in Snowflake, ensuring performance, scalability, and governance while keeping data secure and in place. That phrasing—data stays in place—is the headline. Prompts do not leave Snowflake. Responses do not traverse the internet. The data path is contained.
Regional availability is a practical constraint. Not all Claude models are available natively in all regions. When the model you need is not in your default region, Snowflake offers cross-region inference. The mechanism keeps data within Snowflake but routes the inference request to a different region if necessary. Importantly, the cross-region inference documentation confirms that user inputs, service-generated prompts, and outputs are not stored or cached during cross-region inference. If inference crosses regions within the same cloud provider (both AWS, for example), traffic remains encrypted within that provider's network. If it crosses providers (AWS to GCP), it uses TLS encryption on the public internet. Credits consume in your requesting region, not the processing region, so there are no surprise data egress charges.
On warehouse sizing, the Cortex LLM functions guide is explicit. It recommends a warehouse size no larger than MEDIUM when calling Snowflake Cortex AI Functions, noting that larger warehouses do not improve performance but do increase costs. A MEDIUM warehouse is sufficient for inference. Scaling the warehouse size does not improve model latency, so the cost savings are obvious.
Size for efficiency, not scale
Snowflake recommends a MEDIUM warehouse maximum. Larger warehouses increase cost without improving inference latency.
On pricing, Snowflake publishes that the Cortex Code IDE (the visual development environment for AI-assisted coding) costs $20 USD monthly after a trial, but per-token credit costs for Claude models are not published in public documentation. This is a gap. Snowflake structures this as enterprise pricing, which means the exact per-model rates are negotiated with your account manager. For procurement purposes, this means asking your Snowflake sales contact for the specific rates before committing to a pilot.
Stay with the series
This is chapter five of a hub that breaks down every Claude deployment route with primary-source references for pricing, residency, and contractual terms. New chapters as they publish, sent to your inbox. Subscribe to the newsletter.
The data-platform guarantee gap
Here is where the critical caveat sits. Data-platform routes that use gateway access, including Snowflake Cortex Code connecting to Anthropic, fall into the "gateway" category under Anthropic's Cowork 3P model. And the Cowork 3P overview is explicit. "The data-residency, compliance, and 'no conversation data sent to Anthropic' statements throughout these pages apply only when inferenceProvider is vertex or bedrock. They do not apply when using Azure Foundry or a gateway. Equivalent guarantees for Azure Foundry are coming, and we will update these pages when they are available."
That phrasing is precise and has two separate implications. First, it confirms that Bedrock and Vertex carry Anthropic's written guarantee that no conversation data reaches Anthropic's systems. Second, it explicitly states that gateway routes do not yet carry that guarantee. Data-platform-native routes using gateway access, including Snowflake Cortex, do not currently have that Anthropic-level guarantee.
This does not make data-platform routes unsuitable. It means that if your compliance posture depends on a written assertion that no conversation data reaches Anthropic, Bedrock and Vertex are the only two routes that meet that requirement today. Data-platform routes may still be right for your architecture if your compliance requirements live elsewhere—data residency inside one perimeter, for example—or if you have negotiated specific data handling terms with your platform vendor (Snowflake, BigQuery, etc.) that satisfy your audit and legal requirements.
For a CTO mapping routes to compliance, this boundary is the first filter. Read it again. Bedrock and Vertex. No others, as of 2026-04-22. Verify the current status at Anthropic's Cowork 3P overview before procurement conversations.
The data-platform guarantee gap
Data-platform routes using gateway access do not yet carry Anthropic's "no data sent to Anthropic" guarantee. Only Bedrock and Vertex apply.
Competitive landscape and positioning
Snowflake Cortex is one option in a small category. The competitive set includes Google BigQuery, Microsoft Fabric, and Databricks, though each has a different positioning and feature set.
Google BigQuery offers native LLM inference through the AI.GENERATE_TEXT() SQL function. Claude is available as a partner model through Vertex AI, meaning the SQL function delegates inference to Vertex AI rather than embedding the model inside BigQuery itself. The data path keeps everything in Google Cloud, which satisfies data residency requirements, but the inference happens in a separate service rather than fully native to BigQuery. For CTOs already on GCP with existing BigQuery investment, this is the Snowflake-equivalent choice.
Microsoft Fabric provides native AI inference but exclusively through Azure OpenAI Service. The Fabric AI services overview lists GPT-5, GPT-4.1, and text analytics, but no Claude option. For organisations locked into the Microsoft ecosystem, Fabric is the data-platform-native pattern, but with OpenAI models, not Claude.
Databricks mentions support for Anthropic integration through its AI Gateway and Model Serving framework, though documentation on whether Claude is natively available as a first-class offering inside Databricks remains limited. If you are evaluating Databricks and Claude is a requirement, verify directly with Databricks whether Claude is available natively or requires external routing. Do not assume from the presence of an "Anthropic integration" that production support exists.
These alternatives exist, but the category is small. Only two to three platforms offer genuinely data-platform-native inference comparable to Snowflake Cortex. For most other data platforms—Redshift, SageMaker, Palantir—Claude integration requires external routing to services like Bedrock, not native inference inside the platform.
Operational requirements and trade-offs
Three operational constraints shape whether this route fits your deployment.
First, warehouse sizing. The MEDIUM warehouse recommendation applies broadly. Do not assume that buying larger warehouses improves inference performance. It does not. Cortex inference latency is determined by the model and the network path to the inference endpoint, not by your warehouse compute size. Larger warehouses consume more credits per unit time without improving the inference itself. Size conservatively and let the warehouse suspend when idle.
Second, cross-region inference mechanics. If your data must stay in one region and the model you need is not natively available there, you must decide whether cross-region inference is acceptable for your compliance posture. Intra-cloud cross-region (both AWS, for example) happens with internal encryption. Cross-cloud (AWS to GCP) happens over TLS on the public internet. Data is not cached in the transit region, which is the key guarantee. But the data does physically transit outside your home region. Some compliance frameworks (GDPR, certain HIPAA implementations) have specific rules about personal data crossing borders. Verify that cross-region inference aligns with your legal requirements before deploying it at scale.
Third, pricing opacity. Snowflake publishes the Cortex Code subscription fee but not the per-token credit costs for Claude models in Cortex. This is a known gap in the public documentation and a standard feature of enterprise Snowflake pricing. You cannot model cost without contacting your account manager. For a proof of concept or pilot, request a sample query, estimate token consumption manually, and ask Snowflake for a rate card or ballpark estimate. Do not assume Cortex pricing is comparable to Bedrock or Vertex token pricing without verification.
Edition requirements are similarly unclear. The public documentation does not specify which Snowflake editions (Standard, Enterprise, Business Critical) support Cortex. Access control is documented (you need the USE AI FUNCTIONS privilege and the CORTEX_USER or AI_FUNCTIONS_USER database role), but whether all editions grant those roles is undefined. Ask your account manager which editions you need before purchasing.
When this route fits
Data-platform-native inference is right for your deployment when all of these are true.
You already run a data platform (Snowflake, BigQuery, Fabric) at scale. You have existing warehouse investment, BI tooling, and SQL expertise. Adding Claude to SQL queries is a natural extension of that investment rather than introducing a new system.
Compliance requires data to stay in one perimeter. You have auditors who understand Snowflake or BigQuery better than they understand external cloud services. Your legal team is comfortable with your platform vendor's data handling terms but wants to avoid routing data to third parties.
The use case is analytics or BI augmentation, not generalised chat or agent loops. Claude summarises text fields, classifies rows, generates insights from structured data. These are high-value tasks inside a warehouse. Generalised conversational AI may be better served by a dedicated chat interface routed through Vertex or Bedrock.
This route does not fit if you require Anthropic's written guarantee that no conversation data reaches Anthropic, or if your organisation uses multiple data platforms and needs a unified Claude access pattern. Bedrock and Vertex handle both of those constraints better.
Primary sources
- Snowflake. Cortex AI Functions. Retrieved 22 April 2026.
- Snowflake. AI_COMPLETE SQL Function Reference. Retrieved 22 April 2026.
- Snowflake. Cross-Region Inference Guide. Retrieved 22 April 2026.
- Google Cloud. BigQuery Generative AI Overview. Retrieved 22 April 2026.
- Microsoft. Fabric AI Services Overview. Retrieved 22 April 2026.
- Anthropic. Cowork on 3P — Overview. Retrieved 22 April 2026.
Nothing in this article is legal advice. It names regulatory frameworks and describes how each deployment route affects compliance posture. Compliance interpretation for your specific regulatory context, jurisdiction, and client contracts must be reviewed with qualified legal counsel. Verify current Anthropic documentation before making a procurement decision.
