AI NewsletterSubscribe →
Resource HubEnterprise

Governance, Usage Policy, and Cost Control

Policy enforcement through MDM, token-based rate limiting, audit trails, credential rotation, and the change-control model for running Claude at enterprise scale.

Larry Maguire

Larry Maguire

GenAI Skills Academy

Once Cowork 3P is deployed across your organisation, governance shifts fundamentally from what you experience with a standard cloud SaaS application. There is no web console where an administrator logs in and adjusts who can use Claude, what they can do with it, or where their data goes. Instead, policy lives in MDM configuration, enforcement happens at the device level, and your audit trail is distributed across three separate systems. This chapter explains how to operate governance effectively within those constraints.

The foundational difference shapes everything that follows. Cowork 3P has no runtime configuration reload. Policy changes require a device restart. This means every governance action—from revoking a credential to tightening a rate limit to changing who can access a tool—becomes a planned maintenance event rather than a hot deployment. The consequence is that your governance model must be designed around this architecture instead of trying to retrofit standard enterprise patterns onto it. This chapter shows you how.

Why MDM-delivered configuration is governance

MDM is the policy enforcement mechanism in Cowork 3P. When you deploy a Claude Desktop installer to a device, the app reads configuration from two places at startup. First it checks for managed configuration delivered by your MDM platform, and then it checks for local configuration on the device. When a managed preference and a local preference disagree, the managed value wins. This precedence is absolute and applies to every setting Cowork 3P supports.

Anthropic's configuration reference states it directly.

When managed sources exist, they take precedence over local values.

This architecture has a material consequence. Policy enforcement happens once, at launch. Configuration is immutable at runtime, which means the same change-management process your platform team already uses for any other MDM-delivered application is the process that governs Claude. Jamf, Intune, Workspace ONE, or Group Policy becomes your governance instrument. No separate Claude control plane. No proprietary admin interface. The controls sit inside the tools your organisation has already invested in.

Planned, not reactive

Configuration changes require app restart. This makes policy updates deliberate, auditable, and scheduled. Not reactive patches deployed at runtime.

The operational implication is that credential rotation, policy updates, and rate limit changes all require a planned restart window. You cannot rotate a revoked credential at runtime and have it take effect on a running session. You deploy the new configuration and the user restarts the application. The benefit of this constraint is that it makes every change deliberate. There is no possibility of a configuration change landing mid-session or a credential rotation happening invisibly. The drawback is that you cannot respond reactively. A compromised API key requires a configuration update and a planned restart cycle, not an instant kill switch.

To verify that configuration has propagated and been accepted, the Cowork 3P application provides a diagnostic tool. Anthropic documents this in the installation reference.

Go to Help → Troubleshooting → Copy Managed Configuration Report to verify successful configuration deployment. This summary shows detected keys, their source (managed vs. local), and credential validation status with redacted secrets.

This report is the record of what configuration was actually deployed to a device. It shows which keys were read from MDM versus from local configuration, and which credentials validated successfully. When auditors ask "what policy was in effect on this device on 15 March," the configuration report is your answer.

Policy locks at the device level

Tool-level access control in Cowork 3P is enforced through policy locks applied to managed MCP servers. Anthropic's extensions documentation states the mechanism.

Administrators deploy remote MCP servers to all devices using the managedMcpServers configuration key. These servers support per-tool policy locks (allow, ask, blocked).

Managed MCP servers are the admin-controlled layer. They are deployed via MDM configuration, appear automatically in every user's connector list, and cannot be removed by end-users. They support three policy states per tool. allow means the tool runs without prompting. ask means the tool runs only if the user confirms. blocked means the tool cannot be invoked at all.

This granularity applies to managed MCP servers only. If your organisation distributes plugins through the organisation plugin directory on the device, those plugins can contain MCP servers defined in a bundled .mcp.json file. Those MCP servers do not support per-tool policy locks. A critical limitation documented in the extensions reference is that MCP servers in .mcp.json do not support granular policy control.

Important limitation: MCP servers in .mcp.json don't support per-tool policy locks; use managedMcpServers for that control.

The consequence is a two-tier extension architecture. Managed MCP servers give you tool-level control. Organisation plugins give you distribution convenience but no per-tool enforcement. If your compliance model requires per-tool granularity, managed MCP servers are the only mechanism that delivers it. Organisation plugins can contain other utilities and capabilities but will not satisfy a requirement to lock down specific tools.

Beyond MCP servers, four toggles control extension access more broadly. isLocalDevMcpEnabled blocks the creation of local MCP servers. isDesktopExtensionEnabled blocks installation of .mcpb extensions from the Anthropic directory. isDesktopExtensionSignatureRequired enforces digital signature verification. isDesktopExtensionDirectoryEnabled hides the extension directory from the UI. These toggles operate at the binary level. When set to false, they prevent the feature from being available rather than merely hiding it.

Token-based rate limiting for cost control

Cost governance in Cowork 3P is enforced through token-based rate limiting at the device level. Anthropic's configuration reference describes this.

Token-based rate limiting enforces per-window caps (input plus output combined) with configurable tumbling window duration (1 to 720 hours).

This is a per-device, per-window mechanism. You set a maximum number of tokens that can be consumed in a given time window. That window can be as short as 1 hour or as long as 720 hours (30 days). When the cap is reached, the application stops accepting new requests until the window rolls over. The cap applies to the combination of input and output tokens across all models on that device.

The critical limitation to understand is scope. Device-level token caps are a cost-control mechanism but not a comprehensive budget tool. They cap usage on individual devices. They do not provide an organisation-wide token budget or a total spend limit. Provider-side rate limits apply independently. AWS Bedrock enforces its own per-account quota. Google Vertex AI enforces its own quota. These limits operate in parallel with the device-level cap. An organisation will be charged by the provider for every token consumed regardless of whether the device-level cap has been exhausted. The device cap prevents the device from consuming more tokens. It does not prevent the bill from arriving.

Organisations that need organisation-wide cost governance must aggregate usage data from three sources. Device-level rate limiting logs what was consumed on each device. Provider-side logs (CloudTrail for Bedrock, Cloud Audit Logs for Vertex) capture what was actually invoked. OpenTelemetry export can send raw usage to your own collector for custom analysis. None of these sources automatically roll up to a single dashboard. Organisations must design the aggregation themselves.

Stay with the series

This is chapter nine of a hub that breaks down every Claude deployment route with primary-source references for pricing, residency, and contractual terms. New chapters as they publish, sent to your inbox. Subscribe to the newsletter.

Credential rotation as a planned maintenance event

Credentials in Cowork 3P are stored in MDM configuration. They are read once at launch. Anthropic's configuration reference states this clearly.

Configuration is read once at launch, so fully quit and reopen the app after any change.

Every credential type is managed the same way. API keys, bearer tokens, service-account JSON files, OAuth refresh tokens all come from configuration that is immutable at runtime. When a credential needs to change, you update the MDM configuration, the device reads the new value at next launch, and the application uses it. Until the user restarts the application, the old credential remains in use.

This architecture means credential rotation is not a reactive incident-response operation. It is a scheduled maintenance event. When you rotate a credential, you prepare a new configuration, deploy it through MDM, and schedule user restart windows. The benefit is predictability and auditability. The cost is planning overhead. A credential that is compromised cannot be instantly revoked. It remains valid until devices are restarted.

Organisations with strict incident-response requirements sometimes use a credential helper, which is an executable on the device that retrieves short-lived credentials dynamically at runtime. This pattern allows credential rotation without app restart, because the executable can fetch a new token from an identity provider every time the app runs. Credential helpers are supported for all four provider types and can integrate with single sign-on, public-key infrastructure, or identity-aware access proxy patterns. The trade-off is implementation complexity. Credential helpers require organisational infrastructure to provide just-in-time token generation.

Credential rotation is maintenance, not incident response

Credentials are read once at launch. Rotation is a configuration change that requires restart. Plan for it. Do not execute it reactively.

Building an audit trail across three layers

Audit data in Cowork 3P does not come from a single source. It comes from three separate systems, and organisations must wire them together to build a complete picture.

The first source is MDM delivery logs. Your MDM platform (Jamf, Intune, Group Policy) records when configuration profiles were delivered to which devices, whether they were accepted or rejected, and when they were last refreshed. These logs show what policy was in effect on a device at any given time. They do not show what the user actually did with Claude.

The second source is provider-side logs. When inference runs on Bedrock, AWS CloudTrail logs every model invocation in your account. When inference runs on Vertex, Google Cloud Audit Logs captures the equivalent data. These logs show what was actually invoked, by which account, at what time, with how many tokens consumed. They do not show what MDM configuration was in effect on the device that made the request.

The third source is OpenTelemetry export. Organisations can configure Cowork 3P to stream session activity (prompts, tool invocations, token counts, model calls) to a custom OpenTelemetry collector endpoint. This export happens independently of Anthropic's telemetry. Anthropic documents it this way.

Disabling all four telemetry categories ensures 'no outbound connections to Anthropic-operated hosts at runtime' except the VM bundle download and your inference provider.

OpenTelemetry export is the only way to capture the full session activity in a regulated environment where conversation history must stay on the device. It sends structured logs to your collector. You parse and store them. The data lives in your infrastructure.

Organisations that need a complete audit trail must collect all three sources and correlate them. MDM logs show policy. Provider logs show inference. OpenTelemetry shows session detail. A regulator asking "what Claude activity happened on device X while this policy was in effect" requires joining data across all three layers.

User identity and local versus centralised access control

User identity in Cowork 3P is device-based and local. There is no centralised user directory, no session-level authentication, no per-user authentication token managed by Anthropic. Identity is handled by the inference provider. When a user runs Claude on Bedrock, they authenticate using their AWS credential. When they run Claude on Vertex, they authenticate using their Google credential. The device itself has no central user registry.

This architectural choice has compliance implications. You cannot revoke a single user's access to Claude without affecting all users on the device. Access control is at the device level, not the per-user level. To block a specific user, you must either block their access at the provider level (revoke their AWS identity or Google service account) or prevent them from accessing the device altogether. Per-user rate limiting is not possible. Quotas are applied per-device, and all users on that device share the same quota.

Organisations that need per-user audit trails can build them via provider identity logs. If each user has a distinct identity at the provider level, then provider logs can be correlated back to individual users. The work of doing that correlation falls to your organisation, not to Anthropic.

Designing an organisational governance model

Effective governance in Cowork 3P requires three parallel systems. Centralised policy at the MDM layer. Device-level enforcement of that policy. Provider-side audit aggregation.

Start with centralised policy. Decide what tools are allowed, what usage limits apply to different device groups, what credentials are required, and what telemetry is acceptable. Encode those decisions into configuration profiles. Deploy them through your MDM platform to device groups. Document what was deployed and when.

Next, ensure device-level enforcement works. Test the configuration on a pilot device. Verify that the diagnostic report shows the policy was read correctly. Check that managed MCP servers cannot be removed. Check that rate limits are enforced. Check that disabled tools are actually disabled. Incident response and remediation depend on understanding exactly what constraints are in effect on a device.

Finally, aggregate provider-side audit data. If using Bedrock, configure a CloudTrail bucket in your security account. If using Vertex, enable Cloud Audit Logs to a cloud logging sink. If using OpenTelemetry export, deploy a collector and define retention. Build queries that can answer "how many tokens did organisation unit X consume last month" and "which devices invoked tool Y while policy Z was in effect." The aggregation work is separate from the deployment work, but both are necessary.

Organisations differ on where to place controls. A public sector organisation might run in the Locked Down security profile with no telemetry, all traffic going through a VPC endpoint, and stringent credential rotation windows. A regulated financial services organisation might run in Restricted mode with essential telemetry only, credential helpers for just-in-time rotation, and comprehensive provider-side audit logging. A rapidly iterating technology organisation might run in Standard mode and focus on cost governance through token rate limits rather than tool restrictions. The architecture can accommodate all these models because governance is decentralised at the enforcement layer and centralised at the policy and audit layers.

Governance in Cowork 3P is different from the cloud-console model most CTOs are familiar with. The difference is not a limitation. It is a consequence of the architecture decision to keep data on devices rather than in a central backend. Work with that architecture rather than against it.

Primary sources

Nothing in this article is legal advice. It names regulatory frameworks and describes how each deployment route affects compliance posture. Compliance interpretation for your specific regulatory context, jurisdiction, and client contracts must be reviewed with qualified legal counsel. Verify current Anthropic documentation at https://claude.com/docs/cowork/3p/overview before making a procurement decision.

GenAI Skills Academy

Achieve Productivity Gains With AI Today

Send me your details and let’s book a 15 min no-obligation call to discuss your needs and concerns around AI.