Local vs Cloud Processing Options

The deployment decision

AI coding tools offer different processing models that affect where your data actually lives. The previous section covered what data these tools transmit. This one examines where that data goes and what options exist for keeping it inside your walls.

Direct API access is simple but sends everything to the provider. Cloud provider integrations (Bedrock, Vertex AI, Azure OpenAI) route through infrastructure you already control. The tradeoff is complexity versus control. Neither option is universally better.

Direct API access

The simplest setup routes requests directly to the AI provider.

Claude Code connects to Anthropic's API by default. Authentication uses an API key in the ANTHROPIC_API_KEY environment variable. All inference traffic goes to Anthropic's servers under their terms of service.

Codex connects to OpenAI's API. Authentication happens through browser-based OAuth (codex login) or an API key via stdin. Enterprise data handling requires ChatGPT Enterprise, Education, or Healthcare plans.

GitHub Copilot connects to GitHub's infrastructure, which routes to various model providers (OpenAI, Anthropic, Google) depending on the feature and model. Enterprise and Business plans include data protection commitments.

Direct API access means new models and capabilities appear immediately. No waiting for cloud providers to integrate them. The downside: limited control over data residency and network routing.

Cloud provider integrations

For organizations already running on major cloud platforms, routing AI traffic through your existing provider adds control.

AWS Bedrock for Claude Code

Amazon Bedrock runs Claude models inside AWS infrastructure. Claude Code can use Bedrock instead of calling Anthropic directly.

To enable Bedrock routing:

export CLAUDE_CODE_USE_BEDROCK=1
export AWS_REGION=us-east-1

The AWS_REGION variable is required. Claude Code does not read region settings from ~/.aws/config.

Authentication options:

AWS CLI credentials (aws configure)
Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN)
SSO profile via AWS_PROFILE
Bedrock API keys via AWS_BEARER_TOKEN_BEDROCK

Required IAM permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream",
        "bedrock:ListInferenceProfiles"
      ],
      "Resource": [
        "arn:aws:bedrock:*:*:inference-profile/*",
        "arn:aws:bedrock:*:*:foundation-model/*"
      ]
    }
  ]
}

Model selection can be customized:

export ANTHROPIC_MODEL='us.anthropic.claude-sonnet-4-5-20250929-v1:0'
export ANTHROPIC_SMALL_FAST_MODEL='us.anthropic.claude-haiku-4-5-20251001-v1:0'

Why this matters for compliance:

AWS Bedrock supports VPC Interface Endpoints via PrivateLink. With this configuration, inference traffic never touches the public internet. Your instances do not need public IP addresses. AWS does not store or log prompts, and data is not used for training.

Cross-region inference profiles can route requests dynamically across US, Europe, and APAC regions while staying within regional compliance boundaries.

Note: Bedrock and Vertex AI users have non-essential telemetry disabled by default.

Google Vertex AI for Claude Code

Vertex AI runs Claude models inside GCP infrastructure.

To enable Vertex AI routing:

export CLAUDE_CODE_USE_VERTEX=1
export CLOUD_ML_REGION=global
export ANTHROPIC_VERTEX_PROJECT_ID=YOUR-PROJECT-ID

Setting CLOUD_ML_REGION to global uses Google's global endpoint with dynamic routing. For guaranteed data locality, use a regional endpoint instead but it costs 10% more.

Authentication:

gcloud auth login
gcloud config set project YOUR-PROJECT-ID
gcloud auth application-default login
gcloud services enable aiplatform.googleapis.com

The required IAM role is roles/aiplatform.user, which includes aiplatform.endpoints.predict.

Why this matters for compliance:

VPC Service Controls can block all public internet access to Vertex AI APIs. Private Service Connect gives you private endpoints within your VPC. Regional endpoints guarantee that data at rest stays within specified geographic boundaries.

Per-model region overrides let you route specific models to specific regions:

export VERTEX_REGION_CLAUDE_3_5_HAIKU=us-east5
export VERTEX_REGION_CLAUDE_4_0_OPUS=europe-west1

Azure OpenAI for Codex

Codex CLI can use Azure OpenAI instead of calling OpenAI directly. This routes inference through your Azure infrastructure using your own deployments.

Configuration in ~/.codex/config.toml:

model = "gpt-5-codex"
model_provider = "azure"

[model_providers.azure]
name = "Azure OpenAI"
base_url = "https://YOUR_RESOURCE.openai.azure.com/openai/v1"
env_key = "AZURE_OPENAI_API_KEY"
wire_api = "responses"

Set the API key:

export AZURE_OPENAI_API_KEY="<your-api-key>"

Configuration notes:

The model value must match your Azure deployment name
Include /v1 in the base_url path
The env_key references an environment variable; never hardcode keys in the config file
Entra ID authentication is not currently supported

Azure OpenAI resources can be deployed in specific Azure regions. Private endpoints, virtual networks, and managed identity integration give you the network controls you expect from Azure.

GitHub Copilot deployment options

GitHub Copilot Enterprise requires GitHub Enterprise Cloud. Copilot is not available for self-hosted GitHub Enterprise Server.

Data residency: GitHub Enterprise Cloud with data residency is available in:

European Union
Australia
United States
Japan

With data residency enabled, your enterprise runs on a dedicated subdomain of ghe.com (for example, octocorp.ghe.com). Data at rest stays in the designated location.

Network controls:

GitHub-hosted runners can deploy into Azure Virtual Networks. This lets Copilot coding agent access private resources without exposing them to the internet.

For organizations wanting more control, GitHub Copilot's Bring Your Own Key (BYOK) option (currently in public preview) routes requests through your own cloud AI deployments:

Azure OpenAI (via Microsoft Foundry)
AWS Bedrock
GCP Vertex AI
Direct Anthropic or OpenAI API

BYOK usage bills directly from your provider and does not count against GitHub Copilot quotas.

When to use which option

Requirement	Recommended approach
Maximum data residency control	AWS Bedrock with VPC endpoints or Vertex AI with VPC-SC
Fastest new model access	Direct API (Anthropic or OpenAI)
AWS-heavy enterprise	AWS Bedrock
GCP-heavy enterprise	Google Vertex AI
Azure-heavy enterprise	Azure OpenAI for Codex; Copilot BYOK for broader coverage
Regional data requirements for GitHub	GitHub Enterprise Cloud with data residency
Multiple model providers	Copilot BYOK or direct API per tool

Pricing:

Anthropic maintains pricing parity across channels. Claude costs the same whether accessed through the direct API, Bedrock, or Vertex AI. However, regional endpoints on both Bedrock and Vertex AI carry a 10% premium over global endpoints. Cloud provider billing may add data transfer and networking charges on top.

Feature timing:

New Claude models typically appear first on the direct API. Bedrock and Vertex AI follow within days to weeks. Organizations that need immediate access to new capabilities may maintain direct API access alongside cloud provider integrations.

VPC deployment patterns

For strict data residency requirements, VPC configurations prevent inference traffic from touching the public internet.

AWS Bedrock VPC endpoints:

API category	Service name
Control plane	`com.amazonaws.<region>.bedrock`
Runtime	`com.amazonaws.<region>.bedrock-runtime`
Agents build-time	`com.amazonaws.<region>.bedrock-agent`
Agents runtime	`com.amazonaws.<region>.bedrock-agent-runtime`

Traffic stays within the AWS network. No public IP required. Security groups control access. NAT gateway data transfer costs disappear.

Google Vertex AI private access:

VPC Service Controls create a perimeter that blocks all public internet access to Vertex AI APIs. Private Service Connect establishes private endpoints within your VPC. Regional endpoints guarantee data at rest stays within specified geographic boundaries.

GitHub Actions with Azure VNET:

For Copilot coding agent, GitHub-hosted runners can deploy into Azure VNETs. Runner NICs deploy directly into your virtual network. Network Security Group rules apply automatically. ExpressRoute and VPN tunnels to on-premises resources work as expected.

Configuration summary

Claude Code environment variables for cloud providers:

# AWS Bedrock
export CLAUDE_CODE_USE_BEDROCK=1
export AWS_REGION=us-east-1

# Google Vertex AI
export CLAUDE_CODE_USE_VERTEX=1
export CLOUD_ML_REGION=global
export ANTHROPIC_VERTEX_PROJECT_ID=YOUR-PROJECT-ID

# Disable non-essential telemetry (automatic for Bedrock/Vertex)
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1

Codex configuration for Azure OpenAI:

# ~/.codex/config.toml
model = "your-deployment-name"
model_provider = "azure"

[model_providers.azure]
base_url = "https://YOUR_RESOURCE.openai.azure.com/openai/v1"
env_key = "AZURE_OPENAI_API_KEY"

The next section covers data retention policies how long different services store your inference data and what zero-retention options exist.

On this page