Amazon Bedrock Mantle

Amazon Bedrock Mantle provides OpenAI-compatible API endpoints for model inference on AWS. Access models from Anthropic, Mistral, NVIDIA, Qwen, DeepSeek, Google, and more through familiar OpenAI SDK patterns.

AWS Bedrock Mantle Documentation

Quick Start

from portkey_ai import Portkey

portkey = Portkey(
    api_key="PORTKEY_API_KEY",
    provider="@your-bedrock-mantle-provider"
)

response = portkey.chat.completions.create(
    model="mistral.ministral-3-3b-instruct",
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=50
)

print(response.choices[0].message.content)

Add Provider in Model Catalog

Go to Model Catalog in Portkey
Search for Bedrock Mantle and select it
Enter your AWS credentials

AWS Access Key

Use AWS Access Key ID, AWS Secret Access Key, and AWS Region.Credential Guide

AWS Assumed Role

Use AWS Role ARN, optional External ID, and AWS Region.Setup Guide

Supported Endpoints

Bedrock Mantle supports four API endpoints. Each model works on specific endpoints based on its provider.

Chat Completions — `/v1/chat/completions`

Works with non-Anthropic models (Mistral, NVIDIA, Qwen, Google, DeepSeek, MiniMax, Moonshot, Z AI, Writer, OpenAI).

response = portkey.chat.completions.create(
    model="mistral.mistral-large-3-675b-instruct",
    messages=[{"role": "user", "content": "Explain quantum computing in one sentence."}],
    max_tokens: 100
)

Messages — `/v1/messages`

Works with Anthropic models. Uses the Anthropic Messages API format.

import requests

response = requests.post(
    "https://api.portkey.ai/v1/messages",
    headers={
        "Content-Type": "application/json",
        "x-portkey-api-key": "PORTKEY_API_KEY",
        "x-portkey-provider": "bedrock-mantle",
        "x-portkey-virtual-key": "your-virtual-key"
    },
    json={
        "model": "anthropic.claude-opus-4-7",
        "max_tokens": 100,
        "messages": [{"role": "user", "content": "Hello, Claude!"}]
    }
)

Extended Thinking

Bedrock Mantle uses thinking.type: "adaptive" with output_config.effort instead of the standard Anthropic thinking.type: "enabled" with budget_tokens.

curl -X POST "https://api.portkey.ai/v1/messages" \
  -H "Content-Type: application/json" \
  -H "x-portkey-api-key: PORTKEY_API_KEY" \
  -H "x-portkey-provider: bedrock-mantle" \
  -H "x-portkey-virtual-key: your-virtual-key" \
  -d '{
    "model": "anthropic.claude-opus-4-7",
    "max_tokens": 16000,
    "thinking": {"type": "adaptive"},
    "output_config": {"effort": "high"},
    "messages": [{"role": "user", "content": "What is 27 * 453? Think step by step."}]
  }'

Responses — `/v1/responses`

Works with select models (e.g., openai.gpt-oss-120b, openai.gpt-oss-20b). Supports create, get, and delete operations.

curl -X POST "https://api.portkey.ai/v1/responses" \
  -H "Content-Type: application/json" \
  -H "x-portkey-api-key: PORTKEY_API_KEY" \
  -H "x-portkey-provider: bedrock-mantle" \
  -H "x-portkey-virtual-key: your-virtual-key" \
  -d '{
    "model": "openai.gpt-oss-120b",
    "input": "Hello! How can you help me today?"
  }'

Count Tokens — `/v1/messages/count_tokens`

Count input tokens for Anthropic models without making an inference call.

curl -X POST "https://api.portkey.ai/v1/messages/count_tokens" \
  -H "Content-Type: application/json" \
  -H "x-portkey-api-key: PORTKEY_API_KEY" \
  -H "x-portkey-provider: bedrock-mantle" \
  -H "x-portkey-virtual-key: your-virtual-key" \
  -d '{
    "model": "anthropic.claude-opus-4-7",
    "system": "You are a scientist",
    "messages": [{"role": "user", "content": "Hello, Claude"}]
  }'

Response:

{"input_tokens": 25}

Endpoint-Model Compatibility

Not all models support all endpoints. Use this table as a quick reference:

Model Family	Chat Completions	Messages	Responses	Count Tokens
`anthropic.*` (Claude)	—	Yes	—	Yes
`openai.gpt-oss-120b`, `openai.gpt-oss-20b`	Yes	—	Yes	—
`openai.gpt-oss-safeguard-*`	Yes	—	—	—
`mistral.*`	Yes	—	—	—
`nvidia.*`	Yes	—	—	—
`qwen.*`	Yes	—	—	—
`google.gemma-*`	Yes	—	—	—
`deepseek.*`	Yes	—	—	—
`minimax.*`	Yes	—	—	—
`moonshotai.*`	Yes	—	—	—
`zai.*`	Yes	—	—	—
`writer.*`	Yes	—	—	—

Use the /v1/models endpoint on Mantle directly to discover the full list of models available in your AWS account and region.

Streaming

Enable streaming by setting stream: true.

response = portkey.chat.completions.create(
    model="qwen.qwen3-32b",
    messages=[{"role": "user", "content": "Tell me a story"}],
    max_tokens=200,
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Limitations

Not all models support all endpoints. Each model on Mantle only works on specific endpoints based on its provider family. For example, Anthropic models only work on /v1/messages, not /v1/chat/completions. See the compatibility table above.
Model availability is account and region specific. The models available to you depend on your AWS account’s access permissions and the region you’re using. Some models (e.g., research previews) require explicit allowlisting by AWS.
Extended thinking uses a different format. Mantle requires thinking.type: "adaptive" with output_config.effort instead of the standard Anthropic thinking.type: "enabled" with budget_tokens.
/v1/responses input items listing is not supported. The GET /v1/responses/:id/input_items endpoint is not available on Mantle.
Prompt caching minimum threshold. Anthropic prompt caching on Mantle requires the cached content to meet a minimum token threshold (typically 2048+ tokens). Smaller prompts won’t trigger caching.

Supported Models

Model availability depends on your AWS account and region. Discover available models with:

curl -X GET "https://bedrock-mantle.us-east-1.api.aws/v1/models" \
  -H "Authorization: Bearer YOUR_API_KEY"

Bedrock Mantle Model List

Supported Regions

Region	Endpoint
US East (N. Virginia)	`bedrock-mantle.us-east-1.api.aws`
US East (Ohio)	`bedrock-mantle.us-east-2.api.aws`
US West (Oregon)	`bedrock-mantle.us-west-2.api.aws`
Asia Pacific (Mumbai)	`bedrock-mantle.ap-south-1.api.aws`
Asia Pacific (Tokyo)	`bedrock-mantle.ap-northeast-1.api.aws`
Asia Pacific (Jakarta)	`bedrock-mantle.ap-southeast-3.api.aws`
Europe (Frankfurt)	`bedrock-mantle.eu-central-1.api.aws`
Europe (Ireland)	`bedrock-mantle.eu-west-1.api.aws`
Europe (London)	`bedrock-mantle.eu-west-2.api.aws`
Europe (Milan)	`bedrock-mantle.eu-south-1.api.aws`
Europe (Stockholm)	`bedrock-mantle.eu-north-1.api.aws`
South America (São Paulo)	`bedrock-mantle.sa-east-1.api.aws`

Set the region when creating your provider in Model Catalog.

Next Steps

Explore Portkey features that work with Bedrock Mantle:

SDK Reference

Python and Node.js SDK documentation.

Gateway Configs

Add fallbacks, retries, load balancing, and caching.

Observability

Track logs, traces, costs, and latency.

Prompt Management

Version and manage prompts across models.

Ecosystem

LLM Integrations

Cloud Platforms

Guardrails

Plugins

Vector Databases

Agents

AI Apps

Libraries

Tracing Providers

MCP Clients

MCP Servers

AWS Bedrock Mantle Documentation

Quick Start

Add Provider in Model Catalog

AWS Access Key

AWS Assumed Role

Supported Endpoints

Chat Completions — `/v1/chat/completions`

Messages — `/v1/messages`

Extended Thinking

Responses — `/v1/responses`

Count Tokens — `/v1/messages/count_tokens`

Endpoint-Model Compatibility

Streaming

Limitations

Supported Models

Bedrock Mantle Model List

Supported Regions

Next Steps

SDK Reference

Gateway Configs

Observability

Prompt Management

Ecosystem

LLM Integrations

Cloud Platforms

Guardrails

Plugins

Vector Databases

Agents

AI Apps

Libraries

Tracing Providers

MCP Clients

MCP Servers

Documentation Index

AWS Bedrock Mantle Documentation

​Quick Start

​Add Provider in Model Catalog

AWS Access Key

AWS Assumed Role

​Supported Endpoints

​Chat Completions — /v1/chat/completions

​Messages — /v1/messages

​Extended Thinking

​Responses — /v1/responses

​Count Tokens — /v1/messages/count_tokens

​Endpoint-Model Compatibility

​Streaming

​Limitations

​Supported Models

Bedrock Mantle Model List

​Supported Regions

​Next Steps

SDK Reference

Gateway Configs

Observability

Prompt Management

Quick Start

Add Provider in Model Catalog

Supported Endpoints

Chat Completions — `/v1/chat/completions`

Messages — `/v1/messages`

Extended Thinking

Responses — `/v1/responses`

Count Tokens — `/v1/messages/count_tokens`

Endpoint-Model Compatibility

Streaming

Limitations

Supported Models

Supported Regions

Next Steps