API Compatibility

OpenAI-Compatible API.
Zero Code Changes.

Switch from OpenAI or Anthropic in minutes. Same SDKs, same code, up to 10x faster inference on EU sovereign infrastructure.

Switch in 5 Minutes

Change one line of code. Keep everything else.

Sign up & get API key

Create your free account at cloud.infercom.ai. No credit card required.

Change base URL

Point your existing OpenAI or Anthropic SDK to api.infercom.ai

Choose your model

Use MiniMax M2.7 Ultraspeed (flagship), gpt-oss-120b (fastest), or any model from our catalog.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.infercom.ai/v1",
    api_key="your-api-key"
)

response = client.chat.completions.create(
    model="MiniMax-M2.7",
    messages=[{"role": "user", "content": "Hello"}],
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="")

Choose Your SDK

Full compatibility with OpenAI and Anthropic client libraries

OpenAI SDK

OpenAI SDK Compatible

Drop-in replacement for OpenAI. Works with your existing code, tools, and frameworks.

Same /v1/chat/completions endpoint
Works with LangChain, LlamaIndex, CrewAI
Python, JavaScript, TypeScript, REST
Streaming, function calling, JSON mode

View OpenAI compatibility docs

Anthropic SDK

Anthropic Messages API

Use the Anthropic SDK directly. Same Messages API format, EU hosted.

Standard /v1/messages endpoint
Works with existing Anthropic code
Tool use on MiniMax and gpt-oss-120b
Same authentication patterns

Vision, extended thinking, and prompt caching are not currently supported.

View Anthropic compatibility docs

More Examples

Works with your favorite languages and frameworks

JavaScript / TypeScript

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.infercom.ai/v1',
  apiKey: 'your-api-key',
});

const response = await client.chat.completions.create({
  model: 'MiniMax-M2.7',
  messages: [{ role: 'user', content: 'Hello' }],
});

console.log(response.choices[0].message.content);

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://api.infercom.ai/v1",
    api_key="your-api-key",
    model="MiniMax-M2.7"
)

response = llm.invoke("Explain quantum computing")
print(response.content)

View full quickstart guide

Faster Than the Alternatives

SambaNova's dataflow architecture delivers up to 10x faster inference than GPU-based providers.

400+

tokens/sec on MiniMax M2.7

700+

tokens/sec on gpt-oss-120b

Up to 10x

faster than GPU inference

See performance benchmarks

Supported Endpoints

/v1/chat/completions

Chat completions with streaming support

/v1/messages

Anthropic Messages API format

/v1/models

List available models and metadata

/v1/embeddings

Text embeddings (selected models)

Full API reference

EU Sovereignty Included

OpenAI-compatible API, European jurisdiction.

All inference in Germany (Equinix MU4)
No US CLOUD Act exposure
GDPR compliant, ISO 27001 certified
Zero data retention

Learn about EU sovereignty

ISO 27001

Certified

GDPR

Compliant

AI Act

Ready

Tier III+

Datacenter

Documentation

Everything you need to integrate

Quickstart Guide

Get your first API call working in minutes

API Reference

Complete endpoint documentation with examples

OpenAPI Spec

OpenAPI 3.1 schema for code generation

Available Models

MiniMax M2.7, gpt-oss-120b, Gemma 3, and more

Ready to Switch?

€5 free credit. No credit card required. Start in 2 minutes.

OpenAI-Compatible API.Zero Code Changes.