How Do You Want Your Inference?

Start with our inference service and scale to dedicated capacity or on-premises as your needs grow.

€5 Free Credit to StartEUR Native PricingNo Hidden Fees or Minimums

All options include a fully managed, OpenAI-compatible API. We handle model deployments, infrastructure, and updates — you just call the endpoint.

How Pricing Works

AI inference is priced per token. A token is roughly 4 characters in English. You pay for what you use — no minimums, no commitments.

What is a token?

Tokens are the basic units LLMs process. In English, 1 token ≈ 4 characters or ¾ of a word. 1,000 words ≈ 1,300 tokens. Other languages may use more tokens per character.

Input vs Output

You pay separately for input (your prompt) and output (the model's response). Output tokens typically cost more because they require more computation to generate.

Choosing a model

Larger models (like DeepSeek V3.1) are more capable but cost more. Smaller models (like gpt-oss-120b) are fast and cheap for simpler tasks. EU-hosted models guarantee data sovereignty.

Access Tiers

FreeSelf-Service

€5 credit

No credit card required. Standard rate limits.

View rate limits →
DeveloperSelf-Service

Pay-as-you-go

Production-ready rate limits. Per-token pricing in EUR.

View rate limits →
EnterpriseCustom pricing

SLA, priority support, custom rate limits, and add-ons.

Contact sales →
🇪🇺

EU Sovereign Models

Full GDPR compliance, no US CLOUD Act exposure

ModelInput/1MOutput/1MContext
gpt-oss-120b0.220.59128K
MiniMax-M2.50.301.20164K
DeepSeek-V3.13.004.50128K

Prices in EUR, excl. VAT. EU-hosted models include full data sovereignty.

🌐

Global Model Catalog

Additional models via global infrastructure

ModelInput/1MOutput/1MRegion
Llama 3.1 8B0.100.20US
Qwen3 32B0.400.80JP
Qwen3 235B0.400.80JP
Llama 3.3 70B0.601.20US
DeepSeek R1 Distill 70B0.701.40JP
Llama 4 Maverick0.631.80JP
DeepSeek V33.004.50US
DeepSeek V3.1 Terminus3.004.50US
DeepSeek V3.23.004.50US
DeepSeek R15.007.00US

Prices in EUR, excl. VAT. Requests processed on global infrastructure outside the EU.

Pricing Calculator

250 tokens

125 tokens

Cost per request

0.000129

With €5 credit: 38,834 requests

EU Sovereign — data stays in EU

Cost breakdown

Input: 0.000055Output: 0.000074Rate: €0.22/€0.59 per 1M tokens

Estimate only. Token counts are approximate (~4 characters per token for English). Actual tokens vary by language, content type, and model tokenizer. For precise costs, check your usage in the cloud portal after making API calls.

Compare Features Across Tiers

All tiers include EU sovereignty by default. Scale up as your needs grow.

FeatureInference ServiceDedicatedOn-Premises
EU-Hosted by DefaultYour Location
Pricing ModelPay-per-tokenReserved capacityCustom licensing
Rate LimitsPer planHardware onlyUnlimited
Model CatalogStandard modelsStandard + CustomAny model
Custom Model Hosting
SupportDocs & Community to PriorityPriorityDedicated 24/7
SLA GuaranteeBest effort to CustomCustomCustom
Air-Gapped Deployment
Data Residency ControlEU defaultEU guaranteedYour choice
Best ForPrototyping to productionHigh-volume productionFull physical ownership

The Infercom Advantage

Transparent, fair, and built for European organizations.

EU Sovereignty by Default

EU-hosted models process all data in our Munich datacenter — full GDPR compliance, no US CLOUD Act exposure. Choose Global Catalog models only when you explicitly need them.

Transparent Token Pricing

Clear per-token pricing in EUR with no hidden fees. What you see is what you pay. No surprise charges for data transfer or API calls.

No Performance Throttling

Every request runs at full inference speed regardless of your plan — we never reduce token throughput or deprioritize pay-as-you-go users. Rate limits cap request frequency, not performance.

Clear Upgrade Path

Start with pay-as-you-go, scale to dedicated capacity, deploy on-premises. Move between tiers as your needs evolve.

Frequently Asked Questions

Ready to Build the Future of AI in Europe?

Join forward-thinking organizations deploying sovereign AI with world-class performance