Question 1

How does the €5 free credit work?

Accepted Answer

Every new user gets €5 in free credit - no credit card required. How far that goes depends on the model: approximately 8.5 million tokens on gpt-oss-120b (€0.59/1M output), or about 1.1 million output tokens on DeepSeek V3.1 (€4.50/1M). Credits expire after 30 days.

Question 2

What are the rate limits?

Accepted Answer

Rate limits vary by plan and model. Free tier: 20 RPM, 20 requests/day. Developer tier: 120 RPM, 30K requests/day. Enterprise plans offer custom rate limits. Rate limits control request frequency - they do not affect inference speed per request.

Question 3

What payment methods do you accept?

Accepted Answer

For our inference service, we accept major credit cards through Stripe. Enterprise, dedicated capacity, and on-premises customers can also pay by invoice.

Question 4

Can I use models not in your catalog?

Accepted Answer

Yes! With dedicated capacity or on-premises, you can run any model supported by the SambaNova platform, including fine-tuned variants. On the inference service, you can request specific models and we'll evaluate adding them to our catalog.

Question 5

What's included in EU sovereignty?

Accepted Answer

All data processing happens in our Munich datacenter by default. Your data never leaves EU jurisdiction unless you explicitly opt to use non-EU models. Full GDPR compliance and AI Act readiness included. No exposure to US CLOUD Act.

Question 6

Do you offer volume discounts?

Accepted Answer

Yes - both Enterprise and dedicated capacity contracts offer custom pricing based on your committed usage. Contact our sales team to discuss your requirements.

Question 7

How quickly can I deploy on-premises?

Accepted Answer

Our standard deployment timeline is 90 days from contract signing to production-ready, including hardware delivery, installation, and configuration. This is significantly faster than traditional GPU-based alternatives.

Question 8

What's the difference between inference service and dedicated?

Accepted Answer

The inference service is our multi-tenant platform where you share resources with other users - available in Free, Developer, and Enterprise plans. Dedicated capacity reserves entire SambaNova racks exclusively for your use with guaranteed performance and no software rate limits.

Question 9

Is the API OpenAI-compatible?

Accepted Answer

Yes, Infercom provides an OpenAI-compatible API. You can use the standard OpenAI SDK by simply changing the base URL to https://api.infercom.ai/v1. Works with LangChain, LlamaIndex, and other frameworks.

Model	Input/1M	Output/1M	Context
gpt-oss-120b	€0.22	€0.59	128K
Gemma 3 12B	€0.20	€0.35	128K
MiniMax-M2.5	€0.30	€1.20	160K
MiniMax M2.7 Ultraspeed	€0.60	€2.40	192K

Model	Input/1M	Output/1M	Region
Llama 3.3 70B	€0.60	€1.20	JP
DeepSeek V3.1	€3.00	€4.50	JP
DeepSeek V3.2	€3.00	€4.50	JP

Feature	Inference Service	Dedicated	On-Premises
EU-Hosted by Default			Your Location
Pricing Model	Pay-per-token	Reserved capacity	Custom licensing
Rate Limits	Per plan	Hardware only	Unlimited
Model CatalogStandard models: Our curated selection of production-ready models. Custom: Bring your own fine-tuned models based on supported architectures.	Standard models	Standard + Custom	Any model
Custom Model Hosting
Support	Docs & Community to Priority	Priority	Dedicated 24/7
SLA Guarantee	Best effort to Custom	Custom	Custom
Air-Gapped Deployment
Data Residency ControlEU default: EU-hosted models keep data in EU; Global Catalog models are processed outside EU. EU guaranteed: All processing contractually stays in EU. Your choice: You control where hardware is located.	EU default	EU guaranteed	Your choice
Best For	Prototyping to production	High-volume production	Full physical ownership

How Do You Want Your Inference?

Inference Service

Dedicated Capacity

On-Premises

How Pricing Works

What is a token?

Input vs Output

Choosing a model

Access Tiers

EU Sovereign Models

Global Model Catalog

Pricing Calculator

Compare Features Across Tiers

The Infercom Advantage

EU Sovereignty by Default

Transparent Token Pricing

No Performance Throttling

Clear Upgrade Path

Frequently Asked Questions

Ready to Build the Future of AI in Europe?