From developer-friendly pay-as-you-go to dedicated enterprise capacity, all with EU sovereignty by default.
Start with our inference service and scale to dedicated capacity or on-premises as your needs grow.
All options include a fully managed, OpenAI-compatible API. We handle model deployments, infrastructure, and updates — you just call the endpoint.
Pay per reserved rack
Your own reserved SambaNova racks, fully managed by us. Hosted in our EU datacenters with the same operational simplicity as the inference service, but with guaranteed capacity and priority support.
Own the hardware and software stack
Deploy the complete inference stack in your own datacenter. Air-cooled, no liquid cooling required. Start with a single rack and scale as needed, with dedicated 24/7 support.
All tiers include EU sovereignty by default. Scale up as your needs grow.
| Feature | Inference Service | Dedicated | On-Premises |
|---|---|---|---|
| EU-Hosted by Default | Your Location | ||
| Pricing Model | Pay-per-token | Reserved capacity | Custom licensing |
| Rate Limits | Per plan | Hardware only | Unlimited |
| Model Catalog | Standard models | Standard + Custom | Any model |
| Custom Model Hosting | |||
| Support | Docs & Community to Priority | Priority | Dedicated 24/7 |
| SLA Guarantee | Best effort to Custom | Custom | Custom |
| Air-Gapped Deployment | |||
| Data Residency Control | EU default | EU guaranteed | Your choice |
| Best For | Prototyping to production | High-volume production | Full physical ownership |
Transparent, fair, and built for European organizations.
EU data sovereignty isn't an add-on or premium feature. It's included by default in every tier, at no extra cost.
Clear per-token pricing in EUR with no hidden fees. What you see is what you pay. No surprise charges for data transfer or API calls.
Every request runs at full inference speed regardless of your plan — we never reduce token throughput or deprioritize pay-as-you-go users. Rate limits cap request frequency, not performance.
Start with pay-as-you-go, scale to dedicated capacity, deploy on-premises. Move between tiers as your needs evolve.