Powered by SambaNova

Revolutionary AI Infrastructure

Infercom is powered by SambaNova's dataflow architecture — purpose-built for AI inference, delivering unprecedented performance and efficiency.

Up to 10x

Faster Inference

Up to 5x

Energy Efficient

24TB

Memory per Rack

Dataflow vs. GPU Architecture

Why purpose-built dataflow beats general-purpose GPUs for AI inference

SambaNova

Dataflow Architecture

Purpose-built for AI

Purpose-built for AI workloads, creating custom processing pipelines for entire computation graphs while minimizing data movement.

  • Entire model resident in memory
  • Data flows through operations without intermediate writes
  • Operator fusion: hundreds of operations in single kernel
  • Software-defined hardware optimizes for each workload

Traditional GPU

General-purpose design

General-purpose design requiring kernel-by-kernel execution creates bottlenecks for AI inference workloads.

  • Kernel-by-kernel execution creates overhead
  • Excessive data movement between processor and memory
  • Memory bandwidth bottleneck limits performance
  • Underutilization of compute resources

Deep dive: How dataflow solves the AI inference crisis

SN40L Reconfigurable Dataflow Unit

Built on TSMC's 5nm process with 102 billion transistors, delivering 10.2 PetaFLOPS of compute capacity at BF16 precision.

SambaNova SN40L server blade with dual RDU chips

SambaNova SN40L Server Blade — Dual RDU Chips

102B

Transistors

1,040

Cores

10.2

PetaFLOPS

TSMC 5nm

Process

Three-Tier Memory Architecture

520MB

SRAM

Ultra-fast on-chip cache

64GB

HBM

High-bandwidth memory

1.5TB

DDR

Massive off-package storage

System Configuration

16

Chips/Rack

24TB

Total Memory

10kW

Power

Air

Cooled

World-Record Performance

Independent benchmarks by Artificial Analysis. Performance measured in tokens per second per user for real-world inference workloads.

MiniMax M2.5

NEW — EU Hosted
404tokens/sec

High-performance multimodal model now hosted on Infercom's EU infrastructure. Independently measured at 400+ tokens/sec by Artificial Analysis.

EU Sovereign

DeepSeek-R1 671B

10x vs GPU
250tokens/sec

The world's largest reasoning model at unprecedented speed. Up to 10x faster than GPU-based providers.

671B params

DeepSeek-V3.1

EU Hosted
273tokens/sec

128K context, full function calling, and JSON mode — one of Europe's most capable sovereign LLMs.

EU Sovereign

gpt-oss-120b

Fastest EU Model
772tokens/sec

OpenAI's open-source 120B parameter model. Exceptional throughput for high-volume sovereign workloads.

EU Sovereign

Sustainable AI Infrastructure

Up to 5x better energy efficiency than GPU-based inference

Lower Power Consumption

Average 10kW per rack versus multiple GPU racks consuming 40–50kW+ for equivalent workloads. Reduced chip count translates to dramatic power savings.

Smaller Footprint

Dramatically reduced physical space, simplified cooling, and lower total infrastructure costs.

Air-Cooled Design

No liquid cooling infrastructure required. Standard air cooling simplifies deployment, reduces maintenance complexity, and lowers operational overhead.

"Not all tokens are created equal. The real value lies not in measuring tokens generated, but in the quality of intelligence delivered per unit of energy consumed."

SambaNova — "Intelligence per Joule"

Advanced Model Capabilities

Massive Model Support

Run models up to 671B parameters (DeepSeek-R1) on a single rack. Support for Composition of Experts (CoE) systems up to 5 trillion parameters with 100+ expert models simultaneously.

671B params5T CoE100+ models

Long Context Windows

Handle up to 256,000+ token context windows on single-node deployment. Massive memory capacity enables document analysis, code generation, and reasoning tasks without truncation.

256K+ tokensSingle nodeNo truncation

Millisecond Model Switching

Multiple models resident in memory simultaneously with millisecond switching latency — orders of magnitude faster than GPU systems. Perfect for agentic AI and multi-model workflows.

ms switchingMulti-modelAgentic AI

European Infrastructure

Hosted in Equinix Munich 4 — Tier III+ certified, carrier-neutral datacenter

Row of SambaNova racks powering Infercom's EU infrastructure

SambaNova rack infrastructure — air-cooled, 10kW per rack

German Flag

Munich-Based Hosting

All data and processing remains within German borders under EU jurisdiction.

No US Jurisdiction

Protection from CLOUD Act, PATRIOT Act, and foreign intelligence access.

AI Act Ready

Infrastructure prepared for EU AI Act requirements and compliance.

Tier III+ Certified

99.982% uptime guarantee with redundant power and cooling systems.

Klar til at bygge fremtidens AI i Europa?

Slut dig til fremsynede organisationer, der deployer suveræn AI med performance i verdensklasse