Proximal Cloud · Hardware Platform

Unity

A 50× roadmap to population-scale inference.

Memory-centric. Heterogeneous. Open.

Platform improvement
Lower cost
More performance
Made in India
For population-scale inference
Built for population scale

One platform.
A 50× trajectory.

Unity is engineered to bend the cost-and-performance curve of enterprise AI — delivering more tokens, more parameters and more reach per rack, every year through 2030.

Platform improvement
0×
Compounding gains across silicon, memory and fabric by 2030.
Total cost
0×
Lower cost per token than today's single-vendor rack-scale stacks.
Performance
0×
More raw performance for the workloads enterprises actually run.
The enterprise question

Real decisions span all of your data.

Fast, private, on-prem. The answers that move a business don't live in one database — they live across every modality at once.

How do the latest tariffs or steel prices affect my quarterly results, my ability to deliver products, my revenue and my cost?
— paraphrasing Larry Ellison on the enterprise AI workload
Enterprise truth lives across four data modalities

Relational SQL

Transactions, financials, ERP.

Vector DB

Semantic search, embeddings.

Graph DB

Relationships, supply chain.

LLM Inference

Reasoning, synthesis.

Answering these questions requires running all four — together, securely, at scale. Today's infrastructure can't.
Today's leading approach

Rack-scale, single-vendor compute.

The current state of the art orchestrates three specialized rack types across separate fabrics — powerful, but heavy, costly and far from sovereign.

GPU RackPREFILL / ATTENTION
Vera Rubin NVL72
NVLink spine
72 high-bandwidth GPUs co-located for KV-cache-heavy work.
Inference RackDECODE / FFN
Rubin CPX
Direct chip-to-chip spine
Token-by-token decode at scale, fed by interim activations.
CPU RackORCHESTRATION
Vera CPU
Spectrum-X Ethernet
x86-class CPUs running the control plane and data plumbing.
Orchestrated by Nvidia Dynamo · KV-aware routing (ATTN ⇄ FFN)

Powerful — but multi-rack, GPU-heavy, single-vendor, and far from sovereign or affordable for the rest of the world.

The directional shift

From racks of silos
to memory-centric nodes.

CPU, GPU and xPU stop living in separate buildings — and start living next to memory, each specialized for the part of the workload it's best at.

Yesterday

Rack-scale silos

GPU RACK
XPU RACK
CPU RACK
3 racks, 3 fabrics, 3 vendors. Compute travels far to reach memory.
Tomorrow

Memory-centric integrated node

HBM · 128–512 GB
CPU
x86
Agentic Compute
GPU
Tensor
LLM · Prefill
xPU
Custom
Decode
One-hop memory fabric
DDR · SRAM · HBF
One node. All compute. Memory at the center.
KV-aware routing happens inside the node — not across three buildings of fabric.
The Unity™ solution

Two views.
One node.

The same memory-centric node, seen two ways. As an enterprise workload — four data modalities running side-by-side. And as a runtime — every LLM execution function sitting one hop from HBM.

View 01 · Workload
Four modalities, one node.
Unity™ Node · Compute.AI
HBM · 128 – 512 GB
MEMORY AT THE CENTER
LLM Inference
GPU
Relational SQL
CPU
Compute.AI
xPU
x86 · 64–128 cores
orchestrator
Vector DB
CPU / GPU
Graph DB
CPU
↕ 800G
↕ 800G
↕ 800G
↕ 800G
4× 400G / 800G ETHERNET · UEC
View 02 · Execution
Every function, one hop.
Unity™ Node · One-hop runtime
HBM · 128 – 512 GB
MEMORY AT THE CENTER
▴ One hop ▴
KV-cache
Attention
LLM
Tensor
Agents
x86
Prefill
GPU
Decode
xPU
SQL
CPU
↕ 800G
↕ 800G
↕ 800G
↕ 800G
4× 400G / 800G ETHERNET · UEC

Memory at the center

Every workload, every function reaches HBM in one hop — no rack-crossing penalties, no fabric tax.

Function-aware silicon

CPU runs agents and SQL. GPU runs LLM and prefill. xPU runs custom decode. Each function on the right engine.

Open, heterogeneous compute

x86 + GPU + xPU coexist on a shared memory plane. No single-vendor lock-in.

Standard 2U, standard Ethernet

4× 400G→800G uplinks over the Ultra Ethernet Consortium standard. Drops into any datacenter.

Meet the appliance

Compute.AI.
The Unity™ node, in a box.

A standard rack appliance with everything the platform needs — CPU, GPU, xPU, HBM and HBF — built around a one-hop memory plane.

Compute.AI appliance front bezel
Front view
View 01
Front
Signature red status line. Hot-swap cooling intakes. Tool-free rails.
3/4 view
View 02
3/4 perspective
Standard 2U / 3U form factor. Drops into any 19" rack.
Top view
View 03
Top
Full-width airflow. Optimized for high-density rack deployment.
Form factor
2U / 3U
Standard 19" rack
Power
1 KW
per node, nominal
Cooling
Air
Front-to-back
Network
4× 800G
UEC Ethernet
Platform roadmap — 50×

Three generations.
Costs cut by 3×. Performance up by 15×.

Result: a platform 50× better than today — with 10× more capacity. Memory falls in cost, the bandwidth fabric grows, and every layer of the node steps up together.

& lower cost × 15×more performance = 50×platform improvement @ 10×more Capacity
GEN 12026
128 GB HBM
24 x86
220 Tensor
400 GbE
MEMORY
HBM128 GB1 TB/s
DDR5512 GB64 GB/s/ch
Memory cost  $400K
GEN 22028
256 GB HBM
64 x86
512 Tensor
800 GbE
MEMORY
HBM256 GB2 TB/s
DDR6512 GB96 GB/s/ch
HBF-11 TB128 GB/s
Memory cost  $300K
GEN 32030
512 GB HBM
96 x86
1024 Tensor
1600 GbE
MEMORY
HBM512 GB4 TB/s
DDR61 TB128 GB/s/ch
HBF-24 TB256 GB/s/card
Memory cost  $200K

UNITY™ HARDWARE PLATFORM  ·  50× ROADMAP

Explore in 3D

Pull the appliance apart.

An interactive model of the Compute.AI node. Drag to rotate, scroll to zoom, hover any part to inspect it — or hover the list to highlight it on the model.

Drag · Scroll · Hover
Front bezel
Red status line · vents
x86 CPU
Agentic Compute · 24–128 cores
GPU
LLM · Tensor · Prefill
xPU
Custom · Decode
HBM memory
128 – 512 GB · at the center
HBF cards
4 / 8 / 16 configurable
800G UEC ports
4× uplinks · 3.2 Tbit / node
Inside the chassis

One chassis.
Three engines. N cards.

Pop the lid and the same one-hop architecture is laid out in silicon — x86 CPU for agentic compute, GPU for tensor work, and a stack of HBF cards sitting next to high-bandwidth memory.

Inside the Compute.AI chassis
Compute.AI · internal subsystem
01
x86 CPU
Agentic Compute

24 – 128 cores. Runs the orchestration, agents and control plane.

02
GPU
LLM · Tensor · Prefill

High-bandwidth tensor engine for prefill, attention and LLM inference.

03
xPU + HBF cards
Custom · Decode

Configurable stack of accelerator + High-Bandwidth Flash cards, sitting directly on the memory plane.

4
HBF cards
Entry — for compact inference workloads.
8
HBF cards
Standard — most enterprise deployments.
16
HBF cards
Max — large-model, long-context inference.
The platform, exploded

Every piece, in one frame.

A blowup of every component in the Compute.AI appliance — the silicon, the memory, the fabric and the network — stacked the way they actually sit.

Front bezel
Compute.AI · status line
Layer 01

Bezel & chassis frame

2U / 3U standard form, hot-swap intakes, tool-free rail rails.

x86 CPU board
Agentic compute · 24 – 128 cores
Layer 02

CPU · Agents & orchestration

The agentic compute plane — orchestrators, control logic, SQL and graph workloads.

GPU module
LLM · Tensor · Prefill
Layer 03

GPU · LLM & prefill

Tensor cores for prefill, attention, vector and LLM inference workloads.

xPU module
Custom · Decode
Layer 04

xPU · Decode engine

Custom silicon for token-by-token decode — the cheapest, most power-efficient path to high throughput.

HBM memory
128 – 512 GB · memory at the center
Layer 05

HBM · memory plane

One hop from every engine. Every workload, every function, lives next to HBM.

HBF card stack
4 / 8 / 16 cards · High-Bandwidth Flash
Layer 06

HBF · tiered memory

Configurable High-Bandwidth Flash cards extend the memory plane for long-context, large-model inference.

Network & backplane
4× 800G · UEC Ethernet
Layer 07

Backplane & uplinks

4× 400G→800G Ultra Ethernet uplinks. Standard, open, datacenter-ready.

Compute.AI · the Unity™ hardware platform, in seven layers
Scale · open interconnect

One rack.
Trillion-parameter scale.

An open 800G Ultra Ethernet (UEC) fabric stitches Compute.AI appliances into a single, coherent inference engine — multi-trillion-parameter models, sovereign and ready to deploy, in a single rack.

Compute.AI rack with Arista 800G UEC top-of-rack switch
800G UEC switch
3.2 Tbit / node
48 Compute.AI nodes

Open 800G UEC fabric.
One rack. Many models.

No proprietary spine. No vendor lock-in. Standard 800G Ultra Ethernet wires every appliance to every other — and to the top-of-rack switch — at line rate.

Ultra Ethernet Consortium · open standard
800G per port

Up to 3.2 Tbit/sec per node across 4 uplinks.

Open standard

Multi-vendor switches, optics and NICs — no single-vendor spine.

RDMA + lossless

Modern transport for KV-cache exchange and tensor parallelism.

Rack-scale models

3.3T parameters served from a single rack — no spine fabric needed.

48
nodes / rack
3.3T
parameters
2.5M
tokens / sec
48 KW
total power
Unity™ · Hardware Platform

Four principles. One node.

Memory-centric

HBM at the core. One-hop memory fabric across the node.

Heterogeneous

x86 + GPU + xPU under one roof, each on the right workload.

Open

Standard 2U, standard 800G UEC Ethernet, no lock-in.

Sovereign

Made in India.

0M
tokens / sec per rack
0T
parameters per rack
0×
platform improvement by 2030

The dot in dot.ai —
built in India, for population scale.

Memory-centric. Heterogeneous. Open. Sovereign. The hardware platform for the next era of enterprise inference.

Tech specs

Unity™ Node at a glance.

Form factor
Standard 2U / 3U rack node — drops into any datacenter
Memory
128 – 512 GB HBM · DDR · SRAM · HBF, memory at the center
CPU
x86 · 24–128 cores · agentic compute, orchestration & control plane
GPU
Tensor · LLM, prefill and attention workloads
xPU
Custom silicon · token-by-token decode engine
HBF cards
4 / 8 / 16 High-Bandwidth Flash cards — configurable per workload
Memory fabric
One-hop · every engine sits next to HBM
Networking
4× 400G / 800G UEC Ethernet · up to 3.2 Tbit / node
Workloads
LLM, KV-cache, agents, prefill, decode — all one hop from memory
Per rack
48 nodes · 48 KW · 3.3T parameters · 2.5M tokens/sec
Origin
Made in India · sovereign by design