Infrastructure

The GPU Network
Behind the Speed

Distributed GPU pools, intelligent scheduling, fault-tolerant execution. Three cluster types for every scale of ML work.

Available Pools
Public Pool
Shared · multi-tenant
Managed Cluster
Dedicated · reserved
Private Cluster
On-prem · air-gapped
GPU Pool Types

Three execution modes.
One API.

Public Pool, Managed Cluster, and Private Cluster have fundamentally different architectures. Same API surface. Different guarantees.

Public Pool

Shared GPU network. Multi-tenant isolation.

SHARED
Architecture
  • Jobs routed to any available GPU node in the network
  • Multi-tenant with strict workload isolation
  • Ephemeral compute — no persistent state between jobs
  • Node capacity reported in real-time by the kernel
  • FIFO queue with configurable priority lanes
Data Flow

Your workspace is mounted read-only at job start. Outputs are written back. The node is wiped clean after job completion.

Best For

Experiments, prototyping, one-off training runs

Feature
Public Pool
Managed Cluster
Private Cluster
Data stays on your infra
Dedicated capacity
No queue contention
Custom GPU config
Air-gapped mode
Network Architecture

Decentralized.
Not just cloud-hosted.

GPU capacity is aggregated across a distributed network — better availability, broader hardware diversity, no single vendor lock-in.

📡

Nodes Register

GPU suppliers install the ResonTech worker. It reports GPU specs, VRAM, network speed, and availability to the kernel.

🧠

Smart Matching

When a job is submitted, the kernel matches it to optimal nodes — by GPU type, locality, bandwidth, and current load.

🔀

Distributed Execution

Large jobs are split across multiple nodes. Data is sharded. Workers communicate via high-bandwidth interconnects.

🛡️

Self-Healing

Heartbeat monitoring detects node failure in seconds. Jobs reschedule automatically to healthy nodes, resuming from checkpoint.

Your ClientResonTech KernelGPU Node A/GPU Node B/GPU Node C

Simplified job routing flow. Jobs are dispatched to best-fit nodes across the network.

Performance

Why we're faster.
The technical reasons.

throughput gain

Multi-node automatic distribution

Submit with --gpus 8 and we split your job across nodes automatically. No manual NCCL setup, no rank configuration.

< 15s
node selection time

Topology-aware scheduling

Multi-node jobs are placed on nodes with high-bandwidth interconnects — NVLink, InfiniBand. Slower nodes picked last.

data throughput

Data sharding at mount time

Your dataset is automatically sharded across worker nodes. No centralized bottleneck, no transfer overhead.

100%
recovery success

Checkpoint-aware recovery

Node failure triggers automatic rescheduling. Job resumes from last checkpoint — not epoch 0. Zero lost compute.

Kernel Intelligence
GPU affinity routing

Jobs requesting specific GPU types are routed to matching nodes first

Topology-aware placement

Multi-node jobs prefer nodes with NVLink or InfiniBand interconnects

Preemption + recovery

Lower-priority jobs yield to high-priority ones; they resume from checkpoint

Elastic scaling

Inference endpoints scale replicas up/down based on request throughput

Spot reclamation handling

Reclaimed spot nodes trigger automatic rescheduling, not failure

Cost-aware scheduling

Kernel can prefer cheaper nodes when latency is not the constraint

Become a Supplier

Have idle GPUs?
Put them to work.

Join the ResonTech supplier network. Install the worker and your nodes are registered into the distributed GPU pool, running real ML workloads.

01

Register Your Hardware

Sign up as a GPU supplier. Provide hardware specs, location, and availability windows.

02

Install the Worker

One-line install of the ResonTech node worker. It registers your GPU into the network and handles job routing.

03

Configure Availability

Define GPU availability windows and resource allocation. The kernel registers your node capacity and factors it into job routing.

04

Monitor Utilization

Jobs route to your hardware automatically. Full dashboard visibility into node utilization, job history, and resource metrics.

Hardware Requirements

Minimum specs.
And what we recommend.

For both running jobs on the network and supplying GPUs to the network.

Component
Minimum
Recommended
GPU
GTX 1080 Ti
RTX 3090 / A100
VRAM
8 GB
24 GB+
RAM
32 GB
128 GB+
Storage
100 GB NVMe
200 GB NVMe
Network
100 Mbps
10 Gbps (for multi-node)
OS
Ubuntu 20.04
Ubuntu 22.04

Note: Requirements vary by job type. Inference serving requires less RAM than large training runs. Multi-node training requires NVLink or InfiniBand for best performance. Contact us for specific hardware validation.

Ready to run on the network?
Start in 60 seconds.

Start with the public pool — deploy instantly, no configuration required. Book a demo for managed or private cluster access.