Infrastructure

Multi-cluster fabric.
HPC, cloud, on-prem.

Three architectural layers. Three deployment models. The same fabric whether you run a single cluster on the public pool or aggregate dozens across your own perimeter.

Deployment models
Public Pool
Shared · per-minute billing
Managed Cluster
Reserved · multi-site
Private Fabric
On-prem · air-gapped
Topology

One control plane. Any clusters underneath.

One job. Multiple clusters. Native distributed training inside each, coordinated across all of them. Inference deployed and served on the same fabric.

ResonTech Control Plane
Job dispatchCross-cluster coordinationInference meshArtifact storeTelemetry
HPC Cluster
On-prem / partner institution
  • Slurm scheduler
  • 8 × H100 · 640 GB
  • InfiniBand interconnect
  • ◆ training · ◆ inference
Cloud Pool
Hyperscaler reservation
  • Kubernetes
  • 8 × H100 · 640 GB
  • Cloud interconnect
  • ◆ training · ◆ inference
On-Prem Rack
Your datacenter
  • Bare metal agent
  • 8 × A100 80 GB · 640 GB
  • NVLink
  • ◆ training · ◆ inference
Deployment models

Public pool. Managed cluster. Private fabric.

Start on shared infrastructure, graduate to dedicated capacity, or run the entire fabric on hardware you own. Same training APIs, same inference APIs, same SDK at every tier.

Public Pool

Shared, multi-tenant GPU pool. Per-minute billing.

SHARED
Architecture
  • Pool aggregated from cloud and supplier nodes
  • Multi-tenant with strict workload isolation
  • Ephemeral compute — no persistent state between jobs
  • Fair-share scheduling with priority lanes
  • Single-cluster training and inference
Data flow

Your workspace is mounted read-only at job start. Outputs are written back to your bucket. The node is wiped after job completion.

Best for

Experiments, prototyping, single-cluster training and inference when you don't own GPUs yet.

Capability
Public Pool
Managed Cluster
Private Fabric
Single-cluster training
Multi-cluster training
Multi-region inference
Data sovereign / federated
Data stays on your infra
Dedicated capacity
Air-gapped mode
Supplier program

Have a cluster sitting idle? Plug it into the fabric.

HPC operators, datacenter owners, and teams with reserved cloud capacity can register clusters into the ResonTech supplier network. Jobs and inference replicas route to your nodes when there's a match; you earn from utilization without negotiating contracts.

This is not a consumer-GPU marketplace. We partner specifically with operators of professional clusters (8+ GPUs, fast intra-cluster network) — the kind of capacity that runs real distributed training and serves real production inference.

01

Register the cluster

Sign up as an operator. Provide cluster specs, location, scheduler (Slurm / K8s / bare metal), and availability windows.

02

Install the agent

One-line install of the ResonTech node agent. It registers your GPUs and joins them to the routing layer.

03

Configure availability

Define windows and resource allocation. The fabric factors your capacity into job routing alongside cloud and supplier nodes.

04

Monitor utilization

Real jobs land on your hardware automatically. Full dashboard visibility into utilization, job history, and resource metrics.

Hardware requirements

Minimum specs. And what we recommend.

For running jobs on the network and for supplying GPUs to it. Multi-node training prefers NVLink or InfiniBand within each cluster; cross-cluster coordination tolerates standard networking.

Component
Minimum
Recommended
GPU
GTX 1080 Ti
RTX 3090 / A100 / H100
VRAM
8 GB
24 GB+
RAM
32 GB
128 GB+
Storage
100 GB NVMe
200 GB NVMe
Network
100 Mbps
10 Gbps (for multi-node)
OS
Ubuntu 20.04
Ubuntu 22.04

Three layers, three deployments,
one fabric.

Start on the public pool today. Talk to engineering when you need multi-cluster or a Private Fabric inside your perimeter.

Talk to engineering