Compose your cluster.
Train your model.
Serve it.
Distributed training and inference, across any GPU pool. From one GPU to multiple clusters. No infrastructure to manage.
From notebook to multi-worker training in one call.
Compose your cluster
Pick GPUs from the shared pool, your reservation, or your own hardware. Mix RTX 3090 to H100 — a tuned cluster of mid-tier cards often matches a smaller cluster of flagships at the same hourly spend. Heterogeneous composition lets you actually choose.
Bucket as workspace
Per-account S3 bucket on signup. Browse folders, edit Python and configs in the browser, pin files side-by-side. boto3-native end-to-end.
Recovery without restart
Worker drops mid-run? The fabric evicts the bad node, reschedules onto a healthy one, and resumes from your last checkpoint. Not epoch zero.
Any framework, any model
PyTorch, TensorFlow, JAX, HuggingFace — bring your stack. YOLO11, Phi-3.5-mini LoRA, Llama-3, DenseNet on HAM10000, Whisper, SDXL LoRA, BGE — ready-to-run examples you can lift.

Train across GPUs around the world.
Submit one job. The platform coordinates training across every worker you picked — shared pool, dedicated cluster, on-prem rack, or a partner site half a continent away — and writes outputs directly to your S3 bucket. Write standard PyTorch, TensorFlow, JAX, or HuggingFace — no rewrites.
- PyTorch, TensorFlow, JAX, HuggingFace, MONAI, Ultralytics, Diffusers — no custom operators.
- Choose an aggregation algorithm: FedAvg, FedOpt, Scaffold, FedProx, DiLoCo.
- Workers self-shard the dataset via presigned URLs from your bucket.
- Auto-resume from the last checkpoint on node failure or preemption.
- Stream logs, metrics, and artifacts back to your S3 prefix in real time.
- Submit from a notebook or CI — the SDK is one HTTP call.

Serve from a mesh, not a single endpoint.
Push a checkpoint or HuggingFace repo. Get an OpenAI-compatible endpoint with scale-to-zero, health-aware load balancing, and routing across replicas in the regions you choose. Run vanilla HF models or bring a custom script — same deploy path.
- Drop-in replacement for OpenAI clients — point base_url at your endpoint.
- Scale to zero when idle; sub-10s warm boot when traffic returns.
- Multi-region mesh with automatic failover across replicas.
- Token streaming, request batching, and per-endpoint rate limits.
- Bring your own script for custom routing, A/B tests, canary rollouts.
- One short-lived API key per deployment — rotate from the dashboard.
Three deployment tiers.
Same SDK across every tier. Start free on the Public Pool, graduate to a reserved cluster when you need predictability, drop into your own perimeter when compliance demands it.
Shared GPU capacity contributed by suppliers. Start with one GPU on the same SDK an enterprise uses.
Dedicated GPU capacity on our infrastructure. Reserved for your team. Provisioned, monitored, and recovered by us.
Run the platform inside your perimeter. Connect your on-prem clusters and cloud accounts under one control plane.
Common questions.
They are infrastructure providers — they rent you GPUs from their own datacenters and regions. ResonTech is the layer above: a decentralized datacenter you compose yourself. Plug in capacity from any of those providers, plus your on-prem racks, your reservations, or partner clusters, and treat them as one pool. Submit a single job; the platform places workers across the mix, coordinates training, and serves the resulting model — with minimal cross-cluster overhead. We aren't competing with their infra; we sit on top of it.
Not a rewrite, but a thin adaptation — typically 50–100 lines. You define a model class, a dataloader, and the training step the way you normally would in PyTorch / TensorFlow / JAX, then expose them through a small Executor + Persistor scaffold so the platform can shard data, dispatch workers, and aggregate model deltas across clusters. We're shipping a Claude plugin that ports an existing training script to that scaffold automatically — paste your repo, get back a submit-ready job in minutes.
Public Pool bills per-minute on allocated GPU time, the same model as RunPod or Lambda — you pay for the minutes a worker is reserved to your job, with no monthly minimums and no pre-purchased credits. Managed Cluster is a flat monthly reservation for dedicated capacity. Private Fabric is per-cluster licensing on your own hardware. Inference endpoints bill per active replica-minute and can scale replicas down between traffic. Egress, storage, and NAT are itemized line-by-line — no hidden tail in the invoice.
The fabric detects the failure, evicts the bad node from the worker pool, and resumes from the last checkpoint automatically. You receive a notification but no manual intervention is required. For long multi-cluster runs this can save dozens of GPU-hours you would otherwise rerun from scratch.
You can host the bucket yourself (any S3-compatible store). Workers fetch only their assigned shard through short-lived presigned URLs; the control plane never proxies raw bytes. Managed Cluster lets you pin storage per region. Private Fabric runs the entire control plane inside your perimeter, with air-gapped mode for sensitive environments. Data-sovereign mode keeps data anchored across organizations for federated training and federated inference.
Install the Python SDK with one pip install, point it at your model or training script, and submit. First job typically runs within minutes on the public pool. No infrastructure provisioning, no cloud-account setup, no support ticket to request quota.
H100 SXM5, A100 80GB, and A40 nodes depending on availability and priority tier. Managed Cluster reserves specific GPU types — H100 NVLink, A100 PCIe, L40S — for your team. Private Fabric runs on whatever you bring (B100/B200, H200, A100, L40S, RTX-class, mixed pools all supported).
Across clusters the platform runs federated-style coordination — each cluster trains a local round (multiple SGD steps), then exchanges only model deltas with the central aggregator over gRPC + TLS. That replaces per-step all-reduce, which would die on WAN latency. The aggregator combines updates using a chosen algorithm — FedAvg by default, with FedOpt, FedProx, Scaffold, and DiLoCo selectable per job. Deltas are compressed and the sync interval auto-adjusts to the observed link bandwidth, so it tolerates anything from 10 Gbps cloud interconnect down to public internet. Final model quality typically lands within a few percent of centralized training; the wall-clock cost is the round-trip latency between sites.
Compose your first cluster.
Free on the Public Pool.
One GPU or a hundred. pip install resontech. No credit card to start.
