H100, A100, L40S. Copy-paste configs. Launch in ~90 seconds.
All configs include NVMe storage, 400G networking, and full root access
~2000 tok/s (Llama-3-70B, BF16)
~1200 tok/s (Llama-3-70B, BF16)
~25 it/s (SDXL training, BF16)
~600 tok/s (Llama-3-8B, BF16)
Working snippets for PyTorch, vLLM, SDXL, and more
# PyTorch training on H100
docker run --gpus all -v $(pwd):/workspace \
nvcr.io/nvidia/pytorch:24.01-py3 \
python train.py --fp16 --batch-size 32
# What you paste here runs the same in production.
"What you paste here runs the same in production."
Measured throughput and cost per million tokens
Model | Precision | GPU | Throughput | $ per 1M tokens |
---|---|---|---|---|
Llama-3-8B | BF16 | H100 | 2800 tok/s | $1.25 |
Llama-3-70B | BF16 | H100 x2 | 2000 tok/s | $3.49 |
Mixtral-8x7B | BF16 | A100 | 1500 tok/s | $1.26 |
SDXL (train) | FP16 | L40S | 25 it/s | $0.89 |
Llama-3-8B | INT4 | A6000 | 1200 tok/s | $0.49 |
No hidden fees. Cancel anytime.
Pay as you go
1-3 month commitment
12 month commitment
Strategic locations with 400G networking and renewable energy
Reykjavik
Idaho
More regions coming soon. Need a specific location? Let us know
Built for engineers who ship production workloads
Multi-Instance GPU support with proper K8s primitives. No weird workarounds.
High-bandwidth networking between nodes. Not marketing fluff.
NVMe + GPUDirect Storage ready. Move data at GPU speed.
Docker, K8s, or bare metal. Your choice. Sub-60s cold starts.
What you see is what you pay. No surprise bandwidth charges.
Real engineers who understand your workload. Not chatbots.
GPU Core vs typical cloud GPU providers
Feature | GPU Core | Others |
---|---|---|
Price clarity | ||
400G networking | ||
MIG with K8s UX | ||
No minimum term | ||
Real engineer support |
We help you move from CoreWeave, RunPod, or other providers with zero downtime.
Yes. H100 HGX configurations include NVLink and NVSwitch for high-speed GPU-to-GPU communication. Perfect for large model training and multi-GPU inference.