Advancing the frontier of AI research

Capacity,
solved.

Luma Research Lab builds the cloud stack that turns fragmented, idle, or hard-to-use AI compute into one reliable training and inference substrate — across Trainium, GPUs, spot fleets, and older accelerators. Zero porting. No vendor lock-in.

Explore Research
Hardware we support
NVIDIA GPU AWS TRAINIUM AMD ROCM GOOGLE TPU AWS INFERENTIA

Three pillars that define our work

We believe AI should be open, reproducible, and deployable anywhere — not locked to a single cloud or proprietary stack.

Any Silicon, Zero Porting

Our sealed compile layer runs any Hugging Face or custom model on AWS Trainium, Nvidia GPUs, AMD ROCm, and Google TPU — with no code changes from the customer. The compiler runs in a Nitro Enclave; only an opaque training artifact leaves, protecting Luma IP.

Heterogeneous Fleet as One Cluster

Mix H100s, A100s, L4s, Trainium nodes, and TPUs into a single logical training job. Luma's orchestration layer abstracts the hardware differences — your training loop sees one cluster, regardless of what's underneath.

Spot & Idle Compute, On-Demand Reliability

Automated checkpointing and failover let reclaimed spot nodes swap out while the job continues — turning volatile spot fleets and idle prior-generation accelerators into reliable, cost-effective training capacity.

What we're building

A cloud stack that solves GPU scarcity — by making every accelerator, spot instance, and idle chip usable for AI training and inference.

Sealed Compile

Any Model on AWS Trainium

Any Hugging Face or customer model runs on AWS Trainium with zero porting work. Luma's compiler runs inside a Nitro Enclave — customers receive an opaque training artifact (NEFF) they can execute but cannot inspect, keeping compiler IP fully protected.

Explore infrastructure →
Heterogeneous Fleet

Mixed Hardware, One Training Job

Luma binds a mixture of accelerators — H100s, A100s, L4s, Trainium nodes, TPUs — across the internet into one logical training cluster. No customer-side complexity. One job spec, whatever hardware is available.

Explore infrastructure →
Spot & Idle

Spot Fleets with On-Demand Reliability

Automated checkpointing and failover migrate jobs off reclaimed spot nodes with no lost work. Prior-generation accelerators that would otherwise sit idle become usable again — dramatically expanding available AI compute supply at lower cost.

Explore infrastructure →
Fine-Tuning

Domain-Specific Models on Any Silicon

Once the infrastructure layer is in place, domain-specific fine-tuning runs on whatever compute is cheapest or most available. We deliver fine-tuned checkpoints tailored to specific industries — without the customer managing the hardware complexity.

View model coverage →

Any model.
Any accelerator.

Train, fine-tune, and serve models across every accelerator — without rewriting your code. Our compatibility layer handles the hardware differences so your research doesn't have to.

See infrastructure Coverage table
AWS Trainium (trn1)
All model families supported
✓ Full
Nvidia H100 / A100
Native support, reference baseline
✓ Full
AMD Instinct MI300X
Native ROCm — no adapter needed
✓ Full
AWS Inferentia (inf2)
Inference & serving layer
✓ Full

Research at scale

<8%
Max overhead vs. native silicon
4
Accelerator platforms supported
10
Model families running on Trainium
~57%
Compute cost reduction vs. H100 on-demand

Every model family.
Every accelerator.

Our infrastructure layer has been validated across all major model architectures — from classic encoders to modern vision-language-action systems and mixture-of-experts.

Decoder LLM
distilgpt2
Text-to-text
Encoder
distilbert-base-uncased
Text encoder
Encoder-Decoder
t5-small
Text-to-text
Vision Transformer
ViT-Base/16
Image classification
CNN
ResNet-18
Image classification
Diffusion UNet
DDPM CIFAR-10
Image generation
Speech-to-Text
Whisper Tiny
Speech-to-text
VLA
SmolVLA Base
Vision-language-action
VLM
SmolVLM-256M
Image-text-to-text
MoE
Switch-Base-8
Text-to-text (MoE)
View full coverage table →

Early access & updates

Join the waitlist for early access to our models, infrastructure tools, and research previews. No spam — just meaningful updates.

We respect your privacy. Unsubscribe anytime.

✓ You're on the list! We'll be in touch.