Advancing the frontier of AI research

Capacity,
solved.

Luma Research Lab builds the cloud stack that turns fragmented, idle, or hard-to-use AI compute into one reliable training and inference substrate — across Trainium, GPUs, spot fleets, and older accelerators. Zero porting. No vendor lock-in.

Explore Research

Hardware we support

NVIDIA GPU AWS TRAINIUM AMD ROCM GOOGLE TPU AWS INFERENTIA

Our Mission

Three pillars that define our work

We believe AI should be open, reproducible, and deployable anywhere — not locked to a single cloud or proprietary stack.

Any Silicon, Zero Porting

Our sealed compile layer runs any Hugging Face or custom model on AWS Trainium, Nvidia GPUs, AMD ROCm, and Google TPU — with no code changes from the customer. The compiler runs in a Nitro Enclave; only an opaque training artifact leaves, protecting Luma IP.

Heterogeneous Fleet as One Cluster

Mix H100s, A100s, L4s, Trainium nodes, and TPUs into a single logical training job. Luma's orchestration layer abstracts the hardware differences — your training loop sees one cluster, regardless of what's underneath.

Spot & Idle Compute, On-Demand Reliability

Automated checkpointing and failover let reclaimed spot nodes swap out while the job continues — turning volatile spot fleets and idle prior-generation accelerators into reliable, cost-effective training capacity.

How It Works

What we're building

A cloud stack that solves GPU scarcity — by making every accelerator, spot instance, and idle chip usable for AI training and inference.

Sealed Compile

Any Model on AWS Trainium

Any Hugging Face or customer model runs on AWS Trainium with zero porting work. Luma's compiler runs inside a Nitro Enclave — customers receive an opaque training artifact (NEFF) they can execute but cannot inspect, keeping compiler IP fully protected.

Explore infrastructure →

Heterogeneous Fleet

Mixed Hardware, One Training Job

Luma binds a mixture of accelerators — H100s, A100s, L4s, Trainium nodes, TPUs — across the internet into one logical training cluster. No customer-side complexity. One job spec, whatever hardware is available.

Explore infrastructure →

Spot & Idle

Spot Fleets with On-Demand Reliability

Automated checkpointing and failover migrate jobs off reclaimed spot nodes with no lost work. Prior-generation accelerators that would otherwise sit idle become usable again — dramatically expanding available AI compute supply at lower cost.

Explore infrastructure →

Fine-Tuning

Domain-Specific Models on Any Silicon

Once the infrastructure layer is in place, domain-specific fine-tuning runs on whatever compute is cheapest or most available. We deliver fine-tuned checkpoints tailored to specific industries — without the customer managing the hardware complexity.

View model coverage →

Infrastructure

Any model.
Any accelerator.

Train, fine-tune, and serve models across every accelerator — without rewriting your code. Our compatibility layer handles the hardware differences so your research doesn't have to.

See infrastructure Coverage table

AWS Trainium (trn1)

All model families supported

✓ Full

Nvidia H100 / A100

Native support, reference baseline

✓ Full

AMD Instinct MI300X

Native ROCm — no adapter needed

✓ Full

AWS Inferentia (inf2)

Inference & serving layer

✓ Full

Model Support

Every model family.
Every accelerator.

Our infrastructure layer has been validated across all major model architectures — from classic encoders to modern vision-language-action systems and mixture-of-experts.

Decoder LLM

distilgpt2

Text-to-text

Encoder

distilbert-base-uncased

Text encoder

Encoder-Decoder

t5-small

Text-to-text

Vision Transformer

ViT-Base/16

Image classification

CNN

ResNet-18

Image classification

Diffusion UNet

DDPM CIFAR-10

Image generation

Speech-to-Text

Whisper Tiny

Speech-to-text

VLA

SmolVLA Base

Vision-language-action

VLM

SmolVLM-256M

Image-text-to-text

MoE

Switch-Base-8

Text-to-text (MoE)

View full coverage table →

Stay in the loop

Early access & updates

Join the waitlist for early access to our models, infrastructure tools, and research previews. No spam — just meaningful updates.

We respect your privacy. Unsubscribe anytime.

✓ You're on the list! We'll be in touch.

Capacity,solved.

Three pillars that define our work

Any Silicon, Zero Porting

Heterogeneous Fleet as One Cluster

Spot & Idle Compute, On-Demand Reliability

What we're building

Any Model on AWS Trainium

Mixed Hardware, One Training Job

Spot Fleets with On-Demand Reliability

Domain-Specific Models on Any Silicon

Any model.Any accelerator.

Research at scale

Every model family.Every accelerator.

Early access & updates

Capacity,
solved.

Any model.
Any accelerator.

Every model family.
Every accelerator.