Research — Luma Research Lab

Areas

Our focus areas

We focus where impact is highest and under-served: efficient training, hardware portability, multimodal systems, and alignment.

LLM

Efficient Language Models

Architecture research focused on reducing compute per token without sacrificing downstream performance. We study sparse attention, mixture-of-experts routing, and training curriculum design.

Multimodal

Vision-Language-Action Systems

Unified models that process images, text, and sensor data — deployable on edge hardware. We benchmark VLA models across standard robotics and embodied AI tasks.

Training

Hardware-Agnostic Training

Compiler-level abstractions that port models across accelerators with minimal throughput loss. Covers operator fusion, graph compilation, and mixed-precision strategies.

Safety

Interpretability & Alignment

Feature visualization, probing classifiers, and mechanistic interpretability applied to frontier models. We also develop adversarial evaluation suites for alignment benchmarking.

Publications

Papers & preprints

All our work is published openly. Preprints appear on arXiv before formal peer review.

Publications coming soon

Our first papers are in preparation. Join the waitlist to be notified when we publish.

Model Coverage

The same model, every accelerator

Select a silicon type to see which model families run out of the box versus with our compatibility layer.

#	Family	Model	Type	Stock Trainium	With Luma
1	Decoder LLM	distilgpt2	Text-to-text	✕ NaN loss	✓ trains
2	Encoder	distilbert-base-uncased	Text encoder	✓ trains	✓ trains
3	Encoder-decoder	t5-small	Text-to-text	✓ trains	✓ trains
4	ViT	vit-base-patch16-224	Image classification	✓ trains	✓ trains
5	CNN	resnet-18	Image classification	✓ trains	✓ trains
6	Diffusion UNet	ddpm-cifar10-32	Image generation	✕ no converge	✓ trains
7	STT	whisper-tiny	Speech-to-text	✓ trains	✓ trains
8	VLA	smolvla_base	Vision-language-action	✕ compile error	✓ trains
9	VLM	SmolVLM-256M-Instruct	Image-text-to-text	✕ crash	✓ trains
10	MoE	switch-base-8	Text-to-text (MoE)	✕ hangs	✓ trains

Stock Trainium (unmodified Neuron SDK) handles encoders, seq2seq, vision and speech models natively. Causal LLMs, diffusion models, VLAs, VLMs, and MoEs require our compatibility layer — which closes every gap and trains all ten end-to-end.

Use case	Hardware	Advantage
LLM inference	MI300X	192 GB HBM3 — fits 70B+ models in a single GPU
Long-context models	MI300X	Memory capacity enables 128k+ context windows
Big-batch inference	MI300X	Lower $/token than H100 at equivalent throughput

Benchmarks

Open evaluation suites

We maintain and contribute to open evaluation suites covering language understanding, reasoning, multimodal perception, and alignment.

Language

[Benchmark Name]

Evaluation suite for [describe what it measures]. Includes [N] tasks across [domains].

⏳ Coming soon

Multimodal

[Benchmark Name]

Evaluation suite for [describe what it measures]. Includes [N] tasks across [domains].

⏳ Coming soon

Alignment

[Benchmark Name]

Evaluation suite for [describe what it measures]. Includes [N] tasks across [domains].

⏳ Coming soon

What we study.
What we publish.

Our focus areas

Efficient Language Models

Vision-Language-Action Systems

Hardware-Agnostic Training

Interpretability & Alignment

Papers & preprints

Publications coming soon

The same model, every accelerator

Inferentia coverage

Nvidia GPU coverage

Open evaluation suites

[Benchmark Name]

[Benchmark Name]

[Benchmark Name]

What we study.What we publish.

Our focus areas

Efficient Language Models

Vision-Language-Action Systems

Hardware-Agnostic Training

Interpretability & Alignment

Papers & preprints

Publications coming soon

The same model, every accelerator

Inferentia coverage

Nvidia GPU coverage

Open evaluation suites

[Benchmark Name]

[Benchmark Name]

[Benchmark Name]

What we study.
What we publish.