AllenAI's OlmoEarth v1.1 cuts satellite-model compute up to 3× while preserving prior performance

News

5/19/2026, 6:57:29 PM

AllenAI's OlmoEarth v1.1 cuts satellite-model compute up to 3× while preserving prior performance

AllenAI released OlmoEarth v1.1 on May 19, 2026, an updated family of transformer — based remote sensing models that reduces end-to-end compute costs by as much as threefold while preserving OlmoEarth v1’s performance on the authors’ chosen benchmarks and partner tasks. That cut in compute can materially lower lifecycle costs for teams running satellite — model workflows. The original OlmoEarth v1, released in November 2025, has already been adopted by partners for use cases such as mangrove — change tracking, forest — loss driver classification, and country — scale crop-type mapping; v1.1 aims to offer similar accuracy at much lower compute expense.

Rather than focusing primarily on reducing model size, the team concentrated on shortening token sequence lengths—a dominant cost driver for transformer architectures. Because transformer compute roughly scales quadratically with token sequence length, modest token reductions reduce multiply — accumulate operations (MACs) and make pretraining and inference materially cheaper.

OlmoEarth’s token design for Sentinel-2 inputs underpins those savings. A Sentinel-2 tensor is represented as [H, W, T, D=12] (height, width, temporal steps, 12 channels). The pipeline splits inputs into spatial patches of size p×p and creates one token per timestep per resolution. With three resolutions (10 m, 20 m, 60 m), a two-timestep example yields six tokens per patch; in general the model sees H/p × W/p × T × 3 tokens.

Many remote — sensing models use a per-resolution token to help learn cross — band relationships: Galileo and SatMAE both assign unique tokens per resolution, and SatMAE reports significantly better results when doing so. By contrast, the CROMA family collapses all bands into a single token, reducing token counts by roughly threefold and cutting compute across pretraining, fine-tuning, and inference.

However, the OlmoEarth team found that naively merging resolution — specific tokens into one harms accuracy: simply collapsing tokens produced about a 10 percentage — point drop on the m-eurosat kNN benchmark. The authors say they achieved the compute benefits of fewer tokens without degrading performance by modifying the pretraining regimen, though the published snippet does not detail the exact pretraining changes.

For builders and deployers, v1.1’s efficiency — versus-accuracy trade — offs are practical: because compute dominates lifecycle costs — data export, preprocessing, inference, and post-processing—a model family that offers up to 3× lower compute enables broader partner support and cheaper self-hosted runs. AllenAI has published the models, a technical report, and code to help teams choose the model size and token design that best fit their budget and accuracy requirements.

Sources

Hugging Face Blog · 5/19/2026

Replies (0)

No replies in this topic yet.

Back