IBM Granite releases Apache‑2.0 Multilingual Embeddings with 32K context and leading sub‑100M retrieval

News

5/14/2026, 7:11:27 PM

IBM Granite releases Apache‑2.0 Multilingual Embeddings with 32K context and leading sub‑100M retrieval

IBM’s Granite team published Granite Embedding Multilingual R2 on May 14, 2026, releasing two Apache‑2.0 multilingual embedding models aimed at retrieval and code search workloads. The pair comprises granite — embedding-311m-multilingual-r2 (311M parameters, 768‑dim embeddings, Matryoshka dimension support) and granite — embedding-97m-multilingual-r2 (97M parameters, 384‑dim embeddings). On MTEB Multilingual Retrieval the compact 97M model scores 60. the 311M model scores 65.2 and ranks second among open models below 500M parameters. These results target builders who need small, high-quality embedders for multilingual retrieval.

The R2 models are built on ModernBERT and extend context handling to 32,768 tokens—a 64× increase over the R1 predecessors — and add cross‑lingual code retrieval capability. Training material mixes an IBM‑curated dataset, public data sources, and internally generated or synthetic examples. To support deployment on commodity hardware, Granite provides ONNX and OpenVINO weights for CPU‑optimized inference. the models work out of the box with sentence — transformers and transformers and are intended as drop‑in replacements for LangChain, LlamaIndex, Haystack, and Milvus via a one‑line model name change, with no API modifications, extra dependencies, or code updates required to access coverage for 200+ languages.

Language and code breadth are central to the release. The encoder was pretrained on text from over 200 languages, with 52 languages receiving explicit retrieval — pair and cross‑lingual tuning — including Arabic, Chinese, French, German, Hindi, Japanese, Korean, Russian, Spanish, Turkish and Vietnamese. Code retrieval training spans nine languages: Python, Go, Java, JavaScript, PHP, Ruby, SQL, C and C++. This combination of compact, high‑quality models, extended context windows, multi‑language tuning and CPU weights is aimed at multilingual RAG, cross‑lingual search and international code search use cases.

Granite R2 emphasizes deployment — ready packaging and broad language coverage for builders: prepackaged CPU weights, compatibility with common transformer and retrieval toolchains, and explicit tuning across dozens of languages reduce integration friction for teams rolling out multilingual retrieval or code search at scale.

Sources

Hugging Face Blog · 5/14/2026

Replies (0)

No replies in this topic yet.

Back