Cohere enters the speech recognition market with the release of its open model Transcribe

News

4/26/2026, 6:09:39 PM

Cohere enters the speech recognition market with the release of its open model Transcribe

On March 26, 2026, the Canadian artificial intelligence lab Cohere officially announced the expansion of its product ecosystem by releasing its first proprietary automatic speech recognition model named Cohere Transcribe. Previously, the company focused its efforts primarily on large language models for text processing and enterprise search; however, this release marks an important step towards multimodal capabilities. The transition to audio data processing enables developers to offer more comprehensive solutions for business users in need of reliable transcription tools.

Technically, the new system, internally codenamed cohere — transcribe-03-2026, is a specialized model designed to convert incoming audio signals into outgoing text. The neural network architecture is designed to directly accept raw audio waveform data as input and output structured textual information. A distinctive feature of the release is its built-in multilingualism, as the system initially supports fourteen languages. This list includes English, German, French, Italian, Spanish, Portuguese, Greek, Dutch, Polish, Vietnamese, Chinese, Arabic, Japanese, and Korean.

The company's approach to licensing the new product attracted particular attention from the technological community. Instead of hiding the technology behind strict proprietary barriers, the creators decided to release the model under the permissive open-source Apache 2.0 license. This step grants engineers and system architects worldwide the legal freedom to modify, integrate, and scale the neural network within their own corporate environments without incurring additional licensing risks. The publication of an open-source tool fundamentally changes the economics of creating complex products, lowering the barrier to entry for startups and large integrators.

To begin working with Cohere Transcribe, engineers are granted access via a specialized Audio Transcriptions API. Integration is as predictable as possible, specifically through the use of the current version of the Python client library and the ClientV2 object. Users can upload audio files in standard formats, specifying the target language and model ID, after which the API returns the transcribed text. The official documentation notes that the company offers free testing of the functionality for tasks with low initial setup requirements. This experimental mode is subject to certain request limits, the details of which are described in a separate section of the platform's documentation.

For scaling the solution to industrial operation levels and bypassing request limits, the provider offers the use of the Model Vault service. This deployment format ensures inference in an isolated private cloud with minimal latencies, relieving end customers of the need to self-administer server infrastructure. While the official announcement does not disclose exact throughput figures or detailed latency technical specifications, the developers emphasize the architecture's overall orientation towards high loads.

The release of Cohere Transcribe occurs amidst rapidly intensifying competition in the enterprise AI and voice processing technology segment. In recent years, powerful open-source solutions like Whisper from OpenAI have become the de facto standard for such tasks. Nevertheless, the Canadian lab places a conscious emphasis on the specific needs of large businesses, offering a clear combination of an open license and a managed cloud service.

Sources

Cohere Changelog

Replies (0)

No replies in this topic yet.

Back