Mistral launches 128B Mistral Medium 3.5 with 256k context and cloud Remote Agents for Vibe and Le Chat

News

5/5/2026, 11:29:29 AM

Mistral launches 128B Mistral Medium 3.5 with 256k context and cloud Remote Agents for Vibe and Le Chat

Mistral released Mistral Medium 3.5 in public preview with open weights under a modified MIT license, added cloud — hosted remote agents for Vibe and a persistent Work Mode in Le Chat to support asynchronous, multi — step local — to-cloud workflows.

Mistral has published Mistral Medium 3.5, a 128‑billion‑parameter dense model available in public preview with open weights under a modified MIT license. The model is built for instruction following, reasoning and coding within a single system, supports a context window up to 256k tokens, includes a vision encoder for variable image inputs, and exposes a configurable “reasoning effort” per request to trade off brief replies against longer multi‑step executions.

Alongside the model release, Mistral introduced cloud‑hosted remote agents for its Vibe tooling and added capabilities to Le Chat. Developers can begin sessions from a command‑line interface or inside Le Chat, run tasks asynchronously in cloud runtimes, and move sessions from local execution to the cloud while preserving state and history to continue work without losing context.

Sessions launched in the cloud run in isolated environments where agents can modify code, install dependencies, interact with external systems and produce concrete outputs such as pull requests and user notifications. The runtime model enables agents to perform end‑to‑end tasks that extend beyond single‑turn code generation, including multi‑step orchestration and structured outputs tied to external services.

Mistral Medium 3.5 is now the default model for these agent flows and replaces earlier models in the Vibe CLI. The company positions the model for long‑running workflows that require tool usage and orchestration rather than only one‑off code generation. Teams that prefer on‑premises execution can self‑host the model on a small number of GPUs, or rely on Mistral’s cloud runtimes for remote execution.

Le Chat’s new Work Mode allows an agent to execute multi‑step workflows across connected tools: accessing external data sources, running analyses, drafting messages, creating issues and generating reports. The system surfaces the agent’s actions and intermediate steps, requires user approval for sensitive operations and persists sessions so agents can iterate until a task is complete. Multiple agents can operate in parallel to support concurrent workflows.

Early community reaction on X has been largely positive about the local‑to‑cloud handoff and the ability to run a single dense model on fewer GPUs; developer Jarek Sobiecki reported improvements during his testing. Some users raised cost concerns: one reported $1.50 in / $7.50 out pricing compared with lower figures observed for other models. These cost signals and runtime behavior will influence adoption for continuous integration and other production workflows.

Sources

InfoQ AI/ML · 5/5/2026

Replies (0)

No replies in this topic yet.

Back