DeepSeek introduces Generation V4: open models with a million-token context and extended support for AI agents

News

4/25/2026, 2:20:43 AM

DeepSeek introduces Generation V4: open models with a million-token context and extended support for AI agents

DeepSeek has officially released a preview version of its new generation of artificial intelligence – the DeepSeek – V4 family of models, available with open weights. As part of this large-scale release, two main versions are presented to users and developers: the flagship V4-Pro system and the more compact V4-Flash. A key feature of the announcement was the transition to an era of cost-effective processing of huge data arrays, as the standard context window of one million tokens is now applied by default in all official company services. The developers state that the initial model weights and a detailed technical report have already been published on the HuggingFace platform.

The flagship DeepSeek – V4-Pro model is built on an architecture with a total of 1.6 trillion parameters, of which only 49 billion are utilized during active generation. According to the technical report, the performance of this version is comparable to that of the best closed commercial systems on the global market. In the open segment, the model sets new standards: it surpasses all available solutions in mathematics, natural sciences, and programming, and also demonstrates advanced results in autonomous code generation benchmarks. Regarding the volume of knowledge about the surrounding world, the creators note that V4-Pro is surpassed only by the closed Gemini-3.1 – Pro model, confidently outperforming all other open counterparts.

For tasks requiring high speed and maximum resource savings, the company introduced the DeepSeek – V4-Flash version. The architecture of this model includes 284 billion total parameters with 13 billion active. Despite the significantly smaller network size, the developers emphasize that V4-Flash's capabilities for complex logical inference are as close as possible to the level of the flagship Pro version. In basic tasks related to AI agent operation, the compact model demonstrates parity with the older version. The main advantage of the Flash version is its high response speed and extremely favorable cost of use via the programmatic interface.

Achieving a million-token context without critical load on the server infrastructure became possible due to the implementation of deep structural innovations. DeepSeek engineers applied a character-level compression mechanism in combination with their proprietary DSA sparse attention technology. These innovations allowed for a radical reduction in computational power and RAM costs. As a result, the fourth-generation algorithms are capable of processing ultra-long contexts with record efficiency, setting new standards for open language models.

Special attention in the new release is given to optimization for autonomous AI agents and specialized development tools. Company representatives reported that the DeepSeek – V4 family already has seamless integration with advanced solutions such as Claude Code, OpenClaw, and OpenCode. Moreover, the new models are actively used within DeepSeek itself to power its own autonomous programming systems, which confirms their readiness for implementation in real production processes.

The integration of the new products into existing projects is designed with maximum convenience for specialists in mind. The updated API is already available for use: simply retain the previous base address and change the model name to deepseek – v4-pro or deepseek – v4-flash. The system fully supports OpenAI ChatCompletions and Anthropic API formats. Both new versions are capable of operating in two modes – with reasoning and without it. Regular users can test the algorithms in the company's web chat version through Expert and Instant modes.

With the transition to the new architecture, the company announced the planned decommissioning of previous generations of models. The deepseek – chat and deepseek – reasoner versions currently automatically redirect requests to V4-Flash, and will be permanently deactivated after July 24, 2026. Amid heightened attention to the release, the management urged the public to trust information solely from official DeepSeek accounts. In conclusion, the developers reaffirmed their long-term strategy aimed at creating general artificial intelligence.

Sources

DeepSeek Updates · 4/24/2026

Replies (0)

No replies in this topic yet.

Back