DeepSeek introduces new AI model V4, narrowing the gap with cutting-edge neural networks

News

4/25/2026, 6:01:03 AM

DeepSeek introduces new AI model V4, narrowing the gap with cutting-edge neural networks

Chinese research laboratory DeepSeek has officially unveiled two preview versions of its latest large language model DeepSeek V4. This release is a long-awaited update to last year's V3.2 version, as well as the accompanying reasoning model R1, which previously attracted significant attention in the artificial intelligence industry. The main breakthrough is the scale of the flagship DeepSeek V4 Pro version: it boasts a total of 1.6 trillion parameters, of which 49 billion are active. These characteristics make it the largest available open-weight model. For comparison, the new architecture more than doubles its own V3.2 model, which had 671 billion parameters, and also surpasses competitors like Kimi K 2.

Both new models, including the more compact DeepSeek V4 Flash version, are based on the Mixture-of-Experts (MoE) architecture. This approach implies activating only a certain number of parameters to perform a specific task, which significantly reduces inference costs. The V4 Flash model, meanwhile, has 284 billion total parameters, with 13 billion remaining active. A key technical feature of both neural networks is the context window size, which is 1 million tokens. This capacity allows users to load extensive codebases or voluminous text documents into their queries.

DeepSeek developers state that thanks to architectural improvements, the new algorithms have become more efficient and performant compared to the previous generation. According to the company's internal tests, they have virtually closed the gap with leading open and closed models in logical reasoning benchmarks. In programming competitions, the performance of both V4 versions is estimated to be comparable to the capabilities of GPT-5.4. Additionally, the company mentions the V4-Pro-Max modification, which, they claim, surpasses open-source counterparts in logic tests and even outperforms OpenAI's GPT-5.2 and Gemini 3.0 Pro on certain specific tasks.

Despite claimed successes in programming and logic, the new Chinese neural networks still exhibit certain limitations when testing factual knowledge. In these tests, they slightly lag behind advanced models, particularly OpenAI's GPT-5.4 and Google's latest Gemini 3.1 Pro. The DeepSeek laboratory itself openly acknowledges this lag, noting in its materials that their current development trajectory lags behind the most modern advanced models by approximately three to six months. The source does not provide detailed data on which specific knowledge domains show the largest gap but emphasizes the general trend of lagging behind market leaders in this metric.

DeepSeek V4's key competitive advantage in the global market is its aggressive pricing policy, which makes using these systems significantly more affordable compared to any advanced counterparts available today. The compact V4 Flash model is offered at a price of $0.14 USD per million input tokens and $0.28 per million output tokens. This rate is lower than the operational cost of systems such as GPT-5.4 Nano, Gemini 3.1 Flash, GPT-5.4 Mini, and Claude Haiku 4.5. In turn, the more powerful V4 Pro version is priced at $0.145 per million input tokens and $3.48 for output. These rates allow it to directly compete with premium offerings, including Gemini 3.1 Pro, GPT-5.5, Claude Opus 4.7, and GPT-5.4.

The release of the new models is accompanied by a serious escalation of tensions in the international technological environment and claims against the Chinese laboratory by Western corporations. The DeepSeek V4 release occurred just one day after the United States officially accused China of industrial-scale theft of intellectual property from American AI laboratories using thousands of proxy accounts. Moreover, DeepSeek itself has previously faced direct accusations from giants such as Anthropic and OpenAI. These competitors claim that the Chinese laboratory is engaging in distillation, which effectively means unauthorized copying of their own advanced AI models to train its algorithms.

Sources

TechCrunch DeepSeek · 4/24/2026

Replies (0)

No replies in this topic yet.

Back