DeepSeek V4: Release Date, Specs, and the Huawei Chip Bombshell

Updated April 5, 2026 — includes Reuters Huawei chip confirmation (April 4), V4-Lite benchmark results, and community pricing analysis.

Every other article about DeepSeek V4 says it’s launching “mid-February” or “sometime in March.” Those articles are wrong. It’s April. The model has been delayed twice. And the real story isn’t the delay — it’s what changed while everyone was waiting.

Reuters confirmed on April 4 that DeepSeek V4 will run on Huawei’s Ascend 950PR chips. Not NVIDIA. Not AMD. Huawei. That makes it the first frontier AI model built to run on Chinese semiconductor infrastructure — and it has implications that go far beyond a new chatbot.

Here’s everything we know right now, updated daily as the launch approaches.

What Is DeepSeek?

If you haven’t followed the DeepSeek story: it’s a Chinese AI lab that shocked the industry in January 2025 when DeepSeek R1 matched GPT-4 on key benchmarks at a fraction of the training cost. They followed with V3, which became one of the most-used open models in the world. V4 is their next frontier model — and potentially the most significant AI release of 2026.

DeepSeek’s models are open-source (typically MIT or Apache 2.0 licensed), meaning anyone can download, run, and modify them for free. That’s a sharp contrast to OpenAI and Anthropic, whose frontier models are closed and API-only.

DeepSeek V4 Specs

Specification	Details
Parameters	~1 trillion (MoE architecture)
Active parameters per token	~37 billion
Context window	1 million tokens
Modality	Text + image + video (native multimodal generation)
Architecture	Mixture of Experts (MoE)
Training cost	~$5.2 million
License	Expected open-source (MIT or Apache 2.0)
Hardware	Huawei Ascend 950PR + Cambricon chips
Variants	V4 (full), V4-Lite (lighter, already in testing)

The Mixture of Experts design is key. While the model has 1 trillion total parameters, only ~37 billion activate per response. That means it runs more like a 37B model in practice — fast and relatively lightweight — while having access to 1T parameters worth of knowledge. It’s the same trick Google used with Gemma 4’s 26B MoE model, but at 40x the scale.

Release Date: When Is It Actually Launching?

Let’s be honest about the timeline. DeepSeek V4 has been delayed twice:

Expected Date	What Happened
Mid-February 2026	Delayed — no explanation given
March 2026	Delayed again — V4-Lite appeared March 9 instead
Late April 2026	Current expectation based on Reuters reporting + insider leaks

The strongest signal that V4 is imminent: V4-Lite has been live-tested on API nodes since early April, with developers reporting a 30% inference speed increase and dramatically improved context recall (94% at 128K tokens, up from 45%). A stealth version also appeared on OpenRouter under the codename “Hunter Alpha” before being identified.

Insider leaks suggest a rescheduled late April launch. One possible timing factor: the upcoming Trump-Xi meeting, where demonstrating Chinese AI parity could strengthen negotiating position on chip export controls.

Our best estimate: last two weeks of April 2026. We’ll update this page the moment it drops.

The Huawei Chip Story (Why This Is Bigger Than a Model Launch)

This is the part most AI blogs are missing.

DeepSeek deliberately denied NVIDIA early access to V4 while giving that window exclusively to Chinese chipmakers. Alibaba, ByteDance, and Tencent have placed bulk orders for hundreds of thousands of Huawei’s Ascend 950PR chips — and prices have jumped 20% in weeks.

This matters because:

It’s the first frontier AI model that doesn’t need NVIDIA. Every other leading AI model (GPT-5, Claude, Gemini) runs on NVIDIA GPUs. DeepSeek V4 proves you can train and run a competitive model on domestic Chinese silicon.
It challenges the US chip export strategy. US sanctions on advanced chip exports to China assumed Chinese companies couldn’t build frontier models without NVIDIA hardware. V4 on Huawei chips undermines that assumption.
It could shift the economics of AI. If Huawei chips can run frontier models at lower cost than NVIDIA, the entire pricing structure of AI APIs could face downward pressure.

DeepSeek has been rewriting core code components to bypass NVIDIA’s CUDA ecosystem in favor of Huawei’s CANN architecture. They’re also developing two V4 variants specifically optimized for consumer-grade Chinese GPUs — meaning everyday developers in China could run it locally.

Pricing: How Cheap Will It Be?

DeepSeek has historically offered the cheapest API pricing in the industry. Early pricing leaks for V4:

Model	Input (per 1M tokens)	Output (per 1M tokens)	Monthly Cost (moderate use)
DeepSeek V4	~$0.28	~$0.50-1.10	$5-20
Claude Opus 4.6	$5.00	$25.00	$100-360
Claude Sonnet 4.6	$3.00	$15.00	$15-30
GPT-5.4	$2.50	$15.00	$15-25
Gemini 3.1 Pro	$2.00	$12.00	$8-20
DeepSeek V3.2	$0.28	$0.42	$1-5

If these numbers hold, V4 delivers near-Claude-Opus quality at roughly 1/50th the price for input and 1/25th for output. For production workloads, that’s the difference between $2,000/month and $40/month.

The catch: these are leaked/estimated prices, not confirmed. But given DeepSeek’s track record with V3 pricing ($0.28 input), it’s credible.

Benchmarks: How Good Is It?

Caveat up front: DeepSeek’s V4 benchmarks are self-reported and not yet independently verified. Treat them with healthy skepticism until third-party evaluations confirm.

That said, here’s what early community testing of V4-Lite and internal benchmarks suggest:

Benchmark	DeepSeek V4 (claimed)	Claude Opus 4.6 (verified)	GPT-5.4
SWE-bench Verified	~81%	80.8%	~80%
Math reasoning	115% of Opus	Baseline	~100%
Knowledge	97% of Opus	Baseline	~95%
Science reasoning	95% of Opus	Baseline	~90%
Coding	~90% of Opus	Baseline (best-in-class)	~95%

The pattern: V4 appears to match or beat Claude on math and knowledge tasks while falling slightly short on coding. For developers, that coding gap matters. For general AI use, V4 at 1/50th the price is compelling even at 90% quality.

V4-Lite’s context handling has improved dramatically in testing — 94% recall at 128K tokens (up from 45% in earlier versions). If the full V4 model maintains that improvement at 1M tokens, it’ll have one of the largest effective context windows of any model.

How to Access DeepSeek V4 When It Launches

Based on DeepSeek’s pattern with V3, expect multiple access paths:

1. DeepSeek API (fastest) Sign up at platform.deepseek.com. DeepSeek typically launches on their own API first. Expect V4 to appear within hours of announcement.

2. OpenRouter (day of or day after) OpenRouter already hosts V3 and likely had V4 stealth-testing as “Hunter Alpha.” Expect immediate availability.

3. HuggingFace (within days) DeepSeek publishes open weights on HuggingFace. V3 is already at deepseek-ai/DeepSeek-V3. V4 will follow. Large download (~400-700GB for full model).

4. Ollama (within a week) Community quantizations appear on Ollama quickly after HuggingFace release. For local use: ollama pull deepseek-v4 (once available). You’ll need serious hardware — the full model is 1T parameters. Quantized versions for consumer GPUs will follow.

5. Third-party APIs Fireworks AI, Together AI, and Groq typically add popular open models within days.

DeepSeek V4 vs the Competition: Quick Decision Guide

Your Priority	Best Choice	Why
Cheapest possible	DeepSeek V4	10-50x cheaper than Western models
Best coding quality	Claude Opus 4.6	Still the verified benchmark leader for code
Longest context (1M)	DeepSeek V4	1M native context, proven in V4-Lite testing
Run locally (free)	Gemma 4 31B	Open-source, runs on 24GB+ GPU
Best all-rounder	Claude Sonnet 4.6	Quality + reasonable cost
Privacy (no cloud)	DeepSeek V4 local (once available) + Gemma 4	Open weights, run on your hardware

What We Don’t Know Yet

Being honest about the gaps:

Exact launch date. “Late April” is our best estimate. Could be next week, could be month-end.
Real-world benchmark performance. Self-reported numbers don’t always match independent testing.
Censorship/alignment differences. Previous DeepSeek models had restrictions on politically sensitive topics per Chinese regulations. V4 will likely have the same.
Local running requirements. The full 1T model needs massive hardware. Quantized consumer versions are expected but specs aren’t confirmed.
Whether open weights include the full model. DeepSeek could release a smaller variant open-source while keeping the full V4 API-only.

The Bottom Line

DeepSeek V4 could be the most consequential AI model launch of 2026 — not because it’s the smartest (Claude Opus still leads on coding), but because it could be the cheapest frontier model by a factor of 50x, the first running on non-NVIDIA chips, and fully open-source.

If you’re evaluating AI models for your work, don’t make a final decision until V4 launches. The pricing could reshape the entire market.

We’ll update this page the moment V4 goes live. Bookmark it.

Sources: