DeepSeek V4: Release Date, Specs, and the Huawei Chip Bombshell

Updated April 21, 2026 — status: still unreleased, delayed a third time. The “Hunter Alpha” OpenRouter model was confirmed to be Xiaomi, not DeepSeek. Huawei chip confirmation from April 4 still stands.

Every other article about DeepSeek V4 says it’s launching “mid-February” or “sometime in March.” Those articles are wrong. It’s late April. The model has been delayed three times now. “Late April” is still the community expectation as of April 21 — pre-training is done, V4-Lite has been live-tested on API nodes, but no official V4 launch, no model card, and no Hugging Face drop.

What’s actually confirmed is the Huawei story. Reuters reported on April 4 that DeepSeek V4 will run on Huawei’s Ascend 950PR chips. Not NVIDIA. Not AMD. Huawei. That makes it the first frontier AI model built to run on Chinese semiconductor infrastructure — and the implications go far beyond a new chatbot.

Here’s everything we know right now, with speculation clearly marked.

What Is DeepSeek?

If you haven’t followed the DeepSeek story: it’s a Chinese AI lab that shocked the industry in January 2025 when DeepSeek R1 matched GPT-4 on key benchmarks at a fraction of the training cost. They followed with V3, which became one of the most-used open models in the world. V4 is their next frontier model — and potentially the most significant AI release of 2026.

DeepSeek’s models are open-source (typically MIT or Apache 2.0 licensed), meaning anyone can download, run, and modify them for free. That’s a sharp contrast to OpenAI and Anthropic, whose frontier models are closed and API-only.

DeepSeek V4 Specs

Specification	Details
Parameters	~1 trillion (MoE architecture)
Active parameters per token	~37 billion
Context window	1 million tokens
Modality	Text + image + video (native multimodal generation)
Architecture	Mixture of Experts (MoE)
Training cost	~$5.2 million
License	Expected open-source (MIT or Apache 2.0)
Hardware	Huawei Ascend 950PR + Cambricon chips
Variants	V4 (full), V4-Lite (lighter, already in testing)

The Mixture of Experts design is key. While the model has 1 trillion total parameters, only ~37 billion activate per response. That means it runs more like a 37B model in practice — fast and relatively lightweight — while having access to 1T parameters worth of knowledge. It’s the same trick Google used with Gemma 4’s 26B MoE model, but at 40x the scale.

Release Date: When Is It Actually Launching?

Let’s be honest about the timeline. DeepSeek V4 has now been delayed three times:

Expected Date	What Happened
Mid-February 2026	Delayed — no explanation given
March 2026	Delayed again — V4-Lite appeared March 9 instead
Early-to-mid April 2026	Pretraining done, but no official launch. Still no model card.
Late April 2026	Current community expectation. No confirmed date from DeepSeek.

The strongest signal that V4 is imminent: V4-Lite has been live-tested on API nodes since early April, with developers reporting a 30% inference speed increase and dramatically improved context recall (94% at 128K tokens, up from 45%).

Correction: An earlier version of this page said a stealth V4 also appeared on OpenRouter as “Hunter Alpha.” That attribution was wrong. The Hunter Alpha model was later fingerprinted and confirmed to be Xiaomi’s MiMo-V2-Pro, not DeepSeek V4. The SillyTavern community issued a PSA on the same point. As of April 21, no verifiable V4 model exists on OpenRouter or Hugging Face.

Insider leaks still point to a late-April launch. One possible timing factor: the upcoming Trump-Xi meeting, where demonstrating Chinese AI parity could strengthen Beijing’s position on chip export controls.

Our best estimate: last two weeks of April 2026, but a slip into early May is plausible given the pattern. We’ll update this page the moment it drops.

The Huawei Chip Story (Why This Is Bigger Than a Model Launch)

This is the part most AI blogs are missing.

DeepSeek deliberately denied NVIDIA early access to V4 while giving that window exclusively to Chinese chipmakers. Alibaba, ByteDance, and Tencent have placed bulk orders for hundreds of thousands of Huawei’s Ascend 950PR chips — and prices have jumped 20% in weeks.

This matters because:

It’s the first frontier AI model that doesn’t need NVIDIA. Every other leading AI model (GPT-5, Claude, Gemini) runs on NVIDIA GPUs. DeepSeek V4 proves you can train and run a competitive model on domestic Chinese silicon.
It challenges the US chip export strategy. US sanctions on advanced chip exports to China assumed Chinese companies couldn’t build frontier models without NVIDIA hardware. V4 on Huawei chips undermines that assumption.
It could shift the economics of AI. If Huawei chips can run frontier models at lower cost than NVIDIA, the entire pricing structure of AI APIs could face downward pressure.

DeepSeek has been rewriting core code components to bypass NVIDIA’s CUDA ecosystem in favor of Huawei’s CANN architecture. They’re also developing two V4 variants specifically optimized for consumer-grade Chinese GPUs — meaning everyday developers in China could run it locally.

Pricing: How Cheap Will It Be?

DeepSeek has historically offered the cheapest API pricing in the industry. Early pricing leaks for V4:

Model	Input (per 1M tokens)	Output (per 1M tokens)	Monthly Cost (moderate use)
DeepSeek V4 (projected)	~$0.14–$0.50	~$0.28–$0.80	$5-20
Claude Opus 4.7	$5.00	$25.00	$100-360
Claude Sonnet 4.6	$3.00	$15.00	$15-30
GPT-5.4	$2.50	$15.00	$15-25
Gemini 3.1 Pro	$2.00	$12.00	$8-20
DeepSeek V3.2	$0.28	$0.42	$1-5

If these numbers hold, V4 delivers near-Claude-Opus quality at roughly 1/50th the price for input and 1/25th for output. For production workloads, that’s the difference between $2,000/month and $40/month.

The catch: these are leaked/estimated prices, not confirmed. But given DeepSeek’s track record with V3 pricing ($0.28 input), it’s credible.

Benchmarks: How Good Is It?

Caveat up front: DeepSeek’s V4 benchmarks are self-reported and not yet independently verified. Treat them with healthy skepticism until third-party evaluations confirm.

That said, here’s what early community testing of V4-Lite and internal benchmarks suggest:

Benchmark	DeepSeek V4 (claimed)	Claude Opus 4.7 (verified, Apr 16)	GPT-5.4
SWE-bench Verified	~81%	87.6%	~80%
Math reasoning	115% of Opus 4.6 baseline	Strong	~100%
Knowledge	97% of Opus 4.6 baseline	Strong	~95%
Science reasoning (GPQA Diamond)	95% of Opus	94.2%	~90%
Coding	~90% of Opus 4.6	Baseline (best-in-class after 4.7 launch)	~95%

The pattern: V4 appears to match or beat Claude on math and knowledge tasks while falling slightly short on coding. That gap widened on April 16 when Claude Opus 4.7 pushed SWE-bench Verified from 80.8% to 87.6% — if V4’s ~81% coding claim holds, it now trails the current frontier by ~6 points instead of running even. For general AI use, V4 at a fraction of Opus pricing is still compelling even at 90% relative quality.

V4-Lite’s context handling has improved dramatically in testing — 94% recall at 128K tokens (up from 45% in earlier versions). If the full V4 model maintains that improvement at 1M tokens, it’ll have one of the largest effective context windows of any model.

How to Access DeepSeek V4 When It Launches

Based on DeepSeek’s pattern with V3, expect multiple access paths:

1. DeepSeek API (fastest) Sign up at platform.deepseek.com. DeepSeek typically launches on their own API first. Expect V4 to appear within hours of announcement.

2. OpenRouter (day of or day after) OpenRouter already hosts V3 and historically adds new DeepSeek models within 24 hours. Don’t assume any current “mystery model” is V4 in disguise — the Hunter Alpha speculation turned out to be Xiaomi, not DeepSeek.

3. HuggingFace (within days) DeepSeek publishes open weights on HuggingFace. V3 is already at deepseek-ai/DeepSeek-V3. V4 will follow. Large download (~400-700GB for full model).

4. Ollama (within a week) Community quantizations appear on Ollama quickly after HuggingFace release. For local use: ollama pull deepseek-v4 (once available). You’ll need serious hardware — the full model is 1T parameters. Quantized versions for consumer GPUs will follow.

5. Third-party APIs Fireworks AI, Together AI, and Groq typically add popular open models within days.

DeepSeek V4 vs the Competition: Quick Decision Guide

Your Priority	Best Choice	Why
Cheapest possible	DeepSeek V4	10-50x cheaper than Western models
Best coding quality	Claude Opus 4.7	SWE-bench Verified at 87.6% — the verified leader after the April 16 launch
Longest context (1M)	DeepSeek V4	1M native context, proven in V4-Lite testing
Run locally (free)	Gemma 4 31B	Open-source, runs on 24GB+ GPU
Best all-rounder	Claude Sonnet 4.6	Quality + reasonable cost
Privacy (no cloud)	DeepSeek V4 local (once available) + Gemma 4	Open weights, run on your hardware

What We Don’t Know Yet

Being honest about the gaps:

Exact launch date. “Late April” is our best estimate. Could be next week, could be month-end.
Real-world benchmark performance. Self-reported numbers don’t always match independent testing.
Censorship/alignment differences. Previous DeepSeek models had restrictions on politically sensitive topics per Chinese regulations. V4 will likely have the same.
Local running requirements. The full 1T model needs massive hardware. Quantized consumer versions are expected but specs aren’t confirmed.
Whether open weights include the full model. DeepSeek could release a smaller variant open-source while keeping the full V4 API-only.

The Bottom Line

DeepSeek V4 could be the most consequential AI model launch of 2026 — not because it’s the smartest (Claude Opus 4.7 now leads on coding after its April 16 release, at 87.6% SWE-bench Verified), but because it could be the cheapest frontier model by a factor of 10-50x, the first running on non-NVIDIA chips, and fully open-source.

If you’re evaluating AI models for your work, don’t make a final decision until V4 launches. The pricing could reshape the entire market.

We’ll update this page the moment V4 goes live. Bookmark it.

Want to actually use DeepSeek V4 in your dev workflow? Our Claude Code with DeepSeek V4 course walks through the routing setup, the eval pattern that lets you swap V4 for Sonnet on the right tasks, and the cost-vs-quality math for mixed-model coding. Free to start, Pro for the full path. If you want broader Claude Code fundamentals first, Claude Code Mastery is the prereq.

Sources:

DeepSeek V4: Release Date, Specs, and the Huawei Chip Bombshell

Table of Contents

What Is DeepSeek?

DeepSeek V4 Specs

Release Date: When Is It Actually Launching?

The Huawei Chip Story (Why This Is Bigger Than a Model Launch)

Pricing: How Cheap Will It Be?

Benchmarks: How Good Is It?

How to Access DeepSeek V4 When It Launches

DeepSeek V4 vs the Competition: Quick Decision Guide

What We Don’t Know Yet

The Bottom Line

Build Real AI Skills

AI Fundamentals

Prompt Engineering

OpenClaw Mastery

AI for Small Business