Meta Muse Spark: The Closed-Source Model That Made Meta Stock Jump 6%

Meta abandoned open-source Llama for Muse Spark — a proprietary model scoring 52 on benchmarks. Here's what it does, where it fails, and why it matters.

Meta spent the last year as the open-source underdog of AI. Llama was the model you could run on your own hardware, the one that powered a million indie projects, the one that made Meta the People’s AI Company.

Then on April 8, Meta Superintelligence Labs released Muse Spark — and it’s closed source. No weights. No download. No running it yourself.

Meta’s stock jumped 6% that day. The open-source community had… feelings.


What Is Muse Spark?

Muse Spark is Meta’s first model from their new Superintelligence Labs division. It’s a multimodal AI that can process text, images, and reason through complex problems — similar to what Claude, ChatGPT, and Gemini do.

But unlike everything Meta has released before, you can’t download it. It runs on Meta’s servers, inside Meta’s apps: WhatsApp, Instagram, Facebook, Messenger, and Ray-Ban AI glasses. You can also use it at meta.ai in a browser.

The model was built over nine months by a team led by Alexandr Wang — the founder of Scale AI, which Meta acquired for $14.3 billion. Wang’s mandate from Mark Zuckerberg was explicit: catch up with OpenAI, Anthropic, and Google. And do it fast.

Muse Spark is the first result of that push. Internally codenamed “Avocado,” it represents a ground-up rebuild of Meta’s AI approach — not an iteration on Llama, but something entirely new.


How It Compares to Claude, GPT, and Gemini

This is where the numbers get interesting. Artificial Analysis, a respected independent benchmark firm, scores Muse Spark at 52 on their Intelligence Index v4.0. Here’s where that puts it:

ModelIntelligence IndexPrice
Gemini 3.1 Pro57$20/mo (Google One AI Premium)
GPT-5.457$20/mo (ChatGPT Plus)
Claude Opus 4.653$20/mo (Claude Pro)
Muse Spark52Free
Claude Sonnet 4.649Free (limited) / $20/mo
Grok 4.247$8/mo (X Premium+)

A score of 52 puts Muse Spark in the top 5 globally. And it’s completely free for consumers. That matters.

Where Muse Spark Wins

Health and medical knowledge. Muse Spark leads every competitor on HealthBench Hard with a score of 42.8 — beating GPT-5.4 (40.1), Gemini 3.1 Pro (20.6), and Grok 4.2 (20.3). If you ask Meta AI about symptoms, medications, or health conditions on WhatsApp, you’re getting the most medically accurate frontier model available.

Chart and figure understanding. On CharXiv Reasoning — which tests how well a model understands charts, graphs, and data visualizations — Muse Spark scores 86.4, ahead of GPT-5.4 (82.8) and Gemini (80.2). Useful for anyone who photographs a chart and asks “what does this mean?”

Multimodal perception. On MMMU-Pro (multimodal understanding), Muse Spark scores 80.5%, behind only Gemini 3.1 Pro (82.4%). It’s strong at understanding photos, screenshots, and visual information — a natural fit for Instagram and Ray-Ban glasses.

Token efficiency. Muse Spark used just 58 million output tokens to complete the full Intelligence Index evaluation. Claude Opus 4.6 used 157 million. GPT-5.4 used 120 million. Fewer tokens for the same work means faster responses and lower operational costs for Meta.

Where Muse Spark Falls Short

Coding is a weak point. On Terminal-Bench (code generation), Muse Spark scores 59.0. GPT-5.4 scores 75.1. Claude Opus 4.6 leads coding benchmarks entirely. If you’re a developer, Muse Spark is not your tool.

Abstract reasoning lags behind. On ARC-AGI-2 — which tests novel pattern recognition that models can’t memorize — Muse Spark scores 42.5. GPT-5.4 scores 76.1, and Gemini 3.1 Pro scores 76.5. That’s nearly double. Muse Spark handles knowledge-based tasks well but struggles with problems that require genuine logical reasoning on unfamiliar patterns.

No developer API (yet). There’s no public API. Select partners have private preview access, but you can’t build apps on Muse Spark right now. Meta says they plan to expand access — but no timeline.


Why Meta Went Closed Source

This is the part that’s generating the most debate.

For years, Meta’s AI strategy was built on open source. Llama was free. Llama 2 was free. Llama 3 was free. Researchers, startups, and hobbyists built entire ecosystems around downloadable Meta models. The r/LocalLLaMA subreddit — 800K+ members who run AI models on their own hardware — existed largely because of Meta.

Then Llama 4 underperformed. The models launched in April 2025 to mixed reviews, with benchmark discrepancies and quality concerns. By mid-2025, Meta had reorganized its AI division, brought in Alexandr Wang, and created Meta Superintelligence Labs with a fundamentally different philosophy.

Wang’s background is telling. He built Scale AI — a company that made its money providing high-quality training data to closed-source AI companies like OpenAI and Anthropic. His approach is data quality over openness. Controlled releases over community building. Competitive performance over ecosystem growth.

Meta says they “hope to open-source future versions” of the Muse series. But the first model — the one that sets the tone — is proprietary. And the open-source community noticed.


How to Try It

Muse Spark is available right now, for free:

  • meta.ai — browser-based chat interface, works on desktop and mobile
  • WhatsApp — type a question to Meta AI in any chat
  • Instagram — access through the Meta AI integration
  • Facebook / Messenger — same Meta AI interface
  • Ray-Ban Meta glasses — voice queries processed by Muse Spark

No signup needed for the web interface at meta.ai. For the messaging apps, you need the respective accounts you probably already have.

The “Contemplating” mode is Muse Spark’s reasoning feature — similar to ChatGPT’s “thinking” or Claude’s “extended thinking.” It takes longer but produces more thoughtful, step-by-step responses for complex questions. It’s particularly strong on science, math, and health queries.


What It Can’t Do

Being clear about limitations:

  • No coding assistance in any meaningful sense. Terminal-Bench 59 vs 75+ for GPT and Claude means this isn’t a developer tool.
  • No file upload/analysis. You can’t upload documents, spreadsheets, or datasets. It processes text and images only.
  • No API for builders. You can’t integrate Muse Spark into your own products yet.
  • No persistent memory across sessions (as of launch). Each conversation starts fresh.
  • Limited tool use. It can’t browse the web for you, run code, or interact with third-party services the way Claude or ChatGPT can with their respective agent platforms.

What This Means for You

If you use Meta’s apps daily: You’re getting a significant AI upgrade — for free. The Meta AI assistant in WhatsApp, Instagram, and Messenger is about to get noticeably better, especially for health questions, photo understanding, and chart interpretation. If you’ve tried Meta AI before and found it lacking compared to ChatGPT or Claude, try it again. Muse Spark closes much of that gap.

If you’re an open-source AI enthusiast: This is a real setback. Meta’s commitment to open-source AI was always partly strategic — it let them recruit talent and build community while competing with better-funded closed labs. Muse Spark’s closure signals that competitive pressure has overtaken community building as Meta’s priority. The hope that future versions will be open-sourced is exactly that — hope, not commitment.

If you’re choosing between AI platforms: Muse Spark changes the equation because it’s free and good enough for many daily tasks. But it’s not as capable as Claude, GPT-5.4, or Gemini for complex work. Think of it as the best free option for general questions and image understanding, with the premium platforms still ahead for coding, reasoning, and professional workflows.

If you’re a developer: Wait for the API. Muse Spark isn’t useful to you until you can build with it. When the API opens, the token efficiency (58M vs 157M for Claude) could make it attractive for high-volume applications where cost matters more than peak performance.


The Bottom Line

Muse Spark is Meta’s re-entry ticket to the frontier AI race. It’s not the best model — it ranks 4th behind Gemini, GPT-5.4, and Claude — but it’s free, it’s fast, and it’s embedded in apps that 3.2 billion people already use.

The bigger story is strategic. Meta built its AI reputation on openness and gave it up the moment performance required it. Whether “we hope to open-source future versions” becomes reality or remains a PR line will tell you everything about where Meta’s AI division is really headed.

For now, the model is real, the benchmarks are honest, and the stock market liked it. Try it at meta.ai and form your own opinion.


Sources:

Build Real AI Skills

Step-by-step courses with quizzes and certificates for your resume