Meet the Models: What's Under the Hood

“ChatGPT” and “Claude” are brand names. The real differences live in the specific models running underneath.

When someone says “ChatGPT gave me a terrible answer,” they might have been using GPT-4o on the free tier. Switch to GPT-5.2 on a paid plan and you’d get a completely different result. Same goes for Claude — Haiku and Opus are practically different products.

Let’s break down what’s actually available as of March 2026.

The ChatGPT Model Lineup

OpenAI has a surprisingly large family of models now. Here’s what matters:

Model	Best For	Access
GPT-5.2	Flagship reasoning, math, science	Plus ($20/mo)
GPT-5.2 Pro	Maximum capability, deep research	Pro ($200/mo)
GPT-5.3 Codex	Agentic coding, terminal tasks	Plus ($20/mo)
GPT-4o	Everyday tasks, free tier default	Free
o3-pro	Extended reasoning chains	Pro ($200/mo)
o4-mini	Fast reasoning, cost-efficient	Plus ($20/mo)

The free tier runs GPT-4o — still solid, but noticeably behind the paid models. GPT-5.2 is where the real power lives. It scores 100% on AIME 2025 competition math problems and 94.3% on graduate-level science questions (GPQA Diamond).

GPT-5.3 Codex dropped on February 5, 2026, designed specifically for developers. It leads Terminal-Bench 2.0 at 77.3% and handles DevOps tasks, CI/CD pipelines, and code reviews better than any other model.

The Claude Model Lineup

Anthropic keeps things simpler — three tiers with clear roles:

Model	Best For	Access
Claude Opus 4.6	Flagship reasoning, 1M context, agent teams	Pro ($20/mo)
Claude Sonnet 4.6	Default model, balanced speed/quality	Free + Pro
Claude Haiku 3.5	Fast, budget API tasks	API only

Claude Opus 4.6 leads Chatbot Arena with an Elo rating of 1506 — the highest of any AI model. It also supports a 1M token context window, which means you can feed it an entire book or codebase in one go.

Sonnet 4.6 became the new default for both free and paid users on February 17, 2026. It’s the model most people interact with.

Both major releases (Opus 4.6 and GPT-5.3 Codex) dropped on the same day — February 5, 2026. The AI arms race is very real.

✅ Quick Check: If you’re on a free tier right now, which models are you actually using? (ChatGPT: GPT-4o. Claude: Sonnet 4.6.)

The Context Window Gap

This one matters more than most people realize.

Context window = how much text the AI can “see” at once. Think of it like working memory. A bigger window means the AI can read more of your documents, code, or conversation history before it starts forgetting earlier parts.

	ChatGPT	Claude
Max context	128K tokens	200K (standard), 1M (Opus 4.6)
In words	~96,000 words	~150,000 to 750,000 words
In pages	~50 pages effectively	~100+ pages coherently
Accuracy at max	Not well documented	Less than 5% degradation
Max single output	~4-6 pages	19+ pages coherent writing

That last row is wild. If you ask ChatGPT to write something long, it typically hits a wall around 4-6 pages. Claude can produce 19+ pages of coherent, structured writing in a single output. For anyone working with long documents — contracts, research papers, codebases — this is a real difference.

Architecture: How They Think Differently

Both models use transformer architectures, but their training philosophies diverge:

ChatGPT (RLHF — Reinforcement Learning from Human Feedback): Human reviewers rank outputs, and the model learns to produce responses that humans rate highly. This makes ChatGPT good at being helpful and conversational. The downside? In April 2025, an update over-optimized for user approval, making the model excessively agreeable — it even endorsed bad business ideas instead of pushing back. OpenAI rolled it back within days.

Claude (Constitutional AI): The model self-critiques against a set of written principles (a “constitution”). This makes Claude more consistent about honesty and less prone to telling you what you want to hear. The trade-off is that Claude sometimes refuses borderline requests that ChatGPT handles fine.

Neither approach is strictly better. But they produce different personalities. ChatGPT tends to be more flexible and eager to help. Claude tends to be more careful and honest, even when that means pushing back on your premise.

✅ Quick Check: What’s the key difference between RLHF and Constitutional AI? (RLHF optimizes for what humans rate as “good.” Constitutional AI optimizes against a written set of principles.)

Speed: Who Responds Faster?

For everyday use, both feel instant. But if you’re building something with the API or care about time-to-first-token:

ChatGPT: ~45ms average response time, 0.56s time-to-first-token
Claude: ~50ms average response time, 1.23s time-to-first-token

ChatGPT is roughly 2x faster to start generating output. For API applications where latency matters, that gap adds up. For normal chat conversations, you won’t notice.

Key Takeaways

The model you’re using matters more than the brand — GPT-4o and GPT-5.2 are practically different products
Claude Opus 4.6 leads Chatbot Arena (#1 rated), while GPT-5.2 leads math/science benchmarks
Claude’s context window (200K-1M tokens) is significantly larger than ChatGPT’s (128K)
Their training philosophies produce different personalities: ChatGPT is more flexible, Claude is more honest
Speed difference is minimal for chat, but ChatGPT has a 2x edge in API time-to-first-token

Up Next

Models are interesting, but money is real. In Lesson 3, we’ll break down exactly what you get at each price point — free, $20/month, and $200/month — so you can figure out which plan actually gives you the most value.