Gemini API Pricing: Free Tier Limits + April 2026 Changes

Gemini API pricing explained — free tier limits, paid tier costs per million tokens, April 2026 billing changes, and how it compares to OpenAI and Claude.

Google’s Gemini API just got more expensive to use for free — and cheaper to use at scale.

On April 1, 2026, Google enforced mandatory spending caps across all billing tiers, restricted Pro models behind a paywall for free users, and introduced prepaid billing for new accounts. If you’ve been running on the free tier, your rate limits probably got cut. And if you haven’t set up billing yet, you might be locked out of models you were using last month.

But here’s the thing most coverage misses: even with these changes, Gemini is still the cheapest major AI API for most use cases. Flash-Lite costs $0.10 per million input tokens. That’s 25x cheaper than Claude Sonnet and 5x cheaper than OpenAI’s GPT-5.4.

The pricing just got more complicated. Let’s sort it out.

What Is the Gemini API?

If you’re not a developer, here’s the short version: an API (Application Programming Interface) is a way for software to talk to AI. When you use ChatGPT through its website, that’s the consumer product. When a company builds AI into their own app — an email assistant, a document analyzer, a chatbot on their website — they use the API.

The Gemini API lets developers build with Google’s AI models. It’s how apps like custom chatbots, content tools, and data analysis pipelines access Gemini’s brains.

For developers reading this: you already know what an API is. Skip to the pricing table.

What Changed on April 1, 2026

Three big shifts:

1. Mandatory spending caps by tier. Google now enforces maximum monthly spend at the billing account level. You can’t exceed your tier’s cap even if you want to.

TierMonthly CapHow to Reach
Tier 1$250/moEnable billing (default)
Tier 2$2,000/mo$100 spend + 3 days (was $250 + 30 days)
Tier 3$20,000-$100,000+/moContact Google

Hit your cap? Your API pauses until next month — or until you upgrade tiers. No surprise bills. That’s actually a good change for developers who’ve been burned by runaway API costs.

2. Free tier restricted to Flash models only. Before April, free-tier users could access Gemini Pro. Now, Pro requires either a paid API key or a Google AI Pro/Ultra subscription. The free tier still works — but only with Flash and Flash-Lite models.

3. Prepaid billing for new users. Starting March 23, new AI Studio accounts may be required to use prepaid billing — buy credits first, use them as you go. Existing accounts aren’t affected (yet).

The Free Tier: What You Still Get

The free tier isn’t dead. But it’s smaller than it used to be. Google cut free quotas by 50-80% back in December 2025, and the April changes removed Pro model access entirely.

Here’s what free-tier developers get right now:

ModelRequests/MinRequests/DayTokens/Min
Gemini 2.5 Flash-Lite151,000250,000
Gemini 2.5 Flash10250250,000
Gemini 2.5 Pro5100250,000

All three models get full access to the 1-million-token context window — that hasn’t changed.

Is the free tier enough? For prototyping, personal projects, and low-traffic apps, yes. Flash-Lite at 1,000 requests/day handles a surprisingly useful workload. But for anything production-grade or user-facing with more than a handful of daily users, you’ll need to enable billing.

Here’s the full pricing table for Gemini API as of April 2026, per million tokens:

ModelInputOutputInput (>200K context)Output (>200K context)
Gemini 2.5 Flash-Lite$0.10$0.40
Gemini 2.5 Flash$0.30$2.50
Gemini 3 Flash (preview)$0.50$3.00
Gemini 2.5 Pro$1.25$10.00$2.50$15.00
Gemini 3.1 Pro$2.00$12.00$4.00$24.00

Cost-saving features:

  • Batch API: 50% off all prices (for non-real-time jobs)
  • Context caching: Up to 90% savings on repeated context
  • Audio input: Priced higher than text — $1.00/MTok vs $0.30 for Flash

How It Compares to OpenAI and Claude

This is the table most developers actually want. How does Gemini stack up per million tokens?

ModelInputOutputBest For
Gemini Flash-Lite$0.10$0.40Cheapest option anywhere
Gemini 2.5 Flash$0.30$2.50Speed + low cost
Gemini 3.1 Pro$2.00$12.00Google’s flagship
OpenAI GPT-5.4$2.50OpenAI’s latest
Claude Sonnet 4.6$3.00$15.00Strong reasoning
Claude Opus 4.6$5.00$25.00Best reasoning, premium price

On paper, Gemini wins on price across every tier. Flash-Lite at $0.10 per million input tokens is the cheapest production-ready API from any major provider.

But there’s a catch. Recent research found that listed prices can be misleading. When you account for how many tokens each model actually uses to complete a task, the real cost sometimes flips. In 21.8% of model comparisons, the “cheaper” model actually costs more in practice. The study found that Gemini 3 Flash — listed 78% cheaper than GPT-5.2 — had an actual cost 22% higher when measured by task completion. And Claude Opus 4.6, listed at 2x the price of Gemini 3.1 Pro, actually cost 35% less because it completed tasks with fewer tokens.

The lesson: don’t just compare per-token prices. Test your specific workload with each model and measure total cost per task.

Which Model Should You Use?

Here’s a quick decision framework:

Gemini 2.5 Flash-Lite — For high-volume, cost-sensitive tasks where you need speed and low cost. Think classification, simple extraction, boilerplate generation. At $0.10/MTok input, you can run thousands of requests for pennies.

Gemini 2.5 Flash — The workhorse. Good quality, fast, and cheap enough for most production apps. If you’re building a chatbot, content tool, or summarizer, start here.

Gemini 3.1 Pro — For complex reasoning, long-context analysis, and tasks where quality matters more than cost. Use it when Flash isn’t accurate enough.

When to use OpenAI or Claude instead: If your task is primarily code generation (Claude excels here), complex multi-step reasoning (Claude Opus), or you need specific tool integrations (OpenAI’s ecosystem). Gemini wins on price and context window size. The others win on specific quality dimensions.

How to Set Up Billing

If you’re still on the free tier and want to unlock higher rate limits and Pro models:

  1. Go to Google AI Studio
  2. Click your profile → Billing
  3. Choose Prepaid or Pay-as-you-go (new accounts may only see Prepaid)
  4. Add a payment method
  5. You’ll start in Tier 1 ($250/mo cap)
  6. After $100 spend and 3 days, you’ll auto-upgrade to Tier 2 ($2,000/mo cap)

Check your current tier anytime at: aistudio.google.com/app/apikeys

What’s Coming Next

June 1, 2026: Gemini 2.0 Flash and 2.0 Flash-Lite get deprecated. If you’re still using either, migrate to 2.5 Flash or 3 Flash before then.

Google is also expected to keep tightening the free tier while expanding paid features. The trend is clear: the free lunch is shrinking, but the paid menu keeps getting cheaper.

The Bottom Line

Gemini API is still the most affordable major AI API in 2026 — especially at the Flash-Lite and Flash tiers. But the April 1 changes mean you can’t coast on free forever. If you’re building anything beyond a prototype, enable billing. The spend caps mean you won’t get surprised by a runaway bill, and the rate limits on paid tiers are dramatically higher than free.

And before you lock into any single provider based on listed token prices, test your actual workload. The cheapest API on paper isn’t always the cheapest in practice.


Sources:

Build Real AI Skills

Step-by-step courses with quizzes and certificates for your resume