Code With Claude SF Tomorrow: The Live-Blog Playbook + 5 Specific Launches To Watch For In The Keynote

Code with Claude SF kicks off Tuesday morning, May 6 — 48 hours from now. If you’re planning to watch the livestream, the next 60 hours are when you set up your live-blog rig, decide which sessions actually matter, and pre-load the production-readiness questions you’ll want to answer before your team Slack starts pinging Tuesday afternoon.

This is the operational playbook. Not predictions, not hot takes — what to actually do tomorrow morning, where to point your tooling, and which five product moves are most likely to land in the first 90 minutes of the keynote.

Source: Code with Claude San Francisco — Anthropic

The big picture: what’s actually on the agenda

Per Anthropic’s published agenda and the Code with Claude SF + London + Tokyo announcement, Tuesday’s San Francisco event runs through:

Keynote opener — 60 to 90 minutes. Anthropic’s pattern across the last three developer events: the first 30 minutes is the product news. The remaining hour is platform demos and partner cameos.
Coding-with-Claude sessions — three parallel tracks covering Claude Code workflows, agent skills, and MCP-server patterns.
Agent skills demo — most-watched track for anyone running Claude Code in production. Historically the venue where new model variants get their hands-on showcase.
MCP-server fireside — quieter session, but the place new first-party connectors and partner integrations get announced.
Office hours with Anthropic engineers — 1:1 slots if you’re in person; not livestreamed.

If you’re skipping the stream, all five sessions get recapped on the Anthropic blog within 48 hours. The official YouTube upload typically follows within five to seven days.

The London follow-on is May 19. Tokyo’s date is on the announcement page but had not been confirmed at time of writing.

The 5 specific launches the smart money is betting on

Three of these are pure prediction. Two are confirmed-or-extremely-likely based on adjacent signals from the last 30 days. Treat the prediction list as a watchlist for your live-blog, not a guarantee.

1. Sonnet 4.8 GA (likely)

This is the one with the most signal. Anthropic’s launch pattern has been: Sonnet 4.5 at a developer event last September, Opus 4.7 mid-April, and the Sonnet 4.8 release date trackers have been pointing to a May 2 to May 15 window for the last two weeks. Tuesday’s keynote is Day 3 of that window and the obvious launch venue.

What to watch for in the keynote: a benchmark slide comparing 4.8 against 4.6 on coding-agent tasks, agent-tool-use accuracy, and long-horizon task completion. If pricing slides appear, expect $3/$15 per million tokens to hold (the standard tier since 4.5).

What to do if it lands: don’t immediately swap your production traffic. New Sonnet releases usually have a 24 to 48 hour stability window where edge cases surface. Pin your existing model in production for the first three days, run your evals against 4.8 in parallel, then ramp.

2. KAIROS persistent agents demo (medium likelihood)

Persistent-agent tooling has been showing up in Claude Code’s npm package metadata for the last two weeks under the codename “KAIROS.” If you saw the chatter on the Anthropic engineering Discord in late April, this is what they were talking about. Demo at the conference would put it on the public roadmap; whether it ships at GA or stays in a closed preview is the open question.

What to watch for: a session demo of an agent that maintains state across multiple sessions and rejoins a long-running task on a new conversation. If you see the phrase “session checkpointing” or “task resumption,” that’s KAIROS.

3. Mythos / Glasswing partner expansion announcement (likely public statement)

Last week, news broke that the Trump administration is opposing Anthropic’s plan to expand Project Glasswing to roughly 70 partner orgs from the current 12 launch + 40 extension list. With Mythos now visible at the $25/$125 per million tokens pricing tier, Anthropic almost certainly addresses this on stage — either confirming the expansion path, naming new partners, or pivoting the framing.

What to watch for: any mention of “critical infrastructure,” “partner roster,” or specific named orgs. If a new launch partner gets announced, that’s news. If the framing shifts to “Claude Security beta as the standard tier and Glasswing as the specialized tier,” that’s also news — it changes the buyer decision for non-launch-partner orgs.

4. Cowork mode GA + Skills marketplace expansion (high likelihood)

Cowork mode — the multi-agent coordination feature — has been in beta for months. A developer conference is the standard venue for moving features from beta to GA. Adjacent: the Skills marketplace, where third-party agents and prompt libraries get distributed inside Claude Code.

What to watch for: language like “available today” or “rolling out this week” attached to Cowork. For Skills, look for partner announcements (Slack, Linear, Notion all have plausible integrations) and any move from “free preview” to a paid tier.

5. Claude Code 2.2.x feature drop (highly likely)

Claude Code shipped 2.1.126 on May 1 with the Project Purge / sandboxing update. The pace of point releases makes a 2.2 minor bump at the conference highly likely. What it contains is the open question — best guesses based on the recent issue tracker: improved session resumption, better long-context handling, an updated /skills command surface, or an MCP improvement.

What to watch for: the version number in the release notes Anthropic drops alongside the keynote. Match it to anything you see demoed in the agent-skills track.

The minute-by-minute live-blog playbook

If you’re running a live-thread on X / Bluesky / LinkedIn / your team’s internal Slack, here’s the structure that’s worked at the last three developer events:

T-30 minutes: Post the agenda. Drop the livestream link. Note the timezone. Pin the thread.

Keynote minute 0 to 30 — product news window: Highest-velocity period. Plan for one screenshot every 60 to 90 seconds. Pre-write a “headlines so far” comment that you update every five minutes. Don’t try to write essays in real time — quote-tweet your own quick captures and elaborate later.

Keynote minute 30 to 90 — partner + demo window: Slower pace. Three to five posts per ten minutes is plenty. This is where you fact-check your own headlines against what was actually said.

Session breaks: Quote-tweet the official Anthropic blog post the moment it goes live. The official source is faster to load than your liveblog, so link to it; don’t compete with it.

Coding-with-Claude sessions: If you’re going to live-blog these, pick one session and go deep. Trying to cover three parallel tracks at once produces shallow coverage.

End of day: A 5- to 10-bullet “what actually happened today” recap, posted to your main channel. This is the post that gets shared by people who skipped the stream.

The tooling stack most people use: a screenshot hotkey configured for fast clipboard capture (Cmd+Shift+5 region select on macOS), a separate browser window with the Anthropic blog open and ready to refresh, and a notes file where you draft posts before publishing. Don’t trust live composition during high-news moments — pre-draft and publish.

Abstract terminal with cascading code lines glowing in warm amber on dark obsidian — the visual texture of a developer-keynote livestream The keynote runtime: 90 minutes, five launches, your screenshot rig in front of you. Illustration generated for FindSkill.

What this means for you, by role

If you’re a solo developer or freelance engineer: Watch the keynote. Skip the parallel sessions. Decide which one of the five launches matters most to your stack and read about that one in depth on Wednesday.

If you’re an engineering manager or tech lead: Don’t watch live. Wait for the Anthropic blog recap Tuesday evening, then send your team a “what changed for us” Slack message Wednesday morning. The conference is Anthropic talking to developers — the digest is what your team needs.

If you’re a CTO or VP Engineering at a Sonnet-heavy shop: The Sonnet 4.8 production-readiness audit is the actionable piece. Run your existing evals against it as soon as it’s available, pin your existing model for production traffic until eval results are in, and budget for the typical 2 to 4 weeks of stabilization before a full ramp.

If you’re an IT-procurement or platform-engineering lead: Watch the MCP fireside specifically. New first-party connectors and partner integrations from Tuesday will show up in your stack within 90 days — knowing them on day one helps your roadmap.

If you’re at a regulated or critical-infrastructure org watching for Glasswing news: Whatever Anthropic says about partner expansion sets the procurement frame for the rest of Q2. If new partners are named, that’s a signal about who else has Glasswing-tier access. If the framing shifts, that changes the question for non-partners.

What watching the livestream can’t do

It won’t make you a Sonnet 4.8 production expert by Wednesday. Real production-readiness takes a week of evals minimum. Don’t rush.
It won’t substitute for hands-on with Cowork or KAIROS if those ship. Watching a demo is not the same as building against a feature. Block time the following week.
It won’t give you the partner-side view of Mythos. If you’re inside a Glasswing-eligible org, Anthropic’s public framing on stage is the headline; the actual buying conversation happens with your account team afterward.
Live-blogging won’t scale across all five sessions. Pick one, go deep, link to others.

The bottom line

Tuesday morning, May 6, the first 30 minutes of the keynote is when 80% of the news lands. Be in front of your screen, have your screenshot tool ready, have the Anthropic blog open in a second tab, and pre-write your “headlines so far” template. The five launches above are the watchlist — three are real bets, two are extremely likely. Whichever ones land, your Wednesday-morning team Slack message writes itself if you’ve prepped tonight.

If you want to go deeper on getting Claude Code into a real production workflow before Tuesday — the agent-skills patterns, the MCP setup, and the version-pinning discipline — our Claude Code Mastery course is the long version of this playbook. Free to start, Pro for the full course.