Claude Opus 4.8 lands: better at code, more honest and with sub-agents in parallel

Thu, 28 May 2026 00:00:00 +0200

Anthropic unveiled Claude Opus 4.8 this afternoon, its most capable production model. It arrives less than two months after Opus 4.7 and, most strikingly, without a price bump: it costs the same as the previous version.

The improvements in numbers

The benchmarks Anthropic published compare directly against Opus 4.7:

Agentic coding: 64.3% → 69.2% — Long-running tasks where the model chains tool calls, reads files, runs tests and self-corrects.
Multidisciplinary reasoning with tools: 54.7% → 57.9% — Problems requiring jumps between domains and the use of external tool context.
Knowledge work: 1753 → 1890 — Anthropic’s internal metric for analysis, writing and synthesis tasks.

According to the data Anthropic shared, Opus 4.8 beats GPT-5.5 and Gemini 3.1 Pro on several of these benchmarks. On the internal Super-Agent benchmark, it’s the only model that completes every case end to end.

An AI-generated podcast: how we built it and what we've learned

Thu, 21 May 2026 00:00:00 +0200

A few days ago I published the first episodes of El podcast de Sergio and El informativo on Apple Podcasts. Neither was recorded by a human.

The entire process —finding the topic, writing the script, synthesising the voice, assembling the audio and publishing it— is handled by an AI agent. Here’s how it works and what we’ve learned along the way.

Why

Not because it’s the most comfortable way to make a podcast. I did it because I wanted to explore how far synthetic voice quality can go in Spanish, and because building an automated production pipeline seemed like an interesting technical problem. You choose the topic and review the script — the agent handles the rest.

Claude on Sergio Comerón Blog

Claude Opus 4.8 lands: better at code, more honest and with sub-agents in parallel

The improvements in numbers

An AI-generated podcast: how we built it and what we've learned

Why