Ai on Sergio Comerón Blog

Claude Opus 4.8 lands: better at code, more honest and with sub-agents in parallel

Thu, 28 May 2026 00:00:00 +0200

Anthropic unveiled Claude Opus 4.8 this afternoon, its most capable production model. It arrives less than two months after Opus 4.7 and, most strikingly, without a price bump: it costs the same as the previous version.

The improvements in numbers

The benchmarks Anthropic published compare directly against Opus 4.7:

Agentic coding: 64.3% → 69.2% — Long-running tasks where the model chains tool calls, reads files, runs tests and self-corrects.
Multidisciplinary reasoning with tools: 54.7% → 57.9% — Problems requiring jumps between domains and the use of external tool context.
Knowledge work: 1753 → 1890 — Anthropic’s internal metric for analysis, writing and synthesis tasks.

According to the data Anthropic shared, Opus 4.8 beats GPT-5.5 and Gemini 3.1 Pro on several of these benchmarks. On the internal Super-Agent benchmark, it’s the only model that completes every case end to end.

How I code in 2026: my stack in the age of agents

Wed, 27 May 2026 00:00:00 +0200

I’ve been turning over an uncomfortable thought for weeks: I open an editor to write code less and less. I direct it, I review it, I approve it, but I type less every day. And yet I’m publishing more than ever: two podcasts, a personal website with nine tools, server monitoring, my own Jitsi instance, an Apple fan site, more Moodle plugins than ever, this blog. Something has shifted in how I program, and I think it’s worth writing down.

Gemini 3.5 Flash: speed over intelligence for AI agents

Fri, 22 May 2026 00:00:00 +0200

Google presented Gemini 3.5 Flash at I/O on May 19th. Fast, cheap, optimised for agents. Good news for developers. But the most interesting thing is not what it does for Google — it’s who else is going to use it.

Apple has a multi-year agreement with Google to integrate Gemini models into Apple Intelligence. WWDC 2026 is on June 8th. All signs point to the new Siri — the one they’ve been promising for years and never quite delivering — running, at least in part, on this very model.

An AI-generated podcast: how we built it and what we've learned

Thu, 21 May 2026 00:00:00 +0200

A few days ago I published the first episodes of El podcast de Sergio and El informativo on Apple Podcasts. Neither was recorded by a human.

The entire process —finding the topic, writing the script, synthesising the voice, assembling the audio and publishing it— is handled by an AI agent. Here’s how it works and what we’ve learned along the way.

Why

Not because it’s the most comfortable way to make a podcast. I did it because I wanted to explore how far synthetic voice quality can go in Spanish, and because building an automated production pipeline seemed like an interesting technical problem. You choose the topic and review the script — the agent handles the rest.