Vision Learns to Think, Codex Goes Everywhere, and Open Weights Claim the Coding Crown

Hi there,

Today's shift is less "one big model" and more "models getting specialised." OpenAI just turned image generation into a reasoning task with GPT-Image 2 and expanded Codex into almost every surface a developer touches. Meanwhile, the open-weight camp quietly took the SWE-Bench Pro crown with Z.ai's GLM-5.1, Meta's Llama 5 landed with a 5-million-token context and explicit "System-2" reasoning, and Oracle signed a 2.8 GW power deal with Bloom Energy to feed the whole thing.

The pattern underneath: the frontier isn't a single leaderboard anymore — it's fragmenting along capability lines, and open weights are a real seat at the table.

🔥 Featured Post

Vision Learns to Think, Codex Goes Everywhere, and Open Weights Claim the Coding Crown

GPT-Image 2 ships with native reasoning, 2K output, and multi-image consistency — OpenAI calls it a "visual thought partner" that verifies its own outputs and produces up to eight coherent images from a single prompt.
OpenAI also announced "Codex for (almost) everything" on April 21, pushing the Codex agent into more surfaces across the developer stack.
Z.ai's (formerly Zhipu) GLM-5.1 — a 744B MoE with 40B active parameters, 200K context, MIT-licensed — scored 58.4 on SWE-Bench Pro, topping GPT-5.4 (57.7) and Claude Opus 4.6 (57.3). First open-weight model to lead the coding-agent benchmark.
Meta's Llama 5, released earlier in April, explicitly markets "System-2" deliberate reasoning and a 5-million-token context window, while a new proprietary Meta model, Muse Spark, launched alongside it — a notable crack in Meta's open-source-only posture.
Oracle signed a deal for up to 2.8 GW of Bloom Energy fuel cells for AI data centers (1.2 GW already contracted), part of a US utility capex plan hitting $1.4T through 2030 driven almost entirely by AI power demand.

Read the full post →

📚 In Case You Missed It

Inside smart-sdlc: The Skill-First Agentic Framework That Turns Copilot and Claude Into a Full SDLC Team — smart-sdlc is a markdown-only agentic SDLC framework that runs inside GitHub Copilot, Claude, or any AI assistant — six personas (Aria, Rex, Nova, Sage, Lead, Scout), six phases, zero runtime. Here's why the 'skill-first, no platform' bet is interesting.

AI's Trust Test: Surgical Robots, Broken Benchmarks, and the EU's 100-Day Countdown — NVIDIA's healthcare physical AI stack (Open-H, Cosmos-H, GR00T-H, Rheo) ships into real operating rooms, Berkeley researchers prove the top 8 agent benchmarks can be hacked, and the EU AI Act deadline is now 103 days away. Trust is the new frontier.

The Silicon Decoupling: Meta's 1GW MTIA, OpenAI's $20B Cerebras Deal, and AI's Quiet Escape From Nvidia — Meta's 1-gigawatt Broadcom MTIA deal, OpenAI's $20B Cerebras contract, and Perplexity's Personal Computer on Mac — three stories, one pattern: AI compute is decoupling from Nvidia and from the cloud.

More posts dropping every day. Stay curious.

— Bhanu @ superml.dev