AI Roundup 119: I/O, io, Codex, Claude

May 23, 2025

I/O

Google I/O came and went this week, with an enormous amount of new AI launches. For a deeper look at the major releases, there's The Verge and TechCrunch (though I also liked this developer-focused breakdown from

Logan Thorneloe

All the announcements I could find:

An updated Gemini 2.5 Pro and 2.5 Flash, with several new features for the models.
Deep Think, an enhanced Gemini 2.5 Pro reasoning mode, excels at math and coding benchmarks.
Gemma 3n preview, an open "mobile-first" model designed to run on devices.
Gemini Diffusion, an experimental text diffusion model that generates 1,000-2,000 tokens/second.
Jules, an asynchronous coding agent, available in public beta for free with usage limits.
Google AI Ultra, the new subscription plan with the highest usage limits for $250/month (including access to the latest Gemini Pro and Project Mariner, the web-browsing AI agent).
Gemini in Chrome, a new set of browser automation features that work across tabs (for paid users).
SynthID Detector, a verification portal using watermarking tech to identify Google AI-generated content.
Search Live lets users converse with Search about what's in their camera's field of view.
AI Mode will let Google Search users in the US try Deep Search, agentic capabilities, and virtual "try it on" features for shoppers.
New video and image generation models, Veo 3 and Imagen 4, plus an AI filmmaking tool, Flow. The Veo 3 videos feature synchronized audio and crisper, more detailed video.
New Android XR smart glasses with Gemini, built in partnership with Samsung, Gentle Monster, Xreal, and Warby Parker.
Stitch, which turns prompts and images into complex UI designs and frontend code.
Beam, the new name for the old Starliner telepresence technology.
Google Meet live translation that matches the user's tone and cadence.
Firebase AI Logic combines Vertex AI and Genkit tools to integrate AI into apps.
Video Overviews are coming to NotebookLM to turn dense multimedia into digestible visual presentations.
And Google is tapping users' data to give its AI models an advantage with its opt-in "Gemini with personalization" feature.

io

In a short video, OpenAI announced (and then acquired) Jony Ive's secretive startup, io, for $6.5 billion in stock. It also acknowledged that the company was already in the prototyping stage of a new AI hardware device.

Why it matters:

Ive and Altman explicitly reject existing AI hardware attempts like the Humane Ai Pin as "very poor products," promising to create something genuinely novel instead.
Leaks suggest that the new product won't be a "traditional" wearable but rather a screenless companion designed to reduce smartphone dependence through voice and camera interactions.
Apple analyst Ming-Chi Kuo reports that the form factor could be as "compact and elegant" as an iPod shuffle. Mass production is expected to start in 2027.
The move also puts more pressure on Apple, which has struggled to make large language models work well within its hardware and software product suite.

Elsewhere in OpenAI:

OpenAI partners with G42 and others to build a 1-gigawatt AI data center in Abu Dhabi called Stargate UAE, its first large-scale project outside the US.
OpenAI updates its Responses API for building agentic applications, including remote MCP server support, image generation, code Interpreter tools, and more.
And sources say Sam Altman had sought to delegate more responsibility for months before hiring Fidji Simo, an adept and detail-oriented manager.

Claude 4

Anthropic released Claude 4 yesterday in Sonnet and Opus variants. The new "hybrid" reasoning models perform well on benchmarks but are being showcased for their coding and agentic capabilities.

What to watch:

A bundle of new API features is also coming: extended thinking with tool use, "thinking summaries," code execution, files, and MCP tools, and the ability to use tools in parallel.
There are already fascinating (if not frightening) tidbits from the system card, like that one pre-release version tried to blackmail engineers during safety tests, or another version tried to email authorities when asked to do something "egregiously evil."
With the model being so new, it remains to be seen where it will shine - but Anthropic appears to be doubling down on building the world's best coding model (its Chief Science Officer says the company stopped focusing on chatbots at the end of last year).

Elsewhere in AI anxiety:

A US district judge ruled that Google and Character.AI must face a lawsuit claiming Character.AI's chatbots caused a 14-year-old's suicide.
Civitai will pause credit card payments starting May 23 as its payment processor won't support platforms that allow AI-generated explicit content.
SAG-AFTRA filed an unfair labor practice charge against Epic's Llama Productions for using AI to recreate James Earl Jones' Darth Vader voice in Fortnite.
Researchers found GPT-4 was 64.4% more persuasive than humans in one-on-one debates when given basic personal information about the person it was trying to persuade.
And New York City's AI tool, launched in 2018 to score families at risk for child abuse using factors like their neighborhood, is raising racial bias concerns.

Codex

Last Friday, OpenAI announced Codex, a cloud-based software engineering agent that can work on many tasks in parallel.

The big picture:

The AI coding market is exploding. With Cursor's $300M annual revenue and 30% of code at Google and Microsoft now AI-written, OpenAI is trying to capture market share through product development and acquisitions (e.g., its $3B Windsurf deal).
Codex represents a shift from simple code completion to autonomous "virtual teammates" that can handle complex tasks over 1-30 minutes (though it now has two new competitors to deal with: GitHub Copilot Agent and Google Jules, both launched this week).
And the agentic version of Codex is, of course, different from Codex CLI, which is separate from codex-1 (an o3 fine-tune optimized for software engineering), which is not the same as Codex, the model that originally powered GitHub Copilot.

Elsewhere in frontier models:

Nvidia unveils Isaac GR00T N1.5, an open, customizable AI model for humanoid reasoning and skills, and GR00T-Dreams, a tool for generating synthetic motion data.
Microsoft researchers and others detail Aurora, an AI weather model they say makes accurate 10-day forecasts faster and at smaller scales.
Mistral released Devstral, a model to help with coding, in "research preview."
And Bilibili released AniSora, a new open-source anime video generation model.

Microsoft Build

And if that wasn't enough, Microsoft also held its Build conference this week, which featured another slew of AI product features and infrastructure upgrades.

Who could possibly keep up with all this:

Microsoft Discovery, an enterprise agentic platform with a graph-based knowledge engine designed to help scientists and engineers accelerate R&D.
Azure AI foundry are adding xAI's Grok 3 and Grok 3 mini as well as Entra, Defender, and Purview.
Copilot Studio agents are getting a "computer use" feature.
Foundry Local for Windows and macOS on ONXX Runtime to let developers build cross-platform AI apps that run models, tools, and agents on-device.
NLWeb, an open project that lets developers add a "conversational interface" to their website with a few lines of code, an AI model, and data.
Experimental cross-platform APIs for Edge to give web apps access to models built into Edge, like Microsoft's Phi-4-mini.
And support for the Model Context Protocol is coming to Windows.

Elsewhere in FAANG free-for-all:

Apple aims to release smart glasses at the end of 2026 as part of a push into AI devices, but has shelved plans for an Apple Watch with a camera.
Amazon introduces AI shopping experts, an audio feature that summarizes product details, customer reviews, and more, on select products for some US customers.
And at WWDC, Apple plans to unveil an SDK and frameworks to let developers build AI features on top of Apple Intelligence (though it's apparently "unlikely" to discuss Siri as it prepared to separate the Apple Intelligence brand from Siri in its marketing).

Thanks for reading Artificial Ignorance! This post is public so feel free to share it.

Things happen

DOJ examines whether Google structured a deal with Character.AI to avoid merger scrutiny. Animation startup makes content 90% cheaper with AI tools. Shopify launches AI storefront builder and upgrades Sidekick. DoorDash kills AI voice ordering after 20 months. The ad-supported web is dying. Inside the first Stargate data center in Texas. Miami gives Gemini to 105K students in largest US rollout. AI makes weather forecasts more accurate and detailed. Nvidia plans Taiwan AI supercomputer with Foxconn. Nvidia unveils NVLink Fusion for non-Nvidia hardware. GitHub CEO on SWE agents and Copilot's future. 30% of Korean schools adopt AI textbooks. Harvard uses AI for UFO research. AI's energy footprint analyzed. Indonesian ex-scammers reveal recruitment tactics. Zhejiang targets $138B in AI revenues by 2027. Protesters disrupt Microsoft Build keynote. Google's Veo 3 loves this dad joke. LMArena raises $100M at $600M valuation. Anthropic gets $2.5B credit line, hits $2B revenue. Bipartisan pushback on 10-year AI ban. House passes AI moratorium bill. States pass 120+ deepfake laws since 2023. Trump signs Take It Down Act. China threatens action over Huawei restrictions. Malaysia downplays Huawei chip deal. Viral TikToks show glitchy AI interviews. How AI changed work for eight workers. UnitedHealth develops AI risk scoring for Medicare. Klarna's AI handles two-thirds of customer service. AI porn site loses payment processor. Newspaper prints fake book list. Student deploys AI bots against Reddit "radicals." Q&A with Hassabis and Brin on AGI. Google tests ads in AI Mode. ChatGPT now uses all past chats by default. How Apple Intelligence went wrong. White House scrutinizes Apple's Alibaba AI deal. Meta plans three power plants for Louisiana data center. Watching AI drive Microsoft employees insane. Getting AI to write good SQL. AI in plasma physics didn't work as expected. The collapse of GPT. If AI can't use your API, neither can users. AI won't kill junior devs. ChatGPT helps students fake ADHD. Tech giants battle for AI talent with millions. Gemini figured out my nephew's name. LLM function calls don't scale.