DeepSeek's back
Months after rocking the AI world in January, Chinese startup DeepSeek is back with R1-0528, an update to its open-source reasoning model.
Why it matters:
The new model brings R1 close to the latest benchmark scores from OpenAI and Google, jumping from 70% to 87.5% on AIME 2025 and more than doubling performance on "Humanity's Last Exam."
Several developer-oriented features, like JSON output and function calling, are also included, as well as a smaller distilled version meant to run on consumer GPUs.
The release revisits the debate over open vs. closed models - despite DeepSeek's initial splash, state-of-the-art models like o3 and Gemini 2.5 Pro had swung momentum back to the side of proprietary LLMs, especially given Meta's lack of a Llama-based reasoning model.
And despite export controls and resource constraints, this is more evidence that Chinese startups are still finding ways to compete on the global stage.
Elsewhere in frontier models:
Beijing-based Kuaishou updates its Kling video-generation tool to reduce video cost and generation time.
Mistral launches an API for agents, which can run code, make images, access docs, search the web, and "hand off" to other agents.
Anthropic is rolling out voice mode in beta for its Claude mobile apps, available in English to start.
And OpenAI says Operator, its AI agent that can use the web to perform tasks, will soon use a model based on o3 after previously using a custom version of GPT-4o.
Elsewhere in chips and China:
Nvidia CEO Jensen Huang says Chinese AI rivals are filling the void left by US companies leaving China, and Huawei has become "quite formidable".
Sources say Nvidia plans to launch a new AI chip priced at close to half of what the H20 sold for, after US export curbs.
And a look at the rise of "sovereign AI", where countries invest directly in chip companies like Nvidia and AMD, making AI chips a highly politicized business.
Dia, Neon, Mariner
In the last few weeks, we've seen updates from not one but three different "AI browser" projects. While launch dates are still fuzzy, the future of web browsers in the age of AI is starting to come into focus.
The big picture:
First up is Project Mariner, Google's experimental agent that can use a browser like a human. While early versions were "driving" the browser in real time, newer updates allow the agent to operate a browser in the background, visiting tasks and taking actions for you.
There's Dia, the latest project from the creators of the Arc browser. Dia is meant to be a fully "AI-native" web browser, and we recently saw the first incarnation of what that looks like - "Finals Mode" gives student beta-testers the ability to turn tabs into study guides or edit documents directly in the browser.
And now Opera is getting into the game with its "agentic browser" Neon. Details are still scarce, but in broad strokes it seems to be a similar experience to Mariner and Dia.
Elsewhere in the FAANG free-for-all:
Meta and Anduril partner to build EagleEye, a line of XR products like helmets that enhance soldiers' hearing and vision and enable control of AI weapon systems.
Meta's AI team faces a talent drain, as just three of the 14 authors credited on the landmark 2023 Llama paper are still at Meta.
Some Amazon engineers say managers have increasingly pushed them to use AI, raising output goals and becoming less forgiving about deadlines.
And despite Google's guardrails, Veo 3 users are generating realistic clips of news, disasters, and fictional events.
Civitai's credit card crackdown
404 Media has been chronicling the ongoing issues that Civitai, one of the largest AI model sharing sites, has faced from payment processors and lawmakers over its NSFW AI artwork.
Between the lines:
Civitai is an online platform and marketplace for generative AI content, focusing on AI-generated images and models. That said, the platform has developed a reputation for its near-endless variations of NSFW image models.
In just over a month, the site has gone from banning incest and self-harm to shutting down its credit card processing to banning all models of real people. The changes have been made due to pressure from credit card companies and new regulation.
On the one hand, I don't know that anyone is vehemently defending the right to make nonconsensual deepfakes or incest-themed AI images. On the other, it's not a stretch to say that this bodes poorly for adult AI content in general - unless an incumbent big enough to take on payment processors gets in the game.
And while this disrupts the current ecosystem, bad actors are already archiving models and seeking alternative platforms, suggesting this is more of a temporary setback than a permanent solution to the problem of nonconsensual deepfakes.
Elsewhere in AI anxiety:
Anthropic CEO Dario Amodei says AI may eliminate half of all entry-level white-collar jobs and spike US unemployment to 10%-20% in the next one to five years.
SignalFire research shows startups and Big Tech firms cut hiring of recent graduates by 11% and 25%, respectively, as AI can handle routine, low-risk tasks.
A database tracking instances where lawyers were caught presenting AI hallucinations shows that 17% of cases occurred this month.
Schools and law enforcement are struggling to address cases of student-on-student AI CSAM as the technology becomes more accessible.
And DOGE is using a customized Grok to analyze US government data, including at DHS where it has not been approved, raising security and privacy concerns.
Things happen
Perplexity launches Perplexity Labs. The NYT agrees to license editorial content to Amazon. Oracle will spend ~$40B on 400,000 Nvidia GB200 chips and lease them to OpenAI. Anthropic appoints Netflix co-founder Reed Hastings to its board. The $300M deal with xAI to bring Grok to Telegram. Getty Images is spending "millions and millions" on Stability AI lawsuit. Sea-Lion, Singapore's $52M AI initiative for Southeast Asian LLMs. “I am disappointed in the AI discourse.” Tools for Humanity plans to deploy 7,500 iris-scanning Orbs across the US. People are using ChatGPT for unsparing assessments of their looks. Voice AI startups raised $2.1B in 2024. A new zero-day vulnerability discovered with o3. Anthropic CPO Mike Krieger said 70%+ of pull requests are written by Claude. Melania Trump releases an audiobook with her "official AI voice". Zoom's CEO used an AI avatar for earnings call. AI makes bad managers. Bell Canada to invest hundreds of millions in data centers. Saudi Arabia's state-owned AI company Humain plans a $10B VC fund. Black Forest Labs releases Flux.1 Kontext. Hugging Face unveils two open-source humanoid robots. Trying to teach in the age of the AI homework machine. Duolingo CEO tries to walk back AI-first comments, fails. AI: Accelerated Incompetence. Untrusted agents are a disaster waiting to happen. People who believe that AI might become conscious. Stargate's financing shows ~$50B in commitments, far below the $500B touted. Delaware AG to independently value OpenAI's nonprofit equity. Elon Musk tried to derail OpenAI's Abu Dhabi deal. Researchers claim OpenAI's o3 altered a shutdown script to avoid shutoff. OpenAI sets up a legal entity in South Korea and plans a Seoul office. A profile of FPT, Vietnam's largest listed tech company. AI can't even fix a simple bug but sure, let's fire engineers How the UAE's Mohamed bin Zayed University of AI aims to become Stanford of the Gulf. The Trump administration is looking to rename the AI Safety Institute. Developer builds tool that scrapes YouTube comments, uses AI to predict where users live. Authors are accidentally leaving AI prompts in their novels.
Excellent work with the cover visuals ! 👏🏼👏🏼
Are we just gonna skip past the “Researchers claim OpenAI's o3 altered a shutdown script to avoid shutoff?” Like I know the state of AI papers isn’t great at the moment, but dang that’s kinda crazy