This week has been an absolute firehose of new launches - truth be told, I don't have enough space to cover them in as much depth as I would like. But given the volume, there's enough to focus almost entirely on new model launches and features.
Model Mayhem: Google
On Wednesday, Google unveiled a slew of new models, projects, and agents. It's flagship model, Gemini 2.0, has some positive reviews already, while its other products are beefing up its lineup against OpenAI.
What's the latest:
The first release in the Gemini 2.0 family is Gemini 2.0 Flash, a lower-latency model that can generate text, images, and audio, and use third-party apps and services (including Google Search).
The company updated Project Astra, which can now store, summarize, and answer questions based on a 10-minute video.
Project Mariner is a prototype AI agent that can control Chrome, move the cursor, and perform various web tasks - similar to Anthropic's Computer Use functionality.
On the agent side is Jules, an experimental AI code agent that can autonomously fix software bugs and prepare code changes, as well as agents that that can understand rules in video games to help players via real-time conversations.
Google also introduced Deep Research, an AI tool that uses Gemini to scour the web and write detailed reports.
Gemini is also making it into Android XR, a mixed reality OS for headsets and smart glasses, set to debut in 2025.
The new model's training was powered by Trillium, Google's sixth-gen AI chip with 4x the training performance of its predecessor.
And while not part of the Gemini rollout, Google says its AI weather model has mastered 15-day forecasts.
Model Mayhem: OpenAI
Continuing with its "12 Days of Shipmas," OpenAI also rolled out several new and noteworthy features this week.
This week's releases:
First up was Sora, OpenAI's long-awaited text-to-video model. The new model is available to paid ChatGPT users, and some early reviews have already come in.
For what it’s worth, the product already appears to have some guardrails when generating videos of real people.
Canvas, ChatGPT's digital editing space, is not available to all users with added features and support.
With Apple's latest OS updates, ChatGPT is now available in Apple Intelligence.
And the company released “Advanced Voice Mode with vision” for paid ChatGPT users, allowing screen sharing and video responses - plus a "Santa Mode" Easter egg for the holidays.
Model Mayhem: Meta
While not as splashy as Google or OpenAI, Meta also managed to sneak in a few new model releases this week.
The more the merrier:
The company announced Llama 3.3 70B, a text-only model claimed to deliver the performance of its largest Llama model at a lower cost.
Meta Video Seal is a new AI tool that applies imperceptible watermarks to AI-generated videos and hidden messages to uncover their origins.
And Meta unveiled Meta Motivo, an AI model for controlling human-like digital agent movements, aiming to offer lifelike NPCs in the metaverse.
Model Mayhem: Odds and ends
Elsewhere in model mayhem:
Microsoft launched Phi-4, a 14B-parameter language model that reportedly outperforms larger models in mathematical reasoning (the company is also reportedly renegotiating its deal with OpenAI).
Researchers released Trellis, a new 3D mesh generative model.
Character.AI announces parental controls and a new language model for users under 18, following lawsuits claiming its chatbots contributed to self-harm.
An evaluation of six frontier AI models for in-context scheming found that only OpenAI's o1 model was capable of scheming in all tests.
AI startups are recruiting experts to train models on specialized tasks for sensitive sectors like finance, defense, and healthcare.
Chatbot Arena, which now ranks over 170 AI models, aims to become a Wikipedia for AI, while new competitors like Countless.dev are also cropping up.
And Wired looks at the OnlyFans models turning to AI impersonators to keep up with their direct messages.
Things happen
A deep dive on AI scaling laws. Nvidia is beefing up its China presence. Harvard releases massive AI training dataset of public domain books. A look at Anthropic's AI risk evaluation team. Google asked the FTC to break up Microsoft-OpenAI deal. Smart glasses with ChatGPT hit the market. YouTube rolls out auto-dubbing feature. Apple cancels Mac chip to focus on AI server chip. Stability AI CEO claims triple-digit growth. Q&A with Microsoft's AI chief on various topics. "Stop hiring humans" billboard sparks outrage. The AI we deserve. ElevenLabs launches AI-hosted podcast tool. Study finds AI improves breast cancer detection. Palantir and Anduril launch AI consortium for national security. ByteDance takes early lead in China's AI race. AI teams discover more new materials than standard methods. How the UK became an AI regulation leader. Will AI eat the browser? US-China fallout fuels AI boomtowns in Southeast Asia. AI robots for children are dying. China opens anti-monopoly probe into Nvidia. Meta's AI coding tool won't be released externally. AI guesses your accent. AI program reduces stillbirths in Malawi clinic. Chip industry uncertainties under potential Trump presidency. New benchmark aims to assess AI safety. Microsoft unveils zero-water data centers. The first commercially streaming, AI-generated movies. Reddit tests AI-powered Q&A feature. Apple execs discuss company's AI efforts. Grok chatbot available to all X users. AI-enhanced payment fraud adds billions to US losses. OpenAI's o1 represents shift to reasoning models. AI slop invades Oregon journalism. Japanese scientists written out of AI history. Using AI 100 times per hour. Russian influence campaign likely used AI voiceovers. AI skeptics need to wake up. US approves AI chip exports to UAE for Microsoft facility. Trump's AI czar lacks industry ties.