AI Roundup 097: Model Mayhem

December 13, 2024.

Dec 13, 2024

∙ Paid

This week has been an absolute firehose of new launches - truth be told, I don't have enough space to cover them in as much depth as I would like. But given the volume, there's enough to focus almost entirely on new model launches and features.

Model Mayhem: Google

On Wednesday, Google unveiled a slew of new models, projects, and agents. It's flagship model, Gemini 2.0, has some positive reviews already, while its other products are beefing up its lineup against OpenAI.

What's the latest:

The first release in the Gemini 2.0 family is Gemini 2.0 Flash, a lower-latency model that can generate text, images, and audio, and use third-party apps and services (including Google Search).
The company updated Project Astra, which can now store, summarize, and answer questions based on a 10-minute video.
Project Mariner is a prototype AI agent that can control Chrome, move the cursor, and perform various web tasks - similar to Anthropic's Computer Use functionality.
On the agent side is Jules, an experimental AI code agent that can autonomously fix software bugs and prepare code changes, as well as agents that that can understand rules in video games to help players via real-time conversations.
Google also introduced Deep Research, an AI tool that uses Gemini to scour the web and write detailed reports.
Gemini is also making it into Android XR, a mixed reality OS for headsets and smart glasses, set to debut in 2025.
The new model's training was powered by Trillium, Google's sixth-gen AI chip with 4x the training performance of its predecessor.
And while not part of the Gemini rollout, Google says its AI weather model has mastered 15-day forecasts.

Model Mayhem: OpenAI

Continuing with its "12 Days of Shipmas," OpenAI also rolled out several new and noteworthy features this week.

This week's releases:

First up was Sora, OpenAI's long-awaited text-to-video model. The new model is available to paid ChatGPT users, and some early reviews have already come in.
For what it’s worth, the product already appears to have some guardrails when generating videos of real people.
Canvas, ChatGPT's digital editing space, is not available to all users with added features and support.
With Apple's latest OS updates, ChatGPT is now available in Apple Intelligence.
And the company released “Advanced Voice Mode with vision” for paid ChatGPT users, allowing screen sharing and video responses - plus a "Santa Mode" Easter egg for the holidays.

Model Mayhem: Meta

While not as splashy as Google or OpenAI, Meta also managed to sneak in a few new model releases this week.

The more the merrier:

The company announced Llama 3.3 70B, a text-only model claimed to deliver the performance of its largest Llama model at a lower cost.
Meta Video Seal is a new AI tool that applies imperceptible watermarks to AI-generated videos and hidden messages to uncover their origins.
And Meta unveiled Meta Motivo, an AI model for controlling human-like digital agent movements, aiming to offer lifelike NPCs in the metaverse.

Model Mayhem: Odds and ends

Elsewhere in model mayhem:

Microsoft launched Phi-4, a 14B-parameter language model that reportedly outperforms larger models in mathematical reasoning (the company is also reportedly renegotiating its deal with OpenAI).
Researchers released Trellis, a new 3D mesh generative model.
Character.AI announces parental controls and a new language model for users under 18, following lawsuits claiming its chatbots contributed to self-harm.
An evaluation of six frontier AI models for in-context scheming found that only OpenAI's o1 model was capable of scheming in all tests.
AI startups are recruiting experts to train models on specialized tasks for sensitive sectors like finance, defense, and healthcare.
Chatbot Arena, which now ranks over 170 AI models, aims to become a Wikipedia for AI, while new competitors like Countless.dev are also cropping up.
And Wired looks at the OnlyFans models turning to AI impersonators to keep up with their direct messages.