AI Roundup 084: Strawberry / o1

September 13, 2024.

Sep 13, 2024

∙ Paid

Workshops

Next Friday is my power-user prompting workshop with Claude and the Anthropic workbench! Learn how to improve your prompts, run A/B tests, and leverage Claude’s unique LLM strengths for maximum effectiveness.

Strawberry / o1

OpenAI announced o1 (codenamed Strawberry), its latest model with improved "reasoning capabilities.”

Between the lines:

There are actually two models, o1-preview and o1-mini. The latter is 80% cheaper but appears to hold its own when it comes to math and coding, while the main model qualifies for the USA Math Olympiad and "exceeds PhD-level accuracy" with STEM problems.
The models were trained to "think out loud" and then honed their thinking process. You can see a summary of the model's reasoning in real time, but OpenAI has decided not to show the full responses to users.
The reviews are already coming in - initial impressions are positive but not earth-shattering. Ultimately, these models seem to represent a promising new area of research but come with their own quirks and flaws.
Perhaps the most impressive thing is that they're available right now - paying ChatGPT subscribers and "tier 5 accounts" can access the model (with strict rate limits).

Elsewhere in frontier models:

Mistral released its first multimodal model, Pixtral 12B, available on GitHub and Hugging Face, with API access coming soon.
Replit launched Replit Agent, an AI that can build entire apps from scratch based on prompts, available in beta to subscribers.
And in case you missed it, the rise and fall of Reflection 70B, the groundbreaking open-source model that wasn't.

The fable of Reflection 70B

Charlie Guo

September 11, 2024

Read full story

Meta data

During an Australian government inquiry, Meta’s global privacy director acknowledged that the company was training its AI models on all public photos and posts from 2007 onwards.

Why it matters:

Defaults like these are what is driving the AI trust crisis - people refuse to believe tech companies when they say that they're not training on user data.
AI has also upended the Internet's default social contract - the incentives to publicly create content are quickly changing.
EU users are able to opt-out because of local regulations, but Australian (and most other) users don't have a choice unless they make their content private.

Elsewhere in the FAANG free-for-all:

Apple introduced Visual Intelligence, similar to Google Lens, as part of iOS 18's Apple Intelligence features.
Facebook and Instagram made AI labels less prominent on AI-edited (vs AI-generated) content.
And Google added Audio Overview, which turns documents into AI-hosted discussions about the source material.

Elsewhere in AI anxiety:

Microsoft, Anthropic, OpenAI, and others voluntarily committed to the White House to fight AI-generated CSAM.
A new study revealed an "alarming" level of trust in AI for making life-and-death decisions.
Senators are calling on the FTC to investigate AI summaries as a potential antitrust violation.
And recent deepfakes have seemingly pushed Taylor Swift to publicly endorse Kamala Harris.

AI spending spree

As we've gotten deeper into the AI arms race, the sums needed to build foundation models have gotten ever larger. Now, even startups that raised hundreds of millions are bowing out of the competition.

Why it matters:

As crazy as current numbers are - Alphabet's capex costs were $52B in the first six months of the year - they're only expected to keep growing to meet AI demand.
Big Tech is also becoming more intertwined with AI startups, either directly investing in them or buying out the founders and researchers.
As strange as it sounds, Meta's open-source efforts may be our best hope against an AI oligopoly.

AI's massive cash needs are Big Tech's chance to own the future

Charlie Guo

February 7, 2024

Read full story

Elsewhere in AI spending:

VC investments in AI startups have surpassed $64B this year (30% of all VC dollars), amid questions about whether they will pay off.
Gartner's chief of research says we're in the brute force phase of AI: once it ends, demand for GPUs will too.
And leaders from OpenAI, Anthropic, Nvidia, Microsoft, Google, and power companies met at the White House to discuss US AI energy infrastructure.

Things happen

Nevada partners with Google to use AI for unemployment benefit decisions. South Korea hosts summit on responsible military use of AI with 90+ countries attending. The AI industry is obsessed with Chatbot Arena. An AI bot named James has taken my old job. Tesla and xAI reportedly discussed a deal to power Full Self-Driving with xAI's models. Former Everyday Robots CEO reflects on giving AI a robot body. Sergey Brin is working at Google "pretty much every day" on AI projects. Audible plans to clone select narrators' voices for AI-generated audiobooks. Confessions of a chatbot helper. Would you trust AI to scan your genitalsfor STIs? Nvidia GPU rentals are cheaper in China despite US export controls. Conversations with GPT-4 Turbo reduced conspiracy beliefs by 20%. Ireland's DPC investigates Google's GDPR compliance for AI model training. Waymo cars in Phoenix and SF had 48% fewer crashes per mile than human drivers. Roblox teases generative AI tool for creating 3D scenes with text prompts. SAG-AFTRA weighs in on California's AI safety bill. Stalker allegedly created AI chatbot to harass woman on NSFW platform. NY Fed survey reveals AI adoption hasn't led to significant job cuts. Mike Krieger discusses his new role at Anthropic and reflects on co-founding Instagram. Honor introduces AI defocus technology to combat myopia on its latest devices. Bill Gates discusses AI, misinformation, and climate change in upcoming Netflix docuseries. Why AI is so bad at generating images of Kamala Harris. Infinity: realistic, speaking AI characters. How Apple's Private Cloud Compute works.