It's been the second year in a row of seemingly non-stop AI news, though perhaps fewer mainstream headlines than in 2023. The innovations have been more subtle but no less significant - AI has become more integrated into our daily lives, moving from novelty to necessity.
A year ago, GPT-4 stood alone at the top of the benchmarks (Claude 3.5, Llama 3, and Gemini 1.5 didn't exist). (Almost) nobody had heard of o1, or NotebookLM, or Sora. And most people still thought about LLMs as chatbots - not as agents or “AI employees.”
But the landscape has shifted dramatically. Google countered the narrative that they were falling behind, and OpenAI transformed ChatGPT into a multimodal platform. There were also plenty of controversies along the way, from copyright infringement to deepfakes to for-profit statuses.
So as we head into the final stretch of 2024, let's look at some of the year's biggest stories.
Honorable mention: xAI
Despite the constant stream of AI headlines, xAI burst onto the scene with a bang. The company raised $6 billion in funding from heavyweight investors like BlackRock and Sequoia Capital, doubling its valuation to $40 billion in just six months.
At the heart of xAI's ambitions is Colossus, a supercomputer cluster in Memphis that would make even the most hardened tech enthusiast's jaw drop. Built in partnership with NVIDIA in just 122 days, Colossus houses 100,000 of NVIDIA's most powerful chips. And they're not stopping there - the plan is to expand to a whopping one million GPUs.
10. World models
When it was first previewed, Sora left everyone speechless with the quality of its video generation. What's interesting about Sora, though, isn't that it can make slick videos - it seems to have a genuine understanding of how objects move and interact. This is known as a "world model”: i.e. a model capable of "understanding" how the world works, physics and all.
But Sora isn't alone in pushing the boundaries of world modeling. Google's Genie 2 represents a parallel breakthrough, creating interactive 3D environments that you can explore. What's remarkable about Genie 2 is its object permanence - if you hide an object behind another one, the model remembers it's there. Walk around to the other side, and that object will remain exactly where you left it. This might seem obvious to us humans, but it's a huge leap forward for AI.
9. Deepfakes
The Taylor Swift deepfake incident in January was a wake-up call for millions not yet acquainted with deepfakes. When sexually explicit AI-generated images of the pop star went viral on X (formerly Twitter), reaching tens of millions of views, the platform's content moderation system completely broke down.
The White House got involved, actors' unions sounded the alarm, and tech companies scrambled to patch their systems. Microsoft had to admit that people were exploiting their AI image tools and rushed to add new safeguards. And even though Congress has yet to pass meaningful deepfake legislation, California passed a suite of new laws targeting deepfakes, from election misinformation to sexually explicit content.
8. Data deals
The numbers tell the story: $250 million to NewsCorp, $60 million to Reddit, and dozens more deals in between. From Le Monde to The Atlantic to Vox Media, publishers rushed to sign licensing deals with OpenAI and other AI startups.
There was also a push from UGC (user-generated content) platforms - Reddit's deal to let AI companies train on user content, for example, sparked immediate controversy, with the FTC getting involved. Other platforms followed suit - Stack Overflow partnered with Google, and Automattic (parent to Tumblr and Wordpress) started working with AI companies.
This gold rush is transforming the social contract of the internet. Publishers are setting boundaries about how their content can be used, and some, like HarperCollins, are sharing the wealth with authors. But as more platforms rush to monetize their users' content, we're facing tough questions about privacy, fair compensation, and who truly owns the data that trains our AI models.
7. Talent "acquisitions"
In 2024, Big Tech found a creative new way to gobble up AI talent. Instead of buying entire companies, they struck unusual deals that looked more like talent raids. Google paid billions to effectively kneecap Character.AI, taking its founders and a chunk of its workforce. Microsoft and Amazon pulled similar moves with Inflection and Adept, snagging their best people while leaving the companies as hollow shells.
These weren't typical arrangements - in fact, they seemed carefully crafted to fly under the regulatory radar. The tech giants called them "licensing deals," but the FTC launched investigations into both Amazon and Microsoft's deals. After all, if it walks like an acquisition and talks like an acquisition, maybe it is an acquisition.
This talent consolidation is happening on top of Big Tech's massive investments into AI startups. Amazon alone has put $6.25 billion into Anthropic this year - that's out of $8 billion total, and still less than the nearly $14 billion Microsoft has put into OpenAI. Going from the trend, the future of AI is increasingly in the hands of a very small group of very large companies.
6. The age of agents
AI agents have arrived, and they're not just fancy chatbots anymore. Companies are starting to use them for everything from customer service to data analysis, though everyone seems to have a different idea of what exactly an "agent" is. Of course, some of these are just regular automation tools with an "agent" label slapped on top - but others are genuinely pushing the boundaries of what autonomous AI can do.
Salesforce and other tech giants are building agent platforms, while a wave of YC companies are launching "AI employees" for various business functions. We're still in the early days, and the hype around agents is still ahead of reality. But the direction is clear: AI agents are graduating from science fiction into the mainstream business world.
5. AI Acts
The EU made history this year by passing the world's first comprehensive AI regulation. The AI Act takes a tiered approach, treating different AI systems differently based on their potential risks. If you're building AI systems that might touch EU users, you need to pay attention - even if your company isn't based in Europe. The rules have teeth, and the fines for breaking them can be massive.
Meanwhile, in California, Governor Gavin Newsom vetoed SB 1047, a bill that would have required strict oversight of large AI models. Newsom's argument was that the bill was too broad and might actually make us less safe by focusing on the wrong things.
But Newsom wasn't completely against regulation - far from it. The same month he vetoed SB 1047, he signed 17 other AI-related bills into law. These new rules tackle everything from AI watermarking to preventing AI-generated misinformation. It's a sign that even if we're still figuring out exactly how to regulate AI, governments aren't sitting on their hands anymore.
4. Llama 3
Llama 2 was one of the most impactful AI stories last year, and Llama 3 continues that trend. While proprietary models from OpenAI and Google often hog the spotlight, Meta continuously works towards state-of-the-art AI that isn't locked behind closed doors. The crown jewel was Llama 3.3 70B, released in December, which matched the performance of models nearly 6 times its size.
But the real value wasn't just about raw performance - it was about accessibility. These models could run on consumer hardware, meaning you can now run a GPT-4 class model on your laptop. When others added vision capabilities with Llama 3V, they showed that even advanced features like image understanding didn't need to be locked away in expensive API calls.
3. Multimodality
2024 was the year AI learned to see, hear, and speak. While early AI models were limited to text, the latest versions can handle pretty much anything you throw at them - images, audio, video, and more. And these capabilities are quickly becoming table stakes: it's hard to imagine a ChatGPT competitor that doesn't offer multimodal capabilities.
This shift to multimodal AI isn't just a cool tech demo - it's fundamentally changing how we interact with computers. For example, OpenAI's Advanced Voice Mode turned ChatGPT into something that could hold a conversation, complete with different personalities and voices. These capabilities are making things like AI companions or fully automated call centers a potential reality.
Likewise, Anthropic's Computer Use mode adapts Claude to use the same UIs humans do - by taking screenshots of your browser and sending back keyboard and mouse events. Slowly but surely, LLMs are growing from just a single sense to experiencing the (digital) world in many of the same ways we do.
2. Reasoning models
After months of rumors about a project called Q* (or "Strawberry"), OpenAI released o1 - a "reasoning" model designed to think through problems step-by-step. Based on chain-of-thought prompting, the process doesn't lock models into an answer upfront and instead allows them to explore different lines of thinking.
But o3, released in December, marked a major breakthrough. It achieved unprecedented results across the board: solving complex programming tasks, competing at high-level mathematics, and even surpassing human performance on the ARC-AGI benchmark - a test that had seen minimal AI progress for years. This wasn't just an incremental improvement; it represented a fundamental shift in what AI could accomplish.
This shift toward "reasoning" models represents a different direction for AI development. Instead of focusing solely on larger training runs, OpenAI found that allowing models more computation time at inference could produce better results across technical domains. And it has shown improved performance on an incredibly compressed timeline - there were only 3 months between o1 and o3, versus 3 years between GPT-3 and GPT-4.
1. AI infrastructure
Tech giants have evolved from spending billions on AI companies to spending hundreds of billions on AI infrastructure. They’re building massive data centers, buying up every GPU they can find, and preparing for an AI-powered future. Microsoft, Amazon, Meta and Google combined are on pace to pour over $100 billion into data center expansions and related infrastructure costs this year alone. Much of this spend is targeted for the US, but there are also multi-billion data center investments happening in France, Malaysia, Finland, Singapore, and the UK.
We're seeing entirely new data center designs specifically for AI workloads, with some facilities dedicating most of their power to running NVIDIA chips. Meta and xAI are building GPU clusters that would make a supercomputer blush, with tens of thousands of chips working in parallel. And Microsoft and Amazon have even committed billions towards nuclear power projects to try to help fill energy demand.
This massive build-out shows how much faith (and money) the tech industry is putting into AI. It's an unfathomably large bet that AI will reshape not just the Internet but also the physical infrastructure that powers it.
Over to you
There was so, so much that happened this year, and I left a lot of different threads on the cutting room floor. What’s on your top ten list? What stories do you think went underreported this year? Let me know in the comments!
Great list!
I'd throw Genesis into your roundup of world models (https://genesis-embodied-ai.github.io/). While Sora appears to mimic real-world physics without truly understanding it (much like LLMs mimic language), Genesis seems to be a genuine physics engine that can simulate real-world scenarios from mere text prompts.
Also Odyssey's Explorer is another candidate, albeit more limited in scope: https://odyssey.systems/introducing-explorer
But your post anchored for me just how much stuff happened this year alone. Some of these feel like they've been around for well over a year to me, and yet they're not.
Glambase is something I would have to throw in too. AI influencers