First off, I think your grade should've been higher for last year's predictions. On this year's I definitely agree on regulation getting dialed back (and everyone struggling to figure out governance), slop content, and the AI movie/music credit.
I very much hope you're right on multiplayer mode and better personalization. ChatGPT's memory is one of my top favorite features across all of the various GenAI tools.
You know, I've heard a lot of "this election wasn't decided by AI", but SO many factors play into that sort of decision making that I'm skeptical AI didn't have a strong influence this time around... we may be too dumb to see how just yet, though. I'm not talking about a supermind or some kind of conspiratorial nonsense, FWIW, but really just the way phenomena can emerge from far simpler subsets. Perhaps we'll have some real idea of the actual influence at some point in the future, or perhaps not.
It reminds me of when mainstream media outlets would write about "up and coming" punk bands. The reality is that they had no idea what was actually going on at the surface, you know? I'm not 100% sure this is that thing, but it really seems like a very neat and tidy rug sweep, if that makes sense.
I think/hope that existing products will have more meaningful AI features introduced. Not just the standard "Have AI re-write this sentence" feature, but more meaningful things where the AI has access to customer data. For example:
* Ecommerce platforms will be able to analyze performance metrics and make recommendations about promotions, marketing campaigns, merchandise buying, and other things.
* Banking services will use AI to summarize activity and create custom reports and recommendations for customers, like a personal banker for the wealthy might.
Kind of analogous to how Cursor has the "Chat with codebase" feature where the AI has more context than just the current file you're looking at. I think that concept is going to be brought to more software and will be very powerful.
Cursor is such a great example of how you can build a compelling user experience even using someone else's models (though they do have a custom model to merge generated code snippets into your files).
I definitely hope we see richer interaction design with AI products in 2025. For what it's worth, I do think Google has been putting out some really interesting ideas with NotebookLM and Project Mariner.
I'm just working on my post for tomorrow, which is about AI predictions (among other things) and at least 4 of mine overlap heavily with yours. These ones:
1) Reasoning models get their moment (My WIP bullet point text in the draft is "At least three other reasoning models that are on par or better than OpenAI’s o3".)
2) AI gets a major movie/music credit (My WIP draft text: "A major movie features scenes made with text-to-video models.")
3) No GPT-5. On this one, I'm close to saying that we might even never have a model called GPT-5. Looking at the trend toward inference-time reasoning models, the old GPT-class models might not get as much focus and may see a reboot of the series.
4) The age of agents - agents becoming more prominent in the news cycle again is a pretty clear trend, so I'm hunting for a more nuanced/precise prediction along the lines of "An all-in-one agent builds a [product/tool/whatever] single-handedly," but I want to find a way to keep it realistic, since I also don't expect the issues with AI agents to be fully solved in 2025.
As for your other predictions, I think most of them are solid and some are near-guarantees.
Let's see how the year pans out!
And hey, a C+ isn't bad at all, especially since an otherwise solid B is dragged down by a single F.
Your comments and predictions all make sense to me. Well done!
I have a question for you--actually three questions--all about "thinking" AIs.
1. What, if anything, do the "thinking" models like o1, o3, Gemini Experimental, etc. do other than search the space of COT chains for one (or more) that work?
2. Search and backtracking aren't built into LLMs as we understand them--or at least not as I understand them. So what code (or other software) makes that happen? That is, what is it that makes these models search? If it's just code, why is it such a big deal? We've known how to write search algorithms forever. If it's not explicit code, what is it? How can you "train" search into an LLM?
3. In what sense do AI systems "understand" COT chains. The underlying systems cannot do logic. So, in what ways do they verify that the COT chains make any logical sense? That also doesn't seem like something a standard LLM knows how to do. So, how is doing it integrated into "thinking" models? As you know, systems like AlphaProof use LLMs to generate possible proof strategies and then use formal verifiers to turn them into proofs if possible. But that's not what "thinking" systems do -- at least as far as I know.
First off, I think your grade should've been higher for last year's predictions. On this year's I definitely agree on regulation getting dialed back (and everyone struggling to figure out governance), slop content, and the AI movie/music credit.
I very much hope you're right on multiplayer mode and better personalization. ChatGPT's memory is one of my top favorite features across all of the various GenAI tools.
You’re too kind! I asked o1 to sum up my letter grades and give me an average 😅
You know, I've heard a lot of "this election wasn't decided by AI", but SO many factors play into that sort of decision making that I'm skeptical AI didn't have a strong influence this time around... we may be too dumb to see how just yet, though. I'm not talking about a supermind or some kind of conspiratorial nonsense, FWIW, but really just the way phenomena can emerge from far simpler subsets. Perhaps we'll have some real idea of the actual influence at some point in the future, or perhaps not.
For sure. History is (usually) too complex to definitively say: "this one event was/wasn't decided by X." It could be that AI had far more impact on elections that we currently know. I'm currently taking my cue from stories like this one: https://theconversation.com/the-apocalypse-that-wasnt-ai-was-everywhere-in-2024s-elections-but-deepfakes-and-misinformation-were-only-part-of-the-picture-244225, but definitely open to changing my mind on it.
It reminds me of when mainstream media outlets would write about "up and coming" punk bands. The reality is that they had no idea what was actually going on at the surface, you know? I'm not 100% sure this is that thing, but it really seems like a very neat and tidy rug sweep, if that makes sense.
I think/hope that existing products will have more meaningful AI features introduced. Not just the standard "Have AI re-write this sentence" feature, but more meaningful things where the AI has access to customer data. For example:
* Ecommerce platforms will be able to analyze performance metrics and make recommendations about promotions, marketing campaigns, merchandise buying, and other things.
* Banking services will use AI to summarize activity and create custom reports and recommendations for customers, like a personal banker for the wealthy might.
Kind of analogous to how Cursor has the "Chat with codebase" feature where the AI has more context than just the current file you're looking at. I think that concept is going to be brought to more software and will be very powerful.
Cursor is such a great example of how you can build a compelling user experience even using someone else's models (though they do have a custom model to merge generated code snippets into your files).
I definitely hope we see richer interaction design with AI products in 2025. For what it's worth, I do think Google has been putting out some really interesting ideas with NotebookLM and Project Mariner.
Man, that's uncanny!
I'm just working on my post for tomorrow, which is about AI predictions (among other things) and at least 4 of mine overlap heavily with yours. These ones:
1) Reasoning models get their moment (My WIP bullet point text in the draft is "At least three other reasoning models that are on par or better than OpenAI’s o3".)
2) AI gets a major movie/music credit (My WIP draft text: "A major movie features scenes made with text-to-video models.")
3) No GPT-5. On this one, I'm close to saying that we might even never have a model called GPT-5. Looking at the trend toward inference-time reasoning models, the old GPT-class models might not get as much focus and may see a reboot of the series.
4) The age of agents - agents becoming more prominent in the news cycle again is a pretty clear trend, so I'm hunting for a more nuanced/precise prediction along the lines of "An all-in-one agent builds a [product/tool/whatever] single-handedly," but I want to find a way to keep it realistic, since I also don't expect the issues with AI agents to be fully solved in 2025.
As for your other predictions, I think most of them are solid and some are near-guarantees.
Let's see how the year pans out!
And hey, a C+ isn't bad at all, especially since an otherwise solid B is dragged down by a single F.
Your comments and predictions all make sense to me. Well done!
I have a question for you--actually three questions--all about "thinking" AIs.
1. What, if anything, do the "thinking" models like o1, o3, Gemini Experimental, etc. do other than search the space of COT chains for one (or more) that work?
2. Search and backtracking aren't built into LLMs as we understand them--or at least not as I understand them. So what code (or other software) makes that happen? That is, what is it that makes these models search? If it's just code, why is it such a big deal? We've known how to write search algorithms forever. If it's not explicit code, what is it? How can you "train" search into an LLM?
3. In what sense do AI systems "understand" COT chains. The underlying systems cannot do logic. So, in what ways do they verify that the COT chains make any logical sense? That also doesn't seem like something a standard LLM knows how to do. So, how is doing it integrated into "thinking" models? As you know, systems like AlphaProof use LLMs to generate possible proof strategies and then use formal verifiers to turn them into proofs if possible. But that's not what "thinking" systems do -- at least as far as I know.
Thanks.