YCombinator's AI boom is still going strong (W24)
Combing through all 158 YC AI startups (65% of the batch).
With YC's latest Demo Day (W24), the AI companies are continuing to grow. Six months ago, there were around 139 companies working with AI or ML - that number has climbed to 158, a clear majority of 65% (there are 243 total companies in the batch).
Let's dive into what's new, what's stayed the same, and what we can learn about the state of AI startups.
The biggest domains stayed big
Perhaps unsurprisingly, the most popular categories remained unchanged from the last batch. Last time, the top 4 domains were AI Ops, Developer Tools, Healthcare + Biotech, and Finance + Payments. This time, the top 5 were:
Developer Tools: Apps, plugins, and SDKs making it easier to write code. Tools for testing automation, website optimization, codebase search, improved Jupyter notebooks, and AI-powered DevOps were all present. There was also a strong contingent of code-generation tools, from coding Copilots to no-code app builders.
AI Ops: Tooling and platforms to help companies deploy working AI models. That includes hosting, testing, data management, security, RAG infrastructure, hallucination mitigation, and more. We'll discuss how the AI Ops sector has continued to mature below.
Healthcare + Biotech: While I've once again lumped these two categories together, there's a pretty big split in the types of AI businesses being built. Healthcare companies are building automation tools for the entire healthcare lifecycle: patient booking, reception, diagnosis, treatment, and follow-up. Whereas biotech companies are creating foundation models to enable faster R&D.
Sales + Marketing: Early generative AI companies were focused on the sales and marketing benefits of GPT-3: write reasonable sounding copy instantly. Now, we're seeing more niche use cases for revenue-generating AI: AI-powered CRMs for investors, customer conversation analysis, and AI personal network analysis were among some sales-oriented companies.
Finance: Likewise, on the finance side, companies covered compliance, due diligence, deliverable automation, and more. Perhaps one of my favorite descriptions was "a universal API for tax documents."
The long tail is getting longer
Even though the top categories were quite similar, one new aspect was a wider distribution of industries. Compared with the last batch, there were roughly 35 categories of companies versus 28 (examples of new categories include HR, Recruiting, and Aerospace). That makes sense to me. I've been saying for a while now that “AI isn't a silver bullet” and that you need domain-expertise to capture users and solve new problems.
But it's also clear that with AI eating the world, we're also creating new problems. It was interesting to see companies in the batch focused on AI Safety - one company is working on fraud and deepfake detection, while another is building foundation models that are easy to align. I suspect we will continue seeing more companies dealing with the second-order effects of our new AI capabilities.
We're also seeing more diverse ways of applying AI. In the last batch, a dominant theme was "copilots." And while those are still present here (as well as "agents"), there are also more companies building "AI-native" products and platforms - software that uses AI in ways beyond a shoehorned sidebar conversation with an AI assistant.
AI infrastructure continues to mature
As we continue to build with AI, we need a lot of tooling and infrastructure. Six months ago, I made the following observation:
The size of the AI Ops space also shows how much work is needed to really productionalize LLMs and other models. There are still many open questions about reliability, privacy, observability, usability, and safety when it comes to using LLMs in the wild.
That's still true today, with AI Ops being one of the most popular categories. While there are some companies with premises we've seen before, many are also helping SaaS-ify techniques at the frontier of AI product development. For example, up until last year, "RAG" (retrieval augmented generation) was a mostly unknown term outside of AI research circles, but now, multiple companies are building "RAG as a Service."
Best practices for LLM deployment include curating training data, regularly running evals, and testing vector chunk sizes - but there aren't many industry-standard tools for doing so in production. Developers are working on weird and wonderful ways to mitigate hallucinations. Partially that's a consequence of the technology - when the state-of-the-art significantly changes every 3-6 months, it's hard to find a solid footing - but it's also because there just hasn't been much time for things to solidify. Amazon S3 and EC2 are 18 years old at this point. GPT-3 has been around for less than four years.
There's still room for bespoke models
Companies like OpenAI and Anthropic often dominate the conversation regarding the latest and greatest models, and it's easy to think that new companies are mostly building "GPT wrappers." However, the companies in the batch show that there are still opportunities for companies to train models from scratch.
Some examples of companies training models for new use cases:
Diffuse Bio: A model that designs new proteins for vaccines and other medications.
Infinity AI: A model that generates short "movie" clips from a given script.
Piramidal: A model to understand brain activity trained on EEG data.
SevnAI: A model for graphic design aimed at creating editable vector graphics.
Sonauto: A model for creating hit songs given lyrics and a short prompt.
Yoneda Labs: A model for optimizing chemical reactions.
To be clear - this is a very incomplete list! Plenty more companies cover video, speech, code, music, and other types of AI-generated content.
Where we're headed
As always, YC's latest batch offers a glimpse into many futures where AI (and other technologies) succeed in transforming our lives. On Demo Day, startup founders pitch their vision for a world where their product is successful, capturing billions of dollars in value and disrupting an incumbent (or sometimes nascent) industry. With AI in particular, we're seeing a growing diversity in the industries that founders are targeting.
And within each space, there are countless ways to work with AI. Part of what makes it so difficult to capture the value from LLMs is that everyone’s usage (and thus value) will be different. We’re all simultaneously experts and below-average at different things, which means we’re going to find AI helpful in different ways.
It’s a big part of why there’s still an explosion of new AI companies - we’re trying to personalize the UX for various types of users. And I expect to see the trend continue - we've barely scratched the surface of leveraging AI.