Two years ago, I knew almost nothing about generative AI. Like millions (now billions) of others, I was stunned when ChatGPT first launched - but I did have both an engineering background and the stubbornness to go figure out what was happening beneath the hood.
I've seen quite a few technology waves come and go in Silicon Valley. I remember the hype around social/mobile/local (SoLoMo, we called it), VR, drones, 3D printing, blockchains (more than once), the sharing economy, the creator economy, and even pre-LLM chatbots. Most followed the same pattern: explosive hype, a trough of disillusionment, and eventually either disappearance or quiet integration into our daily lives.
But generative AI felt different. The first time I typed a complex prompt into ChatGPT and watched it produce coherent, thoughtful text, I came away thinking this wasn't just another overhyped technology trend (though many would still disagree with me). I couldn't have articulated precisely why at the time, but I felt sure this was something I needed to learn and explore1.
Fast-forward to today: I'm an AI engineer at Pulley, applying this technology across an organization in ways I couldn't have imagined when I first typed "Hello" to ChatGPT.
Of course, I'm a self-taught AI engineer: there's no established curriculum or career track for building with foundation models (there's barely even a set of "best practices"). There was no "AI Engineering 101" course I could take. The term "AI Engineer" (as I'm using it today) didn't even exist when I first started working with LLMs.
I had to chart my own course in a landscape that moves insanely quickly, drawing as much on my experiences both as a founder and as a software engineer2. And that's really what I want to get at: becoming an AI engineer is both more accessible and more valuable than you might realize.
The path I've taken - learning through building, sharing in public, and constantly adapting to new developments - isn't just available to anyone with technical aptitude; it might actually be one of the only reliable routes into this rapidly evolving discipline. Because the field is still being defined, the barriers to entry are surprisingly low, and the potential impact is enormous.
What is an AI Engineer, anyway?
My current role aligns closely with Swyx's definition of an AI engineer: someone capable of building applications on top of large language models who understands the technology's nuances, sharp corners, and best practices.
This means being familiar with:
Foundation models
Prompt engineering approaches
RAG (retrieval-augmented generation) pipelines
Evaluation suites and techniques
LLM monitoring and observability
Fine-tuning and inference platforms
Agentic frameworks and tool usage
Not all AI engineers will be experts at all of these things - you can build an entire specialization around just RAG pipelines, for example - but being aware of them at least a high level is important. And as LLMs evolve, we'll undoubtedly see more pieces added to this stack3.
But more than specific tools, it's about understanding this brand-new, peculiar technology. It’s about knowing when to use AI and when not to. Understanding how to architect systems that account for AI's inherent limitations. And being able to effectively debug and optimize AI applications.
Crucially, my role doesn't require writing TensorFlow or PyTorch code. I'm not training models from scratch or designing novel neural networks. The technology is so new and fast-moving that most engineers are still figuring out what and how to build AI-enabled products.
One of my favorite analogies with generative AI is the smartphone. The skills needed to create the first iPhone (hardware design, operating system development) are very different from those needed to develop compelling iOS apps. Similarly, the skills needed to train ML models differ from those required to build practical applications on top of them.
In meeting and working with dozens of AI engineers at this point, there are a few key capabilities that I've identified:
A deep curiosity and openness to new ideas/tools/processes
The ability to clearly share context and clearly define expected outputs/goals
Strong technical foundations and/or domain expertise to know how to correct the AI
The knowledge of when to delegate to AI and when to delegate to humans
Effective communication around AI's capabilities and impacts on a given project or process
Ironically, the things that make someone great at working with AI heavily overlap with those that make someone great at working with other humans. Having both front-line knowledge of what's now possible with AI, plus the ability to communicate that knowledge effectively, is a very valuable combination.
At present, it's not always obvious what problems benefit from AI, leading to a general backlash as leaders attempt to apply AI indiscriminately. "Let's add ChatGPT to our app" isn't a paint-by-numbers exercise; it requires a deep understanding of user needs, business contexts, and technological capabilities4.
Still, how does one "learn to build with AI" without an obvious path forward? For this, the hacker community has always had a great way to route around formal certifications: building in public.
Building (and learning) in public
In my case, I started with simple projects. No complex ML algorithms, just prototypes with OpenAI's APIs. Early experiments included prototyping a voice-based smart assistant, building a "chat with your documents" app, and creating fake nature documentaries.
What's striking in retrospect is how quickly these projects evolved. Within months, I moved from basic "Hello World" applications to building systems that could:
Extract structured data from unstructured text
Generate and evaluate multiple responses to find optimal outputs
Implement basic retrieval-augmented generation (though I didn't know that term yet)
Chain together multiple AI calls to solve more complex problems
Each project taught me something new about working with LLMs. It's a theme I've often repeated here - to understand what AI is capable of, there is no substitute for working with it yourself. As I've said before:
These technologies have deeply unpredictable capabilities. No one, not even their makers, knows the best way to use them (at least at first), or what their failure modes are. GPT-4 could write incredible prose, yet failed to count the words in a paragraph. Two years later, o1 can solve advanced math equations, but can't read an analog clock. And with every major update, the capabilities change, meaning we're all back at square one, trying to discern which AI claims are accurate, and which are snake oil.
More importantly, each project added to a public portfolio I could refer to later (including in interviews). Building in public forced me to clarify my thinking and distill complex concepts into clear explanations - skills that have proven invaluable in my role today.
Getting started
Two years ago, the only people with "applied LLM engineering" experience were those lucky enough to be building with GPT-3 at companies like GitHub or Notion, or those already positioned as machine learning engineers.
Today, countless online courses teach you to build apps with Cursor and Claude. If you find them helpful, by all means, use them. But you'll still want a portfolio - with so many people trying to break into AI, having demonstrable experience goes a long way (I should know - this is something we actively look for when hiring AI engineers).
If you want to start on this path, here's what I suggest. First, get access to foundation models. OpenAI is the default choice, but Anthropic is another easy-to-use option (Gemini is getting better but definitely has some developer experience challenges). Build a basic chat app.
Next, try your first "real" project. Don't overthink it; this isn't a Y Combinator interview. Solve a small, genuine problem you have with AI. Build it in a weekend. Use Cursor or GitHub Copilot to help you code. Don't worry about frameworks or libraries - use the raw APIs and read about features like Structured Outputs, image uploads, and prompt caching. Figure out some prompt engineering best practices for your use case. Create an eval or two. (If you need some inspiration, check out the projects page)
And as you work on these projects, document your process publicly. This could be through tweets, blog posts, or simply pushing your code to GitHub with detailed READMEs.
The beauty of this approach is that it compounds quickly. Within a few weeks of consistent experimentation, you'll develop an intuition for working with LLMs that is superior to the average person’s.
Staying ahead of the curve
That said, it's one thing to get ahead now and another to stay ahead without clear design patterns or best practices. As easy as it is to start, building more sophisticated applications can still come with headaches. For example:
Models can be frustratingly inconsistent, and prompt engineering often feels more like art than science
There's significant ambiguity about what to build - AI isn't a panacea for every problem
The field moves incredibly quickly - what works (or doesn't work) might be entirely different six months from now
Truly, I can't count the number of times I've been simultaneously impressed and disappointed by what these models can do.
If I'm being optimistic, these challenges also create big opportunities. The engineers willing to embrace this uncertainty and learn to bridge technical implementation with business value are best positioned to thrive in our AI-enabled future5.
I don't know that AI will automate all programmers out of a job, but I am confident that the vast majority of the profession will evolve. And while you might not need to know how to build AI-enabled software for your job, there's a decent chance you'll at least need to learn how to use AI-enabled software.
Not every company will train a foundation model, but nearly every company will have to consider working with LLMs. Just as companies no longer question whether they need to develop for mobile devices, nearly every company is asking itself how it might need to change for the GPT era.
Plus, even if your company is writing the most mundane, non-AI-related software in the world, you, as an individual, could likely benefit from learning how AI coding tools are transforming this little industry of ours. I've experienced first-hand how rapidly good engineers with AI fluency can work. And when a team of AI engineers regularly shares workflows and best practices - it's truly remarkable.
We're so early
Two years into my AI journey, I still feel like I've barely scratched the surface. Every week brings new models, new tools, and new possibilities. It's simultaneously exhilarating and exhausting.
And yet - we are still so early. To return once more to the smartphone: current capabilities of AI chatbots are akin to an iPhone 3G: impactful to early adopters, but a long way from being a mature, ubiquitous technology6. In the iPhone's case, there was a lot of low-hanging fruit, but truly mobile-native businesses like Instagram and Uber took years to materialize.
So I want to say that if you're considering becoming an AI engineer - there's still time. There's clearly a demand - it's already the fastest-growing job title according to LinkedIn - and as we've discussed, a lack of structured education.
Don't worry about mastering any particular tool, but rather the process of continuous learning and adaptation. The willingness to experiment, to build, to fail, and to try again. The skill of evaluating which problems are worth solving with AI (or are even possible to solve with AI) and which aren't. The ability to maintain a sense of wonder while building muscle memory for this strange new technology.
The frontier is wide open, and for once, nobody has a significant head start. The field is moving too quickly for anyone to be an expert for too long - we're all perpetual beginners in some way, continuously rediscovering what's possible.
In two years, you might find yourself looking back, amazed at how far you've come. And if my experience is any proof, the journey itself is worth it, regardless of where this technology ultimately takes us. The world needs more thoughtful humans guiding how AI develops and integrates into our lives - why not be one of them?
The problem was, I was working as the CTO of an eCommerce company at the time. And while we had opportunities to experiment with generative AI, there wasn't an obvious way to make our core business "AI-native." So I made a decision: I needed to learn this new paradigm, despite having no formal guidance on where to start.
For what it's worth, I do want to acknowledge my priors here: I think the journey is going to be a bit harder without any sort of software engineering background. But I also think the journey to becoming a software engineer is getting much easier - I sorely wish I had Cursor to help me debug during my CS classes.
"Vibe coding," for example, is an emerging "skill" that didn't even have a name a year ago.
My background as a CTO and startup founder proved unexpectedly valuable here. Identifying business bottlenecks and considering different technologies now helps me evaluate where AI can create genuine value versus being just a flashy addition.
If I'm being pessimistic, the engineers with these skills will be the only ones who thrive in our AI-enabled future.
This metaphor is doubly relevant with Apple's recent announcement that they're going to be delaying their flagship Apple Intelligence features until next year.
Great post as always. I see a lot of cross disciplinary interest in ‘creating using ai’ and the democratisation of app building in my area across engineering and non engineering communities, latter more so interestingly. An offshoot of this could be that theorists/non techies with lower barrier of entry to practice, help elevate ideas and products to a next level by ability to prototype and create complex software independently.
Fantastic read, thanks for sharing. For some reason I especially enjoy how you did the footnotes