This is the second in a series of posts on building an AI agent. Last time, we built a very rudimentary agent using ChatGPT as the core LLM. While we created some useful architecture, it ultimately couldn't do much beyond high-level planning and endless task generation.
#AgentGoals
At the end of our last tutorial, it was clear that our first agent was fairly limited. There were quite a few obvious improvements to make - and I decided to tackle three in particular:
Narrower use case. Having an agent that can "do anything" just means it's pretty terrible at everything. Instead, we should focus on a narrower use case, like "writing a research report" or "setting up a new codebase.
Internet access. Everything gets more interesting with internet access! Whether that's to do research or to give ChatGPT the ability to call other software, we can do much more once we upgrade our agent with Wifi.
Better code quality. We can refactor the OpenAI API calls and add plenty of safeguards and error handling.
Taking all three together, I chose a much narrower goal for our agent's V2: internet research. We're going to reuse the architecture of our existing agent but with a different outcome:
Given a research topic or question, generate a list of sub-topics or questions.
For each sub-topic:
Search the internet for a list of relevant links.
Read and summarize the contents of each link.
Create an overall sub-topic summary.
Write a final answer based on all of the individual sub-topics.
In particular, we can answer questions about topics that rely on factual data or data more recent than 2021. A simple example would be, "Who won the 2023 Super Bowl?" Out of the box, ChatGPT responds with:
As an AI developed by OpenAI, I don't have real-time data or future predictions. As of my last update in October 2021, I can't provide the winner of the 2023 Super Bowl. Please check the most recent sources for this information.
I want to do better than this. Let's get started!