Try out the new model capabilities through Claude.ai & Claude Code on the Pro or Max plan: http://clau.de/tinahuang
#claudepartner @anthropic-ai
Sign up for my FREE Careers in AI talk on June 25th, 1pm EST in collab with IBM: https://ibm.biz/TinaHuang/
Want to get ahead in your career using AI? Join the waitlist for my AI Agent Bootcamp: https://www.lonelyoctopus.com/ai-agent-bootcamp
🤝 Business Inquiries: https://tally.so/r/mRDV99
🖱️Links mentioned in video
========================
Google IO Keynote: https://www.youtube.com/watch?v=LxvErFkBXPk
Code With Claude Keynote: https://www.youtube.com/watch?v=EvtPBaaykdo
Langchain Interrupt Keynote: https://www.youtube.com/watch?v=DrygcOI-kG8
Andrew Ng Interrupt Talk: https://www.youtube.com/watch?v=4pYzYmSdSH4
YC Vertical AI Agents: https://www.youtube.com/watch?v=ASABxNenD_U
YC Advanced Prompting: https://www.youtube.com/watch?v=DL82mGde6wo
🔗Affiliates
========================
My SQL for data science interviews course (10 full interviews):
https://365datascience.com/learn-sql-for-data-science-interviews/
365 Data Science:
https://365datascience.pxf.io/WD0za3 (link for 57% discount for their complete data science training)
Check out StrataScratch for data science interview prep:
https://stratascratch.com/?via=tina
🎥 My filming setup
========================
📷 camera: https://amzn.to/3LHbi7N
🎤 mic: https://amzn.to/3LqoFJb
🔭 tripod: https://amzn.to/3DkjGHe
💡 lights: https://amzn.to/3LmOhqk
⏰Timestamps
========================
00:00 — Intro
01:07 — Opportunities for AI Agents
05:33 — Tips on Starting a Vertical AI Agent Company
06:25 — Quiz 1
06:35 — Skills for Building AI Agents
13:03 — Quiz 2
13:10 — Career Advice & the “Agent Engineer”
15:35 — Quiz 3
15:45 — AI Agent Insights & Pro Tips
18:06 — Quiz 4
📲Socials
========================
instagram: https://www.instagram.com/hellotinah/
linkedin: https://www.linkedin.com/in/tinaw-h/
discord: https://discord.gg/5mMAtprshX
🎥Other videos you might be interested in
========================
How I consistently study with a full time job:
https://www.youtube.com/watch?v=INymz5VwLmk
How I would learn to code (if I could start over):
https://www.youtube.com/watch?v=MHPGeQD8TvI&t=84s
🐈⬛🐈⬛About me
========================
Hi, my name is Tina and I'm an ex-Meta data scientist turned internet person!
📧Contact
========================
youtube: youtube comments are by far the best way to get a response from me!
linkedin: https://www.linkedin.com/in/tinaw-h/
email for business inquiries only: hellotinah@gmail.com
========================
Some links are affiliate links and I may receive a small portion of sales price at no cost to you. I really appreciate your support in helping improve this channel! :)
It is conference season and I went to over 72 hours of talks and workshops about AI agents. So in this video, I want to share with you guys the cliffnotes version of everything that I learned about AI agents. But as per usual, it is not enough for you just to listen to me talk about stuff. So I will have little quizzes distributed throughout this video to help you retain all of this information. Now, without further ado, let's go. A portion of this video is sponsored by Anthropic. The way that I'm structuring this video is by grouping roughly by topic as opposed to like per conference. And the reason I'm doing this is because there's a lot of overlap. The first topic is what are the opportunities for AI agents? Like how to identify which agents are worth building if you want to start your own company or within your own existing company and workflows. Second topic is the skills to learn for building AI agents. Next up is career advice, including if you want to become what the Lang Ching CEO calls an agent engineer. He's completely convinced that this is just the beginning of AI agents and this like AI engineer role is going to become really in demand. And finally ending with some nuggets of insights. This is where I'm going to discuss some of the practical tips, some business insights as well as what are things to focus on for the future. All right, let's start
with what are the opportunities for AI agents. So the major takeaway that I got from going to the Google IO conference is that Google is completely focused on integrating AI into all of its different products and giving them agentic abilities. They literally listed like 15 plus products and just talked about how there's AI in every single one of them. But it's not just Google. Basically, every big company out there is figuring out how to incorporate AI into their products and then how to build new products with a gentic capabilities as well. 2025 really marks the beginning of a Gentic AI products. Okay, so you might be thinking now what if I am not a very big company? Where are the opportunities for me if I want to build my own AI agent startup or integrate AI agents into my existing business? Well, luckily for you, there is an excellent roundt discussion from YC that exactly answers this question. Their advice is to build within the category of what they call vertical AI agents, which is defined as specialized AI agents that are designed for very specific industries and functions. To explain this, to make a comparison to the SAS boom, where a couple decades ago there was a lot of success for people who were building software as a service companies. There were three waves of SAS companies. The first wave is what they called obvious consumer applications. These are basically the online SAS versions of desktop software. Think your emails, your calendars, your documents like the Microsoft Office suite that was brought online by Google. In this category, initially there were a lot of startups that were popping up. But ultimately your incumbents, your actual big companies won because they just brought their own products online instead. The second wave is what they call the not obvious consumer applications. These were new consumer behaviors that were enabled by mobile and cloud technology. things that were not obvious and really risky. Some examples include Uber, Airbnb, Instacart, Door Dash, and Coinbase. Startups won in this category because they were not obvious and also really risky. So, your bigger companies wouldn't bother trying to take that risk. The handful of successful companies in this category became worth tens and hundreds of billions of dollars. And finally, there was the third wave, the third category, which they called the B2B vertical SAS. This is by far the largest category, creating over $300 billion companies. These companies build specialized software for specific industries and functions. Think Viva for pharma or gusto for payroll worth $47 billion and $9. 5 billion respectively. Now the reason why that there was so much value that was created by startups in this category is because each vertical had to be brought online but your incumbents like your big companies were not going to do it because it was just not really worth it for them. like each niche, each vertical required such specialized specific knowledge that the return on investment of just like $50 billion is not worth it for your Google and your Microsoft. And that's why if you were a startup at that time and you focus on building a vertical SAS company, the chances of you succeeding would have been relatively high. Okay, let's now bring this back to the AI agents boom that we're seeing today. We're already seeing the results of wave 1. There were a lot of startups that were doing like really obvious agentic things like taking meeting notes, sending emails, and doing calendar stuff. But pretty much by now, most of these startups have been squeezed out already because companies like Google are simply integrating AI agents into their own software. And startups just can't compete with that. Wave two, your not so obvious consumer applications of AI agents, is starting to emerge right now. And I'm sure we're going to see some of these very soon. But wave three, your B2B vertical companies, aka vertical AI agent companies. This is where the fertile ground is. Now is such a good opportunity to be looking at specific vertical niches and seeing how you can create AI agents in those niches. And actually in the YC roundt discussion, these investors were predicting that these vertical AI agent companies can be 10 times bigger than the SAS counterparts because vertical AI agents can not only replace the software portion of it, it can actually replace the people who are operating the software as well. An example they gave is a HR specific AI agent. With a tool like this, you can easily imagine a CEO being able to personally call up their 1,500 employees and have personalized, meaningful conversations with all of them to make sure that they're happy with their roles and being able to help them with whatever it is. This literally takes away the need for entire human teams, potentially the entire HR department. The market for vertical AI agents isn't just the company's software, it's a company's software budget plus the payroll of the entire team that they're replacing. So yeah, TLDDR, if you want to build your own startup, your own business, expand your current business's capabilities, you should be building vertical AI agents. I'm going to put on screen now some examples of YC companies that are in the vertical AI agent space for inspo. Okay
so before I move on to the next section, I also want to give a couple little tips for how to get started with your vertical AI agent company. The first thing that you should do is choose a vertical niche that you have domain knowledge in. Be that law, accounting, marketing, teaching, whatever. The more deeply you understand the space, the more likely you are to be able to build a really good AI agent. Step number two is to identify the repetitive, administrative, and boring task. And ask yourself, how do I use an AI agent to automate these? I had the really amazing opportunity to talk to some of the product teams, even have some one-on- ones. And the advice they gave to me is think about the advantages that AI agents have. For example, they're available 24/7. They're cheap and you can do a lot of personalization. So, how do you use these advantages to enhance your AI agent? For example, the founder of Sweet Spot had a friend whose job was to manually refresh a government website all day. So, they decided to create an AI agent to do this instead. It's cheaper, it's faster, and it can do it
24/7. All right, I'm going to put on screen now a little quiz. Please answer these questions and put it in the comments to make sure that you retain all the information we just talked about. Assuming now that you're hyped up
to build your vertical AI agent, let's now move on to the next section about the crucial skills to learn for building agents. I want to start off with a disclaimer. There are many skills that you need to build AI agents. And I actually have a video which I'll link over here that goes into a lot more depth about what these skills are, but there were two skills that came up over and over again across a lot of different workshops and talks. And those two were prompt engineering and writing evals. Just to make sure we're on the same page, prompt engineering is defined as the process of designing and refining the input instructions or prompts used to interact with generative AI models. Prompt engineering in general, in my opinion, is the highest return on investment skill that you can possibly learn just like in general. But specifically for building agents, it's even more important. And that's because of two reasons. The first one is that everything like all the instructions that you're giving an agent has to be in a single prompt. You can't iterate on it like you can with a chatbot. So you have to be very clear, very detailed, and very precise. The second reason why prompt engineering is so important for AI agents is because you need to be able to balance between clear instructions and flexibility. So the reason why you want to be using an AI agent in the first place is because you want it to be able to autonomously do things, right? This inherently means that you can't exactly know all the things that it's going to do. So you have to have like this fine balance between giving it enough instruction so it's able to like perform a variety of behaviors but not too much instruction that you're inhibiting its autonomous abilities and you're just ending up with like a normal workflow. It's a very fine line. General guidance from Anthropic's prompting for AI agents workshop is to start with a short prompt and then immediately start experimenting and testing different use cases. Like your initial prompt can be as simple as search the web to answer the user's question. Then you need to start thinking about all the weird things that people could be entering into your AI agent. Test them out and see what your AI agent does and take those results to refine the prompt to get the behavior that you ultimately want. So, I personally think that this is very good like high-level mindset type of advice, but for me, I prefer more like frameworks and things that are a little bit more concrete. So, I figured if some of you guys are more like me, um I would share my own six component framework that I usually use when I'm um prompting an AI agent. And that is defining a role, a task, input, output, constraint, and capabilities and reminders all within one prompt. I'm going to put on screen now an example of a research assistant agent whose job is to summarize the latest news and trends in AI. This is something that I actually like personally use to create these videos. Here's another example. In another YC roundtable, Parhelp, which is one of the YC funded companies, very generously also showed their actual like production prompt, which is of course even longer and even more complex, including things like step-by-step reasoning, as well as using markdown and what is called XML tags in order to structure the prompt more clearly. I'm not going to go into too much more detail about these advanced prompting techniques, but I really recommend that you actually check out the YC video over here. It has a lot of nuggets of insights for you to improve your prompt engineering for agents specifically. So, as I said earlier, I was at the Code with Cloud conference in San Francisco a couple weeks back where Anthropic dropped their newest models, the Claude Opus 4 and the Claude Sonnet 4. Watching the live demos, especially how they used it with Claude code, their command line tool, was pretty crazy. They showed extended reasoning, tool use, and real-time code edits. Opus 4 is the world's best coding model with sustained performance on complex longrunning tasks, leading industry benchmarks like the Sweetbench at 72. 5% and the terminal bench at 43. 2%. 2%. And Sonet 4 is a major upgrade from Sonnet 3. 7, offering a great balance of speed, intelligence, and cost. So, I've been testing out these models myself on real world coding and research task. And I really got to say, the sustained performance on long-term complex projects is really, really impressive. If you haven't already, I really recommend that you try out the cloud for models today on cloud. ai with cloud code or with your favorite IDE/vibecoding tool at this link over here, also linked in description. Thank you so much Enthropic for sponsoring this portion of the video. Now back to the video. Let's move on now to the second skill that everybody was talking about in the conferences which is writing evals. Writing evaluations. So what are eval? Evals otherwise known as evaluations for AI agents are a structured way to measure their performance across various task or scenarios. Prompt engineering and writing evals go hand in hand. Like you can write a prompt and you think it's good, but you don't actually know if it's good unless you evaluate the results of it. Right? If you're just personally using AI and just like interacting with a chatbot or whatever, you can usually just like eyeball the results of it to evaluate whether you need to like change your prompt or not, but it's a completely different story if you're building an AI agent that's going to be used by lots of different people with a lot of different scenarios. You need to be very thorough in testing the behavior of your AI agent to make sure that it's doing what it's supposed to be doing. Especially when it comes to edge cases, things that you may not have predicted off the top of your head. Some common categories of evals include task completion like did the agent actually complete the task? Planning and reasoning quality were the steps that the agent took logical and efficient tool use accuracy. Was it using the tools that it has correctly? Robustness. What happens when it comes across an error? Whether that be an input error or a processing error and latency and efficiency. How fast and efficient is your agent running? You need to create a lot of different test scenarios to evaluate your AI agent. But luckily for us, many companies have eval tools to help us with this. Some tools include OpenAI's evaltools eval platform. What is really interesting is that a lot of companies now consider their evals to be like their crown jewels, their actual IP. That's why Parhelp, for example, was okay sharing its prompt, but I'm sure if you ask them for their evals, there is no freaking way they would give you their evals. And the reason for this is because eval are the direct representation of the actual like successes and failures of a user's workflow. You need to have really deep understanding of the domain and what people are using your AI agent for to be able to create good eval. And only with that are you able to improve your AI agent over time. There's so much more that I can talk about evals as well and just like all the other skills for building AI agents, but there is not enough time for this video. So I really recommend that you check out the video that I made over here which goes into a lot more depth. And also, I do want to let you know that we'll be restarting our AI agents boot camp for the next cohort in the next few weeks. It's a very hands-on boot camp, and you're going to be creating minimum of four different AI agent systems and deploying it in production, too. Last time we did it, we sold out within 40 hours and just through the weight list. Like, we never even publicly announced it. So, this time around, if you are interested in joining, please do join the weight list because the weight list gets priority.
All right, I'm going to put on screen now a little quiz that will test your knowledge of this section. Please put it in the comments. Moving on now to career
advice. Indie Lang Chain Interrupt Conference, their CEO Harrison Chase sort of hard launched a new role called agent engineer. And that's because he's convinced that we've just begun to experience the power of AI agents. Like we've experienced a minuscule amount of the abilities of what AI agents can bring. He argues that there's still a really big gap between being able to build an impressive demo of an AI agent and being able to get it to work reliably in a production environment, which is what you need to be able to integrate into products at scale. That's why an agent engineer needs to have four primary skills to be able to create robust agents. The first one no-brainer is prompting the ability to interact with your LM effectively. The second is traditional engineering, being able to build reliable systems and data pipelines. These are traditionally skills of software engineers and data engineers. Third is product. This represents the need for domain knowledge to be able to understand users workflows to automate and put into an AI agent. And the fourth is machine learning, a skill that is traditionally found in data scientists and data science researchers. And the reason for this is because to build a good AI agents, you need to have good eval statistics and non-determinism. I just want to say that this is like a full circle moment for me like super surreal because um so I started off uh working in software engineering and then I was a data scientist at Meta and at that time data science was like this cool fancy new role that you know combined together so many different fields and now we have the agent engineer that's combining the skill sets of data scientists and software engineering and product and like yeah it's just like a new role I guess the world is evolving. Anyways, I don't know if this specific role is going to be the title, but I definitely see Harrison Chase's point because these are certainly skills that are needed to build good AI agents and there certainly is increasing demand for AI agents as well. So, for those of you interested in the field, start digging into these domains and these skill sets. Hey friend, I just wanted to let you know that I'll be giving a free talk on how to prepare for careers in the age of AI in collaboration with IBM. It will be on June 25th at 1:00 p. m. Eastern time. If you complete any of the spotlight activities, you'll also earn an official IBM Skills Build digital credential to put on your LinkedIn profile. IBM Skills Bill will also be giving out 300,000 free granite tokens, too. Visit ibm. biz/tina to sign up today because spots are limited. And I will see you on June 25th. Now, back to the video. Here is a
the future. So, first up, I want to share a very practical step-by-step framework for decomposing a business workflow to automate with your AI agent. Step number one is to observe. Look at what people are doing in the business process. Step number two is to decompose. Break down their workflow into smaller manageable task. Step number three is to map the flow. Identify the workflow. Identify how these tasks are related to each other and try to like draw out all the different branches and interactions. Step number four is to prototype. build an initial simple version of the workflow that you just mapped out in an agentic way. If you don't know what that means, I go into a lot more detail in this video over here. And step number five is to evaluate and iterate. Write your emails and use your evals to identify where is it that your agent is not behaving the way it should be and then iteratively improved through your prompt, changing out different models, giving you different tools, etc. And then finally looking into the future. I really like this talk from Andrew Ning in the Langchain interrupt conference. He was basically asked like what are the three things that people should focus on? And the first one is like prompt engineering evals. No surprises there. The second one was voice agents. He thinks that it's a very underrated um thing right now because voice agents can dramatically make the experience of your AI agent much better. So look into those. Yeah, AI voice agents are cool. And the third is coding. So Anjuning is very adamant about everybody learning how to code. He said that even his receptionist knows how to code and he thinks that you know because coding is even easier now with the ability to do vibe coding. um even more people should learn how to cook. So, I just want to like give you his opinion right now. I'm gonna like share a little bit of my opinion. Um take it with however you like. But in my opinion, having like worked with so many people through the agents boot camp and like consulting and things like that. I think that it is true that to build AI agents, you do need to have coded implementations. However, I don't think that everybody needs to learn how to code because I also think that there is a large amount of people who have like very good business knowledge and really great domain knowledge and they can like understand the fundamentals of AI agents and be able to use no code tools to build out like demo versions and like prototype versions of things. It is perfectly reasonable for them to hire out engineers who can actually implement decoded versions into their businesses. So, that is my personal opinion. All right, that is all that I have for you
guys today. Here is the final little assessment to test this section of your knowledge. Please leave it in the comments. And thank you so much for watching until the end of this video. I hope it was helpful. And I will see you guys in the next video or live stream.