Today, I want to share a new episode with James Evans.
James is the Head of AI for Amplitude, the leading product analytics platform. We had a great chat about why nobody has built a good AI analyst yet and why he’s betting on AI agents that can monitor your product 24/7. James also shared some real talk about whether AI will displace PM and data scientist jobs.
Timestamps:
(00:00) Why nobody has built a good AI data analyst yet
(03:32) The biggest problem with product analytics today
(06:13) Live demo: AI agents that monitor your website 24/7
(21:48) The hardest parts of building AI analytics products
(25:21) How to evaluate AI agents that run experiments
(32:53) AI product pricing strategies that actually work
(35:42) Should data scientists worry about their jobs?
(38:04) Non-obvious advice for building AI products
Get the takeaways:
Where to find James:
LinkedIn: https://www.linkedin.com/in/james-evans-7086b3126/
Website: https://amplitude.com/ai
📌 Subscribe to this channel – more interviews coming soon!
James is the head of AI for Amplitude. Why do you think uh nobody has figured out how to build a AI powered data scientist or analyst yet? Oh, there's all these trends and AI it's going to be our crystal ball and it's going to like pull out the trends magically. Like I think of it much more as unlimited time. It's not it's certainly there are some things that these agents find that humans may not have stumbled upon, but most of it is just doing the same things you can do in Amplitude as a human if you had 100 hours. It's just most people don't. Everyone gets all hot and bothered about this. Like should agents be able to ship stuff without human in the loop? My personal take is you can create a ton of value by just being in this middle state where the agent can work in the background and produce insights and ideas but then a human has to just like click approve. Should we be worried about our jobs like as a data scientist or like a PM maybe I shouldn't say this right but okay welcome everyone. My guest today is James Evans. James is the head of AI for amplitude and really excited to talk to James about how AI will transform analytics. James will also demo how anyone can have a team of AI agents for analytics and also we should talk about how PMs and data scientists can get ready for this f future. So welcome James. Thanks for having me. All right. So uh let me start with a spicy question. So you know we've seen like AI completely transform like coding right like cursor and all these other tools. Yeah. But why do you think uh nobody has figured out how to build a AI powered data scientist or analyst yet? Good question. 80% of my answer is like I just I don't know. I think there's a lot of um there's a chicken and egg problem. I think with a lot of larger SAS companies where um the DNA is such that if you haven't launched something AI related already expectations are like really high for your thing to be really great which actually makes it harder for you to launch something because these products I think are inherently bad at the beginning and you just have to believe in the direction and iterate with your customers on it. So I think that's like high level applies to a lot of SAS. I think for um analytics maybe the difference between analytics and some of the tools you mentioned is it's like a much more multimodal job um than text generation in various forms like that's not really the hard part. Um, for example, the data you put in just I mean just like in traditional product analytics, the data quality is the bottleneck for good insights. That's also true for an AI entity doing analytics. And so there's like hard data science problems that people have been working on for a long time that like underly that constraint. And then also on the action side like when it comes to generating a experiment to run on your website or shipping a like customized survey um I think you have to actually have the workflow down the human workflow like existing software to do a good job. You can't just rely on an AI agent to just generate like code and then hopefully it works and then you try to give it feedback and it's kind of like talking through a straw. you have to have like dials to change the branding and all that. And so I think it's actually been a pretty big effort to try to solve the data constraint and then also the workflow constraint to get an analyst to work effectively. But obviously everyone thinks the problem they're working on is the hardest. So that's my caveat. Got it. And like let's put AI aside for a second. Like what do
you think is the biggest problem with product analysts today? like just like the analytics space like you know people trying to Yeah. The biggest problem is it takes time. Like there aren't that many people whose like jobs are live in tools like Amplitude and squeeze out insights. Like that's one of the things that I've it's been really interesting to see talking to. So I'm I've only been in Amplitude for like seven or eight months and some customers are insanely good at using Amplitude. And I think a large part of our job, you know, working with our customers is like how do we take what we see those customers doing and getting all of our customers to use Amplitude in the same way to know when to use it in their flow to make sure their taxonomy is great to know what kinds of actions tend to drive impact. Mhm. And the really cool thing about building this agents product is like it's kind of like we're shipping software and users of the software instead of relying on our customers, the people at our customers to like learn our software, learn the best practices because they only got 40ish hours a week. And like while we'd love them to spend all 40 in amplitude, like the reality is most of them don't. Um, and so in some ways I think this launch is just like giving them unlimited like I think a lot of people model AI in data analytics specifically as super intelligence like unlimited intelligence like oh there's all these trends and AI it's going to be our crystal ball and it's going to like pull out the trends magically like I think of it much more as unlimited time it's not it's certainly there are some things that these agents find that humans may not have stumbled upon but most of it is just doing the same things you can do in amplitude as a human if you had 100 hours it's just most people Yeah. I work on the analy product too and um like the power users they can like you know filter by break down by like age and like break down by all stuff and like show it's a bunch of group buys. Yeah. buys and then you can spend like an hour like uh writing a bunch of SQL queries and try to figure out what's going on. But like most people either don't know how to do this or like don't have time to do this. So totally and you kind of just skip over that problem. There's that great blog post by Sarah Tavl like sell the work not the software. Like I kind of think it's and like sell the software and the work. Um it's like the users because the reality is if our agent's product didn't connect to exist our existing software to let people like understand how the agent is reaching a conclusion or like tweak the experiment it'd be pretty painful to use. But I really like the construct of selling the work. Yeah. Maybe you can uh show us
Live demo: AI agents that monitor your website 24/7
this Asian product. Yeah. Let's do it. Yeah. So, Amplitude agents are basically like I said before, AI users of Amplitude, AI specialists um using Amplitude to achieve a specific goal. One of the key design decisions we made was we didn't want to build like a monolithic there's this concept in our space that we very much believe in Amplitude called the self-improving product. Mhm. And the idea here is, you know, a lot of the use case of Amplitude is making tweaks to your website, app, products to like make it better for customers. Today, that loop is like you use Amplitude to come up with an insight like there's friction on this page. Then you maybe watch some session replays to like confirm it. Then you go make an experiment and then you launch it and then you monitor learn from it and then you go. Basically, we wanted to you have a decision of whether you have sort of a monolithic agent that optimizes across all your pages and you know all your KPIs at once or goal specific agents and so we've chosen the goal specific path because we think it maps better to how like humans work in companies today where each person each team will own something specific and then they can create agents to help them with that specific spaces. So like for example, these are templates that allow you to inst in instantiate agents that are focused on specific goals. So like this one is e-class ecom use case card abandonment web conversion like conversion for a web page feature adoption for like a self-s serve onboarding flow. Um so you can imagine like the first one is maybe a growth ecom team. This is like a marketing team. This is a product team. Um, but at the end of the day, these things are AI entities that have access to a set of tools and then they try to use those tools to achieve their goals. Got it. You want to show us one? Yeah, templates. Let's make one. So, yeah, let's go ahead. So, uh, you don't actually have to do much to like create an agent. They kind of sit on top of what it's already in your Amplitude account. So all I need to do to create this web conversion one is tell it like what to optimize. So I can just select CTA clicked which is just like you know someone clicked some CTA on our website. Then there's this question of autonomy. Um everyone gets all hot and bothered about this like should agents be able to ship stuff without human in the loop. My personal take is you can create a ton of value by just being in this middle state where the agent can work in the background and produce insights and ideas, but then a human has to just like click approve because imagine if the workflow shifts from you do all this stuff in Amplitude to generate an experiment or you just get pinged and it's like do you approve this experiment? Maybe a bit of feedback. Yes. Like I think that's 95% of the way there. Yeah, totally agree. So now let's create the agent. Non-deterministic demos are always fun. Uh this is for like a particular website that you have live already. For the Amplitude website. Oh, ampute. Okay, great. Yeah. Okay, cool. So, giving us sort of like the plan it's going to follow. Um, we're going to look at the different pages in Amplitude, pick one to focus on for the initial experiment, go off and find an insight, maybe confirm it with session replays, and then pick a way to capitalize on the insight and hopefully launch it. So then these are it surfaces up kind of goes in amplitude finds like relevant events and just data te's up the pages it thinks matter most let's do the pricing page this one is uh obviously very important page so behind the scenes is like uh running some queries to get the data or yeah we basically have a bunch of tools to go grab data and then analyze that data and just like a prompt be like hey you know find the most traffic and the get the conversion rates. Yeah. Some of it the other nice thing about having templates is you can give the agent like we tell it it's kind of like if you hired someone to just to optimize web pages, you'd be like, "Hey, like here's what conversion typically means. Here are different definitions of conversion. Here's like classic friction points. " So like for example here, what the agent has concluded is there's a bunch of dead clicks on in a particular part of our page. The reason the agent knows about dead clicks is because this is the webpage optimization template. Dead clicks are a classic problem for a lot of web pages. And so one of the things we tell it to go look for is dead clicks. So it's not like we rely on the foundation model to like know this stuff. We do put it into the templates. Yeah. I think having a team of agents makes it much simpler to prompt and evaluate. That's the whole that's like the whole reason we do it because if you jammed in imagine you had a list of like here's a thousand things you can do in Amplitude like even though the context windows are getting huge. There's this great piece I read recently about how like people think larger context windows like kill rag but like actually you can't just like jam the same amount of a thousand times the context and expect the prompt to work as well. And that's certainly what we've found. Mhm. Okay. So let's see. So summary of the friction analysis dead clicks high volume of dead clicks on plan headers. So let's see what this means. This is our pricing page. And if I click here, nothing happens. Okay. This is real. I we haven't fixed this intentionally yet because, you know, it's a good uh good example of what the agent can find. But I promise you, I didn't plant this problem. Like this is actually our Yeah, it looks like a blue link. Yeah, exactly. It's like our It's our brand color, but yeah. Okay, so the problem is real. So now uh let's see what else. Okay, so session replays. So one of the things we constantly hear is like we love having session replays, but it's pain to watch them. Um and we have I think we're doing a good job making that less painful for humans, but uh it's actually way easier for agents to squeeze insights out of session replays. And then this kind of goes back to what we were talking about earlier where it's like how do you um how do you allow the human to like understand what the agent is doing? What the way this works is the agent goes off finds an insight and then produces in session replay data and then produces like a highlight reel of like clips that relate to um the insight the agent found. So you can kind of like check their work. So this is a bunch of people clicking on the pricing page dead zone and then not converting. Well, that this is actually pretty cool. Like how did you because like you know this is not just like trying to summarize text or something. This is like multim model, right? Or how are you how you get the AI to analyze? Yeah. Um this is a lot of hard work honestly. Okay. So I don't know how much we want to share on this frankly. Um, but yeah, it's a it's a combination of multimodal and kind of more classic prompting. Okay. So then it recreates this highlight reel. Um, I could, you know, if I wanted to, I could jump into like existing like the whole session to like get more context and then it comes up with strategies. So these are and this is kind of the loop it runs all the time like insight strategies that act on that insight. So in this case it I think it's probably going to come up with these are all web page experiments. Um but think of this as a generic sort of action space. So other things you can do in amplitude. You can create a guide like a popup. You can create a survey. Um, we're exploring integrations with folks now to enable like other customerf facing actions like sending an email. And maybe in the future, we don't do this today, but maybe in the future it's more of like a collection of actions versus just one. So, let's see. Clickable plan headers. Basically, it's it wants to run AB test. Yeah, an AB test where it makes the headers clickable. So then if I click and I could go I could give a feedback like I don't like this idea or whatever but this is the right idea. It's nice that you kind of explained everything like right below it you know it's like this actually was a piece of feedback from customers. So, we used to just in the very alpha version, like here's the right answer and people would be like even though cuz my think my original thinking was like we're giving you a lot of information to like assess whether you agree with the agent, but people would then people weren't like making the link between the insight and the suggestion. And so we created this to call out the fact that the suggestions come from two places. They come from the insight, but they also come from the template, like the best practices. Got it. You got to make sure people trust this thing. So, yeah. Exactly. Like it's not just it's not enough for the idea to be like reasonable. It has to be you want it to be sourcable. One thing I'm thinking through is how do we take this a step further and um connect the best practice part with content? So, like obviously we have a bunch of content at Amplitude around how to like optimize your PLG onboarding flow and stuff like that. And everyone asks about benchmarks like how do I get my agent to know when my pricing page conversion is good? Very sticky. I'm sure you think about this like very sticky problem the benchmarks because it would suck to give an agent like a non-industry specific benchmark and now it thinks your conversion is totally great but actually it's terrible and so it doesn't do any work on that page, you know. Yeah. Makes sense. Yeah. So, so it's basically uh so right now we set up this manual experiment flow but yeah uh does it like is it always on the look lookout for when something go goes down or you have to do it manually? Um yeah, it's always on the lookout. So it um it they are persistent. So this is the first run of the agent. Uh but now it exists and it's monitoring this page and I could tell it to monitor other pages. Okay. And then it can basically what it it's a bit hard to demo, but it basically would surface with this end state and it'd be like the Slack message you'd get would be like, "Hey, I think I found dead clicks happening at specific part of the pricing page. I think this is a conversion opportunity. So, I created this experiment. Can I run it? " Oh, that that's great. Yeah, there's a whole art to deciding sort of how it's triggered to look for opportunities and we do that with a uh combination of the best practices that I shared. So think of it as it regularly looks for standard issues and then it also is more triggered by like new data or anomalies to go the classic anomaly would be like oh conversion is dropping so it went to investigate why. Yeah. Okay s uh it kind of like monitors on it. You set it up once. I mean why would I not want to turn just turn all other stuff on? It's a dumb question. I mean, I think it sort of maybe suffers from the Look, I ran this example because I've run this example before and I know there's a problem on our pricing page and I haven't fixed it because it makes for a great example, you know. Um, so just to close the loop, I can now hover. So, the agent wrote the code to do this. Um, oh, so hover. Nice little subtle animation. And if I click, it should Yeah. So, like now when I click the card, it takes me to the form. It's not like it's not creating a whole website from scratch, but yeah, the bet is that a lot of websites have a lot of problems like this and this one is actually like a pretty high lever change because it's a pretty important page. Makes sense. And and I guess uh going back to my question, why would I not want to set up for every page? I think it's a similar. So yeah, this is a great example. I obviously chose a page where I know we have a problem and I know it like kind of works. Yeah. Most of the time, I think the expectation should be the first time you create an agent, you run through this flow, like it's probably giving you like a pretty obvious insight or something you kind of are aware of. Um, and you have to give it some feedback. Like this product is intentionally, maybe I shouldn't say this way, but it's kind of an intentionally high friction product. Like it's pretty heavyweight. This is not, by the way, like the only AI that we're building into amplitude. If you want to ask a question of a chart, you shouldn't create an agent to do that. If you know exactly the experiment you want to run, like you shouldn't create an agent to do that. You should create an agent to sort of like outsource a problem and then the bet is that it gets better with feedback. So to your question of why do you want to not turn all these on? I think if you turned on a bunch, it'd probably be similar to creating a ton of anomaly um like detectors where you just get banned and you probably wouldn't pay attention to them. Got it. Okay. Got it. That that makes sense. Okay, that makes sense. And um Okay, great. This is awesome though. Yeah, I I think uh simplify it to very specific use cases is actually a really wise decision. So yeah, it makes it easier to solve for feedback as well because if we did the monolith, it'd be really hard and imagine everyone is cont everyone at a company is contributing to um feedback giving feedback to the agent. it needs to know what feedback to pay attention to in what scenarios and choosing domain specificity just makes that problem way easier. So then uh what was the hardest part about building a product? Um yeah, it doesn't always do what you wanted to do, right? Is that Oh yeah, now I can like and now I can be like let's start off. Let me just finish it. Let's start off by rolling this out in Norway. only. You know how it's funny this happens where a lot of I don't know if you guys do this, but a lot of teams will like pick a region. Yeah. Where they test stuff first. Let's see if this Yeah, there you go. Um, and now I could hit this very enticing button and run the experiment. Nice. So, yeah. Uh, hardest part.
The hardest parts of building AI analytics products
I mean it's like ending one so I guess I'll we don't know if it's gonna be successful but I think the hardest part is everyone has very big expectation customers have very big expectations for what they want AI to do for them like there's a ton of data quality use cases instrumentation use cases um and data science use cases and it's I think Eric Visher has talked about this a lot, but I think being building AI products and I'd be curious if you agree from your experience feels like more of an exercise of like wouldn't it be cool if and seeing if the models today are actually good at something versus like a classic customer discovery loop because in my experience a lot of customers don't have a lot of intuition about like what AI is actually good at and they'll describe what they want as like the crystal ball and that's a very it I don't really think that is an easy product to build today or like make sense as a product to build. So it's almost like you have to be I think less um it's more like building MVP and testing it versus like a classic B2B enterprise SAS customer discovery loop which I think is hard for a lot of bigger companies. I think that maybe going back to your question of like why don't we see more of this like I think it's hard for bigger companies to think about building product in that way. Yeah. Like what is the classic B2B? You mean just like getting some enterprise people giving some feedback and they have a set role? Yeah. Like oh it turns out people want this toggle. Okay, great. We built this toggle and that's going to drive like I mean I'm being I'm intentionally caricaturing the miniature feature, but like no one was like, "Hey, I want you to build agents and I want them to, you know, do it for these use cases. " is like it it's just it wasn't um it was it feels more like a startup where you're like oh man this is how product analytics should work. Let's see if people use it versus um spending a lot of time testing and bu building something that customers are basically telling you to build. Yeah. I think people probably also have like higher accuracy and you know quality standards for data products right like data is like very deterministic right but this Yeah. which is kind of it kind of why we've decided to focus on the insight to like experiment problem because it's not like the bar is all of our experiments are successful like that would be a stupid bar. It's it should have some successful experiments but the that rate should hopefully increase over time and still hopefully learn from the failed experiments. basically just trying to reduce the thing we did hear from customers was like I said before we just don't have enough time to like formulate all the experiments and monitor them and set up the do not harm conditions and all that stuff and that's frankly like really easy for AI to do. Yeah, it seems like uh I mean it seems like it just like saves a ton of time, right? is without what you just show me. I have to go look at the charts and hopefully see some anomalies and then I have to go watch session replays connect in my head to set up the experiment try to optimize this right and that's not hard for people. I mean sometimes the insight part is hard depending on data quality and what you're looking at but yeah in our experience like I said before it's just more a question of time versus like genius. Okay. Uh can you
talk a little bit about uh how do you set up like AI evals for this product or like you know like one of these a agents? What was that hard? Um we do a bit of eval. Um, I think the challenge is kind of like what I said before, like it's not clear really what a good experiment. I mean, there's some universal truds like one of the things that happened in the early days of this product is it was producing like very subtle, you know, border radius changes and stuff like that. So already Yeah. I think it's Yeah. kind of the kind of thing you look at be like, "Okay, yeah, I could run a thousand of those and like it wouldn't do anything. " Yeah. And so it it's partially a temperature problem, but then like part of the eval is like was there a substantive enough change? So there's definitely some truds, but I think the hard part is uh like it's not always clear that a experiment that fails is like not valuable. And so we're definitely prioritizing more of the like now that we have our the product out there, we're prioritizing more the success metric is like are people clicking yes? And if you fail 10 experiments, but they people our customers still keep clicking yeah, I approve that experiment. Um for us that's a win. If you click that button a hundred times and none of the experiments work, okay, at some point it's not good enough. Um but for now we're really focusing on do people think the experiments are reasonable enough to ship. So the experiment is like the action that you're correct. Yeah. The action or does it give feedback? Does the insight correct? This is the classic problem with product analytics and insights which is that it's hard to value an insight. And so one of the things we're working on is proactive um you can kind of think of it as clarification which is a huge hugely important topic for this space and I think especially for analyst agents but one of the things an analyst agent can do if it's producing an insight like it can follow up with you and ask you like did this create value for you and it remembers the insights it provides and so it can do that not just like in the immediate aftermath of producing the insight but later on to kind of help you recall an insight. And so we're pretty early in the product, so we haven't had a lot of those loops, but we're that's it's we're kind of trying to solve the tying insights to outcomes problem by just proactively having the agents kind of be a bit annoying and like try to proactively clarify. Got it. Yeah. And they can get that feedback back into the product. Yeah. I mean, dude, honestly, there's like all these fancy techniques like LM as a judge and blah blah, but like a lot of it's just like rallying your company to test the product, man. Like, yeah, I mean, we do use those things. It's just I think it's like we're in the early stages and we're just trying to prioritize like we're trying to get something that people use. I think a lot of people again maybe going back to your um original question of like why don't we see more activity in the space? I think it's because people hold themselves to the bar of like oh well it should be coming up with it should be nailing every experiment whereas eventually some but our bar is are people using this and even if the value is it gives me an idea of an experiment that I'm going to go create myself and it just gives you sort of a prototype to run with I don't love the phrase but people are calling it like vibe experimenting then that's fine like I'm totally fine with that as a starting Right. Yeah. I I love how he's focusing on experiment because experiments at the end of day how you actually measure causal impact for you know Yeah. And it's low stakes cuz it's just an experiment. You know, it's not end of one personalization yet. Uh which we could talk about, but um I think there's a lot of lowhanging fruit in robots using software to run experiments and campaigns instead of jumping to end of one personalization. I mean that that's also like a very good lesson for just like internally. You don't want to like oversell these AI products because like you know they take time to get right. So if you're like oh I'm like go personalize the web page for every user and then right make it happen. I think coding tools did a pretty good job of this where at the beginning it was kind of like oh it's kind of nice like it gives me like a nudge in the right direction and then it's now it's like oh actually like Devon's just like making PRs you know. Yeah. And I think um the point about yeah like the focus on insight to experiment insight to action versus like getting the to come with insight in the first place like I don't think it's I mean it's pretty good like chatb is pretty good if I upload my like I don't know t tax return or something like yeah you know it's pretty good at finding like top expenses or something but like for like amplitude scale for like companies using amplitude with all the data that it has I don't think you can put data into it has to come with insights I guarantee it does not And that that's yeah why we that's not the product we shipped. Yeah we are thinking about MCP and like how to um you know maybe GPT5 will like will be good enough. I think that's part of the problem with this being in building product with AI is like the timeline of building enterprise products it the aperture is longer than the model release cycle. So if you plan to build like a product on January 1 and you're going to release it in September, it might be in January, but by the summer the product is actually the models are actually good enough to support the use case. And I think we've actually se we we've seen dramatically improved performance over the past I want to say like in April is when I forget which model actually it was that dropped in April, but it was a really big improvement. And then you get to the classic lesson of like part of the magic here is having intuition for where the models are getting better in your specific domain and then building expecting the models to improve. Yeah. Or or just like wait like a month and then you know try again. Yeah. You got be got to be willing to like throw out the like maybe GPT5 will be so good at just here's raw data like find insights that we'll just build like you know a product that just does that and that's fine and and do you have like a kind of blanket privilege to update to whatever the latest model is because like in some companies that that's kind of hard to do too like there's like you know cost and blah blah but like we're trying to be I think a failure mode is trying to like model out like today's models and cost structure into big scale. And so we're not really um I mean we look at it obviously but we're not constraining what we build with cost. Like I'd rather build a product that's like really useful and have it be like oh man this is pretty expensive. and then like whittle it down versus Yes. Oh, we've got a really good cog profile on this like pretty useless product. Yeah, that that's the key. You want to Yeah. Like if we're up to me like use a state-of-the-art model and then maybe like ray limit or something, but like you know Yeah. Everyone talks about I think people get way too lost in the sauce on pricing for AI products.
Um, I think a lot of like the PLG influencer people have started talking about like AI pricing um as like the new kind of intelligencia topic. Um, I totally agree like just do rate limiting, just do fair use and like that eliminates a lot of these like nightmare cost scenarios. This was true at um my startup too, Command AI like we have uh we had just AI chat embedible AI chat product and we priced it by the way we like to price it was by MTUs not by chats because and MTUs we took a risk because if a user asks a bajillion complicated questions we could lose money on that user and potentially the account. Um, but our bet was like, first of all, we're going to rate limit, so you can't abuse, and we don't want people to think like, oh, is it worth rolling out the chatbot to these users because it costs us X? Like, I just want them to use it maximally. Um, I don't think we lost money on a single account. And it was used. Well, what is MTU real quick? Total to track user. So just say million users will charge you for a million users, not 50,000 chats. Okay, got it. Okay, got it. Unlimited chats. Got it. That makes sense. I mean, dude, like uh the reality is like it's hard for AI product to find product market. So like if you have the if you have people using it too much problem, then that's Yeah, exactly. totally post success problem. like worst worst case you don't find a way to make it viable and then you shut it down and at least you've learned something and got the market's attention but the reality is like the cost curves come down. I'm curious how you guys think about this Roblox like do you have like a dedicated like budget for ceiling on how much you can spend or cog profile or how do you think about No, I think um I mean there there's like budgeting but like for me like I always want to use the latest model and um again like TBD if people even want the product or not. So like let's find product market fit first and we'll figure it out, right? Yeah. I think it's cool that I think a lot of companies are actually um taking this perspective on cost. Um someone I heard someone say like a lot of CIOS are trying to uh accelerate AI adoption because they feel like they were too slow on cloud which I think is very accurate. I mean, and also like you know, like open just randomly decreased the price 80% for 03 the other day. We wake up and be like, "Oh, turns out our cogs are a quarter now. " Yeah. But then you upgrade the latest model and you do more calls and back up. It's kind of like how web pages are now unusable. Websites are unusable if you're on a 3G connection because they assume a 5G connection and they fire off a ton of requests and aren't optimized for 3G. Yeah. Dude, so let's wrap up some of these questions, right?
So like um should we be worried about our jobs like as a data scientist or like a PM like um anytime soon? I think the way that we're thinking about it uh is there's so many ways so many experiments to run that just aren't being run today that it's not about the experiments you making the experiments you are running done by an agent. I mean, I'm sure some companies will think about that way, like if you're really strapped and you want to keep running experiments and yeah, I mean, an agent will be cheaper. Um, but the feedback is super important. Like, most companies aren't in a place where they're willing to let these things run autonomously for anything but like toy use cases. So, we're definitely taking the like Iron Man suit uh framework of this makes our customers like way better at their jobs. Okay. The replace. Yeah. And like you know I work with a lot of data scientists and like a lot of their job is like when I ask them to make a dashboard for my product or like start you know adding logging a spec and that kind of stuff like they hate doing man. Exactly. Very boring. Our retool is popular for building internal tools. Like people don't like to do that. Yeah. So like if AI wants to automate that stuff like that that's great because then the data science can focus on like thinking and our agents today I mean maybe our next generation agents will do stuff like this but our agents today aren't like deciding on what new products you should ship or like what new markets you should enter. like sure those are maybe in scope in the future but uh it's a lot more I think about informing the board deck and then the board makes the decisions. Uh maybe the board is just copying and pasting the PDF into chatbt and doing what it says. I love that meme the like just do what AI says meme. Yeah. I mean a lot of times I kind of bounce around with it and then I kind of do what it says at the end, right? So yeah, it's like that really popular tweet where it's like you write you use AI to write an email and then someone uses AI to like decode the email to summarize one sentence version. Yeah, I mean yeah, like I don't write internal docs longer than like half a page now because someone just use AI to summarize it. So what's the point? Yeah, totally. Yeah. Um so um any kind of closing words of advice for
folks building AI products that is not a typical advice, you know, it's not just like, you know, try some AI tools like So what's typical advice? Build assuming the models will get better. cheaper. Focus on maybe mildly non-obvious I would say is like just focus on what the models are good at. We have this tremendous ability to prototype anything like you paste in the chatbt. Does it do like a reasonable job? If so, you're probably you know with a bit of you know eval rag like you can probably come up with a good product. Um, yeah. I think the thing I'd probably come back to is maybe just an expanded version of that is like I think building good products in AI is much more about taking a wouldn't it be cool if mindset putting a kind of coherent product in front of someone. Yeah. And being like does this work for you? Good example, Lovable and Bolt UI prototyping. I wasn't hearing about like, oh man, UI prototype, like prototyping for websites is the number one, you know, thing that we think AI is going to be good at. Sure, I mean, it was somewhat reasonable, but then these products figured out that it was good at that and they made it exceptional and then everyone loves those products now, now Figma make, etc. So, I think it's I don't think people should be afraid to build towards a coherent product and then put it out to the market. I think iteratively testing. If you put something that's like kind of crap in front of a customer and say like, "Oh, well, I know it's bad today, but don't you like the idea of being able to prototype stuff? " Everyone's just going to say, "Yeah, if it's good. " So, don't be afraid to like build until it's good before you kind of put it out into the world is the advice I would. Yeah, that's a good advice. I mean, like, you know, you can you can't be a perfectionist with this stuff because it's never going to be perfect. Yeah. And uh like a lot of like my job like just building a process is just like you know every day just like be like okay how do I fix the prompt or yeah it does make a lot the other piece of advice I'd say is yeah there's this debate around like how much of AI products should be chat based. I think the meme like last year was like everything is going to be a text box. I think with generative UI, the meme has like sort of shifted away from textbox to like, oh no, rich control is workflows needed. You should always have a text box because it's better than any session replay because you know exactly what the person is trying to do with the product. Like it's so helpful to just go and be like, oh, the user is fighting with the agent to get it to do that. And it's clear because they're telling it that. Whereas if you look at a session replay of using a software product, sure we've got lots of clever ways of figuring out, oh, are they confused? Are they angry or whatever, but you can't like reach through the screen and just say, "Hey, what are you trying to do? " Whereas you can that is a text box. Okay. So, so you think the text box and chat is going to be here to stay at least as one of the mediums? I think here specifically for the purpose of helping find product market fit because it gives you perfect insight into what usually into what the person is trying to do. And by the way, like people are pretty good at prompting and so you can chatbt has kind of onboarded everyone into how to prompt and so like you can kind of rely on people knowing what to do roughly. Yeah. Like there's no like you know just do the simple thing first man just like you know give people a chance to talk what they want always. Yeah. All right James. Well as a fellow AI builder I mean a lot of this stuff is actually pretty frustrating right? just like trying to iterate on the prompt and like ship stuff. You got have rocket science. Despite what people what I may say on LinkedIn, it's not rocket science. Cool. So, where can people find uh yourself and Amplitude AI agents? Please visit us at amplitude. comai. Okay, cool. And uh do you pull thought leadership on LinkedIn or where do people find you? I sometimes not as much as uh as I probably should, but Okay. All right, man. It's been an awesome conversation. Thanks so much, man. Good jamming. Yeah.