Exclusive: Inside the Best AI Model for Coding and Writing | Scott White (Anthropic)
48:07

Exclusive: Inside the Best AI Model for Coding and Writing | Scott White (Anthropic)

Peter Yang 09.03.2025 7 465 просмотров 169 лайков обн. 18.02.2026
Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
My guest today is Scott White. Scott is the Head of Product for Claude at Anthropic, and I’m excited to be the first to interview him about the hybrid reasoning AI model Claude 3.7 Sonnet. Scott also gave me a behind-the-scenes look at how Anthropic builds AI products, the most essential skills in the AI era, and how Claude will evolve from assistant to agent. This episode is brought to you by Vanta — Join 9,000+ companies like Atlassian and Quora, and use Vanta to manage risk and prove security in real time. Get $1000 off at https://vanta.com/peter Scott and I talked about: (00:00) How Claude will evolve from assistant to agent (01:58) Inside Claude 3.7: The first hybrid reasoning AI (04:51) How Claude became best in class at coding (08:50) Top 3 ways that Scott uses Claude to build Claude (13:49) PMs can now design and collapse the talent stack (17:00) Will AI models become commoditized, and Claude's personality (22:34) How to work with AI researchers (27:20) Step by step how Scott builds AI product (32:37) Why every PM must master writing AI evals (35:53) If Mike Krieger has brought Instagram's principles to Anthropic (38:41) How to get hired at Anthropic (43:01) Flipping the script on Claude's future Get the takeaways: https://creatoreconomy.so/p/inside-the-best-ai-model-for-coding-claude-scott-white Where to find Scott: LinkedIn: https://www.linkedin.com/in/scottiewhite/ Website: https://claude.ai/ 📌 Subscribe to this channel – more interviews coming soon!

Оглавление (12 сегментов)

  1. 0:00 How Claude will evolve from assistant to agent 402 сл.
  2. 1:58 Inside Claude 3.7: The first hybrid reasoning AI 505 сл.
  3. 4:51 How Claude became best in class at coding 761 сл.
  4. 8:50 Top 3 ways that Scott uses Claude to build Claude 1023 сл.
  5. 13:49 PMs can now design and collapse the talent stack 656 сл.
  6. 17:00 Will AI models become commoditized, and Claude's personality 1006 сл.
  7. 22:34 How to work with AI researchers 918 сл.
  8. 27:20 Step by step how Scott builds AI product 961 сл.
  9. 32:37 Why every PM must master writing AI evals 643 сл.
  10. 35:53 If Mike Krieger has brought Instagram's principles to Anthropic 539 сл.
  11. 38:41 How to get hired at Anthropic 822 сл.
  12. 43:01 Flipping the script on Claude's future 1026 сл.
0:00

How Claude will evolve from assistant to agent

I think we're still just getting started I think you know this last year Claude sort of feels like a very capable assistant like something that you have to guide it to a specific outcome you have to know exactly how to prompt it and give it very specific information to get what you want I think in the next year we're going to flip the script a little bit and I think it's going to feel a lot more like Claude is what I like to think of as a capable collaborator and to me that means it's moving up almost a career ladder on a few Dimensions can it actually take things off my plate can I feel like I can delegate meaningful tasks and meaningful work to this capable collaborator uh as opposed to like telling it exactly what it needs to do that's what I think things should start to feel like this year and it's not just like one feature that will do that I think it's the Confluence of many capabilities both model and product all moving in that direction but I think like means that it's on the dimensions of its knowledge its ability to communicate and its ability to do stuff for you making like really meaningful strides this year so that it doesn't feel like an assistant anymore it feels like something that's helping you solve your biggest problems and most pressing timely needs and meaningfully taking things off your plate that's how I sort of think all right welcome everyone uh my guest today is Scott uh the head of product for Claude uh my favorite AI product and uh I'm excited to be the first to interview Scott about Claude 3. 7 soet uh anthropics brand new hybrid reasoning model and also talk about how you know Scott and team build products inside anic so it's welcome Scott thank you Peter I'm excited to be here and thank you for your advocacy for Claude and it's been awesome to see some of the things that you built like the Star Wars video game uh it's very cool to see the kinds of things that people are building with the new model so uh appreciate you having me here yeah it's magic man like especially for a PM who doesn't really build stuff himself is very magical yeah absolutely but let's talk
1:58

Inside Claude 3.7: The first hybrid reasoning AI

about the new like maybe you can tell us a bit about it and what you're most excited about it about yeah absolutely so 37 which launched this week uh is our first we call it a hybrid reasoning model so it can produce both instant responses really quickly but it all it can also do step-by-step thinking for longer uh horizon or more complicated tasks so it's a little bit of a different Philosophy from the other reasoning models on the market kind of similar to how humans kind have like thinking fast and slow like two separate brains for questions that uh should either be answered immediately based on sort of like knowledge and Intuition or others that require much deeper thought and like pausing to really think about something uh and so we have these toggles both within the user experience on cloud. a but also through our API where you can either um make it answer really quickly or also toggle on what we call Extended thinking um so you're directing the model to go a lot deeper on particular problems and that allows people to really dial in like how long do I want this thing to think how hard and it's also really useful for API customers who kind of want to dial in the like budget versus thinking Frontier uh instead of just having one very like Dynamic uh and unpredictable sort of um output in terms of how long or hard it'll think um so that's one of the big categories of differences that we're really excited about the other thing is that we've really tried to make it valuable for what I think of as like practical and useful customer use cases so it's optimized somewhat less for sort of like competition math and computer science problems uh or like Puzzles and somewhat more towards like real world tasks that we think are more similar to what large language models are used for especially in the workplace and I think that um is more of a closer reflection of our needs as well and we think we're kind of Representative of our customers uh and a great example of that is like production software engineering and coding um 37 son is really good at that and that's something that's awesome for our customers uh and it's also awesome for us because we do a lot of software development both to obviously train these models um but we also build products like our API and Cloud that Ai and so there's sort of this virtuous cycle of having a model that's really good at coding allows us to build better products faster uh and it's also been really good at things like visualizations um which has actually been surprising to us and me and that's been particularly helpful in my role so those are a few of the things that I get really excited about off yeah like um you know I I joke that like claw
4:51

How Claude became best in class at coding

doesn't have all the bells and whistles like you know maybe Voice or you know some of the stuff but like from practical perspective like you know 80% of my use cases is like either writing or getting into coding a little bit right like and uh I think that's like a really important Point man like but how do you actually train the model to become good at his normal use cases like how do you uh yeah how do you actually make it good at his normal use cases yeah it's interesting it's not so dissimilar from traditional product development I think it starts with having a very strong vision of what you want to be great at like what problems do we want this thing to solve who our Target customers what are the kinds of use cases that are indicative of the problems that we want to solve and that's both across our API customers and being really close to what they're trying to solve uh in the market in addition to internally us building something like cloud. Ai and Claud for work allows us to have really tight feedback loops with like what are our customers trying to do what are we trying to do that gives us this kind of shared intuition about okay what's our vision where do we want this to go what are the key cap capabilities that we think uh we need and what are the kinds of categories of problems that we need to excel at and then that helps us prioritize specific things like you know uh reinforcement learning environments and um very specific kinds of data and use cases that we want to be really good at in terms of training the model uh and it's not so we are not so dissimilar from our customers so it's actually really important for us to dog food our own you know early generations of these models and figure out what are the things that are blocking us from being successful um one of my old bosses was Andy rackliffe he like a uh professor and he was the CEO of wealthfront and he used to say like the only thing that matters is if the dogs are eating the dog food uh and well like we're sort of the dog uh and it's critical that we want to eat our own dog food H and so that's been an incredibly helpful thing for us to like ground our model in the problems that we practically want to solve and also working very closely with our customers on that so having that feedback loop between go to market product engineering stakeholders with deep research involvement through the entire process helps us pipeline these learnings and sort of like match it with the vision of where we're trying to go just the mechanics of how you train a model end up being the key difference but like fundamentally the overall process ends up feeling like product development just with the new Nuance that we control this sort of model uh that also integrates into our product and it's just a new way of working this episode is brought to you by vanta whether you're a startup founder navigating your first audit or a security professional scaling your GRC program security has never been more critical or more complex that's where Vana comes in businesses use van to establish trust by automating compliance needs across over 35 Frameworks like sock 2 and ISO 271 Vana can help you start or scale your security program by connecting you with audit ERS to conduct your audit and set up your program fast plus with Automation and AI throughout the platform Venta gives you time back so you can focus on building your company join over 9,000 global companies like alassian quora and Factory who use Venta to manage risk and prove Security in real time get $1,000 off at vat. com Peter that's VA n. com Peter for $11,000 off now back to the episode yeah I already believe that like you know uh like rapid feedback Lo is key like with your customer like dog product every day yourself and like I feel like a lot of PMS don't do this man they just they don't use their own product they don't talk to customers I mean with I guess with Claud is like AER the job like that's the most important thing yeah it's and like if you're not using cloud or
8:50

Top 3 ways that Scott uses Claude to build Claude

one of these other products like you're basically not like you're not being very effective as a PM at this point like you really should use this stuff to become way more effective right I mean it's like yeah I I use it for everything yeah I mean I think that there's like using your own product and then there's using you know new technology that can help you be better at um like the core role that you're doing but like yeah a fundamental thing I think every product manager should be um really great at and um prioritizes being the expert on their own product uh and for us the product is both what we're building in terms of the user interfaces and the apis but the product is also the model so we of course need to be like on the front lines of using our own model and being like What's blocking us from being successful so I think that's really critical that's the beauty of working at a place where you are the customer of your own product uh like you can just have those feedback loops so rapidly so like uh using Cloud for your own job uh like what are some uh most common use cases for like the normal clad versus the reasoning clad you know like how do you think about that yeah that's a good question I think like uh the normal Claud without sort of extended I use a lot of document sort document creation document summarization um a bunch of the things that I do around creating prds um creating evals uh and based on the prds and implementing those evals into the PRD uh these are the kinds of things that I cla's really good at without extended thinking though there are some times where I will use extended thinking even with that there are instances where I have like a project that I've set up with all of the sort of product requirements templates documents and like the eval structure that we do and sometimes I'll like dump in a ton of different documents about like a particularly hairy large initiative and I'll use extended thinking to try to like connect the dots a bit between all of these things and I have noticed that helps but a lot of the sort of um daily tasks of summarizing meeting notes identifying action items creating documents and prds these are things that I can do without extended thinking and then I think an area where we have noticed a lot of value on extended thinking is like coding use cases um and we built a product that we launched as well this week in a research preview called Claud code and that's something that's become like integral to the way that we're working internally and a lot of people have noticed the extended thinking being valuable uh in that use case in terms of coding um we like it's so that's a good example of where us being our own customer has helped us ship something that think is also exciting to the market but it has become so indispensable to us and it is changing the way that we work um and it's cool because like yesterday even I was at my desk and I noticed at bug uh where we had like a really bad contrast on like a um toggle in the UI and there was a designer who's just sitting like three seats away from me I just walked over to her desk and I was like this is an issue like can we quickly choose a color that we want to actually like have the toggle so we fix this contrast issue and we like put out a fix in under a minute in a way that we just like that wouldn't have happened uh even six months ago so like these are some of the like practical ways that things are changing internally the other thing the last one that I'll mention is like visualizations have been actually surprising so there are some instances where you're like we really want to make it good at this thing and you like Point all of your sort of like efforts towards doing that and then there are other times where you just end up getting surprised after the thing has baked and visualizations is kind of an example of that it it's kind of intuitive because CL CLA is very good at coding so it can code good visualizations but it's kind of surprising to us so now there a lot of ways that both from a product perspective I think we'll capitalize on that and try to ship stuff that leverages that more but also internally in my use case I end up building a lot of visualizations using extended thinking to try to uh make it clear like what the user flow that I'm trying to work towards is in a product spec and that's been a new way like something I wouldn't have done you know two years ago I wouldn't have had the time to do it um but now I do it and I even see Engineers doing it because they're so passionate and there's so much Bottoms Up energy here they now have this new you know tool in their toolkit to communicate their Vision to me and to designers and that helps us like close feedback loops between us internally on what we want to build so um when you say visualizations you mean like uh like flow charts like user Journey charts or like actual mocks and stuff or actual mocks like both like in terms of you know mermaid diagrams to show what a user flow should look like but also like full react components to be like this is what I actually think the page should look like and like clickable prototypes of going through those experiences I've been very surprised at how good Sonic 37 is with extended thinking at that I I do that
13:49

PMs can now design and collapse the talent stack

a lot too I'm not sure how my designer counterpart feels about this so usually what happens is like you know there's a figma design and then I'm like hey you know iug get some AI tool and like hey can you make it better can you Pro this and I show it to her and then she's like it's like this is a great job Peter I'm not sure like how does your designer feel about PMS kind of showing the mocks or prototypes yeah I mean we're all super collaborative here and I think like I was sort of saying earlier there's this interesting intersection as people have more of these tools more of their sort of workflows overlap with each other and give them new ways to collaborate like we have way more designers putting out poll requests for like things that they just want to you know improve from a quality of life standpoint or a craft standpoint in the product experience we have more engineers and product people like creating um visualizations and so there are just these new touch points we have more like salespeople basically writing product specs um because they are hearing feedback directly from the field they have access to all of our templates so there are these touch points that exist that traditionally wouldn't have and I think like we're learning to embrace them as opposed to be um being like defensive it's like these are just new ways that we can collaborate and I think it's actually quite fun like I get inspired when I see Engineers create visualizations it even happened this morning like before coming into this uh discussion here uh an engineer put out like a really compelling um output like mock of something and I think we might ultimately go and make some product decisions based off of that later today so like these are just new ways of us working together and they get me super I totally agree like I think you know I think um Scott bski He Adobe said the talent stack is collapsing like you know like used to be these pro teams have all these roles like you know as a PM you can only write internal documents or something but now every can do everyone can do a little bit of everything I mean that makes the job so much more fun you know yeah absolutely and I think like it there's a component of like I not going to be like the uh production engineer who's actually like getting everything down the stack in creating like the full flow but I can create like a small prototype of something just to get the ball rolling I think of like static friction versus rolling friction right and static friction is so uh much harder to break and then once you like break through that there's so much less friction in like going farther down which is why often when you're like creating a new company or something like even just creating the website uh like even creating like the landing page helps you get started I think these new tools just help us get started and build momentum on things faster between teams so while the talent stack is collapsing I think it's not that like we are all doing all of each other's jobs and everyone's like doing the same thing it's more we have more ways to break through the static friction and that makes me super excited yeah I I totally agree keep the ball rolling downhill basically yeah makes sense um so let's talk about the model a bit more um you know there's been a lot of talk about how like you know all this model stuff is becoming Commodities and then you know a week later someone ships a great model right but like I actually think um
17:00

Will AI models become commoditized, and Claude's personality

what's really unique about Claud is like his personality and like it just feels like more nice uh and more pleasurable to talk to than some other models um how do you think about this commonization thing or like you know how do you think about even the model Personality yeah good question so I think there there are two components of that and they do overlap in some ways so I'll go through both the way that I think about like the commoditization question I think the last couple years have been there's been a lot of experiment ation like businesses are trying to figure out what's the best way to deploy AI in what parts of our stack are we buying are we building like what is sort of our overall strategy I think that they've sort of oriented themselves now and they're savier about how to deploy models at different tiers of the product experience and internal stack experience I think like overall what I've seen is businesses are taking like a multimodel approach versus choosing just one um like companies who build internal tools one thing that we've seen is the general Behavior looks like they basically build this tool and they have like a their own model selector which you can access you know through things like AWS and um and vertex and then that enables their employees to switch between models as they need uh choosing whatever is the best model for that particular use case within the internal tool But ultimately there are also just use cases that different models are really amazing at like software engineering and coding um is one that we continue to be really excited about and a lot of companies will adopt clae for work or you know now CLA code or through a you know a different solution like cursor and end up using our model for software development um while they might be using other models for other use cases internally and then there are also situations where um there's kind of like a cost component of this where we have a model family right like we have Claud Hau and we have Sonet and we have Opus there are different use cases that they have internally across different models that leverage different performance and cost thresholds so we we've just noticed there are many vectors for Choice like we also build products not only just make our models available through apis where there is already a lot of choice we also have new product experiences through cloud. Ai and Claud for work that people really love uh to use that are of course built on our own models and so when you're deploying Cloud code you're using our models to write your code when you're adopting Cloud for work for you know sales use cases or uh you know writing you marketing use cases like you're using our models there and that creates a lot more of a customer relationship and stickiness there's also a lot of companies that are adopting both at the same time in sort of a more holistic partnership with us and those are really great customer relationships because we get feedback loops on so many different dimensions so I think that there are just a lot of vectors of choice and there are a lot of opportunities to be um partnering on specific like areas or vectors so I'm not super concerned about that point I think one of the vectors that you talked about now to Parlay it a bit is personality um we've noticed this be it's like hard to have you know like evals against this like something that's like a very you know your hill climbing on being a good vibes personality but it's something we take super seriously like uh Amanda Asal here she is kind of the Claude character leader that we have and this sort of Guru genius behind claude's character and it's something that we take super seriously uh and we sort of steer claude's character and personality um to have specific character traits that we want to encourage the model to have these are things like curiosity and uh sort of interest in multiple perspectives striving to tell the truth um open-mindedness and sort of selfawareness these are traits that we want to be reflected within Claud and that's something that helps people adopt it and feel like it is a capable collaborator with them it's sort and it is um Mercurial like you probably couldn't say why you have a friend like why is your friend it's not because they have a really cool job or like they're good at cooking food it's like you just like them and it's some people just like Claude and that is not only a driver of how people adopt CLA that AI or Claud for work because it is that capable trusted collaborator or adviser but it also drives like um you know customers through our API who for whom that character is really important to pass through to their end customers so companies like intercom have switched over to us um door Dash Lyft like these are companies where they want to have a really strong customer relationship with their end customers and that personality also drives that decision making for them as well so it's something that is both very important to us as a company uh in terms of us building a model that embeds those traits and characteristics which we think are sort of like adjacent to our constitutional AI perspective but it's also important um for our customers as well so I think that is a point of differentiation yeah I totally think it is I mean like I don't really care if the models really good at math competitions I care like it it's nice to me and like you know we can work together yeah absolutely show the same way um okay
22:34

How to work with AI researchers

let's talk a little bit more about uh how you guys build products internally so um uh you know at anthropic I'm sure other companies research is like a big part of the job right like the researchers like you know improve the model so how do you actually work with them uh like how do you do you're like hey what are you like hey when is a new model coming out or like you when should I get ready for this or like how do you work with the researchers yeah great question so I think the thing that I would say is if we think about traditional product development it's been like talked about with the three-legged stool you know product design and Engineering but I really think it's now like a four four-legged chair uh research has to be there through the entire product development life cycle just like you would have design or engineering because we do have agency as a company to change the model that is one thing that is unique and special about us and so it would be foolish for us to not like be collaborating super part um closely with our partners on Research to do that and for research to treat us like a customer because we are ultimately an adopter of the model and they can help make the model better for us so there's this virtuous Loop of like we want to solve problems the research uh team will help us do that but the research team also wants to make the model better we are their customer and so I think that there are a few key components or ingredients to making that relationship actually work and the product development cycle I think the first I mentioned this earlier but a consistent vision for where our product and our company is going like what are the capabilities that we need to make Claude good at to make it the most useful and safest uh that is that shared Vision about where Claud is going as sort of a capable collaborator a virtual collaborator is important to ground our Collective thinking so that we have a little bit of shared language and we're prioritizing the same like big chunks of things um then the second thing is drisking research as early as possible uh and building very early prototypes having like a constant portfolio of bets from a research standpoint that we can get surprised by and then like work towards productization or productionize um there are things like artifacts and computer use which are an example of that like being open to surprises is very important so that we can capitalize on those that's kind of like the bodal thing here like we have that shared vision and we work uh towards very concrete shapes of things that we collectively prioritize but then we also take a lot of risks and are very open to surprises it's like balancing the those two things as important ingredients for Success um and then the last thing I would say is and this is very similar to traditional product development and we've talked about it a lot which is just like feedback loops and having these stakeholders be on the team like we have researchers who are embedded in our product um teams so that we can be pipelining all of the learnings that we're getting into that team understanding what's early and successful from the research side that we should be capitalizing on but also finding the gaps that we are seeing in our attempt to build stuff that we then need to pipeline over to research to make sure that we're actually addressing before we end up shipping new models or new products that are um interdependent with the model that's like the really unique thing is we're not just building models we're building model features and there's a lot of interaction between the feature and the model to make the model good at powering the capabilities of the feature and that's like a unique anthropic thing um that we're super excited about and having that connectivity means you really need like single-threaded ownership through the process on Research to be like a product expert on what you're trying to do to then like translate that into reinforcement learning environments that we need to do to actually address the failure that we're seeing on the product side like I probably don't need to be that much of an expert but having the researcher on my team I can give them the shape of the problem that we're running up against and then they can translate it so having that endtoend connectivity has been something that I've had to learn and is like a new skill like it's a new way of working but it's been so much fun and it's also been like really transformational for how we build stuff yeah because I think everything timately starts with what the customer is trying to do right like accomplish yeah absolutely it's like you have problems and then you just have new tools to solve those problems and then you have this new language of like evals to be a proxy for how good you are at actually solving that problem and then once you have that common language you can work towards the same set of outcomes so let's make this a little bit more
27:20

Step by step how Scott builds AI product

practical like I I feel like you probably have a way of building a new product now that I think 99% of PMs can like learn from especially if they have access to clad right so like it's not you mention how you put everything into a project first or something like can you walk through like how you actually build a product yeah absolutely so I can talk about something like um maybe like Styles which is a feature that we launched uh a few months ago that helps tailor claud's output to a specific style like whether it's concise or whether it's my writing style like uh and or whether it's more about explanatory things so um I think the first step is like any traditional product you might have a product manager getting a lot of feedback on this being the kind of thing that we're excited about building and then you would build like a product spec based on that and I you know use a project to house all of our product spec templates that we have internally uh and then I would like put in also the customer feedback you know speak to Claude through dictation on my mobile app of like ranting about what my vision for what this thing is the problems that I'm trying to solve what's the evidence for these problems existing what are the fundamental product decisions that we need to make and then I'll build out like the fundamental early spec but then the new thing that is really important to do after that is build a set of evals that are indicative of us actually doing the thing um so as part of this Styles product that we built we also built something called user preferences which is sort of custom instructions that should um in every chat uh guide CLA towards what you're trying to allowed to do and so the way that I thought about building sort of evals for this is honing in on specific use cases for this feature and building evals that are indicative of those use cases so things like Claude being a domain um expert so tailoring um technical explanations to match your knowledge or uh output format templates like how do I want Claude to respond in specific formats um learning goals like these are specific use cases that we might want to solve for this product and then you build actual evals against these with like real user preference strings that I would put into this product that we have which is just a simple set of custom instructions so it's like what's a real um preference string that I would put in there and then what are uh prompts that I would do to evaluate whether it's actually solving the problem that I'm trying to and then have like Claude actually grade the outputs on whether it is using the preference string and combining that with the prompt of what I'm giving it to give me an output that is actually tailoring cla's response to my domain expertise level or my output format that I'm trying to so it's getting the use case into uh an eval framework that Claude can understand and grade for you and then iterating on that until it is ultimately good at that thing and then now you have the eval language that the team can start to at on things like the system prompt or the model and its ability to be good at those things um and then the really important other thing that it does is there's this whole new world of how we're building product development that kind of feels like continuous integration continuous delivery So In traditional software like you build a product and then you write tests for your product and then when you're deploying changes to your products to production you'll run your tests and then your tests will either fa fail or pass now we have this whole evals thing which is like we're designing a new model um or we're updating our system prompt we have to make sure we're not breaking these features and so now we have these evals that we can run in the future to say oh are is preferences still working and the reason I bring up preferences is because when we were training um 37 and about to launch it broke preferences uh and we discovered that before we uh we released it um because we had written these evals and so then we had to go back and make adjustments um to make sure that the feature now still continue to work in the next generation of the model so that's an example of how you take like a feature like a set of user problems and then make it a set of like use cases that then translate into evals that you then use to iterate you know as you get towards the release of the thing to make sure that EV valves are working then you establish those evals as part of a regression set through your continuous like future development Loop because as you build AI products like new features and new models end up interacting with each other in all of these uh like unpredictable ways and you need that common ground to be able to um to make sure that you're building a high quality experience and so that's those are a few of like that's one journey to building it and some of the new flavors of things that I think PMS need to get um upskilled on if they're going to be building AI products um uh can you give
32:37

Why every PM must master writing AI evals

like a quick example of the EV for the Styles product like for example you have a explanatory style or something right and like and so Bas basically like you write some ground truths or you try to get Cloud to like you have like a example makes it like very practical yeah exactly so um that would be a good example where you have like an explanatory style and you say like here are 10 uh prompts of what I would want the explanatory Style 10 prompts and outputs um that I would want the explanatory style to actually sort of like provide um on specific topics like uh you know graph ql and how it works uh or like um you know uh how big is the galaxy right like there are specific prompts that you will write and then you'll ground it in like real truth uh and then you'll you can do it that way to say like what percent of the ground truth is actually accurate but then you can also write like a greater prompt where you basically get clawed to evaluate based on what the intent of this feature is do I think that the output is fulfilling the intent of this feature and that is like a way when you don't have ground truth to actually just get a proxy for how good the thing is so instead of like grounding the explan style in ground truth you could also just get CLA to say like is it actually providing an a good explanation based on what it is trying to do or like if we have a concise style like is it giving a concise style while still maintaining a lot of clarity and answering the question grade this from one to 10 something like that and then once you have those grades uh you then have your a new eval and that is more of like a synthetic eval as opposed to it being a ground truth that's I think get the AI to evalate itself yeah that's right yeah with a lot of human oversight right because you're writing this prompt you're making sure that it's representative of the use case yeah I've been building a AI product too and um in some ways is nice because now the PM can actually contribute to the product like the prompt is like a key part of the product and the eval key part of product but also it's like really it can get really painful man like it's not that like you just write requirements and you're done because like every day the ground is shift shifting beneath your feet yeah the ground is Shifting and that's where you need those like continuous again like there's this new world to me of model cicd which is the ground is Shifting and you need to have observability of when the ground is Shifting so that you can jump in and make changes so it's this new thing that you constantly need to have your eye on as a PM I agree with you it's like a new challenge and there will probably be new tools and new processes to make sure like cicd you know happened for engineering and now change is the standard for how engineering teams work in terms of test driven development and having test failures and automated test Suites and unit test and integration test and end to end tests I think there will be this like new language for how you build model TR product I mean why I build apps of cursor there's no test man I just go for it there's no Tes well yeah hopefully it's writing the test too yeah okay uh real quick I want to talk about Mike creger uh he joined recently
35:53

If Mike Krieger has brought Instagram's principles to Anthropic

uh I really love uh his principles around like you know doing a simple thing first doing fewer things better has he brought similar values to the product or atopic or you know yeah absolutely yeah for sure I think there are lots of areas where he's he has sort of pushed us to double down and put more wood behind fewer arrows um I think especially on that sense of pragmatism that I talked about earlier or practicality of the model and the product I think um from that ethos like making this thing indispensable for us because we believe that we are indicative of uh the kinds of customers that we also want to work with both on Cloud for work and our API uh that is an area where it helps with a lot of things it gets closer feedback loops for us to make the model better um it helps us build product features which we think will be valuable for ourselves and our customers I think coding is a great early example of this I think Mike's helped us hone in on like do the things that are going to make us want to use CLA and how will that um translate to making us successful in the market that's like a very clear way that I have seen him drive towards that like closing the loop between us and our customers um which I think is really accelerated us on a number of Dimensions yeah it's like a privilege to build a product that you actually enjoy using yourself like I don't think every PM gets that opportunity yeah and that probably came from Instagram right like he was building Instagram and like he probably used it every day and he's saw the Magic in uh in doing that and you develop like such a good intuition for your product uh that helps everybody at the company like have great ideas I think there's so much Bottoms Up adoption and energy here I think that's also another thing that he um has brought he's you know traditionally an engineering leader in previous roles and that like he has such a good spirit for all of the bottoms up energy that comes from teams like engineering but also other teams he like Fosters a lot of that and having the combination of that energy plus shared intuition plus an objective to make this thing so much better for ourselves it just creates all this amazing energy of like Bottoms Up ideation like um iteration and adoption I get constantly surprised for just like crazy prototypes and things that people build and then like influence our product road map and it makes my life easier because I don't need to come up with all these ideas yeah you gotta keep the star of a Vibe alive you can't do like you know um can't get this Big Tex situation where like you know all the PMS call all the shots and stuff you don't want that kind of thing yeah absolutely you're just you're curating Vibes you're not uh water doing waterfall product development and um if someone wanted to join a pro team at
38:41

How to get hired at Anthropic

atopic like uh like what skills or like you know how should they get ready yeah great question yeah uh I'll talk first about like overall things that I feel like are pillars for anthropic generally I think putting our mission first working like very independently but also collaboratively committing to AI safety as a really core value like you should be in this for the right reasons I think that's something that we believe really strongly in um I'll give one that I think is my personal philosophy uh which is relentlessness I think we're just in a really competitive market in an industry and a time of Rapid Evolution and I think we really need to show up so I think there's a component of energy and relentlessness uh and urgency that I think is really important for any rule especially product management at anthropic um I also think that the last thing here is like a one- team mentality I think we've talked about the stool being more important and interconnected than ever like the product development stool and now it's more like a chair with uh research being a big key part of this I think that makes the interconnected nature of these teams um and the sort of like one team Vibe more important than ever because there's this new thing that we need more feedback loops on to improve and so having like a global one team mentality I think is really important for us here um and then outside of that I think for product in different teams there are different skill sets that I would call out I think like if you're were looking for something in Enterprise I think there are things like customer centricity um cross functional skills being really involved and invested in go to market and your ability to work with sales teams and get excited about working with customers I think is really important on the B2B side on the growth and sort of consumer side I think there's like um sort of like a marketing aspect and like a data uh sort of familiarity aspect and an iteration Loop and Rapid development Cycles thing that you probably want to have some experience and some um skills in or want to flex your muscle in and then on the kind of core product development side like model feature development I think up upskilling on things like evals I think is really important I think like learning that language learning that framework understanding how it applies to product development I think if you're going to end up building model features like a feature which has to closely use a model to um drive towards a specific outcome I think that's a really important new skill that PM should be developing yeah you know I've been doing a lot of interviews uh to different companies and like I feel like the best PMS just have these like core values that match a company it's not about like you know knowing to answer some like trivia question or something or like prodense question it's about like do you actually give a about customers do you K about the craft like it's the values right it's like that that's what actually matters yeah and I think ultimately you want like if you're doing something that excites you and that you want to do you'll you're just going to do it better you're going to want to do it put yourself into do it more you're going to want to like do an excellent job and I think like working someplace that you look at the leadership team or um the values mission of what they're trying to do and that deeply resonates with you I think it's like it's just going to also make your life better um it's G to make you know work feel more meaningful to you in a way that I think I want my work to feel meaningful to me um and so yeah I think that's a an excellent point yeah like I want to work at a company where like I'm not thinking about the career ladder like I just think about how to make this product really good and then comp to take care of the career ladder you know yeah and I just want to have fun I also want work to be fun like I want to have fun with good people so I think about that as well yeah you go to get the cloud to be good at generating memes that then it be more fun in internally there's a lot of meme generation internally this is very meme forward culture uh more so than I've ever experienced got it um okay uh La last topic um uh when you guys published the new model this is a really awesome chart
43:01

Flipping the script on Claude's future

you know like it says Claude assist 2024 Claude collaborates 2025 and Claude Pioneers 2027 and like maybe you can give us a preview of what that means and like what what's next for yeah absolutely um so I if I think about like the evolution of how we've um we've built Claude in my experience here I've been here for over a year and a bit now when and I started we were very small we were on the CLA 2 family of models we had no uh API we had no mobile app we were not available in many countries and it's been a crazy you know 15 months from when I started uh to get Claude adopted at such scale uh in you know around the world and through businesses and but I think we're still just getting started I think you know this last year Claude sort of feel like a very you know capable assistant like something that you have to give it information for you have to guide it to a specific outcome you have to know exactly how to prompt it and give it very specific information to get what you want I think in the next year we're gonna we're going to flip the script a little bit and I think it's going to feel a lot more like Claude is what I like to think of as a capable collaborator and to me that means it's moving up almost a career ladder on a few Dimensions like what is it able to know and how does it expand the Horizon of what its information is and understand really your world how is it able to communicate with you in novel ways like a collaborator would and ultimately what is it able to do for you like can it actually take things off my plate can I feel like I can delegate meaningful tasks and meaningful work to this capable collaborator uh as opposed to like telling it exactly what it needs to do taking the input and then going and doing it in whatever other process I'm trying to do how can I like give the keys to Claude for it to um actually take the thing off my plate that's what I think things should start to feel like this year and it's not just like one feature that will do that I think it's the Confluence of many capabilities both model and product all moving in that direction but I think like uh it means that it's on the dimensions of its knowledge its ability to communicate and its ability to do stuff for you making like really meaningful strides uh this year so that it doesn't feel like an assistant anymore it feels like something that's helping you solve your biggest problems and most pressing timely needs and meaningfully taking things off your plate that's how I sort of think about it it's interesting to see like the tech industry like the middle management layer get uh you know challenged but at the same time I feel like in the future we're all going to be managing these models or like these AI a agents so yeah it's interesting right it's like two different forces that are kind of coming to together yeah I think that there's a there is a component of like collaborating with this thing exercising your judgment and as you start to delegate more tasks to it with more information it might come back to you with questions and you might want to like work through it in a meeting where you're talking about like you know brainstorming a particular solution and then you start to hand more of it off so I do think that they're like you know people will it'll feel more like collaborating in a one-on-one and like having a one-on-one dock with Claud as opposed to it being like a synchronous chat experience eventually awesome Scot uh it sounds like you're having a lot of fun there I'm sure it's intense but um any kind of closing words of advice for people listening to this who want to get into AI product or you know G to stuff yeah try to build stuff just I think like intuition is important I think I've said that a number of times here like starting to make it very concrete like how do these models work how do I Implement with them how are the tools changing that people are adopting in the market just go try to build something fun like something that's timely in your life or pressing or just fun like a game uh or something that helps you with your you know tasks at home or at work like just go try to build something and I think you'll develop the intuition for what it feels like to prompt models to realize that like what you've built isn't working and now maybe you need to think about evals for it like I think just putting the puzzle pieces together by actually trying to build something is I think the fastest way to get intuition and learn like are you really excited about this and if you are there are a lot of cool opportunities yeah you got to have the Curiosity you got have the time to thinkers with stuff yeah absolutely well I got to uh say again you know like uh you know I have a job and I try to do this podcast stuff and I would not be able to do all this without claw's help like CLA helps me uh you know Cloud help me summarize a bunch of interview notes helps me like come with the qu questions it helps me with the transcripts a lot of things man so thank you for buing this and uh yeah let's stay in touch I hope to see some new stuff coming soon yeah absolutely and thank you know we couldn't be doing this without people like you uh using it and loving it so appreciate the advocacy awesome Scott

Ещё от Peter Yang

Ctrl+V

Экстракт Знаний в Telegram

Транскрипты, идеи, методички — всё самое полезное из лучших YouTube-каналов.

Подписаться