Try out the new model capabilities through Claude.ai & Claude Code on the Pro or Max plan: http://clau.de/tinahuang
#claudepartner @anthropic-ai
Want to get ahead in your career using AI? Join the waitlist for my AI Agent Bootcamp: https://www.lonelyoctopus.com/ai-agent-bootcamp
🤝 Business Inquiries: https://tally.so/r/mRDV99
🔗Affiliates
========================
My SQL for data science interviews course (10 full interviews):
https://365datascience.com/learn-sql-for-data-science-interviews/
365 Data Science:
https://365datascience.pxf.io/WD0za3 (link for 57% discount for their complete data science training)
Check out StrataScratch for data science interview prep:
https://stratascratch.com/?via=tina
🎥 My filming setup
========================
📷 camera: https://amzn.to/3LHbi7N
🎤 mic: https://amzn.to/3LqoFJb
🔭 tripod: https://amzn.to/3DkjGHe
💡 lights: https://amzn.to/3LmOhqk
⏰Timestamps
========================
00:00 — Intro
00:50 — Code with Claude Conference takeaways
02:20 — Claude Sonnet 4 & Opus 4 demos and updates
12:30 — Claude 4 prompting tips & prompt generator
13:39 — Quiz 1
13:51 — Claude Code vs. Firebase Studio vs. Windsurf
18:36 — Claude Code demo
21:46 — Using Claude Code SDK
22:31 — Pros & Cons
23:46 — Quiz 2
📲Socials
========================
instagram: https://www.instagram.com/hellotinah/
linkedin: https://www.linkedin.com/in/tinaw-h/
discord: https://discord.gg/5mMAtprshX
🎥Other videos you might be interested in
========================
How I consistently study with a full time job:
https://www.youtube.com/watch?v=INymz5VwLmk
How I would learn to code (if I could start over):
https://www.youtube.com/watch?v=MHPGeQD8TvI&t=84s
🐈⬛🐈⬛About me
========================
Hi, my name is Tina and I'm an ex-Meta data scientist turned internet person!
📧Contact
========================
youtube: youtube comments are by far the best way to get a response from me!
linkedin: https://www.linkedin.com/in/tinaw-h/
email for business inquiries only: hellotinah@gmail.com
========================
Some links are affiliate links and I may receive a small portion of sales price at no cost to you. I really appreciate your support in helping improve this channel! :)
Anthropic recently came out with two new Claude 4 models, the Claude Opus 4 and the Claude Sonnet 4, as well as major upgrades to Claude Code, which is their AI coding agent. So yeah, I am officially a Claude code fan now. Anyways, I was very honored to have been invited in person to the code with Claude conference in San Francisco where they unveiled these new models and all the cloud code stuff as well. So, in this video, I want to share with you guys what I learned about the new capabilities of these Claude 4 models and demonstrate some practical ways that you can use them on the Claw. ai UI as well as through Claude Code. I just wanted to say that this video is sponsored by Anthropic. I know I cannot believe it either. As per usual, it's not enough for me just to talk about stuff. So, throughout this video, there's going to be little assessments, which if you can answer, then congratulations. You're educated on the Claude 4 models as well as Claude Code.
During the code with claude conference, the anthropic CEO Daario basically said that they've pivoted away from building a generalpurpose chatbot in competition to like open AAI or Google, which makes sense. It's quite hard to compete with open AAI and Google. Instead, they're really focused now on creating the best coding models and that is what the cloud for models are. If you look at the benchmarks, you can see that Opus for Sonnet 4 far exceeds everything else in terms of software engineering specifically at agentic coding and agentic tool use. and they both have a 200,000 context window. So, I know that a lot of people are into the benchmarks, but me personally, I honestly don't really care that much about the benchmark. I'm much more into like what's the actual use that I can get out of these models. Let's first look at their subscription pricing. So, Anthropic does have a free tier, but honestly, it's like two prompts and you're done. So, realistically, you're going to have to either go for the Pro or the Max. With the Pro, it's $17 per month if you purchase the annual plan or $20 per month if you're build annually. the third tier which is the max at $100 per month and with that you get more usage limits and pro bunch of other things and here's an important one if you do want access to cloud code directly in your terminal you do need to have the max plan you also if you want to use advanced research and connect any context where tools through integrations early access events cloud features and priority access at high traffic times so without going into too much technical detail but if you want to be using a lot of the tools are available and take advantage of MCP to be connecting a bunch of different types of tools and context X2 Claude needed a max plan as well. Let's start off with some data
analysis. So I have this prompt over here and some bike sharing data. We will use the set for here. So what I added here is bike share data. So we can see that day. csv which has information per day from bike share and hour. csv which has information per hour for bike share. The prompt that I have is how should cities optimize bike sharing systems in 2025 based on usage patterns and what are the current best practices. I'm telling it to please examine the data structure first and develop a comprehensive analysis plan and execute it. Here I'm prompting it to go into the thinking mode and showcase extended thinking abilities. I also write for maximum efficiency whenever you need to perform multiple independent operations invoke all relevant tools simultaneously rather than sequentially. The reason why I wanted to add this part of the prompt is because the set for model should be able to invoke different tools in parallel and then efficiently combine them together. So I want to see that in action. Okay. So we see that there's a bunch of things that are happening in the artifact on the side and we'll check that out. Uh but first thing that we see is that it does examine the data structure first and it tells you its thinking process. So help you develop a comprehensive analysis of bike sharing optimization based on the uploaded data set. Let me start by examining the data structure and then create a thorough analysis plan. And it actually you can see the code as well if you want to as it analyzes the data. Now let me search for current best practices in bike sharing optimization to complement the data analysis. So now it's invoking the web search tool. Let me now search for more specific optimization strategies and recent technological advances. So more web search. It does more analysis like performing a comprehensive analysis of bike sharing data to understand usage patterns blah blah. And then over here it says let me recalculate the insights with proper variable scope. And this is actually doing it in parallel here. Then it combines everything together to create a comprehensive optimization report combining the data insights with current best practices and provides a summary which is what we see here in the artifact as well. So it has an executive summary and key recommendations. So this is nice. This is great. But maybe I have poor attention span and I cannot read. So what I'm going to tell it to do is please create this into an interactive dashboard. All right. So we have our little interactive dashboard now. So datadriven insights for smart city mobility optimization and it has these graphs that were created as an overview. Daily usage patterns. Key insight peak demand at 5 p. m. 461 trips per hour is 72 times higher than 4 a. m. low. You see that claw models are very logical and they do these calculations try to ground it in facts which I appreciate a lot. User distribution and you have usage patterns, seasonal usage patterns. We see that total average per hour mostly in the fall deployed 2. 1 times more bikes in fall versus spring and weekly usage patterns as well. And here are some optimization impact potential key optimization strategies like IoT powered fleet AI demand forecasting dynamic pricing smart rebalancing and user segmentation. So it goes on to have weather impacts and strategic insights as well which is pretty damn good. Like this honestly looks pretty good to me. And just because I am paranoid off camera, I did actually check these numbers from Cloud myself. So it is actually not just pulling these numbers out of its ass. These are actually correct. Now in comparison to Sonnet 3. 7, this is what we get. It's honestly not bad. As you can see, 5,600 plus daily rise and low usage and things like that. But you'll notice that it doesn't give you like specific detailed numbers. It would just say like show higher registered user activity. while weekends have higher casual writer participation. The numbers are still correct, but it's just not as insightful overall and not as detailed either. All right, so this is great. We see that Sonnet 4 has extended reasoning. It has the ability of running tools in parallel and it can create very nice dashboards. Okay, so I know that you might be like, it's not like that big of a difference. It's a little bit more detailed and it has like more numerical grounded facts. Fine. So, I do want to give you a taste for what the difference is when it comes to coding specifically to just really show you guys like how much of an upgrade it is. A couple months back, I did another Claude video where I gave this prompt to Sonnet 3. 7. Write a p5. js script that simulates an ant colony searching for food. Use pheromone trails and basic AI rules to show ants exploring and optimizing paths. Include controls the user can adjust in real time. So, first we're going to look at Sonnet 3. 7. So, it does all these things, blah blah, and then it has a popup here where you can actually see the ants here. So, you can adjust the number of ants. But you'll notice that for some reason when you adjust the number of ants, everything just kind of like moves around with it. And yeah, you can increase pheromone strain, pheromone evaporation, wandering behavior. You can increase the wandering behavior so the ants would wander around more. All this is fine. You can reset the simulation as well. Whenever you like add the food amount, it would kind of like glitch out a little bit here. Okay. And you can cancel obstacle mode. And you can also have add obstacle mode. So you can like actually add obstacles and the ants would not be able to like bypass these obstacles. So overall, not bad. Right now looking at from Sonnet 4, this is what it came up with. You have the ants and you can have number of ants. You can increase the number of ants. Ant speed. You can increase the speed. They're like zooming around really fast now. Pheromone strength. It has all the things that Sona 3. 7 has. But it can also say like add random food and you can add food just by clicking it which is something that you cannot do here. You can also toggle pherommones visibility. You can increase it toggle obstacle mode. So you can add like obstacles yourself. And you can actually rightclick to remove obstacles too which is I think pretty damn cool. And yeah you can clear all obstacles as well. You can just see that it made it in a way that is just a lot better. The experience is so much better for the user and it's just a lot more detailed as well. Compared to Sonic 3. 7, the Claw 4 models are also less overeager. With 3. 7, there's a tendency where if you ask it to do something, then it would like change a lot of other things as well when it's coding. But with Cloud 4 models, there's a significant decrease in this eagerness score. So, it's able to just like keep everything else the same and not like edit everything at the same time. This is actually a really big problem for me when it came to coding with Sonnet 3. 7. There's like a running joke that if you ask Sonnet 3. 7 to add a button, it will just vibe code you an entire new app. So, this behavior is significantly corrected for the Cloud 4 models. Another major upgrade models is improved memory. These Cloud Form models are able to have a goal and then consistently work towards that goal over an extended period of time. The example from Anthropic is Claude Opus 4's ability to play Pokémon. They said that cloud opus 4 dramatically outperforms all previous models on memory capabilities. When developers build applications that provide cloud local file access, Opus 4 became skilled at creating and maintaining memory files to store key information. This unlocks better long-term task awareness, coherence, and performance and agent task. So, we can see that Opus 4 is playing Pokemon and it's able to train the different Pokemons and remember to keep training the Pokemon and battle all the gyms and complete the game. While previous models would have trouble staying on task, it would start training Pokemon and then just like get confused and go do something else. Next up, Anthropic says that Claude has a increased ability to follow instructions. They say that Cloud for models are specifically trained to better follow instructions within complex long system prompts, even longer than 10,000 tokens. This is something that Son 3. 7 and many other language models out there do struggle with. If you have very detailed instructions, it would generally miss out on some of them. It kind of just like forgets in the process. So, to test this out, I'm going to have this prompt here that's going to write an email for me with a laundry list of requirements. The prompt is, I need to help me write outreach emails to potential YouTube live stream guests, but you must all this extensive list of requirements exactly. So, mandatory email format rules, subject line must be exactly YouTube live stream invitation, guest name, the expertise area, always start with their first name only, never dear or high, etc., etc. is like just very specific things. Like for example, number 18 is used live stream the word live stream exactly three times total. Oh, I don't even know what happened over here. The spacing got up, but it should be fine. Says, "My YouTube channel gets 950K views per month. Please write an email to Bob Smith, who works at Anthropic and gives amazing practical workshops on AI agents. I want to have him on my YouTube live stream to give his workshop to my audience. " This is actually true. Not Bob Smith, but I did meet someone at anthropic conference who gave a really good workshop. So, I'm really hoping we can get him on to the live stream to teach you guys about AI agents more. So, it came up with this email and it is true. I did check that it is able to follow exactly every single one of the requirements that was given. The way that the email is written is also super natural and humanlike. It doesn't have that like distinct AI feel from a lot of the other models. Cloud has always been one of the best when it comes to tone. That's why a lot of people use cloud for the writing. Bob, your recent practical workshops on AI agents and anthropic has been gaining significant attention developer community. I'd love to explore this topic with my audience. Your hands-on approach to teaching AI agent development would provide tremendous value to viewers who are looking to implement these technologies in real world applications. It just like sounds super supernatural. And finally, one more significant area of improvement for the cloud for models is reduced reward hacking. Reward hacking is a behavior where models might take shortcuts in order to achieve a goal without actually solving the underlying problem. It could be something like hard- coding test or commenting out test in code. In a game in which the agents would lose points if they're taking damage, they might just like end up not moving at all in order to bypass that. Or a cleaning robot where the reward is detecting less dirt. It might do something like just shutting off its camera lens so it doesn't see any dirt to maximize it reward without actually cleaning anything. It's like behaviors they do in order to hack the system. Anthropic reports that Cloud 4 models demonstrate an 80% reduction in reward hacking compared to 3. 7, which means that users should be able to better trust Claude in order to do things the proper way. So, I don't know if the Cloud 4 models are like 80% better than 3. 7, but I did notice when I'm coding with Claude, it does have a tendency to be more thorough and doesn't take as many shortcuts. Like it would actually like thoroughly test things and it wouldn't just like bypass certain
things. I'll put on screen now some prompting tips specific to the Claude 4 models that was shown in the code with Claude conference. By the way, in case you didn't know, Anthropic also does have a pretty good prompting guide as well as a prompt generator on their console. So, I'll also put the link for that on screen right now. As I've said many times before, prompting is the best return on investment skill that you can be learning. Knowing how to prompt well can make a huge difference in the responses that you get. Now, before I move on to showing you the true power of the cloud for models, which is through building and coding through coding agents, I want to explicitly call out these cloud form models are definitely better. But if you're only going to be using it through cloud. ai UI, the web interface, you are going to very, very quickly run into usage limits. Even if you're using the pro plan, or even the max plan, the context window of 200,000 tokens for these cloud models also becomes an issue. You're not able to give it massive amounts of information and very detailed prompts like you can using something like Gemini. And since Anthropic made a point that as a company they're going more towards coding agents and being really good at these specialized models as opposed to general chatbot interfaces. You shouldn't be expecting chatbot functionalities like voice functionalities or multimodal outputs. The power of these cloud for models really does become unleashed when you pair them together with code specific tools like windsurf cursor or
claude code. I'm going to put on screen now a little assessment to test your learning so far. Do put the answer in the comments. Let's now actually move on to seeing the full power of these cloud for models and talk about claude
code. Cloud code is the most versatile coding agent that is run directly through the terminal. So you can just use it directly to code in your terminal, but it also does have extensions for VS Code and Jetrain. So you can integrate Cloud Code directly to your ID. All right, so let's now actually jump into the coding experience. And I'm actually going to show you a comparison between Gemini 2. 5 Pro using Firebase Studio, Windsurf using Sonnet 4, and Cloud Code using Sonnet 4. so you can really see the difference in the experience and the results. The prompt that I'm using for all of them is build a gamified pixel art app where users set daily goals and earn XP for completing them. If they slack by having incomplete goals, their AI rival gains XP every 1 minute. Show the XP bar with numbers. It's inspired by pixel art games like Red from Pokémon and similar game mechanics. Each week, users face their rival in a battle to see who's stronger or it can be invoked at any time. I wrote the invoked at any time thing just primarily for testing so we can actually see it. Whoever has more XP will get 10% of their total XP added as a bonus. Users should be able to choose to customize the AI rival. And tasks that users would do would be like studying calculus for 50 minutes, drinking eight glasses of water, gym for 1 hour. I've attached an image here for reference. And here is the reference image mockup which I made using Chatbt. Okay, so here is Firebase Studio using Gemini 2. 5 Pro. And I'm not going to go through the entire process of doing this in this video, but if you want to see me vibe code using Firebase Studio, you can check out the link over here. I have a full video on it. And this is the app that I finally came up with. So, okay, it is called pixel progress and it has progress tracker, daily task as well as timers here. We can add something like vibe code app for 50 minutes and we can say like 15 XP, right? And it would add it here. And then if you check it off, you would get 15 XP over here. You can add something else like finish this video and call it like 30 XP. So on the timers here, it does have the time until daily reset and this is correct. It should be resetting every 24 hours and it's currently 8:48 a. m. where I am. However, this is after like multiple tries, right? I still could not manage to get the rival gains XP. Like, it should be a countdown from 1 minute, right? I have not been able to actually get this to work. So, your rival doesn't actually gain XP. And the only way that I think I can manage to get this work is actually go and dig into the code myself. At this point, the vibes are not enough here. Let's move on now to Windserve using Claude Sonnet 4. So, put the prompt over here. Image have that here. and press enter. Cool. It seems to have built something and populated things. All right, we're going to accept all and then I'm going to run it. Let's open an external browser. All right, and this is what we got from Windsurf with clot force on it. I mean, it looks much better. Oh, I forgot like with this one, it did not have the like every weekly battle situation. So, it looks like pixel goals here does have that. So, I've actually tested it out. Rival gains XP in 4 minutes and 8 seconds. So, there should be one minute. And we'll correct this in a bit and see if we can get manage to get it right. But here's blue. Here's me. And if I do this, I do get 20 XP. And you can see the XP bars over here as well. So, gym for 1 hour, that works too. And then if you add a goal like finish this video, and this is going to be like 50 XP. Save goal. It does go here. And if it does this, cool. And I also level up to level three as well. Very nice. All right. So, if I do challenge rival, it says I win. So, you gain 26 bonus XP. When you challenge the rival, you do have an XP increase over here. So, you do get the 10% bonus here. All right, cool. And let's look at what settings is. So, you can change the rival name because you want it to be customizable. Let's call it uh red. And it changes to red. Wonderful. So, the only thing that needs to be changed right now is supposed to gain XP every 1 minute. So, I'm going to see if I can fix that real quick. A few moments later. Refresh. Now it's static and it doesn't actually work. So, I'm going to screenshot this and try again. Now, it's just static. Stuck at 0348. It does not countdown 60 seconds. Attach a screenshot. Let's see if that works. All right, it is at 1 minute now. So, let's delete some of these. And if we uncheck this, it should start counting down, which it does. Yay, it works. Some other really nice things about WinSurf is that you can actually select like a specific UI component. Like for example, I can select something like what do I not like here? I kind of like I don't know like this button, right? Like settings an element was added to cascade. You can see that the element is added over here. So you can directly reference it. So change this button to a darker gray for example. Refresh. And it does do that. So it works pretty well. And it can also send console errors directly to cascade as well in windsurf. All right. Now
let's actually see the experience using claude code. Okay. So this is the terminal. To install claude code you just simply do this. In my case it has already been installed. And to activate it you just do claude. Do you trust the files in this folder? So press yes proceed. And we'll do that. So welcome to claude code. It gives you like tips for how to get started. Uh so I'm going to just say ask cloud to create a new app or clone a repository. So let's do that. And it says what would you like me to work on? By the way I'm doing this just directly through terminal but you can also you know use it through VS code as an extension for example. So same thing just copy pasted the prompt and put this image here as well. Press enter and here we go. So one cool thing about cloud code is that it says you have the option of yes and don't ask me again in this session. So you can do like shift tab here and cloud code does learn as you go along. It doesn't need to keep asking for permissions over and over again. Yeah. And it tells you the plan that it has which is set up basic HTML structure and project files. Create pixel RCSS styling UI layout. Implement XP bar components for user per player and rival. Build goal management systems. Create rival XP game mechanics. One XP per minute for incomplete goals, etc., etc. So it does have this plan here and it's doing its thing now. Okay. So ready to use. We can just do index. html in your browser. I'm just going to say open it for me. And this is what it came up with. Okay. So, it has the rivals, the levels here. Today's goals. Add new goal. Rival gains XP in 23 seconds. And you can challenge the rival here. So, let's actually test this out. So, study calculus. It does increase. Does increase here. Level increases too. Has a nice little popup which is quite pretty. Add a new goal. finish this video, which is supposed to be 40 XP here. So, your rival gains XP in 53 seconds. And time left is correct as well. You can challenge the rival. So, you defeated your rivals 10%. So, 30. That is correct. And your rival XP is 180. That is correct. I want to see if I click this, does it pause? It does not. So, you see that the rival gains XP. This should actually pause after everything is completed. So, I'm going to see if I can fix that. It also doesn't have a settings to customize it. Okay. And we will write the timer does not pause when all goals are completed. So I'll put the screenshot over here. When vibe coding is generally best to change one thing at a time and you can check my vibe coding 101 video if you want to learn more about vibe coding fundamentals. That's where we're going to do that first. Okay. So that works now. You know it's going over here. But if I click it, it says paused. And if I continue, it continues again. And the rival gains XP afterwards. Great. Now I'm going to see if I can have the settings there as well. We have a settings page. Now, let's call it red. And a color. Let's make it blue just to be confusing. The rival XP gain rate is every 1 minute. Let's say every 30 seconds instead. And 5 XP. We can save the settings here. And it looks like things are saved. And it gains XP by 5 XP. Yeah, looks good to me. All right
so this is just the coding part of things and you can see that windsurf and cloud code are both you they're using the same model. So yeah, they both are quite good at that. But what the big difference with cloud code is that there is a lot more functionality is associated with it. The biggest one being able to integrate cloud code through the cloud code SDK into the applications that you're building like the GitHub app for example. You can just do /install GitHub app. You can go through the entire installation process and voila, you're able to use Claude on GitHub to do your pull requests, to respond to people, and to even add new features. This is just a single example. There are so many other use cases that you can do by integrating cloud code into your application. If you are interested in using cloud code, I also do recommend that you check out this live stream that is called mastering cloud code in 30 minutes. It goes into a lot more detail about how to get started and all the things that you can do. All
right, to summarize things now and talk about some of the pros and cons. I think if you're a vibe coder, you're not a very technical person, I would personally just stick to using the clawed for models through something like Windsurf. It has everything that you need to get started and also has certain features like being able to select for different elements so you can change things and make it look better. But if you are a very serious developer working on an actual codebase and building products where you want to integrate cloud code into what it is that you're building, then I think you should check out cloud code. Should you consider switching over to Claude products at all? This is what I would say. If you are a casual chatbot user, your normal chat GBT user, your Gemini user, then no, it is not worth it. It is far more expensive and there's usage limits that are on the claw. ai UI interface. If you are someone who is vibe coding, who wants to build certain things, I would recommend using the Claw 4 models through something like Windsor cursor because it is significantly better than Sonnet 3. 7 and Gemini 2. 5 Pro. Then finally, if you are a power user, a developer, then you can try using cloud code. Cloud code really opens a lot of cutting edge possibilities for development. Plus, you're not restricted by the 200,000 context window because they internally actually implemented a mechanism that summarizes the prompts for you. So, you can actually use Cloud Infinitely without running into the context limit issue. All right, I hope
this video was helpful for you guys. Here is a final little assessment. Please answer the questions in the comment section to help you retain all the information that we covered today. Thank you so much for watching until the end of this video and I will see you in the next video or live stream.