Anthropic Just Dropped the Biggest Subagent Upgrade Yet

11:13

Anthropic Just Dropped the Biggest Subagent Upgrade Yet

Ray Amjad 23.04.2026 23 414 просмотров 577 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Level up with my Claude Code Masterclass 👉 https://www.masterclaudecode.com/?utm... Use LASTCALL for 35% off lifetime (before lifetime plan is retired on Friday 24 April 23:59 PST) My Claude Code newsletter 👉 https://www.masterclaudecode.com/news... My Claude Code coaching 👉 https://www.masterclaudecode.com/cons... I've never accepted a sponsor; my videos are made possible by my own products... —— MY CLASSES —— 🚀 Claude Code Masterclass: https://www.masterclaudecode.com/?utm... —— MY APPS —— 🎙️ HyperWhisper, write 5x faster with your voice on MacOS & Windows: https://www.hyperwhisper.com/?utm_sou... Use coupon code YTSAVE for 20% off 💬 AgentStack, AI agents for customer support and sales: https://www.agentstack.build/?utm_sou... 📲 Tensor AI: Never Miss the AI News on iOS: https://apps.apple.com/us/app/ai-news... on Android: https://play.google.com/store/apps/de... 100% FREE 📹 VidTempla, Manage YouTube Descriptions at Scale: http://vidtempla.com/?utm_source=yout... ————— CONNECT WITH ME 🐦 X: https://x.com/@theramjad 👥 LinkedIn: / rayamjad 📸 Instagram: / theramjad 🌍 My website/blog: https://www.rayamjad.com/ ————— 00:00 - Introduction 00:15 - Why Subagents 01:24 - How Forked Subagents Work 02:03 - Where It's Useful 03:31 - The Prompt Cache 03:41 - Anthropic's Use of It 04:13 - How to Use It 04:50 - Example 1: 2 Forks 07:39 - Example 2: MCP 08:40 - Other Use Cases 10:13 - When to Not Fork 10:37 - Conclusion

Оглавление (12 сегментов)

Introduction

Okay, so yesterday Anthropic solved what I found to be the biggest problem with Claude Code subagents, and that is by introducing this new feature called forked subagents. And I'll be going through exactly what this means and how you can be leveraging this to improve your own workflows. So just to make sure that we're

Why Subagents

all on the same page, the whole reason for us to have subagents is so we can delegate any noisy tool calling into a separate context window and only get the most relevant results back into the main session. So if you were to do everything inside of the main session, it would fill up with a bunch of random noise and like output that Claude Code did not need. And that would lead you to use your context window of the main session faster. And we know that as a context window fills up, then Claude Code makes worse decisions. So to keep our context window lean, we usually delegate to subagents like Explore subagents to look through the codebase or research subagent to search online, which once it is done, will return the most relevant things back into the main session. And whilst in this example, it happens at the start of the conversation, it can also happen in the middle of the conversation. So your main Claude Code session may be doing some stuff. It then decides it needs to call a subagent, and then a summary of everything that the main session has done so far alongside instructions will be passed over to that subagent. It will then do whatever it needs to do, and that result will be passed back into the main session. Now, this can be handy in many situations whereby that blank context was helpful for Claude Code to get a different perspective. But in many situations, you actually

How Forked Subagents Work

want everything that you've accumulated so far in the main conversation to also go over to the subagent. And that is what forked subagents allow you to do. So the forked subagent has the entire prior history of the main conversation and instructions as well. And then that can basically continue down a certain path and then pass that result back over into the main session. And one of the benefits here is that when you are using a forked subagent, it will be using the same prompt cache as well as the main session, which means that it can be cheaper. So essentially the main difference between normal subagents and forked subagents is that forked subagents inherit the entire conversation history. Of the main session. Okay, so before going over how you can actually use the

Where It's Useful

feature, I'll go over a recent situation in which this would have been handy for me. So essentially before forked subagents, there were instances where I would be doing some kind of design work with Claude Code and I would basically chat with Claude Code back and forth of, hey, we're gonna choose these fonts, like what colors do you think are handy and so forth. And this main conversation would accumulate a lot of nuance and nuance is a key word here. And then I would say to Claude Code, okay, let's design 3 different variations of this in parallel with subagents. So it's faster and the variations remain isolated from one another so they're not biased by one another. And Claude Code would then essentially give a summary of everything we've done so far with instructions over to 3 different subagents. They would do the design and then give it back to me, like 3 different HTML webpages, for example. And the main problem is that these 50,000 tokens I've accumulated so far in the main conversation to come up with some kind of design has now been compressed into 2,000-ish tokens for the prompt of each subagent. And this was like too much compression for me, and it actually meant the subagents did a worse job because they couldn't remember all the details that we had talked about so far in the conversation. So essentially, during this compression, we lost a lot of nuance from the main conversation that would have been handy for the subagent to make better decisions. But now with the brand new forked subagent, I can then use all the nuance that we've accumulated so far in the main conversation have each of the subagents keep that in mind when designing each variation. And the fact that for each forked

The Prompt Cache

subagent, it will be using the same prompt cache as a main session means that it won't be that much more expensive to send all that history again and over time.

Anthropic's Use of It

Now Anthropic have been using this idea of forked subagents inside of the recent new features that they added. So for example, my video about the Claude Code auto-dream feature, they also use forked subagents to do the memory consolidation. I'd recommend watching that video as well. And it's also used inside of the recent recap feature inside of Claude Code. This is using a forked subagent behind the scenes. So doing /recap, that will trigger a forked subagent. Doing /bytheway, /btw, which I made a previous video about, also uses a forked subagent behind the scenes. Okay, so let's go ahead and use this

How to Use It

feature. So we have to make sure we have this environment variable set inside of Claude Code. So we can either copy that over, go back to Claude Code, and then paste that in into our terminal and then run Claude kind of like this. And that will enable the fork subagent. And you can see that in action because if I do /fork, then it will say fork spawn a background agent that inherits the full conversation. Alternatively, you can go to your settings. json for your project and put this in at the very top so that if you were to run Claude normally, then that would be enabled for every session going forwards. So doing /fork, you can see that in action again. Okay, now let's go for a few examples. So in this

Example 1: 2 Forks

project, I was adding some pre-warming for connections so that when the connection is needed, it would happen faster. And I basically don't know if the premise here is correct and I want to know what changes have happened so far. So I can tell Claude Code, hey, can you spawn up two forked subagents? One of them should make a Mermaid diagram of all the changes that we made so far and another one should search online with the Exa MCP to check if everything we've done so far is correct. So pressing enter, these two forked subagents will inherit the entire previous conversation, have all of that context and nuance, and it can lead to a better decision. And it also keeps a main conversation clean as well, because we don't need all this information of generating a Mermaid diagram in the main conversation. That can just go over to a forked conversation. It can like use all the context so far to make one and then pass in the URL back into the main conversation. So I can actually see these two forks running right now at the very bottom. And how many tokens they're using. And you can see they immediately started out with about 180,000 tokens, which is how much I used in this conversation so far. So if I click on this one, I can see what this fork is doing. Going up, I can go back to the main conversation. So this is really handy because then I can give a fork a follow-up question as well. I can press Escape, go over to the text input over here, and then give that fork an additional like prompt. So I can say, use a light theme instead, press enter. And that would go over to that fork with that message queued. So this kind of feels like agent teams in Claude Code, if you are aware of that. So our verification fork is done and it passed this information back into main session. And this information is much richer and like more nuanced because the fork had the nuance that we accumulated so far. And now the Mermaid diagram fork is done as well. So I can open up this. And then see the light theme mermaid diagram that it made me over here. So the key here is to essentially think to yourself, is a nuance of the main conversation so far useful to the subagent? If so, tell it to spin up a forked subagent. If it's not useful and could hinder or bias the subagent anyway, then don't use a forked subagent. Now another pretty interesting use case is for my Claude Code Masterclass, which is the most comprehensive Claude Code class that you will find on the internet. By the way, I will be removing the lifetime plan for the masterclass at the end of the week. So if you want to buy the lifetime plan before it disappears, then now is a chance to do so. There is a discount as well if you are interested. And you're probably thinking, why would I buy the lifetime plan if in one year from now, I don't know if we'll still be using Claude Code? And that is a point because like a year from now, there will likely be a better tool available, in which case I will make a class about that as well. And you will get lifetime access to all future Agentic Coding classes that

Example 2: MCP

I make. Now my class also has a MCP server, which is a really good use case for forked subagents. So if you add like the MCP server to Claude Code, I can basically tell Claude Code, with a forked subagent, can you use the Agentic Coding School MCP and recommend me any videos to watch that you think would be most helpful based on the session that we've just done together? Pressing enter, Claude Code will use all the nuance and detail of the current conversation with a forked subagent. It will do all the noisy tool calling in the fork to gather all that information and then just return that back to the main session. So we can see this fork in action by going down, pressing enter on this. It's doing all the noisy tool calling inside of the fork, and then it will pass those recommendations back into the main session. So you can see Claude Code now recommended me some videos from my masterclass. Based on this session that I haven't seen yet and why they would be helpful. So yeah, a lot of people have found this MCP server helpful to help them decide what they should watch next inside of the class. Now to quickly go over some other use cases, you may want

Other Use Cases

to use this for some kind of tangent containment. So a fork can do multi-steps. So if you use /bytheway, that is a single-step turn, whereas a fork is a multi-step turn and it can use tool calls, like MCP servers, for example. So you can ask side questions without derailing the main conversation. And if you get an answer back and you don't think the answer is helpful, then of course you can do /rewind to go back to before you spun up that fork. It can also be handy to do things like considering an opposing view with all the context so far. So you could do something like getting a forked subagent to explore if the premise that you have in the main conversation is wrong. So the fork would take the opposite of our working assumption. Like I showed as well, we can use it for any noisy tool calls. So any like throwaway research. And if you don't find the outcome of that research to be helpful, we can rewind the conversation. We could also be like, okay, can you actually check the logs inside of a fork to verify everything we've done so far? Another interesting approach that I will be trying more of is parallel decision convergence. So what if you combined forks and non-forks in interesting ways to kind of see where they would agree and disagree. So you can actually do this as well because I just asked Claude Code, can you spin up two subagents to do further research? One should be a fork and one should not be a fork. So you can see at the bottom here, one of them is a fork because it already has 200,000 tokens so far. And one of them is not a fork because it only has 35. And I can check up on each of them to actually see what they are doing behind the scenes. And a pretty common situation for not wanting

When to Not Fork

to use a fork would be for a code review of sorts. Because by forking, you'd probably get a worse review compared to having a separate subagent do that review instead. Because Claude Code would see the code that it wrote earlier inside of the fork and be like, "Oh, I wrote that code. Of course it's like good. " So it wouldn't do as detailed of a review compared to having a review done inside of a blank subagent. Anyways, I'm really excited by the feature and

Conclusion

I will be playing around with it much more and trying to find more interesting ways of using it to get better results. And I will be sharing all of that inside of my Claude Code Masterclass as I discover those use cases. So if you do want to take advantage of the lifetime offer before it is removed, then there will be a link down below. There's also a 14-day money-back guarantee if you're not satisfied for whatever reason, but I'm sure that you will be satisfied because so far less than 0. 2% of people have asked for a refund. Anyways, there will be a link down below if you're interested, and if you want to email me about it and ask me any more questions, there will be my email down below as well.

Другие видео автора — Ray Amjad

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник