# Agent Skills: Code Beats Markdown (Here's Why)

## Метаданные

- **Канал:** Sam Witteveen
- **YouTube:** https://www.youtube.com/watch?v=IjiaCOt7bP8

## Содержание

### [0:00](https://www.youtube.com/watch?v=IjiaCOt7bP8) Segment 1 (00:00 - 05:00)

Okay, so over the past couple of months, it's become pretty clear that Claude's skills, also known now as agent skills, have really become the killer tool for helping both the models and agent harnesses just get really good at getting things done. Now, this started out with just Claude skills, but very quickly this became an open standard, and in many ways, I would say it's an open standard that is much better than MCP. mostly because of the ease of use, the ease of understanding and very quickly we saw companies like Open AI jump on the bandwagon realizing that this is something that could work for their models as well as the Clawude models and more recently we've seen some of the people at Deep Mind adopt agent skills for their coding harnesses like anti-gravity and Gemini CLI. Now, this is cool that it's become an open standard. Not only have we seen multiple companies support this, but we've also seen a rise of directories of things like skills. sh where you can come and search for skills, skillsmpp. com where it's a marketplace for skills. And generally, this has just caused a sort of proliferation of abilities that models and harnesses can now use that are authored by a whole bunch of different people. Now, in this video, I want to talk about some key things around skills, but I really want to focus in on the concept of scripts and the concept of getting your skill to basically use code and a coding sandbox to get things done in a far more efficient way than just writing markdown instructions. All right, so first off, why do skills actually work? They're really built on the concepts that have taken off over the past year of context engineering. And there are lots of different names for this. Some people call it context engineering. harness engineering. Basically, it's about getting the right tokens into a model at the right time to generate the right output out. I often jokingly refer to this as conditional probability engineering. But the idea with this sort of progressive disclosure is that even going back to the sort of days of the original sort of tool calling for models, you need something that tells the model what tools it has access to. In this case, what skills And the idea with doing it like this is because the models have actually gotten a lot better, we can now just do it in a very small amount of tokens with markdown. Then once a model decides that it actually wants to use a particular skill then the actual skill. md file is loaded into context. And this brings us to what actually are the parts of the skill. So the main thing that you must have is the skill. md file. So this is basically your metadata so that the model can actually identify what the skill is. It's also got the core instructions of the skill in there. But then we've got a few other things going on here as well. First off, we've got references, templates, examples, often in a folder called something like references. In there, you can have examples of the output that you expect, examples of how to actually call things, etc. You can also have an assets folder if you want to include images or some kind of other things in there as well. And then finally, the one that I really want to focus on in this video are the scripts. So this is where your actual code files are going to live in here. Now, it's really important to understand that one of the things that has enabled all of this to happen is that many of the models and certainly the agent harnesses have access to sandboxes. So by giving it sort of example scripts, it can actually take that script, it could rewrite it or it could use it just out of the box. It can then run it in a sandbox and it can either generate or retrieve more context that gets passed back to the model or just a straight output that gets sent to the user. So you can think of these skills as structured instruction sets that tell the model how to approach a specific task. And then the code execution allows it to use things like bash and command line interfaces, run Python scripts, and generally just empower our skills with the ability to talk to various different APIs. Now, while there are a bunch of different ways to build skills out there, one of the things that I see people go wrong often is that they never optimize those code scripts for the particular task that they're trying to do. And you end up with a skill with the instruction set being for something specific, but the code being very vague and not trying to do anything about reducing the number of tokens that's going to go back to the actual model. So, what I thought I would do in this example is take a look at some scraping skills. So, scraping

### [5:00](https://www.youtube.com/watch?v=IjiaCOt7bP8&t=300s) Segment 2 (05:00 - 10:00)

skills is one of the things that I see people want to do a lot, but it's also one of the things that I see people make mistakes of where they've got the most generic scraping script. And sometimes it works and sometimes it doesn't work. And when it doesn't work, they often don't really know why it's not even working. Scraping is a game of where you can get some of the best context for models from particular sites and you can even get it out in the format that you want, but you're often fighting sites that don't want agents coming to their sites and they'll do anything they can to kind of block you as you go through this. So, this is where the sponsor of today's video comes in. One of the big challenges we often face if we're building a scraping system like this is getting our IPs blocked. And that can happen for a variety of different reasons. So to get around that, you need reliable proxies. And that's exactly what Data Impulse provides. They provide residential proxies. And honestly, the price to quality ratio here is pretty hard to beat. They start from $1 per gigabyte with no expiration. They have over 95 million ethically sourced IPs across 195 different countries. And the cool thing is if you're using any kind of conventional scraping scripts like Scrapey, Playright or Puppeteer, they've got tutorials for each of those on their sites. So here we're going to bake these data impulse proxies into the code for our skill. And if we wanted to, we can even get mobile IPs. So if we want to make it look like our requests are coming from a mobile phone, we can actually set our script up to do that with a mobile proxy from data impulse. So, the link's in the description. Check it out if you're building any kind of data pipeline and if you want specific IP locations or if you're just looking for a hassle-free way to do scraping. All right. So, mistake number one that people make can actually be the most expensive. And this is when your skills use the web fetch tool to get back an entire page of HTML. Now, don't get me wrong. The web fetch tool is one of the tools that's built into claude code and it can be very useful when quickly it decides that it needs to go and get some docs or it needs to get some context from somewhere else. If you're trying to do this at scale though, where you're going to have it go through 50 pages or something, the last thing you want is it getting every single piece of HTML and returning that back. The perfect example of this is something like this where it goes off and gets all the HTML on a page and brings it back. You can see in this case just for the hacken use front page, we're already at 34,000 characters and approximately 8,000 plus tokens. Now, if we just include in our scraper that we want to skip things like tags that are script, style, perhaps nav, footer, etc. Now we can go from the sort of 8,000 tokens down to under a,000 tokens. So you can see here you've literally got nearly a 90% reduction in the amount of tokens. Now something like that is not going to be a big deal when you're just talking about one page. But if part of your skill is to go and get news updates for you each day and you're scraping a 100 pages, you don't want to have to burn through nine times as many tokens to be able to get the same result out. The next common mistake I see a lot is that when people know what content they're actually going to be scraping, don't waste tokens having your model actually figure out the page structure every single run. If you know what the structure is and classes you want are, you're much better to basically pass those in and then have your script just filter those out and return them. And here's the cool thing. If you don't know what those are, you can just literally get Claude to do this once and give you that information out so that it knows. For example, if we're trying to scrape hacker news and we want basically the titles of the articles, the URL for them and the number of points perhaps the user or the number of comments etc. Those are things that we can literally ask a model. So you can see here I'm just putting this into claude and asking it hey just extract out that metadata for me and tell me okay what it actually is and you can see sure enough we've getting back the actual CSS class selectors we've got the article title we can see what that is we've got the user score we've got this and we can then use this to get Claude to actually write the script to get exactly these things back so rather than just telling it Hey, go to hacker news every 30 minutes and get me a list of what's been posted rather than burn that sort of 8,000 tokens each time. You're much better to get back structured data which then you pass into your model. All right, so third up is now that we've

### [10:00](https://www.youtube.com/watch?v=IjiaCOt7bP8&t=600s) Segment 3 (10:00 - 15:00)

established that you basically want to make sure that your skill. md is like an orchestrator. It decides what to scrape and what to do with the results, but the actual scraping is going to be done by a script. You want to make sure that script returns things in the easiest format for the model to actually be able to use. Now, the majority of time that's either going to be returning it in markdown or returning it in JSON. In this case here, I'm going to have it literally do the passing, take the output, go through it, convert it to JSON, return the JSON. Right? So, for example, if I'm looking for a card that has a title, a URL, a summary, I'm going to basically extract those out and I'm then going to convert it to JSON and return that back to the skill, which is returning it to the model. Now, building on that even more, the other thing that you want to do is you really want to define a strict output schema, ideally directly in your script. But if you can't do that, for example, if we're scraping pricing data from lots of different sites, we want to make sure that we at least get things that we can compare against. So in that case, we might have a title, product name, URL, price, current discount kind of thing. And if we can't put that in the script, we want to at least basically tell that in our skill MD so that it can extract that information as it's bringing it back. The biggest commonality here is ideally you want the scripts to do as much of the heavy lifting as possible so that it's not the actual logic in the model having to make decisions. If those decisions can be pre-made in code, you're going to get a skill that's far more efficient. It's going to save you on tokens. It's also going to mean that you can actually just assign it to a lower quality model via a sub agent etc. It's also going to actually make it easier to run these in parallel as well. And the whole idea of running things in parallel is definitely one of the key things you want to do. So for example, something like searches etc. You ideally want to do them in batches so that you're not having to basically wait for 15 sequential searches. That's going to be 15 round trips. And each time you're returning it, you're adding another call for the full conversation context. So ideally, you want to basically have your script using things like threads to be able to make multiple searches in a batch at the same time. All right. Next up, you want to be able to handle things like limits and stop conditions. If you ask it to go and scrape news results, for example, and you've got something like pageionation, or here where we can press more and get more results back, you don't want your scraping skill to go into a loop of just trying to get every single one. You want to have some clear logic in your orchestrator. You want to have something like this in your skill MD of where you basically tell it the max number of web searches or web fetch calls that you actually want. This is one of the simplest ways where people flood their context window where the skill is technically doing the right thing. It's going out and getting you more information. But you can see very quickly if you've got pageionation, it's going 5 10 pages deep and you're picking up 30 40,000 tokens per page. very quickly your context window is going to be full as opposed to something that's much more efficient with only 2,000 tokens per page and it's limited to three pages. You can always have it set in your skill that if it doesn't find what it actually needs, it can then go deeper. And that brings us to another thing that you can think about doing when you're scraping is actually designing for incremental runs. So the idea here is that as your scraper basically scrapes things and perhaps saves them to some kind of local file storage or database going forward, you don't want your skill to go from scratch each time. It should just check where it was up to in the past and then be able to scrape anything that's newer than what it saved before. So you can achieve things like this in your skill. md file by putting in something like an incremental mode. You can see the example here. Before scraping, check if a previous report exists. Look for the most recent file matching and basically be able to return what is newly scraped versus stuff that you've done in the past. And lastly, the other big advantage of trying to get this into an actual script is that you can hardcode certain things like your proxy for actually using the proxy to be able to do the scraping, but also things like category names, usernames, anything that you know I just want to prefill this to get these particular pages. Just a quick recap, that's a handful of patterns that you can use here. And really the common thread here is that every single one of these is about being intentional with

### [15:00](https://www.youtube.com/watch?v=IjiaCOt7bP8&t=900s) Segment 4 (15:00 - 16:00)

your tokens. What goes into the context window, what comes out of should never touch the context window at all. That's the game with scraping skills. So while things like clawed code and the modern models can be really good at just web fetch and extracting data out, you really want to think about is that what you really want here? If you're building a skill that you're going to use every single day to do something very specific for your particular business or use case, as skills are becoming more and more important in how you actually build agents and how you actually give agents the ability to do different tasks, I'm seeing two things happen. One is that it's amazing that these things can just do so much so quickly. But also, I'm seeing people just throw out best practices that have been around for years, which not only make your applications more efficient, but in this day and age where you're paying quite a lot of money for using these tokens, you really want to be able to save as much as you can on your token bill. So, if you're looking to find out more best practices of how to actually build LLM apps, whether they're agents, whether they're flows, then check out this video next.

---
*Источник: https://ekstraktznaniy.ru/video/22368*