What is Transformers.js?

What is Transformers.js?

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI

Оглавление (6 сегментов)

Segment 1 (00:00 - 05:00)

Hello everyone. Welcome to a new series on the coding train all about working with machine learning models in JavaScript in the browser using the transformers. js library. So this is just the first video in the series. I'm going to give you a little background about what is transformers. js, where does it come from. I'm going to look at how you load it into a p5 js sketch and kind of do some kind of hello world demo as well as talk you through what the range of possibilities of things you could explore with the library are. Then hopefully depending on when you're watching this there will be many videos that follow that show a whole bunch of these demonstrations of different capabilities and different kinds of models that you could use. Now I do happen to also have on my channel this entire playlist. I call it a track. A beginner's guide to machine learning in JavaScript with ML5JS. What's ML5? Is that different than Transformers. js? Yes, those are two different libraries. It is built on top of a JavaScript library called TensorFlowJS. I know confusing. They both could be abbreviated as TFJS. A lot of the models that I've explored in that series are, you know, image classification, object detection. More recently, I looked at uh pose detection, 3D pose detection, etc., etc. However, there are many new techniques, architectures, capabilities, models that can run in the browser that are not supported by TensorFlow and are instead supported by a different library called transformers. So I am in fact working with collaborators at NYU and all around the world on investigating how we might integrate transformers. js behind the scenes into ML5. So that might be something that I explore later on this channel. But for now I'm just going to try to figure out transformers. js in its raw form and see how I can integrate it with p5. This is the thing people say to me all the time. Why aren't you doing this in Python? The people ask me. Well, I would really like to do this also in Python. And in fact, if you are someone who is interested in becoming a machine learning researcher or engineer, Python is a language you're going to want to learn. And you're probably explore the hugging face transformers library, which is their framework for machine learning models, text, vision, audio, multimodal models, inference, and training, all in Python. For me, what I'm looking to do is explore what from all of this research, from everything that all of these people who work on this are doing, what can I pull from there and make work with p5. js in the browser. And what's so exciting about that is when I say make it work in p5s in the browser, I mean the machine learning model is going to be running in the browser also. No cloud server, no API key, no credit card that I have to pay to like send my data somewhere else. Of course, the capabilities are going to be limited, but I think we're going to see in as I get into these tutorials what kinds of exciting possibilities there are. So, where does Transformers. js come from? Hugging face. It is a place where a massive online community of researchers, developers, coders share models, data sets, and something called a hugging face space, which I will show in a minute related to machine learning. Zenova Joshua is the creator of the transformers. js library that we're using. This is his uh page on hugging face and various spaces. Spaces is a platform that allows you to host a live web app um to demonstrate a particular feature of a model. So for example, this is Whisper, which is a machine learning powered speech recognition model. And if I click record, choo choo, welcome to the coding train. Now, if I click transcribe audio, it's going to load the model. And then there's what I said, and it's time codes. All right, so you want to get started with transformers. js. Here's the documentation page. Of course, uh, take a look at the video's description. I will include links to all the different things I'm showing you here. What I want to focus on is this. So, this is what makes transformers. js library special. This particular API called the pipeline API. You always first have to create this pipeline. So, this is Python code, but it you can see what happens. You are importing the library. You're creating the pipeline and then you're sending some data into the pipeline and getting a result. And this kind of hello world version is for sentiment analysis. And maybe that's what I'll do in JavaScript. If I scroll over now there, this is the JavaScript version. Now you could see ah import pipeline. Great. We're importing the library await pipeline. Okay, we're allocating a pipeline for the task sentiment analysis and then we are passing some data in and getting a result out. Here's the thing. If you are a p5 js programmer, there's going to be some things here you don't recognize. Import, that's not syntax you've ever

Segment 2 (05:00 - 10:00)

seen before. What are those curly brackets doing there? What's this await keyword? Const. Do I need const? This is the stuff I'm going to talk about and cover in this video. So, let's actually just take this and let's bring this into a p5 js sketch. Let's just put it in setup right now. The dream here is that I could somehow just get this code to run right here in the p5 web editor. So, first, if you are watching this video, you will want to update your version of p5 to 2. 0. The too long didn't read version of this is just go click this little chip in the top right of the p5 web editor and make sure you've selected a 2. 0 version. Before we tackle the await issue, we've got to tackle this import. JavaScript, JavaScript. Oh, JavaScript. As you might know, or if you've watched me for the however many 10 plus years I've been doing this, there's always 1,500 different ways to do the same thing in JavaScript. And JavaScript as a programming language is always changing and updating and having new features. It's not really that new anymore, but it might be new to you if all of you've used p5. The way that you load a library in the p5 web editor is by looking at your index. html file and finding the script tag. It's loading the p5 library version two. There is something else called ES6. ES6 being the version of JavaScript that we're talking about modules which support something called import and export. And those are ways to bring in another JavaScript file in your current JavaScript file as well as export something you're coding in your JavaScript file to another JavaScript file. That's what you would use to make libraries and engage in a like broader JavaScript ecosystem. Transformers. js uses import. P5 JS does not use import. They aren't a perfect match in that sense. No problem. There is a way we can get around this. We can ultimately keep the global mode script tag and rewrite this import statement in a different way to essentially call an import function to bring in the library. Instead of the import keyword being there, I'm going to say let then I'm going to declare this variable pipeline and I'm going to set it equal to await call a function import which is supported in JavaScript. And let me close this out so I have more room. And I'm going to talk about async going to a wait. This should work. It's essentially saying uh actually it's not going to work. We're missing a detail. Hold on. I'm going to get to all the points I need to get. We need to have access to the pipeline in our p5 js sketch. The place we get the pipeline from is from the hugging face transformers library. So we're going to import it and we're going to use await and async to do that. However, this at huggingface transformers is not specific enough here because I'm doing it this way. I can't just reference the library through at huggingface. I need to use the proper URL just like the URL I'm using here for the p5 library. There we go. Now I am bringing in the library from the JS Deliver web server and I'm bringing in the transformers library from hugging face. I am tagging a very specific version at 3. 73. You don't have to. It will automatically pull the latest version if you don't do this. But I just want to be really explicit in the code and have people see which version I'm using at the time being. I want to talk about this syntax here. pipeline curly bracket which might be unfamiliar to you and this is as good a time as any for me to explain it. Let's say I have an object in JavaScript that says inventory curly bracket close curly bracket. What is an object in JavaScript? It is a collection of key value pairs. So maybe I have a property called mango with a value of five. a property called blueberry 100. One more. Let's say apple and I have nine. This is JavaScript object syntax. Let's say I wanted to get the value five into its own variable. I might say this. Let mango equal inventory domango. This is saying look up the object inventory and then go to its property mango and give me the value there and set it equal to this new variable called mango. It's essentially pulling out this individual property and putting it into its own variable of the same name. Well, this is a common enough thing that you want to do in JavaScript that I can actually do this. Let curly bracket mango close curly bracket equal

Segment 3 (10:00 - 15:00)

inventory. This is known as object dstructuring. I'm essentially taking a property called mango out of the object and putting it in its own variable. And in fact, one of the lovely things you can do with this is you can do it with a bunch of things. Now I've essentially taken this object and deconstructed it into three different individual variables. This is exactly what's going on in that import statement. In other words, importing that library is a giant collection of things. Pipeline, I mean the only other one I know is text streamer, but there's all these properties. All of the functions and classes and things that the library can do, they're all sitting there in this giant object. The only thing I want out of it is the pipeline. So I am just taking the pipeline property out of that library and putting it in its own variable. That's what's going on there. If you're paying close attention, you might notice that the pipeline is being imported with a keyword await. And the setup function has this keyword async in front of it. This is a whole new paradigm for handling asynchronous events in JavaScript like loading data or loading a library that is now supported in p5 2. 0. If you want the full story behind what await and async mean and how they work, I would suggest pausing this and going to watch the separate video I just made all about that for p52 and then coming right back here. We got the idea of the pipeline from the library. Now we need to create a pipeline. So when you create a pipeline, you need to specify certain things. Only thing you are technically required to specify, although it's a good idea, as you're going to see as we do more examples, to specify a lot more, is the task you want to do. So I want to create a pipeline for the task sentiment analysis. And the reason why I don't love this is one of the things that's important to me is to demystify this technology. It's not magic. it only exists because there's a very specific machine learning model that was trained with specific data that is doing sentiment analysis in a certain way based on its model architecture and how it was trained. So I really want to get very quickly to the point where we need to specify the task and what model we're using. But transformer. js will pick a default model behind the scenes for any given task. So in this case we allocate we create that pipeline. We have to do this asynchronously with Owait for the task sentiment analysis and const is a label for declaring a variable just like let the difference is if you use const you can't reassign that variable so it helps you sort of protect yourself against yourself if you know this variable should never change it's a constant there's a lot more to it I made a whole video about it it's not that important I'm going to just be my myself I'm me and I just going to I'm just going to use let — one thing I want to point out here is pipe isn't really a great variable name. Sure, it's short for pipeline, but really what we're getting out of the pipeline is a sentiment analysis pipeline. And I like to name the variable according to the task that it's going to do. So in this case, I would probably call it analyzer. And then what I'm doing is waiting for the analyzer to process the text. I'm going to say results instead of out. I like to say that. So we create the pipeline and then if I have that pipeline I can now send data into the pipeline. And of course there's often a lot more properties that you need to send along with. But in this case for sentiment analysis what it's doing is it's taking a block of text and it's giving it a score between zero and one. One for being positive and zero for being negative. Oops, that's not correct. The score is actually the confidence score for whatever the sentiment label is. So, if it's 0. 98 positive, it's 98% positive. Or if it's 97 negative, it's 97% negative. Okay, let's run this and see if we can console log these results. So, a couple things you'll notice. One of the things when you're using transformers. js is you're going to see a lot of different warnings and messages in the console. Don't be alarmed. Sometimes those messages might make you think that everything's broken. They're not necessarily. It's just like, hey, you're using this computer with this GPU and I kind of thought I prefer this one, but I'm going to do it this way and you know, I like big numbers with lots of decimal places, but let's just use a smaller decimal place now for this reason. Like a lot of messages like that with machine learning data will populate the console. Don't be alarmed. So, first we're seeing no model was specified. It's using the default model. So, one thing I would say is let's not use a default model. Let's go right here and put that in next. The second argument to the pipeline is the model you're picking. And I'm going to show you how and why you might pick different models. So we can see here we're getting the model that is in ZOVA's account on hugging face and it is called distill base uncased fine-tune SST2 English. Great model choice. Now it's giving me

Segment 4 (15:00 - 20:00)

another message about DType. Let me come back to that in a little bit. Let's just look and see if we got the results we wanted. Positive. 99% positive. Awesome. We did it. We have transformers. js working in p5. Let's make this interactive in a quick and dirty way just to have a little demonstration. I'm going to keep the canvas. I think it's sort of fun to have the canvas. Let's make the canvas smaller. I'm going to quickly just change the background color of the preview so you can see it more easily. And let's make a text input field. This is using the p5 DOM functions. I'm going to say create input. So, I've got a input box with unicorns and rainbows as the default text. Let's make a submit button and let's handle that submit button with a event. And just lower down here, I'm going to write the callback for that function called analyze. And in this function, I need to do what? I need to get the text from the text input box. So I get that with value and then I need to get the results from the pipeline but send in my text and then we're going to put those results in a paragraph element to display them on the page. But because I went ahead and made this function outside of setup, I need access to text input and the pipeline. Those need to be global variables. And it was called txt input and pipe. make that a global variable and make pipe a global variable. But we've still got a problem. Await is only valid in an async function. So anytime you write any function anywhere in your code that is going to use the await syntax, you've got to label it as async. So I'm going to add that in here. I called it txt input. Okay, so these are my results. You can't just put a JavaScript object in a DOM element on the web page. It just thinks it's a bracket object object. Remember, the object has a label and a score. Guess what? This is a good time for us to use object dstructuring. I don't need the results. I want the label and the score pulled out of the results as individual variables. I can create a paragraph with the label. That did not work. You know why? I wasn't paying close attention. This is a very common thing. So the pipeline does not assume that you might only give it one thing to analyze. You might give it like five things to analyze all at once. It's always going to give you back the results as an array based on the number of things you sent in to the model. In this case, I only sent one thing, but you can see the results are in the zero element of this array. So I need to be a bit more careful and I think let's stick with getting the results. Then I can say label comma score equals dstructure the object that's in the zero element. And now when I hit submit positive and why was I doing this? I thought well just as the for demonstration purposes let's use lurp color. So during the live recording I went on a long tangent trying to figure out how to color the canvas according to the sentiment score. I don't think it's worth including all of that here, but I will put up on the screen the solution, which you can see here. If I say unicorns and rainbows, positive score. If I were to say sad, sad puppies, negative score. And all I'm doing is using an if statement based on which label I get and then coloring the background according to an RGB value multiplied by the score. So let's talk a little bit more about what's going on here when you create the pipeline in terms of choosing the task and the model and this other message here that says dtype not specified. Let's go back to the hugging face website. One of the things that I like to do when I'm working with transformers. js is I will go to the models and I will search for models based on some constraints that are offered here. So for example, I can click on libraries and if I go and find here transformers. js, I can click on that. Now what I'm seeing over here to the right are only the models that work with transformers. js. There are 2,238 of them. So you know that might be hard to just randomly pick something out of there. However, I could also then navigate over to tasks. You can see here all the different tasks. multimodal, computer vision, natural language processing, audio, etc. So, the task I'm actually looking for is text classification. From what I understand, sentiment analysis is an alias to text classification. So, you can kind of use either. And it was nice for the demo to say sentiment analysis. But let's click

Segment 5 (20:00 - 25:00)

on text classification. And now I can see, aha, here is that model that I'm using. So, now I can look at all sorts of other kinds of models. Let's check out this one that says language detection. So if I want to use the language detection model, I'm going to copy the name of the model. I'm going to go back to my code. I'm going to duplicate this sketch. So I save the previous example. I'm also going to simplify the analyze function and just console log the results since we don't know what format it will come back in yet. Let's remove the canvas. And now let's change the model. And I will change the task to text classification. Let's run this. I'm going to click submit on unicorns and rainbows. We get a result. It is 29% confident it is English Latinu fatig. That is I probably need an accent there, but there's my poor French. 88% confident French lens. So this is what I mean. you actually have everything you need for any given task and any given model you want to use. You just need to create the pipeline. Now, different models might start to need different things in terms of what you pass in and the results might be formatted differently. So, there's some nuance here, but essentially this template should in theory allow you to access any model that's hosted on hugging face that is compatible with transformers. js. Okay, so I'm wrapping up the basic introduction to transformers. js itself. So a few more things I want to cover. One is when you're using a particular model, you probably want to instead of just running and using it, do some research about what it is, who trained the model, what data was it trained on, and what are some of the questions and concerns that you might want to think about in terms of how it could go wrong or be misapplied. One of the things that is not too uncommon that we're seeing here is that this is a port of another model. It's converted to be compatible with JavaScript. But to actually look at the information about the model, it looks like I need to navigate to this other URL which is the original model. Okay, let's click on this. And now we can see this is more about the model. This is the information about the architecture of the model, the data set it was trained on. This is what I'm looking for. Two more important details when you are loading the model beyond the task and the name of the model. The other is the device you want it to run on. And this has to do with where are the numbers going to be crunched. What computing resources of your computer are going to be used. Now remember these computing resources are going to be mediated by the web browser because this model is running in the web browser itself. And you might be wondering like what wait where are the model files stored? like I don't see them here in the p5 web editor. All I'm doing is like giving this username and name of the model. So the model is being loaded from the cloud just like the p5 library is and the transformers. js library is later we'll probably see loading some very large models that take some time to load. We I will need to make examples that have like a progress bar or some type of information. This model, this language detection sentiment analysis, they're very small. They kind of load so fast we don't really need to worry about that. But once the model is loaded from the cloud, right here on this laptop, the numbers are being crunched inside of the web browser and the web browser could make use of the computer's graphics hardware, the GPU or its CPU, the central processing unit. And so if I want to be spec, certain models are going to work better depending on what you do, but generally speaking, you might on your first try set the device to web GPU. So, we need a third property task model and then an object that has properties in it like device and I'm going to say web GPU. Now, here once I've specified web GPU, here's another one of these warnings that looks very scary and like maybe everything broke. But this you don't have to worry about. It's just giving you some information about like I know you said webpu, but I'm going to do this part on the CPU because I think that's better and I'll take care of it. Don't worry about it. Let's put the label in the canvas so we don't have to keep looking at the console. All right, let's make sure this still works. English. Great. So, it's running on WebGPU. And then there's one more property that's useful to think about, and it has to do with this dype, which is short for data type. Let's go back to the whiteboard. Let's draw a little diagram of a neural network. Maybe we have some inputs. Maybe there's a hidden layer. Maybe there's some outputs. So, this is a two layer hidden and outputs vanilla neural network of which I've covered and talked about in numerous other videos on this channel that I'm sure you could find and I'll link to in a dense

Segment 6 (25:00 - 29:00)

connected multi-layered perceptron neural network. Oops. Every node is connected to every other node. There's so much I could go back and talk about again in terms of how the data flows through the neural network and activation functions and all that stuff and training and blah blah. The point of what I'm trying to talk about here is when the model is finished being trained, whatever its architecture might be is there's lots of different nodes that are connected with a weight. So in this case, there are three inputs connected to four hidden nodes. There are 12 connections or 12 weights here. We've got four hidden nodes connected to two outputs. So, we've got eight weights. So, you know, there's bi there's the bias and all sorts of stuff that I'm not counting, but you could kind of say like, oh, this neural network has 20 parameters. You might, you know, see all these like overly hyped models in the news that have 400 bazillion trillion parameters. That's what they're talking about. how many weights, how many numbers are being stored in what is essentially a giant spreadsheet. The thing that you often don't think about is well, what is the precision level of those numbers? Are they just like 2. 135? Or if one of those numbers, is it 2. 135 67 89 1 2 3 4 6 76 81? How precise do they need to be? Are they floating point numbers with 32 bits of detail of decimal places? Are they floatingoint numbers with only 16 bits? There's even a way to store the weight in an integer form. And the process of reducing the amount of memory you need to store for each weight value is known as quantizing it. Quantization. I don't know. I'll put something on the screen that is the correct term. And of course, in an ideal world, we would all love to have the most precise numbers. But it turns out if you kind of lop off a bunch of decimal places, the results are kind of just as good and it runs faster. So this is that question that you're can make a decision of when you are loading your model. You can choose that level of precision. Floating point 32, floating point 16, right? Are we using 8 bit? Are we using 4bit? So in this case it used the default type FP32. But in a lot of cases, you're going to see that in my examples, it's going to specify a different data type. Now, not all data types will be available for every model. Let's see if this still works. It did. So, I I'm using the quantize down to 4bit integer data type for this particular model. Are we noticing any difference in behavior or how fast it's running? No. But this is something that really can make a difference. So, that's the full story. That what transformers. js JS is how you load the library into p5 using p5 2. 0 how you allocate a pipeline with a task a particular model thinking about specifying what device to do the computation and what precision level of the data and then it's up to you to build your interactive system that uses that model in some way to receive data produce a result and use that result. So, I don't know. Let's see. Maybe from this video, you'll just go off and running and pick some interesting models to try and create some projects. I don't know how long it will take me to build out this playlist, but hopefully whenever you're watching this, you will see some videos ahead of you that show particular kinds of tasks. I could just name a few that I might try to do. Might try depth estimation is a good one to try. Uh, speechtoext Oh, I want to do a lot of stuff with embeddings. There's so much you can do with embeddings models and actually uh working with small large language models I think is what you might actually see in the next video that follows pretty immediately after this one. All right, thanks for watching and I'll see you in the next time.

Другие видео автора — The Coding Train

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник