[ML News] Llama 3 changes the game

31:19

[ML News] Llama 3 changes the game

Yannic Kilcher 23.04.2024 47 880 просмотров 1 655 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Meta's Llama 3 is out. New model, new license, new opportunities. References: https://llama.meta.com/llama3/ https://ai.meta.com/blog/meta-llama-3/ https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md https://llama.meta.com/trust-and-safety/ https://ai.meta.com/research/publications/cyberseceval-2-a-wide-ranging-cybersecurity-evaluation-suite-for-large-language-models/ https://github.com/meta-llama/llama-recipes/tree/main/recipes/responsible_ai https://llama.meta.com/llama3/license/ https://about.fb.com/news/2024/04/meta-ai-assistant-built-with-llama-3/?utm_source=twitter&utm_medium=organic_social&utm_content=thread&utm_campaign=imagineflash https://twitter.com/minchoi/status/1782775792298037639?t=6U7Ob9P0SQmYdyLGUGq0Kg&s=09 https://twitter.com/_akhaliq/status/1782607138952499661?t=osENiISXOhJEf89b9QAjSA&s=09 https://twitter.com/_philschmid/status/1782420712105357616?t=vQQt7O9abWazZ-R3k3l9Kg&s=09 https://twitter.com/lmsysorg/status/1782483699449332144?t=h1EdrbrXi0_03gXXbhXskw&s=09 https://twitter.com/SebastienBubeck/status/1782627991874678809?t=QvZngdG1k0TllAyzT0qAsg&s=09 https://twitter.com/_Mira___Mira_/status/1782595759726354485?t=QvZngdG1k0TllAyzT0qAsg&s=09 https://twitter.com/_philschmid/status/1782358903558205556?t=h1EdrbrXi0_03gXXbhXskw&s=09 https://twitter.com/cHHillee/status/1781060345366503527?t=5ONxSzdwnghsKcwq3IPmEQ&s=09 https://www.meta.ai/?icebreaker=imagine https://twitter.com/OpenAI/status/1777772582680301665?t=DKDx-qwUP3Xr4oFvAM9mOQ&s=09 https://twitter.com/OpenAIDevs/status/1780640119890047475?t=YOJFQ6Ysx7JVDfZ6o3TT6A&s=09 https://twitter.com/OpenAIDevs/status/1779922566091522492?t=KhlVzoXh3NjCld1JiobsTw&s=09 https://twitter.com/CodeByPoonam/status/1776902550811525146?t=3cK96YjTWJnY0RmHLwAPsg&s=09 https://twitter.com/hey_madni/status/1776950057801236933?t=P2x2bXrYgMHm8jX7k2CAaQ&s=09 https://cloud.google.com/blog/products/ai-machine-learning/google-cloud-gemini-image-2-and-mlops-updates https://twitter.com/altryne/status/1778522661070475586?t=jdDna4B-45yLez12yuElig&s=09 https://twitter.com/xenovacom/status/1778812177215881395?t=oGLiMj6GQdKTuM6GbiYrAg&s=09 https://twitter.com/minchoi/status/1778074187778683253?t=Mb-mQvm4YIZ35pVpEijs6g&s=09 https://www.udio.com/ https://www.udio.com/pricing Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Оглавление (14 сегментов)

Intro

H hello we are witnessing the Llama Revolution um so today I don't have big fancy intros or anything like this it's just pretty raw me um because we have to talk about llama 3 came out two days ago or something like this and it's already mad it's already all across the large language model world uh if you don't know what I'm talking about and I don't know why you wouldn't but if for some reason you have missed this meta has released a new iteration of their llama models are a large language model uh series and they are fully or almost fully open source we'll get to that part in a bit but they have released very highly performing uh language models in two different sizes so far and they look really really good like to the point where they compete with current commercial models and they have another model that is still training that's even bigger a 400 billion parameter model that from what we know so far and it's still training is going to be really crazy like crazy good so a lot of things are maybe about to change in this world where uh so far the common wisdom let's say has always been okay these open source models they're pretty good for some use cases and so on but if you really want like the best and the most General and so on you go to the commercial ones apis and so on and now you will still have to go to an API probably because a 400 billion parameter model isn't super easy to just run on your local graphics card but now with open models with the best or a model that's like up there with the best models being available with open weights this can potentially change a lot of things around like the proliferation of capabilities people building stuff into this model is going to be insane or could be insane so let's dive a little bit deeper into this uh there is not uh descriptions are just coming out um so they say they will release like an

Benchmarks

entire research guide by the time this video is published that might already be out uh but so far they just say hey we have this new model we're releasing the smaller two variants and uh we will write like an entire paper when everything is out even the largest model the numbers like the standard benchmarks look extremely good compared to models in their own like class size class so you can see here uh llama 3 compared against the latest Gemma model which was already very performant and the mistal model was which was also uh considered to be really good for its sort of size and the Llama 3 Model is significantly better in these benchmarks both in terms of um human language and also code and you know things like math and so on I do believe the math benchmarks so far are more like gim Mickey a bit um but it's still a good number to keep track of whereas the larger llama 3 Model really it can hold its own against commercial apis right like Gemini Pro 1. 5 is a commercially available like Google sells that and so does anthropic cells CLA 3 so very good numbers from these models right here uh then they have a

Performance

bit of a more extensive blog post where they go into um what they you know what they measured the goals and so on the performance uh human evaluations you can see when they compare their llama 3 that the 70 billion parameter models versus others so clo Sonet mystal medium gbt 35 llama 3 wins by sometimes by significant margins um imagine just think back how strong GPT 35 was when it came out now we just have an openly available model that's not even that big like that 70 billion parameters that blows it out of the water so really cool and uh they release instruction tuned variants of those models uh the interesting Parts uh the interesting changes maybe are detailed here model architecture they have a larger vocabulary so a tokenizer with a vocabulary of 128,000 tokens saying in codes language more efficiently which leads to substantially improved model performance also uh they're using uh query grouped query attention and they've increased their context size to now 8,000 tokens I believe in llama 2 it was 4,000 tokens and these 8,000 tokens uh they can then be extended um so through various methods of context length extension they can be extended to almost nowadays almost arbitrarily long context obviously trading off the performance interestingly the model has been trained on over 15 trillion tokens they say that is seven times larger than that used from llama 2 they don't say where it's from they just say publicly available sources whatever that is right um it also includes four times more code so it's um it's supposed to be a kind of unified data set and it does contain a significant part of multilingual data over 5% of the Lama 3 pretaining data set consists of high quality non-english data that covers over 30 languages you might think 5% is little because it's like 95% goes to English and then 5% to the rest of the languages but in fact I think there's enough research to show that once you're cap really capable at one language you only need quite little data on another language to kind of transfer that knowledge back and forth so having 5% of 15 that's still like almost a trillion tokens that is there in a different language so um in other languages so that I think that's a it's a viable strategy it's much more viable than sort of the people who think oh we're just going to collect from each language equally and so on no this is definitely uh the way to go they also have put a lot of uh emphasis on quality of training data so lot of filters and euristic and so on and especially in the uh instruction fine-tuning they say it is crucial some of our biggest uh improvements in model quality came from carefully curating this data and Performing multiple rounds of quality assurance on annotations provided by human annotator so saying that the data selection has an outsized influence on the performance of the aligned models that's very good to know um good that people find that and also cool to hear looking back on a project like open Assistant where we did put effort into sort of human curation and human checking of other humans data uh great to know that was uh very forward thinking maybe by lock turns out that putting a lot of effort into quality of that data is actually quite important so what does that mean for you right if you want to collect data to you know maybe instruction tune your own models or something like this that last bit of data seems to be it seems to be quite helpful if you can curate that in a way that has the that is very high quality whatever quality means in the context where you want to deploy it so along with llama 3 they also release some side projects I guess uh they call so one is cyers SEC eval which is an evaluation um n evaluation suite for

Side projects

large language models so benchmarks and things for uh safety so they have llama safety evaluations and then they also have two uh utilities if you will kind of humanistics you put on top of the model one is called guard and one is called code shield and therefore language and code respectively so they sit on top of the model and prevent any sort of unwanted output so in terms of language guard it's maybe um unsafe outputs in terms of you know advice uh it's maybe something like swearing or something like this on the other hand code Shield is supposed to prevent uh unsafe code output and things like this so people starting to combine these language models with heuristics on top to refine the output even more maybe rejected resample it and so on very very cool model card is available um as I said weights are available for 7B or 8 billion parameters and 70 billion parameters and the 400 billion parameter model is yet to be released the license is a bit new so you

License

may remember the Llama 2 license which was already a bit special in itself because it said something like you know you can use this commercially you know do whatever you want except if you have 700 million monthly active users or something like this at the point where we release the model so uh it's kind of our competitors can't use this 700 million monthly active users in the pre ceding calendar month um on The Meta Lama 3 version release date okay that's the cut off however they have now added to that other provisions and these other Provisions I almost want to compare them to Creative Commons with attribution so they say something like if you redistribute or make available onama materials or any derivative works then provide a copy of the agreement and prominently display built with meta Lama 3 On a related website or blog post and so on if you use the Llama materials to create train fine tune or otherwise improve an AI model which is distributed or made available you shall include llama 3 at the beginning of any such model name at the beginning um so they essentially use this as marketing right so you're free to use this as long as you uh include lamo 3 and I'm going to guess you can probably give them some money to not have to do that so it's kind of like shutter stock or something that just slap their logo on top and if you want to remove the logo you they don't say that you can give them money but I'm fairly sure they would be open uh to such an arrangement if a suitable opportunity arises right here but no seriously I think this is very cool um we have seen the world be extremely closed off right and the AI ethics crowd having the upper hand screaming and screeching against anything that's kind of released uh towards a world where people release things for research purposes and so on and some companies still do that but now it's much more blatant uh they're much more open about the fact that it is to make money right like the coher models you can use them for research but if you want to use them commercially then you have to pay us money that was it was you know um it used to be the same end situation but for very different it used to be like oh we're really afraid someone's going to sue us if this model causes some damage in a commercial application so you may only use it for research purposes and now meta at least is moving to be much more open and this is absolutely welcome and this is extremely cool and all the people like Mark Mark not Mark's zorg I wanted to say mark this date um but I think we've already seen that we've already seen with the mistal models which are also which are fully open source right and with the past with llama 2 we've already seen that all these people who have announced how terrible the world is going to be if we open- Source these models have been wrong have been plainly wrong the Improvement in like in the field the good things that have happened undoubtedly massively outweigh any sort of bad things that happen and I don't think there's a big question about that it's just that the same people now say well okay not this model but the next model is really dangerous to release openly so this is the next model and my prediction today is it's going to be just fine in fact amazing releasing this and if you want to know what happened oh by the way meta is also releasing an assistant um

Assistant

so Zach has recently given an interview where he was really strongly Pro open source not ideologically but essentially just saying hey you know it's worked so far we released stuff open source people use it and make it faster and make it better and that helps us so Zach is not opposed to making the next llama closed Source if he thinks that helps the company more so I wouldn't say meta as such is now ideologically an open source company maybe Yan Lon inside of meta is but I would not be surprised if this sort of opening up Trend stops at some point especially now they're building their own sort of internal chat Bots and whatnot but for now it's good times so this has been out what two days three days two and a half and uh yeah people have already done insane stuff like okay doubling its context window fine-tuning it on an iPhone um web agents for web navigation this one says how to use llama 3 for regression analysis and then the Tweet quote it says please don't use to do regression analysis and I'm not sure this is a joke or not because it says don't believe this llama can't run code it's all made up um this world I don't know if this is a meme or not and then whoosh or a mem meme like a double whoosh I don't know um research assistance including it into therapist. a doing all kinds of stuff fine-tuning and so on you won't believe there is a archive paper on Lama 3 how how okay

My hypothesis

so my hypothesis is this that people we've known for a while that Lama 3 will come out at some point people have just been sitting here like being all ready to plug in any new model into their whatever pipeline right just being ready here and then as soon as it comes out it's like boom it's if they're the first one to be like here is how you find tun it or we have some sort of study on it or something like this if they're the first one then that will be a lot of clout I guess so there is almost a r so I'm almost a bit unsympathetic to the people who like Rush these things out as soon as the new model comes out because it's like it's just a clout grab in a s like some people for sure they're just playing around with the new stuff and that's cool and it's hard to decide which is which but for some it's definitely just like who get it out get it out the faster we are you know the better um you shouldn't throw from The Glass House right as a sort of news bringer newscaster that's the deal with news otherwise they're not news the faster you are the better but still research I think is a little bit different um so yeah how to fine-tune llama 3 has been included in the lmis leaderboard and it is performing really really well see being ahead of the coher models the anthropic uh models except the largest one the open AI ones and so there are only very few commercial models ahead of the 70 billion parameter llama 3 Model so only I mean it's only to guess where the 400 billion parameter model ends up but yeesus okay in other news Microsoft has

Microsoft F models

released a model called 53 so Microsoft's F models have been going a different route route of very curated data right very high quality data so that results in smaller models who also perform really well so the uh 53 mini is a 3. 8 billion parameter model supposedly matching uh the big mixol like 50 billion parameter mixture of experts model and GPT 35 and then the um the 70 H sorry 7 billion parameter model matching llama 38b which is the one we just saw is super strong and then a 14 billion parameter model now with the FI models um and by the way they're not out yet they're saying oh we're going to release um the models this oh sorry this is another fine-tuning llama 3 one H with the FI

Data curation

models it's never sure like it there's a meme of calling the five models just training on the test set because essentially their good data curation makes it much more likely that they will train sort of on data that's you could consider being in the test set um now they do have as far as I know D duplication methods but still fi is probably one of the models where people are most skeptical about the claims like you know does do the numbers on the benchmarks really hold up once you put them in front of humans and let hum decide like which model is better and which model is worse I don't know U we will see it's very cool that fi is available and also fi is as far as I know completely open source or will be open source even though here they say open weights release don't you dare don't you dare Microsoft don't you dare doing like some shady some shady o open weight stuff that would be sad in any way also very big announcement right almost going under the under the um llama things yeah this is another nice thing I've uh I've seen Horus head tweet out um now this is an almost real time you can see as the person types the image changes around so uh image generators going from clunky long running and so to now almost real time updating uh for you know but yeah I don't have access to that so I can only show you the video of it in other

OpenAI

news the Old Guard almost open AI has made a few announcements more like product announcements now I'm not usually in the habit of being like the um the ad person if as long as I guess if open AI wants to pay me that's fine um but I will mention it just because open AI is such a big part of the llm market so they now have an improved GPT for Turbo model it has improved in vision and you can Vision requests can use Json mode and function call this is not like these are just features of a software by now this is Maybe interesting you can now upload up to 10,000 files uh for pointing an one of open ai's Assistant uh to these files doing a bit of retrieval augmented Generation Um across up to 10,000 files now it was how much it was less it was I think a couple of hundred before so if you have use cases where you have more than like a 100 documents but less than 10,000 which are you know honestly a couple of use cases for example I can imagine a lot of like University course material will fit into that category where um and things like that like uh small wikis and things like this uh they will be amenable to this you can build an assistant on top of

Batch API

that and they now have a batch API so you uh submit a request and you don't get a response immediately how your like you're used to but essentially you get a job uh is being scheduled on your behalf and then within 24 hours you can go and get your results so they do that obviously because they have Peaks and so on during the day and this allows them if you can afford to wait like if it's okay for you to wait for a day for your results they will schedule your jobs whenever it suits them best and by doing that you only pay 50% so you pay half on the API prices you can save some money if you have batch use cases Google on the other hand has

Video Prism

had an announcement for cloud developers and they've kind of launched uh video prism which is like chat GPT for videos and Screen AI which can recognize things on the screen um and sort of do uh you know what's on the screen what do I need to press and what's where and so on now while this is all really cool it's sort of the Google way where they announce something but then um yeah it's kind of going to be available in vertex for Gemini UI studio for Google workspace users in Google one and not for anyone else or something like this uh that being said this announce ment reminded me of that uh this is from April 9th but Google Cloud announces updates to Gemini imagine Gemma and mlops on vertex AI like cool this you're too complicated Google just make it a lot simpler please um also I like this style of tweeting right here uh Google just launched video prism and it's insane here's five features of video prism you don't want to miss now I do appreciate that other people actually deliver me the news but the style is quite is quite something is going to transform the future of ux forever here is everything you need to stay ahead of the curve readon and at the bottom there's Pro I haven't looked but there's probably going to be like a link to follow me for similar content yeah I mean it's not bad right um so I mean thank you for delivering I just found it funny that I have come across the Google news from very similar Styles and uh the other news maybe less

Google Gemini

so there is another update this is um I know it says ridiculous right here but this is a genuine human expression of uh astonishment and not a engagement farming text um where Google uh the Gemini is able to process raw two plus hours of audio and asking direct questions and in less than 30 seconds it process 250k tokens of audio and extracted valuable insights from that so I do think uh the extension of sort of the same interfaces to like long audio and long video is really cool because traditionally podcasts and long videos and so on were among the hardest mediums the easiest mediums to consume but among the hardest to like efficiently search over and search in so very cool that this now becomes more and more

Udio

possible in the last bit of news today I want to show you music gen web which uh this is by Zen NOA um you can try this out this is a local model so this is in Transformers JS and does music generation uh and yeah there's source code available and so on now as always with music I know even if it's generated I'm extremely hesitant because some manager some talent manager of some musician somewhere will think that they have a copyright claim to my video if I do that so I will not show you that but I will show you kind of the newest kit on the Block in terms of Music generation and that's called udio um udio is a uh prompt to music model and it's really good like I've listened to these things and they're super duper cool if you go to their website you can see it's like image generation from the interface uh that is presented to you but you can um you have to log in but you can put a prompt and then generate uh some music now this isn't open source or anything like that there is a pricing page right here and if you actually go there it says Okay a product is free right now but it's obviously going to or presumably going to cost something later and you need to make an account and so on so this for now is a marketing uh effort so you may make use of it during the marketing effort and get to play with this model for free and if you want something open source as I said the like music gen um is available now even in a browser all right that's it for uh quick news for today keep your eyes on the Llama releases there is a lot going on and it's people are certainly not done like the speed at which the community buil stuff is I said there are some people who are Cloud farming by being the fastest at something but I still believe like the majority of the community is really well- intended and just wants to make really cool stuff and I'm absolutely super duper excited what comes out of this uh especially the fact that you know the weights and the models are available so you know how long until we have good sort of small loadable components in these models and some of these exist already like okay we have prompts for sure right uh and we have soft prompts I guess um but how long until I don't know well okay I guess the Laura and so on already are that but I imagine a future where you can sort of load and unload capabilities of your model just by like clicking in a few you know low rank vectors into the model and click them out again and they be sort of cumulative like okay I can click in the uh math module and lawyer module or something like this and then I have these capabilities there maybe that's already the case and I'm just so far behind what's possible but um I could see that this could be a way towards the future um where we send these things around like we send software around nowadays instead of just always sending like a full fine tune of the 70 billion parameters around or like a Laura even a Lura nowadays you kind of need to instantiate the 70 billions and then put your adapters in there and so on um it's not really as modular as I would like to be so I hope with openly available weights something like this will become a lot more accessible to people uh than it is right now with the API models where essentially the accessibility is limited to prompting all right I will stop fling now and leave it to be stay hydrated I'll see you around bye-bye

Другие видео автора — Yannic Kilcher

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник