BIG AI News : Open Source CRUSHES Everything, GPT-5 Paramters Leaked, AGI Could BeDecades Away?

25:19

BIG AI News : Open Source CRUSHES Everything, GPT-5 Paramters Leaked, AGI Could BeDecades Away?

TheAIGRID 07.09.2024 59 510 просмотров 983 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Prepare for AGI with me - https://www.skool.com/postagiprepardness 🐤 Follow Me on Twitter https://twitter.com/TheAiGrid 🌐 Checkout My website - https://theaigrid.com/ 00:00 - Introduction to Reflection 70B open source model 00:51 - Discussion of model performance and benchmarks 02:09 - Explanation of reflection tuning technique 04:34 - Implications of Reflection 70B's performance 05:18 - OpenAI considering higher-priced subscriptions 07:56 - Discussion on the value of AI models for businesses 10:22 - Interview clip about AI cost reduction and improvement 11:32 - Xdai team's Colossus AI training system 13:01 - Andrew Ng's comments on AGI timeline 15:23 - Debate on AGI definition 16:33 - Ilya Sutskever's new AI startup focused on superintelligence 18:18 - Speculation about GPT-5 model size 19:39 - One X Robotics CEO on scaling robot production 22:20 - Introduction to Google's AlphaProteome 24:10 - SEAL leaderboard results for AI models Links From Todays Video: https://x.com/elonmusk/status/1830650370336473253 https://x.com/tsarnick/status/1830175922411966755 https://x.com/TheHumanoidHub/status/1830331686653149628 https://x.com/tsarnick/status/1830045611036721254 https://www.reuters.com/technology/artificial-intelligence/amazon-turns-anthropics-claude-alexa-ai-revamp-2024-08-30/ https://x.com/GoogleDeepMind/status/1831710991475777823 https://x.com/mattshumer_/status/1831767014341538166 https://www.theinformation.com/articles/openai-considers-higher-priced-subscriptions-to-its-chatbot-ai-preview-of-the-informations-ai-summit?utm_campaign=Editorial&utm_content=Newsletter%2CAI+Agenda&utm_medium=organic_social&utm_source=twitter&rc=0g0zvw https://www.reuters.com/technology/artificial-intelligence/amazon-turns-anthropics-claude-alexa-ai-revamp-2024-08-30/ Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos. Was there anything i missed? (For Business Enquiries) contact@theaigrid.com #LLM #Largelanguagemodel #chatgpt #AI #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks #Robotics #DataScience

Оглавление (15 сегментов)

Introduction to Reflection 70B open source model

from a new stunning open-source model that can beat current state-of-the-art models like GPT 40 to revised timelines about the future of AGI this video will show you what's happened in the past few days in terms of what you missed so this is arguably one of if not the biggest story that's happened this week in terms of AI announcements this is Matt Schumer the CEO of hyperight AI and he tweets that I'm excited to announce reflection 70b the world's top open-source model trained using reflection tuning a technique developed to enable llms to fix their own mistakes this model manages to surpass everything and with 45 billion parameters coming next week we expect it to be the best model in the world now it shows some further details

Discussion of model performance and benchmarks

but the implications for this are staggering I spoke a little about this in my school Community earlier and I think this shows us what the future of AI is truly going to be so on these charts what we can see here is that we have many different AI systems so on the right we have the Llama 3. 1 the 405 billion parameter model which is of course the base one then we've got Gemini Pro GPT 40 Claude 3 Opus and Claude 3. 5 Sonet and remarkably the only model that manages to come even remotely close to reflection 70b that's 70 billion parameters and that's pretty incredible because we do know that all of these other closed Source state-of-the-art models are actually much larger than 70 billion parameters but what we can see here is that Claude manages to surpass it on the GP QA and on the human eval but it's not by much now it's pretty crazy because I think the most interesting thing that we have here is the way that this model was tuned provides us an insight into how an AI can actually get smarter in terms of its reasoning ability and of course its responses so you can see right here that

Explanation of reflection tuning technique

this is basically from the reflection paper which is basically where earlier I think last year we had a paper that basically spoke about when you have an AI system give out a response you don't just use that response you get the model to respond to its own output and then ask it what it could have done better and in doing so the model kind of critiques itself and it manages to improve its own responses so right here you can see that the model is given a task write the first sentence of the preamble to Declaration of Independence but do it in mirrored writing so you can see right here it has thinking this is something that I have seen in Claude 3. 5 sonnet which is probably one of the reasons that the model is so advanced in terms of its ability to perform correct tasks and then we can see it outlines a plan and it says let's break this down into steps then it says let's go through this plan and then this is where the magic actually occurs we've got reflection before proceeding I should clarify that there are two ways to interpret mirrored writing simply reversing the order of the letters right to left and reversing the order and flipping each character for this task I'll use method b as it is more challenging and true to the concept of mirrored writing that's the end of the ref reflection then you can see it gives it then it reflects one more time and it says I've double checked the mirror text and it appears to be correct each character has been flipped and the order has been reversed the punctuation marks remain as they would in a true mirror refle and then of course you can see after all of this internal thinking it then gives a final output so the user usually wouldn't see all of this stuff if you ever wondered you know what goes on in an AI recently like I said before we saw with claw 3. 5 that you were able to prompt it to see exactly how the model was thinking in terms of its stepbystep process and this kind of gives us an insight into how these models get better responses and here we have the output which says here is the first sentence of the Preamble yada Y and it gives a really good response you can see here it says it holds its own against even the top closed Source models including claw 3. 5 Sonic GPT for 4 the top llm in mlu and other important benchmarks and a beach GPT 40 on every

Implications of Reflection 70B's performance

Benchmark tested and allegedly it clubs llama 3. 1 405 billion parameters and he speaks here about how it uses Chain of Thought So we separate planning into separate step improving the Chain of Thought accuracy SL potency and keeping the output simple and concise for end users so you can use reflection 70 billion parameter here and the thing is that they currently experiencing high traffic so if you're watching the video on the day of the release you probably won't be able to use this AI System since they didn't expect to get that much traffic but I suspect that as the week moves on a lot more people are going to be using this model now

OpenAI considering higher-priced subscriptions

interestingly enough if this model actually does surpass state-of-the-art systems for everyday tasks and in terms of the coherency of the models and how smart it is I think this is just going to completely change the game because people will have no reason to use Frontier models and pay a $20 a month subscription for something that's actually quite worse I do think that yes there are certain scenarios where these AI systems do excel such as tool use and of course certain Windows where you can have interactive demos but for an effective AI system that actually goes through all of these steps I think this is going to be something that is remarkably useful for the average person and it's quite surprising ing that these benchmarks are here so soon now something else that was rather fascinating was the fact that open AI is considering higher price subscriptions TT chat. a now I think that this one has one of the most staggering implications because I don't think openi would do this if monumentally powerful technology wasn't on the horizon so it starts out by saying how much would you be willing to pay for chat GPT every month $50 $75 how about $200 or $2,000 and that's monthly so it says that's the question facing open AI whose Executives we here have discussed high price subscriptions for upcoming large language models such an open AI reasoning Focus strawberry and the new flagship llm dubbed Orion and they state that in early internal discussions subscription prices ranging up to 2,000 per month were on the table said one person with direct knowledge of the numbers though nothing is final and of course we have strong doubts that the final price would be that high now I think there are two main reasons for this because one of the main reasons that this might be and I don't think that this is a main reason but it's just one that could be there is because these models are actually quite expensive to train and run than prior rles so you can see that it says for instance we've reported that when given additional time to think the strawberry model can answer more complicated questions or POS puzzles than open eyes current models can that additional thinking or processing time could mean more computing power and therefore more costs and if that's the case open ey is going to want to pass along some of that to its customer and it basically says here a high price would also mean that openi believes its existing white color customers of chat GPT will find these upcoming models a lot more valuable to their coding analytics or engineering

Discussion on the value of AI models for businesses

work now I think that this is not going to be an issue provided that openi can actually showcase that these models are worth their weight in gold if openi can showcase that these models are truly advanced in either coding engineering or whatever task that requires a lot of cognitive capabilities then I think mainly companies and small businesses are going to be paying a lot for these models that can solve their tasks at 1 100th of the average work now I know that most people might be valuing these models like okay $200 a month that's crazy but I think that's not how you want to look at the value of these models you have to think about it in the other way think about it the fact that most salaries are costing like you know 3,000 4,000 5,000 $6,000 a month and if you can replace that with an AI system that can cost $200 a month companies are not going to bat an eye because that is a fraction of the price that you're currently paying so when you think about it from that aspect you can say okay now I can see why companies and individuals might want to actually pay more for these models if we know that they can actually do a lot more in terms of their ability now of course you do have the cost issue which is facing many of these companies because these models are not free to train it does require millions of dollars in computes and energy and Chip infrastructure but I think the fact that they're discussing this kind of stuff tells me that the next level of AI that's going to be here is going to be really incredible I know that many people are doubting the abilities of strawberry and of course Orion but I'm not doubting that these models are truly going to put us in the next Frontier of AI considering the fact that literally earlier in the video we did see a model that just managed to surpass closed Source AI as a 70 billion parameter model that was fine-tuned on solving problem so I think that the future models we're going to get are going to be absolutely incredible now there's one caveat to this I am wondering if opening ey does have some insane models that are quite expensive how are they going to maintain those customers and those prices when we know that the price and cost of intelligence has been dropping quite a lot and I'm going to show you guys a recent interview where the CEO of perplexity actually talks about this issue more in depth all we know is that the SPID is

Interview clip about AI cost reduction and improvement

not really trying to grow it's growing look at the trend uh the growth rates are clearly there yes the cost per query is very high that's why we want to raise money if it was not high we would need that much money so we are betting on the fact that this technology is on an increasing curve the models are going to keep improving the cost per query is going to go down tremendously and since the D said it it's gone down by 100x and so with the same amount of money you'll be able to serve a lot more users over time so you're betting on something that's really going to work and when the cost per query goes down and the models are going to get more and more capable in smaller sizes then we are heading towards a where whatever is like you know one in a 10 queries hallucinate today would be one in 100 in a year one in a, in 2 years one in 10,000 3 years it's going to like increase uh its quality exponentially so you're like you cannot understand that world today it doesn't exist yet so you're buing on the potential of getting there and in terms of the world that we might not understand in the future this weekend the x. a team brought on Colossus which is 100,000 h100 s in their Data Center Online from start to

Xdai team's Colossus AI training system

finish this was done in just over 4 months Colossus is the most powerful AI training system in the world and moreover it will double its size to 200,000 h20s in just a few months this is absolutely incredible and I think this is why many people continue to tell you including myself that you shouldn't underestimate Elon Musk because he's shown time and time again that he's able to consistently disrupt certain industries with his ability to rapidly put together different teams of specialists in different fields and build great products that people actually want and love if you look at what he's been able to do with SpaceX compared to Boeing it's absolutely incredible that SpaceX has managed to beat Boeing to the punch on several different occasions even as to you know bring back these astronauts that are currently stuck in space that's a whole another video but for those of you who know what I'm talking about you'll know that Elon Musk isn't someone to underestimate so I can't imagine how crazy the future is going to be now whilst yes I think a bit crazy some individuals think not now Andrew NG machine learning Pioneer stated that AGI is many decades away and maybe even longer and companies that say it is only a year or two away are using non standard definitions to lower the bar take a look

Andrew Ng's comments on AGI timeline

at this because this is thought-provoking and may rethink your timelines but I give my opinions in a moment regarding AGI the standard definition of AGI is AI that could do any intellectual task the human so when we have AGI AI should learn to drive a car or learn to fly a plane or learn to write a PhD thesis in the universe for that definition of a I think we're many de I hope we get there in the life one of the reason that there's hype about AGI in just a few years is there's some companies that using very non-standard and if you redefine AI to be a low power then of course we could get there in one or two years but using the standard definition of AGI of AI to could do any intellectual task of human I think we're still many but I think be great regarding AGI the standard definition of AGI is AI that could do any intellectual tasks the human so when we have AGI AI should really learn so I think this one was rather fascinating because many people have you know pushed back on this video because I think what it does is it says that AGI is you know something that's able to do a million different things now there's completely no disagreement with Andrew NG in his prediction but I think the problem with AGI is that there are so many different definitions of AGI that when the talk of AGI is here so many people disagree on what that thing is that we can't ever get a statement from anyone that seems to all converge in one single point like there isn't one single thing that we can all agree and AGI can do which leads to various different definitions for what this system is going to be capable of now of course many people always to refer back to deep Minds levels of AGI paper and this actually gives us a useful guideline for where AGI is you can see that we already have superhuman narrow Ai and you can see that we are trying to get to level two which is comp or at least 50th percentile of skilled adults now I think if we do have an AI that can manage to you know learn how to fly a plane learn how to drive a car do all of that thing I'm not sure with as to why that wouldn't be like level four which is you know virtuoso AGI or near level five which is superhuman AI there's actually one tweet I did agree

Debate on AGI definition

with it says Andrew NG's definition of AGI seems to aim for a Godlike all-encompassing intelligence rather than General human intelligence can you name a person who can drive a car fly a plane write a PhD thesis and do other very specific things why set the bar for AGI so high instead let's define AGI as common sense intelligence across domains with specific expertise in certain areas creating multiple agis rather than one now I do think that this is rather important because if there was someone that could do all of these things they'd definitely be in the 1% of the 1% and if we say that AGI is only a thing that can do that it then kind of confuses people in terms of how these definitions are so I do think that with respect to Andrew and G what he's describing is probably something that is a virtuoso AGI which is outperforming 99% of humans and is probably quite close to something that many could Define as a super intelligent but in speaking about super intelligence if Andrew NG thinks that super intelligence is far away remember ilas satova the man who was integral to opening eyes success you can see here

Ilya Sutskever's new AI startup focused on superintelligence

that it says exclusive opening eyes co-founder satova new safety focused AI startup SSI raises $1 billion so this company is valued at $5 billion and these funds will be used to acquire computing power and basically safe superintelligence Inc founded by ilas satova is basically just going to focus on super intelligence if you miss this basically on their web page they basically said that this is something that they think they could do and it was something that they're going to be pursuing you can see here that the most important sentence from this entire website is that super intelligence is Within Reach now I don't think that if super intelligence wasn't in reach they wouldn't be doing this and whilst yes some people could say that oh it's just another company doing hype marketing I think that the likes of ilas satova are not to be underestimated in terms of his genius to be able to do things that we might not think about so I think it's rather fascinating because they say our singular Focus means no distraction by management overhead or product Cycles which is quite clearly referring to you know chat gbt where you need to do certain things in a certain times you can release your model ahead of competitors and they're basically just going to completely focus on super intelligence and nothing else so the fact that this company founded by Ilia satova is just focusing on superintend intelligence and not even focusing on AGI gives us the kind of insight to where things might be in terms of certain industry insiders and their differences of opinion we also had this image brought to my attention by the legendary Jimmy apples which shows that GPT 5 compared to GPT 4 I know this is

Speculation about GPT-5 model size

quite hard to see but I will zoom in but you can see the image of GPT 5 in terms of the model size it seems that it might be 3 to5 trillion parameters now if you remember GPT 3 was 175 billion GPT 4 was 1. 8 mixture of experts so GPT 5 it looks like it's going to be double the size in terms of how big the model is now this is quite fascinating because this is the first time that we've got any sort of details with regards to the size of GPT 5 and I do know that yes there is quite a lot of misinformation out there with regards to how these models are in terms of their size if you all remember that very disingenuous image that was posted across social media and I'm referring to this image which shows that one model is just incredibly bigger than the other one but this image was proven to be fake and I think visual diagrams like this are completely useful but not useful when they present misinformation GPT 5 it seems like it's only going to be twice the size of GPT 4 but I think the size isn't the main thing like I said before with models like strawberry and Orion the main thing that we want to really understand is of course the reasoning capabilities and the reliability so then we have the CEO

One X Robotics CEO on scaling robot production

of 1x robotics the robot that was recently revealed in an interview stating that they are targeting to scale by 10x every single year tens of thousands in 2026 hundreds of thousands in 2027 and millions in 2028 if this is true and accurate this scale of humanoid robotics is going to be absolutely incredible considering the implications for this we've had like an internal mantra for quite a while which is scale by 10 so we did like 10 EES then we did close to 100 and then we're going to do thousands of Neos tens of thousands of Neos so this means thousands of Neos in 2025 tens of thousand in 2026 hundreds of thousand in 2027 millions in 2028 you can do the math that's hard so far On Target uh but it's really hard it's really and it's been really painful I mean it's not like we're there yet but it's also like not like we haven't done some of this before so for E for example we peaked between 10 and 20 units per month on the previous line and for Neo we're going to basically 10x St right and it's a lot of organizational pain you need to build out how it is to execute as manufacturing company which one thing is all those systems that needs to just be in place for you to be organized and efficient at a SC at a scale with respect to like supply chain and materials and process and traceability and all these things it is step by step yeah I think a mistake people often do is think that like they can go manufacturing from like one to 1 million and like you clearly you can't PR you have to walk the steps and you can walk them pretty fast but you still have to walk the steps so we're taking I'd say a pretty a bit of a humble approach to this I think Supply here sorry demand for Al strips Supply but you still have to walk the steps so feel very confident right now since we're building pretty large batches of Neo now already on the new Factory line that we will hit our targets for next year very confident about 2026 something kind of magical happens when you go from like tens of thousands to hundreds of thousands of millions and it's traditionally like you see a lot of companies fail it's very painful and you just you need to make sure that you have the best people and people have done this before and that you really leverage that you have the full understanding of your product all through your organization right and that if anything goes wrong you can redesign you can fix it and that's like again kind of like back to controlling your own destiny while being vertically integrated and controlling your own supply chain you actually have a you can't blame anyone else you have the power to fix this if something wrong be a lot of work yeah in some really good news from Google we also had Alpha Proto generates novel proteins for biology and health research so basically Alpha Proto is a new AI System created by Google deep mind it's designed to make new proteins

Introduction to Google's AlphaProteome

that can stick to specific Target molecules in our body now this is important because it could help scientists develop new medicines and understand diseases better now why is this important well you know the proteins are like tiny workers in our bodies and they do all sorts of jobs from helping cells grow to fighting off diseases and sometimes we want to create new proteins that can attach to specific Targets in the body like viruses or cancer cells and this helps us to study these new diseases or to create new treatments now if you're wondering how this works it's been trained on a huge amount of information about how proteins work and stick together and when the scientists give it a Target molecule it can design a new protein that will stick to that Target now this is special because it is actually faster and more successful than the older methods of Designing proteins the proteins it designs stick more strongly to their targets and it can actually create create proteins for targets that were difficult to work with before now this is great because this could be used for developing new medicines creating tools to study diseases making senses to detect specific molecules in the body and helping crops resist pests now this does have some limitations for example it couldn't design proteins for one particularly challenging Target and they're still working on improving that but I think that this is rather fascinating because the implications are profound now the team Google are actually being careful about how they share this technology because it is powerful and could potentially be misused and they are working with experts to ensure it is used responsibly

SEAL leaderboard results for AI models

now lastly I will leave you all with this is the leaderboards currently for seal which is a company that focuses on the private evaluations so this is where you have models that are currently evaluated on a private data set so there's no contaminations models can't be trained on these answers so it's just raw Intelligence being measured and we can see that in coding claw 3. 5 Sonet is ranking first and they've added GPT 40 Gemini Pro and mistro large 2 and surprisingly mistro large 2 encoding actually beats out Google Gemini 1. 5 Pro and I've told you guys many times before that mist large 2 is one of the models that is severely underrated you can see here that in math GPT 4 ranks second and even instruction following GPT 40 doesn't rank first so it's clear that by private metrics GPT 40 has been dethroned and I am wondering how long is it going to be before openi reclaims their title I do think that they do have a decent lead in terms of their brand image and the popularity of chat GPT

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник