# Metas New A.I Statement Just CONFUSED Everyone!

## Метаданные

- **Канал:** TheAIGRID
- **YouTube:** https://www.youtube.com/watch?v=0XMlljUmLvo
- **Дата:** 04.06.2024
- **Длительность:** 33:03
- **Просмотры:** 36,278
- **Источник:** https://ekstraktznaniy.ru/video/14270

## Описание

Join My Private Community - https://www.patreon.com/TheAIGRID
🐤 Follow Me on Twitter https://twitter.com/TheAiGrid
🌐 Checkout My website - https://theaigrid.com/


Links From Todays Video:


Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos.

Was there anything i missed?

(For Business Enquiries)  contact@theaigrid.com

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience

## Транскрипт

### Segment 1 (00:00 - 05:00) []

there is no such thing as a gii because uh we can talk about human level AI but human intelligence is very specialized so we shouldn't be talking about AI at all so meta's Chief scientist has said something that has ruffled some feathers in the AI Community yanin is a member of meta's AI team and he leads a division that is working on something different but his recent comment has like I said before made some people wonder what's really going on with this AGI thing just take a listen uh we should be talking about what kind of intelligence can we observe in humans and animals that current ey systems don't have and you know there's a lot of things that current AI systems don't have that your cat has or your dog and they don't have anything close to general intelligence so the problem we have to solve is how to get machines to learn as efficiently as humans and animals that is useful for a lot of applications uh this is the future because we're going to have ai assistance that you know we talk to help us in our daily lives we need those system to have human level intelligence so uh you know that's why we need it we need to do it right so his statement there was actually rather fascinating he said that we shouldn't even be talking about AGI at all and we should be talking about what kind of intelligence we observe in humans and animals that currently AI systems don't have it's a very interesting perspective and it makes the AI industry one of the most interesting ones because you know people just don't agree on certain things now this AGI debate on what AGI is and what AGI isn't there's a lot more that is explained in a recent article by the financial times you can see here he says meta's AI Chief says that large language models will not reach human intelligence yanin argues that current AI methods are flawed as he pushes for World modeling Vision with super intelligence and take a read at this because I think it's important to have different and other AI perspectives that show you that the path that everyone is currently on might not be the path that we use to get to superintelligence and remember this is one of the most respected AI researchers that have been in the field for a very long time and has made numerous contributions to the field so take a look at this meta's AI Chief said that the large language models that power up generative AI products such as chat gbt would never achieve the ability to reason and plan like humans as he focused instead on a radical alternative to create super intelligence in machines okay and I think it's pretty crazy that he's focusing on something that will create super intelligence and is basically stating that llms have I wouldn't say plateaued but they're not the kind of intelligence that you can effectively scale take a look at this okay he said that llms have very limited understanding of logic do not understand the physical world do not have persistent memory cannot reason in any reasonable definition of the term and cannot plan hierarchy and I honestly say you do have to agree with some of the things that he says I mean very limited understanding of logic this is something that if you've actually tested GPT 4's version not the version of logic but if you've tested like how the logic actually I guess you could say is understood sometimes it doesn't truly understand certain things like for example if you ask it to count the number of RS in Strawberry it gets it wrong but if you ask it to count it out letter by letter it then gets it right if you pre present you know certain logic problems that are very simple to a human sometimes GPT 4 will get it wrong and then sometimes it will get it right okay um and that is apparently a limitation of these large language models now in addition of course they do not understand the physical world and I think understanding the physical world how things interact and how they are is of course an important component of how we learn and how we digest information I mean it's a lot more that than these llm systems have because we have these senses of touch we have smell we have site I mean all of these you know sensors and llms what do they really have they just have text which is I guess you could say very limited in terms of you know how much intelligence you're allowed to grab and of course we have you know persistent memory this is something that is really important if we want a system to be able to effectively become smarter and effectively learn so persistent memory is something that you know I guess you could say they're making some strides in it but I guess the only way we're going to know is if there are any research papers that are published surrounding this area and of course if any of the big Labs do state something like that now so I mean there are a lot of limitations on llms that most people don't think about if you're actually

### Segment 2 (05:00 - 10:00) [5:00]

trying to get to AGI like all of these things understanding of logic understanding of the physical world a persistent memory you know being able to reason in a decent definition and being able to plan hierarchically those are things that you know when you think about it really like you really break it down the average human can do these things you know with I guess you could say not ease but it's not impossible for the average human to do this I mean humans just have this innate understanding of you know basic logic of the physical world and of course a persistent memory now there are also some other things that were said you know we do have the fact that he is developing an entirely new generation of AI systems that he will hope will power machines with human level intelligence although he said that this Vision could take 10 years to achieve so he's basically I wouldn't say given up on generative AI but he's working on something completely different and one of the you know key indicators of this is that every time a llama 3 paper is released some sometimes people say congratulations Yan lerna but he always states that hey uh it wasn't actually me that worked on this we're working on something completely different and he describes it here and I'm going to show you guys a video where basically he talks about this but he describes it here he says this is the article in the financial times where I explained that auto regressive llms are insufficient to reach human level intelligence or even cat level intelligence but alternative architectures and this is where the meat of the video comes into play objective driven May reach human level intelligence one day they use World models based on JEA joint embedding predictive architectures which are not generative with this we may have systems that understand the physical world have persistent memory can reason can plan and perhaps hierarchically so this is where we take a look at uh you know his new architecture that they're working on which is I guess you could say the path to superintelligence SL AGI today machines require thousands of examples and hours of training to learn a single concept the goal with jeppa which means joint embedding predictive architectures is to create highly intelligent machines that can learn as efficiently as humans vjp is pre-trained on video data allowing it to efficiently learn Concepts about the physical world similar to how a baby learns by observing its parents it's able to learn new Concepts and solve new tasks using only a few examples without full fine-tuning V jeppa is a non-generative model that learns by predicting missing or masked parts of a video in an abstract representation space unlike generative approaches that try and fill in every missing pixel V JEA has the flexibility to discard irrelevant information which leads to more efficient training to allow our fellow researchers to build upon this work we're publicly releasing v jeppa We believe this work is another important step in the journey towards AI That's able to understand the world plan reason predict and accomplish complex task very fascinating okay something very interesting that you need to uh know so of course we know that V JEA right here is you know self-supervised learning you know it's trained using unlabeled data meaning that it doesn't need predefined labels to learn this is basically this the process of self-supervised learning which essentially it's just more efficient and it kind of mirrors how humans naturally learn from their environment and with v jeer you have this abstract representation so this is higher level understanding so instead of focusing on every tiny detail V jeer learns to understand videos in an abstract way for example it can recognize actions like picking up or putting down a pen without needing to analyze every single Pixel in the video and the model essentially learns by masking parts of videoos so the model learns by you know um predicting the missing parts and it helps and this helps basically develop a deeper understanding of sequences and interactions with the video so basically you know he kind of describes this right here with as to why this is a lot more effective than what we are currently doing and so we started experimenting with system that could do video prediction and we got nowhere I mean the system can sort of predict a little bit in the video but the kind of representation of the world they learn from this is useless and what I kind of completely changed my mind about three four years ago by realizing that the systems that work best to learn image representations are not generative they're not systems where you take an image and you corrupt it and you restore it or or video there are systems that are called Joint embedding so you take an image and a corrupted version of it and then you train uh encoders neuron Nets to U encode the image so that you can recover the representation of the full from corrupted and that's not generative you're not reconstructing pixels it turns out you can't reconstruct pixels in image it's just too complicated so you have to use those non-generative architecture so the future of AI is non-generative so you can see here that he's clearly stating that the future is of course non-generative now it's pretty

### Segment 3 (10:00 - 15:00) [10:00]

interesting with as to you know his further statement on llms because in a recent interview called the scaling theory on the podcast he basically said that llms are useless so take a listen to this I mean I don't want to discredit him by stating that he said llms are completely useless they're useful for of course text generation but he talks about in terms of course reaching AGI they're not as useful as people might think a good guess is that the ones that will be players there are other ones that can invest in research long term research not just findun in current systems but like you know how do we change the Paradigm which is basically what I personally work on okay the Next Generation AI systems uh to you know to people like me LMS are the past ESS they're kind of boring now um of course they're very useful and there is a whole industry that should be built around them I'm not seeing the opposite but uh but if you're interested in the future um there's kind of a lot more that will happen in the future and it may completely change you know what we perceive I mean it's entirely possible for example that system will become a lot smarter with a lot less data um you know humans are not exposed to nearly as much text Data as current AI systems yet we're still a lot smarter than those systems right and here is another important clip and this is basically why he says llm suck um and here what they're good for because I think this clip where he actually explains it and breaks it down with as to why uh we're fooled by the fluence and they don't truly understand how the world works I think he's stating that you know basically the current consensus the current I guess you could say Collective Consciousness on how great LMS are is I guess you could say mistaken you know people are fooled by the current uh state of things they really are not good for anything and so the idea that somehow we're going to scale those up I mean they're very useful but they're not good for you know reaching human level intelligence so the idea we're going to scale those up and train them on even more text which we don't have because basically they're already train on the entirety of the public text on the internet um the idea somehow we're going to scale them up and reach human intelligence is not it can't possibly work so AGI is not around the corner if you believe in AGI um and so we're easily fooled by their fluency into thinking that they are smart but they really aren't um they're useful there's no question we can build an entire industry around them okay and more power to the people doing this uh they're going to make our search engines better and everything better you know writing AIDS everything um so we're not going to get to human level intelligence by just scaning up LMS and the question is what do we miss what are we missing um but okay here is why perhaps we're doing all of this wrong why llms will never get to where we want um so typical LM today is trained on 10 trillion tokens um 10 to the 13th each token is about 75 word uh you can sort of evaluate it this way and each token is about two bytes uh in the typical tokenization of language you have 30,000 possible tokens so that's two bytes right so do the math that's 2 10 to the 13 bytes that's the size of training set used to train the little LMS it would take 170,000 years for any human to read through this at reading eight hours a day at 250 WS per minute um that's an enormous amount of information right well not really because if you take a human child four-year-old a four year-old has been awake 16,000 hours and you try to quantify how much information has gotten into the visual cortex of that at four-year-old in his or life um you got 1 million fiber in each optical nerve each optical fiber carries maybe 10 bytes per second so that's 20 megabytes per second multiply by 16,000 hours multiply by 3600 seconds per hour and a 4-year-old child has seen 10 to the 15 bytes that's 50 times more than the biggest LMS in the world in four years so what that tells you is that we're not going to reach anywhere close to human or even animal intelligence by just training from text we've already saturated the amount of text that we can train on you know 10 trillion token that's basically the entire public internet and here's where he responds to a student who essentially asks about you know text and basically stating that you know do we still want to use text are we going to use text at all uh and this is where he kind of just explains that you know the V jeer architecture is completely different because you know when you look at how animals learn how peoples learn you know it's not really based on text things are I guess you could say non-linguistic um and that involves a variety of different you know pieces of information that you otherwise wouldn't have in text no it's no text no I'm not talking about text representation of anything uh in fact what I've been arguing uh for is the fact that you don't want to go through text is a terrible representation of knowledge like I mean the kind of manipulation you do in your mind when you think about like I don't know building a something out of wood or something that has nothing to do with language it's not connected with language anything that any animal does has nothing to do with language at least for most

### Segment 4 (15:00 - 20:00) [15:00]

animals so how do you like take the modality to like so if I'm thinking in images how do you make a system think in images to think of the next action then if it doesn't go down to text well so that's the idea of this JEA architecture right you uh train the system to produce a representation within which he can do prediction by given an action what's going to be the next state of the world and that doesn't have to have any relation to text or language right it's just some abstract representation of state of the world so those were three of the most important points that he made in a talk this is a I think it's an hour and 10 minutes but I only put like 3 minutes in because I think you know a lot of the other stuff is about like different architectures and you know just a bunch of other stuff but this is the stuff that's really important for this video because him basically saying that text is I don't say useless but you know you can't get to AI with it and you can basically see where he's stating that look like the amount of data that we think llms actually have is much less than humans are actually you know absorbing so of course I guess you know if we look at the scale in the future I mean this is why they think scale and the scaling laws are probably real with the fact that you know just putting more data in is going to yield bigger and bigger results now um I think this is going to be really interesting because you know the next level of systems I think we're going to see if openi managed to solve the data problem if data is really the bottleneck what they're really doing to get more out of systems and where the kind of limitation are now something that was also really interesting is the fact that you know in this interview right here on the Lex fredman podcast he gives a kind of prediction date on when he thinks AGI will come now everybody knows that people most people I guess you would say and I say most people I'm talking about like people who are viewing videos like these would say that AGI is probably going to be here by 2030 although there are some very I wouldn't say conservative at all but very some eager predictions stating that you know it's going to be here in s months going to be here by 2025 but I would say my prediction is still 2029 that's still mine based on some of the things that I've seen but he says that you know it's going to be here in pretty much 10 years time so this one I wanted to include this cuz it's very fascinating with us to you know how he thinks about you know the future and of course AGI and I mean maybe we're just all looking in the wrong direction because there is such I guess you could say love such a like all the eyeballs are on opening eye at this moment and that's their proprietary system you know that they're currently using is not coming soon meaning like not this year not the next few years potentially farther away what's your basic intuition behind that so first of all it's not going to be an event right the idea somehow which you know is popularized by science fiction and Hollywood that you know somehow somebody is going to discover the secret to AI or human level AI or Ami whatever you want to call it and then you know turn on a machine and then we have a gii that's just not going to happen it's not going to be an event it's going to be gradual progress are we going to have systems that can learn from video how the world works and learn good World presentations before we get them to the scale and performance that we observe in humans it's going to take quite a while it's not going to happen in one day um uh are we going to get systems that can uh have large amount of associative memories so this can they can remember stuff yeah but same it's not going to happen tomorrow I mean there is some basic techniques that need to be developed we have a lot of them but like you know to get this to work together with full system is another story how we going to have system that can reason and plan perhaps along the lines of the objective driven AI architectures that I described before yeah but like before we get this to work you know properly it's going to take a while so and before we get all those things to work together and then on top of this have systems that can learn like hierarchical planning hierarchical we presentations systems that can be configured for a lot of different situation at hands the way the brain can um you know all of this is going to take you know at least a decade and probably much more because there are a lot of problems that we're not seeing right now that we have not encountered and so we don't know if there is a easy solution within this framework um so you know it's not just around the corner I mean I've been hearing people for the last 12 15 years claiming that you know edgi is just around the corner and being systematically wrong and I knew there were wrong when they were saying so yeah AGI is not around the corner and like I said before I think one of the most interesting things will be is of course if openi uh you know then their latest updates of GPT 5 since they're the market leader and they've had you know 18 months to develop a system if it kind of you know leans into what yanin is saying here because there are a lot of AI critics including people like Gary Marcus who have been very critical of the AI scene um and there have been people like yanen that basically saying look everyone's got it wrong so I think this statement here that's why I've included this because I mean it's going

### Segment 5 (20:00 - 25:00) [20:00]

to be super interesting very soon to see if what he is saying is true now there is the super intelligence debate and he talks about how the emergence of super intelligence is not going to be an event and we don't have anything close to a blueprint for super intelligence systems and then this is you know very interesting that he said this because I mean this was before uh you know open eyes team just recently quit so I think the fact that maybe he's on to something here because you know if open I you know they kind of disbanded their super intelligence team on you know kind of you know developing it safely maybe he's kind of right because if if open I were close to Super intelligence I'm sure that they would you know probably be rapidly hiring for that team okay so I mean it it's very hard to like kind of look outward you know I mean look inward from you know an outside position so I'm not going to try to speculate too much but you know this is a response also to Max tegmark's tweet where he says Yan my position is marked different extreme power concentration must be avoided I completely agree super intelligence is likely to kill us all if anyone builds it before figuring out how to make it safe hence nobody should be allowed to build it before it can be safe and basically he's just responding to that you know I probably should have shown that tweet first but he's basically saying that like look it's not just one day boom we got super intelligence it's going to be quite gradual but you know there's a lot of disagreements with that so I mean it's very hard to argue that I mean on one side you have the fact that yes super intelligence is probably going to be gradual because things get smarter and they get smarter and of course we have the thing that like you know the moment it becomes smarter than us is the moment that we die so we're not really going to know what exactly happen so I mean it's pretty crazy okay and this is how he describes super intelligence he says the design will start by having the intelligence level of a rat or a squirrel then we'll ramp up the intelligence progressively by simultaneously designing proper guard rails and safety mechanisms testing it in simulated playgrounds then we will design it in such a way that it is only purpose will be to fulfill goals specified by humans I call this objective driven Ai and it will be a diligent problem solving server for us and Max seems to believe in the unrealistic sci-fi Trope of a suddenly appearing super intelligent and super powerful system that also wants to take over um and he says that flies over in the fact of uh how we know everything works now there were a lot of disagreements with this but in this one right here he actually talks about how even with this development that they're trying to do there's also I guess you could say a problem because uh you know there's a hardware issue that you know we need to solve first I mean certainly scale is necessary but not sufficient absolutely so we certainly need competition I mean we're still far in terms of compute power uh from you know what we would need to match the compute power of the human brain um you know this may occur in the next couple decades but um but we're still some ways away and certainly in terms of power efficiency we're really far um so a lot of progress to make in uh in in hardware and you know right now a lot of progress is is not I mean there's a bit coming from Silicon technology but a lot of it coming from architectural Innovation and quite a bit coming from uh like more efficient ways of you know implementing the architectures that have become popular basically combination of Transformers and cets right and uh so you know there's still some ways to go until uh we're going to saturate we're going to have to come up with like new principles new fabrication technology new uh basic components um perhaps you know based on sort of different principles than those classical digital semas interesting so you think in order to build Ami I mean we need we potentially might need some Hardware Innovation too well if you want to make it um ubiquitous yeah certainly cuz we're going to have to reduce the you know comput the power consumption a GPU today right is half a kilowatt to a kilowatt human brain is about 25 WS uh and the GPU is way below the power of human brain you need you know something like 100,000 or million to match it so uh so you know we are off by a huge Factor here I know uh you know he spoke about how the energy problem is a real thing because the human brain is remarkably efficient but there are other things that people are working on that are I guess you could say advances in chips that really do make the difference now essentially what I'm going to show you in this small talk right here is a discussion on photonic chips and this is basically you know it's basically stating that supercomputing is no longer a niche field thanks to llm so basically of course now that llms are here people are you know desperately trying to find a way to scale these systems and it's pretty hard to do when you have uh very inefficient computers compared to the human brain so a photonic chips use light instead of electrical signals to perform computations offering another pathway to energy efficient Computing and this is based on their energy

### Segment 6 (25:00 - 30:00) [25:00]

efficiency because they have light speed processing and light can travel you know a lot faster with less resistance than electrical signals leading to faster and more efficient data processing and Optical signals actually generate less heat compared to electrical signals which can actually reduce cooling requirements and overall energy consumption and then of course you've got photonic chips that actually can handle higher bandwidths making them ideal for data intensive tasks and they have Paralis which means that they can also process multiple signals simultaneously similar to neuromorphic chips enhancing their efficiency so some companies are actually trying to work on this and basically with these llms or whatever kind of system you're really working on uh I think well some people think that this actually might be the next breakthrough and what's crazy as well is that I do believe that samman slop AI may have actually invested in some of these companies so take a look at this I wasn't sure whether or not I wanted to include this in the video cuz it might be relevant to AGI but I do think that you know when we look back on the kind of breakthroughs that do get us there faster chips is something that just does make sense okay uh and it's pretty crazy at you know what's going on here four performance out of computer chips uh Jensen had GTC announcement yesterday I believe where he showed a chip that was twice as big for twice the performance and that's sort of what we're doing in terms of scaling today so the core technology that's driven you know Mo's law and dinard scaling that made computers faster and cheaper and has democratized you know Computing for the world and made this AGI hunt that we're on possible is coming to an end so at light matter what we're doing is we're looking at how do you continue scaling and everything we do is centered around light we're using light to move the data between the chips allow you to scale it to be much bigger so that you can get to you know 100,000 nodes a million nodes and Beyond try to figure out what's requ required to get to AGI what's these next gen models so this is kind of what a present day supercomputer looks like uh you'll have racks of networking gear and computing gear and there are you know a lot of interconnections when you're inside one of the Computing racks but then you kind of get a spaghetti you know a few links over to the networking racks and this very weak sort of interconnectivity in these clusters and what that means is that when you map a computation like an AI training workload onto these superu you're basically having to slice and dice it so that big pieces of it fit in the tightly interconnected clusters you're having a really hard time scaling getting a really good unit performance scaling as you get to you know 50,000 gpus running a workload so i' would basically tell you that a th000 gpus is not just a th000 gpus it really depends how you wire these together and that wiring is where a significant amount of the value is this is present day data centers what if we deleted all the networking racks of these and what if we scaled the compute to be a hundred times larger and what if instead of the spaghetti we have linking everything together what if we had an all toall interconnect what if we deleted all of the networking equipment in the data center this is the future that we're building at light matter so this is literally the future of where you can see that data center are going to change and that's why I say companies like matter with their you know photonic chips are really Innovative and they're really important to like what we're going to be looking at so that's why I wanted to include this because you know as we talk about the Energy Efficiency it's something that people kind of don't really talk about but you know when you're training these models at like a hundred billion dollars you know supercomputers and stuff like that like it's just so un unefficient like when you actually think about it but you know eventually we will get there but it's of course you know as yanen just pointed out it's an important point that you know most people don't really remember we're looking at how you get these AI supercomputers to get to the next model it's going to be super expensive and it's going to require fundamentally new technologies and this is the core technology this is called passage and this is how all gpus and switches are going to be built um we work with companies like AMD Intel Nvidia Qualcomm places like this and we put their chips on top of our Optical interconnect substrate it's the foundation for how AI Computing will make progress it will reduce it'll reduce the energy consumption of these clusters dramatically and it will enable scaling to a million nodes and Beyond um this is how you get to Waf for scale the biggest chips in the world and this is how you get to AGI and a very recent thing here was what he was stating about llms and he basically said that like look don't work on llms if you're getting into AI

### Segment 7 (30:00 - 33:00) [30:00]

research and development right now because llms have I wouldn't say hit their Plateau but I guess llms are just the backbone of these AI systems that we're currently integrating with our Technologies but the next stage I'm guessing what he's basically saying here is at the next stage that's going to lead us into I guess you could say a kind of different era where we're actually do we are actually truly having you know human level intelligence I think he's basically saying that look this is what I'm working on and this is what you should be working on too my picture of the progress of AI they think of this as some sort of highway on the path towards reproducing perhaps you level intelligence or Beyond and on that path you know that AI has followed for the last 60 or 70 years it's been a bunch of branches some of which gave rise to classical computer science some pattern recognition computer vision you know other things spech recognition Etc and all of those things had practical importance at one point in the past but we're not on the main road to you know alterate intelligence if you want I llm as another one of those off ramps it's very useful um there a whole industry building itself around it which is awesome uh we're working on at meta obviously but for people like me who are interested in what's the next exit on the highway or perhaps not even the next exit like how do I make progress on this highway it's an off so I tell uh PHS to students young students who are interested in AI research for the Next Generation AI systems I tell them do not work on NN there's no point working on NL this is in the hands of product divisions in large companies there's nothing you can bring to that table you should work on the Next Generation AI system that uh lift the limitation of LX which you know all of us have some idea what so it's going to be interesting okay very interesting because Sam Alman predicts that you know 5 years give or take maybe slightly longer which kind of actually lines up with his prediction uh and you know Elon Musk stated that AGI is going to be here in 2025 and you know Ray kwell stated that it's going to be here 2030 2029 so I mean in on the Spectrum things are just different you know things are really just different but I think overall stating that there's no such thing as AGI stating that you know I guess you could say this is not the architecture that we need and do truly need is of course V jeer I mean it's a bold statement but he does truly have some claims now I think this is the the point where things are going to change is either the GPT 5 level systems and what open ey has been working on we're truly going to see uh what limitations have been overcome or maybe he makes a breakthrough and his entire team that are working on this kind of show us that look we were right all along and this is uh truly where things are going to be so the next couple of years I genuinely think are going to be so interesting because we truly are going to understand where certain things are and where things are going to be