OpenAIs Surprising New Plan For Superintelligence...
28:20

OpenAIs Surprising New Plan For Superintelligence...

TheAIGRID 16.07.2024 23 556 просмотров 630 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
OpenAIs Surprising New Plan For Superintelligence... Prepare for agi with me- https://www.skool.com/postagiprepardness 🐤 Follow Me on Twitter https://twitter.com/TheAiGrid 🌐 Checkout My website - https://theaigrid.com/ Links From Todays Video: Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos. Was there anything i missed? (For Business Enquiries) contact@theaigrid.com #LLM #Largelanguagemodel #chatgpt #AI #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks #Robotics #DataScience

Оглавление (6 сегментов)

Segment 1 (00:00 - 05:00)

so there has been a recent Pace bin floating around and I don't usually cover Pace bins like this because they aren't usually true and it doesn't usually help the AI space in terms of development when you know there's a lot of leagues floating around but I wanted to cover this because it does make some claims which haven't really been picked up by the mainstream AI Consciousness in terms of how impactful these changes are truly going to be so I'm going to walk through some of these claims because when we do actually look back at some of the things that AI is doing and look forward could do I do think that this you know pce bin which is essentially just a website where you can anonymously paste a piece of text and you know we've previously gotten some verifiable leaks from that but I do think this one is more interesting than it is real I guess you could say but we're going to be taking a look because it basically covers how AI achieves superhuman intelligence through video games so this is a document that is claiming that superhuman intelligence is going to be coming through video games and this might actually be true because the only times we've truly seen superhuman intelligence is currently through a video game setting so one of the things that this document claims is that it claims that the Superhuman intelligenceai that open AI is developing is essentially going to use reinforcement learning so reinforcement is basically where it receives feedback from the game environment and continuously improves its performance this iterative process allows the AI to refine its strategies to superhuman levels now basically all that happens is that because AI systems are virtual and they can be trained in Virtual environments what you can do is you can essentially get continuous feedback so what that means is that unlike humans where we might make a mistake and it might take us a day or two or even a month to figure out if that mistake is right or not um an AI can make a million different mistakes a million different times and it can play itself millions and millions of times as it continues to play its times it gets rewarded for the good results and gets you know punished for the bad results it can then you know exactly what to do and in the simulation it can figure out crazy things now the thing about reinforcement learning is that it has been used before and with reinforcement learning it was actually used previously with open AI so some of you might know this but open AI actually did do some things regarding video games a few years ago and when I say I mean around five to six years ago so this was pre-chat GPT era where they were focusing just purely based on some research stuff so basically what you can see here this is open Ai and Dota 2 so essentially what this was a game where they used reinforcement learning to kind of I guess you could say train their models to get better at a video game and what was crazy was that this video game agent SL thing managed to destroy and beat the top level world champions which was pretty crazy so take a look at the first initial findings when they were doing this because it does go to show how important reinforcement learning is good game humans no longer think they can win they would be absolutely correct so their Dev team got absolutely crushed I think it was the fastest casting gig of my life then it went into game number two uh the humans kind of I mean they had time to like think about the game and stuff like that they got crushed even harder and the Bots did exactly what I hope for is they own this area of the map you take away two3 of the map they didn't even touch these two bottom towers and they would be 100% correct in this is like one of the highest level plays that you can make this side of the map is incredibly hard for the boss to control and so they're just playing this top side and this mid side cuz they understand that these are the two most important parts to control in the game the ability to like intuitively do this is insane doing it one game I could maybe chalk it up to just dumb luck doing it two games in a row flipping the sides means that it's more than just coincidence it took me and I'm fairly reasonably good at the game eight years before I learned some of the strategies I would say it was pretty easy to quantify for me it was about eight years for me to learn the strategies that the bot was intuitively doing and I just want to interject here because what he just said shows you how important this is he said that you know it took him 8 years to learn what the Bots did and that's the point okay with reinforcement learning you can compress years of knowledge into you know sometimes just a few hours and of course you know with large clusters large Training Systems it might take a couple days or a couple weeks but the point is that you can compress the time it takes to learn knowledge in a simulated environment and because it's an AI system it can do that remarkably well and he was say saying look it took me 8 years to figure out what's going on and these Bots have already figured out exactly what's going on intuitively doing to train our Bots we use reinforcement learning with selfplay We Run the game on over a 100,000 CPUs and our Bots learn from every game they play because DOTA is so complex to learn even for a single player we created a hyperparameter which we call team spirit the five Bots start out completely selfish but doing this knob tells them to care about their teammates so that they can learn to play together as one unit now Dota 2 wasn't the only game

Segment 2 (05:00 - 10:00)

where open and I used reinforcement learning there was actually also something that broke the internet at the time because it now has over 10 million views but this is multi-agent hide-and seek and this was another way where they used reinforcement learning basically with this one what they wanted to do was they wanted to take you know two teams of AI Bots they wanted to have an AIS that were you know the Seekers which are these little guys in red and then they wanted to have these hiders that were these little guys in blue they basically did millions and millions of games in this simulated environment right here and they made some pretty interesting discoveries take a look on Earth the simple rules of natural selection and competition led to the evolution of increasingly intelligent life forms today we ask if comparably simple rules and multi-agent competition can also lead to intelligent behavior in a new virtual world these agents are playing hideand-seek these agents have just begun learning but they've already learned to Chase and run away this is a hard world for a Hider who has only learned to flee however after training in millions of rounds of Hide and Seek the hiders find a solution the hiders learn to use rudimentary tools to their advantage by grabbing and locking these blocks they can create their own shelter The Seekers are locked in place for a brief period at the start of the game giving hiders a chance to prepare even so the hiders must learn to collaborate accomplishing tasks that would be impossible for any single individual the hiders are not the only ones who can learn to use tools after many generations of failing to break into the shelter The Seekers learn to jump over obstacles using ramps however after many millions of rounds of having their shelter breached the hiders learn to take away the primary Tool The Seekers have at their disposal note that we did not explicitly incentivize any of these behaviors as each team learns a new skill it implicitly changes the challenges the other team faces creating a new pressure to adapt we've also put these agents into a more open-ended environment randomizing the objects team sizes and walls in this world they learn to construct their own shelter from scratch requiring that they arrange multiple objects into precise structures to prevent Seekers from using the ramps the hiders move them to the edge of the play area and lock them into place and now here's where things do get really interesting because after training it millions and millions of times these AI agents managed to figure out really unique and innovative ways to solve complex problems and the crazy thing about this is because you're not explicitly training them to do X or to do y just do whatever you want to do and then you know you have whatever your reward function or whatever the point is that you manage to then get innovative solutions because you're not training for a specific outcome you're basically just training for it not to be you know got by the seeker and you can do that in a variety of different ways we originally believed this would be the final strategy that the agents learned however we found that after more training The Seekers discovered that they can jump on top of boxes and surf them to The Hider shelter in the last stage of emergence strategy that we observed the Hiers learn to lock as many boxes as they can before constructing their fort in order to defend against box surf so how do agents acquire these skills they're trained using reinforcement learning an algorithm inspired by the way animals on Earth learn the agents play thousands of rounds of hide-and-seek in parallel for many days they train against each other as well as past versions of themselves using an algorithm called self-play co-evolution and competition on Earth led to the only generally intelligent species known to date humans while this world is far less complex than Earth we have found evidence that simple rules can lead to increasingly intelligent behavior from multi-agent interaction we hope that with a much larger and more diverse environment truly complex and intelligent agents well so basically here you can see that they're just talking about how the fact that they've managed to get the system to evolve and basically simulate exactly what's going on Earth over millions and billions of years and I think this is why people are looking at this because the fact that you can get these emergent abilities that spawn out of nowhere and the fact that you can get these Innovative abilities you know in things like alphago which is move 37 where basically because these systems have played millions and millions of times they can discover I guess you could say new knowledge that we haven't looked at before and you know one of the things that was going on in this that you know it wasn't in this video but some really cool things that they were able to do was that these you know AI systems they were able to literally break the physics of this so these robots were able to you know get these ramps and they were able to throw them out they were able to glitch also able to fly up into the air and jump straight back down it was pretty crazy after a lot more generations and with that you know physics breaking and all this kind of stuff you know some people are arguing that kind of you know super intelligence where you've got all of these new capabilities coming out of the you know a pretty sterile environment you kind of wonder if that can be applied to real life and I think that's why that was mentioned in the post now Dr Jim fan here is talking about deep Minds SEMA which is an agent that plays seven games and four 3D simulations by reading pixels and generating keyboard/mouse control and this is the original promise of opening our universe in 2016 which was way ahead of its time and after 8 years this is done with the you know modern Horizon stack now basically this is you know Deep Mind SEMA now this is basically a an AI agent

Segment 3 (10:00 - 15:00)

game where you know it can play the game you know only for 10 seconds so it can't play for hours on end the data pipeline isn't very scalable you know you need to have ways for the agent to explore autonomous and look for novel activities to engage you know it can't do that much but you know it's able to play it by keyboard and mouse which is you know a breakthrough in a sense so the reason that I've added this is because in the document it talks about generalizing skills the skills and strategies learned in video games are generalized to other domains for example the Strategic thinking and planning required in games can be applied to Fields such as mathematics science and complex real world problem solving now I that might be a stretch you know to you know go from a video game to straight solving math equations I think the concept of generalizing these skills is there and it's kind of true because when you look at Google's Deep Mind SEMA it says as we expose SEMA to training you know more training Worlds the more generalizable and versatile we expect it to become so basically as they manag to get an agent to work in many different video game environments it wouldn't be crazy to say that you know maybe we could use this in robotics in our Earth environment like you know what's to say that and it says and with more advanced models we hope to improve sema's understanding and ability to act on a higher level language instructions to achieve more complex goals so you can see here that it's not completely crazy to say that because with more advanced models they're going to basically you know use that to have more understanding and you know act on higher level language instructions and basically do a lot of different sub actions and all those kinds of things now so you can see here it says but learning to follow instructions in a variety of game settings could unlock more helpful AI agents for any environment and this is something that you know is being worked on AI agents are something that we currently do struggle with due to the generative nature of our current AI systems you know it just doesn't you know work well across many different things but the point here is that if we manage to get these AI agents to work in video games then we're probably going to be able to get them for any environment so it says our research shows how we can translate the capabilities of advanced AI models into useful real world actions through language interface we hope that SEMA and other agent research can use video games as sandboxes to better understand how AI systems may become more helpful so you can see here that this is something that Google is clearly working on now what we also do have is generalizing skills and what was crazy about this is that Deep Mind SEMA generalizes better than the models that were explicitly trained just for that task so for example it says in our evaluations see agents trained on a set of nine 3D games from our portfolio significantly outperformed all specialized agents trained on each individual one which is pretty crazy because they've been trained specifically for that game and SEMA outperforms them which is you know pretty crazy so it says what's more an agent trained in all but one game performed nearly as well on that unseen game which is also pretty crazy so it wasn't even trained on that game but it was able to perform better than the one that was trained on that so it's pretty crazy because basically what they're saying here is that SEMA has the ability to function in brand new environments and it's able to generalize Beyond its training which is a key factor in order for robotics to succeed one of the reasons that you know autonomous soft driving and Robotics really does struggle is because it's not able to generalize Beyond its training data so if you've got an environment mapped out which is good that's fine but the reason you need a lot more training data is because when you know you're a robot and you see a million different scenes you're just not able to understand what's going on and you just struggle with the task so if you could get a robot that could literally generalize Beyond its training environment it's able to work in brand new environments which is something that you know modern robots struggle with so for example if I asked you to place this cup on a desk you could easily do that in any environment you could be in a desert on a stranded Island and I could say place this deck on a cup and you know place his cup on a desk and you could absolutely do that but a robot you know you switch around the environment side is it's going to start to struggle you know when identifying certain things and if we can get an AI agent that's able to do that that's going to be an important integral piece of research now one of the other crazy things as well that I just quickly want to touch on is that if we go back to here it basically talks about you know superhuman intelligence through video games and you know the more and more research I start to do the more I don't think that that's completely impossible if you know transfer those methods now recently we did get a piece of information which kind of confirmed the super intelligence isn't as far off as we think and whilst that might be hyped to drum up investment I don't think it is now this is from ear sat's company if you haven't been paying attention he started something called super intelligence safe super intelligence Inc and I think the biggest thing that people missed about this is the fact that their statement is that super intelligence is Within Reach which is a pretty wild statement to make considering the fact that allegedly externally we don't have AGI yet so I don't know if you know openi have

Segment 4 (15:00 - 20:00)

achieved AGI internally but the fact that one of the lead researchers is stating that super intelligence is Within Reach is of course a indicator that maybe we are a lot further along than most people will have thought now it's important because he was working on Super alignment at open Ai and clearly they've been working on some stuff which leads him to believe that super intelligence isn't that far off now what we have here is you know this graph from Leopold Ashen Bren document called situational awareness and essentially right here this is where he talks about how you know automated AI research is possible in around just three to four years and 3 to four years is not that much time like if you think about how your how like your last three years of your life maybe some things have changed but I think the economy or you know the world going through a huge transformative period in just 3 to 5 years I think that's going to be Monumental and whilst yes it may take some time to distribute the Technologies and even if breakthroughs are made you have to do testing manufacturing all those kinds of things compute building energy storage y y all of those things that you need to do to ensure that you can actually get to Super intelligence the point is that it's something that is no longer this sci-fi Trope possibly a reality considering where we've come in the past and once again Leopold ashenbrenner is someone that was working on Super alignment at of open eye so this is someone that was working on aligning super intelligent so this is something that just isn't from a random blog post from a random blogger this is from someone that has arguably got the most knowledge on where we are with regards to Super intelligence so that trajectory right there um isn't like clickbait or anything like that but I do think that this will probably be one of the most important graphs to watch in the future because if we you know look at where we're going to be in 2030 it's going to be pretty insane on the kinds of systems that could be built and that's not just by openi this is by any developing Nation with a race to develop their own Frontier models now something else that this document dives into that I do want to cover is the Monti Cara treesearch evaluation now monteal treesearch is a very popular tree search method which evaluates the strategies then picks the best one based on simulations then it uses a search tree to represent the possible game station actions and then it runs a simulation to determine the best you know strategy so basically if you're a human doing this you would basically think about a scenario think about all the possible scenarios try and run a simulation that could run the scenarios once you see the outputs of those scenarios you think okay I'm not going to go ahead and do that I'm pick the best one so this is basically how it works in simplistic terms and uh yeah it's something that does work a lot because they use this in Alpha go so you can see here that you know the trained network is used to guide a search algorithm known as Monte Carlo tree search MCTS to select the most promising moves in games for each move Alpha zero searches only a small fraction of the positions considered by traditional chess engine in chess for example it searches only 60,000 positions per second in chess compared to roughly 60 million for stockfish and it managed to beat stockfish so the crazy thing here is that we have a system that is searching only 10 thousands of moves but is able to get a lot better than them which means that the search architecture is of course something that advanced AI system will need to do and this is something that humans do automatically if you're someone who's about to make a decision especially in chest you tend to think about all the different kinds of outcomes that may you know result you know you tend to think about 100 or 50 different outcomes and then you tend to think okay provided all of those outcomes and provided the outcomes of the outcomes what position am I going to be in you then tend to you know get a very good grip and the thing is you know why they say human brains are so great is because humans manag to have a search program that works on only hundreds of moves but for current AI systems they have to search through so many moves in order to get that right decision which is pretty crazy now this isn't just theory of course like I said this is something that happened in Alpha go but of course this is You Know spoken about clearly and I've referenced this clip before these Foundation models are World models of a kind and to do really creative um problem solving you need to start searching so if I think about something like alphago in the move 37 famous move 37 where did that come from all its data that it's seen of human games or something like that no it didn't it came from it identifying a move as being quite unlikely but you know possible and then via process of search coming to understand that the that was actually a very good move so you need to you to get real creativity you need to search through spaces of possibilities and find these sort of hidden gems that's what creativity is I think current language models they don't really do that kind of a thing they really are mimicking the data they mimicking all the human Ingenuity and everything which they have seen from all this data that's coming from the

Segment 5 (20:00 - 25:00)

internet that's originally derived from humans if you want a system that can go be truly beyond that and not just generalize a novel way so it can you know these models can blend things they can do you know Harry Potter in the style of a Kanye West rap or something even though it's never happened they can blend things together but to do something that's truly creative that is not just a blending of existing things that requires searching through a space of possibilities and finding these hdden gems that that are sort of the hidden away in there somewhere and that requires search so I don't think see systems that truly Step Beyond their training data until we have powerful search in the process so that was Shane leg who is someone who states the AGI by 2027 and he is the co-founder and chief AGI scientist at Google deepmind so his opinion is very important when it comes to discussing these kinds of topics but the point here is that what we have is a situation that shows us that this is truly Going to Be an Effective method and samman has actually spoken about this how you know GPT 4 if you get that model to search over many different things you end up basically with a lot more responses well better responses because you can find out which ones are going to be there now something that was also kind of interesting and that relates back to this search was you know neuros symbolic AI so essentially neuros symbolic AI this process combines neural networks with symbolic reasoning enabling AI to handle abstract Concepts and logic effectively gaming experiences helps develop these Advanced cognitive abilities so essentially neuros symbolic AI just means that you're giving an AI you know a piece of a system that it never really had before that kind of completes the system in the way that humans are complete in their B so basically the neural networks part the neuro part is basically the robot's brain that learns from examples think of it like a child learning what a cat looks like by seeing lots of pictures of cats and this part is great because it can recognize patterns and you know understands things that it has seen many times now the symbolic part this is where the robot's library of rules comes into play this is like teaching the robot Specific Instructions like how to play a game by following the rules step by step and this part is good at thinking logically and following clear instructions so when you combine these things okay this is where we get AI systems that are really effective and many people have said that you know you're never going to get to AGI unless you have neuros symbolic Ai and basically combining the power of neuron networks with the logical thinking of symbolic AI this is how you get to truly advance systems and this is what we had with Alpha go was primarily based on neural network technology but it also Incorporated elements that can be considered part of a symbolic Ai and what was interesting about this entire thing about alphao neuros symbolic AI was that this isn't the only you know thing that had this because a symbolic component of Monte Carlo's research you know searching through millions of different possibilities that was actually recently done with this so basically franchis cholet is someone who recently you know spoke about this Arc AGI test it's basically a test in which you know if an llm can't do it it's not AGI yet and it's not smart yet it basically tries to test how well an AI can actually reason because the tests that it's doing aren't really things that are in the training data at all so essentially with frost Chay he's responding to a tweet here where someone actually used an llm with discrete search so basically in order to complete his test this Arc AGI test someone used you know a search method and they managed to get 50% now you might think 50% that's not even that good but the thing is before that people were saying that this was never going to be done and 80% was what they're claiming to be an actual AGI system so he says that this has been the most promising branch of approaches so far leveraging an llm to help with discrete program search by using the llm as a way to sample programs or branching decisions this is exactly what neuros symbolic AI is for the record and here he states of course alpha go was neuros symbolic as well and basically the point here is that asking these AI systems these llms to do you know a lot of things it doesn't really make sense but getting these AI systems to be combined with things like you know code interpreter or what whatever Tre search this is where we're truly going to get systems that are truly incredible now Sam Alman did briefly kind of speak about this a little bit but he kind of hinted towards the fact that you need AI systems that think in a different way so it's kind of interesting I just want to put this here and of course this covers the qar thing questions I would love to ask your intuition about what's GPT able to do and not so it's allocating approximately the same amount of compute for each token it generates is there room there in this kind of approach to slower thinking sequential thinking I think there will be a new paradigm for that kind of thinking will it be similar like architecturally as well we're seeing now with llms is it a layer on

Segment 6 (25:00 - 28:00)

top of the llms uh I can imagine many ways to implement that I think that's less important than the question you were getting at which is do we need a way to do a slower kind of thinking where the answer doesn't have to get like you know it's like I guess like spiritually you could say that you want an AI to be able to think harder about a harder problem right and answer more quickly about an easier problem and I think that will be important is that like a human all kinds of research uh we have said for a while that we think better reasoning in these systems is an important direction that we'd like to pursue we haven't cracked the code yet uh or we're very interested in it and what's interesting about all of this okay is that one of the notable AI Skeptics Gary Marcus even agrees with this term so the reason why I'm talking about Gary Marcus is because basically everything that has been popular this year has been you know a lot of AI stuff and he's been a huge you know critic of the you know current AI paradigm which is llms and Transformers and he basically says that you know you need neuros symbolic AI now I actually like Gary Marcus because I always state that you know if everyone's thinking in the same direction then absolutely no one is thinking and it's always good even though you know sometimes he can be a bit too critical of AI I do think that opinions like this are important because if you start to think wait a minute this guy might actually be right and maybe he's not just a critic but maybe he's hinting at a clear limitation of the models and if we actually listen to what he's stating and actually work on that area maybe we'll actually get AGI I think it's important to what he's stating here okay he's stating that you know newm buic AI is most certainly needed okay so I have three lastes if I can afford it that um so uh summary one scaling alone is not all you need um I wrote this slide a couple years ago and it's still true I didn't change the word um current deep learning still struggles with reasoning and factuality even giving vast amounts of data enormous models I should say even vaster amounts of data um hasn't changed classical AI isn't the answer either like I'm not expecting that pure symbolic AI is going to solve all this we don't know how to generate knowledge at fast enough clip and so forth there lots of problems um we have to combine online learning with abstraction neuros symbolic approaches I think offer our best hope here there shouldn't be 50 people in this room should be 5,000 like if we had a tenth of the resources for neuros symbolic AI as we do right now for large language it completely change the world um but advances in neuros symbolic AI alone won't be enough because intelligence is multifaceted and we shouldn't expect any one size fits-all solution here which is why going back to your you know three-year problem that I was teasing you about like we're just not going to solve all this in the next years there many facets of things uh we also need large scale knowledge including some in so yeah with that being said take a look at this diagram and although this diagram is inaccurate in terms to language and reading because there are many different parts of the brain to where you know language actually corresponds to but the point here is that when you if you're actually trying to represent the human brain maybe the human brain isn't entirely intelligent but the point is that llms are only one subsection of human intelligence and there's actually a lot more that needs to go into the pipeline in terms of neuros symbolic AI with that being said hopefully you all enjoyed this video and I'll see you on the next one

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник