Why AI is Harder Than We Think (Machine Learning Research Paper Explained)
36:41

Why AI is Harder Than We Think (Machine Learning Research Paper Explained)

Yannic Kilcher 30.04.2021 48 543 просмотров 1 693 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
#aiwinter #agi #embodiedcognition The AI community has gone through regular cycles of AI Springs, where rapid progress gave rise to massive overconfidence, high funding, and overpromise, followed by these promises being unfulfilled, subsequently diving into periods of disenfranchisement and underfunding, called AI Winters. This paper examines the reasons for the repeated periods of overconfidence and identifies four fallacies that people make when they see rapid progress in AI. OUTLINE: 0:00 - Intro & Overview 2:10 - AI Springs & AI Winters 5:40 - Is the current AI boom overhyped? 15:35 - Fallacy 1: Narrow Intelligence vs General Intelligence 19:40 - Fallacy 2: Hard for humans doesn't mean hard for computers 21:45 - Fallacy 3: How we call things matters 28:15 - Fallacy 4: Embodied Cognition 35:30 - Conclusion & Comments Paper: https://arxiv.org/abs/2104.12871 My Video on Shortcut Learning: https://youtu.be/D-eg7k8YSfs Abstract: Since its beginning in the 1950s, the field of artificial intelligence has cycled several times between periods of optimistic predictions and massive investment ("AI spring") and periods of disappointment, loss of confidence, and reduced funding ("AI winter"). Even with today's seemingly fast pace of AI breakthroughs, the development of long-promised technologies such as self-driving cars, housekeeping robots, and conversational companions has turned out to be much harder than many people expected. One reason for these repeating cycles is our limited understanding of the nature and complexity of intelligence itself. In this paper I describe four fallacies in common assumptions made by AI researchers, which can lead to overconfident predictions about the field. I conclude by discussing the open questions spurred by these fallacies, including the age-old challenge of imbuing machines with humanlike common sense. Authors: Melanie Mitchell Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/ BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Оглавление (8 сегментов)

Intro & Overview

hello there welcome back today we're going to look at why ai is harder than we think by melanie mitchell of the santa fe institute this paper argues that the cycles of ai spring and ai winter come about by people making too over confident of predictions and then everything breaks down and mitchell here goes into why people make these overconfident predictions she outlines four fallacies that researchers make and details them and gives some suggestions of what can be done better so it's a bit of a different paper than we usually look at but i'd still be interested in your opinions let me know in the comments what you think share this video out and of course subscribe if you're interested in machine learning content all right why ai is harder than we think in the abstract here mitchell makes the case that since the 1950s when ai was sort of beginning to develop there were repeating periods of what are called ai springs which are periods of optimistic predictions and massive investment and on the other hand periods of disappointment loss of confidence and reduced funding which are called ai winters and she says even today where ai has a number of breakthroughs the development of long-promised technologies such as self-driving cars housekeeping robots and conversational companions has turned out to be much harder than many people expected and she says one reason of this is uh our limited understanding she says of the nature and complexity of intelligence itself and there are four fallacies she describes in common assumptions which can lead to these overconfident predictions so if you know anything a little bit about the history of ai you

AI Springs & AI Winters

are aware that this there is this cycle of these springs and winters and this has been the case from the very beginning and she outlines very clearly here that you know when for example the perceptron was invented people thought oh we're going to do all of this extremely cool things here claude shannon right um said i confidently expect that within a matter of 10 to 15 years something will emerge from the laboratory which is not too far from the robot of science fiction fame right and marvin minsky forecasts that within a generation the problems of creating artificial intelligence will be substantially solved so this is due to the fact they saw real good progress in a very short amount of time and they just extrapolated that progress and um that did not turn out to be the case and then of course there was a winter a downturn in enthusiasm after all these promises didn't materialize then again in the 1980s there were more ai systems coming up there was a upswing again and a disappointment again and then in the 1990s and 2000s uh finally machine learning was introduced by the way the 1980s the time of like expert systems so people first people developed the yeah the perceptron and thought that was um that was the the best and then expert systems people thought if we just kind of develop these rules and have these rule solvers and sort of these rule searching algorithms then we can build ai that did not turn out and now in the current paradigm we are in the machine learning paradigm where people develop machine learning algorithms and they think okay that's the way to go so she makes the case here that also this time we might be in a period of over confidence she says however around 2000 deep learning in which brain inspired multi-layer neural networks are trained from data emerged from this backwater from its backwater position and rose to superstar status in machine learning has been around since the 1970s but recently with big data sets and big compute um you know we can scale up to a large number of on so of unsolved challenges and solve them so we can do speech recognition machine translation chatbot image recognition game playing protein folding and many more things and people let's say call this ai right in essence this is machine learning and ai are almost synonymous nowadays but we shouldn't forget that ai is a different thing than machine learning it's just that many people today believe that you can use machine learning in order to achieve ai and there was all at once a new round of optimism about the prospects of what has been variously called general true or human level a. i and she goes through a little bit of

Is the current AI boom overhyped?

what tech ceos say um like co-founder of google deep mind predicted that in 2008 that human level here will be passed in the mid-2020s i guess that's soon um mark zuckerberg declared that one of facebook goals for the next five to ten years is to basically get better than human level at all the primary human senses vision hearing language and general cognition also that would be yeah very soon these 10 years come to an end so she says in spite of all this optimism it didn't take long for cracks to appear sorry in deep learning's facade of intelligence so already she's calling it a facade of intelligence and not intelligence itself turns out like all ai systems of the past deep learning can exhibit brittleness unpredictable errors when facing situations that differ from the training data um she says these things are susceptible to shortcut learning and i've done a video on shortcut learning if you're interested in that it's a criticism of neural networks that is well summarized here by saying learning statistical associations in the training data that allow the machine to produce correct answers but sometimes for the wrong reasons one should add the correct answers in the test data set and this stems a lot from the fact of how these data sets are generated so maybe there was this famous paper that where they tried to um detect criminality from a face uh portrait and they just happened to you know their assembled their data set they took all the criminal ones from like their mug shots but they took all the non-criminal ones from like linkedin and the model could just learn who is dressed well and who smiles and had nothing to do with actual criminality and this shortcut learning is essentially where you say look you know the way you construct the data set you might there might be something in there where the model learns to give you the correct answer on your test set because that's constructed equally however it doesn't really learn the true thing you want it to learn right that is certainly exists um however that is i feel that is like a data set problem uh not a problem with deep learning itself now humans have that right so by the way in other words these mechanisms don't learn the concepts we are trying to teach them but rather they learn shortcuts to correct answers on the training set and such shortcuts will not lead to good generalizations so if you think of humans do that as well like if you know with branding and all like if you ever bought a pair of nike shoes and you didn't exactly check their quality or evaluate them and so like maybe some of you do but others are just like oh it's this brand um that you know tells me something about it's it's made like the about the quality of the shoes or something like this like you know they're not the cheapest and you know they're you know not the cheapest manufacturer even though that might not be true but you attach all of this to the brand symbol and so essentially humans perform shortcut learning all the time but you know point taken these networks are brittle they sometimes learn the wrong attack they're of course they're vulnerable to adversarial perturbations though i don't think that's like a an exact criticism it just means that the networks they see the world in a little bit a different way than we do right and you can exploit that little difference in order to make them do weird things but you know you need to really target that it's not like that happens by itself the i think the big challenge here is what what she says next um however it seems clear from their non-human-like errors and vulnerability to adversarial perturbations that these systems are not actually understanding the data they process at least not in the human sense of understand it's still a matter of debate in the ai community whether such understanding can be achieved by adding network layers and more training data or whether something more fundamental is missing so a couple of comments right here this understanding and she says this correctly it's like in the human sense of understanding puts it in quotes it's like i don't think i've met yet anyone who can actually tell me what understanding means and or suggest a rigorous test for understanding i think walid saba came the closest to actually you know put saying look here if this and this happens then i claim it understands but most people just say something like well i'll know it when i see it right so this seems a bit um the sorry moving a bit of moving the goal post of what it means to understand uh but i agree most people here wouldn't think that today's ai systems actually understand the data in the same way humans do for whatever definition of understand that is commonly used the other point here is whether that understanding can be achieved by adding network layers and more training data or whether something more fundamental is missing now you have to remember that you know human intelligence however smart it might be it runs on hardware right it runs on neurons and later they the authors here make the case for embodied con cognition but ultimately it runs on hardware like it's an it's an algorithm implemented in hardware and in very much you know all the same it's all neurons sure they're super specialized in some fashions but ultimately you only have the chemistry that you have and um we know for a fact that intelligence arises from an algorithm on that hardware so yes you can ask whether the current neural networks architectures are going to be sufficient but i don't know what fundamental thing here might be missing like there might be better approaches more efficient approaches and so on but ultimately the human brain is hardware too but yeah we could more purpose-built let's say network architectures if we know that something specific is missing um maybe it's a different structure of network or a different type of algorithm on the hardware we could build that in okay so as we go on um she is going to into her four fallacies right now the indies and remember so she claims that because these fallacies exist people make over-confident predictions about the future of ai and we shouldn't do that because if we make over-confident predictions that means we won't meet our goals and then we will you know the funding will dry up because we've set two high expectations and then we'll go into another ai winter which is a valid thing to say though at some point she also quotes elon musk here about you know self-driving cars and that they're not fully self-driving i think that's up here yeah so um elon musk 2019 promised a year from now we'll have over a million cars with full self-driving software and everything and despite attempts to redefine full self-driving into existence none of these predictions have come true so this reference here is to a link where the where tesla i think towards the dmv so towards the regulators they say oh we're actually not doing fully self-driving uh so i think it's a bit uh it's a bit weird to criticize you know tesla on that like i'm sure no other company ever has said has had a different tone in messaging when they do marketing than when they talk to the regularities like i'm sure that never happens um anywhere on the planet except with tesla right now that being said elon musk does over promise all the time on the other hand he also achieves things that no one else achieves i think it drives certain people mad that even though he's not over promising so much he still like achieves insane results uh just not as insane as he promises but i like that it makes people mad a

Fallacy 1: Narrow Intelligence vs General Intelligence

bit okay so first fallacy is narrow intelligence is on a continuum with general intelligence so that's the fallacy is thinking that um if we develop something like deep blue it was hailed as the first step of an ai revolution or gpt-3 was called a step towards general intelligence and the fallacy here is that we think that there's this continuum like if we get better on individual tasks we make progress towards general ai the first step fallacy um is the claim that ever since our first work on computer intelligence we have been inching along a continuum at the end of which is ai so that any improvement in our programs no matter how trivial counts as progress it was like claiming that the first monkey that climbed a tree was making progress towards landing on the moon this has connections to uh like kenneth stanley as work on exploration on reinforcement learning without you know goal undirected reinforcement learning exploration based learning um where you can deceive yourself by just going towards a goal maybe you need an entirely different approach and i guess the fallacy here is to say that whatever progress we make you know we're going to interpret that as or whatever successes we have a success or as a step towards general ai and you know honestly i get it i get a deep blue what is not general ai and i get it that with like a min max search tree um and a bunch of handcrafted rules you cannot get to general ai however you know the principles uh are still in use like deep blue isn't so different from alphago um and the concept that you need like an internal search that goes to a certain depth as a look ahead uh in order to achieve ai is not stupid like it is and the demonstration that such as systems can beat human at a previously unbeaten task is i think definitely progress towards general ai i doubt we'll find a general ai that does not have something that at least resembles uh such a module the same with gpt3 like i'm fairly convinced that uh a general ai will have some type of self-supervised learning of language going on it's and to not call gpt3 a step into the direction of general intelligence like sure it you know all the criticism it's just interpolating training data yada yada you can leverage that but it's undeniable that gpt-3 and the family of models there are tremendous progress and i would argue progress towards general ai i guess the more question is how much of a progress is it like is it halfway there or is it one percent there in a way a monkey climbing on the moon is a bit of progress going towards the moon because they you know they see the moon and they may want to go to the moon uh yeah so um i agree a little bit i don't know how um how valid that is though

Fallacy 2: Hard for humans doesn't mean hard for computers

fallacy 2 easy things are easy and hard things are hard so that's the fallacy where the correct the corrected version would actually be easy things are hard and hard things are easy and this is all about arguing that we assume that you know the hard problems for computers are also humans so whenever we solve we think wow that's a you know the computer must be super smart because only a super smart human would achieve such a thing for example researchers at google deepmind in talking about alphago's triumph described the game of go as one of the most challenging of domains but correctly this paper asks challenging for whom for humans perhaps but as psychologist gary marcus pointed out there are domains including games that while easy for humans are much more challenging than go for ai systems one example is charades and this is a it's a valid criticism that people you know fall people fall victim to how often have you seen someone interact with not even an ai system but any anything technical and asking like why can't the stupid computer just you know do this like how easy is that you know and you have maybe coded previously and you recognized it it's not that easy even though it seems super easy to a human um yeah so that's correct it's a correct criticism i do think deep learning has brought us a lot closer here like in all of these things where human-ness um shines i think deep learning especially in the perception domain has brought us a lot closer though this paper argues that there's still this kind of notion of common sense that isn't yet there for machines which i also

Fallacy 3: How we call things matters

agree fallacy number three the lure of wishful mnemonics and this is a bit about um how we call things so uh the argument is here a major source of simple-mindedness in ai programs is the use of mnemonics like understand or goal to refer to programs and data structures if a researcher calls the main loop of his program understand he is until proven innocent merely begging the question he may mislead a lot of people most prominently himself what he should do instead is refer to the main loop as g double o three four and see how it can how if he can conceive itself or anyone else that g double o three four implements at least some part of understanding many instructive examples of wishful mnemonics by ai researchers come to mind once you see this point so this is about how we talk about ai systems and the fact that we call things as we do they give a more recent example here again for deep for some reason deep mind is a lot so ibm watson is of course here too deepmind as well um you know granted they do make a lot of claims about in intelligence and their systems so uh so demi's hassabis says alphago's goal is to be the best human players not just mimic them david silver said we can always ask alphago how well it thinks it's doing during the game it was only towards the end of the game that alphago thought it would win and the cursive words here are goal things and thought it would win and this the fallacy here is that we use these words and we sort of ascribe human tendencies human wants human needs to those systems so um the author here argues that alphago africo doesn't have a goal per se right we just say this uh alphago doesn't think anything about itself and um winning doesn't mean anything to it now i agree that by calling things certain names we implicitly you know we imply that there's something happening we ascribe humanness to these machines that might not exist however i i don't necessarily agree that alphago for example has no goal like you know what does it mean to have a goal um you know how can you even measure that humans have a goal right unless you ask someone like what's your goal but if you can't ask a human you observe their behavior they seem to be acting you know to achieve a certain result alphago does the same like i don't see why alphago doesn't have a goal in the same way at least you can't give me like a tangible definition of goal that does not include alphago unless you explicitly carve it uh such that you know alphago is excluded but the same with you know what how it thinks it's during the game it was only towards the end that alphago thought it would win this is a bit more dicey right because actually alphago isn't even thinking how much it would win against in the current game it's actually evaluating um its value function against itself right so against the sort of the best opponent it knows so it constantly underestimates its chances of winning because you know unless someone is better than alphago um however again you know of course winning doesn't mean anything to alphago however what does you know you also can't do this for a human like hey human what does winning mean um who knows right alphago does have a concept of winning a game of getting positive reward like there is a clear state in its state space that relates to a winning game position so again it's a valid criticism that we shouldn't attribute humanness to these machines however i do think a lot of a lot of these examples here are not as clear right the more clear ones are down here you know when we have data sets and tasks uh such as the stanford question and answering data set this is squad short or the race reading comprehension data set the general language understanding evaluation right glue and it's derivative super glue these are named of course if you work with them you know fairly quickly that this is if it is question answering it's a very limited set of question answering like it's a very specific kind of question answering it's not the ability to answer questions and you know that but you have to give it some name right the thought here is that uh to the public it might seem that you know when then the press writes things as microsoft's ai has outperformed humans in natural language understanding then that might be overly that might appear overly optimistic which is of course true uh however the researchers i feel are only mildly to blame for this um you know of course there's marketing in research but um i would maybe you know like there's a high chance that in this article here it was the journalist that massively upped those statements to gather more clicks and i agree though that to the public then it's over promising maybe there's a politician that reads this right directs more funding because wow and so on and then you get this over-promising and disappointment cycle then fallacy four is

Fallacy 4: Embodied Cognition

intelligence is all in the brain and this is about embodied cognition and it we should pay more attention to embodied cognition so the fallacy is that intelligence is all in the brain and she criticized here the information processing model of the mind and essentially saying that there is lots of evidence that here the assumption that intelligence is on the brain has led to the speculation that to achieve human level yeah we simply need to scale up machines to match the brain's computing capacity and then develop the appropriate software for this brain matching hardware okay so jeff hinton is there saying you know in the brain we have x many connections if you know once this is a hardware problem um however there are these researchers in embodied cognition gaining steam since the mid-1970s and they have a lot of evidence body cognition means that the representation of conceptual knowledge is dependent on the body it's multimodal not a modal symbolic or abstract this theory suggests that our thoughts are grounded or inextricably associated with perception action emotion and that our brain and body work together to have cognition there is there's a lot of evidence that you know we work that way our intelligence works that way however i so if i have to leverage some criticism here i would say maybe the maybe the author here also uh has a bit of a humanness fallacy in making this argument right just because human intelligence has those properties doesn't mean that that's the only way to reach intelligence even human level intelligence or human-like intelligence just because humans don't work without a body doesn't necessarily mean right that we can't build intelligence otherwise i could also say so the argument i mean there is there are good arguments for this don't get me wrong but if you say something like look all the intelligence we ever see is body based like human intelligence is the only intelligence we know and that is intelligence that interacts with the body right in acts in the world and so on i can also um here it's not at all clear so instead what we've learned from research and in body cognition is that human intelligence seems to be a strongly integrated system with closely interconnected attributes including emotions desires a strong sense of selfhood and autonomy and a common sense understanding of the world it is not at all clear that these attributes can be separated i want to leave out the common sense understanding of the world right now and and focus on like the embodiment in the same vein you can say you know all human intelligence we've ever encountered uh looks something like you know like like this so you know there's a brain stem right here there's the frontal thing i am terrible at drawing brains this is a brain okay brain you know all human intelligence looks like this and you know maybe there is the spine and there are the proof the nerves here so this is a nervous system human intelligence looks like this uh why don't you know our computers you know must also look like this otherwise because all the intelligence we ever see looks like this right so since you know it since we don't have that we need to build it it's not like i get it we all this intelligence we see is a brain and the central nervous system and the body doesn't mean that we need it even it might be that you know the evolutionary pressure on humans given their body made their intelligence super entangled and the development of intelligence dependent on having a body but again ultimately we have to acknowledge that intelligence is something that's implemented in hardware and it is the case that you know paraplegics have intelligence um i get it things like emotions and desires and so on they're still there and and they might play a role in the development of in intelligence but in you know paraplegics have intelligence but what doesn't have intelligence is someone who who's been to the guillotine right that there's no intelligence there in you know the body part um so there's there's fairly good evidence i'd say that intelligence exists independent of the body because we can remove like every part of the body and still have intelligence except the brain however the body and embodiment might be necessary to efficiently develop intelligence and the same in my sense goes a bit for common sense this common sense is a bit of it's a bit of a mystery word that people use i feel so common sense they mean like oh you know the things that you just know right but i would say you know this is this common sense that people mean is the result of ginormous years of evolution you know built into your brain or at least making your brain extremely adapt to learning these things really quickly right that's what evolution has done so in that way it is very much a scale problem it's very much a data plus scale problem and maybe some you know clever neuromorphic algorithms or something like this but it's not like you know we oh we have to put in common sense it seems like a scale problem we could accelerate it by you know directly programming in common sense but it's not the it's not like a qualitatively different thing at least i feel i do agree that embodiment is probably a good way to go in order to develop a general ai in order to push the next boundary of ai especially kind of multi-multi-modal multi-uh sensory intelligence and also reinforcement learning so models that act in the world and observe their own actions but we have that kind of too like they're like a recommender system like youtube or something they do you know the actions have influence on the system and so on it just doesn't handle it super well for now so that were the four fallacies uh she lays out a bit of a future

Conclusion & Comments

plan here uh especially you know focusing on you know we need to get these machines a bit of common sense that's still missing we attribute too much humanness to them uh we need to go after maybe more after embodied cognition because that seems to be very promising um we shouldn't use wishful mnemonics so we shouldn't call our things something like maybe something like attention like we should maybe call ours our routines attention because you know it's not the same kind of attention that we call attention um we shouldn't assume that the same things are hard for humans as they are for machines and finally wait where was it we shouldn't assume that just any new solved task is a step towards general intelligence those are the four fallacies and that was this paper i invite you to read it in full it's some has some good uh stuff in what i didn't read right now uh go check it out tell me what you think in the comments and i'll see you next time bye

Другие видео автора — Yannic Kilcher

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник