Investigating Human Priors for Playing Video Games (Paper & Demo)
11:01

Investigating Human Priors for Playing Video Games (Paper & Demo)

Yannic Kilcher 20.05.2020 2 647 просмотров 112 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Why are humans so good at video games? Maybe it's because a lot of games are designed with humans in mind. What happens if we change that? This paper removes the influence of human priors from a game and ends up with a pretty fun experience. Paper: https://arxiv.org/abs/1802.10217 Website: https://rach0012.github.io/humanRL_website/ Code: https://github.com/rach0012/humanRL_prior_games Abstract: What makes humans so good at solving seemingly complex video games? Unlike computers, humans bring in a great deal of prior knowledge about the world, enabling efficient decision making. This paper investigates the role of human priors for solving video games. Given a sample game, we conduct a series of ablation studies to quantify the importance of various priors on human performance. We do this by modifying the video game environment to systematically mask different types of visual information that could be used by humans as priors. We find that removal of some prior knowledge causes a drastic degradation in the speed with which human players solve the game, e.g. from 2 minutes to over 20 minutes. Furthermore, our results indicate that general priors, such as the importance of objects and visual consistency, are critical for efficient game-play. Videos and the game manipulations are available at this https URL Authors: Rachit Dubey, Pulkit Agrawal, Deepak Pathak, Thomas L. Griffiths, Alexei A. Efros Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher

Оглавление (3 сегментов)

Segment 1 (00:00 - 05:00)

hey there what's going on today we're looking at investigating human priors for playing video games by ratchet dooby phuket agrawal Deepak Patek Tom Griffiths and Alexey I a for us so there is a paper to go with this but I actually don't want to get into the paper too much in order to not reveal too much of what's coming but basically they're trying to investigate what makes video games work for humans so what the humans pay attention to what priors bring humans into a video game and the fun thing is they've created these games where they oblate these individual priors and we are going to play them so the original game right here as you can see is kind of this montezuma's revenge type of game so you only need the arrow keys if you go to a bad blob like this then you die and you can jump on them and you can use the ladders and the spikes they'll hurt you if you jump on them and also if you fall down between the platforms so what you got to do is basically get the key then go to the door over here and bada-boom cool so let's try out so they they basically oblate different things here mask semantics means that you don't know what the objects are anymore so you might go over here and you might be like what's this green thing can I jump on it oh so we're probably a bit biased because we've seen the game before so we know that these are the that the pink ones are the bad ones and that's the key so we should probably get it but you can imagine that it is a bit harder but you could still solve it right reverse semantics is very interesting if you play it for the first time because all of a sudden now there's the coins oh and the fire but I think humans could probably still figure it out with like some minimal trial and error right this ice cream cone you realize okay now it gets interesting because right now we've always had sort of we know that there's an object and there's no object on the platforms but now these are masks so basically you don't know what's like a relevant object and what isn't so I know that there's like a bad thing here and a bad thing down to the left so I I'm gonna guess these light aha these light pink things are the bad things yeah these are the ladders cool bad thing right here hey we are rocking this okay kee where's that that's the key and the door so it gets harder because you kind of have to remember the colors right I know that the light pink ones are the bad squares still solvable so um let's jump over these on the left because these gets really actually it's going to here so in masked affordances if what they're saying is that okay you can kind of from the way something looks you can tell what you can do with it for example the platform's you can you know jump on them and the background is sort of empty space so you know that there's nothing much happening there so they trying to take that away by simply re texturing all the objects here such that you don't know how you can interact with them and it does get significantly harder because okay so these green ones are the platforms right here so but I can still see that must be the ladder right you can imagine if you were playing this again for the first time that this is significantly more difficult but you still see the key and the green ones being the platforms we got this now it gets harder masked visual similarity so this is where they say maybe as we did so far maybe you as a human can kind of make out that things that are visually similar to each other can you can do the same things with like we said all the green ones are probably the platforms so they took it away haha gee ok so that must be ok can't go here fell down let's try again this is a platform is this one here yes these are platforms that was easy too easy running into that bad Bob there okay the ladders are still like this but then okay yeah this gets harder as you can t okay I'm too dumb to remember from before I'm like the ideal subject because I don't remember how did this work I'm going to solve this just so you know even if this video gets to 50 minutes I'm going to make it through this okay ah here we can okay see my short-term memory is so bad okay we got the key now just get over to the door doors over there yeah okay let's wipe the short-term memory again here changed ladder interaction where they basically say okay one of the

Segment 2 (05:00 - 10:00)

things that you could know from the real world is how you know these objects work in the real world so there's not really any pink blobs with evil faces there might be spikes yesBut ladder is something you know that works so if you want to go up here that doesn't work so you kind of have to figure okay so you have to go kind of left and right to go up the ladders and so that one I actually tried before and I figured that out pretty quickly I think humans are able to figure that out fairly quickly because you kind of on the ladder right you can actually go down and easily you you're on the ladder and then if that kind of doesn't work and then you kind of try to wiggle because there's two of them I don't think that's necessarily super hard and now it feels a bit like you know this Super Mario maker thing where people just try to make levels as hard as possible and trick you with trick blocks and invisible stuff this is hard so the direction of gravity is another left key jumps right and here this key so this is like this is extremely hard ah because I have to like think about every move I make before I do it and okay now this is so unintuitive for real yeah got it okay really try this out this is crazy yeah okay so the last thing is we combine all of it I guess except it changed gravity and changed interaction so now all the priors all the visual object priors removed this is Kings discipline right here okay so we figured out where the blue cool so where's the next this is the next platform we're step okay there must be like it yeah take that okay but we know the next so we can't really generalize from this because we know the next bad blob isn't going to be the same color right okay white then this I know there's a bad thing here but we'd have to figure this out so basically kind of the point of the paper I think is to say that this is what you're doing to or know the spikes RL algorithms if you're in the most case so they simply have to go and to basically try every single thing and remember what worked and what didn't work now of course the RL algorithms can also exhibit like can also use the visual similarity that was the key yeah let's get the door don't know there is it there's like a bad thing here right no spikes so either we build these priors into the oral algorithms if we want to get them to human level or we have some sort of learning these priors before we let the people go you know onto a paper or I don't know or we just you know take it that orell algorithms you know have to figure all of this out by themselves so they oblate these things right here you can see the masked object identity makes kind of the biggest difference in terms of time number of deaths the number of states explored reverse semantics I believe these are humans that are you know trying it for the first time and they're just like oh and ice cream so it can also hurt right your algorithm wouldn't be super impressed by it looking like an ice cream but the human is very much and the crazy thing here you can see exploration the original game and then exploration in the no object prior game especially if you play this for the first time this is just mad like no freaking way I would actually like love to see video games like this coming out this would be the worst selling video game of all times where dynamically it just removes these kind of priors but it's a I think it's a really fun way to investigate what humans learn and what they already bring into the game so here they have another game and they do this same thing on an RL agent and you see here the RL agent just don't care about any of these things except visual similarity so visual similarity helps the RL agent to generalize across the game so if you see a bad blob the next bad blow-up will look similar and that's sort of kind of an invariance that we know they can exploit since they're using

Segment 3 (10:00 - 11:00)

convolutional neural networks and so on but I think it is really drawing attention to the importance of priors prior knowledge in reinforcement learning and human knowledge so in this game right here where you have these hidden rewards that the human doesn't see right but if they kind of touch it they're kind of coins and the human performs way worse than the RL agent because the RL agent will actually try those things out and the human having the prior that the black think this don't see the yellow boxes that the black thing is just empty space they won't even explore that so maybe you know that is something to think about with respect to building RL agents all right I don't want to go into the paper too much it's a very cool paper but we're here to play games and I invite you to read the paper check out the website try these games for yourself they're a lot of fun especially if you travel first time and

Другие видео автора — Yannic Kilcher

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник