# This AI Does Nothing In Games…And Still Wins!

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=u5wtoH0_KuA
- **Дата:** 09.05.2020
- **Длительность:** 6:57
- **Просмотры:** 1,413,689

## Описание

❤️ Check out Weights & Biases and sign up for a free demo here: https://www.wandb.com/papers 

Their instrumentation for this paper is available here:
https://app.wandb.ai/stacey/aprl/reports/Adversarial-Policies-in-Multi-Agent-Settings--VmlldzoxMDEyNzE

📝 The paper "Adversarial Policies" is available here:
https://adversarialpolicies.github.io

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Haro, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Daniel Hasegan, Eric Haddad, Eric Martel, Javier Bustamante, Lorin Atzberger, Lukas Biewald, Marcin Dukaczewski, Michael Albrecht, Nader S., Owen Campbell-Moore, Owen Skarpness, Rob Rowe, Robin Graham, Steef, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh
More info if you would like to appear here: https://www.patreon.com/TwoMinutePapers

Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://discordapp.com/invite/hbcTJu2

Károly Zsolnai-Fehér's links:
Instagram: https://www.instagram.com/twominutepapers/
Twitter: https://twitter.com/twominutepapers
Web: https://cg.tuwien.ac.at/~zsolnai/

## Содержание

### [0:00](https://www.youtube.com/watch?v=u5wtoH0_KuA) Segment 1 (00:00 - 05:00)

dear fellow scholars this is two minute papers with dr. Carol Jean IFA here today it is almost taken for granted that neural network based learning algorithms are capable of identifying objects in images or even rightful coherent sentences about them but fewer people know that there is also parallel research on trying to break these systems for instance some of these image detectors can be fooled by adding a little noise to the image and in some specialized cases we can even perform something that is called the one pixel attack let's have a look at some examples changing just this one pixel can make a classifier think that this ship is a car or that this horse is a frog and amusingly be quite confident about its guess note that the choice of this pixel and the color is by no means random and it needs solving a mathematical optimization problem to find out exactly how to perform this trying to build better image detectors while other researchers are trying to break them is not the only arms race we are experiencing in machine learning research for instance a few years ago deepmind introduced an incredible learning algorithm that looked at the screen much like a human would but was able to reach superhuman levels in playing a few Atari games it was a spectacular milestone in AI research they also just have published a follow-up paper on this that will cover very soon so make sure to subscribe and hit the bell icon to not miss it when it appears in the near future interestingly while these learning algorithms are being improved at a staggering pace there is a parallel subfield where researchers endeavor to break these learning systems by slightly changing the information they are presented with let's have a look at open e eyes example their first method adds a tiny bit of noise to a large portion of the video input where the difference is barely perceptible but it forces the learning algorithm to choose a different action then it would have chosen otherwise in the other one the different modification was used that has the smaller footprint but is more visible for instance in Punk adding a tiny fake ball to the game can coerce the learner into going down when it was originally planning to go up it is important to emphasize that the researchers did not do this by hand the algorithm itself is able to pick up game specific knowledge by itself and find out how to fool the other AI using it both attacks perform remarkably well however it is not always true that we can just change these images or the playing environment to our desire to fool these algorithms so with this an even more interesting question arises is it possible to just enter the game as a player and perform interesting stunts that can reliably win against these AIS and with this we have arrived to the subject of today's paper this is the you shall not pass game where the red agent is trying to hold back the blue character and not let it cross the line here you see two regular AIS duking it out sometimes the red wins sometimes the blue is able to get through nothing too crazy here this is the reference case which is somewhat well-balanced and now hold on to your papers because this adversarial agent that this new paper proposes does this you may think this was some kind of glitch and I put the incorrect footage here by accident no this is not an error you can believe your eyes it basically collapses and does absolutely nothing this can't be a useful strategy can it well look at that it still wins the majority of the time this is very confusing how can that be let's have a closer look this red agent is normally a somewhat competent player as you can see here it can punch the blue victim and make it fall we now replaced this red player with the adversarial agent which collapsed and it almost feels like it hypnotized the blue agent to also fall and now squeeze your papers because the normal red opponents win rate was 47% and this collapsing chap wins 86% of the time it not only wins but it wins much more reliably than a competent AI what is this wizardry the answer is that the adversary induces of distribution activations to understand what that exactly means let's have a look at this chart this tells us how likely it is that the actions of the AI against different opponents are normal as you see when this agent named Xue plays against itself the bars are in the positive region meaning that normal

### [5:00](https://www.youtube.com/watch?v=u5wtoH0_KuA&t=300s) Segment 2 (05:00 - 06:00)

things are happening things go as expected however that's not the case for the blue lines which are the actions when we play against this adversarial agent in which case the blue victims actions are not normal in the slightest so the adversarial agent is really doing nothing but it is doing nothing in a way that reprograms its opponent to make mistakes and behave close to a completely randomly acting agent this paper is absolute insanity I love it and if you look here you see that the more the blue curve improves the better this scheme works for a given game for instance it is doing real good on kick and defend fairly good on sumo humans that there is something about the sumo ants game that prevents this interesting kind of hypnosis from happening I love to see a follow-up paper that can pull this off a little more reliably what a time to be alive what you see here is an instrumentation of this exact paper we have talked about which was made by weights and biases I think organizing these experiments really showcases the usability of their system weights and biases provides tools to track your experiments in your deep learning projects their system is designed to save you a ton of time and money and it is actively used in projects at prestigious labs such as open AI Toyota research github and more and the best part is that if you are an academic or have an open-source project you can use their tools for free it really is as good as it gets make sure to visit them through wnb comm slash papers or just click the link in the video description and you can get a free demo today our thanks to weights and biases for their long-standing support and for helping us make better videos for you thanks for watching and for your generous support and I'll see you next time

---
*Источник: https://ekstraktznaniy.ru/video/14134*