Adversarial Attacks on Neural Networks - Bug or Feature?

4:57

Adversarial Attacks on Neural Networks - Bug or Feature?

Two Minute Papers 10.09.2019 96 278 просмотров 4 280 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

❤️ Support us on Patreon: https://www.patreon.com/TwoMinutePapers 📝 The paper "Adversarial Examples Are Not Bugs, They Are Features" is available here: http://gradientscience.org/adv/ The Distill discussion article is available here: https://distill.pub/2019/advex-bugs-discussion/ If you wish to play with some of these Distill articles, look here: - https://distill.pub/2017/feature-visualization/ - https://distill.pub/2018/building-blocks/ Andrej Karpathy’s image classifier - you can run this in your web browser: https://cs.stanford.edu/people/karpathy/convnetjs/demo/cifar10.html 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Brian Gilman, Bruno Brito, Bryan Learn, Christian Ahlin, Christoph Jadanowski, Claudio Fernandes, Daniel Hasegan, Dennis Abts, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, James Watt, Javier Bustamante, John De Witt, Kaiesh Vohra, Kasia Hayden, Kjartan Olason, Levente Szabo, Lorin Atzberger, Lukas Biewald, Marcin Dukaczewski, Marten Rauschenberg, Matthias Jost,, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Nader Shakerin, Owen Campbell-Moore, Owen Skarpness, Raul Araújo da Silva, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil. https://www.patreon.com/TwoMinutePapers Thumbnail background image credit: https://pixabay.com/images/id-2981865/ Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Instagram: https://www.instagram.com/twominutepapers/ Twitter: https://twitter.com/karoly_zsolnai Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (1 сегментов)

Segment 1 (00:00 - 04:00)

Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. This will be a little non-traditional video where the first half of the episode will be about a paper, and the second part will be about…something else. Also a paper. Well, kind of. You’ll see. We’ve seen in the previous years that neural network-based learning methods are amazing at image classification, which means that after training on a few thousand training examples, they can look at a new, previously unseen image and tell us whether it depicts a frog or a bus. Earlier we have shown that we can fool neural networks by adding carefully crafted noise to an image, which we often refer to as an adversarial attack on a neural network. If done well, this noise is barely perceptible and, get this, can fool the classifier into looking at a bus and thinking that it is an ostrich. These attacks typically require modifying a large portion of the input image, so when talking about a later paper, we were thinking, what could be the lowest number of pixel changes that we have to perform to fool a neural network? What is the magic number? Based on the results of previous research works, an educated guess would be somewhere around a hundred pixels. A followup paper gave us an unbelievable answer by demonstrating the one pixel attack. You see here that by changing only one pixel in an image that depicts a horse, the AI will be 99. 9% sure that we are seeing a frog. A ship can also be disguised as a car, or, amusingly, with a properly executed one-pixel attack, almost anything can be seen as an airplane by the neural network. And, this new paper discusses whether we should look at these adversarial examples as bugs or not, and of course, does a lot more than that! It argues that most datasets contain features that are predictive, meaning that they provide help for a classifier to find cats, but also non-robust, which means that they provide a rather brittle understanding that falls apart in the presence adversarial changes. We are also shown how to find and eliminate these non-robust features from already existing datasets and that we can build much more robust classifier neural networks as a result. This is a truly excellent paper that sparked quite a bit of discussion. And here comes the second part of the video with the something else. An interesting new article was published within the Distill journal, a journal where you can expect clearly worded papers with beautiful and interactive visualizations. But this is no ordinary article, this is a so-called discussion article where a number of researchers were asked to write comments on this paper and create interesting back and forth discussions with the original authors. Now, make no mistake, the paper we’ve talked about was peer-reviewed, which means that independent experts have spent time scrutinizing the validity of the results, so this new discussion article was meant to add to it by getting others to replicate the results and clear up potential misunderstandings. Through publishing six of these mini-discussions, each of which were addressed by the original authors, they were able to clarify the main takeaways of the paper, and even added a section of non-claims as well. For instance, it’s been clarified that they don’t claim that adversarial examples arise from software bugs. A huge thanks to the Distill journal and all the authors who participated in this discussion, and Ferenc Huszár, who suggested the idea of the discussion article to the journal. I’d love to see more of this, and if you do too, make sure to leave a comment so we can show them that these endeavors to raise the replicability and clarity of research works are indeed welcome. Make sure to click the link to both works in the video description, and spend a little quality time with them. You’ll be glad you did. I think this was a more complex than average paper to talk about, however, as you have noticed, the usual visual fireworks were not there. As a result, I expect this to get significantly fewer views. That’s not a great business model, but no matter, I made this channel so I can share with you all these important lessons that I learned during my journey. This has been a true privilege and I am thrilled that I am still able to talk about all these amazing papers without worrying too much whether any of these videos will go viral or not. Videos like this one are only possible because of your support on Patreon. com/TwoMinutePapers. If you feel like chipping in, just click the Patreon link in the video description. This is why every video ends with, you know what’s coming… Thanks for watching and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник