How Do Neural Networks Memorize Text?

2:39

How Do Neural Networks Memorize Text?

Two Minute Papers 20.04.2019 27 411 просмотров 1 217 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

📝 The paper "Visualizing memorization in RNNs" is available here: https://distill.pub/2019/memorization-in-rnns/ ❤️ Pick up cool perks on our Patreon page: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: 313V, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Brian Gilman, Bruno Brito, Bryan Learn, Christian Ahlin, Christoph Jadanowski, Claudio Fernandes, Dennis Abts, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, James Watt, Javier Bustamante, John De Witt, Kaiesh Vohra, Kasia Hayden, Kjartan Olason, Levente Szabo, Lorin Atzberger, Marcin Dukaczewski, Marten Rauschenberg, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Morten Punnerud Engelstad, Nader Shakerin, Owen Campbell-Moore, Owen Skarpness, Raul Araújo da Silva, Richard Reis, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Thomas Krcmar, Torsten Reil, Zach Boldyga, Zach Doty. https://www.patreon.com/TwoMinutePapers Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Facebook: https://www.facebook.com/TwoMinutePapers/ Twitter: https://twitter.com/karoly_zsolnai Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (1 сегментов)

Segment 1 (00:00 - 02:00)

Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. This is an article from the Distill journal, so expect a lot of intuitive and beautiful visualizations. And it is about recurrent neural networks. These are neural network variants that are specialized to be able to deal with sequences of data. For instance, processing and completing text is a great example usage of these recurrent networks. So, why is that? Well, if we wish to finish a sentence, we are not only interested in the latest letter in this sentence, but several letters before that, and of course, the order of these letters is also of utmost importance. Here you can see with the green rectangles which previous letters these recurrent neural networks memorize when reading and completing our sentences. LSTM stands for Long Short-Term Memory, and GRU means Gated Recurrent Unit, both are recurrent neural networks. And you see here that the nested LSTM doesn’t really look back further than the current word we are processing, while the classic LSTM almost always memorizes a lengthy history of previous words. And now, look, interestingly, with GRU, when looking at the start of the word “grammar” here, we barely know anything about this new word, so, it memorizes the entire previous word as it may be the most useful information we have at the time. And now, as we proceed a few more letters into this word, it mostly shifts its attention to a shorter segment, that is, the letters of this new word we are currently writing. Luckily, the paper is even more interactive, meaning that you can also add a piece of text here and see how the GRU network processes it. One of the main arguments of this paper is that when comparing these networks against each other in terms of quality, we shouldn’t only look at the output text they generate. For instance, it is possible for two models that work quite differently to have a very similar accuracy and score on these tests. The author argues that we should look beyond these metrics, and look at this kind of connectivity information as well. This way, we may find useful pieces of knowledge, like the fact that GRU is better at utilizing longer-term contextual understanding. A really cool finding indeed, and I am sure this will also be a useful visualization tool when developing new algorithms and finding faults in previous ones. Love it. Thanks for watching and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник