# OpenAI Outperforms Some Humans In Article Summarization! 📜

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=RUDWn_obddI
- **Дата:** 30.03.2021
- **Длительность:** 8:29
- **Просмотры:** 93,184

## Описание

❤️ Check out Weights & Biases and sign up for a free demo here: https://www.wandb.com/papers 
❤️ Their mentioned post is available here: https://wandb.ai/openai/published-work/Learning-Dexterity-End-to-End--VmlldzoxMTUyMDQ

📝 The paper "Learning to Summarize with Human Feedback" is available here:
https://openai.com/blog/learning-to-summarize-with-human-feedback/

Reddit links to the showcased posts:
1. https://www.reddit.com/r/AskAcademia/comments/lf7uk4/submitting_a_paper_independent_of_my_post_doc/
2. https://www.reddit.com/r/AskAcademia/comments/l988py/british_or_american_phd/

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Haro, Alex Serban, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Haris Husic,  Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Joshua Goller, Kenneth Davis, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers

Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://discordapp.com/invite/hbcTJu2

Thumbnail background image credit: https://pixabay.com/images/id-1989152/

Károly Zsolnai-Fehér's links:
Instagram: https://www.instagram.com/twominutepapers/
Twitter: https://twitter.com/twominutepapers
Web: https://cg.tuwien.ac.at/~zsolnai/

## Содержание

### [0:00](https://www.youtube.com/watch?v=RUDWn_obddI) Segment 1 (00:00 - 05:00)

Dear Fellow Scholars, this is Two Minute  Papers with Dr. Károly Zsolnai-Fehér. This paper will not have the visual fireworks  that you see in many of our videos. Oftentimes,   you get ice cream for the eyes, but today,  you’ll get an ice cream for the mind. And,   when I read this new paper,  I almost fell off the chair,   and I think this work teaches us important  lessons and I hope you will appreciate them too. So, with that, let’s talk about AIs and dealing  with text! This research field is improving at an   incredible pace. For instance, four years ago,  in 2017, scientists at OpenAI embarked on an AI   project where they wanted to show a neural network  a bunch of Amazon product reviews and wanted to   teach it to be able to generate new ones, or  continue a review when given one. Upon closer   inspection, they noticed that the neural network  has built up a knowledge of not only language,   but also learned that it needs to create a  state-of-the-art sentiment detector as well.    This means that the AI recognized that in order to  be able to continue a review, it needs to be able   to understand English, and efficiently detect  whether the review seems positive or negative. This new work is about text summarization, and it  really is something else. If you read reddit, the   popular online discussion website, and encounter  a longer post, you may also find a short summary,   a TLDR of the same post, written by a fellow  human. This is good for not only the other readers   who are in a hurry, but, it is less obvious  is that it is also good for something else. And now, hold on to your papers, because  these summaries also provide fertile   grounds for a learning algorithm to read a  piece of long text, and its short summary,   and learn how the two relate to each other.   This means that it can be used as training   data and can be fed to a learning algorithm.   Yum! And the point is that if we give enough   of these pairs to these learning algorithms,  they will learn to summarize other reddit posts. So, let’s see how well it performs. First,  this method learned on about a hundred thousand   well-curated reddit posts, and was also tested  on other posts that it hadn’t seen before. It   was asked to summarize this post from relationship  advice subreddit, and let’s see how well it did. If you feel like reading the text, you  can pause the video here, or if you feel   like embracing the TLDR spirit, just carry  on, and look at these two summarizations.    One of these is written by a human, and the other  one by this new summarization technique. Do you   know which is which? Please stop the video and let  me know in the comments below. Thank you! So this,   was written by a human and this by the new  AI. And while, of course, this is subjective,   I would say that the AI-written one feels at  the very least as good as the human summary,   and I can’t wait to have a look at the  more principled evaluation in the paper.    Let’s see…the higher we go here, the higher the  probability of a human favoring the AI-written   summary to a human-written one. And we have  smaller AI models on the left, bigger ones to   the right. This is the 50% reference line, below  it, people tend to favor the human’s version,   and if it can get above the 50% line, the  AI does a better job than human-written   TLDRs in the dataset. Here are two proposed  models, this one significantly underperforms,   this other one is a better match. However,  whoa! Look at that! The authors also proposed a   human feedback model that, even for the smallest  model, handily outperforms human-written TLDRs,   and as we grow the AI model, it gets even  better than that. Now that’s incredible,   and this is when I almost fell off  the chair when reading this paper. But! We’re not done yet, not even close. Don’t  forget, this AI was trained on reddit, and was   also tested on reddit. So our next question is,  of course, can it do anything else? How general   is the knowledge that it gained? What if we  give it a full news article from somewhere else,   outside of reddit? Let’s see how it performs.   Hmm…of course, this is also subjective, but I   would say both are quite good. The human-written  summary provides a little more information,   while the AI-written one captures the essence of  the article and does it very concisely. Great job.

### [5:00](https://www.youtube.com/watch?v=RUDWn_obddI&t=300s) Segment 2 (05:00 - 08:00)

So, let’s see the same graph for summarizing  these articles outside reddit. I don’t expect   the AI to perform as well as with the reddit  posts as it is outside the comfort zone,   but…my goodness, this still performs nearly  as well as humans. That means that it indeed   derived general knowledge from a really narrow  training set, which is absolutely amazing. Now,   ironically, you see this Lead-3 technique  dominating both humans and the AI. What could that   be? Some unpublished, superintelligent technique?   Well, I will have to disappoint, this is not a   super sophisticated technique, but a dead simple  one. So simple that it is just taking the first   three sentences of the article, which humans seem  to prefer a great deal. But note, that this simple   Lead-3 technique only works for a narrow domain,  while the AI has learned the English language,   probably knows about sentiment, and a lot  of other things that can be used elsewhere. And now, the two most impressive  things from the paper, in my opinion:  One, this is not a neural network, but a  reinforcement learning algorithm that learns   from human feedback. A similar technique has been  used by DeepMind and other research labs to play   video games or control drones and it is really  cool to see them excel in text summarization too. Two, it learned from humans, but derived so much  knowledge from these scores, that over time,   it outperformed its own teacher. And the  teacher here is not humans in general,   but people who write TLDRs along their posts  on reddit. That truly feels like something   straight out of a science fiction  movie. What a time to be alive! Now, of course, not even this technique is  perfect, this human vs AI preference thing   is just one way of measuring the quality of the  summary, there are more sophisticated methods   that involve coverage, coherence, accuracy, and  more. In some of these measurements, the AI does   not perform as well. But just imagine what this  will be able to do two more papers down the line. Thanks for watching and for your generous  support, and I'll see you next time!

---
*Источник: https://ekstraktznaniy.ru/video/13948*