❤️ Check out Linode here and get $20 free credit on your account: https://www.linode.com/papers
📝 The paper "International evaluation of an AI system for breast cancer screening" is available here:
https://deepmind.com/research/publications/International-evaluation-of-an-artificial-intelligence-system-to-identify-breast-cancer-in-screening-mammography
❤️ Watch these videos in early access on our Patreon page or join us here on YouTube:
- https://www.patreon.com/TwoMinutePapers
- https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Haro, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Daniel Hasegan, Eric Haddad, Eric Martel, Javier Bustamante, Lorin Atzberger, Lukas Biewald, Marcin Dukaczewski, Michael Albrecht, Nader S., Owen Campbell-Moore, Owen Skarpness, Rob Rowe, Robin Graham, Steef, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh
More info if you would like to appear here: https://www.patreon.com/TwoMinutePapers
Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://discordapp.com/invite/hbcTJu2
Károly Zsolnai-Fehér's links:
Instagram: https://www.instagram.com/twominutepapers/
Twitter: https://twitter.com/twominutepapers
Web: https://cg.tuwien.ac.at/~zsolnai/
Оглавление (2 сегментов)
Segment 1 (00:00 - 05:00)
Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. These days, we see so many amazing uses for learning-based algorithms, from enhancing computer animations, teaching virtual animals to walk, to teaching self-driving cars depth perception, and more. It truly feels like no field of science is left untouched by these new techniques, including the medical sciences. You see, in medical imaging, a common problem is that we have so many diagnostic images out there in the wild that it makes it more and more infeasible for doctors to look at all of them. What you see here is a work from scientists at DeepMind Health that we covered a few hundred episodes ago. The training part takes about 14 thousand optical coherence tomography scans, this is the OCT label you see on the left, these images are cross sections of the human retina. We first start out with this OCT scan, then, a manual segmentation step follows, where a doctor marks up this image to show where the most relevant parts, like the retinal fluids or the elevations of retinal pigments are. After the learning process, this method can reproduce these segmentations really well by itself, without the doctor’s supervision, and you see here that the two images are almost identical in these tests. Now that we have the segmentation map, it is time to perform classification. This means that we look at this map and assign a probability to each possible condition that may be present. Finally, based on these, a final verdict is made whether the patient needs to be urgently seen, or just a routine check, or perhaps no check is required. This was an absolutely incredible piece of work. However, it is of utmost importance to evaluate these tools together with experienced doctors, and hopefully, on international datasets. Since then, in this new work, DeepMind has knocked the evaluation out of the park for a system they developed to detect breast cancer as early as possible. Let’s briefly talk about the technique, and then I’ll try to explain why it is sinfully difficult to evaluate it properly. So, onto the new problem. These mammograms contain four images that show the breasts from two different angles, and the goal is to predict whether the biopsy taken later will be positive for cancer or not. This is especially important because early detection is key for treating these patients. And the key question is, how does it compare to the experts? Have a look here. This is a case of cancer that was missed by all six experts in the study, but was correctly identified by the AI. And what about this one? This case didn’t work so well — it was caught by all six experts but was missed by the AI. So, one reassuring sample, and one failed sample. And with this, we have arrived to the central thesis of the paper, which asks the question, “what does it really take to say that an AI system surpassed human experts”? To even have a fighting chance in tackling this, we have to measure false positives and false negatives. A false positive means that the AI mistakenly predicts that the sample is positive, when in reality, it is negative. A false negative means that the AI thinks that the sample is negative, whereas it is positive in reality. The key is that in every decision domain, the permissible rates for false negatives and positives is different. Let me try to explain this through this example. In cancer detection, if we have a sick patient who gets classified as healthy is a grave mistake that can lead to serious consequences. But, if we have a healthy patient who is misclassified as sick, the positive cases get a second look from a doctor, who can easily identify the mistake. The consequences, in this case, are much less problematic, and can be remedied by spending a little time checking the samples that the AI was less confident about. The bottom line is that there are many different ways to interpret the data, and, it is by no means trivial to find out which one is the right way to do so. And now, hold on to your papers because here comes the best part. If we compare the predictions of the AI to the human experts, we see that the false positive cases in the US have been reduced by 5. 7 percent, while the false negative cases have been reduced by 9. 7%. That is the holy grail! We don’t need to consider the cost of false positives or negatives here, because it reduced false positives and false negatives at the same time. Spectacular! Another important detail is that these numbers came out of an independent
Segment 2 (05:00 - 08:00)
evaluation. It means that the results did not come from the scientists who wrote the algorithm and have been thoroughly checked by independent experts who have no vested interest in this project. This is the reason why you see so many authors on this paper. Excellent. Another interesting tidbit is that the AI was trained on subjects from the UK, and the question was how well does this knowledge generalize for subjects from other places, for instance, the United States. Is this UK knowledge reusable in the US? I have been quite surprised by the answer, because it never saw a sample from anyone in the US, and still did better than the experts on US data. This is a very reassuring property, and I hope to see some more studies that show how general the knowledge is that these systems are able to obtain through training. And, perhaps the most important. If you remember one thing from this video, let it be the following. This work, much like most other AI-infused medical solutions are not made to replace human doctors. The goal is, instead, to empower them, and take off as much weight from their shoulders as possible. We have hard numbers for this, as the results concluded that this work reduces this workload of the doctors by 88%, which is an incredible result. Among other far-reaching consequences, I would like to mention that this would substantially help not only the work of doctors in wealthier, more developed countries, but it may single-handedly enable proper cancer detection in more developing countries who can not afford to check these scans. And note that in this video, we truly have just scratched the surface, whatever we talk about here in a few minutes cannot be a description as rigorous and accurate as the paper itself, so make sure to check it out in the video description. And with that, I hope you now have a good feel of the pace of progress in machine learning research. The retina fluid project was the state of the art in 2018, and now, less than two years later, we have a proper, independently evaluated AI-based detection for breast cancer. Bravo DeepMind. What a time to be alive! This episode has been supported by Linode. Linode is the world’s largest independent cloud computing provider. Unlike entry-level hosting services, Linode gives you full backend access to your server, which is your step up to powerful, fast, fully configurable cloud computing. Linode also has One-Click Apps that streamline your ability to deploy websites, personal VPNs, game servers, and more. If you need something as small as a personal online portfolio, Linode has your back, and if you need to manage tons of client’s websites and reliably serve them to millions of visitors, Linode can do that too. What’s more, they offer affordable GPU instances featuring the Quadro RTX 6000 which is tailor-made for AI, scientific computing and computer graphics projects. If only I had access to a tool like this while I was working on my last few papers! To receive $20 in credit on your new Linode account, visit linode. com/papers or click the link in the description and give it a try today! Our thanks to Linode for supporting the series and helping us make better videos for you. Thanks for watching and for your generous support, and I'll see you next time!