# Gemini: ChatGPT-Like AI From Google DeepMind!

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=ex1GeX0IhJQ
- **Дата:** 07.12.2023
- **Длительность:** 10:52
- **Просмотры:** 147,201
- **Источник:** https://ekstraktznaniy.ru/video/12874

## Описание

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers

Try it out now! https://bard.google.com/

📝 The paper "Gemini: A Family of Highly Capable
Multimodal Models" is available here:
https://storage.googleapis.com/deepmind-media/gemini/gemini_1_report.pdf

Update - A little more context on some of the results on the hand motion videos: https://developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html

📝 My latest paper on simulations that look almost like reality is available for free here:
https://rdcu.be/cWPfD 

Or this is the orig. Nature Physics link with clickable citations:
https://www.nature.com/articles/s41567-022-01788-5

John Petrucci - Gemini:
- Live: https://www.youtube.com/watch?v=6Dq34haNhTs&t=112s
- Album version https://www.youtube.com/watch?v=sdV9s5-9V40

Listen to this too! https://www.youtube.com/watch?v=6EgdYWiwt14

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Al

## Транскрипт

### Intro []

My goodness, today might be a historic day, because Google DeepMind’s ChatGPT competitor, Gemini is here. We were wondering what kind of secret AI does google have up its sleeve, and today, you will know all about it. I am so excited, and I was also surprised by this appearing as I was catching up on some music albums to listen to, and found a new favorite song with the same name, what are the chances! So, what can the Gemini AI do? Boy, did they come out with guns blazing. Thus, Gemini has a lot to offer. According to the marketing materials, it is absolutely insane, and fortunately, look!

### Multimodel AI [0:49]

Yes! Haha! We also have a paper to have a look at together. This is a multimodal AI, it can have a look at your math homework, with handwritten notes, no need to type in anything, just a photo, and look! It already found a silly little mistake in this physics derivation. It explains what went wrong and how to do it right beautifully. It can even give you new practice problems and analyze your answers to that. A personal teacher. That is amazing. But note that ChatGPT can do this too. We’ll see in a moment if it can do things that ChatGPT cannot. And we are not done yet. Not even close. In fact, we are just warming up. Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér.

### Metaanalysis [1:44]

Now, have a look at this paper. This is a meta analysis on different genetic variants of particular diseases. This means a research paper, written by hand by looking up tens of thousands and thousands of papers, finding the handful of papers out of those that have relevant information to a given topic, aggregate and compare their findings, and draw a conclusion. That is a herculean task.

### Papers [2:12]

And now, hold on to your papers Fellow Scholars, because Gemini can do this automatically. It can evaluate papers rapidly, and see how relevant they are. Now, little AI, read the paper and extract the relevant data from it. Look! No hallucinations allowed, because we are seeing where it got all this data from. According to DeepMind, it read through hundreds of thousands of papers in just one lunch break.

### Multimodal [2:39]

I have to say they have incredibly productive lunch breaks over there at DeepMind. Bravo! Now it can also update the studies and the fact that this is a multimodal AI, it means that it can deal with code, audio, images, video, you name it. Not just text. Fantastic. But wait a second, if it can do all of that, is it possible that it can even write a paper? Interesting question. Well, little AI, update the graphs with the new information.

### Update graphs [3:20]

Wow! So far, we needed this kind of herculean effort to summarize this genetic research data up to 2019, and now, we have all this data up to 2023. And all this in just one lunch break. Automatically. And AlphaCode 2 is also here.

### Alpha Code 2 [3:40]

The first version was roughly as good as half the humans in coding, and now, better than 85% of the human competitors. All this in just one version change. So, what can it do? Well, it can break down a written task, and come up with efficient solutions. Here, it came up with a solution using dynamic programming, this is a college-level computer science algorithm that breaks down a big, difficult problem into smaller, easier problems. Such a short, elegant solution. And make no mistake, this is not just coding. This is reasoning, understanding, and synthesis. Loving it. And now, you can have this incredible digital mind as a companion, or even as a teacher in your journeys. And by double-checking and repairing its own answers, it can now write you a web app in less than a minute. It is not going to be perfect, but it is an incredible first crack at a problem. Once again, ChatGPT can likely do this too for you at least as well. Now I promised that we will see if it can do things that ChatGPT cannot. So, can it? Well, it is advertised as a universal AI, let’s talk about that and compare to ChatGPT

### Universal AI [5:05]

because there are absolutely insane insights there. They showcase incredible performance on the MMLU dataset. When I heard this, my eyes basically popped out of my head. So what does this mean? This is the Massive Multitask Language Understanding dataset. It contains questions about anatomy, astronomy, chemistry, formal logic, marketing, management, sociology, virology, you name it. So, how good is it? Well, this is a good human performance, and this is Gemini. Yup. That sort of means superhuman performance. Now this graph is from the marketing materials, but we are Fellow Scholars here, we like research papers better, so there you go. Ah, much better.

### Gemini vs ChatGPT [6:04]

Now let’s look at the comparison. Yes, it claims to beat ChatGPT, but this comparison has its limits. ChatGPT was benchmarked with 5-shot, which means that the AI was given 5 somewhat similar examples to learn from, where Gemini’s CoT stands for Chain of Thought. These are important details. This means guiding the AI model through a step by step reasoning process. This is not entirely the same process at the 5 shot. But it gets better! Beyond that, in 30 of 32 benchmarks it came in first. And many of these comparisons are apples to apples comparisons, you can see the equivalent number of shots in the comparisons on many datasets. With this, I think it is fair to conclude that on these datasets, yes, it can do things that ChatGPT cannot.

### universality [7:02]

I told you, Google is coming with guns blazing. And wait, if we can have this kind of universality in an AI… do you remember the example with the genome data? Yes! This means that it can do this in any area of science. Any topic. Same with the math lessons. Any lesson works.

### sizes [7:25]

My goodness! And Gemini comes in three sizes: Ultra - this is for the hardest of the hardest tasks, pro, this is the bread and butter for most tasks, and nano, this is a tiny version for you to be able to run it in your pocket. Yes, right on your phone. The Pixel 8 Pro is the first smartphone to be able to run this. Now, good news, I’ll tell you in a moment how you can try it, but first, two insane things from the paper: One, the training took place across multiple data centers. Depending on the methodology, that would possibly require a computer network that is so fast that it basically makes my head fall off. That is incredible. Two, look at this.

### specialist AI [8:13]

This is something that I could not believe for years, but it is here, and it is true. Specialist AIs exist, these are techniques that are incredibly good at one thing, but cannot really do anything else. Like AlphaGo at Go, or AlphaFold at protein folding. And then, there are generalist AIs like GPT-4 and Gemini. These can do a lot of things, but not as well as the specialist AI can do the one thing. Until now. And this is a historic moment - this is a universal AI that can beat these specialist AIs at their own game. Perhaps because knowing a little less about your area, but more about the world in general makes you a better specialist. That is incredible. A huge philosophical thought that keeps me up at night. So much to think about! So, when can we try it?

### when to try it [9:16]

Well, according to Google, right about now! According to them, the pro version is available right now in Bard. However, not for me as of the making of this video, so I am assuming there is a staggered rollout of this system. Likely, it will gradually be deployed to all of us very soon.

### conclusion [9:37]

I will note that we do not have any business ties with Google or DeepMind or OpenAI. None of our videos were sponsored by them, including this one of course. According to what we hear from internal documents, getting this AI and Bard right might be an existential issue for Google, so they are putting all their might into it. And I have to say it shows. Wow. Very excited to give this a try soon, and if you see this video, make sure to try Google Bard, it might already be infused by Gemini by the time you are watching this. The link is in the video description. What a time to be alive! And if you enjoyed this and if you wish to see more, become a Fellow Scholar by subscribing and hitting the bell icon. Checking out our sponsor is also a great way to support us. Thank you so much!
