DeepMind’s New AIs: The Future is Here!

6:07

DeepMind’s New AIs: The Future is Here!

Two Minute Papers 13.03.2025 154 092 просмотров 5 424 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers Guide for using DeepSeek on Lambda: https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 The Gemma 3 paper and the rest are available here: https://blog.google/technology/developers/gemma-3/ https://developers.googleblog.com/en/experiment-with-gemini-20-flash-native-image-generation/ https://deepmind.google/technologies/gemini-robotics/ https://aistudio.google.com/ Sources: https://x.com/thepushkarp/status/1899874826669744425/photo/1 https://x.com/Angaisb_/status/1899852603107721388 https://x.com/alexanderchen/status/1900013570575794414 📝 My paper on simulations that look almost like reality is available for free here: https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations: https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Taras Bobrovytsky, Thomas Krcmar, Tybie Fitzhugh, Ueli GallizziIf you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers My research: https://cg.tuwien.ac.at/~zsolnai/ X/Twitter: https://twitter.com/twominutepapers Thumbnail design: Felícia Zsolnai-Fehér - http://felicia.hu

Оглавление (2 сегментов)

Segment 1 (00:00 - 05:00)

I am really surprised today, because Google DeepMind just released their new Gemma 3 AI, but Gemma 2 before was okay, but nothing spectacular. Now check this out. Wow. It seems nearly as good as the full-size DeepSeek, also anbut look. Running the full DeepSeek typically requires a bunch of graphics cards, but this one, only one. And an open model. This is unbelievable. We are also going to look at 2 other stunning things they released lately, including generated images that are just fantastic, and a new robot that can pack lunch for you. Now, Gemma can also look at images, for instance, if you have a huge check, but you would only like to pay for your part, with a nice 18% California tip, it can calculate it for you. It is an absolute killer at creative writing, getting second place globally. Now just to demonstrate how much of a stunner this is. This was Gemma 2, small, not super smart, then came the big guns like Llama and Deepseek, super smart but really big, and now, Gemma 3 gives you nearly that kind of quality, but it is 20 times smaller. And it can still keep up with them. This is an amazing gift for all of us. This gift comes in 4 sizes, most speak 140 foreign languages and can have a look at images. So if you have a Japanese remote controller, and you wish to put it in heating mode, it will be able to tell you how. So cool. For the smallest model, no images and only English, but in return, it almost runs on a toaster if you got a good one. Where +1, ShieldGemma can evaluate the safety of text and images if you define a set of policies. You can try them in Google AI studio, the link is in the description, or if you wish to run it yourself like I do, it is super easy to do in Lambda GPU Cloud. But it gets better. They also showcased conversational image generation. Regular image generation is not that interesting, anybody can do that, however, here you chuck in an input image, and ask that it adds flowers to that table. Now, with previous techniques, the problem was that we got the flowers, but the scene changed a lot. So what I wanna see here is…oh my, that is exactly what I wanted to see. Same scene. And you can keep iterating on it, if you wish to see tulips instead, not a problem. Now there are some game changers that you can build on top of this. For instance, recipes. You can ask for a recipe for chocolate chip cookies, but after each step, you get an image of what you should be seeing. And all this is generated on the fly. Incredible. You Fellow Scholars are also using this to great effect. Adding some chocolate drizzle here, yummy. Not a problem. And it really is the same image. This is not amazing, this is beyond amazing. I’ve never seen anything like it. Now I would also like to see some images created with text in them. Something that works kind of okay with a word or a couple of words with other systems, so let’s see…wow, that is superb. And some of you Fellow Scholars are already using this here. Now pair it up with their Imagen 3 AI image generator, which is one of the best around and rivals many other top tier paid models. I mean, just look at these images, huge text prompts are okay, and it even has a hint of creativity. I am kind of blown away. So really, what happened at Google DeepMind? Imagen 2 was good, not the best, but Imagen 3 is absolutely fantastic. Gemma 2 was okay, not the best, but Gemma 3 is incredible. I mean, look at this one from the paper. 3 is better in every way, and not by a little. And the list goes on. Something must have happened at Google DeepMind lately, because they are coming out with these bangers. And now, the new robot. I had the honor of having a look at this before release. I will note that we have no business relationship with Google DeepMind, and of course, I always say that I only do it if I can say whatever I want about it. The Papers are beholden to no one. Ha! You see, last time I was there, they had these robots playing football, and they won the world championship, but its a world championship for robots, they are not Ronaldo, not even close. It was a very impressive project, until you see this one. Now, the best features in my opinion. It runs in real time and reacts to the world changing around it. Or in other words, you can troll it all you want. Super fun. It is also good at high-dexterity tasks, which is pretty stunning. I think of robots folding laundry as a thing that

Segment 2 (05:00 - 06:00)

is perhaps a decade away, and when I see this…wow. And finally, it generalizes to new tasks. That is what we want from an intelligent being. It was asked to slam dunk this ball here, well, it’s not Michael Jordan, but it gets the job done. Absolutely incredible. And yes, it packs lunch too. Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. So, what do you think? Let me know in the comments below.

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник