# Elon Musk CHANGES AGI Deadline..Googles Stunning New AI TOOL, Realistic Text To Video, and More

## Метаданные

- **Канал:** TheAIGRID
- **YouTube:** https://www.youtube.com/watch?v=dasHfMnxbJg
- **Дата:** 19.06.2024
- **Длительность:** 24:33
- **Просмотры:** 34,635

## Описание

Learn A.I With me - https://www.skool.com/postagiprepardness 
🐤 Follow Me on Twitter https://twitter.com/TheAiGrid
🌐 Checkout My website - https://theaigrid.com/


Links From Todays Video:
https://x.com/GoogleDeepMind/status/1802733643992850760
https://www.tiktok.com/business/en/blog/tiktok-symphony-ai-creative-suite
https://x.com/AIatMeta/status/1803107817345393136
https://ai.meta.com/blog/meta-fair-research-new-releases/?utm_source=twitter&utm_medium=organic_social&utm_content=video&utm_campaign=fair
https://x.com/hyperparticle/status/1802093188012011990
https://ai.meta.com/blog/meta-fair-research-new-releases/?utm_source=twitter&utm_medium=organic_social&utm_content=video&utm_campaign=fair
https://x.com/runwayml/status/1802691475391566108
https://runwayml.com/blog/introducing-gen-3-alpha/
https://x.com/hedra_labs/status/1803095713112580475
https://x.com/Uncanny_Harry/status/1803098094516437318

Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos.

Was there anything i missed?

(For Business Enquiries)  contact@theaigrid.com

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience

## Содержание

### [0:00](https://www.youtube.com/watch?v=dasHfMnxbJg) Segment 1 (00:00 - 05:00)

there were actually so many different AI stories that I nearly wasn't able to keep up but I'm going to show you guys some of the most important ones because these ones were pretty pivotal so coming in at number one we have Google Deep Mind actually giving us a very fascinating update you can see that Google Deep Mind said that we're sharing progress on our video to audio generative technology it can add silent clips that match the Acoustics of the scene accompany onscreen action and more and here are four examples basically what they're doing is if you have a video that doesn't have any sound effects you can use Google's new model in order to add sound effects and the theme tune behind it it's really cool so what I will do right now is I'll show you a couple of the examples and then we'll do a deep dive onto why this is probably happening h pretty fascinating about these examples is that we can actually see the prompts that they used as well so it takes a combination of the prompts and then of course the video you can see right here prompt for audio wolf howling at the moon then right here which I mean if you just saw the wolf howling at the Moon that one was pretty nice it actually sounded like a really high quality scene so I do think that you know Google have you know shown us this demo in a way that makes me believe that this is pretty much near ready because the limitations that they speak about they don't seem to be that crazy and after all this is just an audio model then of course we had this one a slow mellow harmonica plays as the sun goes down on the Prairie I thought this one was really fascinating too cuz the sound of this one actually sounded realistic the sound for the jellyfish pulsating underwater marine life ocean didn't sound as great as many other things could be but I do think that it was pretty decent in with regards to what it was able to do and then of course this one right here this is the one that I want to talk about the most because the prompt for the audio was a drummer on stage at a concert surrounded by flashing lights and a cheering crowd so with this one the reason I liked this one so much is because it actually synced up to the hits of the beat so I think that what Google have shown us here is that considering the fact that these output Generations if you didn't know these were also AI generated so a lot of the stuff that you're looking at here like all of these Generations they are generated by Google's in-house system vo if you aren't familiar with vo basically around two weeks ago Google sort of shown us okay some more of their capabilities of their video model and right here you can see the video model in action being able to generate a drummer hitting the drum and like I said the most impressive thing is that it didn't just take poor representations of what was going on in the video it managed to actually sync up the hits with the actual audio that the system gave it so this is something that I think is remarkably also do have an additional web page right here where they explain how generating audio for video works so essentially they talk about how it uses video pixels and the text prompts in order to generate Rich soundtracks and I think that this is where Google actually comes over the edge because I think Google is you know using systems that are a little more advanced than what we initially have most systems that we currently have today they only really allow us to do text prompts and then you have to kind of generate as much as possible and then hopefully they sync up to what it is that is on screen now usually if it is just like an acoustic scene like for example a peaceful Meadow or a relaxing Park where you're just hearing like crickets or something like that those things don't really require much onscreen syncing but in certain scenarios Google's process here actually comes in very useful so I'm wondering what kind of creative process Google are going to be having

### [5:00](https://www.youtube.com/watch?v=dasHfMnxbJg&t=300s) Segment 2 (05:00 - 10:00)

because you can see that they've got vo and they've got this and it seems that they're building the entire infrastructure for their multimodal AI now if we are continuing to talk about Google we need to talk about Google's major shift Google recently have made a major shift from research lab to AI product Factory now if you remember you've been watching the channel this is something that I spoke about for quite some time and I spoke about how Google has everything they need in order to take on companies like chat GPT slop aai but the problem is that they have a situation on their hands where they're trying to focus on safety and not push out any products and the craziest thing about this all is that Google Deep Mind have been losing consistently researchers because people wouldn't push features out now I made a 20-minute video on this explaining the details of all the reasons and all the failings that Google had but if you didn't believe me this is someone who used to work at Google that actually went to join Luma you know how Luma just recently released their new AI dream machine that people can actually use for text a video he says now you know why I left Google to join Luma I was in the team that developed vo early on but I knew it would never be shipped to the masses for quite some time the same for Sora not until a company like Luma forces their hand that is at least I hope give me access so basically saying that you know I knew that Google weren't going to ship out anything for such a long time so I decided to leave and of course join Luma and remember this isn't the first Google employee to do this many of them have left to join open aai start their own startups so Google has had this kind of brain drain problem where they losing their best talent to other companies and of course people that want to start their own thing because they don't like how Google's handling this AI craze now essentially you can see here it says over one week in miday two companies introduced AI products being built using one of Google's major breakthroughs openi of course announced the new model that underpins chat GPT and of course Google announced AI overviews but remember the overviews didn't go well which were pretty embarrassing you can see here this article continues to State the discontentment about pushing to too hard on commercialization is a mirror image of the internal critique from the last 2 years when Google was struggling with bringing generative AI to Consumers researchers who wanted to ship products departed for startups because they thought the company was moving too slowly according to people familiar with the lab brain researchers have also mourned the loss of their brand and some even welcom the prospect of stronger leadership basically some of them thought that look the Google brand isn't what it used to be and maybe we need new leadership if someone can't navigate this AI industry so overall this article details a few things about how the company has combined its two AI labs to develop Commercial Services A Move That Could undermine its long running strength in foundational research so the problem is that Google is one of those labs that like they are so good at doing research and making breakthroughs like I've already said they made the Breakthrough that powers chat G PT but they're thinking do we now spend more of our time focusing on actually commercializing some of our breakthroughs rather than just experimenting in the research Labs that could potentially lead to more interesting things overall it's a very hard thing it's you know very hard to prioritize whatever it is at a company that you know could decide the fate of your business for the next 10 to 20 years on one side if you don't prioritize product people get frustrated they're like okay we're just going to you know all flock to chat GPT or whatever state-of-the-art system there's going to be but of course if you don't make incredible breakthroughs you're never going to get to that next level of AI where you truly have outstanding products that stand the test of time and have an incredible mode it's a very hard position for Google to be in especially since they're a bigger company and the problem is they don't have the newness to be able to make mistakes and for people to not critique them if open AI kind of does something wrong it's like ah they're kind of like a newer kid on the Block but Google's a behemoth a company that's been around for years that has a pretty impressive reputation to uphold so it will be interesting to see how Demis aabis and Sundar Pai managed to work together to get Google off the ground I do hope that they manag to ship good products and keep a steady hold on their research initiatives because someone wants said and I don't remember who said it but

### [10:00](https://www.youtube.com/watch?v=dasHfMnxbJg&t=600s) Segment 3 (10:00 - 15:00)

they said that if there's any company that should achieve AGI and that You' probably trust with AGI it's most certainly going to be Google in a bit of strange news we have Tik Tok actually introducing Symphony their new creative AI Suite now I don't usually cover AI social media tools sometimes I do but I wanted to cover a broad range of things that are going on in the AI industry so basically symphony is designed to elevate your content creation Journey every step of the way it basically Blends human imagination with AI powered efficiency now this tool is basically an evolution of their creative assistant it's an AI powered virtual assistant that basically just helps you make better videos overall so you know how sometimes people look on social media they'll try and find different Trends they'll be like okay what is going on here what are the best practices what kind of ideas should I be making well this is a platform that actually uses generative a to be able to analyze all of those things and then come up with something very effective so you can see right here this is the thing in action you can see you have your product you have the description then of course you have the media assets can see you're able to actually create Tik toks that are AI driven in just seconds you can see that once you manage to import your content it actually allows you create AI generated content pretty quickly now this is relatively different because it's not creating AI generated content it's synthesizing your content with AI in order to produce that content at Mass scale so I think this is probably the right approach because I think if you can get content out faster with the help of AI that helps everyone who wouldn't want to save time but of course people do not want AI slop on their timelines which is why stuff like this I guess it still works and something that they do have as well is that they have transl for Global reach tell your message for audiences around the world by translating the script and dubbing the voice over in multiple languages with just a few clicks this is something that's going to bring down the global barrier and of course they have this a prebuilt AI Avatar selection that allows you to use it for commercial use so like stock photos they provide quick accessible costeffective narration for Brands bringing your product to Life Next we had meta release a huge amount of open models it was so many that it's hard to even fathom what's going on at meta now these aren't game-changing models but they will change the community a lot of times a lot of the Community Innovations that we see are built off the back of Open Source models and this entire ecosystem of Open Source development is continuing to thrive thanks to meta take a listen to what they had to say and then we'll dive into some of the more intricate details of what they released today at meta we are sharing some of our latest research models and data sets we've shared some of the papers for this work but um by sharing more of the artifacts we should enable the community to innovate faster and also develop new research now this is part of our decades long commitment to open science and sharing our work publicly what's included in the release today there's a multi-token prediction model that can actually reason about multiple outputs at a time enabling faster inference there is meta chameleon a model that reasons about images and text using an early Fusion architecture there is a meta audio seal a new technique for watermarking audio segments there's meta Jasco a technique for music generation that allows better conditioning on chords and Tempo and there's prism a data set that enables better diversity from Geographic and cultural features there's a few more things as well uh I really uh am excited to continue our work towards open research I look forward to seeing what the community will build with these uh latest artifacts and of course look forward to sharing more with you over time so here you can see Meta sharing new research models and data sets from meta fair this is pretty remarkable stuff meta seems to be one of the only companies backing The open- Source area from a position of strength whilst yes companies like Google have released things like Gemma 7 billion parameters and those smaller models meta are taking a clear stand I think meta's initiative is that they want to be the leaders in the open source community and it makes sense because what they've released are things that are truly impressive you can see right here they've released meta chameleon it's a family of models that can combine text as images as input and output any combination of text and

### [15:00](https://www.youtube.com/watch?v=dasHfMnxbJg&t=900s) Segment 4 (15:00 - 20:00)

images with a single unified architecture for both encoding and decoding so this is very fascinating take a look at this video that they had on their page CU I think meta is going to become one of the major players that sneak up on everyone especially since they're supposed to release llama 3 meta chameleon is a unified multimodal model with joint modeling of text and images in one Transformer it's able to perform any combination of inle text and images and its input and output without the need for modality specific modules most current late Fusion models use diffusion based learning for image tasks and tokenization for language tasks built upon an early Fusion architecture meta chameleon uses tokenization for text and images making for a more unified approach we believe this approach can scale better than late Fusion or modular models while being easier to design and maintain so yeah meta have released some pretty interesting stuff audio seal being the first audio watermarking technique specifically designed for the localization of detection of AI generated speech making it actually possible to pinpoint AI generated segments within a longer audio snippet so all of these things that they released like even the prism data set which actually increases the diversity of certain things I think meta is taking a different stance you know people have realized that you look didn't mean to do that people have realized that look opening eye has the llm area down meta they're doing things that are truly Innovative because of course they have V jeppa if you don't know that is a unique architecture that apparently might actually lead us to systems that truly understand what's going on and of course they have the open source area where people are going to be building upon it and then of course we're going to get Innovations on top of that now if you were living under a rock you may have actually missed one of the most important announcements in video generation Runway introduced gen 3 Alpha so gen 3 Alpha is the first of an upcoming series of models trained by runway on a new infrastructure built for large scale multimodal training long story short their text to video model is absolutely insane now I did cover this earlier this week it was truly impressive what we saw was something that was so amazing and I think one of the key features about Runway that you probably should be paying attention to is the fact that they have photorealistic humans everything about their system is really good but I have to be honest with you the photo realistic humans genuinely looks better than open AI Sora and that isn't some sort of clickbait some sort of you know exaggeration I've looked at both videos now and every time I look at the photorealistic humans that comes from Runway I struggle to see any true issues with the quality of the content in terms of just the you know nature of how it looks like it does look extremely realistic and I still struggle to believe that what I'm looking at is truly text a video so yeah it's pretty strange that right now we have text a video that's so high quality and there aren't any imperfections that would make you think otherwise but this thing is truly impressive with as to what it's able to do because I mean like I said before I just can't see enough mistakes in these clips for me to realize that it's AI generated whereas with other models you can actually see that so I would argue that this AI system with the photo realistic humans which is a special feature that they decided to train on I'm guessing that whatever kind of training mechanism they used for this one they had a lot of high quality data sets and as we know data definitely affects the output so with this right here what we can see is that Runway is going to be the leader in terms of these photorealistic humans and I think it's also going to be pretty fascinating because this marks a new stage where not even I can tell if something is AI generated and for sure I'm going to be questioning nearly everything I see online especially when there's a HD video of a human because I'm not going to be able to know if that is real or if it is AI generated now unfortunately with Runway we don't actually have access to this yet it does look very interesting and very effective as it's able to cover a wide range of styles and it's able to essentially merge things that haven't been merged before in a very effective way so it will be interesting to see who releases first will it be Runway will it be Sora I mean we've got a very

### [20:00](https://www.youtube.com/watch?v=dasHfMnxbJg&t=1200s) Segment 5 (20:00 - 24:00)

fascinating you know few months for the rest of the year in terms of what the releases are going to be like and of course in terms of AI development so it will be interesting to see what kinds of models do come out of this area now in AI video news we got introducing the research of our foundation model character one available today at hr. com basically this is something that can actually do reliable head shot generation tell stories that if you didn't know before this is something that AI particularly struggles with and we've seen that AI systems have been able to do this with Microsoft Versa one but hedra have gone ahead and of course released this for the public to use for free it's something that's going to allow Next Level content Creation with emotionally reactive characters now there is one example that I want to show you that shows you how crazy this is because it's pretty incredible with as to what you can do with this kind of Technology let me show you can see here uncanny Harry AI says meet my mate Dave he's had a belated Father's Day message for everyone sound on please I made Dave on Father's Day with a mid Journey image and 11 Labs voice to voice and a new tool called hedra Labs it's the closest thing to acting that I've seen from an AI generated video and out of all the examples even the one on hedra's page this by far seems like the most interesting example that I've seen so far my old dad he was old school you know British stiff up a li he loved me and I loved him but we didn't talk about it much didn't have that sort of relationship so when he got to cancer and my mom was looking after him I'd see him but I always thought I'd had time one day I got the call and I rushed to the hospital I was going to tell him everything but it was too late he'd already passed we never got the chance to tell him thank you for being a good dad thank you for teaching me how to be a man thought dad still land go and tell him now that you love him before it's too late so yeah that is pretty incredible I mean if we look at how the face is moving how the skin is folding it just looks remarkably impressive and it doesn't just look like they've isolated the face you can also see that the rest of the area is moving as well so this is a truly pivotal moment because I think maybe they're going to have cartoon versions be a lot more realistic because of course if you could just put in any realistic human's face and it be able to animate it it's going to present a lot of issues and maybe even there could be some liability for hedra themselves so this is the kind of technology that starts to blur the lines between fantasy and reality and this also brings the question what kind of things do you think people are going to be in the future when we are actually looking at these kinds of content if we can imagine a stage where AI gets to the level where the latency drops down to let's say a few milliseconds are we going to be interacting with humans or these things over the internet that are basically digital twins something to think about and for the last clip we have Elon Musk talking about Tesla's new announcements and of course his AGI revised date Tesla so you'll be able to access Gro through Tesla through your Tesla and you also be able to ask it to do whatever you want you could uh ask your car to go pick up uh a friend or anything you can think of it it'll the car will be able to do it yeah you'll be able to ask your Tesla to go pick up groceries pretty much anything op Optimus is really going to be next level you'll be able to uh skin Optimus in a wi you know pretty much with anything so people on the internet have asked me to make cat GS real and actually you can make cat gills real if you have a robot catg yeah Optimus will be able to pick up your kids from school and Optimus will be able to be school if you want be able to teach kids anything yeah it'll support any language too I think AI will be next year probably it's not next year it's I say 2026 at the latest for AGI at the latest oh hope it's nice to us so when defines AGI is uh smarter than any human I think it's we're less than 24 months away from that yeah please be nice to us AI humanoid robots will user in a level of abundance that was hard to imagine there will be no shortage of goods and services

---
*Источник: https://ekstraktznaniy.ru/video/14238*