# MASSIVE AI NEWS #13 Everything Just CHANGED! New AI Robots, AI Reads Minds, , New AI Text To Video

## Метаданные

- **Канал:** TheAIGRID
- **YouTube:** https://www.youtube.com/watch?v=0a5e0iCW6bE
- **Дата:** 20.08.2023
- **Длительность:** 22:51
- **Просмотры:** 24,464

## Описание

Welcome to our channel where we bring you the latest breakthroughs in AI. From deep learning to robotics, we cover it all. Our videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on our latest videos.
https://twitter.com/fffiloni/status/1688593255720456192
https://twitter.com/lukebelmar/status/1689212989637828608
https://techcrunch.com/2023/08/09/anthropic-launches-improved-version-of-its-entry-level-llm/?guccounter=1
https://www.reddit.com/r/singularity/comments/15mn1rf/nvidia_unveils_more_powerful_ai_chip_coming_next/
https://www.reddit.com/r/singularity/comments/15mk8p8/text_to_music_suno_ai/
https://www.reddit.com/r/singularity/comments/15mzo0b/playht_just_announced_a_conversational/
https://twitter.com/Jessewelle/status/1690051035723976704
https://twitter.com/TheAIAnonGuy/status/1692462508412420549
https://twitter.com/rowancheung/status/1691856185228706181

Was there anything we missed?

(For Business Enquiries)  contact@theaigrid.com

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience
#IntelligentSystems
#Automation
#TechInnovation

## Содержание

### [0:00](https://www.youtube.com/watch?v=0a5e0iCW6bE) Audio LDM2

so welcome back to another week in AI content and this week was actually quite interesting because a lot of unsurprised tools managed to come out so let's take a look so starting with audio ldm2 it's learning holistic audio generation with self-supervised pre-training or for the people who can't be bothered with technical jargon it's basically text to audio text to other stuff as well so in this video that they talk about they essentially put a text input in but you're able to get an entire output of different kind of outputs and if that doesn't make sense to you I'm going to play part of the video but essentially what it is that you're getting an output that is not just text audio so on the page you can see it says section one text prompted audio generation so you can see that this is essentially sound effects so what you can get is you can get sound effects you can also get music generation and speech generation that means this entire model it essentially gives you three different entire outputs which is really useful because a lot of the reasons why people use different models is because they need to fine tune these models and specify them for certain and purposes but with this model it seems that they've managed to combine state-of-the-arts text performance in text to audio text to music and text-to-speak generation which is pretty insane so I think this is going to be one of the models that is largely used because as you know people who are working with text audio text and music and text-to-speech usually what does happen is you might be working on a project in which you might need text and music text-to-speech and of course you might need Texas sound effects so I could presume a tool like this the reason I think this is so game changing is because something like this where you have these sound effects and in a moment I will play some of them so you can see that this isn't actually awful or just something where it's just a demo this is something that is really good and I did use this tool live and it did show me something that was really cool as well so right here essentially what we do have is Texas sound effect I'm gonna play some of these Texas sound effects now so you can hear them and you can see just how good this is oh so now that you've seen all of the examples the reason I think audio lm2 is game changing is because something like the Adobe suite imagine I'm editing a video and I want them to be able to add you know a background track and a background track then I want to be able to add the speaker I can have the speaker then I want to add some sound effects I can simply add that all in one simple tool there is a video that they did release um but it is actually really cool so um leave a comment down below let me know if you do try it out and if you do like the results because my personal results were actually pretty good then we had something game changing where we had Google's deepmind CEO so this next algorithm will Eclipse chat gbt so essentially the company that built alphago many different game changing breakthroughs and odds for this intelligence is working on a new system called Gemini that stated that it will tap into techniques that helped alphago to be a gold champion in 2016 and it's going to Eclipse chat EBT now essentially what this means is that they're probably going to release a tool that somehow more advanced than chat GPT or gbt4 and by them stating that it's going to Eclipse their TPT I don't doubt it because if you've ever seen alphago and you've ever watched our video in which we talked about it you'll understand just how crazy it was so essentially the reason this is so crazy is because number one it's from Google and Google's Deep Mind have done insane things with AI before if you've ever been around the AI space number two is the fact that they're pumping billions of dollars into this and they're using a new method um of you know training and making this I think so I think this is where we're going to get some real Innovation that's going to lead to that next step and this article is really good and I would recommend you all read this because it is very interesting but of course as you know the article does dive into the grave risk because when you're leveling up an AI and you're making their AI essentially into something that we've never seen before it definitely could have some risks because I think they didn't really understand what they were saying when they said that they're going to make sure that this large language model can plan and think long term I mean if someone said that to me a year ago I would have said well that isn't that super dangerous isn't that what we're trying to avoid but now I guess they're running forward with this so we will see um and they did say that it might be released sometime in 2024 or 20 25 so that will be interesting to see how the developments do come into play and how they're going to be working on

### [4:14](https://www.youtube.com/watch?v=0a5e0iCW6bE&t=254s) Luma Labs

this system then we had Luma Labs talk about how they're essentially trying to accelerate the rate of advancement and grow their Community with their sponsor Nerf studio and the Berkeley AI research so essentially what they're doing is I think they're open sourcing this tool and they're essentially trying to get a lot more people invested in this tool in order to make this kind of so yeah essentially what they're trying to do is make more people invest into this kind of software and essentially make this kind of content with this tool because I do believe that this kind of content and these kinds of tools are really underrated and what they're about to do for technology is really impressive what they also did announce around two to three days ago was they essentially released this app called fly throughs an app that shows off your space with AI generated cinematic videos that look like professional drone captures so essentially all you have to do is do your um you know your lidoc capture whichever however you do it and then essentially what it does is it does the fly through itself so it's actually really cool I think you can also do some of the fly throughs yourself and then it kind of makes it into a really cool Montage now one thing that I haven't seen anyone talk about when we look at lumalabs AI is going to be the application in terms of prime scenes because one thing that I was looking at is the other day you know tons of crime channels are popping up on YouTube where you can watch various documentaries or crime scenes and stuff and how detectives are analyzing this but do you not think it would be interesting if maybe someone for example uses lumalabs you know takes a snapshot of the crime scene at that specific time and then um you know people around the world detectives can analyze the crime scene and maybe you can you know have that snapshot where you can look at that and then think okay maybe this happened maybe that happened because sometimes crime scenes get contaminated sometimes things can happen and I think this would be definitely a really good application I haven't really seen too many other than like real estate of course when you're trying to view a house or cinematic montages but of course this doesn't involve moving objects which is unfortunate but um it's still really cool so um lumalabs you know you can use it on your phone you can do it for your house many different things you want to do so um yeah really cool and I would definitely check this out then of course we have Nvidia finally releasing their model neolangelo so um essentially if you don't know what neural Angela was it was where you could just take any photo and convert it into a 3D model so it's really accurate like really accurate um so it's just shockingly accurate compared to the previous ones in terms of how accurate these 3D models eventually are so it's something that will be interesting because if you don't understand the AI cycle essentially what happens is an AI tool gets released or announced and then essentially what happens after that is they release a research paper with usually with the tool and then what happens after that is they open source it and when allows people to build on top of that AI tool which then means that once someone is able to build on top of that AI tool then that's where you know you get all of these AI tools all these AI apps um you know for example when llama was open source you know we got tons of these smaller large language models but now that this is open source we're likely to get some kind of 3D Recreation you know stuff where people can fine tune it maybe they can make it better user interfaces they can make products so it will be interesting and I do think that's how many businesses are going to be making money by you know just editing open source tools and making them into a full-fledged product so definitely seems like a gold rush opportunity but at the same time I can't wait to use this because there are some pictures that I do want as 3D scenes and it will be interesting to see how this is used um in many different Industries and I think I know this might sound weird but I do think something like this will be used maybe in like planets or something like that like for example let's say the Mars rover we're trying to analyze another planet and it can't venture to a certain area it could take a snapshot of that terrain because it can't get there and then it could analyze it in 3D and then maybe you could get more data about that specific area so that would be really cool so um yeah there's many different explanations we did release an entire video on that so you might want to check that out in the description below um but yeah it's really cool so um I'm excited for this I'm excited to

### [8:00](https://www.youtube.com/watch?v=0a5e0iCW6bE&t=480s) Image to 3D

see what people building then of course we had another AI tool release one image to high quality 3D object generation using both 2D and 3D division priors now this one is super interesting because I've seen tons and tons of different AI tools come out even one from open AI like if you even believe that and this 3D tool surpasses even that and one thing I want people to understand about Ai and I really do want to get the bit going in the comment section about this because it is something that I do think is important I think that any company that solely focuses on solving one AI problem is going to be vastly better than any AI company that focuses on solving various problems because time and time again what we see is you know many of these apps that you're seeing many of these other examples that you're seeing you know shopee I think this one from was from open Ai and some of these other ones are from other companies that do other AI stuff as well but the problem is that these aren't that great like in converting this image to a 3D model it isn't that great but this company that's focused on doing this but you can see magic one two three they've worked extensively on this you can see just how good this is I look at that teddy bear look at this statue look at that horse look at this thing like those actually look like a game ready I would argue that some of them are getting ready maybe not the horse because the horse's face does look a bit weird but literally you know generated objects just from an image this is some pretty impressive stuff so um they use a different strategy I'm not gonna get too much into the details but I do think that this kind of stuff is really interesting because it shows us that when companies just simply focus on um you know whatever it is they want to do and they really just you know decide to make these high quality meshes it shows us that specialization is going to be something that I do think stands out so if there's any air companies watching this you know I highly doubt it but um I would say specialization is where it's going to set you far apart because if there's a company that can literally do image to 3D with perfection they're going to be worth a lot because image to text image is huge um and this is very close like this Teddy Bear this statue this horse this mug rat whatever it is right here that's really close so um I think which whichever company manages to solve quickly um they're going to do really well so um there's that but then we had

### [9:57](https://www.youtube.com/watch?v=0a5e0iCW6bE&t=597s) Music to Image

something called music to image which was really interesting because it's not something that you would expect or even has a demand but I do think that some creatives will use it so essentially you can select an audio and then you can generate an image from that music and I'm not sure what it's based on too much but it is very interesting to see the kinds of AR tools that are constantly released because music to image wasn't something that I even thought about but um I'm guessing just based on what I think they do is maybe they have something that describes the music track then converts that description into a text prompt that is then led into something like steel diffusion and then you get an image output from that so um yeah I think this is something that could work because I mean sometimes people make music and they want to transform it into other mediums of art um and I guess sometimes also people when they do have music they want to see what it looks like you know what does this sound like what does this feel like and if it's aggressive if it's you know um having a certain theme I don't know it's definitely interesting although it's not crazy you know it's not game changing I still think that something like this is interesting um it's definitely something you can check out you can use because there's a hugging face space so yeah um let me know what you think about this as well okay so this is probably the big biggest thing that I've ever seen in Ai and I think this is going to transform online content and I think there's literally going to be a law made about this because what we're looking at is the very next evolution in how um news presenters how YouTubers how you know media companies including some people like myself may actually choose to deploy their content so what you're seeing on screen is not a real person this isn't a real person at all so this person didn't get up and record this content they didn't record the voiceover this is 100 artificial intelligence generated content and it kind of blew my mind because I was like what on Earth is going on here but I did mess around with this software a year ago and it was shockingly good but now it's indistinguishable from reality so essentially this is an AI Avatar you can clone yourself and make videos and make it say whatever you want so um take a look at this video because I know you just been looking at some guy and you haven't even heard anything but um it's pretty insane okay I mean I'm gonna talk about some of the tricks that they did use because I think I did pick up on some of these tricks because it was really cool but take a look at this video because it's still Still till the steak and I mean this it's still been playing my mind every single day because I'm thinking how on Earth is this even possible but enough rambling take a look in the early days of computer programming a significant historical anecdote is known as Grace Hopper and debug in 1947 at Harvard University real Admiral Grace Hopper was working on the Mark II computer one day the system stopped working and technicians discovered a moth trap between the contacts of a relay causing the malfunction Hopper then coined the term dividing to describe the process of fixing compute errors she even pasted the moth in her logbook which is now on display at the Smithsonian National Museum of American History Now understand that this

### [12:50](https://www.youtube.com/watch?v=0a5e0iCW6bE&t=770s) Jesse willing

YouTuber Jesse willing someone that you might know about if you've been on YouTube for maybe over five years but this guy was like a prankster back in the day it was really popular um he actually also did this with his own avatar which he cloned which just goes to show us um how good this really is now I think his video is not as good but I do think that um you know the movements some of the you know inflections some of the pauses do make it a lot more realistic and I think when we start to get to that area where these AI companies start to realize that some of the mistakes is actually what makes a lot of the stuff human um that's when I think it's going to be a game over moment because a lot of times what we have is AI trying to be perfect and usually Perfections isn't great because life isn't perfect like for example um if you were to draw like a car there's usually you know rough things on the card it's not usually a car that's just fresh out of the show stream they're usually mistakes is usually marks usually when people speak sometimes they go sometimes I have little off pauses but with speech it's usually all perfect clearing concise which is why I think that this kind of tool um they're already getting some of that Nuance perfect already which is a little bit scary so um it will be interesting to see how far this tool does go because I think once this gets released worldwide because currently it's in beta or Alpha um it's gonna be absolutely insane so take a

### [14:01](https://www.youtube.com/watch?v=0a5e0iCW6bE&t=841s) Claw 2 improvements

look out for that one because that definitely did scare me also anthropic did actually improve claw 2 which isn't really big news but um I think it's important to note that claw 2 is actually coming up on the level of Chad CPT in terms of usability just because as you know currently chat GPT is facing a crisis where many people are stating that it isn't as good as its counterparts in terms of the fact that the quality has declined so I think it's a PR problem because chat gbt is most certainly better than every other AI tool out there but because it's declined in usability many users are feeling frustrated that they initially paid for something but it's gone down in quality I mean imagine buying a phone and it got worse as you used it I mean although that does happen it's not what you'd expect from a software you wouldn't expect your software to get worse okay it's not like a hardware issue and they just decided to do that I think because of pricing issues so um it will be interesting to see if many people move over to Claude too because also you have a thousand context window you know 100 000 context window which means you can input PDFs large bodies of text and I know that's something that you can't do with chat gbt and another thing that's annoying with chat CPT as well is that not only do you have to pay for it um it's not currently worldwide a lot of the features they release it's just for us at the moment so um that is you know quite frustrating for many users so it'll be interesting to see what happens over the future but I do think that Cloud 2 is actually up there now especially with the context window um it definitely sets itself apart um

### [15:19](https://www.youtube.com/watch?v=0a5e0iCW6bE&t=919s) Nvidia

also Nvidia unveils a more powerful air chip coming next year and honestly guys when it comes to Nvidia I just have to say that this company a lot of people are saying they're overvalued you know it's a stock market bubble like I think the price is actually Justified and I'm not a stock advisor on anything like that but I've looked at Nvidia time and time again and they just keep doing everything right like honestly they just do everything right like they are really really in the game when it comes to AI like they have the graphics card they have the software they have the robots they have the specialization they have the team they have the they literally have so much stuff that I wouldn't be betting on any other company when it comes to AI they are really great um they do provide us with a lot of information that we Supply you guys with so I would say that when they do release this gh200 super chip it's gonna be incredible I mean it's going to be absolutely incredible so um I didn't think that they were going to be able to continually increase the quality of their chips but somehow they do it and if you're thinking okay it's a chip like what is that Gonna Change well think about like this okay I mean early this year or later on this year they announced that they could train entire models on one chip when previously that wasn't possible so if we get to a stage where you can essentially train entire models on one chip in just a couple of days think about how quick we're gonna be getting updates to AI models think about how quick someone's going to be able to train their own model and think about how future models the models like chat gpt6 gbt8 gbt 10 Google's Gemini how much easier it's going to be for them in order to train those models because the compute time is going to be that much quicker so I think that's something that's really underrated and that is something that um you know people aren't realizing is going to increase so much the speed at which this kind of stuff comes out so um keep an eye for this because these are the announcements that they don't break the headlines but they really do move everything quickly because people are like oh my God all these songs coming out so quickly well it's thanks to in videos thanks to all these fast chips that are able to get everything processed so quickly so all these Cloud companies man they're going to be blowing up but um yeah and video once again doing it crazy then of course we

### [17:13](https://www.youtube.com/watch?v=0a5e0iCW6bE&t=1033s) Suno AI

had something called suno AI so essentially it's text to music and we have seen loads of different texting music but this is the first tool that is actually out there and released so thank God for that because um you know so many tools out there you know in these videos that we talk about them but the tools don't get released like I might say okay they released this text image for example earlier in the video we talked about you know um texture 3D or whatever but um or images really but the problem is that we can't actually use these tools so what useful is a tool um if we can't use it and this is something that you can use it as an alpha and you can sign up it's like Discord it's like mid Journey so um for those of you who are creators out there who want uncopyrighted soundtracks definitely going to Shake Up the Music Industry the royalty-free soundtrack industry um I do want to know what your thoughts on is this because um although it doesn't sound absolutely insane I do think this is going to change a lot because once we get full three minute soundtracks that sound perfect like mid Journeys text image um things are about to change so with that being said um let's move on to the

### [18:09](https://www.youtube.com/watch?v=0a5e0iCW6bE&t=1089s) Play HT

next one then we have play HT which is a low-key player in the AI space and they release something called um their conversational model which actually challenges 11 Labs everyone knows what level Labs is it's that privately funded company that you can use to make generated AI voices and it sounds absolutely indistinguishable from Human voices but um play HT they really did just change the absolute game with this conversational stuff I mean on their website you can see AI Tech speech AI voice cloning voice generation API but what they released with the uh with the conversational stuff it's crazy okay now don't take my word for it okay look at this video because um I think this is going to be the second company that most people do use um and just take a look because it's crazy all right it's just crazy I mean um yeah I'm just gonna play the video um and you let me know if you can tell which one is AI because I can't okay the comments can't on other videos um and I don't know so you let me know I honestly can't believe what I'm seeing right now uh I honestly can't believe what I'm saying right now hello Play support speaking hey yes yeah so I've been on your basic plan for like a few months and I think I want to upgrade to the um the professional plan the one that's 99 per month yeah sure glad you're liking it enough to you know consider an upgrade let me just um pull up your details real quick can you shoot me your account email or like your phone number chirp it's a 650-451-2218 uh just give me a sec okay there you are so what are you actually looking for in the upgrade any uh specific features or stuff that you've got your eye on yeah well I've been running out of storage a bit and then we

### [19:55](https://www.youtube.com/watch?v=0a5e0iCW6bE&t=1195s) Toilet Robot

had something that was uh once again shocking as AI is I mean every week it seems like there's something that just you know just blows my mind um and I did think that these people are gonna be safe but I really don't know which job is going to be safe other than you know entertainers because um this what you're seeing on screen now is an AR robot that cleans toilets and it isn't just like a basic like button you press and then you know it essentially just you know like wipes the toilet but this is something that literally goes inside a toilet opens the door shuts the door Hoovers cleans it um and I think this is crazy because everybody knows that nobody really wants to clean toilets I mean especially in some scenarios you know like for example a bathroom in a bar you know those ones are particularly not nice or you know like public bathrooms um and I think you have to understand how capitalism Works where if a company can get you know robots made cheaply enough to where they don't have to pay people to clean bathrooms I think it's going to be a job that's just completely done by a robot I mean this robot literally is opening the door um you know cleaning it and just moving on to the next it's just crazy like I didn't think we'd see that this year I didn't think we would um and maybe it's been something that has been going on but I mean even grabs like some tissue and just like it's crazy I'm not gonna lie guys um this was something that I really didn't see coming just yet um but let me know what you think about this because it was just I mean crazy to see how good it was you know it wasn't tripping over um it's got like a whole you know based on it I mean it's crazy I mean you're watching the video you're seeing what I'm seeing okay um and maybe I'm overreacting like dude it's just a robot you know trying to clean the toilet like it's not revolutionary by any means but I think the accuracy in which it does it the fact that it can open doors um I think that part is interesting um so I think this shows us um once robots get really good at something they're really going to be um a part of the workforce but I think what's gonna be crazy about this because I did do some research and they have been working on these robots for a long time is when these robots come comes super cheap so when economies of scale start to the point where these robots become super cheap I think that's way in your everyday life you're gonna will just be walking on the street and then you're gonna interact with the robot so that's what will be interesting um for the most part but let me know what you think about this because I do think that this is quite interesting and I want to say one more thing before we move on from this um most people are gonna say dude it's just a robot that can you know clean toilets just remember that these things get better every year so now they can just clean robots perfectly what are they going to be able to do next just remember that continual progression and evolution then we have

### [22:13](https://www.youtube.com/watch?v=0a5e0iCW6bE&t=1333s) Pink Floyd

this one where AI just reconstructed a Pink Floyd song from brain activity um and I don't know guys I don't even think we're living in reality anymore because I feel like this stuff is happening and these videos might get 10 to 15K views or whatever but a large majority of people don't realize how advanced this technology is increasing and just how much money is going into this software so I really do think it's going to be interesting to see how much the world is going to change in the next you know 10 to 20 years just based on AI alone so something like this um you know it's really crazy um so yeah I mean I don't know what to say other than that like this is something that is basically science fiction if you told it to someone 10 years ago so yeah crazy

---
*Источник: https://ekstraktznaniy.ru/video/14741*