HUGE AI NEWS #16 DALLE-3 , New BARD, Midjourney 3D, New Self Driving, Youtube AI and More
19:08

HUGE AI NEWS #16 DALLE-3 , New BARD, Midjourney 3D, New Self Driving, Youtube AI and More

TheAIGRID 22.09.2023 5 952 просмотров 184 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
https://twitter.com/unusual_whales/status/1702122741829169302 https://twitter.com/_akhaliq/status/1701435537356124200 https://mv-dream.github.io/index.html https://twitter.com/JaimeYassif/status/1702448587949453400 https://twitter.com/DrJimFan/status/1702718067191824491 https://www.youtube.com/watch?v=514IZJENQ3s Copilot : https://twitter.com/_akhaliq/status/1704916883164614761 Midjourney 3d: https://twitter.com/nickfloats/status/1702721761790357733 Next GPT- https://twitter.com/_akhaliq/status/1701435537356124200 AI senate- https://twitter.com/unusual_whales/status/1702122741829169302 Elon tweet - https://twitter.com/elonmusk/status/1704919985246601320 LLM Driving - https://twitter.com/DrJimFan/status/1702718067191824491 Youtube AI - https://twitter.com/bilawalsidhu/status/1704874993039913051 Bard - https://twitter.com/Google/status/1704119261566800278 Midjoruney vs dall -e Youtube ai 2 = https://twitter.com/YouTube/status/1704865999357403143 Welcome to our channel where we bring you the latest breakthroughs in AI. From deep learning to robotics, we cover it all. Our videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on our latest videos. Was there anything we missed? (For Business Enquiries) contact@theaigrid.com #LLM #Largelanguagemodel #chatgpt #AI #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks #Robotics #DataScience #IntelligentSystems #Automation #TechInnovation

Оглавление (10 сегментов)

Intro

so one of the things that was just recently released was Dal E3 now doll E3 if you don't know is similar to Dolly too it's essential image generation software quite like mid Journey but this one is a bit different you see open AI actually did something that I didn't predict they were going to do and what they were able to do was add text as you can see on screen it says Larry is so cute what makes him super duper so essentially they've integrated this you know chatbot with Dali 3 which is of course an image generation software so you can see they're able to design some stickers they're able to you know talk about many different things and I think this is probably GPT 4. 5 and you might be wondering why are you saying that this is gbt 4. 5 because previously we did talk about how chat GPT as it's going to get incrementally upgraded more things are going to be released and we know that it is going to be more multimodal than anything else so of course this is technically Dali 3 but it's also an upgrade to chat gbt so of course everyone's wondering when is gbt5 yada yada it seems like we're getting these up Jason chat gbt that are essentially becoming you know slowly and surely um towards gpt5 so this is a major update for chat gbt because um something like this is really cool because you can now actually talk with the text and you know chat gbt can now give you an image and of course we do know that this isn't exactly like the previous image capabilities where you can analyze images but in order to create these images quite like mid-journey this is really cool now we aren't sure how good this is compared to mid-journey in terms of the realism but we do know that in terms of you know stickers and you know um various artistic things that this is pretty good so we don't know when this is going to be released I think open AI did say that it's going to be released sometime in September but it will be really interesting to see how this does work so if you did want to

Dalle3 vs Midjourney

see some comparisons between chat gbt and of course mid-journey in terms of Dolly 3 vs mid Journey the top one that we can see here this is of course mid journey and the bottom one that we can see no top one is actually Dolly three and of course the bottom one right here is of course mid Journey now this one you're supposed to have like a heart with a universe inside of it um with clouds coming from the sky and I do think that this one that they've shown us does actually beat mid journey in this area but once again I do think that what we're starting to see from certain image Generations is that certain image Generations are best at certain Styles I don't think there's one image generation that is best to all styles even in mid Journey you have to use V4 for more artistic and V5 for more realistic and version 5. 2 I guess artistic combined together for example this is an example of um some leaves dressed up as dancers or anthropomorphic leaves um as some country folklore singers and this is Dali which looks pretty good and then of course this is mid Journey but like I said before I'm not sure which engine the person did use in mid-journey so I can't really judge that but this is why I'm saying various different softwares and depending on which version is released they're going to generate a different specific result and then of course we have this one right here which is a close-up of a hermit crab nestled in wet sand and the top is of course Daily 3 and the bottom is of course mid Jenny like we said mid Journey does look more realistic um like insanely realistic I would believe this is a real picture um and then this one of course also does one look realistic as well but I don't want to say this is actually a hermit crab and this is actually just a normal crab so I mean it's up to you which one you think is better I'd love to know your thoughts down in the comment section below but um let's move now this

NextGBT Any to Any

was something I did want to talk about last week but the video was like 28 minutes so we just didn't have time to put it in but this is nextgbt any to any multimodal LM and this is generally where I think gbt5 and all these other large language models are going to go I know that many different people in the industry are thinking that you know we're going to decide one large language model that can do absolutely everything and maybe that's going to be possible but I like to think that what we can see here with next GPT um any to any multimeter LM like previously like Microsoft's Jarvis and other systems is that this is much more realistic so I think what we're going to have is one large language model in the middle and then this large language model is going to call on every single other large language model for the specific requests so for example when it needs to generate an image it's going to use mid Journey or maybe Daily 3 when it needs text it's going to use chat GPT when it needs to do audio it's going to pull on the 11 Labs API um and when it needs Vision it's going to pull on the best Vision generator out there and I think with all that combined that's going to be more like an AGI rather than doing one AI that can do absolutely everything I mean I think it's going to be similar to how the human body works or you have the brain that pulls on you know the nose for smelling you know the tongue for tasting the eyes for seeing um rather than the brain just absolutely doing everything so I think it's going to be kind of like a body um and this was a paper so you know it says as we humans always perceive the word and communicate with people through various modalities developing an any to any large language model capable of accepting and delivering content becomes essential to human level AI so I think that is going to be very interesting because it can do any to any um which is even video so I think that's where things are going to go um and it will be interesting to see how that does work but I think that this is where these large language model systems are going to go in even when we just before saw how chat TBT is already using Dali 3 I think that is exactly where we are headed so if you're wondering what the future of AI is going to look like this is a good show this is another

MV Dream MultiView Diffusion

thing I did want to include as well this is called MV dream multi-view diffusion for 3D generation this is really good I mean I've seen tons of different 3D model generations and this is really good like it's approaching that level of detail where you can actually use this stuff right now you can see Viking acts fantasy weapon AK blender um and this is just a text to 3D prompt so I mean look at this I mean Gandalf smiling white hair examples I mean this is really good like before you would see a lot of stuff as of course like right here this is where you're seeing every other project and all of the examples I mean you can see how it looks you know this one's blurry this one's that one's not too bad but these ones aren't as good and of course you can see they've managed to fix um various issues and various quality issues um and yeah this is really promising because when you compare it and you see the level of detail on this one um it just goes to show that with 3D like I feel like every week we're getting increases and here's the thing that someone previously said on Reddit I'm not sure what the post is but what they did say is that sometimes what people are looking for is one major breakthrough and of course that does happen but sometimes all of these smaller breakthroughs on the smaller level allow us to create that much more smoother transition into whatever it is that we might be moving towards so of course you can see right here that 3D is getting maybe 10 better you know every now and again like 10 better every month you know eventually you can have something that's crazy remember mid Journey every month it got better and eventually now we have something that is pretty crazy so this is crazy like Jack Sparrow wearing sunglasses boom a 3D model um and that's pretty crazy for me so um yeah I think this is uh really cool I think it's very interesting um I know something that I wanted to put out there I'm pretty sure there is a GitHub page that you can use or a hugging for a spot but this shows promising results because once this is perfected guys it's really going

Tech Tycoons Discuss AI

to change the entire then of course you can see right here Tech tycoons combined with a net worth of roughly 550 billion gathered in the same room today for a cinema for a senate Forum on the future and regulation of AI from Bloomberg so here you can see all of the people who are leaders and Pioneers in the AI space they're pretty much you know own the decision on where this stuff is going to move are discussing AI now what was interesting was one of the conversations that they had now the conversation wasn't that great but there was something that I did want to pick up on because I don't think enough people are paying attention to this and I think once again it shows that we are right now in a race to the ball so you can see right here according to the Washington Post one of the 22 Tech Titans at that Senate meeting Tristan Harris of the center for Humane technology pulled the room that with 800 and a few of were and a few hours of work his team were able to strip met his safety controls of its open source large language model llama 2 and the air responded to prompts with instructions to develop a biological weapon now you have to understand that uh this is a problem because of course meta Chief Mark Zuckerberg reportedly replied that those obstructions are available on the internet this is an example of AI doing sophisticated research fair enough but like we stated this is just the beginning so remember okay um this is pretty crazy okay because of course as they state deepmind was able to develop our Ford which is pretty crazy that solved a lot of um stuff that would have taken us years to solve so the point here is that of course right now these AI systems aren't great I mean they're great but they aren't great to the point where they can develop completely new biological agents but the point is that if they're able to do that you know five ten years from now that is going to be crazy because if we have open source tools which anyone can access of course this is good that anyone can access it but that of course opens up to Bad actors maybe someone wants to develop something that could ruin a whole town ruin parts of the world I mean it's definitely something that we need to be careful of because if we don't have safeguards and we don't have regulations then this kind of stuff is going to fall into the wrong hands and of course you know they equate this to being somewhat of a nuclear bomb and giving a nuclear bomb to everyone on the planet is a recipe for disaster because it only takes one person to set it off and we know that with eight billion people there's definitely at least a few of those who are crazy enough to just do something just to see how it goes so it is definitely something that shows that although open source AI models are good I don't think that the best because the risk of people using them to you know create fraud and just do many different things it's just too much but it'll be interesting to know what you guys think if these open source am models should still be allowed or if you think that

Autonomous Driving with Chain of Thought

the risk Bad actors is just so then what we have here is something that is very interesting I'm glad that this is now starting to get more recognition this is called autonomous driving with Chain of Thought autopilot thinking out loud in text it says linger one is the most interesting work I've read in Auto driving for a while before perception then driving action then after perception textual reasoning and then action so if you don't know what Chain of Thought prompting is it's essentially where you ask a large language model uh a question okay so for example let's say I asked charity what's two plus two um it might just say back four but let's say I asked it what's two plus two and then I say let's think step by step and show your reasoning and explain your reasoning before you give them your answer then it's going to say two plus two is four because when you add two and then you add two it's gonna be four and of course it's meant for more complex questions but they're now applying this to I guess you could say driving so it says lingo one trains a video language model that comments on the ongoing scene and then you can ask it to explain its decisions as to why you stopped planning and what are you going to do next and of course it shows you okay why it's made these decisions so um at the start we can see here it says I'm edging due to the slow music moving traffic and then of course um as you move on it says I'm overtaking vehicle that's parked on the side so I think this is interesting because it gives us an insight as to how these large language models are making their decisions I'm accelerating now since the road is clear and remaining stationary as the lead vehicle is also stopped um and I think this might be it might be a breakthrough I'm not too sure but um I think Elon Musk did make a comment on this because he did talk about llms but yeah I can't I can't find the actual tweet from Elon Musk where he does talk about llms but I do think that this is going to be interesting to see if this is uh more successful than other decision of course contextual reasoning as we know does improve a model's response by around you know 20 to 30 or in some cases even five times so it will be interesting to see how right now what's also very interesting as well is that Elon Musk actually said something literally an hour ago that means he has inside information about so he said okay

Midjourney 3D

that mid-journey will be releasing something significant soon okay and this was in response to a tweet where it was talked about Dali 3 Once deployed will improve at faster rate the and Elon Musk says mid-journey will be releasing something significant soon so that means of course Elon Musk has Insider information about what's going on at mid-journey and it's no surprise I mean if I work there but I'm trying to think what exactly is mid-journey releasing is it going to be finally 3D because apparently that's what they're working on or is it going to finally be the desktop browser slash area that they were working on that was leaked it was leaked and it was and I did see I do have the screenshot show you because um they don't want anyone to see that stuff and I'm guessing they're just waiting competition and they want to save you that kind of stuff now there's a tweet here that does say according to David holes mid Journey V6 will be a bigger jump from water V5 with better image quality and text prompting and mid Journey 3D should come out in the next six months now if you want to know what 3D might look like we do have an image trailer from this account Nick floats now I'm not sure if he made this himself I'm pretty sure it was him because I didn't manage to find this trials but this does look very interesting and this is possible because the software and Technology does exist to do this already okay and I'm not surprised if this does actually happen with mid-journey because like we stated before being able to do this I know looking at crime scenes as we discussed before like a snapshot of a crime scene or something like that that would be pretty interesting for detectives or managing to do real estate or you know try to virtually explore and figure out where you're going to be where you play certain things I definitely think it's going to you know be that next then of course we have Robo

Robot Factory

Fab introducing the world's first factory for humanoid robots and essentially this is pretty crazy because of course as you know humanoid robots are becoming more and more popular but this is a factory for the you know humanoid robots that are basically going to be everywhere and this is what they want so they're trying to reduce the cost of these automated rooms essentially what's even crazy about this as well so that I forgot to mention is that they're going to be using the robots to actually assist in the factory in which they're building robots so I guess you could say this is somewhat kind of exponential um and very interesting because I didn't expect to see this you know technological announcement happening so quickly so um these robots I'm desperately sure that these companies are going to be getting more funding and more crowdfunding because of course investors definitely want to benefit from this huge industry that is projected to have billions and billions of dollars will be worth trillion will be interesting to see how this does play out then essentially what we had was a surprise we had this platform YouTube saying that they're going to release a bunch of new AI tools and it's going to be helping creators and anyone that does want to become a contract platform so it will be interesting to see exactly how this stuff does work how it does change the entire platform in terms of content creation if it's actually going to be good or if it's going to be just pretty bad so it says AI images for shorts and stuff so what I'm going to do is I'm going to leave some of the video in here so you guys can see exactly how it looks and how it works because this explanation is vastly better than mine yeah are you actually rolling we're actually oh we're rolling okay yeah we're good to go let's go Okay so YouTube just announced this set of AI and editing tools that are gonna revolutionize the platform making creation easier and more fun for everybody the aim is to unlock more creativity for more creators than ever before the most exciting part to me is that what was announced is supposedly just the beginning let's get into it okay first up let's talk about dream screen the new image and video generation experiment that's making its way to YouTube shorts powered by amazing AI technology dream screen lets you bring your imagination to Life by simply typing in ideas as text prompts it then generates super fun images and videos that you can use to set the scene alright let's see this in action laughs I kind of don't want to leave this is nice these new tools are expanding the boundaries of digital art and that's not all meet YouTube create this is a new app that YouTube is building to make editing easier for everybody and it's free of charge it includes access to thousands of royalty-free tracks and sound effects and you can automatically create captions for your video with just one tap Cleo Jade are you not mindless I'm very excited about this in a genuine way last but not least and my personal favorites there's a feature that lets you clean up and remove any background noise I live in New York that would be really helpful the beta for YouTube create is available first on Android to create Us in select countries right now so go and check it out these announcements show that YouTube is really starting to transform the way that content is created helping more creators make more content in more ways than ever before they're shrinking the gap between our wildest ideas and what we can actually create until then just keep making things I'll see you on YouTube okay so of course we had Microsoft announce your co-pilot which is the windows 11 update and it's going to include most um pretty much just a ton of different AI updates including paint photos clip champ and more all to your Windows PC and of course well they also talked about how was that Bing is going to be adding support for the latest Dali 3 model from open Ai and deliver more personalized answers based on your search history and of course just essentially a whole update that just makes everything a lot better so it's going to be really interesting because of course Microsoft is pushing this entire AI into the entire system now I do think that this is going to be interesting because of course as you know Apple's operating software doesn't utilize any of this at all and it seems like apple is currently being left behind in the AI Races they haven't even spoke or even anticipated anything just yet so it will be interesting to see how Microsoft manages to gain ground because of course things with AI do move a Lot quickly and a lot quicker than people do expect and you can be left behind because of course Internet Explorer did actually catch Google off guard now of

Google Bard

course I would show you this entire video I probably might make a dedicated video on this but of course you can see Bard can now connect your Google Apps and services so it says use Bard alongside your Google Apps and services easily double checks its response and access features in more places so what does this actually mean so you know how you have the Google Drive you have Gmail you have YouTube you can actually check and use Google Drive or I mean you can actually use Bard to check your Google drive to check your Gmail to check your YouTube to check Google flights to check pretty much all your editing so this is like actually having a great personal assistant and this is really really cool because this is something that's more valuable than chat gbt because of course chat gbt is great but I mean you have person one of the large problems with track TPT is that if I need information I have to give it that information and it is pretty time consuming especially if you're using chat GPT to respond to email to do work with your business with your company having to constantly feature activity that new information especially after updates is very time consuming this is where Bard comes in and it's already connected with Gmail you know it can already help you with shared conversations just so much stuff and it's really cool that now they have this okay and this is going to be something that once again like we said is a step up here so of course as you can see when you go into bar this is exactly including bad extensions of course if you click next then you're going to see Bard meets Google workspace because it has all of your stuff and of course it's really cool is that double check Bud responses you can check how accurate bars responses actually are so if you're not confident about Bard or if Bard's not confident about something you can click a button and it's going to show you how accurate that response really was so of course you can see right there it's giving you a lot more stuff and of course as you know it's now of course you can upload images so one of The Bard's new features I took a random picture of a car that I found on the internet and then I literally just said YouTube this so if you don't know what it is you can say Google this YouTube this and then essentially it's going to give you like YouTube videos that are about that so I think this is going to be really interesting to see what kinds of cars it chooses um and what kinds of things are linked in here so it's really interesting and it also says YouTube video views will be stored in your YouTube history and so yeah I think this is going to be used a lot more than people did expect then of course we have

Amazon Alexa

Amazon's Alexa voice it's allowed to become a lot more natural and a lot more better clearer so this is going to be interesting because this is something that we did expect earlier on the year we did Bedrock which was a bunch of different Foundation models to all of their different apis and services but this is of course to their lead product which is of course Alexa or as many of you may know Amazon's Astros so take a look at this clip because I think it showcases exactly what Amazon is doing and of course Amazon Alexa hasn't really had much of the spotlight since Siri and other things like Google have all say but I think whatever company manages to get a home device that is really good first off the ground that is integrated with a large language model that it sounds natural is actually useful probably can make jokes and stuff like that it's definitely going to take this next wave by storm

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник