💬 Access GPT-4 ,Claude-2 and more - chat.forefront.ai/?ref=theaigrid
🎤 Use the best AI Voice Creator - elevenlabs.io/?from=partnerscott3908
✉️ Join Our Weekly Newsletter - https://mailchi.mp/6cff54ad7e2e/theaigrid
🐤 Follow us on Twitter https://twitter.com/TheAiGrid
🌐 Checkout Our website - https://theaigrid.com/
https://twitter.com/Rahll/status/1738286342390374882
https://twitter.com/StabilityAI/status/1737588219863339206
https://twitter.com/StabilityAI/status/1735009826814513342
https://twitter.com/btibor91/status/1737057924025811236
https://twitter.com/CircleRadonqq/status/1737338671219843076
https://twitter.com/stevenheidel/status/1736817896314351873
https://twitter.com/adcock_brett/status/1736450188464599216
https://twitter.com/tdinh_me/status/1735827841978364096
https://digi.ai/
https://twitter.com/LinusEkenstam/status/1735825681966113020
v https://twitter.com/Rahll/status/1738087100317118867
https://twitter.com/suno_ai_
https://sites.research.google/videopoet/
https://twitter.com/rabbit_hmi/status/1738000404791566687
https://twitter.com/ylecun/status/1737988581732274672
https://twitter.com/ai_for_success/status/1738558340744249368
https://twitter.com/MartinNebelong/status/1739714101280960518
Welcome to our channel where we bring you the latest breakthroughs in AI. From deep learning to robotics, we cover it all. Our videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on our latest videos.
Was there anything we missed?
(For Business Enquiries) contact@theaigrid.com
#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience
Оглавление (6 сегментов)
Segment 1 (00:00 - 05:00)
So this was actually quite an interesting week in AI. So let's not waste any time to figure out some of the key news stories you might have missed in artificial intelligence. So one of the first things that actually was released was a text video model for zeroot video generation by Google. So essentially this means unlike other video models where you need to input an image to then get out a cool video, this was something that they finally released. Now I think this one was something that did go a little bit under the radar. I'm not sure if I've released a video on it just yet, but there was a deep dive and if it is released, I'll leave a link to that in the description below. But essentially, the reason this was so cool is because it had many new features that other video models haven't, you know, simply done yet. So, one of the, you know, cool features about this was that it had video to audio where essentially you have the video being generated by the model and then of course it not only does that, it actually manages to get the correct audio for that as well. So, for example, in this clip right here, you can see it's a cat playing a piano. Then, if I were to unmute this, you can hear that, of course, there is a piano sound playing as the cat, you know, hits the keystrokes. So, there's multiple examples of this. This is just one of the things, video to audio. There was also this thing by Google research where they did like a little story that was based on this raccoon, which is kind of cute, and I will show you guys that in a moment. But essentially, this was something that many people didn't realize because everyone right now is focused on Gemini. So when Google releases other things, it's not going to be that big as a focus. But one thing that I would say is the main takeaway about Google's video poet is the fact that the accuracy of this model is really surprising. Like sometimes we try to do other things in other models, it's not that accurate. But in these two examples here, we can see that, you know, for example, two raccoons on motorbikes, a meteor shower falls behind the raccoons and the meteor explodes right behind them. You can see that looks pretty accurate. And then this one also lightning flashes in the background, purple smoke emit figure of water. That is also really accurate. The lightning flashes, then the purple smoke does come out after. Now, there were also some interesting other things where we had longer video generation. That was really good. Now, what's also interesting was that literally just after this was released, PA released something where they also do have this as well, but that doesn't really matter. I'll be showing you guys PA in a second. But they also have controllable video editing where essentially you can follow different motions and different dance styles. Also, interactive video editing where essentially extending input videos just from a list of examples. And you can change that via text conditioning. And you can see that the text conditioning makes the videos so much better. And of course, we do have a really decent image to video generation where I even tried this image in other AI video generators. And it actually didn't go as well as this one. So clearly Google might have something secret here. Like I'm not sure what framework they're using. I have looked at the previous Transformers that they did use for this, but I still think that even though they've shown what they're using, they might have some secret source in this. And then of course we have zero stop stylization which is like runways one style to another style. But I would say that this one is a little bit better. You're going to see that later on in the video. There is also another text to video model that was released. And it's really surprising because when I looked to the community, the community was insanely small on Twitter. Like the post about the stuff only had around 10 to 20 likes. And the tool is incredible. So um I can't wait to show you guys that in a moment. But you can see applying a visual style and effects here on this part is really, really nice. So overall, there's going to be many different things maybe in the future. I'm not sure if other companies are going to be using this or if Google's ever going to release this, but it was something that Google did release that I do want you to know about. Next, we had PALBs actually finally release their model to pretty much everyone. So if you want to sign up for PABS, you can and you're going to get immediate access. And what's good about PABS is unlike MidJourney that uses Discord, and this is no hate to MidJourney whatsoever. I just feel like it's really cool that they do have a web app. You can see here that um you can also explore everyone else's generations. So, you can see just how good Pab's quality is. I mean, comparing this to some of the other stuff, it definitely does just take the cake. And one of the things I really like about Pabs is, of course, when we do look at these videos, you can see exactly how crazy it is in terms of the consistency, especially when we're looking at these anime styles. I don't know what it is. I mean, maybe they did. I think what PLABS actually did was in terms of their video generation, maybe they just, you know, specifically trained this model for many, many generations on many, many different anime styles. Um, or maybe they just trained it specifically on animation styles because so far what we've seen is that normal styles look pretty, you know, average. They look pretty okay. But the ones that look really amazing are the anime styles like this Santa and then, of course, the uh 3D animation styles like this raccoon. So, I think those two ones, I'm not sure what Mid
Segment 2 (05:00 - 10:00)
Journey did. Maybe they just kept putting Disney movies into it. But those ones do look really, really effective. And like I said, uh what you can do here as well is you can add something to this. So, you can click edit and then um I think it's here. Yeah, you can click add 4 seconds. So, if you want, um if you got a clip, you can just keep adding 4 seconds to it. And this is where I did try the same generation from Google and PABS, but it didn't work. So it goes to show that PABS and Google are using different techniques to generate their videos from text to video. So this is Domo. So it's video to video and I think the quality of this is crazy because although we did see this with stable the fusion and with runway whatever they're using is absolutely incredible because I've seen some examples on midjourney and I played around with this myself and it looks absolutely outstanding. So, it's called Dermo AI, and I, like I said, I don't know what architecture they're using. Maybe they're using stable diffusion. I'm not entirely sure, but it looks so consistent, and some of the styles that they've chosen to use are very, very effective. So, I think this AI tool, like I said, when I was looking at it on Twitter, you can see here that it's only got 50 retweets and 100 likes, but it's really, really effective at doing the stuff. Now, I'm going to show you guys a quick Twitter thread that showcases some of the cool stuff. And here's this quick Twitter thread that shows you exactly how good this is. So, we have the source video and you can see here that this one is it's just crazy. Like, it's very surprising in terms of how good it is. But you can see that the consistency is completely there. And I'm not sure if this is going to be used to make animated movies. I'm sure that there are different ways that make this stuff, but like I said, I think that this kind of technology goes to show us how crazy this technology is. Because usually when you do see stuff like this, you usually do see stuff that usually looks completely AI generated. But if I did see this, I would initially think that someone did animate this stuff. So, like I said, it's a new AI tool that doesn't really have that much um you know, I guess you could say that much coverage out there, but it is something that is really good. So, I'm not sure if other companies are going to get in on this as well, but um you can see that the consistency of this is pretty incredible. And even this pixelated version right here that you can all see does look pretty crazy. So, I mean, I'll leave a link to the full thread in the description below, but um yeah, Domo AI is definitely something that did actually catch me off guard. And then this is one more example that I did want to show you guys that was on Twitter. Um and you can see like if I saw this, I would definitely not say that that's AI generated. I would say yeah, someone animated that 2D. They spent their time animating that. But of course, um this is completely different. So, will be interesting to see how this progresses in the near future. Then we had something that was rather frustrating slash I don't even know what to the the real take on this is but um Midjourney has been involved in a little bit of drama and it's not the good kind of drama because everybody knows that midjourney is pretty much the top tier when it comes to AI image generation but essentially Midjourney has been in some bad Twitter drama recently for uh copyright infringement and you know alleged copyright infringement because essentially what people are claiming is that Midjourney um has essentially you know ripped off these you know, movies. And essentially how people are trying to prove that is they're prompting it and they're saying get a screenshot from a movie. So the prompt that someone put into the model was Thanos Infinity War 2018 screenshot from a movie, movie scene 4K. And then Midjourney output this image on the right hand side, which is eerily similar to the image on the left hand side. And these images from the left hand side are images that are directly from a movie. So people are saying that um you know this what we have right here is essentially something that's stealing stuff from these movies. Now, uh, MidJourney hasn't been sued by any of these companies, but I do think it is interesting because we have a situation where MidJourney hasn't really responded in the best ways. And the problem here is that apparently they banned this guy just for making this Twitter thread. So, I don't think that's a good look on Midjourney. Um, and Gary Marcus here, um, you know, a leading expert on AI that actually spoke at the US Senate AI hearing actually basically was talking this and was saying, "Dear MidJourney, reinstate this guy or drop your new terms of service. The people have spoken and it's not even close. " So basically they're saying that they should reinstate this guy. Um I'm not sure if they banned him or if they just, you know, were working on their systems. Either way, it would be nice to get some kind of statement from Midjourney. But I do know that Midjourney is in a pretty precarious situation because if they explicitly state that, look, this is kind of a mistake, then they open themselves up to some kind of litigation because if they admit that this is from Infinity War and they admit that they're actually using other things to train their models, then like I said, they're admitting potentially to copyright theft. So, um it is a tricky situation, but um I do hope that this guy's accounts, you know, does get, you know, reinstated. But the only thing that I can say is that it is Mid Journeys version 6, which is currently in alpha. So, I'm not surprised that these bugs are happening. But I also don't think that this is a major surprise. I'm pretty sure that everybody knows that all these AI companies just steal the data like you're about to see later on in the video. So, I'm not sure why this is such a big issue. And of course, there was this last example where we have the Joker movie uh actual movie image and then we have a midjourney
Segment 3 (10:00 - 15:00)
output image. So I mean it will be interesting to see what happens. Um if the story does develop please do let me know but um it's definitely something that I do want to see get resolved. Then we had stable diffusion uh introduce their foundation model which is essentially something that can generate 2 seconds of video and it looks pretty good. I'm not going to lie, It might not be better than pollabs or whatever. But like I said, uh text to video is improving and so far we're getting the increased, you know, uh high quality. We're getting increased frames per second, increased stability, consistency. Um, and this is the exact direction that I do want to see us going in. So, I won't be surprised if, you know, eventually Stability does get another model, but of course right now 2 seconds isn't great, but it's better than nothing. So, Stability did release this. Um, and you can, of course, use this if you want. But, um, it's another development that you probably did miss. What they also did release, and this was something that I was supposed to talk about as well, was of course Stable0123, quality 3D object generation from single images. And essentially, you just use images to generate 3D models. Um, and I've seen this technology get better and better, and I honestly can't imagine where we're going to get um, just three years from now just based on an image. Like, will people even need to model images anymore? So, it will be interesting to see where this does go. But, like I said, I think Stability is pushing out all of this stuff because they recently did actually have some problems in terms of their finances. Um, and they were actually, you know, trying to I think they were trying to raise more money and they were essentially in a pretty precarious situation because they're not making as much money as these other AI companies. So, I'm sure that if they manage to put out more stuff, they're going to be able to eventually survive. But, of course, as you do know, AI is a growing field and I'm pretty sure that the demand is there and they're going to be able to work around this stuff. Then, of course, we had Chat GBT's Project Sunshine. Now, if you haven't been paying attention, I made a video on this around 3 or 4 days ago, and it basically does talk about GPT's advanced features. So, this is a kind of big story that most people did miss. And if you think that this is some crazy leak, um, you can actually go on OpenAI's official web page and then you can check the code and see that it actually does exist there on the JavaScript or whatever the open source code. Um, that you can go ahead and check. It's not open source. You can just go ahead and check the um, inspect element. Like when you click inspect element, you can click CR+ F. Then you can actually find these things on the web page. So essentially these are going to be the features that potentially OpenAI might have. This is what they currently have on the web page. But of course, it's not confirmed because it's not released yet. So they could be just, you know, alpha testing it with a set amount of users. They could be alpha testing it on their end, but it is something that we do know is there and is probably likely going to be the future of GPTs. So it says here that your GPTs can now learn from your chats. And I'm going to include a clip from the other video because there's no point of me going over this again, but I'm going to include one or two minutes from that video where I just pretty much explain exactly what's going on here, but the TLDDR is basically that GPTs are going to be really incredible. Um, I'll leave a link to the full video cuz it's a full deep dive. Like there's a lot that you might have missed, but it is something that you do want to see. We also did have this tweet from someone that works at Open AI, Steven Haidel, who fine-tunes LLMs. And essentially he talked about, "Brace yourselves, AI is coming. " And this was in a response to OpenAI's preparedness framework. And once again, I did a deep dive on this. And this is really, really important because there are these measures that essentially show us what kind of models we're going to be getting. And with this kind of framework that they have, we basically know that really dangerous models are never going to be deployed. So that's not something we should worry about. And kind of dangerous models may be developed by OpenAI, but those ones are never going to be deployed. So if a model is even remotely dangerous, OpenAI is never going to deploy that to the public. But of course, as you know, there are like ways for people to, you know, have prompt injunction attacks and for ways for people to essentially jailbreak these systems. But to OpenAI's knowledge, they're never going to release a model that is, you know, higher than stage. And essentially, you can see here on this scale, what they had was they had something from low to critical. And if there is any model that is high or critical, they will never release that. And if a model is low to medium risk in terms of, you know, the capabilities, then they will release that system. So it goes to show that, you know, the future models aren't just going to be some, you know, AI death murder bots, which is of course something that we didn't think would happen. But at the same time, it goes to show that they're taking AI safety really, really seriously, even though a lot of people don't think that they are. But I do think I'm really glad that they did release this because although OpenAI was actually criticized for steaming like going full steam ahead with all of their projects. Um it's good to see that they still maintaining a very good stance on AI safety despite the uh previous concerns. Now, something that was rather uh I guess you could say frustrating slashin interesting slashfascinating slash innovative uh was this and uh people did actually talk about this uh romantic AI romantic companion but the problem was is that you know they've been a romantic companions for quite some time um and it just really did stir up a huge debate around you know how the future is going to be regarding these kind of models.
Segment 4 (15:00 - 20:00)
Now the app itself isn't that good. Like you can go ahead and check the reviews and this isn't me you know dogging on the app. It's not paid. It's not like hate or anything like that. It's just essentially all one-star reviews. But the point is that I don't think we're going to have a future where people are just going to be obsessed with AI models. But at the end of the day, we do know that large language models are increasing their capabilities in terms of persuasion, in terms of how they sound, in terms of their realness. And I do think that with an ostracized part of the population, there's definitely going to be some of them who are going to use these services whether or not we like it. So, um, will be interesting to see how this kind of does play out. But um it is one of those really controversial topics that does have a lot of information on both sides. That's really Now we had something that was really cool here. We had something from Sunseo AI. So essentially they released this uh you know text to music generator. Um and I'm going to show you guys uh the clips from that trailer cuz it's really cool. Um so take a look at that. I'm headed back. That's where I belong. So I think that is really fascinating stuff because I mean music generation is something that is kind of weird in the sense that you know it's kind something that you do really like to be done by one person. I mean, the whole point of music is that you're listening to someone. So, I'm not sure if people are going to take this on board and are going to start listening to AI generated songs. What I do think that this video did showcase was the fact that we might be in a future where we have AI generated entertainment because the guys walking home, these girls are sitting in a car and rather and I know this is just of course a demonstration but rather than you know picking their favorite song, they just simply put in a text prompt and listen to it. Now, of course, you know, in the future, I think that an AI system that has all of your, you know, favorite music tracks, all of your interests, all of your hobbies, and knows absolutely everything about you could possibly generate some of the best songs, and you may never listen to another artist again. But at the end of the day, it's something that is fascinating because it brings up the discussion of how much AI will actually impact us and these other industries. So, um I do think this tool will just help musicians make stuff faster rather than replace them. But nonetheless, it is something that you should probably, you know, play around with. And I think it does solve the copyright issue where, you know, some people, some creators, even people such as myself, you know, do struggle to find soundtracks that are copyright free. You could go ahead and generate one with AI and then of course just use that. So, um, like I said, this is going to be something that's pretty fascinating. Um, and I'd love to know your opinions on this specific one. Then, of course, we had Leonardo AI motion. So essentially what this is it's basically just image to video and essentially you can just use your image and you put that into the system and then it turns it into a video. So you can see here it's image to motion. can see the motion strength and then you generate that and then um in a couple of seconds you know using your credits or whatever you then get output a video and I think it's really decent based on the first you know cuz this is their first video model that we're seeing and I'm going to show you guys another thread as well but um it goes to show that competition is good and Pabs does have some very good competition in terms of the quality of this model and there's also a thread here and essentially this guy was doing a sketch to video project and you can see here for example like this dough donut. That looks pretty good. Then he has um this robot that was upscaled by Mag Magnific. And if you haven't seen that one, it's a really insane AI tool. See this one right here. Um you can see that like that how it is. And I wonder the day that we're going to get, you know, really high quality videos that are high quality enough like on the same level of midjourney. That would be absolutely incredible. Also got like an apple here. You know, the first one was really cool. And of course, we've also got this like Viking dude in a forest. And then we've got this drawing of a car. So, it's actually a car, but you know, if you look at the sketch, that's what it's taken the sketch as, and then it's made that into that. So, I think this is really interesting cuz it actually does help the creative process out so much. Um, and I mean, like, imagine you're like a kid or something. You're like four or five years old and you want to make some cool stuff and your drawings, you know, your drawings are like that, you could essentially make videos like this. I know that's not really like a crazy application, but um, it's still something that, you know, you can creatively enjoy. Um, something that you can probably use. And of course, we had OpenAI absolutely getting sued by the
Segment 5 (20:00 - 25:00)
New York Times. And essentially why they're getting sued is because apparently OpenAI trained on proprietary data from the New York Times. And essentially when you prompted certain things on GBT4, it just output the exact same text from the New York Times. So um it's really interesting because like I said before, we had MidJourney facing this issue. And then you can see here that the output from GPT4 is the exact same text from New York Times. So it's fascinating because this is something that we didn't see before. And the copyright law is so weird because the problem is these AI companies didn't ask for permission. They've already trained on the data. So it's like we're like, you know, going back in time to kind of sue these guys because it's like suing them for something that they already did steal. I mean, it's it's kind of like a weird thing. I'm not even sure if they're going to win this case. I mean, if they do, what does that mean for OpenAI? Does it mean that any company ever that's trained on data that it didn't ask for? You know, if someone does a prompt and then they find the exact tests, are they going to be able to then sue them as well because that would open up the floodgates and then would that slow down AI LLM production? Um, and then would that increase synthetic data production? I mean, all of these things are things that you definitely do have to think about because I mean, it could change the landscape. Who knows? So uh the ruling of this case will be one to definitely watch because the problem is that with generative AI and with all these new technologies is that there are problems that you haven't seen before which means that there are laws that don't take into account certain things which means um these cases are going to be fundamental to shaping the future of how future cases are tried and you know ruled. So, this is why I'm really going to be paying attention to this. And essentially, the New York Times are, you know, suing them for billions of dollars and saying that, you know, you you've cost us billions because um people are using your software rather than us. So, I mean, we'll be interesting to see how this is ruled because this will set the future um for future companies. Then we had a viral video, okay, where someone said that, hey, Rob Chatt just recognized himself in the mirror. So essentially what this person did was they put JPT into this small robot that was based on the Raspberry Pi platform and that's just a platform that can make really cool stuff with. And um essentially just watch the video first and then I'm going to tell you guys why this been there but there is some kind of music and I'm not sure if I'm going to be able to play it. But the point is okay is that in this video uh they say hey Rob let's test your vision. And then the robot essentially says wow this is crazy. I can't believe I'm seeing myself for the first time. This is like a sense of not consciousness, but realism, yada yada. And at first, people were freaking out about this clip cuz they were like, yo, this robot recognized itself. And there were a bunch of comments on Tik Tok um on YouTube, on Twitter saying that, well, this is the moment that uh people have finally been waiting for uh where, you know, these systems that we don't think are conscious are actually really conscious. But however, there is a caveat to this response where you do have to kind of be honest about how they actually achieved this result. So you can see right here that this guy actually said yes for this test I overly described the mirror thing in the prompt. So when he wasn't in front of a mirror, he mentioned that he didn't see anything that resembled himself and then proceeded to describe what else he saw. Either I simplify the prompt not mentioning anything about a mirror or I'll add in statement to tell it if he doesn't see himself, he describes what he sees. And basically this guy actually did unfortunately prompt the large language model inside chat GBT to basically tell him what was going on. So, it wasn't just him turning on this robot and turning on vision and then the robot realizing that, oh my god, I'm awake. This is crazy. Because if it was, I'm pretty sure it would have been um a more viral moment, but that's why this clip did go viral. So, that is why um a lot of people were like, "Okay, this is absolutely insane. " But this is the honesty. But of course there will be more tests and there will be um you know further times where this is something that does happen because this isn't going to be the first time that LLMs are going to be a robots combined with vision because Adoch the guy that is the founder of Figure Robot which is a robotics company. So this company right here which is figure. ai AI. Essentially, he talks about 2024 being the embodiment of AI in robotics. And it definitely is because um there are so many companies, I've seen so many different press releases and things that don't even have that much attention. And I will do a full video on robotics because there's so much news that people are just missing out on. But basically, he said that robotics is going to be that crazy next step. And that's exactly what his company is focusing on. And he wasn't the only one that said this. Dr. Jim Van, the lead of AI agents at NVIDIA and the CEDA and the senior Nvidia researcher basically said that 2024 is going to be absolutely incredible. So the next step in robotics and it's going to be the thing other than LLMs that takes away the game. He said we are 3 years away from the chat GPT moment for physical AI agents. We've been cursed by the Moverex paradox for too long, which is the counterintuitive phenomenon that tasks human find easy are extremely hard for AI and tasks that humans find hard are extremely easy for AI. So he basically
Segment 6 (25:00 - 26:00)
talks about the future foundation models and platforms for robots. So the multimodal LLMs with robot arms and physical I/O devices such as Vimma, RT2, RT1, Palm E, Google, Robocat, Deep Mind, and then there's and then there's so many different algorithms that bridge the gap and of course they always talk about the data and of course Nvidia's Isaac Sim can actually simulate reality at a thousand times faster. So he's basically saying that 2024 is going to be incredible for robotics and I honestly can't wait. And the last thing we have here is that basically Samman's working with someone who used to work at iPhone to build one of the first AI devices. And of course, there already is the AI device, the Humane Labs device, but this is going to be something next level. So, it will be interesting to see how this platform does develop because Apple um and although it isn't actually Apple, it's iPhone's design chief, but it will be interesting to see what kind of AI project they're going to be developing, what kind of, you know, I guess you could say physical product it's going to be if it actually does go to mass market. if people are, you know, going to be using this, if people are rejecting it, because making something that everybody uses like a mobile phone or like a bracelet or something that really takes off is really hard to do because people are so ingrained with their current devices. So if there is any wearable tech that does take off, it will start an entire new industry and it will be interesting to see if AI actually is able to do