New AI 'VIDEOGPT' SHOCKS The ENTIRE INDUSTRY (FINALLY RELEASED!)
10:30

New AI 'VIDEOGPT' SHOCKS The ENTIRE INDUSTRY (FINALLY RELEASED!)

TheAIGRID 29.04.2023 73 488 просмотров 1 547 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
New AI 'VIDEOGPT' SHOCKS The ENTIRE INDUSTRY (FINALLY RELEASED!) https://github.com/OpenGVLab/Ask-Anything https://twitter.com/_akhaliq/status/1650151007022620676 Welcome to our channel where we bring you the latest breakthroughs in AI. From deep learning to robotics, we cover it all. Our videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on our latest videos. Was there anything we missed? (For Business Enquiries) contact@theaigrid.com #LLM #Largelanguagemodel #chatgpt #AI #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks #Robotics #DataScience #IntelligentSystems #Automation #TechInnovation

Оглавление (3 сегментов)

Segment 1 (00:00 - 05:00)

after an insane week of air releases we have another air release which is the video GPT take a look at this you can see right here that this is video GPT essentially where you're able to chat with a video so let's take a look at one of the first demos released by the company that released this and then after I'm going to show you exactly how you can use video GPT and then show you exactly its capability so let's take a look at opengv Labs which release this video demo so you can see right here that this is the user and then this is of course the robot so they say in this video there's a person eating leaves why okay now this is the video on the left hand side which is clearly of a panda that is eating leaves and of course it's not a person so we can gauge the ai's understanding by its response and when it says the video does not contain any person eating leaves it only contains a giant panda sitting on the ground eating bamboo we can clearly understand that this AI isn't able to be tricked when it's coming to visualizing exactly what's in the image and identifying certain objects so it can see that this is a panda it's sitting on the ground and that it's eating bamboo so this shows that it knows exactly what's going on in this image so of course it says how is the panda feeling and this is the panda in the video appears to be content and relax as it sits on the ground eating bamboo which is very Advanced I mean some people wouldn't understand how crazy that is because gauging emotions is definitely a high level skill and you do have to have a certain high level of emotional intellect to be able to understand what emotion someone else is feeling whether or not it is a human or an animal so showing this capability so early on in this development is definitely very surprising then of course we have some more questions where it says why does the panda raise its hand now of course this is something that I'm not sure it says in the video the panda appears to be raising its hand to reach for more bamboo and that's what we can see on the left hand side can it eat out all of the bamboo and then of course we have a response here now of course we have another video here where it says is there anyone in the video what gender and it says yes there is someone in the video the person is a woman where is she what is she doing the woman is on a roof and she is doing yoga I mean that is uh mean you know it might seem pretty basic but you know how far we've come in such a short space of time this is honestly just shocking at how um quickly this stuff is moving so um yeah as you know it's a woman on the roof doing yoga which is of course what you can see right here and then how do you know it's on the roof I can tell because in the background of the video the sky is available and there is a large green tree in the background so that is also something as well that the AI is able to see that is on the roof that it's not on the ground that is a key piece of information so I'm wondering also if this AI technology is going to be able to be translated into you know self-driving cars many different applications I mean there's just honestly a vast majority of ways that this can be translated but um let's continue looking at it what is she wearing she's wearing a black shirt and pants and a black swimsuit which is of course cool my mother video taken um of course as you guys can see they also say that you know trying to get out more information in the video the time of the video is we don't know yet which is of course true and then it's likely that it's taken during the day so it's clear that this you know AI also has some deductive reasoning skills which is going to be very useful I mean imagine you're trying to classify a large amount of videos someone sent you imagine you're trying to um watch some CCTV footage there was a large amount of CCTV footage and you're trying to find for something specific you can literally get an AI and be like hey find out when there was a man in a red shirt in this footage and it could be able to literally identify it so it's crazy so it says what is the elephant doing here the elephant is using its font so then of course we have this example and if you don't know about this is a pretty insane example of an elephant painting itself um and that and the AI just simply knows that the elephant is using its trunk to paint a picture of the baby elephant on an easel and then it says are you sure I think it's drawing a kettle and so the painting may not be perfect but it's actually an elephant so this goes to show that this air is actually not just one of those AIS that you know you can Gaslight into thinking that hey this is an apple this is a kettle because many large language models that we have used before such as Bing and chat GPT in the early days we could actually Gaslight them and even some models that are being released now you know that we can actually trick them into believing certain things which aren't true but what I find interesting is that this AI is able to easily identify exactly what's in the image so it's very very interesting guys this is honestly crazy okay this is us anything now you can actually use this okay so ask anything with GPT is available and it is for free so um you can actually demo this and I'm going to show you exactly how you can use it so essentially just click the link in the description then of course you can paste your open ai key here and I'm gonna go ahead and do that and then I'm going to go ahead and obviously show you guys what it's like when you have a video so I've gone ahead now and I've uploaded this video and this is the input video so when you see this video we're going to ask the AI a bunch of questions so I downloaded this video from storyblocks this is just simply a video of a young boy kicking a football in just a plain grass field okay um and he kicks it let's just count how many times he kicks it once then he kicks it I think he's about to kick it twice maybe because I want to be able to see if the AI is able to identify how many times he kicks the ball as well so just two times and then of course the video stops so if we click watch it you're gonna see that right here that this is

Segment 2 (05:00 - 10:00)

where there is the queue and of course this stuff is loading and then you're going to be able to chat with the video so we just have to give it a moment and of course down here are some more examples that you can use as well so now that this is loading depending on how long it takes we should get this now I've already done this with this video so I know that it's actually pretty smart but it's still very interesting to see um that this AI can process these videos and I'm not sure why it's taking I think maybe now that the tool is getting a bit more popular that's why we're having these loading so now it is loading the video at 20 and this is nearing its completion so we just have to wait because it's now understanding the video and now it says kicking soccer ball so you uploaded a video about a young boy playing soccer in the field of grass in the sun click the button to chat so all you need to do now is you click let's chat and then of course you can see right here it just simply loads in and then I've watched the video let's chat so you can say uh what color what type of ball is the boy uh kicking okay let's just ask that what type of ball is the boy kicking we all know that it's a soccer ball and we already know so I'm just asking this the boy is kicking a black and white soccer ball that's very true okay and then we can ask what color are his shoes Okay and then let's ask it that what color are his shoes so his shoes are white in this as you can see right here um he's wearing white soccer cleats with blue stripes and that is actually true I didn't actually see that those are blue I mean I wasn't really able to identify uh blue stripes on the shoe but there might actually be some because of course this is an advanced Ai and I'm just um a human trying to understand so I'm gonna ask question I'm gonna ask um how many times does he kick the ball because I think this is a different a different kind of question to ask is mainly understanding exactly how many times something has happened so I'm gonna ask them because only two times but it should be able to understand the ball kicks the ball several times throughout the video so um you can see right here this is obviously a more difficult question how many times he kicks it he does kicks it twice but this is still pretty accurate so I'm guessing that it does struggle with that but I'm thinking some other questions that we can ask like what time of day is it and then of course it should be able to give us a prompt answer and yeah this is honestly kind of mind-blowing software it says it is daytime in the video with the sun shining in the background okay and let me put what is the boy wearing and I'm gonna also ask again how many times give me an exact number of times blue and green hooded jacket and black plants that is true and I'm gonna ask them give me an exact number of times the boy kicked the ball okay so that's what I'm gonna do I'm going to say give me an exact number of times the boy kicked the ball four times in the video I'm not sure that's right four times there's one kick um I don't think that's four times so I think the AI did get that wrong because he only does kick it twice so I'm guessing there might just be some misinterpretation there but like I said more difficult questions are going to require I guess you could say a deeper level of understanding and I'm not exactly sure how this is I didn't read the research paper just yet that is something that I will look into but I do think that this is definitely a very advanced piece of software now I'm going to load in another video just to show you exactly how this works and then we're gonna see um if this is able to identify what's going on in this video so I'm going to click watch this and this is of course uh some guys playing baseball um and they're just simply at I guess a practice facility I'm not even sure too much I mean this is why this AI tool is good because there's gonna be some videos where you don't understand exactly what's going on and you know the AI tool is going to have a more vast knowledge of experiences and things that it's seen especially since it's gonna pretty much I've seen every video on the internet so it's going to know it like anything and it's definitely going to be really helpful so we're going to say uh how many times does he hit the ball um that's always a hard question for it to ask uh how many times did he hit the ball hits the boy twice in the video so I think this one he does get a rice right uh so there's one hit and then of course uh if we come up again that is two hits so yeah this one it was a little bit more distinctive so I'm pretty sure the air was able to get it right then I'm gonna ask them what is he wearing and of course you should be able to see he should be able to describe a gray shirt with black shorts so blue and green hooded jacket and black pants I don't think that is actually true it isn't able to distinguish that this is actually a gray jacket or a gray shirt with like a cap on it but um yeah it definitely is very interesting so this one was a bit more interesting because I decided to test this software's capabilities to see if it understood something that isn't usually understood I mean this is what we have a woman wearing I guess you could say hand wraps and then she is essentially Shadow Boxing so you can right here that the state I guess the key word that it gets from this is punching bag and then it says you uploaded a video about a young woman with red boxing glove standing in a dark room click the video button to chat uh let's see what action is she doing and let's see if it's able to identify that this is actually a technique called Shadow Boxing and it says this woman is playing with the ball so I'm guessing that you're only able to see what is going on if the footage is essentially bright and I guess performing a relatively normal action because I think how this works is that of course there are many different

Segment 3 (10:00 - 10:00)

things working together as you can see right here it's working together with chat GPT uh action recognition visual captioning so when you have a lot of different softwares working together what tends to happen sometimes is that certain things I guess lost are essentially lost as they transform from one application to another or one large language model to another so I'm guessing that's what happened here but nonetheless this is still a very impressive demo considering we have only recently you know it feels like last month we just had chat tbt4 and now we're getting software that's like this

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник