Nvidia's NEW 'AI Perfusion' Takes the Industry By STORM! (NOW ANNOUNCED!)

11:44

Nvidia's NEW 'AI Perfusion' Takes the Industry By STORM! (NOW ANNOUNCED!)

TheAIGRID 13.05.2023 78 251 просмотров 1 424 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Nvidia's NEW PERFUSION Takes the Industry By STORM! (NOW RELEASED!) Simulated Tennis: https://www.youtube.com/watch?v=ZZVKrNs7_mk Nvidia Blog Post: https://blogs.nvidia.com/blog/2023/05/02/graphics-research-advances-generative-ai-next-frontier/?=&linkId=100000200652087 Nvidia texture compression: https://research.nvidia.com/labs/rtr/neural_texture_compression/ Nvidia 3d Images From One Images: https://research.nvidia.com/labs/nxp/lp3d/ Siggraph Event https://s2023.siggraph.org/register/ One thing we did miss was this https://research.nvidia.com/labs/rtr/neural_appearance_models/ Welcome to our channel where we bring you the latest breakthroughs in AI. From deep learning to robotics, we cover it all. Our videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on our latest videos. Was there anything we missed? (For Business Enquiries) contact@theaigrid.com #LLM#Largelanguagemodel#chatgpt#AI#ArtificialIntelligence#MachineLearning#DeepLearning#NeuralNetworks#Robotics#DataScience#IntelligentSystems#Automation#TechInnovation

Оглавление (5 сегментов)

Intro

Nvidia just released around 20 different research papers in which they detail the very next advancements in generative Ai and how it's going to impact us now in this video we're going to cover four of the key ones that they talk about and why these ones are very interesting so stay tuned because honestly you're going to be surprised at what Nvidia has been able to do especially with the advancements in AI so one of nvidia's

Tennis

first groundbreaking papers is essentially learning physically simulated tennis skills from broadcast videos so in this video and in the research paper that they've done they essentially talk about how it's possible now for you to use broadcast videos such as you know perhaps the NBA perhaps maybe just a simple tennis match but to use this in order to Simply mimic those skills slash bodily movements and map that accurately onto a 3D character now you have to understand that this isn't just something where they're simply just using some fake motion capture data this is really accurate really precise and really good data that is then being applied to this 3 3D rendered character that you can see so we know that motion capture is very expensive to produce at scale and we know that some companies that want to use motion capture currently cannot do so because of course it might be outside of their budget range now of course what's also an issue as well is that motion capture data is usually quite large so this sometimes does have its issues so as you know right here the more actually recorded footages that you do have let's say for example when we look at games of athletes that are actually playing in their current sport it's always going to be much more natural and much more better than just simply recording some motion capture footage and essentially what this paper talks about and what this video talks about which we'll leave a link to in the description is how they're able to essentially get that data from the video and then essentially translate that onto a 3D character in which they can then accurately represent exactly what was done in the video and to the extent that there's a lot that these characters can actually do in terms of their ability to hit the ball into several different locations at certain times and hit it into a certain spot at a certain time so it's something that is definitely refined it's not something that seems quite basic because I know that many different projects out there are quite basic but this definitely seems very refined and the system that they've developed seems very good when it does come to refining some of the jagged movements because they talk about how there are many different issues with this but they've managed to refine those and fix those now some of you might be wondering what other tools are there like this out there now there's one tool that recently did come to mind which is one that Dynamics they did release something that is quite similar essentially that tries to wipe out the need for necessary mocap animations and it's actually very interesting there are many different demos that you can actually see online and of course this is in beta stage where you can actually sign up to the waitlist so this is definitely something as well that is going to be going on in the motion capture space another area that is being disrupted by AI but I do have to be honest with you this research paper by Nvidia it does show a very fine-tuned version I think of what Wonder Dynamics is trying to do so I think that if this is fully released at skill it will most certainly change the way thing things are captured especially in video games and in any other Industries where you do need that kind of data now Nvidia like he said once again are coming with more stuff now this one I gotta be honest with you is very impressive okay this is what

Perfusion

we call key locked ranked one editing for text to image personalization that's what Nvidia have titled this and essentially this is what they call perfusion so it says we present profusion a new text to image personalization method with only 100 kilobytes model size trained for roughly four minutes so this can creatively portray personalized objects and it allows significant changes in their appearance now trust me when I say you're about to be shocked at how good this is at training personalized images based on literally sometimes even just one image and then allowing yourself to customize that with a text prompt so let's take a few look at some of the examples because this is by far one of the most game-changing things that I've seen when it comes to image generation in AI so the first example we have here is perfusion being able to generate eight appealing images in several seconds where essentially they transform the nature of what's going on in that image they're able to take the actual table and then present it covered in snow now you can see right here that this isn't just fluff this is actually really high quality data that we can see in this image that looks really legible and really good now this is what I'm saying when I say that this is going to be more widely used I believe than maybe some applications like mid-journey because this personalization is essentially what really drives the utilization of these kind of models because if you can personalize something then it definitely does have more use now of course as well you can see that there is a large level of consistency with these images which is essentially a lot of people have wanted in mid-journey for quite some time and perfusion actually excels at this and does it very well now there's also some other incredible examples of where you're able to train the examples you can see on the left hand side the examples of the teddy and of course the teapot then essentially what they've done is they've transformed the single concept into another image and this looks very good so we can see here that this is really cool but what you don't see in this example if you look closer you can see that they've actually combined two of the trained images together to where on the right hand side we can see inference combined Concepts so a Teddy sitting by the fire with the teapot and a Teddy sailing on a teapot in a lake and that's really really cool because I do believe that this has many different applications and I'm pretty sure Adobe is going to be scrambling to get this kind of software embedded into Firefly because this is something that I don't think I've seen anywhere apart from dreambooth but with this level of consistency and of course quality we haven't seen this just yet so this is honestly truly groundbreaking stuff from Nvidia now what's also cool is that you can see some of the other examples here where there's a Teddy dressed in a blue suit looking at a gourmet meal looks very accurate then of course there's a dog wearing or Sombrero a definitely another accurate output now one thing that many people are wondering is how does this compare against the other models that are in the space that do exactly this so you can see right here that they actually showcase other models and exactly how they handle the same exact prompt with the same exact thoughts and I think it's clear that by far nvidia's model is clearly just the best I mean although they have used several prompts to get this the other ones just didn't seem to actually get the task and they honestly do fail but not spectacularly but in videos is just that step ahead now this right here is what I wanted to show you all because this is what I think the future of AI image generation is going to be where you have this one shot personalization now some people might just have one image of something and they want to essentially have consistency of that image and then essentially manipulate that even further and with nvidia's perfusion you can see right here that this is exactly what we have so this is something that is truly game changing nvidia's perfusion is one of the models that I think they're going to talk more about at the event and it's definitely something that you should be paying attention to because once it does get embedded into I believe potentially their Nvidia Castle we could then be seeing it rolled out into many other different applications because Nvidia is going to be allowing many different you know companies and softwares to be able to use their Picasso cloud service so definitely something to look out for now

Live 3D Portraits

this is very interesting this once again is showing where Nvidia leads the way this is live 3D portraits from real-time Radiance fields for single image portrait view synthesis or in simpler terms they present a One-Shot method to infer and render photorealistic 3D representations from a single image in real time so they can just pretty much just get an image of something and then essentially get a 3D realistic photo realistic representation of that image now right here you can see the inputs versus the outputs and for those of you who haven't seen the competitions level of detail you won't understand why this is truly groundbreaking because it's very hard to infer 3D data from an image because there are many different complex things at stake when you're trying to depict what goes where and it's something that you can easily get wrong but this is something that Nvidia has mastered and this research paper slash video that they talk about it goes over in detail how they've managed to do this and all the different techniques that they've used I mean it's honestly quite interesting as to how they managed to do this because this wasn't something a couple of years ago that you thought you'd be able to get with this level of accuracy and like I said before as AI continues to develop at this rapid Pace we're going to start seeing Innovations which we weren't thought possible develop at a ridiculous level now this was something that I found really interesting and where you can see in real time you're able to get these different effects you can see that with the input compared to the output novel view you're able to get a completely different angle on what this speaker is saying and you can see right here that we have a driving video then we have an input single image and then of course we have the 2D talking head and the 3D lifting with their method now this can be comparable to did in some aspect if you don't know what that is essentially it's where people convert an image to something moving but this is on a completely numb that level you can see right here that you can see in videos at the top left compared to the others in the other areas and we can see clearly that nvidia's is far more accurate with far more detail and this is clearly Superior to anything else that is currently being developed so I would say hat stuff to Nvidia for doing this because constantly with every single blog with every single research paper that I read compared to the other research papers that I read literally just a couple weeks before we're seeing Nvidia push the boundaries on what's truly capable now what's also cool is that essentially as we talked about previously in real time you can see that this is a live demo of someone essentially using a phone to RGB video to 3D now of course this is running on an RTX 490 which is essentially one of their top tier graphics cards but this also goes to show the potential applications of this imagine maybe you wanted to see different angles of something or someone in essentially a more 3D way this is going to present more lifelike and more realistic video calls so I'm not exactly sure what these applications are going to be of course right now it's just pure speculation but just goes to show that being able to generate this amount of depth and detail from a simple 2D image is truly incredible now this one is really cool essentially if you're someone who plays games you know that hair is something that games do really struggle with because of the complex computations needed to essentially calculate how each hair strand is going to move but Nvidia have come up with a solution so in this new paper they essentially talk about a method that can simulate tens of thousands of hairs in high resolution and in real time using neural physics and AI technique that teaches a neural network how to predict how an object would move in the real world so essentially what they're doing is using these neural networks to essentially teach them how these hairs would look and remember this is going to be in real time so really cool to see how it actually plays out in real game then of course we had

Neural Textures

the neural textures versus the bcx textures so essentially what we have here is once again using neural networks to essentially compress textures that usually are quite high in file size but at no other cost so there's no cost in GPU size essentially size you can see that they're both around three megabytes in file size but at the same time you can see that the one on the left nvidia's new neural textures actually produces up to 16 times more quality with the same actual file size which is very interesting because it means essentially you're going to get higher quality at the same power level so this is really game changing for NVIDIA now this annual conference on computer Graphics is going to be happening in August and that's where they're going to be presenting all of their ideas especially some of those that are interconnected with AI and it's going to be interesting to see what other stuff Nvidia has to talk about especially expanding upon some of the topics that we talked about today

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник