Mind-Blowing AI Turns ANY Image Into 3D Worlds! (Worldlabs Spatial Intelligence)
9:32

Mind-Blowing AI Turns ANY Image Into 3D Worlds! (Worldlabs Spatial Intelligence)

TheAIGRID 03.12.2024 17 183 просмотров 492 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Prepare for AGI with me - https://www.skool.com/postagiprepardness 🐤 Follow Me on Twitter https://twitter.com/TheAiGrid 🌐 Checkout My website - https://theaigrid.com/ Links From Todays Video: https://www.worldlabs.ai/blog Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos. Was there anything i missed? (For Business Enquiries) contact@theaigrid.com Music Used LEMMiNO - Cipher https://www.youtube.com/watch?v=b0q5PR1xpA0 CC BY-SA 4.0 LEMMiNO - Encounters https://www.youtube.com/watch?v=xdwWCl_5x2s #LLM #Largelanguagemodel #chatgpt #AI #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks #Robotics #DataScience

Оглавление (2 сегментов)

Segment 1 (00:00 - 05:00)

so world laabs is a spatial intelligence AI company that is building large World models to perceive generate and interact with the 3D World now this company aims to lift AI models from the 2D plane of pixels to full 3D worlds from both virtual and real endowing them with spatial intelligence as rich as our own and this is something that they want to do because they want the ability to imbue AI with this ability in the near term now this company was founded by a Visionary AI Pioneer F Fe Lee along with Justin Johnson Christopher Lasser and Ben Malden Hall each a world-renowned technologist in computer vision and graphics and today they actually announced their newest AI system where you can actually generate 3D worlds all from a single image we can actually take a look at this Suite from world laabs and they say that world laabs aims to address the challenges many creators face with existing generative AI models which is of course a lack of control and consistency now if you've ever messed around with AI image generators you know exactly what they're talking about here sometimes get really good results and other times it's just way off and getting it to match exactly what you want is basically like playing the lottery you don't know what you're going to get now this is where World laabs state that they are doing something interesting they explain that when you feed their system and image it does several mindblowing things first it estimates the 3D geometry which means it's not just you know looking at your image as a flat picture but probably understanding the depth and the spatial reasoning relationships of everything in the scene so you know how like your brain is able to naturally understand that table goes back in space and how a room has depth this is basically what the AI is doing now it gets kind of wild because this is the kind of system that just doesn't understand the parts of a scene that you can see it actually fills in the parts that you can't see so imagine taking a picture of your living room uh and let's say you did it from one angle their AI can figure out exactly what's behind you what's around the corners literally completing the entire space and what they specifically mention is that you can turn around in these scenes which suggests that this isn't just stat but you can actually explore these AI generated environments now the cherry on top is that they're basically saying that this works for pretty much any kind of scene or any kind of artistic style that you throw at it so whether you're working on Photo realistic architecture a fantasy environment a stylized or anything in between these systems can maintain consistency throughout the entire space and this is actually a big deal because one of the you know major headaches that you actually get with current AI tools is getting them to maintain a consistent styo look across these different generations and of course for you creators out there it could be an absolute Game Changer because you can take a single concept art piece and turn it into a fully explorable 3D environment or take a photo of a you know a real location and be able to move around in it freely seeing angles that weren't even in the original shop so this is pretty crazy cuz it's got like a variety of different applications I mean you've got game design virtual production architecture visualization virtual reality it's pretty mind-blowing when you actually think about all of those scenarios now wlabs goes on to state that our 3D scenes can be rendered in real time in the browser with full camera control and this is actually a really big deal so when they say real time in the browser they're talking about being to instantly explore AI generated environments right from your web browser so that means there's no need for any fancy or high-end software so this isn't waiting for an AI to generate individual images this is smooth instant movement just like you'd expect in a video game and speaking of camera control they actually specifically mention that you can explore these scenes with a freely moving camera like in a video game that means you're not stuck with preset angles and limited movement you can literally fly around the space however you want look at things from any angle get up close to any details or pull back for a wide shot now here's where it gets really interesting for filmmakers they mention that you can simulate 3D camera effects like shallow depth of field or even a dolly Zoom so if you're into cinematography you know that these are some of the most powerful tools for creating mood and impact in your shots and shallow depth of field is that beautiful blur effect where your subject is sharp while the background is smoothly birded out and a doly zoom is also known as the vertical effect which is that mindbending shot where the camera seems to stretch or compress while keeping your subject the same size and the fact that you can actually you know play with these professional camera techniques in real time with an AI generated environment is pretty revolutionary for Content creators and filmmakers so now this is where the next tweet really gets into the technical magic of why World Labs approach is different so they actually point out that generative models just predict pixels which basically means that traditional AI image generators are basically creating a flat image pixel by pixel like a digital painting but the world Labs company is doing something fundamentally different they're actually generating actual 3D scenes so I need to break this down for you guys why this is a big deal so while they're saying the scene won't change if you look away and come back they're highlighting a common frustration with current AI tools you know how if you generate an image with AI and try to generate another angle of that same scene it's going to look completely different and that's because the pixel based generation has no memory or understanding in of the 3D space it's

Segment 2 (05:00 - 09:00)

starting fresh each time but with W lab's approach they're creating a consistent 3D environment that stays the same no matter how many times you look at it from different angles and when they mention it obeys the basic physical rules of 3D geometry they're basically talking about how you know their scenes actually make spatial sense so things stay in the right size as you move closer or further away the Shadows Fall correctly and objects maintain their proper relationships to each other in space now they mentioned something called a depth map as the simplest way to visualize this so you can think of this like a heat map where every pixel's color shows how far away it is from the camera and the closer something is the brighter or darker it appears in the depth map giving you a clear visualization of the 3D space now it might sound Technical and boring but it's actually really crucial for creating these you know realistic scenes explorable environments that don't break your brain when you move around in them now this next part is where things get really interesting so world laab says that generating consistent 3D geometry allows us to interact with the scene in 3D aware ways so when we actually break down what they talk about and what that actually means for creators it actually means that when you're working with 3D scenes you can do a lot of things so one of the things they mention is changing the scene's lighting and appearance so think about how in a video game or 3D software you can basically move lights around you can change the time of day or adjust the mood of a scene with different lighting setups and that's essentially what they are talking about here you're not stuck with the original lighting from your input image you can actually play with the lighting quite like a virtual phographer then they talk about you know modifying the geometry and this means that they can you know change the shape and structure of things in your scene so if you want that wall to be a bit taller you need to adjust the siid of the doorway and the crazy thing is that you can actually do that because these are actual 3D objects with depth and dimension it's not just like an you know AI image that just has pixels so they also talk about you know how you can also insert other objects into the scene and this is pretty huge because it means that you can actually add new elements into your environment and that actually fit naturally into the space and the system understands depth and perspective so when you drop in a new shiny object it'll actually look like it belongs there with the correct lighting the correct shadows and the correct scale and this is just basically miles ahead of photoshopping something into a flat image because everything integrates naturally into the 3D space so now on their website they actually do have a few different interesting things one of the interesting things that they do have here is they actually have this area where you're able to literally change the depth of field so you can see right here I can change it from near all the way to far and this is something that I can move around in which is actually really cool so I can manage to change the depth of the field in a scene and all of these things are really interesting and really creative because they allow you to explore 3D worlds in a much richer way it's something that I really do like because when we take a look at AI generated images sometimes we don't have the control over the complexity that we do want we also have different scenes we able to move the sliders to focus on different things and I think this kind of control is something that's going to be very needed if we're supposed to move to a world where we actually are allowing ourselves to use these tools in a way that's actually really effective you can also see here that if we decide to look at these Dolly scenes you can see we can move it too wide or too narrow we can see that these scenes look more and more interesting we can move it up we can move it down this is something that is really effective they also allow you to use the 3D scene structure to build interactive effects so when you click on a scene and you use your keys to move around you can basically have a bunch of different effects so one of them is sonar which is basically like I guess you could say echolocation for like a bat so you know when like a bat does that thing and you can see that it manages to Ripple across all the textures this is something that you couldn't really do with generative AI but with a real 3D scene you're able to do this really effectively so if I click over there you can see it manages to Ripple across all those surfaces right there it's just really accurate then of course we've got the spotlight this is where we're able to add lighting to a scene so if I want to add light to the roof ground I can add that light over there which is really interesting you can also build effects that passively animate the scene for example a color wave you can see right here you're able to use that if you actually want waves I think this one is the coolest one because you can see the scene actually ripples in this really weird 3D way but it's something that I really do like and lastly they show us how you can actually step into paintings I'm actually not familiar with all of these paintings do apologize but you can actually step into them and explore them but I do think that in the future of course this is going to get a lot better which means that it's going to be crazy when we're able to explore completely 3D virtual world and I think this kind of thing has incredible applications for VR because imagine you know inputting a text prompt and exploring that virtual world in your entire home

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник