AI Realism Revolution & More AI Use Cases

19:31

AI Realism Revolution & More AI Use Cases

The AI Advantage 06.09.2024 11 893 просмотров 571 лайков обн. 18.02.2026

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Get started with Stack AI today for free: https://dub.sh/ai-advantage | Send an email to specialdeals@stack-ai.com mentioning The AI Advantage to get 20% off a paid account! Today, we'll be focusing on how incredibly realistic AI image and video tools are getting. This week we got new updates to Luma Labs Dream Machine, Runway Gen-3, Playground AI and more. I'll break it all down for you in this video. Links: https://hailuoai.com/ https://x.com/dAAAb/status/1813159246646943879 https://playground.com/ https://x.com/runwayml/status/1829591480664768993 https://x.com/javilopen https://lumalabs.ai/dream-machine https://reflux.replicate.dev/ https://github.com/replicate/reflux?tab=readme-ov-file#readme https://huggingface.co/spaces/Qwen/Qwen2-VL https://github.com/QwenLM/Qwen2-VL https://x.com/AnthropicAI/status/1831348822775042374 https://youtu.be/r7R5IAveG3g?si=NxtEMQ3D8t2_pUE3 Chapters: 0:00 What’s New? 1:02 Playground 3:42 Stack AI 7:00 Minimax 10:28 LumaLab’s Camera Motion 11:34 Runway’s New Feature 13:51 AI Use Case in Real World 14:36 ReFlux 17:05 Qwen2 18:00 Claude for Enterprise #ai #news This video is sponsored by Stack AI. Free AI Resources: 🔑 Get My Free ChatGPT Templates: https://myaiadvantage.com/newsletter 🌟 Receive Tailored AI Prompts + Workflows: https://v82nacfupwr.typeform.com/to/cINgYlm0 👑 Explore Curated AI Tool Rankings: https://community.myaiadvantage.com/c/ai-app-ranking/ 🐦 Twitter: https://twitter.com/TheAIAdvantage 📸 Instagram: https://www.instagram.com/ai.advantage/ Premium Options: 🎓 Join the AI Advantage Courses + Community: https://myaiadvantage.com/community 🛒 Discover Work Focused Presets in the Shop: https://shop.myaiadvantage.com/

Оглавление (10 сегментов)

What’s New?

so the generative AI conversation ever since the release of chat GPT has been mostly about llms and their capabilities over time the conversation shifted to some of the tooling that is available within it like the image recognition the code generation and others but over the past month that area has become a little stagnant and we're seeing a literal Renaissance for all the AI Imaging and video generation tools and that really is the story of this week and I find it so exciting that we're finally getting tools that are not just hyper realistic and customizable to your own images and styles but also easy to use with real world use cases for example there's this brand new tool that is free to use for 10 images and you can just select from a list of use cases and with two clicks it generates logos posters marketing materials for you or there's a brand new Chinese video model that is also free to use and some of the outputs are literally Sora level and this thing is free and usable today beyond that there's various new features a llm that can actually take 20 minutes of video as an input and the thing is fully open source and there's so many more Innovations and this week episode of AI news that you can use playground over

Playground

here has a super simplistic interface and I guess it's trying to be canva AI but simpler without all the extra bells and whistle that canva has this does one thing and that's helping you design things with the power of AI and a super simple interface it's free to try you get 10 Generations right now and you can just go to the website and start designing right away and as you can see you get the simple interface you can create all of these different assets and I think this is where AI Imaging starts getting really exciting as you might know it's only been a few weeks ago where text inside of AI images got fully unlocked before that we had one model that kind of worked but it wasn't state-of-the-art but now there's multiple models that can do text and that's why we see these userfriendly applications like playground pop up now where it's becoming a design studio where it's really intuitive to create now is this going to allow you to do the most advanced things possible with AI image generation no probably not this gives up some features in favor of user friendliness but you just go in here you pick what you want to create let's say we want to do a poster over here you pick one of the Styles or something that you like let's go with Mario over here and look at this all I need is a Google login and I can start creating 10 images right away and check this out you can prompt as you would inside of chat GPT on top of AI visuals and it just does it for you change the poster to Luigi and adjust the reward to 20,000 coins let's see what this comes up with on our first generation here we won't cut to a different result I just want to see what playground gets us on here and there you go 20,000 coins it Chang it to Luigi how about we try create a logo with it for now I think this could work really well I really like the look of this one change the text to AI Thunder and the logo to Zeus holding a lightning bolt maintain the style quite straightforward prompting here let's see what we get on our very first shot of this and I've been told by my team that it's really good at maintaining Styles so look at that kind of works doesn't it Zeus holding a lightning bolt AI Thunder really similar to this original style I find this quite amazing and I think if you have some basic prompting skills you'll have a lot of fun with this interface because it's so simple you could also go ahead and apply a certain style in here like so and then you can download the image for free too premium features are upscaling and background removal but yeah look at that AI Thunder this actually looks quite good I'll go ahead and download this and there it is no watermark nothing right now you can just create things like this for free isn't that incredible two more notes it has this feature where you can upload an image and start from your very own design that you might already have so that's fantastic then you can just talk to it and prompt on top of it and secondly there is an IOS app where you can actually use this on your phone honestly I'm a fan of this and these segments are never sponsored I always disclose that clearly I just think this is a super simple implementation that anybody can get some value out of and yeah there you go you can just go ahead and try this very powerful and super simple to use so next up we're going to

Stack AI

look at a bit more advanced use case particularly I'll be building an Enterprise grade perplexity clone your own custom AI powered search engine that takes no coding skill to build and you can deploy it like this inside of a chatbot web app however you like and we'll be doing that for the sponsor of today's video stack Ai and at this point I just want to point out I'm so grateful for all the sponsorship opportunities that we get because that means we get to be really picky and I get to pick the ones that have actual Merit and enhanced this videos and I really think this is one of those cases because what I'll be showing you here is inside of a brand new account that you can create on a free trial we'll be building this simple little perplexity clone that takes an input runs a Google search uses your own documents stores all of that in a vector database and runs llm like chat GPT on top of that to give you custom outputs which you can then easily turn into your own chat GPT clone website chatbot like so or a simple form that anybody can use independent of their AI expertise and we'll be doing that on the free 7-Day trial that you can get here and after creating an account I'll create a new template basically we have an input and an output here this will be representing the two Fields the employees of our company or any other users of this app will be seeing first I'll pull in the obvious one here and just pull this Google search from data loaders secondly I'll close this up open up knowledge bases and pull in a documents plus search step which allows me to search over documents that I'll upload by clicking this button here I have all the welcome emails when you sign up to the AI Advantage newsletter this is fantastic context for my search engine so I'll add this to this document search node like so beautiful it's uploaded and then I want to add two more things first of all We'll add a vector database but basically this is just a very efficient way of storing data so llms like ch GPT can work with it and then last but certainly not least we need an llm to process all of this so you can pick from all the big model providers I'll be taking GPT 40 mini for efficiency here and now it's time to hook all of this up the last thing we need to do is customize this prompt I'm going to keep it very simple just referencing the Google search results and the document search to define the tone and voice that is it we built a personalized perplexity I can save this add a input prompt all I need to do is add an input here write me a newsletter segment about upcoming AI events in Lisbon in 2024 I'll hit run up here and then it will run a Google Search and look at my document and it will bring all of that into chat GPT to produce an output over here and there you go here are the custom results using all the Google data with my custom tone and the context from the document I uploaded now obviously there's many more things you could add you could use different llms and once you're done you just go to export and here you instantly have a fully functional website that you can send to anybody in your team or if you prefer this chat assistant interface you can do that too website chatbot hook it up to your WhatsApp slack and more options this is amazing as you get to customize all of these for your exact needs so again this is an Enterprise facing solution but you get a 7-Day free trial with the first link in the description you can also get 20% off on a paid account sign up for stack AI then send an email to special deals at stack. com requesting 20% off and tell them the AI Advantage sent you thank you so much to Tei for sponsoring this video and now let's get back to some more AI news that you can use okay so this is

Minimax

probably the most surprising piece of news that you can actually use of this week because there's a brand new Chinese video model and what's interesting about it is that first of all it's free and secondly it's Sora level and I say that with a question mark in the end because I'm still surprised but the quality seems to match up and there's one super interesting fact about this which is it's really good with humans we'll talk about that more later on and you'll see it throughout the examples here but I think this might be the best available AI tool today to tell stories as it's super good with keeping human anatomy realistic even throughout the 6 seconds it can generate matter of fact me and the team went in and created accounts if you're using Chrome by the way you can translate this to English with the translation tool here on top by default the side this in Chinese but if I switch over I can read everything and if you're logging in you can create an account with a phone number that is not Chinese and then you can use this for free and the way we went about comparing this to Sora is the following remember when open AI announced Sora they released this block post with the various stunning visuals and the prompts underneath we took those prompts threw them into Minimax AI that's what this is called so here's the first Flagship one this is what got everybody so excited as it seemed to do humans consistently if I judge this strictly I would say the Sora version is actually better on the environment I really couldn't tell on the woman I suppose the face on the Sora one looks a bit better but then we tested multiple examples so the next one would be a hyper realistic close-up shot of an eye and this one is just stunning it's lifelike this is Sora level as a x fulltime video producer I can tell you in my eyes this looks exactly like footage you would get from a mirrorless camera in other words this is indistinguishable from real life and again just keep in mind this tool is freely available you just have to sign up with your phone number and off you go okay so next up we have this shot of an astronaut with a red wool knitted motorcycle helmet I don't know if I'm super critical maybe the movement in the end is slightly better in the Sora example but it's close enough and on these TVs here as the next sh shot I have to say I might even like this one more than the Sora example it's kind of hard to judge super subjective but nobody could really tell that this isn't an actual fishey shot of some TV sets with various shows running in other words this output quality is excellent and then lastly we have this train shot also looking super good in here but I do have to point out that here there's some artifacting but you have to consider that the Sora examples were surely cherry-pick like those weren't examples that they generated in one shot and put on their website I'm sure they generated dozens if not hundreds and picked the best ones presented those we just ran this prompt two three times and picked the best one of the bunch so what's the first impression of our team of this tool after testing this extensively well first of all it only has text to video there's no image to video capabilities or camera motion or any advanced features it's just text to video that's how all video tools usually start then they add things later on the generations are 6C long with no manual controls again this is the first release of the tool that's how it usually goes and lastly and this was the thing that I was afraid of the most the prompting can be actually done in English because it has been probably taught footage that is annotated in English and there you go here's a few more examples of what we generated with it it's amazing and I'll leave with one last comment here on why I actually think this tool might be even better than the current state-of-the-art with Runway genre Luma laabs dream machine and cling the anatomy and the hands in this model are almost Flawless most of the time they must have really watched out for that in their data set and it shows because this is the first model that we have access to that generates humans really well all right on to the next one okay and while we're

LumaLab’s Camera Motion

on the topic of AI video generators some useful features have been added by the two current leading models runways gen free Alpha and Luma laabs dream machine let's start talking about Luma laabs cuz this one is quite simple as predicted a few months ago all of these features that we've seen in past generations of AI video generators are coming to these newest models and Camera motion is one of these features and you can access it inside of luma lab's dream machine now you can simply access it by prompting one of the keywords that triggers this drop- down menu like so pushin for example there's many more and they even have little previews of them like so and we went ahead and tested these now look this doesn't change anything about the base model it's still not the best model at generating humans but really good at animations as I pointed out before so it shouldn't be a surprise that our testing results showed exactly this new feature does give you increased level of control over how the shot behaves but having this level of manual control allows you to do things like this which were quite hard before I didn't understand the camera motion as well now it's a baked in feature making it easier to tell stories and by the way for a full list of all the movements you can just type in camera and it shows you all of them all right so that's Luma Labs

Runway’s New Feature

now let's talk about runway's new feature which is the ability to extend the clips even longer and this is super interesting because Runway goes a little crazy sometimes and if you give it these longer run times it has more space to come up with something unexpected but in our testing off this feature we did find that by simply extending a clip and not adding any additional prompts it will not yield the results that you might expect or want it won't even go crazy you need to be very concrete with what you want at this point I would just like to point out this recent viral post from Javi Lopez friend of the channel that went super viral actually all across the Internet it's this little vacation video that he ran for AI video generator and then he extended it and it cost absolute nightmare fuel like so I won't even show the whole thing but this definitely worth checking out I'll include a link in the description the way you do this is you take a clip and then you extend it with a prompt if you do it without ref found this feature doesn't work so well but what it did work super well for is time lapses and this is something that I got to admit frustrates me a little because I've put so much time into learning how to do time-lapses particularly long-term time lapses in my life back in the day when I was running my video production company one of the jobs that I took on and successfully completed was doing a 4mon time lapse of solar panels being installed on top of a Coca-Cola factory the facility where they were bottling it up for all of Central Europe and this was the technically most challenging video project I ever did because at a certain point a worker even disconnected the wire that I ran all across the factory although I had signs and stuff it all worked out in the end I managed to piece it together but the point here is that it's so much work to do long-term timelapses and with Runway you can just prompt long-term time lapse let it run extend a clip for 40 seconds and look at this example it just does day to night of a shot like it's nothing two caveats I need to add this only works for genre Alpha not the smaller version genre Alpha turbo and secondly it doesn't allow you to upload an image yet oh and one more note if you're trying to do something like kave over here you can actually reprompt every time you're extending up to the 40 seconds cuz you always add 10 seconds at a time so you can always describe what you want the next few seconds to be about and that's probably very close to what ja did with this clip here although it didn't tell us about the exact details and just a quick question here would you like a tutorial on how to turn a vacation video I do something absolutely insane with AI video like this if so leave a comment below and now let's move on to the next piece of AI news you can use okay now I

AI Use Case in Real World

want to show you AI use case that popped up in the real world I Love featuring some of these because people have not caught on to the fact that you could do things like voice cloning with 11 laps for example and here it has been used in Taiwanese Congress specifically a legislator used it when she was sick and couldn't fully use her voice she created 11 Labs voice clone to present Her speech which left other members of the Congress stunned they haven't seen anything like this before and I just wanted to point this out as an amazing use case of AI in everyday life in this video they even show them clicking the buttons in the interface just like we do in this new show this was really entertaining to me and I just love to see these helpful us cases in the real world and if you're not aware you could take inspiration from this and do this yourself if you're sick and can't speak and now let's move on to the next story

ReFlux

okay next up we have a new release that takes all of this AI Imaging realism stuff to the next level literally because over the last week we talked about the release of flux which can do hyper realism super well and because it's open source you can fine tune it with your very own images which means that you can train the model to produce images of you and you could also fine-tune it with images of some somebody else or with the logo of your company and then you have an image generator that produces custom images and what released here is called reflex and this is a visual interface that actually allows you to use multiple fine tunes meaning if you train it on your company's logo and yourself you can use both of those now and you can generate images of you wearing a t-shirt with your company logo and many more use cases even more customizability and this interface is super simple now there is a problem with this though and that's the fact that it can only use models that are created in here in this fine tuning tab or other people's fine tunes or hugging face fine tunes now this is probably your best bet but this is also the most demanding workflow in my video on fine tuning that I'll link below I showed you the workflow where you do it all on one website and it's really straightforward as soon as you start storing the luras on hugging face and pulling that into replicate and generating on there it starts getting a little more convoluted now there's also the option of fine-tuning in here and that's what I want to show you here because this is the new tool that you can use and that's what this show is about but the problem is these funes only happen on one image and if you look into the back end of it which you can do because this code is available on GitHub it actually creates multiple synthetic images from the one image you provide meaning for a wellin tuned model you should give it at least 10 ideally 20 or 30 images of yourself here it only takes one so it has to manufacture those at least nine other images the problem is that those usually don't look exactly like you and we find you in the few models on me and every time the result was something just halfway there I mean look at that each one of them is sort of similar to me but it's just not there and that comes from the fact that we only uploaded one image to this fine tuning and the results are just Mead matter of fact right now I have a model training right here but again these results are not great because one image is not ideal so I'll definitely play with this more I'll have to add multiple luras to hugging face and pull them in here which you can do just by adding the links here and then you can use multiple fine-tuned models to generate custom images if you're enjoying the video leave a like it really helps out the channel and with that being said let's move on to the next piece of AI news that you can actually use and now let's

Qwen2

switch gears and talk about llms there haven't been too many releases this week we'll briefly cover this but first of all we have a brand new model with quen 2 and this is a vision model coming out of Alibaba the interesting thing here is it has video recognition too okay now the problem is with this hugging face interface the video upload doesn't really work properly I'm sure the model does and the interesting thing here is that it actually performs Super well on these Vision benchmarks and it's open source under Apache 2 license meaning if you're building something you can plug this in for commercial purposes and I'm covering this because this is really a new type of model we didn't get many capable open-source Vision models that can process up to 20 minutes of video something like this would really be perfect for all of these robots that are being developed and marketed heavily all across the internet I don't really cover them on the show because this is about AI news that you can use and these robots haven't really shipped yet but this model has and something like this will be very useful under the hood for Hardware that needs video recognition so

Claude for Enterprise

that is quent to and then there's also this minor piece of news that you can use I suppose if you sign up for a clawed Enterprise account and they did something interesting here that I want to point out they expanded the context window for Enterprise users the paid plan of clot gives you 200,000 tokens this one gives you half a million and little side note the current King in that category is Gemini 1. 5 Pro with a context window of 2 million tokens all of these companies keep shipping these incremental updates slowly catching up to or in this case even overtaking open AI on specific metrics but overall open AI still has the most well-rounded offering we talked about this many times but between everything they have they're probably still the best choice if you were to pick one thing but there's specific tasks and use cases where some of these other models start to excel with things like expanded context window or the Google gems Integrations with Gmail and Google Maps or YouTube that we covered last week and just as a little side note andropc actually partnered with Amazon to power their Alexa and this is a theme that we'll see a lot in this show over the coming weeks and months as these things ship into production because Alexa is going to have CLA on Android phones you can already use Gemini apple is going to ship the cat GPT integration and the proprietary large language model in September plus we have open AI def day coming up in three weeks where I suspect they'll be upgrading the assistant's API their gpts and they'll probably ship something similar to claw 3. 5 sonets artifacts and as per usual me and the AI Advantage team will pull together all that info test all of it and present you with the findings in an episode just like the one you watch right now and that's all I got for this week see you soon

Другие видео автора — The AI Advantage

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник