# Google's  NEW 'StyleDrop'' Takes Everyone By SURPISE! (NOW ANNOUNCED!)

## Метаданные

- **Канал:** TheAIGRID
- **YouTube:** https://www.youtube.com/watch?v=q_ebiphq2Pk
- **Дата:** 12.06.2023
- **Длительность:** 17:05
- **Просмотры:** 42,773

## Описание

Google's  NEW 'StyleDrop'' Takes Everyone By SURPISE! (NOW ANNOUNCED!)

Google styledrop - https://styledrop.github.io
Styledrop - https://arxiv.org/pdf/2306.00983.pdf
Adobe - https://research.adobe.com/publication/multi-concept-customization-of-text-to-image-diffusion/
Exclusive deal VIDIQ Deal - https://vidiq.com/theaigrid


Welcome to our channel where we bring you the latest breakthroughs in AI. From deep learning to robotics, we cover it all. Our videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on our latest videos.

Was there anything we missed?

(For Business Enquiries)  contact@theaigrid.com

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience
#IntelligentSystems
#Automation
#TechInnovation

## Содержание

### [0:00](https://www.youtube.com/watch?v=q_ebiphq2Pk) Segment 1 (00:00 - 05:00)

so Google have once again just released a new research paper and this one is sort of in image generation now the thing about this paper is I do think it is really interesting in terms of what it provides and it's a new Unique Kind of AI software that I really didn't see coming because many AI softwares follow the same standard model either you are you know text to music you're either text to video or you're either image to text which is very simple but this AI is quite different because this is called style drop which is text image generation in any style now I know this falls under the use of text to image generation but it is still very interesting because I think this one will actually be used widely like mid journey and alongside many other creative tools and I think you guys should stick around for the examples of this because some of the examples are really interesting we present style drop that enables the generation of images that Faithfully follow a specific style powered by Muse a text to image generative Vision Transformer star drop is extremely versatile and captures nuances and details of a user provided style such as color schemes shading design patterns and local and Global effects so essentially what we have here is a tool that simply can learn any style and then replicate that style in any other artwork so it's really interesting and really effective at how cool this tool is and that's why I'm saying I do think this will be used a lot because as you know tools like mid-journey one problem that everyone has is consistency and if you don't know what that means sometimes when you generate an image it comes out in a certain style but it's very hard to replicate that style unless it's a notorious style maybe for example it might be a specific artist like Van Gogh or artistic style like steampunk but if you create your own style there's not going to be a specific name for that that's where star drop comes in so let's take a look at some of the examples that they've provided us with because we want to really see if this is actually good or not so what we can see here from the first example is colorful flowers in V style and the style reference image so this is the image that they've used for reference and essentially if you don't know what style this is kind of like a watercolor style which if we were to Google that you can see that this is likely what mid-journey would think we were talking about and this is why this tool is very important you see if someone wanted to generate an image on a text to image generator and selected the watercolor style what style drop actually allows you to do is customize it to that specific one of the image because if I wanted an image in this style I likely wouldn't be able to get it but that's where star jaw comes in but here you can see a baby penguin in the specific style a banana and the same specific style a bench in that specific style and we also have many other examples and I think this is going to be really useful for those who are running different creative projects because essentially any creative project that people work on if you do have a creative project that requires a specific style or brand sort of theme this is going to be the tool that people do use so I'll be excited for this release and to be honest with you guys so far looking at these examples of this watercolor style it does look very effective and of course just a caveat because this is something that you should know is that when you know these kind of research papers are done of course they will cherry pick certain outputs because it's going to showcase the best that the AI software has to offer but it still does prove to us that this software is very effective you can see right here that this is a style that we talked about in the beginning which is of course the van Gogh style and of course you can see right here this is going to be something that people do want to reference so now with this example I think this is one of the styles that many people who know about art are going to be more familiar with so this is by the famous artist Pang Van Gogh he was an artist or I guess you could say painter that did uh some paintings that made him really famous but he's really known for this style now what I wanted to do was I also wanted to see just how effective this was when comparing it to other image Generations like mid Journey so we all know if you don't know what mid journey is essentially it's a really cool AI tool that you can use to generate realistic and Hyper realistic images from Simple Text prompts now what mid journey is also good at doing is if you have a text prompt and you decide to type in X style or whatever style that you're reference it's going to be able to get that so you can see right here that there are four at the top and I decided just to take these top four because I wanted to see how these would look if I used mid Journey now remember this isn't going to be able to be done for every single kind of image because with mid-journey we have millions of image in Van Gogh style that can be referenced so let's take a look at what a baby penguin would look like in Van Gogh style and then compare it back to dream boost Google's new tool just to see the different nuances in these AI tools so here is the first four Images generated with mid journey and if you're curious about the text prompt that I use for this I typed in a baby penguin in Van Gogh Starry Night style and this is what mid Journey gave me now if you're wondering which engine I was using this is version 5. 1 which is currently the default and you can see here that compared to the style drop one I gotta be honest uh mid Journeys one of course as you know is a little bit more detailed but here's the thing this is not a dog on style drop because we have to understand is that if you don't have a physical reference like for example starring night style then you're not

### [5:00](https://www.youtube.com/watch?v=q_ebiphq2Pk&t=300s) Segment 2 (05:00 - 10:00)

going to be able to get any specific style for your specific artwork which means that style drop is going to be more useful in the future as newer Styles get created and newer colored schemes are involved so I do think that although on one-off scenarios where you do manage to reference an already popular style it can work but what is cool about style drop is that it aims to replicate that from one simple image so I do think what would be cool though is that if mid-journey is able to somehow integrate this into their software maybe Google releases this with an API you get a simple style from an image in mid journey and then you can use that to create different pieces of content I think that would be really effective but as we know Google does a lot of research but they don't always release these projects then let's look onto the second example which is a banana in the specific style you can see I gotta be honest with you guys that one thing that I'm starting to notice as well as looking at these that I just wanted to add is that all of these images do look pretty much on point with all of the other images I mean if someone said these are all from the same book I would 100 believe whereas with mid any I gotta be honest some of the styles do look a little bit more different than the other which would lead to you probably having to select many different images and kind of regenerate them for example the top one on the left looks the most Starry Night whereas the bottom one on the right doesn't look that great compared to its official style which is definitely very interesting anyways when you take a look at the bilana I got to be honest some of these artistic pieces are really interesting to look at and we do like them but if we do compare it to the original if I'm being completely honest I do think that style drop does this much better because I just think it actually gets the complete style in terms of the consistency which is something that you do want if you are going to be creating something and I think that's really effective but it's going to be interesting to see how this evolves and how this sort of comes into actual use in the future then of course we have a bench in that specific style and then of course you can see here are the benches that we generated now what's also interesting is that you can see that on the bottom left it didn't actually generate the bench correctly because it just simply added the bench outside which I did find pretty interesting but nonetheless it just goes straight that with a specifics tool like style drop you're going to have a lot more consistency and yeah I'm not hating on Mid Journey at all I think that both these tools are really incredible but when we look at the reference image and then consistency it's definitely going to be that specific style then of course the last one here we do have a boat in uh you know that X specific style and then of course you can see these are the examples here now I do think that with more fine tuning you are going to be able to get the perfect style from mid Journey or whatever but I do think that this is just an interesting thing to show how these different models interpret the same kind of style and how they arrive at their final destination now you can see right here we can also look through some more examples and this is what I'm saying that if you manage to Simply manipulate an image and want to have a specific style this is going to be really interesting I mean look at all these different kinds of images and I don't know what specific style this is why I'm saying mid Journey wouldn't be able to help you here although you might be able to describe it by saying it looks black and yellow only colors um a pastel drawing or whatever kind of you know descriptive words you can use I think that style drop is going to be much more effective so this right here as well is something that I don't even know how you would get this done in mid Jenny I mean some of these images are just honestly so uh I wouldn't say surprising but just so interesting because this is something that I'm guessing it's clearly maybe a kind of breakthrough in how they've done this because I've seen many different AI tools and what we've seen before is that many different AI tools are usually incremental gains on others but I haven't really seen at all that implements the style like this now one thing I do want to talk about after we go through this is other tools that are quite similar to this so in the paper what they reference is they reference other stuff like dream Booth okay and the thing is that this is actually different to dream booth now if you don't know what dream Booth is and you're like well I don't even know what dreambooth is so um let me just show you what dream Booth is and why I'm referencing it so dream Booth is fine tuning text to image diffusion models for subject driven generation so essentially dream Booth takes an image but rather than take the specific style of that image and this was done by Google as well they take the subject from the image so for example you can see right here we have a nice little corgi dog and then of course you can take that corgi dog and then put it in the Acropolis swimming sleeping in the doghouse in a bucket or getting a haircut and that's what dreambooth is now this is very different although similar to dream Booth okay style draw is where you can simply just get the style and then you can get on a completely new image in that star so according to the paper style drop outperforms other recent methods such as dream booth and text to inversion using Imogen and stable diffusion as pre-trained text to image backbones the evaluation was based on prompt and style Fidelity metrics using clip and a user study therefore styledrop is superior to other methods in terms of quality output as for efficiency style drop is also built on a few crucial components including a Transformer based text to image generation model adapter tuning and iterative training with feedback the authors also of this paper claim that their method is efficient and only requires a small number of training parameters so if you're wondering also about some of the nuances of how this works essentially star drop captures the nuances of the user provider style in

### [10:00](https://www.youtube.com/watch?v=q_ebiphq2Pk&t=600s) Segment 3 (10:00 - 15:00)

the text image Generation by fine-tuning a pre-trained text to image model with a single style reference image the method efficiently learns a new style by fine-tuning very few trainable parameters less than one percent of the total model parameters and improving the quality via iterative training either with human or automated female the resulting personalized text image model can capture details such as color schemes shading design patterns and local and Global effects so now I think if we continue to look at some of these examples these are really cool because I mean this is something that people do want okay so for example you can see right here that this is an illustration and this is something that I didn't show you before in the video so you can see right here it says a person drowning into the phone in X style or whatever and of course this is the one reference image that they have and then of course you can see all of these different Generations that are in the exact same style and I've got to be honest it captures the style perfectly like imagine you're someone who's trying to create an illustration for a kids book or maybe for a website or maybe for whatever brand that you're working on this is going to be a really effective tool now I'm going to show you as well right here a okay this is where we have really cool illustrations now some illustrations are in certain Styles but I do think that this is absolutely incredible in terms of the actual consistency of this style I mean it even gets the shape right color scheme right and I think what's crazy about this that most people don't even take into account is that certain things are just natural to this AI tool like certain color schemes certain Hues certain variations certain widths certain legs that would be very hard for a normal person to immediately get if they don't have some kind of artistic aspect so for example I know that you know sometimes even now when people still use images from mid journey and stuff like that what they'll try to do is they'll try to get these images that are all kind of similar and for example if you use a website like free pick where maybe you're trying to get a few vectors for maybe the front page of your website you can see that lots of the time what you do have is you do have a lot of these packs that are sometimes in the similar style now sometimes as well you don't always get them in the same similar style and you do have to sift through many different images and fine-tune many different images to sort of shoot that style with the same color spot with style drop this is actually really effective and honestly I gotta be honest um after seeing more of these images I'm definitely impressed they definitely have fine-tuned this model to be very effective I mean look at this sticker picture and you can see right here that these do look incredible so we're going to continue to look through these and what's absolutely incredible is this one right here this was one that I saw before and I was like okay this is crazy because even if I did see this even if someone asked me to do this I wasn't even sure how I would go about doing this now I don't consider myself an artist but still it definitely seems pretty crazy so a baby penguin in this style a banana in this style definitely goes to show you just how cool this is because what we often see with many AI tools is we often see easy examples things that are very simple to do for that large language model or maybe that AI tool but this you can see right here is definitely a different would arguably be a very difficult style to emulate I mean what style would you even call this is just original colored smoke maybe off a background but we already have stuff that definitely looks really effective I mean you can actually see a Formula One car a Christmas tree a boat a bench it definitely looks really cool now some more of these as well like I said definitely would seem quite difficult to do I mean I'm not even sure what kind of style this is but stardrop manages to capture it perfectly now something I also noticed when I was thinking about some of the recent AI tools that got released was something from Adobe so Adobe recently did announce Firefly okay but something on their web page which they didn't really announce was really interesting you see if you go onto the Adobe page and you scroll down you can see all of this amazing 3D modeling AI stuff which is all cool but if we scroll down to their stuff where it shows you what they're working on they say we're actively researching new ways generative AI can help creators Express their ideas and that's when I came down and I realized that there was a new thing so this says customizable the fusion it says apply your Unique Style with AI in image generation it can be challenging to achieve the precise visual aesthetic required for a project with customizable diffusion the Creator can select which images inform the generative AI this offers a more creative control over individual images and a simpler way to create to apply creative choices across a body of work so essentially Adobe are working on a tool that is quite similar to like this now I'm not going to get completely into adobe's tool I do think that if they do manage to fine tune it enough to the point where it works completely it's going to be immediately added to the Creative Suite so we're not going to get into this too much but I will leave a link to this in the description but you can see that they're multi-concept customization of text to image diffusion is essentially something where you can get your specific image like just like dream booth and then you can get a specific style like drawings by Aaron hertzman and then you essentially can get the subject in that specific style so I guess it's combining dream booth and Google style drop into one specific tool that they are going to be creating so definitely something that is really interesting and something that we should watch out for and also if you are interested in AI there is this thing called cvpr 2023 which is a con conference about computer vision so

### [15:00](https://www.youtube.com/watch?v=q_ebiphq2Pk&t=900s) Segment 4 (15:00 - 17:00)

essentially the 2023 conference on computer vision and pattern recognition will have a large amount of different papers that will be published and you can see right here the publication date is June 18th 2023 which means in around seven days from the publishing of this video we're going to have a lot more interesting generative AI tools that are going to be available and largely those that are related to computer vision so back to style drop you can see that as we continue on there are many different examples that do look absolutely tremendous I mean looking at these current examples I don't think that there's a tool that beats this in terms of the consistency and the level of quality that it is outputting I have to say that Google definitely do work on some pretty crazy stuff and what's also cool here and I think this is going to be one of the game changers is that this is a 3D style and what's crazy is that 3D stuff is really hard to generate especially when it comes to the shading ambient occlusion everything that goes into generating a 3D image is definitely quite hard especially from 3D images and to be able to generate these 3D models even though they are just 2D images it's definitely something that is worthy of recognition so I would say that star drop is definitely one of the most recent tools that I have seen that has really impressed me because honestly this stuff right here is just really crazy and the more examples I look at just the more crazy this absolutely is so I don't know why this tool wasn't more widely recognized why I didn't see more people covering this I guess because it's not a tool that you can actually use right now that's probably why this wasn't as popularly covered but if you go to the Google's website and you look through these simply on your phone you'll see that this is definitely something that does look really good and not only doesn't it just cover a base level of 2D images it covers 3D images it covers different styles it covers quite a lot so whatever tool they are using I think the Google should release this and I do think that if this does get released it will be a product that people use literally every day so with that being said are you guys going to read the paper do you think this is going to be really cool are you excited for cvpr 2023 I think Google once again has broken the mold and shown us exactly what we didn't think we would see another AI tool that really just go ahead and changes what we thought possible and if you enjoyed this video it'd be great if you could subscribe to the channel

---
*Источник: https://ekstraktznaniy.ru/video/14817*