# Character Consistency with 4o Image Generation

## Метаданные

- **Канал:** OpenAI
- **YouTube:** https://www.youtube.com/watch?v=PFsOUNfBhzI
- **Дата:** 25.03.2025
- **Длительность:** 1:49
- **Просмотры:** 103,514
- **Источник:** https://ekstraktznaniy.ru/video/11323

## Описание

Speaker: dmed Medina

## Транскрипт

### Segment 1 (00:00 - 01:00) []

How's it going? Pretty good. One thing that I'm really excited about Imagen is the ability to keep consistency in characters. I'm David Medina or DMED and I work on multimodal. What I want to show is one of my favorite prompts which is can you create a low poly penguin mage? Make it very low poly. Surprisingly, it's sometimes hard to get very good low poly outputs. It's not like other image generation models where it tries to generate something based on just the text. Instead, it uses the large language model understanding of what does the user want? What is the intent? I also like some board games. So, miniature- like games. So, what I'll do now is generate a miniature from this. So, ideally, we'll see a penguin that looks like this with the same staff and a hat. So, can you make me a realistic miniature as if a professional made this and painted it? This is what I think excites me the most about imagen. The other image generation models will try to create literally what you said, but what's special about this is one, it'll keep the context of this character and then two, it'll understand what I'm trying to ask it for and generate very similar uh model, but in a miniature realistic style. It infers what I want. I don't have to tell it every little detail. One other realistic thing we could do is can you make a crystal version of this with light reflecting and very realistic. Again, I'm just giving a very simple things. Normally, this is not enough for other models to generate something very detailed, but the model understands what I'm asking for. It'll think what type of style it should have. So, this ability to really understand what the character is and make edits and understand what the user wants. For me, it's the just an amazing capability.
