# ChatGPT'S New 'VISION AND AUDIO' Stuns The ENTIRE AI Industry! (Now RELEASED!)

## Метаданные

- **Канал:** TheAIGRID
- **YouTube:** https://www.youtube.com/watch?v=hgqhdX3CoW0
- **Дата:** 26.09.2023
- **Длительность:** 11:12
- **Просмотры:** 12,720
- **Источник:** https://ekstraktznaniy.ru/video/14729

## Описание

Welcome to our channel where we bring you the latest breakthroughs in AI. From deep learning to robotics, we cover it all. Our videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on our latest videos.

https://openai.com/blog/chatgpt-can-now-see-hear-and-speak

Was there anything we missed?

(For Business Enquiries)  contact@theaigrid.com

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience
#IntelligentSystems
#Automation
#TechInnovation

## Транскрипт

### Segment 1 (00:00 - 05:00) []

so ladies and gentlemen Boys and Girls Chat gbt finally has the ability that was announced several months ago chat TBT not only has Vision it also can see here and of course speak so chatgpt open AI managed to roll out this update yesterday and spoke about the new capabilities and how they are intertwined with chat gbt as well as the various applications one of the new applications of this is the vision and with the vision you can do a multitude of helpful things that you may have not even thought of in this video we're going to take a look at every single application use case for chat gpt's new abilities and exactly how the future of AI is going to transpire now that this landscape is changing so quickly so on the open AI page it says we are beginning to roll out the new voice and image capabilities in chat GPT they offer a new more intuitive type of interface by allowing you to have a voice conversation or the show chat TPT what you're talking about in addition if you are wondering about the rolling out of this stuff because one thing that I do know that people want to know is this available right now currently parts of this are available and parts of this aren't for example voice isn't available right now because as you do know chat DBT slowly rolls this out usually when chat TPT or openai roll out an update it's usually USA first and they usually roll it out to certain States first I'm not sure exactly which states but I do know that some do get priority and then it's usually followed by Europe and then the rest of the world additionally for those in the United Kingdom we do get it sometime after so it might be a month after or it might be three weeks after the first two week period in terms of which users it's going to be to it says we're rolling out voice and images in chat TBT 2 plus and Enterprise users over the next two weeks so open AI have shown us a small demo in which you can use voice to engage in back and forth conversation with your assistant now I do think that this is going to be really good because this is something that is quite like Siri only it's much more advanced now let me know what you think about this demo what you're about to hear is someone a normal person a normal human speaking into the chatbot first and then what you're about to hear is chat gbt speaking back using one of its natural language voices I'm not sure which service they're using to use these types of voices it doesn't sound like 11 Labs but it does sound pretty decent it sounds a lot better than Siri but let me know what you think of the demo if you think it's good and if you're going to be using it because I certainly am we want to hear a bedtime story tell us a story about the super duper sunflower Hedgehog named Larry start with telling us a little bit about him Larry was a unique Hedgehog unlike any other he had bright sunflower petals instead of spines residing in Meadowville Larry spread joy and color wherever he rolled people cherished his radiant present what was his house like Larry's house was a cozy burrow beneath the sunflower field inside the walls shimmered with golden petal pattern natural light streamed through tiny sunflower or Windows creating a warm glowing Ambiance it was a sun-kissed Haven who is his best friend was Luna a luminescent firefly at night while Larry's petals rested Luna's glow kept them company their bond was special Larry brought sunshine by day and Luna Starlight by night together they illuminated Meadowville okay it's bedtime for Larry what does he do to go to sleep Larry curls up in his pedal blanket letting its softness Embrace him Luna sings A Gentle lullaby her glow dimming to mimic Twilight the scent of sunflower fills the air and Larry drifts into peaceful dreams good night Larry so from that demo you can see exactly why this is really useful sometimes you do have an idea and you just want to hit the record button and then simply talk to chati rather than simply typing out everything now one thing that I do know as well is that this is likely to be very accurate this is because chai TBT has a special model that they are most likely going to be using which can identify every single word in the human dictionary and is trained on a large data set of human audio so another thing you can see on the way webpage here is that Chad gbt actually comes with five different voices there is Skye there is Juniper there is Cove there's Ember and there is breeze so we have five different voices each one is slightly different you can choose which one that's going to read you your text or talk to you whichever one you find the most pleasing you can then use I think in the future what might happen is that maybe they might allow third party applications for you to have their voice but I'm guessing for now this is all they have to offer it would be cool if chat gbt allowed users to be able to customize these voices Maybe by gender by age by pitch by Tone because that would be something that would really make these AIS as personal to us as possible now if you do want to use voice it might be rolled out now but I'm not sure because I don't use the mobile app so it does say to head to your settings new features and then opt into voice conversations then tap the headphone button and then choose your preferred

### Segment 2 (05:00 - 10:00) [5:00]

voice out of five different voices and like we stated before we also use whisper are open source speech recognition system to transcribe your words into spoken text and this is going to be something that makes AI much faster in terms of getting your information and of course relaying it back to you so what I'm going to do now is I'm going to play the different five voices and let me know which one you think is going to sound most appealing to you once in a tranquil Woodland there was a fluffy mama cat named Lila one sunny day she cuddled with her playful kitten Milo under the shade of an old oak tree Milo Lila began her voice soft and gentle you're going to have a new playmate soon Milo's ears perked up Curious a new playmate Lila purd yes a baby sister Milo's eyes widened with excitement a sister will she chase Tails like I do Lila chuckled oh she'll have her own quirks you'll teach her won't you Milo nodded eagerly already dreaming of the adventures they'd share once in a tranquil Woodland there was a fluffy mama cat named Lyla one sunny day she cuddled with her playful kitten Milo Lila began her voice soft and gentle you're gonna have a new playmate soon Milo's ears perked up Curious a new playmate Lila purd yes a baby sister Milo's eyes widened with excitement a sister will she chase Tails like I do Lila chuckled oh she'll have her own quirks you'll teach her won't you Milo nodded eagerly already dreaming of the adventures they'd share once in a tranquil Woodland there was a fluffy mama cat named Lila one sunny day she cuddled with her playful kitten Milo under the shade of an old oak tree Milo Lila began her voice soft and gentle you're going to have a new playmate soon Milo's ears perked up curious Milo nodded eagerly already dreaming of the adventures they'd shared then of course we have images so it says you can now show chat TPT one or more images troubleshoot why your grill won't start explore the constants of your fridge to plan a meal or analyze a complex graph for work related data to focus on a specific part of your image you can use the drawing tool in our mobile app now this is something that I didn't expect from chat GPT because it wasn't actually demoed in the research paper the drawing tool is something that is going to be a game changer because it means that in an image you can highlight a specific area in order to give chat TBT a sense of what you're trying to look at now I think this is really cool because a lot of the times when you're using stuff like Bing chat and although it is of course good the problem is that sometimes it struggles to locate certain things within the image and infer the context from them so from the actual demo what we can see here someone's taken a picture of their bike and you can see it says help me lower my back seat now I'm glad they actually managed to do this right here where you can see this is a focus area to focus on a certain part for chat TBT because a lot of the times if you are taking an image of something you largely don't know what you're taking an image of and you largely mistake certain things for other things so what you're actually able to do is point out certain things and then ask it questions about it you can also see the chat TPT manages to respond and is taking that note that's not a lever it's a bolt you'll need this to loosen it and it went on further to describe the exact type of tool you need and of course by inputting your manual as a PDF and taking a picture of the toolbox chat TPT was able to say yes you have the right tool or no you have the wrong tool and I think that this has a vast range of applications so long as this works and I'm pretty sure they took a very long time to let this out because I'm pretty sure they wanted to ensure that this was really safe to use in a wide range of applications now for those

### Segment 3 (10:00 - 11:00) [10:00]

now a question that might come up is that some people might be wondering is this the same as being Chap and I have to say no this isn't going to be the same as Bing chat because Bing chat is limited not only in his responses but you can't highlight certain things and usually what's wrong with Bing chat is that you can't prompt it further and ask it certain questions about certain images it seems like Bing chat was a very limited test run of a tool that was obviously going to be rolled out into chat gbt and we're now seeing just how good this really is now another thing as well and this is just something small is that they did name this chat GP TV and that did confuse some people because they did think it meant that this was gpt5 in fact it isn't but I do think that this is definitely GPT 4. 5 or 4. 4 this just isn't gpt4 anymore because the things that we're now getting are definitely A Step Above what we were initially told that gbt4 was going to be so I do think that as they stated before as they're upgrading the system we're going to eventually continue and continue until the point we get to gbt5 it's not going to be some giant leap it's largely just going to be an upgraded version of gpt4 along with some video creation capabilities
