# OpenAI's NEW "VOICE ENGINE" Project Is STUNNING! (Open AI Voice Engine Explained)

## Метаданные

- **Канал:** TheAIGRID
- **YouTube:** https://www.youtube.com/watch?v=AOjeFlFWkiU
- **Дата:** 22.03.2024
- **Длительность:** 19:23
- **Просмотры:** 37,079
- **Источник:** https://ekstraktznaniy.ru/video/14444

## Описание

✉️ Join My Weekly Newsletter - https://mailchi.mp/6cff54ad7e2e/theaigrid
🐤 Follow Me on Twitter https://twitter.com/TheAiGrid
🌐 Checkout My website - https://theaigrid.com/

Links From Todays Video:

https://uspto.report/TM/98456635
https://www.reddit.com/r/singularity/comments/1bkosng/openai_voice_engine_was_trademarked_two_days_ago/
https://www.reddit.com/r/artificial/comments/12cczbg/building_a_kind_of_jarvis_openai_karpathys_twitter/

Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos.

Was there anything i missed?

(For Business Enquiries)  contact@theaigrid.com

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience

## Транскрипт

### Segment 1 (00:00 - 05:00) []

so there is a new project that open I are allegedly working on and in this video I'll give you guys all the details you need to know to stay update with open ai's new voice engine project that they seem to be working on in rather secrecy so let's take a look at exactly what all the details are so you can stay up to date so one of the first things that we noticed and pieces of information that led us to this discovery is of course the trademark now many people have speculated about trademarks before myself included but of course trademarks do include a little bit of speculation but later on in the video I'll show you guys a lot more evidence that leads me to believe that this is a real thing coming very soon and once you see all of the information you're truly going to understand why this is very likely going to be a projects coming in the future so we can see that they trademarked Sora before and then the day after that they did release the Sora announcement they announced their text video thing and then of course you can see right now very recently on the 19th of March they actually trademarked a voice engine and this has a very interesting description and of course by the name voice engine we do know that it does mean something to do with the voice I'll get into the description of exactly what it covers in a moment but the point is that this trademark was filed recently now the only problem with this trademark being filed recently and not being released is that we don't know exactly when this is going to release however we do know that a product is going to be coming later this year because Sam Alman in a recent interview did actually say that later on this year they're going to be releasing a bunch of different products and they don't know what they're going to be calling them I'm guessing right now we do know that one of those products is going to be called voice engine as it was recently trademarked and it could be here within the next month within the next two months but we do know that this is going to be coming out this year and the description of this is very interesting so let's get into the actual description of this trademark on what it possibly means for future products so if we actually take a look at the trademark description we can see here that voice and speech recognition processing commands and converting between text and speech there are a bunch of different things here that go to show that this is going to likely be some kind of voice integration software that either uses our voice to direct an AI system what to do or we can use it to generate lifelike speech and there's many different things here as you can see automatic speech and voice recognition and generation and then of course we can see building digital voice assistant which is one of the main things now I'm going to be coming back to this page because there's actually a lot to dive into but I want to give you guys some more information with as to why I think this project is going to be coming very very soon so as you can see all of this is audio voice generation and of course it also does include some speech recognition but there's some key pieces of information that open ey employees have actually tweeted about that gives us a further Insight the first piece of information that many people may not have been aware of is this from Andre karpathy so Andre karpathy used to work at open aai the screenshot you currently looking at is actually an old screenshot but I've done that for the purpose of showing you guys his bio in his bio you can see right here that it says building a kind of Javis at open AI now essentially the reason that is such a hint onto this project is because this was I think a year ago or a couple of months ago that this was his bio but obviously since he no longer works at open aai that's why I'm taking a look at the old screenshot because it has this piece of information now if you don't know what Jarvis is you might be confused with as to why I even read out that statement Jarvis is actually something from the Iron Man movie it's essentially a AI assistant that is pretty much an AGI level system artificial general intelligence level system that is really really smart and Tony Stark the main character in the movie all he does is he uses voice command to control this AI system and he uses it to control his house suit and he pretty much uses it for absolutely everything and it is really really smart so when someone who works at open aai and says that they are building a kind of Javis at openai this kind of gives us to the Insight on what kind of products we might be getting now I can't include any Iron Man clips here but you can think of it like a kind of version of Siri but just 10 times better that can pretty much do anything you could ever think of in terms of data analysis in terms of you know controlling your entire house and of course if you did own an Iran suit it could control that as well so this is the first piece of information that I'm sure most people would have missed because his bio has changed and of course he doesn't work at open aai anymore now there was also another piece of information that I did find on Reddit now someone who previously worked at Google did also stat that open aai will release this year the best personal assistant ever built nothing will

### Segment 2 (05:00 - 10:00) [5:00]

compare to this thing it will make Jarvis Samantha Hal look like something from the past and those are all fictional AI voice assistants and it says a good function calling model is all they need and by now they seem to have one so essentially what this person is saying here and we do have to be skeptical about this tweet however it does tie into pretty much everything we've seen so far is that maybe this person has some kind of inside information because when they did tweet this out this was far before the trademark and far before the rumors started to circulate which means that potentially this person does know something more than we do so what they're basically saying is that this is going to be a personal assistant that will be the best thing ever built and that nothing is going to compare to this thing at all which does mean that even if we do get some kind of personal assistant from open AI it means that potentially the kind of system we're looking at is going to be more advanced than Jarvis Samantha and Hal more advanced than Siri more advanced on those home devices we have and it seems like we can't even begin to conceptualize what open a I is working on now I think it's just going to be some kind of superpowered assistant but I do think that it is going to be very fascinating to see how they integrate this into the rest of the chat GPT stack and how this kind of thing does work so this definitely does get me pretty excited and there's even more information that you guys haven't seen yet that's going to get you guys super pumped because there is a lot more information from open AI employees that they have actually said about this very thing so even if we are skeptical of this tweet and we're like okay maybe this person is just you know building hype or whatever let's say we're skeptical about this tweet I'm actually not skeptical I actually do believe this statement we can take a look at something that I worked on earlier this year so earlier this year I made a video and I think it was around a month or two ago and I spoke about this person right here okay so this is a screenshot from an article on the website called the information and it said last month Ben new house an open AI employee who worked on computer agents using agents at the startup according to a person familiar with his role posted on X that he was hiring for his team and building what I think could be an industry defining zero to1 product that leverages the latest and greatest from our upcoming models he didn't elaborate and of course someone else said that it would change everything so essentially two open air employees tweeted this and at the time I did think that this was to do with agents but what we could be seeing here is that this could be something that combines AI agents with voice generation or something along those lines so imagine voice generation along the lines of Siri so a personal assistant which is like a complete AI agent that you get along with some incredible voice software now the reason that I think this is going to be true is because if we take a look at what AI agents are essentially autonomous personal assistants that do things on your behalf we haven't really seen one that is really good yet we've seen some in the media like the rabbits one we've seen recently that there was literally open interpret just released we've seen human's AI pin but apparently this is going to be even better and considering open ai's history of building great products this seems to be something that I'm really excited about now you can see that this article isn't just completely fabricated person said Ben new house I'm hiring an open AI we're building what I think could be an industry definining 0er to1 product that leverages the greatest from our upcoming models and if you like product deep technical challenges my DMs are open so this person here of course is showing us that what they are building and this was tweeted earlier this year January the 25th 2024 I did actually retweet this I did think it was about agents and I don't think I'm wrong I do think that this is going to be something that combines agents combines the lifelike and human-like nature from these systems and then possibly releases it in a very fascinating way that possibly helps us to do a lot of household tasks or a lot of annoying things like checking out emails and responding to them now what you can also see here is this is the additional tweet this is PTO Alinda the VP of product at openai and he says that this product will change everything so him quote tweeting that and stating that this product will change everything means that this product is likely to be a very tremendous change in terms of what we are used to because whilst The Voice assistants are quite good I do think that there is a lot of room for things to change we saw how viral rabbit went and how viral human's AI pin went so potentially one of the things that I'm leaning into is the fact that open ey might actually be leaning into a physical product now I don't know if that is going to be the case but we do know that is a possibility I'm leaning more towards the side that they're not going to do this because I think that open AI is by default just like a software company and they mainly just like to focus on software but that doesn't mean that they can't do this because openai has already had talks with some people at Apple to build some kind of product and we do know that this

### Segment 3 (10:00 - 15:00) [10:00]

is something that many companies have already gone to Market with and I've already ordered a few products and I really cannot wait for them to arrive currently if you're wondering what on Earth am I looking at this is essentially Alexa and this is basically Amazon's AI device where you can speak to the AI that is Alexa and you can control your home devices it's something that is really cool but the reason I say a physical product is because what people often like to do is if they have something like an Amazon Alexa or like a speaker which they speaking to they just like to yell commands have it recognize them and then obviously be able to com complete commands on that complete task and I think that since open AI does have its whisper software which is basically automatic speech recognition and it's the very best in the game I would not be surprised if they released some kind of small product now the reason I said that they might release a product in the future is because products do take quite a long time to get down if they are really well and it does take a long time if you're trying to get things really perfect and considering opening eyes track record I think they have a kind of apple esque approach to making things more perfect than they originally need to be and I think that will take a little bit longer than we may originally think now I do think that they might just release some really cool software first because what they could do is they could just release the voice kind of software and I think that one of the features that is going to be really cool is the human-like nature of the speech now many people didn't actually pay attention to this but if we actually go back and we look at this video right here the voices on this and if you're not sure what this is figure one essentially this was the robot that was uh using openi software and this actually sounded really really good okay I'm I think I'm going to play a small part of the clip but when you guys hear it you're going to be like w okay that was uh really really good on it so I gave you the Apple because it's the only uh edible item I could provide you with from the table I think I did pretty well the apple found its new owner the trash is gone and the table wear is right where it belongs I agree thanks figure one so I think that goes to show us and considering that this is you using you know it's utilizing on the back end open eyes technology that in the future whatever kind of voice software we're going to get it's going to sound rather remarkably super realistic and we already know that is being done by 11lbs if you aren't familiar with 11lbs it's an AI software that allows you to not only clone your voice but clone other people's voices and it allows you to create custom human sounding realistic voices from Simple Text prompts and I've titled this coming for the full stack because I do believe that right now if you are a company that is in AI I think the open AI is pretty much coming for every single company that is out there with regards to anything related to artificial intelligence along you know one of the big five main uh you know I guess you could say systems so for example text image video and audio I think openi is truly trying to just be state-of-the-art in all of these I don't think they just want LMS I think they want agents audio and I think they want video and considering how realistic their previous voice sounded that is something that is not surprising now something that is actually also really fascinating was that there was a recent trend on Twitter after the demo of figure so people started to realize that look this thing sounds super realistic and what they started to do was they started to use the chat GPT app so for example if we go to the chat GPT app like Siri and let's say for example I just start to say um so you guys are going to listen to this live this is a live demo and I'm going to say hey chat GPT can you give me a marketing plan for a YouTube channel that is about AI but I want you to make your response sound as human as possible so please use um and rs and filler words so that you sound as realistic as possible and then once it gives me my response it sounds very realistic timely with how fast tech is moving these days all right um first things first you want to uh Define your target audience right are they like Tech enthusiasts students professionals knowing this helps a ton with um tailoring your content now for the content itself you're going to want to uh mix it up a bit maybe throwing in some tutorials deep dives into AI news interviews with you know experts in the field and oh don't forget um share Snippets or like teasers of your videos on let's say Twitter LinkedIn or even Instagram to you know draw more viewers in and um I guess collaboration is pretty huge on YouTube maybe reach out to other content creators for you know guest appearances or co-hosted videos it's a great way to uh get exposed to each other's audiences lastly um SEO make sure to you know optimize your video titles descriptions and uh tags with relevant keywords it helps a ton with um getting your videos to show up

### Segment 4 (15:00 - 19:00) [15:00]

in searches so yeah I think that's um kind of the gist of it just you know keep your content fresh engage with your audience and uh use all the tools at your disposal to like spread the word and uh mistakes are totally okay they make you seem more um relatable you know just uh keep at it and you'll um find your groove so right there you heard just how crazy opening eyes like standard state-of-the-art you know uh voice thing is so that's what it sounds like now so that's them just giving out their very basic voice kind of software so that's why and I actually you know was quite surprised there because it sounded super realistic and I think that goes to show how crazy in the future the kind of uh product will sound and how insane this product could be so either one we could be getting some kind of physical product and I think open AI might opt for this route and I say might because number one they don't want other competitors to surpass them but number two if they do a physical product um they could like sell out really really like they like if they sell out a lot they could become a really profitable company because um let's say these price they price these uh devices at some kind of premium product and let's say it works with all of chat GPT stuff they could become like a really big company I'm talking about like you know $500 billion or trillion dollars just like apple because Apple literally makes so much money off of its hardware and off of its branding off of its you know um entire stack where it's able to just you know rack in the profits because it's such a profitable company in terms of the way it markets its product and in terms of the fact that it leaded the way in terms of the kind of style that it did and I think if open AI does go down the physical product route and if they do you know decide to sell consumer products I think that would be a huge thing for them because they already have the wide bran range it was the fastest consumer you know product to grow the fastest social media kind of thing um and I think that would be something crazy in the future now I don't think they will do this now but I in the future and they have talked about this already and for those of you who don't think this is a reality you can see that here Bloomberg actually reported on Apple's iPhone design Chief and listed by Joanie IV Sam Alman to work on AI devices so you can see that they are joining open AI to work on AI devices and it says here that legendary designer Joanie IV and Sam Alman are enlisting an Apple Inc veteran to work on a new artificial intelligence Hardware product aing to a aiming to create devices with a latest capabilities and a part of the effort outgoing Apple executive Tang tan will join IV's design firm love form which will shape and look the capabilities of new products according to people familiar with the matter Alman an executive has become the face of modern AI plans to provide the software underpinnings and said the people who ask not to be identifies because the so we can see here that it seems very likely that you know since openai is recruiting people to work on a physical product to design some kind of AI device that this is not out of the picture we could be getting some kind of you know device in the future or the first thing that we could be getting is some kind of major Voice update now I'm not sure if this is going to be similar to the Google home similar to the Amazon Alexa but either way this trademark does give us further insights okay into the fact that openai is going to be releasing something later on this year and the fact that you know Andre karpathy said build building a kind of Javis at openai the fact that you know some of the open ey employees said have said that you know this is going to make certain other things Elite it's going to absolutely change everything you know the VP of product saying that this product will literally change everything is pretty crazy when you think about it so I think that right now we are in a very interesting thing and I think if we also look at the capabilities of multilingual speech recognition translation transcription generation of audio uh you know basically a digital voice assistant I think this is going to be absolutely incredible let me know what you think about this do you think this is just a rumor I think that this is going to be the next big thing it might be the biggest product of 2024 considering how far behind Siri and Amazon actually are since they haven't actually updated their systems I have no idea why they haven't just yet they literally could be doing incredible things but I guess some companies are asleep at the wheel and I guess we're going to have to wait to see exactly what happens with this technology
