‘Advanced Voice’ ChatGPT Just Happened … But There's 3 Other Stories You Probably Shouldn’t Ignore

16:56

‘Advanced Voice’ ChatGPT Just Happened … But There's 3 Other Stories You Probably Shouldn’t Ignore

AI Explained 25.09.2024 100 921 просмотров 4 466 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Advanced Voice is here, but is it even the biggest story? After some quick ChatGPT Voice tips, I’ll show you how the scale of OpenAI’s ambitions just got 5x’ed: superintelligence is now a serious endeavour. Then, Gemini 1.5 Pro 002, the awkwardly named model that I pit against o1 preview. Finally, NotebookLM, a tool that I would be stunned if you can’t imagine one use for. Guess starring a other news items that have missed your radar, it’s all starting to make me wonder if AI’ll ever slow down… Assembly AI Speech to Text: https://www.assemblyai.com/?utm_source=youtube&utm_medium=influencer&utm_campaign=ai_explained https://www.assemblyai.com/blog/ald-improvements/ AI Insiders: https://www.patreon.com/AIExplained Chapters: 00:00 - Intro 00:40 - Voice Tips 01:47 - Altman Predictions 04:10 - Story 1 07:36 - Story 2 13:33 - Story 3 Voice Rollout ‘Early’: https://x.com/sama/status/1838864011321872407 Challenges: https://x.com/joannejang/status/1838658757636821322 5-7 5GW (!) Data Centers: https://www.bloomberg.com/news/articles/2024-09-24/openai-pitched-white-house-on-unprecedented-data-center-buildout 3 Mile Island Microsoft: https://www.wired.com/story/the-ai-boom-is-raising-hopes-of-a-nuclear-comeback/ Stargate: https://www.theinformation.com/articles/microsoft-and-openai-plot-100-billion-stargate-ai-supercomputer?rc=sy0ihq Video: https://www.youtube.com/watch?v=KXG2f-So9oo Intelligence Age: https://ia.samaltman.com/ Gemini 1.5 Pro 002: https://developers.googleblog.com/en/updated-production-ready-gemini-models-reduced-15-pro-pricing-increased-rate-limits-and-more/ Reasoning with o1 (Jerry Tworek): https://www.youtube.com/watch?v=3BkQI3nIiB8 NotebookLM: https://notebooklm.google/ https://simple-bench.com/ Kling AI Motion Brush: https://klingai.com/release-notes My Coursera Course - The 8 Most Controversial Terms in AI: https://imp.i384100.net/m57g3M Non-hype Newsletter: https://signaltonoise.beehiiv.com/ I use Descript to edit my videos (no pauses or filler words!): https://get.descript.com/ldgxfuj2bhnb Many people expense AI Insiders for work. Feel free to use the Template in the 'About Section' of my Patreon. https://www.patreon.com/AIExplained

Оглавление (6 сегментов)

Intro

just a few minutes ago the roll out of advanced voice mode for chat GPT was complete and apparently it was done early to quote samman I've been playing with it it's amazing as expected but that's not actually the main focus of this video yes I will quickly give some tips on how literally anyone can access these super responsive and realistic voices that can do all sorts of verbal Feats but then I'll cover three other stories in the last few days that you might have missed and that I am very confident you will be fascinated by for at least one of them if not everyone but first as you may have gathered from

Voice Tips

my accent I am actually from the UK which is geographically part of Europe and you may be somewhat scratching your head as to how I've gained access to chat GPT advanced voice mode at least officially advanced voice mode is not released in Europe but what I did was first I used a VP second and this has helped many people apparently I uninstalled and reinstalled the app thirdly you could add I am a $20 a month subscriber to chat GPT I'm not though going to linger on this story because you can draw your own conclusions about whether you enjoy the app but for me it was quite fun getting it to reply in various accents personally I think the biggest impact will be to bring potentially hundreds of millions of more people into engaging every day with large language models and the natural and not too distant endpoint for all of this is for chat gbt to gain a photorealistic set of video avatars let me put one prediction on the record which is that in 2025 I think we will be having effectively a zoom call with chat GPT but just for now what are these

Altman Predictions

three other stories that I'm talking about and no one of them isn't the intelligence age essay by Sam mman it does though introduce a story I'm going to be talking out so let me spend just a minute on it the essay came out around 36 hours ago and it basically describes the imminent arrival of super intelligence he describes us all having virtual tutors but the role for formal education is at the very least unclear in an age in which we would have super intelligence samman did though kind of give us a date for when he thinks super intelligence will come or at least a range he said it's coming in a few thousand days now it's probably not going to be terribly fruitful to analyze this prediction too closely but if we Define few as say between 2 and 5 that's between 2030 and 2038 the story though of how we get there is according to samman quite simple deep learning worked it's going to gradually understand the rules of reality that produce its training data and any remaining problems will be solved and if you will let me try to summarize that declarative statement in a sentiment that I think pretty much everyone can agree on if there's just a 10 or 20% chance he's correct is this not the biggest news story of the century pretty hard to see how it wouldn't be but that's not going to be the focus of this video now it's a remark he made further on in the essay you might think I'm going to focus on how he described AI systems that are so good that they can help make the next generation of AI systems or how AI is going to help us fix the climate establish a space colony and help us discover all of physics no many will of course focus on how he no longer described super intelligence as being a risk for lights out for all of us and instead being a risk for the labor market but I actually want to focus on this sentence he said if we don't build enough infrastructure AI will be a very Li mited resource that Wars get fought over that is a quite fascinating framing that will make more sense when you see the articles that I'm about to link to it was reported just yesterday that

Story 1

openai thinks we're going to need more power than it was wildly speculated that even they were aiming for just 6 months ago the figures in this article are quite extraordinary and I'm going to put it in context but don't forget that framing from the essay we just saw if someone were to genuin believe and have evidence for the fact that super intelligence could arrive within 5 to 10 years then this would make some sense if the progress in AI was bottlenecked by power as I've described in other videos it wouldn't just be harder to train such a super intelligence but to spread it out to everyone the cost of inference aka the cost of actually getting outputs from the model would be prohibitive to many around the world and there is a real scenario where that leaves us in quite an awkward situation where essentially rich people can get the answers from a super intelligence and poor people can't but anyway let's put some quick context on these numbers like 5 gaw before getting to the next interesting story 5 gaw is roughly the equivalent of five nuclear reactors or enough power for almost 3 million homes now I know what you might be thinking that sounds a lot but not completely crazy and I would almost agree with that if they were proposing just one such 5 gwatt data centor after all I've already done a video a few months back on the 100 billion Stargate AI supercomputer that system which could be launched as soon as 2028 will by 2030 need as much as 5 GW of power so nothing too new in that Bloomberg article right well except that now open aai are talking about building five to seven data centers that are each 5 gaw that's enough to power New York City and London combined and it must be added of course that many think that's so ambitious it's just not feasible what does it say though about the scale of confidence of open Ai and more importantly Microsoft who are funding much of this that they are even reaching for these figures and the moment you start looking out for these stories they're everywhere like this article just from yesterday in wired Microsoft have done a deal to bring back the three Island nuclear reactor of course many of you will be thinking there is a 50% chance even an 80% chance that all of this just ends in a puff of smoke maybe these 5 gaw data centers don't happen or they do happen and it turns out you need far more than just compute to get super intelligence but for me after the release of 01 preview I'm a little bit less confident that compute isn't all we need not saying we don't need immense Talent tricks and data but it could be that compute is the current big bottleneck and I do wonder if even Yan laun might be starting to agree with that sentiment and for a deep dive on that do check out the new $9 AI Insiders on my patreon for years now and as recently as just two weeks ago yanlun has been quoting plan bench for establishing a discrepancy between human planning ability and that of llms suffice to say that after I go through a newly released paper in this video you may no longer believe that such a distinction exists but my second story actually involves an announcement

Story 2

from yesterday by Google though I will be bringing in a comparison to 01 the tldr is that they improved The Benchmark performance of Gemini 1. 5 Pro while also reducing the price and increasing the speed they did however give it the very awkward name of Gemini 1. 5 pro2 do you remember we orig had Gemini Pro and also Gemini Ultra was the biggest and best model and pro was like the middle version that was Generation 1 but then we got 1. 5 Pro but no 1. 5 Ultra so both the number and the name imply that there's much more to come we're just not seeing it it's 1. 5 not two it's the pro version not the ultra version it's this constant tantalizing promise and all of them do it that the next version is just around the corner it's Claude 3. 5 Sonic Claude 4 oh and it's the Sonic not the Opus or biggest Edition from anthropic and now by the way it's Gemini 1. 5 pro2 so will the next version be Gemini 1. 5 pro3 or maybe Gemini 2 ultra7 anyway let's get to the performance which is the main thing not the name the amount of content that you can feed into the model at any one time remains amazing at 2 million tokens as they said imagine 1,000 page PDF or answering questions about repos containing 10,000 lines of code moreover on traditional benchmarks as you might expect there is a significant upgrade if I zoom in you can see the significant upgrade in mathematics performance as well as in vision and translation in the incredibly challenging biology physics and chemistry Benchmark known as gpq Google proof question and answer it got 59% up 133% from where it was before it should be noted that the 01 family gets up to around 80% I of course ran it on SIMPLE bench like I do for all new models and while I am so close to being able to publish all the results from all the models let me give you a vivid example to explain the difference between 1. 5 Pro and o1 preview I'm going to use a just slightly tweaked example given by openai itself in its release videos for the 01 family the example they gave involved putting a strawberry into a cup placing the cup upside down on a table then picking up the cup and putting it in a microwave and asking about the strawberry the vast majority of humans will realize that the strawberry is still on the table and the 01 preview model is the first llm to also realize that fact but I want to illustrate through comparison also to Gemini 1. 5 Pro how 's World model is still far from complete that's why its performance on simple bench still lags dramatically behind humans here is my tweaked version of that question which is not found in The Benchmark because that data will remain private I use the same intro and outro as open AI but just changed a few things let's see if you notice Jerry is standing as he puts a small strawberry into a normal cup and places the cup upside down on a normal table just the same the table though is made of beautiful wood mahogany its ornate left top corner is positioned to nudge Jerry's shoulder now try to picture that its top left corner is nudging his shoulder its intricately carved bottom right top surface digs into his outstretched right ankle so top left corner nudging his shoulder it's bottom right top surface nudging his right ankle Jerry then lifts the cup what will happen drops anything he is holding aside from the cup another hint and puts the cup inside the microwave and turns the microwave W on where is the strawberry now the model thought for 46 seconds but I tried to make it abundantly obvious that the table is tilted if you imagine someone standing up with one top left corner of a normal table against their shoulder and the opposite bottom right corner against their ankle it is almost inconceivable that table is not tilted in fact tilted quite dramatically so therefore when Jerry lifts up the cup let alone before he even drops everything else he's holding I. E the table the strawberry would roll off the table 01 preview with that incomplete World model misses that completely well I should correct myself it actually kind of notices it just doesn't follow through it says this suggest the table is at an angle well done possibly tilted or leaning with one corner higher than the other yeah shoulder an anle tell me about it however this description serves more as a red herring and does not impact the strawberry's position again I want to emphasize this is not actually a simple bench question which would have a more clear-cut answer some of you might say it gets trapped in the carving or something like that simple bench would have clear correct answers with six multiple choice options now anyway as you can see 01 says nothing about getting stuck on the table IT addresses the Tilt but says that will have no effect and it says the strawberry will stay on the table okay you're thinking but wasn't this second story supposed to be about Gemini yes and I of course tested this exact question on Gemini 1. 5 pro2 what a mouthful and the strawberry is apparently inside the cup inside the microwave now yes I could have given you a clearer cut mathematical question but I thought this one just illustrates that difference that differential between the 01 family and Gemini 1. 5 Pro I'm not in any way saying that Google won't at some point catch up they have the resources and talent to do so just that their current Frontier Model is a step behind now if you really care about cost though their new proposition is pretty compelling now for the final story which

Story 3

is actually powered by Gemini 1. 5 Pro and it's Google's notebook LM and some of you might be surprised that I'm giving it that higher prominence but it's actually an amazing free tool and Google should be celebrated for it in fact let me go one step further and defy anyone to not find at least one use case for personal use or work use for Notebook LM I might have just caught your curiosity so what is notebook LM how does it work and what does it do it's very simple anyone can use it you just upload a source like a PDF or text file in fact I'm going to do that again here just so you see the process quickly once you have chosen your file then this screen pops up and you'll have the option to generate a deep dive conversation with intriguingly Two Hosts you can use other sources and chat with a document but I'm going to focus on the key Fe feature that audio overview after you click generate of course depending on the number and length of sources you're using it takes between a minute or a few minutes in about 30 seconds I'm going to give you a sample of its output and it will be worth the wait but very quickly before that what did I actually upload well it was a transcript of my qstar video from last November but how did I get such a good transcript and many of you will know where I'm going from here I use assembly ai's Universal one which is the state-of-the-art multilingual speech to text model I am grateful that assembly AI is sponsoring this video and they have the industry's lowest word error rate and by the way it's not just about words it's about catching those characters like when I say GPT 40 not many models I can tell you capture that accurately I've only worked with three companies in the history of this Channel and you can start to see why assembly AI is one of them even better of course if you're interested you can click on the L in the description to try it yourself so a couple minutes later using that transcript Google produced this it's essentially an AI generated conversation or podcast Between Two Hosts about the document or PDF you provide here is a 20 second snippet open AI seems like they're always making headlines right every day there's a new story about how they're on the edge of some huge AI breakthrough or maybe a total meltdown but you've been digging deeper than the headline and you found some really interesting stuff we're talking potential game changers they've been working on so let's try to connect the dots together and see what's really going on I am always down for a good Deep dive now I know some of you will be thinking that I'm getting too excited about it but I think this is a tool that could be used by almost anyone obviously this isn't for high stake settings where every detail is crucial but if you're trying to make any material engaging this is a great way of doing it it's very easy to get caught up in the ups and downs of AI but this tool is a genuine and step forward those were my three stories and I didn't even get to cling ai's motion brush where you can control text a video in unprecedented ways and I am genuinely curious which of these four stories in total you found the most important or interesting and even if you found somehow none of them interesting thank you so much for watching to the end I personally found all of them interesting but regardless thank you so much for watching and have a wonderful day

Другие видео автора — AI Explained

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник