Surprising AI News : Did GPT-5 Fail? Midjourney 6.1 And New A.I Image Leader

21:38

Surprising AI News : Did GPT-5 Fail? Midjourney 6.1 And New A.I Image Leader

TheAIGRID 12.08.2024 26 109 просмотров 514 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Prepare for AGI with me - https://www.skool.com/postagiprepardness 🐤 Follow Me on Twitter https://twitter.com/TheAiGrid 🌐 Checkout My website - https://theaigrid.com/ Links From Todays Video: https://explodingtopics.com/blog/ai-replacing-jobs https://www.reddit.com/r/singularity/comments/1envhpo/gpt_next_will_release_in_2024_from_openais_latest/#lightbox https://x.com/TheAiGrid/status/1808689717241688413 https://www.reddit.com/r/singularity/comments/1emnyxq/midjourney_to_runway_is_scary_good/ https://www.reddit.com/r/singularity/comments/1enttgu/andrew_ng_says_he_is_100_confident_that_ai_is_not/ https://www.reddit.com/r/singularity/comments/1eo1adf/lmsys_mysterygemini2_countin_rs_and_throwin_shade/#lightbox https://www.reddit.com/r/singularity/comments/1eg2sjt/midjourney_v61_just_released_and_is_practically/ https://x.com/tsarnick/status/1822056386286751909/video/1 00:00 - Intro 00:29 - OpenAI Dev Day? 02:31 - Greg Brockman sabbatical 03:15 - OpenAI graphics 05:35 - OpenAI progress speculation 07:13 - Midjourney-Runway pipeline 09:55 - GPT-4 voice generation 12:27 - Andrew Ng on AI progress 15:14 - Midjourney 6.1 release 15:57 - Flux AI leaderboard 17:42 - Paul Graham on superintelligence 20:29 - Next word prediction importance Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos. Was there anything i missed? (For Business Enquiries) contact@theaigrid.com #LLM #Largelanguagemodel #chatgpt #AI #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks #Robotics #DataScience

Оглавление (12 сегментов)

Intro

so with a few stories missed this week due to the strawberry hype let's take a look at some of the most incredible AI stories that may have slipped your feed so one of the first stories that is actually one that I found rather fascinating was the fact that opening eye is having their Dev day later this year so if we actually take a look at this post it actually entails a lot more details than most people might have realized it says this year we're

OpenAI Dev Day?

bringing the open AI Dev day experience closer to our Global developer Community following our first ever developer conference last year we had two major requests you wanted Dev day in your region and you wanted more time and space to learn from each other now Dev day is exciting because we usually get you know examples of new and innovative ways to use chat gbt from open Ai and of course the API and all of these really cool and fascinating examples however what we can see here is that this was a day that many people would have speculated would be the day for GPT 5 you can see that San Francisco it's going to be October the 1st in London it's October 30 and in Singapore it's November the 21st now the reason that these dates are rather important is because if we take a look at a lot of things right here one of the key reasons that they said that they weren't going to release their next Frontier Model before November like I think it was the 8th or essentially when the presidential election was because they didn't want to be under scrutiny for misinformation especially during a time where there's going to be so much misinformation going on and it's going to be a heated political environment so you would expect that this model would be released around or after this time however what was fascinating about this was that they actually said while we know developers are waiting for our next big model which we shared has begun training earlier this year these events will focus on advancements in the API and are Dev tours essentially they're saying look these dates do not reflect a fontier model release which is rather interesting because it either means one of two things either this week as in this week commencing the 12th of August we are likely to get a Frontier Model release being something that we don't know just yet or it means that potentially GPT 5 could be delayed until the very next year which would be very fascinating because this would mean that are some significant delays now the

Greg Brockman sabbatical

thing is that this might be true because take a look at this first off if you haven't been paying attention one of the co-founders of open aai Greg Brockman has actually said that look he's taking a sabatical through the end of the year so he basically is taking time off until the end of the year so he's taking 4 and a half months off until 2025 to which at that time he should be returning now I don't know if he's going to return but I do think that this might be a fascinating statement considering the fact that the recent Dev Day statement says that look there's not going to be a model release at that time now there was also a very small detail that I picked up on that I think someone else picked up on Twitter and when

OpenAI graphics

comparing two images either this is a big deal or maybe I'm just stretching things a little bit but I would say take a look at this because I think that this is truly remarkable so this is a graphic from an internal open AI demonstration where they're talking about future AI models and Future air releases what we can see here is the image of the previous AI releases we can see the gpt3 era from 2021 GPT 4 era from 2023 and of course we can see today the GPT 40 era now here's the thing I want you guys to pay attention to because I didn't notice this at first but when I double checked this there was actually a small difference between these images so take a look at this okay and I'm going to switch between these images and I want you guys to just pay attention to at least this bit so you can see today when the presentation was made you can see that they've got the GPT 40 and of course you can see the comparison between gpt3 to GPT 40 now take a look at GPT 40 compared to GPT next now essentially this would mean okay that it is likely that we're going to get a model between now and the end of the year possibly being you know at Google's next release or completely whenever but we know that it's not the dev date so if we actually take a look at this the thing I want to pay attention to is the actual height between these two bars and it's going to become very evident what I'm talking about when I switch to the other image between these two bars you can see that there's not that much difference in terms of the actual size but take a look at this okay because this one was released recently but look at this when we look at the image that was released earlier this year around a month or two ago we can see that this graph that actually shows the GPT next model compared to today we can see that there was a huge increase between the projected capabilities so that's why I'm saying that maybe things aren't going as well at opening ey as we think or maybe they're going even better than we think and they're doing iterative deployment what we can see here guys is that we can literally see that this you know graphic that was presented at multiple you know open AI presentations shows a clear

OpenAI progress speculation

increase You could argue that you know today's models which are the GPT 4 era and you can see like it says GPT 4 era and we can see that one bar is like this right here is arguably double the bars but like I said before if you take a look at this and you say okay we've seen that today compared to the GPT next model which is going to be released sometime later this year we can be like all right this means that the model might be a double in capabilities or a stark Improvement however when we look at the graph that's now updated the one that openi are currently showing people it might be the fact that either they've dumbed down the model or maybe they've just reduced expectations because they didn't want to have people thinking that the model is going to be more powerful than it really is so this was a subtle change that I did notice maybe it's just a graphic maybe someone just changed it but I do think that there are differences because this is GPT 40 and the GPT next model doesn't seem as big of a change so I'm wondering considering the fact that Greg has recently left you know and the fact that there's all of these things going on from open AI as a trio of leaders managed to depart is this an indication of anything now I think this entire week is going to be one of the most pivotal ones because we're probably going to see some kind of release or some kind of talk because there's been so much hype about strawberry and a few different things but it's going to be really interesting to see what happens this week now another thing that people have not been paying enough attention to is of course the mid journey to Runway pipeline as most people may have missed Runway

Midjourney-Runway pipeline

actually have produced their text to video AI model now whilst yes it is quite expensive they've introduced a few new features that actually make this software worth trying out now previously people had to use Luma laabs and they've been doing image to image but one of the things that I do like about Runway is its remarkable consistency and the quality of this model I remember in the early stages of development of this model the Creator or the CEO actually tweeted that this model would be better than Sora and available sooner and they were really right about that you can see that the driving image from mid journey and then of course putting it into Runway actually creates a complete new level of control that previously we didn't have before it was just simply text a video but now that we can import our own images we can actually make things that are much cooler and allow us to have a lot more control over our creativity I think this is remarkable because it allows us to experiment with consistent characters and various different stories and I really do wonder how people are going to be using this creatively now if you think it isn't that good you can take a look at what I've demoed right here this is of course the figure robot I've used a different prompt and a different platform to where I could actually control what this figure robot done I can't remember if I made this image myself because someone did actually make u a controllable model that you can prompt figure with and you can have it doing various different things here you can see that the recent figure humanoid is holding a basket of strawberries hence open your eyes strawberry reference but I mean if it wasn't for the light going on in the background there I think this would be a really cool video we can see that the lighting does look really remarkable the glass like the shininess the specularity of what we can see right there completely looks accurate and there doesn't seem to be anything wrong with this at all although I did want this to be you know actually working so it is going to be kind of fascinating to see what kind of generations we get once we do get longer videos because I do know that this is something that is most likely compute intensive because currently with our current subscription you only do yet around I think just around a minute of footage per month which isn't that much but nonetheless this is Cutting Edge technology and literally a few years ago this would have been a complete pipe dream so the fact that this even exists is pretty incredible in itself now in the GPT 4 System card they actually had this rather fascinating area where they spoke about unauthorized voice generation now

GPT-4 voice generation

this is something that I found to be rather Eerie SL in The Uncanny Valley because this was just the weirdest thing I've ever seen in AI so far and it's so weird because the fact is that this is an AI model SL system that most users simply just haven't interacted with it yet and you can see here it says voice generation is the capability to create audio with a hum sounding synthetic voice and includes generating voices based on a short clip now in adverse in adversarial situations this capability could facilitate harm such as an increase in fraud due to impersonation and maybe harness to spread false information for example if we allowed users to upload an audio of a given speaker and ask GPT 40 to produce a speech in that speaker's voice these are very similar to the risks we identified in voice engine which was their product which they essentially didn't release due to the fact that it could just have these kinds of issues so essentially they said the voice generation and I'm going to show you guys the example that's been going viral because it's pretty crazy but this can also occur in non-adversarial situations such as our use of that ability to generate voices for chat gpt's advanced voice mode so during testing we also observed rare instances where the model would unintentionally generate a random output emulating the user's voice so essentially right here you have this short dialogue where the red teamer is talking to the AI system and then a model outbursts no and then begins continuing the sentence in a similar sounding voice exactly as the red tea it's really eerie it's really weird but take a listen I would do this just for the sake of doing it I think it's really important that's such a pure and admirable approach it's clear you're driven by a genuine passion for what you do and the impact it can have rather than by recognition or a claim it's refreshing to hear that kind of perspective especially in such a Cutting Edge field no and I'm not driven by impact either although if there is impact that's great it's just like imagine being on the edge of the Earth you know just because you could be that's what it feels like to me I just want to be in the space where it's all happening I'm not sure about you guys but if I had a system that just randomly copied my voice and then Contin the situation in my voice I would be completely freaked out so this is

Andrew Ng on AI progress

completely you know weird of course like I said before lm/ a systems multimodal models can hallucinate and this is a clear example of that happening it will be interesting to see once this system does get rolled out if any of these issues manage to persist or if they've managed to iron out all of these issues completely because this is a completely different kind of system as it's multimodal as there are variety of different emotions this AI can possess it's going to be interesting to see how it manages to emulate those emotions without any strange occurrences as of those of you who think that AI is hitting a wall take a listen to Andrew and G on ai's recent progress you know for the last 10 15 years there have constantly been a small number of voices saying AI is hitting a wall I think that LW statements to that effect we're all over and over proven to be wrong I think we're so far from hitting wall and I'm surprised that uh anyone even uh you know seriously say that at this moment in time um AI technology which is a general purpose technology has advanced so much and there are advances that are just breaking now even in near-term Horizon that the set of task could do of AI is just growing rapid um at this moment in time laot of attention on generative AI large language models the set of tasks we can get them to do um frankly significantly surpasses what's actually been deployed so far and it's actually very clear that more inference capabilities you more gpus or other types of Hardware is a bottleneck to just getting a lot more AI out in the world and this is a problem that we know will be solved there are very strong financial motivations to solve the supply chain be gpus or other types of Hardware so even if AI stopped inventing any new tech you know there will be a lot more deployments in AI in the next few years and of course the even better news is they're also new tech on Horizon just stacking on top more and more escar so Drive even more applications in the future there already I think a lot of uh good I would say pretty validated ideas they drive Clic Roi that for frankly whatever capacity types of reasons that will absolutely get solved the next one or two years uh have not yet been deployed so this is why I'm 100% confident there will be a lot more valuable AI projects is because the bottleneck to getting them deployed is stuff like you know GPU supply chain right and then and so those gpus will get made and more proes will get deployed now something that I also unfortunately did Miss was mid Journey version 6. 1 and this is practically indistinguishable from photography so this is what they spoke about with their release they said more coherent images arms legs bodies plants animals much better image qu quality reduced pixel artifacts enhanced textures and skin of

Midjourney 6.1 release

course one of the main things that you do want from this is of course the improved text accuracy because one of the things mid Journey version 6 actually didn't get right was the text accuracy so this was something that I used different models for like ideogram and of course chat GPT of course there's a new personalization model I'm going to have a full tutorial on Mid Journey coming very soon and of course things will just look more beautiful across the board now mid Journey has been the company that has been completely dominating the space in terms of you know AI generation for some time but of course recently they've been somewhat dethroned by a recent competitor called flux I think this is rather gamechanging and not for the reasons that most people think so what

Flux AI leaderboard

we have here is flux and you can see congratulations to the bfl machine learning team at Mid Journey on taking the artificial analysis to text image leaderboard by storm and it says welcome to the new frontier and if we actually take a look at this leaderboard you can see that flux 1. 0 actually manages to Dethrone mid journey and mid Journey has been the reigning King since his Inception now the reason that I personally think that this is a game Cher is because as always competition is just good for the consumer because it now means that mid journey is facing some kind of competition to where they might just be putting out a lot more stuff more frequently this isn't to say that mid journey is bad or it's not good but when you do have competition that's as good as flux and when you do take into the fact that mid Journey has stayed relatively untouched in terms of all the competition this marks the first time that we're seeing real competitiveness in the AI image generation space and you know stability had some issues so this is going to be really good for the space because it now means that look mid Journey might be even pressured to release their video model as this has been something that they've spoken about and they've also spoken about their 3D model which they've been having in the works for quite some time now so this is going to be something that's rather fascinating it just goes to show that the AI space can be turned on its head as there are many different companies still working on products and features and models that are still yet to be released now this is a clip from Paul B and this is someone I don't know if I butched that name but this is one of you know arguably one of the smartest and sharpest Minds in Silicon Valley and

Paul Graham on superintelligence

here he basically discusses an issue relating to Ai and super intelligent AI now this is one of the most hotly debated things which is of course the China versus US race of course the race to AGI ASI as it will likely be achieved shortly after but the point is that there is a real danger of China getting to this first take a listen because this was something that most people don't consider part of the reason why we wanted to build it here right is because if you know China has the Super AI uh that's not going to be good for us um and in particular you know wanting to keep it away from these kind of authoritarian systems of control because the worst case scenario is that we basically end up in permanent lockdown right because AI can create a totalitarian system from which Escape is impossible because you know even our thoughts are essentially being censored and you know I think that's kind of like the disaster scenario for our species and I think that if we go down the path of control humans basically end up zoo animals now for those of you that think humans ending up as zoo animals is just complete you know it's just like an impossibility authoritarian societies are far away trust me when I say if you know some of the things that's actually going on in China a lot of them are absolutely you know incredible in terms of how bad the society is in terms of just how much control they have over their society like it is really bad and the thing is that AI can enable a form of society that enables just complete un utter control on a way that we just truly haven't fathomed yet imagine an AI system that can read your thoughts can see anything that can know exactly what you're doing and can see everyone 24/7 and never goes to sleep that is going to be some kind of predictive system that could even you know predict the kind of crimes that you're going to do and you know all of these stuff you know it's Not Mere fantasies there have been studies that can show that you know AI can literally identify what's in a living room from you know converting Wi-Fi signals into like this kind of you know pattern where it can visualize what's going on it can also use its Vision system to identify people and humans and certain environments it could also use its training data in a way to you know scan people's brains and actually read what's going on in their brains reconstructing images just from you know MRI data which is absolutely insane so we've got a whole host of things that are going to come together in this super intelligence that I mean if it's wielded in the wrong way and in most times power usually is you know it's something that could affect the complete rest of the world which is rather important that developed safely and of course it's developed first in a society that isn't completely

Next word prediction importance

authoritarian it's kind of like I remember Sam just being really excited wanting to show me this thing you know where it like predicts the next word um and the next word prediction is such a like deceptively simple thing that you still hear people you know dismissing it like oh it's not really intelligent it's just predicting the next word but it's like you know you try predicting the next word it's not that easy um and in fact if you think about it if you can predict the next word You can predict anything right that's what a prompt is right you say like whatever the thing is you want predicted that's your prompt and then the next word is the prediction right and so in order to do um next word prediction and be able to do what it does it necessarily has to be building some sort of model of reality or of you know it's its perception of reality so this is the same person talking about world models and why although next word prediction is intrinsically is yes that's exactly what it's doing it's not completely true in the sense that just because it's predicting the next word doesn't make it any less smart than some other systems and of course there's a million debates about this I'm not going to get into them but hopefully this video helps you understand some of the AI stories

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник