AI News: New AI Race Has Begun!

20:01

AI News: New AI Race Has Begun!

AI Master 23.03.2024 7 959 просмотров 227 лайков обн. 18.02.2026

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

#sponsored by 🚀 Become an AI Master – All-in-one AI Learning https://aimaster.me/pro 📹 Get a Custom Promo Video From AI Master https://collab.aimaster.me/ Learn more about Akool here! https://akool.com/?via=artur Over the last week there has been a huge amount of AI News that I'm ready to discuss with you. A presentation from nVidia, a global update of Bard AI from Google, Gamini with Apple AI and news from Sam Altman's Open AI with their Chat GPT, and what Elon Musk has to do with Grok AI. We'll get to the bottom of it today. Make yourself comfortable, AI Master is here with you! Chapters 00:00 - Hello! 00:03 - GPT-5 on the doorstep. 00:30 - GPT 5 DEMO 02:00 - Will GPT 5 be ported? 02:20 - Ultimate AI Tool... 03:25 - AGI from Sam Altman. 04:20 - Biggest Gemini AI Update. 05:30 - HUGE CONTEXT 06:01 - How Gemini Pro Architecture Works. 07:21 - Will AI now be able to analyze movies? 07:40 - Elon Musk and Grok-1 08:38 - Grok-1 User Tests 08:55 - Stable Video 3D (what?!) 11:05 - GitHub Copilot's Code Scanning 12:30 - NVIDIA turns the AI world upside down 13:00 - Real World Simulation in AI 14:40 - Apple brings Gemini AI to the iPhone 15:47 - Apple MM1 AI Model 17:04 - Secret Japanise AI Model... 19:15 - Important information (everyone needs to watch)

Оглавление (20 сегментов)

Hello!

it's been a crazy week in the world of AI so let's get right to it

GPT-5 on the doorstep.

there's been a ton of Articles surfacing claiming that open AI will release GPT 5 later this year from what I found this leag comes from two people familiar with the company which leave some space for debates and doubts however the original service that published this news Business Insider has confirmed the identities of these people so even if some of the following andos incorrect at least it comes from valid sources these people claim that some Enterprise customers have already received demos of

GPT 5 DEMO

gbt 5 and here I'm just going to quote Business Insider the initial source of this leak it's really good like materially better said one CEO who recently saw a version of gbt 5 open AI demonstrated the new model with use cases and data unique to his company the CEO said he said the company also alluded to other as yet unreleased capabilities of the model including the ability to call AI agents being developed by open AI to perform tasks autonomously an AI agent is basic basically a program that can act on its own to complete tasks or solve problems think of it like a superpowered virtual assistant like Jarvis rabbit R1 as a device is in my opinion the most direct representation of what AI agents can be developers claim that it can book flights manage inbox and do other things that require AI to act like a person and do things step by step but back to the article open AI is still training gbt 5 after training is complete we'll be safe to tested internally and further red teed process where employees and typically a selection of Outsiders challenge the tool in various ways to find issues before it's made available to the public there's no specific time frame when safety testing needs to be completed one of the people familiar noted so that the process could delay any release date and this is a very important thing to keep in mind AI isn't a new shirt that can just appear on the store shelves gbt 5 is a super sophisticated system that needs thorough testing and fine tining and we'll talk about fine tining a bit later when we'll discuss musk's AI even if open AI air to

Will GPT 5 be ported?

release gbt 5 this year if the testing goes poorly or the issues will be too difficult to fix the release date could get pushed to 2025 but I want to be optimistic and hope that gbt 5 is just around the corner sometimes I think to myself when I read all this news wish there was one single AI platform with

Ultimate AI Tool...

all the important tools and recently I just found one a CO unites a ton of AI tools Under One Roof providing a comfortable and convenient experience there is sponsoring this video and particularly I liked the face swapping tool I can swap faces among multiple people tweak Beauty settings and even play with age transformation the interface is easy to use making the process smooth and enjoyable or how about video translation it supports translations into 29 languages and the voice cloning feature makes translated audio sound like the original speaker which is pretty neat the lip sync is accurate and you can edit the translated subtitles directly which is super handy then there's also so an image generator you give it a simple prompt and it turns that into studio quality visuals want to tweak this style or colors no problem this tool makes complex edits feel easy and simple AOL is more than just an AI platform it's a creative Powerhouse whether you are into digital marketing or just love playing with Cool Tech these tools can transform how you create I will leave a link in the description so be sure to check it out what I find

AGI from Sam Altman.

really interesting is the recent interview of Sam Alman with Lex freed where they talked not only about gbt 5 but also about the project qar AKA AGI from Sam Alman in this interview there were a couple interesting bits take a listen no I think it is an amazing thing um but I think it kind of sucks you know now we have gp4 and I expect that the Delta between 5 and four will be live a few years in the future and remember that the tools we have now are going to kind of suck looking backwards at them in the same interview Sam was later asked about gbt 5 and its release date and here he was quite secretive and didn't share any specific time frames so uh when is GPT 5 coming out again is it blink twice if it's this year we will release an amazing model this year we release over in the coming months what could this be can it really be gbt 5 or will it be something as groundbreaking as Sora I think only time

Biggest Gemini AI Update.

will tell but I'm really optimistic here on a side note Google has start giv access to Gemini 1. 5 and it's wild a week ago the company published a blog post and it has a few interesting things in it first the performance get better the model not does more while using less resources this is possible thanks to the new architecture called mixture of experts but we won't get into that right now the first Gemini 1. 5 model we're releasing for early testing is Gemini 1. 5 Pro it's a midsize multimodal model optimized for scaling across a wide range of tasks and performs at a similar level to 1. 0 Ultra our largest model to date it also introduced a breakthrough experimental feature in Long context understanding to me this sounds crazy Gemini Ultra 1. 0 when it was released was seriously wild if you remember Google made a series of videos highlighting its image recognition capabilities yeah those videos turned out to be fake but still everyone I talked to was seriously impressed by Gemini Ultra 1. 0 my own testing of the Gemini Pro 1. 0 also was quite impressive so I'm pretty excited about this up date what's also

HUGE CONTEXT

interesting about it is that it comes with a standard 128,000 token context window that's the same amount of tokens as in gbt 4 Turbo but hear this Google Now gives chosen developers and Enterprise customers access to a much larger context window of 1 million tokens this is astronomical absolutely mindboggling 1 million tokens equals 1H hour videos 11 hours of audio over 30k lines of code and over 700k

How Gemini Pro Architecture Works.

words these numbers must impress you because if they don't then I don't know what else to tell you now let's read what this new architecture means for users and how it all works if you remember Gemini from the get- go was a multimodal model which means it was trained in different types of data and the model itself has smaller models under its supervision and control while a traditional Transformer functions as one large neural network Moe models are divided in smaller expert neural networks depending on the type of input given Moe models learn to selectively activate only the most relevant expert Pathways in its neural network this specialization massively enhances the model's efficiency so in simple terms it's like your typical model but on steroids smarter bigger better the company also made a few videos showing the features of the new model like complex reasoning I hope these videos aren't fake or staged this time so in one of the videos Google shows how the model works with a 400 pages long file finds funny moments identifies moments in the document based on the uploaded handdrawn image a 400 page document is a huge amount of data to process and seen processing times of around 20 seconds per prompt is wildly impressive when given a 44 minute silent Buster Keon

Will AI now be able to analyze movies?

movie the model can accurately analyze various plot points in events and even reason about small details in the movie that could easily be missed the only problem I see with Gemini 1. 5 rolling out is that it's only for Developers for now so Google hurry up I want to test it all myself another big thing that happened is Elon musk's xai releasing

Elon Musk and Grok-1

grook to the public they published a great blog post that I think we should look at base model training a large amount of text Data not fine-tuned for any particular task 314 billion parameter mixture of experts model with 25% of the weights active on a given token trained from scratch by xai using a custom Training stack on top of Jacks and rust in October 2023 to put it size in perspective even at 34 billion parameters it still has some catching up to do with open AI GPT 4 which had 1. 76 trillion parameters at last count also not fine tuned for any particular task this doesn't sound too good to be honest the model's benchmarks have been around for a few months already and its capabilities are pretty well known in all cases it's about as good as GPT 3. 5 this may not sound all that impressive but the fact that Gro is open source now makes it the biggest open source AI model out there on forums people who

Grok-1 User Tests

tried the model have been leaving not very fluttering reviews considering how poor it is compared to other models it really emphasizes how important fine tuning is models with much smaller parameter counts are outperforming it in many metrics so even though this is interesting news the AI itself isn't

Stable Video 3D (what?!)

something to go crazy about what's really interesting is that stability AI has just released their new AI model stable video 3D that can render 3D videos from 2D images and although their blog post is quite Technical and looks more like a research paper gives a glimpse into what this model can really do stable video comes in two versions there's SV 3du which generates orbital videos based in single image inputs without camera conditioning and SV 3dp which Builds on that by accommodating both single images and orbital views allowing for the creation of 3D video along specified camera paths with stable video 3D the devs have pushed the boundaries of what's possible with 3D generation from a single image make an SV 3D not just an interesting tool but also significant quality jump it's kind of like that situation with Sora before the 3D generation from an image was like that video of Will Smith eating spaghetti but with SV 3D it's like high quality Sora videos according to the deaths SV 3D delivers greatly improved quality and multi view when compared to the previously released stable 0123 as well as outperforming other open source Alternatives one of the core advantages of SV 3D is its approach to creating realistic and consistent 3D models the model uses improved 3D optimization leveraging the sparle capability of stable video 3D to generate arbitrary orbits around an object this mumbo jumbo basically means that the generate objects are not static anymore but smoothly animated and realistic from every angle what's even more impressive from this post is that SV 3D addresses common challenges in 3D model generation like lighting and Detail in obscured areas by using a new mask score distillation sampling lost function SV 3D somehow builds up the details and parts that aren't directly visible in the initial image from what I see this is huge the model is already publicly available if you have a stability subscription and I think I'm going to test it eventually so hit the like button if you want to see this and sub to the channel to not miss that video

GitHub Copilot's Code Scanning

another thing that happened is GitHub has launched its new code scanning Auto effect AI for finding and fixing security vulnerabilities during coding this code scanning Auto effect promises significant improvements in how vulnerabilities are handled the system is capable of fixing over 2third of the vulnerabilities it detects often without requiring developers to make any manual changes to the code I assume it works in the shadows and you don't even know that it fixes your blunders GitHub also states that its code scanning autofix will address over 90% of the types of alerts it detects supporting languages such as JavaScript typescript Java and python yep I did not just repeat myself 2/3 of errors and 90% of alerts those who code will understand this new tool is powered by the code ql engine github's tool for semantic code analysis capable of identifying vulnerabilities even before the code runs in this new tool code ql plays a crucial role but GitHub also mentions the use of a combination of heuristics and GitHub co-pilot apis for proposing fixes and what I find absolutely hilarious in a good way is that gbt 4 is responsible for fixing the errors and explaining them this all looks quite interesting and promising recently we made a video about the best AI tools for developers so be sure to check it out and if I remember correctly github's co-pilot has been really great so I have high hopes for this code scanning AIX Nvidia

NVIDIA turns the AI world upside down

has just announced Groot its new general purpose Foundation model for humanoid robots this platform will be used by many leading robot makers including familiar names like 1X Technologies agility robotics abtronic bus and Dynamics figure Ai and more that covers nearly every prominent humanoid robot maker at the moment with a few notable exceptions like Tesla project grud isn't just software Nvidia is also significantly upgrading its existing ISAC robotics platform these upgrades

Real World Simulation in AI

include generative AI Foundation models which will allow robots to learn and adapt faster not entirely sure if I'm fully on board with that the platform will also include tools for simulating real world environments helping robots train safely before interacting with the physical world I assume it will be kind of like tutorials in video games or like those flight simulators that we use to test pilots one of the key features of Groot will be its ability to understand natural language robots powered by Groot will be able to take verbal instructions and translate them into actions they will also be able to learn new skills by observing human actions and copying them I'm I the only one who feels like we're all inside a sci-fi movie just think about all the robots running around acting like us that will be creepy to power these robots envidia also showed a new computer specifically designed for them the jeton Thor yeah someone at Nvidia is a hardcore Marvel fan this computer is built around the Nvidia Thor system on a chip which should give the processing power needed for complex AI tasks the specs of this computer are wild but to us they're somewhat relevant so we won't dwell on that envidia did a couple more robot related announcements at that event but I think that part about embedding AI in robots was the most important part of it all I like the idea of using AI to train robots make their movements better and responses faster and more accurate something just doesn't feel right something feels off call me a 10 foil hat guy but I have doubts that AI robots walk in the streets or work in at manufacturing will be a significant Improvement of our lives but nonetheless I know one thing for sure in viia stock is hot right now and it's about to get even hotter and apple apparently is in

Apple brings Gemini AI to the iPhone

talks with Google to bring Gemini features to iPhones sounds crazy right but a recent report by Mark German a reputable analyst citing Anonymous sources suggests apple is indeed an active discussions with Google to potentially license their generative llms this is all very interesting especially to me since I'm interested in both Ai and Apple tech so let me bring it up to speed with all this apple plans to add some AI features to the iOS 18 that comes out this June but their own internal development is lagging behind German's previous reports mentioned Apple's internal testing of an apple GPT competitor potentially rivaling open ai's Chad GPT additionally apple is designed an Ajax framework specifically for large language models and reportedly spends millions of dollars daily on conversational AI res SE Arch due to the hardware intensive nature of training these models however the technology is still not as advanced as tools from Google and other Rivals making a partnership look like the better option according to the latest report and by the way we have a great video about an AI from Apple where we dive deep into every detail so be sure to check it out

Apple MM1 AI Model

also a few days back the new research paper was published by Apple detailing new method for training large language models that seamlessly integrates both text and visual information let me just uh read you a bit of at all Apple's research focuses on the combination of different types of training data and model architectures which enables the AI to understand and generate language based on the mix of visual and linguistic cues the paper also highlights the mm1 model is exceptional in context learning abilities particularly in the largest 30 billion parameter configuration of the model this version apparently exhibits remarkable capabilities for multi-step reasoning over multiple images using fuse shot Chain of Thought prompting a technique that allows the AI to perform complex open-ended problem solving based on minimal examples by utilizing a diverse data set comprising image caption pairs inter Leed image text documents and Texton data Apple claims that the mm1 model sets a new standard in ai's ability to perform tasks such as image captioning visual question answering and natural language inference with a high level of accuracy this does sound a bit to pompous to me but hey that's Apple I don't know whether the company could actually come up with something that open Ai and Google haven't tried already but let's just hope for the best with this one and

Secret Japanise AI Model...

since we've shifted into this technical side of things there's been another relatively minor but very interesting breakthrough Tokyo based AI startup Sakana AI has just announced a new method for creating generative AI models that releases obviously all in Japanese so I'm just going to read an article in English our method can automatically create a new underlying model with the capabilities specified by the user models can be created very efficiently because they leverage the vast collective intelligence of existing open models in development Sanada took three open-source AI models and bred them together to create more than 100 Offspring which were then benchmarked to determine which ones performed best those were then used to create a second generation of Offspring this process was repeated for several hundred Generations until a final model was chosen I'm really curious to see how it all looks is it all like merging code together what about Frameworks and do they share training data where is it all an AI that just talks to them imitating the user after all the ersion and selection only three models remained the first model combined the capabilities of Japanese language fluency and Mathematics llm the researchers said that not only was it very good at math but it was also good at General Japanese language ability even as a 7 billion parameter model they said it achieved excellent performance compared with other models of the same size and exceeded even though with 70 billion parameters the second model that provided text to image capabilities also managed to show good benchmarks and handle Japanese cultural knowledge using images and Japanese text and the third model proved capable of swiftly generating Vivid images for only four steps of inference I think it would be really interesting to test this one I know this news doesn't sound super important in the world dominated by open Ai and Google but often such nich products are the ones pushing the envelope further the world doesn't revolve only around English so different cultures and languages have their own specifics that require a more focused approach so even with a small scale project like Sakana they're making a great contribution to the development of the technology this has been a really exciting week and I can't wait to see

Important information (everyone needs to watch)

what comes next will open AI release gbt 5 or come up with something crazier will Google release Gemini 1. 5 to the public and will elon's Grog fail or Prevail only time will tell so let's keep an eye on AI together sub to the Chann Channel thank you for watching like the video

Другие видео автора — AI Master

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник