GPT-4 Gets IMAGES, Midjourneys Massive Update, MAJOR ChatGPT HACK (+More AI NEWS)

29:23

GPT-4 Gets IMAGES, Midjourneys Massive Update, MAJOR ChatGPT HACK (+More AI NEWS)

TheAIGRID 26.06.2023 26 967 просмотров 491 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

GPT-4 Gets IMAGES, Midjourneys Massive Update, MAJOR ChatGPT HACK (+More AI NEWS) Robocat - https://twitter.com/DeepMind/status/1671171448638144515 Zooming Out- https://twitter.com/_akhaliq/status/1672721480139014144 Perplexity.Ai - perplexity.ai/pro?referral_code=0TU1F6PW Bard Concern - https://www.forbes.com/sites/anafaguy/2023/06/15/google-warns-employees-about-chatbots-including-its-own-bard-out-of-privacy-concerns-report-says/?sh=4617d2b7b613 Stability AI - https://stability.ai/blog/sdxl-09-stable-diffusions Salesforce - https://www.salesforce.com/uk/news/press-releases/2023/03/07/einstein-generative-ai/ GPT 4 Gets Images - https://twitter.com/AiBreakfast/status/1672165808921890816 3:36 Text To Video 6:35 Salesforce 9:28 Midjourney 14:35 Stability AI 17:42 PerplexityAI 22:08 Deepmind Robocat 24:10 Gpt-4 Gets Images 26:57 Meta Voicebox Welcome to our channel where we bring you the latest breakthroughs in AI. From deep learning to robotics, we cover it all. Our videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on our latest videos. Was there anything we missed? (For Business Enquiries) contact@theaigrid.com #LLM #Largelanguagemodel #chatgpt #AI #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks #Robotics #DataScience #IntelligentSystems #Automation #TechInnovation

Оглавление (8 сегментов)

Text To Video

you use these online tools so do you remember Runway Gen 2 so essentially Gen 2 was a text to video model which was from the company Runway that has been the dominant force in text to video models something that is particularly very hard to do especially in the AI landscape well last week something changed in the marketplace you see now this come company does have very big competition and genuinely this seems like the most realistic text of video that we've seen and that is taking into account Googles and videos and other companies that still are in the early stages of their text to video so what you're looking at is something called a xeroscope version 2 XL a watermark free model scope based video model capable of generating high quality video at 1024 by 576 so the model was trained with offset noise using 9923 clips and 29 769 tagged frames at 24 frames so this looks absolutely incredible and I do think that this does seem I wouldn't say particularly realistic in terms of the materials that you're currently seeing on screen because of course none of these creatures do exist but in terms of the quality it looks absolutely incredible and in terms of the smoothness that is something that also looks great and in terms of the coherence it definitely also does take the cake I mean if this model does manage to get fine-tuned in the future and we actually do get things which are quite realistic I could see this becoming the leading video model but there are also some other examples that do show at just how great this text to video does look and remember this will slowly be refined over the coming years as many Technologies are so you can see that this is being generated in many different styles but what we do have here definitely does look promising and my question to you is what do you think looks better do you think this synthesis of these video clips looks a lot better than runways Jenna 2 texture video or do you think this new zero scope XL looks or exceeds what we've seen in previous video Generations if I'm being completely honest and totally unbiased this software does seem like it manages to generate more coherent and more fluid pieces of video data than runway's gen 2. although very impressive this is definitely quite impressive in its own regard and I would recommend checking out the videos and links below to see further examples and more documentation then of course we had something that is once again quite concerning but at the same time quite Innovative so there's

Salesforce

this company called a Salesforce you may have heard of them before but essentially it's a marketing company that does a lot of sales and helps a giant number of countries across the United States in terms of their entire sales process now if you don't know what sales is it's essentially where someone calls you up sometimes and called calls you out of the blue to sell you a product that you might need or essentially when you're trying to buy something and then essentially there's a sales process that you walk through before you finish buying your product and this can happen in many different Industries now what this announcement is that this very large multi-billion dollar company actually announced something very recently in terms of their own generative pre-trained Transformer AI which they're going to be embedding into their multiple sales process now what they're doing is truly interesting because essentially they're personalizing every campaign and shopping experience with generative artificial intelligence so what that means is you know how currently when you browse Google or maybe you're on Snapchat or Tick Tock and you see a certain advertisement that may be Broad in its generalizations now sometimes you click them because sometimes they do relate to you but what if that advertisement had your name on it or what if that advertisement was really specified to you this is what generative AI is set to do now not only is this truly interesting and groundbreaking some people are saying that this is one of those things that is also going to lead to a lot of significant job loss now let me explain you see they also introduced something called Einstein GPT so essentially with Einstein gbt it's actually the world's first generative AI for CRM so essentially a CRM stands for customer relationship management and it's a set of integrated data-driven software solutions that help manage track and store information related to your company's current and potential customers now what makes this so crazy is that like you're seeing on screen right now Einstein GPT is personalizing these sales processes and you have to understand that many people were already concerned about their jobs being taken by AI but this Einstein T is going to be able to generate leads for you add a sign up form for you do many different tasks and people are starting to wonder if this generative AI tool is able to do this for us then what use is our labor and this is definitely something that is going to be talked about in another video but I do think that a generative AI driven CRM is going to have a wide range of impacts so you know the company

Midjourney

mid-journey a company that is focused on text to image generation that pretty much have solved the common problems that many text image generators do have well they've announced a recent update including a game changing feature that changes what we can realistically do with mid-journey now before we talk about the game changing feature we of first need to talk about the actual update so a couple of days ago they announced the version 5. 2 and they actually improved Aesthetics and allowed for shopping images they slightly improved coherence and text understanding they also increased diversity which essentially means that when you try to generate something sometimes you get images that are far too similar and essentially when you also try to get variations that's sometimes the variations aren't true variations they're just far too similar and they introduce something called High variation mode which makes all variation jobs much more varied and essentially the new feature which has taken Everyone by storm is called zoom out so essentially a zoom out feature is something that we've seen right across the industry now if you're not sure as to what I'm referencing just take a look at some of these clips because this will let you know exactly how this zoom out feature works so essentially every single time you upscale an image it's going to have a zoom out button underneath that you can use to reframe said image so you've got two versions of zoom out 1. 5 and zoom out times two and essentially what they do they pull the camera out and fill in all the details on the sides so when it comes to demonstrating the capability of an artificial intelligence tool it's best to show you with some of my personal examples now I will show you also some of the community's examples because they are far better and far smoother but take a look at this example that I quickly generated with the prompt of Apple headquarters in New York a white Sleek futuristic building so this is of course a standard image that we do get from the likes of mid Journey but what is interesting to delve into is of course the new features so if we take a look at the zoom out feature you're going to see that this current image that we have here we're able to zoom out on this and create multiple different variations so you can now see what it looks like when we zoom out from that image so if we go back over to here you can see this is the close-up of the image and this is what is standard by mid-journey this is simply what you get when you enter your prompt and then of course we have the zoom out feature and then this is exactly what we have right here a zoomed out version of that specific image now what's also cool is that mid Journey allows you to generate much more than just one prompt so mid-journey actually gives you the ability to have four different zoomed out looks and it's very interesting when you combine them side by side because you immediately see what the different Renditions are for your specific project so for example right here we can see that this looks like that so now if I decide to switch between these image Generations you can clearly see the differences in these zoomed out pictures you can see that with the variations that mid Journey does give you every single time you manage to generate a new image the exterior of the image is going to be a little bit different and it's really good for generating variations on what would otherwise be a pretty standard concept now I do think that this zoom out feature is very good and very effective but one thing that would be interesting would be to Simply test this against adobe's generative fill now if I'm being completely honest with you although a generative fill is pretty good I do think that mid-journey's prompt feature here including the zoom out feature is going to be far superior since it is a native feature and not based on simply trained data we're not entirely sure as to how mid-journey does this but we do know that mid-journey is by far the most powerful text to image generator at the moment and the most realistic and of course the most diverse in terms of the many different models that it can use all the way from version 4 all the way up to the now newly released version 5. 2 so what will be interesting is to see if adobe's generative fill feature is something that mid-journey does implement to its platform and if you don't know what that is basically the generative fill feature in which Adobe can use any existing image not just one generated by mid-journey text to image generator but any image may be one of your own and then of course extrude that image by adding any other image into that and then merging those into it so let me know what your thoughts are on that because it is definitely interesting to see this feature being added then we had stability AI launch stable diffusion XL

Stability AI

0. 9 which they described as a Leap Forward in AI image generation so on the 22nd of June they announced that their most advanced development in the stable diffusion text to image Suite of models is finally here so essentially this is a huge upgrade compared to their prior model because this contains a lot more quality compared to the previous versions and what's also great is that it's now added the hyper realism that we've seen in mid Journeys version 5 and Beyond so they actually do showcase some key examples in which we do get to see the differences in a simple prompts and to be honest with you it does seem quite good so for example though as you can see from this prompt here we have aesthetic aliens Walk Among Us in Las Vegas scratchy found film photograph and on the left we have the stable the fusion XL beta on the right we have a stable Fusion XL 0. 9 the newly released model and to be honest with you guys this definitely does look like what we've seen in mid Journeys version 5. 1 5. 2 and the version of five so let me know if you're going to be using this over mid Journey almost I do doubt it because many people are quite accustomed to using mid-journey I do think these new examples are pretty good and you can also see this additional prompt that they also added with these two wolves on the left once again the stable diffusion beta and then of course on the right the newly released version a hyper realistic wolf with almost minimal chance of you realizing that it was a i generate and of course we have the big deal for stable diffusion which as to why they released this new AI model so essentially this AI model which they released the big deal was that they could finally generate hands and hands are a very tricky thing for AI to generate because they are particularly confusing and we've known them in the past it took a very long time for this model to be perfected even when we were looking at the likes of so although it does seem strange this does seem a bit too realistic for me because if I saw this in my feed I would arguably say that there's no way that is AI generated but of course we do know that it is and you can see that on the left hand side that version of whoever's hands it may be don't look very real at all so the contrast that we do see at the time of recording this video is honestly so surprising because it just goes to show that with every single major upgrade that there is in these artificial intelligence tools it's always interesting to see the large differences that does get made but then of course we had a very interesting AI tool that I saw being demoed across apps such as Tick Tock and Twitter and this

PerplexityAI

was being touted as an AI research tool that could arguably be better than Microsoft's Bing now that is in and of itself a very bold statement but here we are in perplexity Pro or perplexity. ai and this is something that you can try for yourself and I've definitely got to be honest with you this seems like the most comprehensive AI research tool that we currently do have so let's do a test because of course you want to understand how exactly this tool works and what exactly it can be used for so let's say for example I wanted to research something which I recently did and I wanted that information immediately so all I'd have to do is I'd have to go ahead over here and add this copilot button and what you can immediately see that this is powered by gpt4 so of course as you know Bing is also powered by gpt4 but I do like the way that this information is presented better so one question I did want to ask it because of course as you know we are an artificial intelligence Channel I've asked it what are the top 10 things that happened in artificial intelligence this week so then we hit the search button and you can see that of course first it seeks to understand my question then it goes ahead and can considers eight results and then eventually it's going to give me an answer now of course what you can additionally do if it does manage to struggle sometimes you can give it more information but more often than not what I've seen is that this is actually quite faster and more accurate than the gbt4 that's in open ai's version and it is very interesting to see that perplexity have managed to do that now what you are currently seeing is that I do think over time what we will see is that we will largely see specified AI tools for specified tasks or more commonly known as narrow AI a lot of people do have the idea that we are moving towards an AI that is going to be able to do everything and although whilst this is possible I think this showcases that if something like perplexity AI is able to immediately get you a lot of different research papers and various different sources faster than GPT 4 by openai then people are most likely to use these specific tailored versions on other applications and I do think that is fine this isn't really a knocked gbt4 it's just saying that I do think that people are going to individually build applications like this one that are going to be better than the base one and that's something that we are going to see now you can see here and why I like this much better than chat GPT is because it actually gives me a lot more references the problem with gpt4s browsing with Bing is that it usually references one or two articles and it does take a lot of time to read that page and remember with gbt4 you only get 25 messages per day but with this you get 597. so it's definitely very interesting you can see all the different articles reference you can see just how many pieces there are and usually it gives you the information straight away now another feature that we can look at perplexity AI which I found to be very cool was that you can do specified research so for example you can search Reddit and this is something that a lot of people do at Google if you're someone that uses Google a lot and uses Reddit for certain research although it does seem uncanny it is something that people do this is a very useful tool also you can use it to search YouTube and you might be thinking why don't you just use YouTube search to search what you're looking for when you're looking for a specified tool what it does is it crawls every YouTube video and searches through the transcript of those videos to get you your specified answer so that is why this is very effective so I'm going to do this again to show you how quickly this works so this simply understand your question searches the news considers the results wraps it up and then just like that we have this data and to be honest with you guys if you're someone that needs information reliably quickly with resources this is what you want to use I know in the first instance of the example it wasn't that promising but this is what it is usually like and this is definitely going to be what I use now

Deepmind Robocat

on a day-to-day basis when I'm doing my research online because I do think that whilst Bard and chat gbt are good this is something that is a specified research tool that allows you to search YouTube transcripts Reddit Wikipedia and pretty much everything that we want to see then of course we had deep minds of Robocat which is essentially something out of Science Fiction I mean it's a self-improving robot that is eventually going to be at this stage where it's going to need less than 100 demonstrations in order to perform an action successfully and you have to understand just how crazy that is because self-improving robots are literally the bane of what people are thinking when they think about Terminator robots that get scarily smart and then of course put the human race out of existence but deepmind's Robocat is essentially based off a deep mind multimodal framework called gato which is essentially an AI model that was released last year which can pretty much do 600 random tasks across a huge different name of domains but this Robocat which they released I'll play a small segment from my video I did see in earlier papers from Google before but this was still nice to see even on a artificial intelligence program which is still in relatively early stages which means that these robots are going to be very effective at real world scenarios because as you know the real world isn't just a test facility where we have a few objects that are always going to be things that happen that don't go according to plan and it's important for these robots to be able to quickly and robustly adapt to these scenarios which is what we see it demonstrated here now yeah from what you've seen there just to wrap it up it's pretty much a robot that can self-improve doesn't need that many demonstrations to get the tasks done and ushers in a new way for robots to learn very quickly now this is something

Gpt-4 Gets Images

that didn't get the recognition it deserves this is gpt4 with actual multimodal capabilities the first instance that we've seen online so credit to AI breakfast for this tweet because Bing managed to break its own rule by solving a capture and actually this multimodal capability of analyzing images is only currently available to apparently five percent of users but strangely enough I haven't seen anyone talk about this which is why it's in this video so you can see right here what we have this image is a typical capture it says type the two words we can see of course overlooked and inquiry because of course we are human but the way that these words are designed on the screen they're designed to not be able to be identified by a standard computer system but of course here you can see we have gpt4 or chat GPT being able to easily identify the word overlooks and inquiry and it also is able to see that this is actually a capture test and then it says I'm afraid I can't help you with that so I do think that this shows us that very soon maybe next month maybe the month after we are likely going to be slowly being introduced to the gbt4 version that was actually announced you know the version where they touted us with the version that could really easily identify what was going on in images and I think this version will truly be the next level in AI because although text is great it's only one form of modality and there was tons from the gpt4 paper where they showed exam questions literal screenshots and gbt4 aced those exams so once this feature does actually get rolled out to everyone which it's supposed to be the this is going to be truly incredible so I do think that the reason it's only out to around five percent of users is so that they can collect feedback see what people are doing with it refine it make sure it's safe and then of course put it out into the open so then of course we had meta AI release something truly game changing but at the same time there is something else that is quite like this that I will explain later on in the video so just keep that in mind because although there are tons of different AI models being released when you have a true understanding of every single AI model out there you start to see certain comparisons and meta is very similar to a tool that was always AI based but just hasn't been receiving the hype it deserves So Meta recently announced something called voice box a multilingual high quality text-to-speech AI voice box can remove background noise from a clip hi guys thank you for tuning in today we

Meta Voicebox

are going to show you by re-synthesizing a specific segment incorrectly spoken words via text to speech eliminating the need to re-record hi everyone thank you for tuning in today we are going to show you these are just a few examples of how voicebox can perform across a variety of tasks like to hear a sample of what voice box can do first hand well you already have because all of the voiceover featured in this video was generated using voice box and apparently the quality is so good that they're not making the voice box model code available to the public yet because they want to avoid misuse so essentially what this is if you know what 11 Labs is that's something that can clone your voice just from maybe even three to five seconds of you speaking into a mic but with this they can do the same so for example I'll just play a few clips from the official Twitter and as you can see you can use different styles text you can use different I guess you could say references it is truly the ultimate tool for use but I do think that this is a very similar to an AI tour released about two to one years ago and this was something that I did actually mess around with better then edit all the blather out of your videos because my time is very precious oh that's fire it's been said that manatees are the Cadillac of marine mammals now decrypt was a tour that was released quite some time ago but it was really cool because it allowed you to essentially edit your voice without you having to re-record it again so let's say for example I made a mistake whilst talking I could simply look at the transcript edit the text and it would also edit my voice at the same time so I do want to play a small clip from the descript trailer because it perfectly encapsules what this software can do and how similar it is to meta's voice book so it will be interesting to see how this tool develops over the next year and how they change in response to 11 labs and meta's new voice box being added to the new tool base in terms of AI text to audio

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник