Googles AI Bard Can Now SEE! (Major Bard MULTIMODAL Upgrade Shocks Everyone!)
12:56

Googles AI Bard Can Now SEE! (Major Bard MULTIMODAL Upgrade Shocks Everyone!)

TheAIGRID 18.07.2023 14 150 просмотров 328 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Googles Bard Can Now SEE! (Major Bard MULTIMODAL Upgrade Shocks Everyone!) https://twitter.com/ammaar/status/1679939953956929538 https://twitter.com/dr_cintas/status/1680207475201417217 Welcome to our channel where we bring you the latest breakthroughs in AI. From deep learning to robotics, we cover it all. Our videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on our latest videos. Was there anything we missed? (For Business Enquiries) contact@theaigrid.com #LLM #Largelanguagemodel #chatgpt #AI #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks #Robotics #DataScience #IntelligentSystems #Automation #TechInnovation

Оглавление (3 сегментов)

Segment 1 (00:00 - 05:00)

one of the most underrated AI Technologies recently got an update that will fundamentally change how users interact with this software the update is actually one that adds a multimodal capability to Bard and as many of you know Bard has been likely overshadowed by chat gpt's various highly skilled abilities but many people haven't understood that Bard is just as good as chat TPT in terms of usability and certain use cases and one of those being internet research and of course now image capabilities now Bard continuously gets updates and you can see right here that this is essentially the change log of what was added so right now you can see that this was literally only four days ago and Bard has now added over 40 new languages so you can upload images alongside text in your conversations with bod allowing you to boost imagination and creativity in completely new ways and of course what's also cool is that Bard can read responses out loud now I know that many people are going to just simply state that if we look at the human evaluation tests and what gpt4 is able to do on certain technical tests that Bard isn't even in the realm of possibilities but you have to understand that what we are now seeing is a diverse AI landscape and essentially what that means is that every single AI tool that you now use is going to have fundamentally different strengths and weaknesses and different areas where they simply outclass the other AI for example if we go on to chat GPT we know that chat GPT is of course one of the smartest AIS but the problem is that it isn't connected to the internet and of course the disabled browsing with Bing now browsing with Bing even in and of itself was actually quite slow I remember using it and it wasn't that effective at searching through many different articles and it just wasn't as great whereas if you use Bard to search the internet it's actually much quicker with certain responses and has up-to-date information on every single article when using chat GPT browsing with Bing what essentially what it would do sometimes is that sometimes it would click on a link and it would say reading failed content failed and this is just something that you don't get with Bard so Bard actually does have certain strengths that I'm going to show you which is hopefully going to change your opinion and show you that I don't think people should limit themselves just to using one artificial intelligence tool I think there are different tools for different applications so the point I'm trying to make is that there are many different AIS for many different uses and you're going to be a much more effective individual if you know which is best for which scenario currently the only multimodal AI That's released to the public is bad and once again like we've stated before it's quite underrated so let's take a look at what some of the community is doing with bot and why I do think it is much better than chat GPT if you are trying to do updated internet search information and that does include in comparison to being so essentially when you do come over to Bard what you'll see is you'll see a new user interface you'll see this plus button right here which shows you that you can easily upload a file now when you click the upload button you see you can pretty much upload any image now I was actually messing around with this and one thing that Bard finally added which should have been there from the beginning was of course this recent you can see this is where you can actually search through your chats and where you can view your chat history now it's important to remember although this is going to be really fun to play with if you are going to be messing around with images please try to not submit any personal information because they are currently reviewing the training data that is submitted and they are going to be making adjustments based on Bard's response for example if I were to enter a prompt to Bard and Bard was to give me a prompt that responded in a way which wasn't suited or Google didn't approve of it they're going to ensure that in future updates Bard's responses are much better so that just essentially means that human reviewers are going to be seeing the data that you enter so here you can see that there are two images that I submitted to Bard because I wanted to quickly see how good bud is with image recognition one of the first images you can see right here is just an image of a house which is a 3D render now I've got to be honest it actually gets this very accurate it states that the image you sent me shows a 3D model of a house with a garage it is a transparent PNG image which means that the background is transparent and you can see through the house although there is some minor hallucination as they do say that the house is surrounded by trees and shrubs which isn't true which is quite unfortunate because if this software is going to be used to help someone who might be visually impaired if they do rely on this then of course that wouldn't be too good now of course we do know that Bard hallucinations are still facing some early tweaks but what this has shown us is that this AI race between Microsoft and Google is proving that we are going to get more and more AI features quicker than we anticipated even though Google did announce that at Google I O bod was going to get images we didn't really realize that it was

Segment 2 (05:00 - 10:00)

going to get shipped out this quick because as you do know open AI has been working on images for quite some time and they haven't currently shipped that model out yet they've only actually shipped out images to a select group of users on the Bing experience currently I'm guessing what they're doing is Alpha Testing it to ensure that the AI remains safe for everyone to use and of course to prevent any kind of abuse as you know many kind of jailbreaks and abuse can happen with AIS then of course I decided to send it a basic picture of Sonic and it says the image you sent me shows a blue cartoon character standing on a black background of course once again it's not a black background it is completely transparent so maybe that's where the image got this kind of wrong because sometimes transparency can appear black it does however identify that the character is Sonic the hedge I'm guessing According to some sources what Google has actually done is essentially combined its reverse image search feature with whatever image you input it in and then based on the data that you do get from Google's reverse image search and essentially if you don't know what Google's reverse image search is you can see that right here you have a button that says search by image so for example if that was to put that same picture of Sonic in here I'm guessing that a bunch of different search results would actually pop up here and then I'm guessing what Google's done in order to expedite this process is of course have a bar to summarize what some of the largest search results are and I'm guessing that is probably going to be very effective since that it seems to be working at a decent amount and what this also does prove is that Google is willing to do anything at its current moment to beat openai which shows us that Google have actually stepped up the notch in terms of their development speed and in terms of what they want out of their AI programs now it's time to take a look at what the community has done with Bard because although my examples are quite basic some of the examples floating around on Twitter showcase that Bard does have a vast range of capabilities that most people don't currently explore so on Twitter this person called Dr Cintas actually did one of the very best experiments that you're likely to see and the reason this is one of the very best experiments with bod is because it is a direct comparison into what gpt4's image search is able to accomplish so if you don't know why this is such a great example is because in the gpt4 trailer Greg Brockman the co-founder of openai demonstrated the future multimodal capabilities of gpt4 essentially what he did was he took a picture of a poorly written hand note and then immediately converted it into a website I'll play the clip from the developer live stream because it's better if he explains it than me hand-drawn mock-up of a joke website uh definitely worthy of being put up on my refrigerator here so I'm just going to take out my phone literally take a photo of this mock-up so the thing that's amazing in my mind is that what's going on here is we're talking to a neural network and this neural network was trained to predict what comes next right it played this game of sort of being shown a partial document and then predicted what comes next across an unimaginably large amount of content and from there it learns all of these skills that you can apply and all these very flexible ways and so we can actually take now this output so literally we just said to Output the HTML from that picture and here we go actual working JavaScript filled in the jokes for comparison this was the original of our mock-up and so there you go going from hand-drawn beautiful art if I do say so myself to working website so now that you've seen the clip let's take a look at this example from someone that tried to do the same thing with bud and see if they're actually successful so essentially he decided to do his own sketch of the website and apparently it did work so essentially what he did was of course he decided to add his sketch in which you can all see then of course you can see that essentially the prompt that he added was Write a brief HTML JavaScript to turn this mock-up of a colorful website where you replace the jokes with two real jokes so essentially what he shows you here as well is that if you do get some drafts that aren't particularly accurate like for example sometimes with chat TPT it generates code that doesn't work all you can do is click the regenerate draft button or look at the other drafts to see if that code does work as well because one thing

Segment 3 (10:00 - 12:00)

that's really great about bod is that rather than having to regenerate the entire response like chat gbt it automatically instantly generates three different variations which is very good and if you don't know why they do that I think that one of the reasons they actually have three separate drafts is because of course this might be a secret form of reinforcement learning with human feedback so essentially when people do pick a specific draft over others I think Google realizes that in certain scenarios and then looks at what made that response specifically much better and then Google looks at what made that response much better than the other two which of course over time is going to improve Bard's efficiency so you can see the final version here does actually work and when he pushes to reveal the punch line actually does completely work so this goes to show that although Bard's coding capabilities aren't widely touted as something that is the best they are still pretty decent as you can still get usable code that does work with basic stuff now of course we aren't building websites like this anymore so applications may be limited but it does go to show that Bard is still being upgraded at a pretty quick speed so another Twitter user by the name of Amar reshi actually showcases something really cool by bud and I was actually about to say gpt4 but of course this is surprisingly Google's bot so essentially what he wanted to do was recreate to the iPhone Timer app so you can see right here he put recreate the app you see in this screenshot make sure it's fully functioning provide all of the necessary code for it to work write your work or your code in Swift UI do not make any mistakes then what we can see here is Bard get to work then what he does is of course he Imports it into Swift UI then of course he then and then of course you can see that the user goes ahead and pastes it and of course there are certain things that you do need to fix understand that if you've ever coded anything before you'll understand that coding is never a one trick point now then of course you can see that his build succeeds which means that the app actually works and just like that you can see that he was able to get this simple app coded within seconds now for those of you who are going to say that Bard might be something that doesn't always work I can't get the code to work understand that this is pretty impressive considering Google didn't really have a large language Model A year ago and already they've managed to create a product under intense heat intense competition and managed to ship out multi-modal coding features and capabilities before open AI managed to do it now of course everyone's going to argue that Bing's multimodal capabilities are going to be absolutely incredible which they are but it's just very interesting that Google who was very behind on AI managed to somehow catch up to chat TPT

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник