AI Drone Kills Operator, GPT 4.2 Leaks,Bard Gets SUPERCHARGED, And Much More [AI NEWS #5]
23:14

AI Drone Kills Operator, GPT 4.2 Leaks,Bard Gets SUPERCHARGED, And Much More [AI NEWS #5]

TheAIGRID 16.06.2023 114 219 просмотров 1 477 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
GPT 4.2 Leaks, AI Drone Kills Operator, Bard Gets SUPERCHARGED, And Much More [AI NEWS #5] ORCA - https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/ IMAGE DIFFUSION - https://huggingface.co/papers/2306.03881 LIMA - https://arxiv.org/abs/2305.11206 GPT LEAK- https://twitter.com/ankur_maker/status/1668315145833967618 Drone Kills Operator - https://www.theguardian.com/us-news/2023/jun/01/us-military-drone-ai-killed-operator-simulated-test YOLO V7 - https://www.youtube.com/watch?v=bt3JLYfFaKE&pp=ygUXWW9sb1Y3IFBlcmZvcm1hbmNlIERlbW8%3D MULTION - https://twitter.com/MultiON_AI/status/1662921770289209344 Welcome to our channel where we bring you the latest breakthroughs in AI. From deep learning to robotics, we cover it all. Our videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on our latest videos. Was there anything we missed? (For Business Enquiries) contact@theaigrid.com #LLM #Largelanguagemodel #chatgpt #AI #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks #Robotics #DataScience #IntelligentSystems #Automation #TechInnovation

Оглавление (5 сегментов)

Segment 1 (00:00 - 05:00)

another giant weak in artificial intelligence and let's get straight into the news because boy oh boy is there a lot to cover so you know the company meta formerly Facebook the one that Mark Zuckerberg owns they have actually been very good in terms of AI development although they've been a little bit under the radar because they've been overshadowed by companies like Google and of course open AI in conjunction with Microsoft they've still been releasing large language models that have been doing insane things now one thing that they released earlier this year was they did release a model called llama and this was a lightweight model it was a 65 billion parameters model now what was interesting was that recently they actually released something called Lima okay so essentially Lima was a fine-tuned version of llama with only a thousand carefully prompted responses without any reinforcement learning or human preference modeling now what's interesting about Lima is that it's one of those models when it's trained using the Llama architecture essentially it means that it's going to be much more scalable and can be ran on much smaller devices what's great about Lima is that it actually achieved remarkably strong performance on a range of tasks and outperform forming state-of-the-art language models such an open ai's Chachi bt3 and Google's Bard now the reason Lima is such a groundbreaking piece of technology is because number one it's actually much more lightweight and it's actually a much more scalable model now what's incredible about this model as well and llama is that I've actually seen people running different versions of llama on their laptops locally meaning that you don't need any servers it could literally just be run on a laptop which of course as we know the future implications are huge now what's also interesting about this as well is that there are several implications for Lima success for the future of natural language processing and artificial intelligence firstly it suggests that pre-training is a powerful approach for training large language models and that it may be possible to achieve strong performance without relying on reinforcement learning or human feedback which means that this could make it much easier and more efficient to train large language models for a wide range of application secondly Lima's ability to generalize well to unseen tasks suggests that large language models may be able to learn to perform a wide range of language understanding and iteration tasks with only limited instruction tuning data finally Lima's strong performance in a controlled human study suggests that large language models may be able to produce high quality output that is equivalent or even preferred to human generators responses in some cases and in these human-controlled studies responses from Lima were either equivalent or strictly preferred to gpt4 in 43 of cases and as high as 58 when compared to Bard and 65 when compared to chat gbt which was trained with human feedback then what we had was YOLO an image recognition tool that was upgraded from previous versions compared to YOLO version 6 version 7 and version 8. here we can see a demonstration of YOLO being used on a clip of a popular TV show called the office and it's easily able to identify every single thing in the image now the thing about this is that of course people aren't going to be using this for TV shows for memes but this has many different applications maybe you're going to be using this for a company's Warehouse May identifying things in a public setting maybe you want to see if certain animals are still in their enclosure it definitely is something that we haven't really seen too much when it comes to artificial intelligence but image recognition is something that is going to be useful and as we do know as future AI systems are evolved we do know that many different AI systems are going to rely on a multimodal capability which essentially means that they will have the ability to process different kind of inputs and outputs now with yolo here and image recognition tool what happens if we do have this kind of software embedded into a physical robot that is going to be powered by a multimodal AI I do think this kind of software is going to be very useful for identifying many different things and as we do know many different companies are working on their artificial intelligence robots then we had something that was also quite interesting something that I didn't see coming we had this research paper called prolific dreamer High Fidelity and diverse texture 3D generation with variational score distillation so what's really interesting about this paper is that they've nearly cracked the code for perfect texture 3D generation you see previous versions of texture 3D generation suffered from many other problems such as over smoothing and over saturation so this paper actually proposes a new technique called variations called distillation which is a particle-based variational framework that treats the 3D parameter as a random variable and infers its distribution with this new technique they're able to generate high quality 3D Nerfs which look very realistic you can see here Michelangelo's style statue of a dog reading news on a cell phone a delicious croissant we've also got things like an elephant skull a blue tip a small Cactus planted in a clay pot we've also got things like a pineapple snail on a leaf and if you don't know what you're seeing on the right hand side and if you are confused by that is essentially the 3D models normal map which is essentially a map technique which allows the model to have certain characteristics which make it look more realistic once the actual texture is

Segment 2 (05:00 - 10:00)

applied essentially it just gives it more depth and that texture tells the computer how to render it and how to infer certain characteristics of that 3D model now there are other 3D models that are also made and there are also other Nerfs that they were able to generate that do look pretty realistic we may actually be quite some time away from actually having these as usable assets but I do think that this technology is still progressing very quickly and it does have a whole host of other different applications what's also good about this model as well something that I forgot to mention is that it does cover a wide range of diverse results which means you can actually prompt this with many different things unlike other previous generations so one of the papers that did get released earlier this year was Google's music LM and it was honestly quite shocking so essentially what you were able to do was generate audio from Rich captions so think of it like this it's simply text to music but it isn't text to music as we traditionally think about music it's more so text to background music I would say or text to music samples that can be used enforced your Productions so for example you could use music like this the main soundtrack of an arcade game it's fast-paced upbeat with an electric guitar riff then we have another example of regatron and electronic dance music then we have some other examples here and there are too now we already did cover this in another video but what was actually interesting was that this tool actually started to roll out public use now you do still have to sign up for early use I'm not sure what the wait list is but I do know that many more people are now getting accepted if you head on over to AI Test Kitchen and you do sign up and you are accepted this is what you're presented with it says how to make a good prompt be very descriptive electronic or classical instrument sounds best mention the vibe mood or emotion you want to create certain queries that mention specific artists or vocals will not be generated which is something that many people do want to emulate for example their favorite musician that's obviously something that has its issues picked largely because of copyright and what's also interesting as well is that the model actually allows human feedback so essentially it gives you a selection where it says which track is better give it a trophy so let's say for example I wanted to generate Lo-Fi music that I might hear in an elevator and I click generate it's going to generate me two different music tracks and once I see those I can then play both of them and see which one sounds better so let's play track number one now these tracks are only around 12 seconds long which is currently what's been set but I do think that they're still very decent okay so we're gonna go ahead and we're going to click the trophy which gives us track number one feedback and of course if I want I can literally just click here click shareable link and then I can also click download now I did find that this was really good this was another example that I did in another video hahaha and then we can hit a track two as well so yeah definitely something that we can actually use as background music and I've got to be honest if I did hear this music I wouldn't instantly be able to tell especially since I would be focusing on whatever the main piece of content Focus would be so I think we are a little bit off when it does come to music generation simply because for now we are in that stage where we are actually giving it human feedback so we need to just constantly give it these trophies and there's more and more users give it this human feedback eventually the AI model is likely to improve over time and likely to Output longer pieces of songs but it's also important to note that the music LM is definitely capable of doing five minute songs and a whole host of many different things such as conditioned Melodies where you give it a specific song prompt and then it uses that driving song in order to make a generated audio then we had this browser agent which was actually quite effective it's called multi-on it's something that works as an autonomous agent we've seen this before with gpt4 and gpt3s auto GPT feature which many people just decided to code but this is I guess you could say a fine-tuned version of that so the user has asked to book a Delta flight from June 11th to June 14 and you can see right here this is still an early version but this is what the AI agent is doing so it tells you exactly what it's doing in the bottom right hand corner it's actually searching Google and then it's going to click on the matching desired date and it's going to search which ones it wants for you as the AI agent continues to go around it's going to continue to visit various different websites sort through many different

Segment 3 (10:00 - 15:00)

things to book and things like these are quite useful you see these tasks are what you describe as something that are quite tedious because many people when doing things that we have to do honestly spend and waste I would say a lot of time because we have to do these mundane tasks that aren't really productive I mean it would be much more effective if we could have an AI that could sift through millions of different prices for us and then find us the best one and that's exactly what we do have here now once this is fine-tuned it's going to be interesting to see if these are what we're going to be using in the future as normal because I do believe that many different tasks are going to be completely automated and we're going to have ai scrolling the web and booking us various different things then what was also interesting is that Chad TPT has been getting a bunch of subtle updates essentially if you've been using chat GPT on a day-to-day basis you may not know of some of the small changes that chat gbt actually has had one of the more interesting things is that in chat gbt you can now actually share the conversations that you have with chatgpt now this is actually really great because it allows you to share prompts and continue conversations that already have an extensive history I know that this is something that I'm going to be using when I want to share a conversation that I've had and it's very good because you're now able to share dialogues that otherwise people could have fixed so for example if someone says that chat TPT has said something they can now simply share this feature and this was something that interestingly enough Open the Eyes CEO Sam Altman did say that he was surprised that Chachi BT was such a success because he actually said at YC combinator that literally every startup he advised them to have some kind of viral sharing feature in which apps would be able to explode and it's funny that he didn't add that feature and now with chat TPT he's slowly just sharing up the sharing feature which I generally haven't seen that much on social media but it is a cool feature that will be useful for the workspace then we have right here which is an interesting chat GPT leak now remember this is currently speculation and this was from Reddit however it's been all over Twitter and it seems like it's public news and right now apparently a user that previously leaked some of the chat gbt updates before has come out with this screenshot where essentially you do have this my profile area and this my files area where which it could mean that you're going to be able to upload files to chat gbt and you're going to be able to have chat TPT remember certain preferences about you and certain information now I've got to be honest here I've done some digging and I've seen this upload file feature on chat TBT before you see there was a user on YouTube who actually made this tutorial in which you could actually submit a file to chat GPT it was a really cool workaround but it just goes to show that this is something that is actually possible so I wouldn't be surprised if these features were added because a lot of the times when you do start a new chat you would like chat TPT to remember certain things and there are many third-party applications like chat PDF in which you can actually talk with PDF files and you can actually upload many different files to chat gbt so this is something that I would expect and just for reference this was the actual original video here it actually has over 500 000 views and it's something that you can still use today then of course we had a major update which was one that we really couldn't wait for which was Adobe Firefly being released to the public so if you don't know what Adobe Firefly is it's essentially a tool like mid Journey but it's created by the company Adobe if you don't know who Adobe are they created Photoshop which is used by many different people and of course many people who are familiar with the Creative Suite know about illustrator and of course Premiere Pro and of course After Effects now Adobe Firefly is really good because it has many different features that mid-journey doesn't have and it has many different creative features that many of the creative individuals who work in the creative industry already do need now some of these features aren't actually in the Adobe Firefly app just yet but they still are currently being worked on so many different things such as texture 3D are still currently in production but the video that was released by Adobe does actually showcase a bright future for those content creators who are looking to create content very quickly along with AI assistance and of course that brings us to something that adobe actually did release which surprisingly took Everyone by surprise and I've definitely seen it a lot on social media because it has very effective effects and of course this is adobe's generative fill feature if you don't know what that was it was basically a revamp of an old feature that they had but when using the AI technology you're actually able to generate a completely unique thing within that image that was completely seamlessly added this was something that many people wanted originally and for it to be simply embedded into Photoshop was a complete Lifesaver because you can simply outpat in-paint ad objects and although there are some AIS out there that can actually do this right now none of them were able to do this as effectively as Photoshop and as natively as Photoshop did so Adobe did a great deal by releasing this and in this video you're going to see me actually messing around with it and generally it was actually effective it wasn't just some kind of interesting fad it was something that I can guarantee you people are

Segment 4 (15:00 - 20:00)

using because it saves you hours and hours of editing work generative fill is something that many people have used to expand pictures old pictures to just simply reframe things and of course it's something you can do essentially called out painting with things like stable effusion and things like leonardo. ai but this is something that many people do like now because it's just in Adobe and it seems really effect and the thing with generative fill as well is that it also gave you three different output versions that you could use another thing that was released well I wouldn't say release but it was a research paper that was really interesting because this research paper talked about emergent correspondence from image view so essentially they talked about an emergent ability from text to image which like we talked about in a previous video where we talked about how gbt5 would be extremely risky and how AI does present a whole host of new emerging capabilities This research paper showed us that this is actually true even in a text to image so essentially you know how when you look at two pictures and you can point out things that are similar between them like maybe they both have dogs or trees or a big blue sky you're able to understand that those things are the same even if the pitches are a bit different now imagine we want a computer to do this so we want it to look at two pictures and understand what's the same and what's different that's called finding correspondences between images in the world of computers so in this paper the smart scientists found out that computers can actually do this really well when they use something called image diffusion models this is like a superpower the computers use to diffuse or spread out the image and understand its details what's amazing is that the computer can do this without needing any Specific Instructions or supervision it just kind of figures out and this superpower can be used to create special features which are like little hints or clues about what's in the picture so they named these Clues diffusional features or diff for sure these diffs are super helpful because they can be used to compare the two images and find out what they have in common and the best part the computer doesn't need any extra help or fine-tuning to do this in fact when scientists tested this they found out that dift was even better than other methods of finding these correspondences when looking for the same meaning in different pictures what they call semantic correspondence gift performed better than some other popular methods so in essence these scientists discovered a way for computers to be super smart at looking at pictures and understand standing was the same and what's different about and they don't need any help or instruction now what's also cool as well is that they managed to use this on video data even though video data it wasn't trained on so I do think that this is an emerging capability which is actually really interesting and the web page that they have also does have some interactive demos so I think this is a real win for computer vision and it just shows that even on the back end even on certain categories in which you might not think there's a lot of AI progress there are small incremental changes and hundreds of different research papers being published weekly and which these small changes are going to amount to a huge gradual change over the next coming years the paper also did show us image editing with diff where essentially if you can plot certain points on certain images then of course you can translate this across into many different other things and like I said I think this has a vast majority of different applications and it will be useful to see one day once all of these AI tools come together and produce a highly complex AI system such as AGI or even as there was also a small update to Bard which actually increased the effectiveness by 30 so Sundar pichai the CEO of Google said we're updating Bard with a new technique called implicit code execution now it runs code in the background when it detects computational prompts improving the accuracy of word and math problems by 30 and this was something that Bard did struggle with for quite some time and it was literally one of the main reasons as to why nobody really used Bard over chat gbt so with this issue being solved it's likely to give Bard some more ground in terms of usability there is also a longer blog post which they talk about how Bard has improved its logic and reasoning skills and they said that essentially they have a new method so the new method allows it to execute code to boost its reasoning and math abilities and this approach takes inspiration from a world study dichotomy and human intelligence notably covered in the book thinking fast and slow so system one is Thinking Fast intuitive and effortless and system two is slow deliberate and effortful and essentially what they've done is they've actually combined these two systems in order to improve the accuracy and Barge responses now what I always found cool about Bard was that it always was super quick so if it's much quicker than chat GPT and is now going to be much more effective with three different prompts I think that over time Google could eventually gain some ground then we had a stunning research paper by Google's Deep Mind in which Alpha Dev discovers faster sourcing algorithms essentially they figured out a way to use new algorithms to transform the foundations of computer and essentially this is going to mean that systems over time get more efficient and more streamlined there was also Microsoft's Orca which was released which I'll describe now it might seem complicated at first but this is actually really simple all Microsoft did

Segment 5 (20:00 - 23:00)

was they published a research paper in which they managed to build a system that was basically nearly just as good as gpt4 but a lot less intensive meaning that we could have a lightweight easily scalable version of gpt4 that as just effective now the paper does go into how good this new language model is called Orca and honestly it is quite effective now I think what this research paper shows is that the way how they train this new AI they essentially just trained it based on the explanations that were from gbt4 and basically what they did was they got these complex explanations got gpt4 to explain them in super simple manners then train the language model on that and there were only a thousand of these and then essentially that language model is very effective so what we're seeing is rapid advancement in the scalability and the level of these language models which means that in future we're going to get large language models with less parameters more lightweight more scalability which means eventually all these large language models which are hosted across tons of servers won't be the norm and orca is pretty much on par with one of opening eyes earlier models which is very similar to chat gbt so it will be interesting to see where this model is in the future and just how quickly these models are able to surpass gpt4 what's also cool was that Boston Dynamics robot actually did manage to get an upgrade which actually allows it to see various different things and allows it to also be an additional worker in terms of many different Industries for many different Apple locations you can see the video right here and it just goes to show that many of these actual robots that are being integrated with AI are a lot more effective than just these large language models that we can prompt because they actually have real daily use and reaching things that we can't reach and seeing see and in analyzing certain situations with the many different senses in their Hardware what is very interesting was a recent test which was also done with the US Air Force in which they actually deny running a simulation in which an AI drone killed an operator so according to this article by the guardian it goes into details on how an AI drone that was deployed used highly unexpected strategies to achieve its goal so the article starts out by saying that the US Air Force has denied its conducted an AI simulation in which a drone decided to kill its operator to prevent it from interfering with efforts to achieve its Mission call Tucker Cinco Hamilton described a simulated test in which a drone powered by artificial intelligence was used was advised to destroy an enemy air defensive systems and ultimately attacked anyone who interfered with that order the system started realizing that while they did identify the threat at times the human operator would tell it not to kill a threat but it got points for killing that threat so what did it do it killed the operator because that person was keeping it from accomplishing its objective so it goes on to say that we train the system hey don't kill the operator that's bad you're gonna lose points if you do that so what does it start doing it starts destroying the communication tower that the operator uses to communicate with the Drone to stop it from killing its Target now of course no real person was harmed and Hamilton who is an experimental fighter pilot has warned against relying too much on AI and said the test showed you can't have a conversation about artificial intelligence machine learning and autonomy if you're not going to talk about ethics and AI now the article continues to state that the US Air Force spokesperson denied any such simulation had taken place

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник