# This ChatGPT Skill will earn you $10B (also, AI reads your mind!) | ML News

## Метаданные

- **Канал:** Yannic Kilcher
- **YouTube:** https://www.youtube.com/watch?v=yR4hNBNS6yc
- **Дата:** 11.03.2023
- **Длительность:** 43:27
- **Просмотры:** 74,020
- **Источник:** https://ekstraktznaniy.ru/video/12498

## Описание

#mlnews #chatgpt #llama

ChatGPT goes around the world and is finally available via API. Stunning mind-reading performed using fMRI and Stable Diffusion. LLaMA weights leak and hilarity ensues. GTC23 is around the corner!

ERRATA: It's a 4090, not a 4090 ti 🙃

OUTLINE:
0:00 - Introduction
0:20 - GTC 23 on March 20
1:55 - ChatGPT API is out!
4:50 - OpenAI becomes more business-friendly
7:15 - OpenAI plans for AGI
10:00 - ChatGPT influencers
12:15 - Open-Source Prompting Course
12:35 - Flan UL2 20B
13:30 - LLaMA weights leaked
15:50 - Mind-Reading from fMRI
20:10 - Random News / Helpful Things
25:30 - Interview with Bryan Catanzaro

Participate in the GTC Raffle: https://ykilcher.com/gtc

References:
GTC 23 on March 20
https://www.nvidia.com/gtc/
https://ykilcher.com/gtc

ChatGPT API is out!
https://twitter.com/gdb/status/1630991925984755714
https://openai.com/blog/introducing-chatgpt-and-whisper-apis
https://twitter.com/greggyb/status/1631121912679002112
https://www.haihai.ai/chatgpt-ap

## Транскрипт

### Introduction []

chat GPT goes around the world and now is finally available as an API AI can now read your brain and meta's llamas weights leak in a hilarious fashion welcome to ml news I'm your host Janik and nice that you're here

### GTC 23 on March 20 [0:20]

to ml news first things first it's GTC time nvidia's GTC conference is happening in March 20 to 23rd especially the keynote is happening on March 21st so you should not miss this not only because it's a very cool conference it's completely free you can attend all the sessions that you want but also because if you do you can win a graphics card so really nice 4090 TI signed by Jensen Huang this is exclusive to only very few places and I'm very happy that my channel is one of them so go to ykilcher. com GTC if you want to participate note that you do have to be in Emir in order to do so if you're not you can still go to this website and I'll raffle out some merch and to make this even more fleshy I have interviewed the VP of Applied deep learning research at Nvidia Brian katanzaro and I'll post the interview with him at the end of this video this is important because this person in large part decides what the future of Hardware is going to be Nvidia does research themselves in order to estimate the trends going forward and in order to adjust their Hardwares their software and so on to what they think will be the future and Brian is at the dead center of that thing so whatever Brian does and thinks and says and creates is probably gonna have a very large impact on how the Deep learning field evolves in the near future so definitely worth a listen as I said the interviewers at the end of this video and now let's dive into the news direct Brockman tweets chat GPT API is now

### ChatGPT API is out! [1:55]

available so the chat GPT model that has been only available as a web interface as a research preview is now available in the open AI API which means you can give them money in order to interact with chat GPT in a programmatic way it's a lot cheaper than the DaVinci models the classic GPT 3 models which we might infer it's probably a bit smaller than them but who knows maybe they've just gotten more efficient or they've determined that this pricing is better in any case it's about 10 percent the price of the large models and not only is the chat GPT model available but they've also activated the whisper model in their API now whisper is a model that they have previously given out open source it is a model that takes in speech and transcribes that speech and it does so very well so now also that is available in their API so it becomes easier and easier as a developer to interact with AI to interact with these models you don't have to know deep learning or Frameworks anymore I for one think it's a cool direction that a lot more of these apis are offered I obviously also think that open AI with their big speech about democratizing and sharing and so on is you know what they say and what they do is 180 degrees apart from each other but in principle I think it's a good direction for developers that these things are generally available as apis on their blog they have a few examples for example a Quizlet here is using it to automatically create like quizzes about things that you might want to learn so this is essentially a personalized tutor that's done via chat GPT so I believe that there's a lot of interesting applications and certainly other people have believed that as well because we're now seeing like a ginormous flood of applications of everyone hooking into the chat GPT API and just saying how easy this is uh Greg Boggs on Twitter has tweeted out build the command line chatbot in Python with the chat GPT API 16 lines of code I'm not sure anyone's ever shipped a more powerful Dev tool than open AI he's also released a blog post of how exactly does that and it's deceptively simple all you do is you define this system parameter and the system essentially says you know you are a helpful assistant or like please pretend to be someone from the Moon something like this then you take turns in you know providing messages the model is called GPT 3. 5 turbo again we're not sure if it's a smaller version of gpt3 if it's like just a faster version for Hardware reasons nobody knows but with a few API calls with a few web calls you can definitely just you know make a chatbot or make it do pretty much anything so the blog post goes a bit over the changes that you have to make if you previously use the old openai models and now I want to switch to chat GPT I'll leave a link in the description also open AI is moving more and more

### OpenAI becomes more business-friendly [4:50]

towards sort of the business world the way from the research World they have new policies so Sam Altman tweeted data submitted to the open AI API is not used for training and we have a new 30-day retention policy and are open to Less on a case-by-case basis we've also removed our pre-launch review and made our terms of service and usage policies more developer friendly so everything for open AI goes into the direction of essentially providing AI as a service and much less you know taking part in research that is also evidenced by a leak that happened a few days earlier which is called the open AI Foundry which is essentially a system at least as the screenshots here claim where open AI essentially provides dedicated Hardware dedicated inference Hardware to businesses so you can go to opener and you can give them some money for example here if you want to do a one-year commit to The DaVinci model with 32 000 token context you just give them one and a half million easy peasy and then you get a one year dedicated 600 units instances whatever that is you get dedicated hardware for you so no one else you know can interfere with you this is likely targeting businesses probably larger businesses that have significant demands and constant demands for these types of workloads and lastly there is now a collaboration between open Ai and Bain and Company which is like a Consulting shop like a marketing Consulting shop and open AI brings their technology in order to make payment companies business better so you might know them for example from Coca-Cola they say with chat GPT and Ali we're helping Coca-Cola to augment its world-class Brands marketing and consumer experiences in Industry leading ways we're also using openai technology to improve business operations and capabilities now that it doesn't say anything but they are very much talking about like very Target ads for example so by using Dali or something like this and chat GPT you can make very targeted ads to individual users now is every single one of these ads gonna come with a big fat disclaimer that was done by an AI no who would have guessed I'm sure Stanford is like shaking in their boats writing lectures left and right yeah I mean it's a company so that's fine I'm not wanna don't wanna complain about this however at the same time from

### OpenAI plans for AGI [7:15]

the ground floor of the trading Corners they're now getting back on their horse and here we are with a blog post planning for AGI and Beyond I want to say it's a bit of a like a Fluff piece that just repeats again and again how much they want to benefit all of humanity and it's essentially a justification for them not being what they set out to be at the beginning at least part of it is so they for example speak about how at the beginning they were you know not expecting that scaling was so important for example this side note here we didn't expect scaling to be as important as it turned out to be when we realized it was going to be critical we also realized our original structure wasn't going to work we simply wouldn't be able to raise enough money to accomplish our mission as a non-profit and we came up with a new structure they also said as another example we now believe we were wrong in our original thinking about openness and pivoted from thinking we should release everything to thinking that we should figure out how to safely share access and benefits of the system we still believe the benefits of what is happening are huge and the yada yada essentially as I said it's a justification for the fact that they're not open anymore not non-profit anymore they're very much making money which again is fine for a company but don't at the same time talk to me about openness and sharing and so on this is very much we're going to develop this we're going to make money off of it and we tell you what's good for you they claim to be still open in other ways yet they don't no one knows how you know what exactly is running when you do a gpt3 inference or a chat GPT inference no one knows how the supposed good safety filters of that thing work no one knows you know what training data exactly went into these models none of these things are known and will probably ever be known to in detail so it's very much the opposite of open and it's very much not sharing it with everyone and maximizing the benefit of humanity they're couching this in a language of like safety and ooh the substantial risks of AGI this is a Silicon Valley company right so I'm totally ready to believe that someone is delusional enough to think that they are on the doorsteps of AGI and that the risk is so big and they just have to do something on the other hand isn't it just like very convenient that that's also like the way that your company makes the most money and that your company can retain as much stuff as possible even though you set out to do the exact opposite it just seems like a lot of convenience coming together that's all I'm saying again given that it's a Silicon Valley company it's totally possible that people actually believe that in any case you may read this there's not too much information in there but yeah that's that

### ChatGPT influencers [10:00]

now with the success of chat gbt the influencers come and I know I'm I know glass houses blah blah but certainly everyone is giving tips left and right of how jat GPT works and you know what to do oh you can earn 300 000 by prompt engineering oh this odd jet GPT skill pays 335 000 a year create a course online using 100 AI like you know you just type into chat GPT make me a course and then you make a course why wouldn't someone just go to chat GPT themselves if you just make the course using chat GPT is supreme but it has limitations here are 10 must-know AI tools to work smarter not harder in 2023 it's like it's always like you just change the words of these influencer people and then they always wanna they always have a sub stack or something where you subscribe by the way I have a patron I have a patreon just you know just saying like look oh this pen you thought you knew how a pen worked but no you've been using pens wrong all your life here are 10 websites 10 awesome tools to use your pen better in 2023 follow my sub stack chat GPT and Bard are phenomenal AI but these 50 new websites to finish hours of work in minutes I'm giving away a few Premium Accounts go a threat oh a threat wow that's something new never seen that on Twitter the awesome chat GPT prompts repo is now available 50 awesome chat GPT prom ah influencers now of course like all of this is probably like useful information to someone so I don't want to dump on this but there's is definitely an industry of sort of making clout off of the hype now and again glass houses yeah but I'm pretty sure I was here before chat GPT existed in any case I actually welcome the additional attention I feel it's cool that the world kind of is taking notice of what's going on right here except the European Union is welcome to just not look enough listen because it will just try to regulate everything that's not fun but everyone else welcome to the

### Open-Source Prompting Course [12:15]

field there's definitely also good content for example uh this here is a free and open source course on prompting communicating with artificial intelligence you can just go there you don't have to subscribe to a sub stack it's free in open source if you want to get into prompting this might be a resource learnprompting. org Google researchers release a new model A ul2 has been

### Flan UL2 20B [12:35]

released a few weeks ago but oh okay or months even it's a 20 billion parameter model that has a very new interesting training Paradigm and now they've taken that and tuned it in a planned way so flan is a code name for models that have been instruction tuned in various Fashions there's always been plan T5 beforehand and flan other models and now there is flan ul2 so the checkpoints here are available on hogging face you may go you may download them you may try them and maybe they'll help you especially if you have some sort of instruction following task doing things so an example is this query right here answer the following question by reasoning step by step the cafeteria 23 Apostles so if you have some sort of task and you want a response questions maybe this model is good to check out

### LLaMA weights leaked [13:30]

speaking of model checkpoints the Llama weights have been leaked well llama you'd have to go through this process of a forum to apply to meta and they determine whether your heart is pure enough to really wield the Llama weights they were only for research now they've been leaked and the fun thing is some a person has submitted a pull request to the Llama code base and the pull request the change here so there's a link to the Google form the pull request changes that by two or if you want to save your bandwidth use this torrent linked there is a magnet link or BitTorrent first people said well this is one of the meta researchers who mistakenly sort of put that there and shouldn't have you should only give that to people actually filled out the form before yet it's as far as I can tell this person who made the pull request that Christopher King 42 is an automata researcher or a mistaken even if they are do not trust this pie torch model files contain executable code only trust this once we have verifiable checksums of these files I'm totally ready to believe that these are the correct weights they their size matches and so on but is very dangerous you just download them from somewhere what's funny is the response of the community so this is the request and the bot like the GitHub bot is guiding them through the process like hey thank you for your podcast Welcome to our community and so they ask him to sign the license agreement and then people just jumped in and approved the changes on GitHub you can just go and review a pull request so people jump in going looks good to me and so apparently uh Kristoff signed the license we can now accept your code thanks so the comment section is going nuts and on the right hand side you see the reviewers with all kinds of approvals and you know in general people are very happy about this change I love the internet is great by the way open assistant is going apps absolutely fantastic the data we've collected so far is phenomenal and we are continuing we're also training the first models we're all preparing to do V1 releases yeah that's going well if you don't know about this project yet go to open dash assistant dot IO grab a task play the assistant and um help us all

### Mind-Reading from fMRI [15:50]

right there's a new paper by researchers from Japan that's been accepted at cvpr and it's literally reading your mind like literally so on the top you can see presented images to people like actual people and then from an fmri scan this system can reconstruct these things at the bottom so you'll see that they do match in sort of a little bit visually but also like semantically a lot so a snowboarder in the snow a tower with a clock on it so on and it's relatively simple so they have a training data set of people where they present images and measure their brain activity and what they do is they simply train sort of encoders and decoders here from brain space to latent space of the stable diffusion model so they don't train a big model to reconstruct these images they just take stable diffusion and they assume that all the information you need to reconstruct images is already there and they just need to train a little mapping from brain activity to the latent space of stable diffusion and that works pretty well and as you can see you know right here it's quite fantastic they have some hypotheses of which brain areas map to which parts for example the latent Vector Z right here or the conditioning Vector C how why are they called so similarly see the conditioning Vector is usually that's the text Vector in stable diffusion so you take the different brain area signals you learn a mapping from those to the latent vectors in the stable diffusion process and then you just let it run yeah it's pretty amazing how far we're coming not only is it a amazing that we can sort of read people's thoughts but it also I think tells a lot about like our models that we have might be not as far off as brains in terms of how they understand images and their contents right so for example here you can see sky and down one is sort of more or less cloudy the other one is more bluish and so on it's sort of but semantically it kind of matches and that's what's fascinating to me so it's not just that the colors are right you know the pixels are somehow it's kind of matching semantically that leads me to believe that we're sort of getting closer at understanding ourselves by training these models and that's by the way also why I have quite a bit of Hope for sort of the large language models people always argue who aren't intelligent they don't understand anything and so on and in my mind it's not a question of sort of what the large language models can't do but the question is much more are you as a human doing really that much more than sort of statistically interpolating your training data like are you really doing that much more I have my doubts that you are like that far ahead of something that just kind of does statistical you know look-alike inference of the training data I think humans overestimate themselves a little bit in how smart they are and here's another paper about brain mind reading evidence of a predictive coding hierarchy in the human brain listening to speech so again we're dealing with fmri images as I think so yes or at least brain readings yes fmri images and this time it's not people watching images it's people listening to speech at the same time they take the same speech and feed it into language models and they see sort of how far ahead the human brain must be predicting in order to really listen and understand that speech and then they compare them to the language model and they find a lot of evidence for what they call a predictive coding hierarchy which means that they the human brain very probably predicts not just at one time scale like our language models currently do but at multiple time scales so different regions in the brain do predict the text at a different sort of frames and different granularities and also hierarchically which means that they first sort of predict what the general thing is that's coming up and then they predict sort of the different elements and only at the end they predict the individual tokens from this hierarchical information a lot of evidence is being generated again I think by training these models we come closer and closer to understanding ourselves and that's pretty cool

### Random News / Helpful Things [20:10]

foreign now understanding ourselves uh is good and well but the robots aren't so well alphabet lays off hit trash sorting robots right wired so this is Everyday Robots it's a section beneath alphabet that was doing these kinds of robots that could do household tasks they could go around they have cameras you could teach them stuff they could as you can see here clean a table and so on now this problem it looks easy but it is really really hard for robots in the real world you know one of these chairs could be a bit different the tables there could be something in the way and so on the robotics in the real world is really hard and Everyday Robots has now been hit by the big Tech layoff wave that's currently sweeping these large companies so Denise Gamboa director of marketing and Communications forever the robot says Everyday Robots will no longer be a separate project within alphabet some of the technology and part of the team will be Consolidated into existing robotics efforts within Google research so it's partly of part restructuring but yeah I think it just shows robotics is quite hard and maybe not as fruitful right now but who knows what the future brings maybe just not as profitable this is an article on the hugging face blog it's a swift diffuser so this is stable diffusion on Mac so if you have like an M1 chip you can get or an M2 very fast stable diffusion by simply downloading this app from the App Store very cool pyribs is a library for Quality diversity optimization quality diversity is a technique in sort of the Forever Learning space in the space of hey let's train things without having objective functions or without having the single objective function and this often leads to more robust results more diverse results and so on and this is a library that supports that so if you're in the space of lifelong learning exploratory learning evolutionary search things like this maybe check out this Library Ron Chung tweets the most disturbing AI website on the internet upload a photo of a person and AI will find all of the image images of that person across the internet this is about a website called Pim eyes and you upload a photo and it uses facial recognition and it searches your face I don't know is it good or bad that this exists and is accessible who knows like you can do Google image search right you can do image reverse search you can find a particular picture you can classify pictures in Google image search you can certainly search the internet for your exact name right in Google you just quotes and you put your name you can find yourself so why is it that tragic if you can find an image of yourself that I don't know on the other hand it does seem quite creepy of course it is possible to do this because it's possible it's pretty sure that all the governments are already doing it so then you can also ask yourself is it that bad that like a private company is doing it so they offer you things like you can have alerts if a picture of you appears somewhere you can also erase your picture so this is very much like other pii Erasure websites they would go to external websites and kind of bully them to get rid of your picture well bully them they would use legal means in order for them to get rid of your picture and so on I'm not sure what to think of this website I'm not per se against it this one here there's an opt out request so you can opt out but I tend to de-proof is required means you have to submit your passport if you want to opt out of this that seems a bit Shady like that seems I'm not sure how else you would do it but it does seem a bit Shady that in order if I'm concerned about my privacy I have to send my passport to some company for them not to allow others to invade my privacy that's a bit fishy that's all I'm saying alright the last helpful thing cacti is a framework for scalable multitask multi-scene visual imitation learning this is a multi-step process and the framework helps you in achieving all of the steps so collect data this is demonstrations for kind of robots and Robotics then augment your data compress that data and at the end train an agent on that data so here's a bit of a video of robots doing cool things I do think you know I said before robotics is really hard I'm very happy that other companies aren't stopping their robotics efforts Google isn't either but I feel like robots are still a lot cooler than really smart chat Bots I know a really smart chat bot might change a lot more lives than a single robot I just think robots are cooler and I'm very happy research is happening into that direction shout out to sanyam bhutani who has used control net and the pytorch logo to do really awesome images and these are really cool and it does show how easy it is nowadays just to do air just to be creative with these tools I'm planning to do a bit of a dedicated video on just tools that exist but the landscape changes so much you wait a week and there are three new major tools out like this one control net it's just it's like blows your mind what it can do where previously you'd have to have extremely serious Photoshop skills in order to achieve this yeah shout out these look amazing all right we're at the point where we go over to the interview with Brian Catanzaro again

### Interview with Bryan Catanzaro [25:30]

this is the VP of Applied deep learning research at Nvidia and he's going to tell us a little bit about what Nvidia is doing in research what the plans for the future are what their mission is and what you can expect from his talk at GTC again go to whykilter. com GTC in order to take part in the raffle and maybe win an awesome GPU and have a lot of fun attending GTC I hope to see you there and bye my name is Brian katanzaro it's an Italian name pronounced it very like latinate you know yeah so I wanna I wanted to ask you a little bit about Nvidia and Nvidia research specifically because I always think of Nvidia as you know the graphics card manufacturer so I'm a question that I really had all my life essentially for I've known Nvidia is why doesn't Nvidia even do research why do we see cool papers coming out of Nvidia and I remember the first big one was like really scaling up Gans and making really beautiful images out of that why does why do you even do that why don't you just sell more cards and have others do it well Nvidia is not about selling cards and videos and accelerated Computing company which means that we're trying to build technology that allows people to solve problems that just could not be solved any other way and we know that in order to do that we have to optimize every part of the technology that includes the chips but also it includes the working it includes the systems the chips go into it includes all the software the Frameworks the compilers the libraries the applications the algorithms all of these things are connected and they have to be optimized jointly in order to get the benefits of accelerated Computing you know our business is not just to sell a chip what we're trying to do is enable people to solve problems they couldn't do without accelerated Computing and so that means we have to understand the applications deeply and that means we need to do research one of the things that's challenging about being in the accelerated Computing business is that we have to be extraordinarily focused on what we think the most important computational problems of the future are as we're creating this technology um you know accelerated Computing implies decelerated Computing we're going to focus on the few things that we think are the very most important computational problems and we're going to push the limits every part of the computation every part of the infrastructure is going to be optimized in order to solve that problem and then the other stuff we're going to let go we're not going to worry about that we're going to let you know traditional systems non-accelerated systems take care of that so the fundamental question of accelerated Computing is what are you going to accelerate um and so research helps figure that out and you know I was able to be involved in that uh when I joined Nvidia research in 2008 uh so 15 years ago I was the only person working on artificial intelligence and um you know I had the freedom to investigate what would it mean for artificial intelligence to run on the GPU how could we make libraries for that how could we accelerate AI on the GPU and you know being involved in that project I think has been really good for me I've had a lot of fun and I think obviously Nvidia has been able to focus so much on AI but it you know as many things happen at Nvidia it starts with an exploration it starts with a question it starts with research and so Nvidia research has been instrumental in a lot of different technologies that have been important to Nvidia Ray tracing artificial intelligence the generative AI that you brought up our work in Gans you know all of this comes from our desire to better understand the most important computational work and then figure out how we can push that forward is that would you say that's your mission statement or do you have sort of an overarching demand from the company like here is the one thing you should be doing or is it that is it what you say is it push the Frontiers so we know what's coming tomorrow sometimes we talk about the mission of research being to lift the headlights of the company so that the company can see just a little bit farther into the future I think also research provides a place to incubate ideas that we think are promising but aren't proven yet um we know as a company that the only constant is change if you look at the kinds of workloads we've been accelerating over the years um there's been a ton of change you know we know that the workloads of the future can't be the same as the workloads of the past because the applications of the future are different and we also know that the value of accelerated Computing is enabling people to solve new problems it can't be just about retreading the problems of the past if we just built systems let's say for example that we're the fastest in the world at training resnet 50. we just focused every bit of our might for Accelerated Computing on resnet50 then we would have missed generative AI right because resnet 50 isn't generative AI it's a discriminative problem you're just trying to classify an image and so um we know that the applications of the future are different from the applications of the past we know that the choices about what applications are important are critical to nvidia's future because again if we're trying to build accelerated Computing we have to know what to accelerate and what to not accelerate and um you know research is part of the way that Nvidia decides how to answer these questions for itself it's not the only way I mean there's people all over the company that are incredibly Innovative and building awesome technology and have deep understandings of AI and where the future is going it's not research's job to be sort of the only people thinking about the future the whole company is but research has I think a special ability to incubate things and to ask questions that maybe the company hasn't been asking yet so this question might seem a little bit cheesy but uh Jensen Huang said that Nvidia chooses to solve the hardest problem not the easy convenient ones which plays a little bit into the role of what you're saying right now there is a way in which you know leaders just say that because it sounds cool um but what do you understand like how do you go about that right if I assume you take it seriously to solve the hardest problems what can you give us an example of that well I think one of the ways this manifests has to do with um creating something new um I think Nvidia is generally oriented to build something new rather than to um enter into some battle some zero-sum battle where somebody's gonna win and somebody's gonna lose we love starting in a market that no one cares about yet but that we have a lot of conviction in um and that's always going to be hard you know it takes enormous conviction to go into a market that's worth zero billion dollars in plant you know all of your energy and all of your focus on that market because you believe that in the future that that's going to matter and you know generally it's the case that uh if you're working in a space like that there's a ton of problems everywhere you know if the problems were well understood if they had already been solved then there would already be tons of people working in that space and so when Jensen says we focus on the hard problems I think he's right I think that is our culture at Nvidia and I think the reason that we do that is because we believe that's where the most interesting problems are and we believe it's where we can add the most value can you maybe give you know I guess what we're talking about the future and recognizing future markets recognizing what devices we need to build for the future can you give us a little bit of insight don't reveal any company Secrets but you know we had stable diffusion come along we had chat GPT come along and what do you see if I just could guess what the future is it's bigger Transformers that's kind of it's kind of boring so I'm wondering you know where do you see the future the next few years it's such an interesting question I mean first of all I have to say bigger Transformers are not boring to me um one of the reasons is that they just keep doing more and more amazing things when the Transformer first came out you know in 2017 and it you had that famous title attention is all you need it was kind of perceived as a joke right but nowadays it doesn't feel quite so much like a joke it's amazing it's extraordinary you know the Bert models of 2018 were incredible language models for the day and yet the difference between a Bert model and a chat GPT model over the past five years is extraordinary and the understanding of um intelligence in their ability to do problem solving um you know I think uh the thing that's so extraordinary right now about um the large language models that we're seeing is that they actually are able to solve problems that they've never been trained to do we call that zero shot reasoning and that's extraordinarily valuable because it allows us to apply them to all sorts of new places and so I think that um uh you know just as larger Transformers you could say the only difference between Bert and Chachi PT is that Chachi PT is trained on a lot more data with a little bit different training objective but it's still just a much bigger Transformer um I think in between Chachi PT and what we're going to see in the future five years from now is also going to be extraordinary change um you know I personally am going to bet that huge Transformers are still going to stay part of that change because they just have proven themselves to be so flexible and useful um but I don't think that's not exciting I mean I I'm totally excited about it um so I think there's going to be a proliferation of applications of this technology uh in new ways um obviously there's a lot of problems with um Chachi PT we love Chachi PT it's so exciting it's caught the industry's imagination it's caught the Public's imagination and yet we know that it needs to be better at factual question answering uh using current information you know perhaps including tools like web search um or you know the ability to write code um in order to solve problems I think that uh what we're going to see is a composition of these language models that really expand our capabilities to problem solve before I forget and our time runs out do you want to give us a little bit of a preview of your session at GTC why here's your chance to tell people why they should come and watch yeah well this year at GTC I'm giving a talk about generative Ai and sort of explaining what it is where it comes from and where we think it's going um obviously generative AI is having a big moment um Mark Zuckerberg just announced that Facebook's building a new top level product group focused on generative AI Microsoft has pivoted so much of their strategy to incorporate generative of AI in bing and other products and I think that um you know we're seeing uh this incredible interest in general AI Technologies across the industry um so the question is you know how should we understand that what is this technology how is it built um you know what are its strengths what are its current limitations where do we think it's going and um what is it what does it mean for us and for the technology industry and so that's what I'm going to be talking about you just um you said you believe Transformers will be part will sort of stay with us maybe a little bit they will be able to do more and more interesting things there is a lot of resurgence at the moment of sort of the well in one hand you can call them the Doomsday predictions on the other hand the singularity people uh of is do you believe something like AGI is imminent something or let's call it human level AI do you think because then all these questions pop up of ethics of how do we treat machines that display the same level of intelligence are you generally in the I want to say that the group of people who believe yes the acceleration is so drastic that it's really imminent we're gonna create intelligence whatever that means or is it more the direction that says well it's a really good machine but it's still like a totally a machine I think this technology is extraordinary and what we're seeing right now is the closest thing we've ever seen to general intelligence uh and I think we're going to see a lot of progress over the next few years that pushes the limits even farther um so I'm very excited about the possibility at the same time uh I think that intelligence is generally a complicated topic it's difficult to Define we often simplify it to a one-dimensional scale like a test score where one person has a higher score than another and that means they're smarter but in the real world I don't believe that's how it really works intelligence is so multi-dimensional there's eight billion people on the planet and um there are so many different kinds of talents so many different ways those talents are expressed and also it turns out that our value in this universe is not based on being the best at anything I'm not the best person at anything that out of the 8 billion people on the planet there's always somebody that's better than me in every single aspect every single dimension of my life and yet I'm still able to find meaning and purpose and uh able to contribute and I think that that's going to remain true so I generally don't like um boiling everything down into a one-dimensional scale and start starting to extrapolate uh what does that mean if we're all less than something else because I just don't think that that's realistic I don't think that's how uh the world has ever worked um at the same time you know I do recognize that there are uh going to be some challenges uh that this technology poses because um every time there's been a new way of organizing human work it has changed the ways that we think about ourselves and about what we do um and that is disruptive and it could have the potential to hurt people so I think that we need to be cautious about how we deploy this technology we need to try to put safeguards around it honestly I think open AI has been quite responsible um they have tried really hard to make chat GPT safe to use and um and you know it's not perfect but they they've I think done a really good job of making the technology better I think we have more to go and then I think we need to think about you know what are the implications of deploying this technology I think that it's true that um you know some kinds of work are going to be able to be solved by these models and so some people are going to find that you know they have an opportunity to use their time uh hopefully in a different way hopefully in a better way I think generative AI is necessary in order to grow our economy as a worldwide civilization we need tools we need better tools for intellectual work and that's what generative AI is better tools for understanding the problems that we need to solve and then helping people uh get their work done more efficiently and you know just as uh prior industrial revolutions have changed the way that people work I think this one will is well my grandfather could never have dreamed of me spending my life working in AI because you know 70 years ago AI didn't exist it just wouldn't have made any sense to him and I think that is the normal progression of human civilization is that as we grow as we mature we find new opportunities and you know the things that we did in the past uh stay in the past we change you know it used to be that everyone or almost everyone was a farmer and these days we were able to dramatically more productive per person that's working in agriculture and that I think has been a really good thing because it has opened up possibilities for us to do some something else and you know that's what progress looks like um that's the only way that human civilization progresses is by improving itself and becoming more productive and AI is what we call the tools for making intellectual work more productive so um although I agree that there are challenges and we need to be thoughtful and careful try to take care of people I also believe that this technology is foundational to our future cool beautiful words we'll leave it at that Brian thanks so much for talking to me and uh invite everyone to come to your session
