Why and When You Should Use OpenAI o1
57:17

Why and When You Should Use OpenAI o1

The AI Advantage 08.10.2024 18 987 просмотров 539 лайков обн. 18.02.2026
Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Check out The AI Advantage Community for weekly lectures like this plus curated Learning Plans, Guides, and Resources all designed to teach you how to get the most out of tools like OpenAI o1. 👉 https://myaiadvantage.com/community?nab=0 I held this lecture on the best OpenAI o1 use cases in our private community recently and wanted to share it with you here on YouTube as well! Links: https://myaiadvantage.com/community?nab=0 https://openai.com/index/introducing-openai-o1-preview/ https://chatgpt.com/ https://platform.openai.com/playground https://docsbot.ai/tools/gpt-openai-api-pricing-calculator https://poe.com/ Chapters: 0:00 Introduction 1:21 My #1 Use Case 12:21 01 Use Cases Lecture #ai #openai #o1 Free AI Resources: 🔑 Get My Free ChatGPT Templates: https://myaiadvantage.com/newsletter 🌟 Receive Tailored AI Prompts + Workflows: https://v82nacfupwr.typeform.com/to/cINgYlm0 👑 Explore Curated AI Tool Rankings: https://community.myaiadvantage.com/c/ai-app-ranking/ 🐦 Twitter: https://twitter.com/TheAIAdvantage 📸 Instagram: https://www.instagram.com/ai.advantage/ Premium Options: 🎓 Join the AI Advantage Courses + Community: https://myaiadvantage.com/community 🛒 Discover Work Focused Presets in the Shop: https://shop.myaiadvantage.com/

Оглавление (3 сегментов)

  1. 0:00 Introduction 317 сл.
  2. 1:21 My #1 Use Case 2142 сл.
  3. 12:21 01 Use Cases Lecture 8682 сл.
0:00

Introduction

so today I have a bit of a special upload for you as you know recently opening I released the 01 preview model along with 01 mini and let me tell you that after a few weeks of using this model in my day-to-day workflows and leading endless discussions around what to use it for I came to some interesting and unique conclusions what you actually want to be using this model for and I collected all of that knowledge into a lecture that I presented to the AI Advantage Community this is one of 15 to 20 events that we actually run inside of the membership every single month as you can see here in the schedule this is the llm Innovations format but we also have the same thing for the creative District which is all about image and video generation and the no codes District which is all about external applications that use AI for various workflows then we also have foundational events for beginners panel discussions and office hours which are like group consultings now as you might know there is our paid membership and usually all the lectures in there are behind the paid wall but this one in particular I really wanted to share with all of the subscribers on YouTube because I feel like it works by itself a lot of the community content falls into a longer learn learning path that takes multiple hours to complete and to actually build a skill but this event really talks about the basics of o1 where to use it how to use it and concrete examples of when it's Superior to any other llm out there and that's why you're seeing it here on YouTube so without further Ado let's get into this month's edition of llm Innovations the lecture series that I hold every single month inside of the AI Advantage
1:21

My #1 Use Case

Community I want to start by showing you why I was super excited to do all the research for this lecture uh or what got me so excited because I found that one of the main tasks that I keep returning to with chat GPT is actually better performed by 01 so without further Ado let me sh share my screen and let me just share you the result of what 01 can kind of do for you here and I'm going to do it always in this format I always have two conversations that I kind of prepared in advance one of them is in 40 let's keep that on the left side and one of them is in 01 let's keep that on the right side the task that I'm referring to there's some other examples I have those uh we're going to talk about real world use cases that's really the main focus of these presentations but the task that I'm referring to is rewriting text it's so it's state-of-the-art at rewriting let me just back up for a second and tell you that already with so s something simple like this right just make it better sure you could be more precise with that but I kept it quite generic here this 01 concretely 01 mini result I always prefer to 01 preview and GPT 40 and onet and anything else on the market it's actually crazy how consistently good it is I'll point out the nuances that make it better here in a second but let me just tell you about my general workflow with writing these days it always starts with original fault I do not begin inside of thei we talked about this before but we're just at a point in time if you let AI do all the writing for you it's not going to be good enough certainly not by my quality standards but I think Society in general just agrees that you know AI written text stand alone is just you know if you're using it in a professional environment probably not good enough so I always start with my own writing even if it's super simplistic even if it's just a brain dump where I and look don't worry this is not going to be some like writing with AI master class I just want to outline the process I usually do even if it's just this brain dump or a bullet point list or I just throw down all the ideas that I have I take those then I go to Sonet 3. 5 I use a very simplistic prompt there to rewrite it maybe specify a little bit of myself sometimes I give my custom instructions but usually I just tell uh make this flow well like that's a really good Universal prompt that just works and then Sona does that then I take that maybe I make a slight edits mostly not I bring it into chat GPT and then I let chat GPT be the editor this is the flow that I used uh a week ago and then what I do is um I take the result from sonnet and I take the edited result from chat gbt and I kind of play uh little puzzle game where I just piece it together in a way that works for me and that's how I do my writing these days now the first draft would have already been usable usually even if I dump it down like I got I don't know with all the content creation I have quite some experience with I don't know formalizing my thoughts so the first draft is not bad already that I bring into Sonet that improves the flow makes it better every time and then I bring it into cat GPT to edit now I bring it into 01 to edit okay so we're going to go step byep FR this but I just wanted to show you this one example why I will bring it into 01 mini every single time from here and out uh no discussions so um the reason I will do that is the following okay look this example kind of shows well this is going to be worth spending like 45 minutes of your time here with me on because it's just better simple prompt okay make this better and then if you're not familiar this is the first question in our FAQ on the community sales page this is the first question um I believe it's something like why should I subscribe to community if I can learn about AI for free online it's one of the main points raised by people I think it's a fantastic point but I think we also have a fantastic answer um and the answer so far has been something along these lines okay so bear with me for a second here let's qualitatively review this so the answer that we usually give is learning about AI for free is possible but comes with challenge with the challenge of every piece of content being isolated information that ends once the video/ article is over and the lack of thoughtful discussion surrounding all aspects of AI okay so you know every piece of content every YouTube video stands by itself and it doesn't allow you for like a 9h hour long learning experience like we offer with the learning paths for example right so that's the first point for example the AI Advantage Community YouTube channel does provide quality content and then it goes into these points now you can have a look at this I don't want to waste your time with like all the details here I really want to focus on the first par paragraph the way this was written originally the way I wrote this originally or like me and the team in collaboration wrote this learning about AI for free is possible but comes with the challenge of every piece of content being isolated information that ends on once the video is over and then the there's here um there's a second Point here now this is the main difference between 40 and 01 you can see it right here in the very first paragraph okay again identical prompt this one is 01 mini this one is 40 believe you can see it here yeah 40 01 mini right okay so what's the difference first of all it's more concise okay fair enough that doesn't mean a lot by itself but then check this out learning about AI for free is definitely possible but it often comes with the challenge of fragmented content each video or article stands alone blah blah blah okay so I want you to notice this okay comes with the challenge of fragmented content it's rewriting it right it's editing it it's making it better why does it say challenge of fragmented content here and over here learning AI is entirely possible but it often presents challenges why does this any guesses in chat why this would say challenge singular and this says challenges and this is not anomaly if you look at across all the generations in 40 challenge singular this would be a this was my long conversation but it's the same thing challenge singular so it's consistent right and then if I look across all 01 mini and 01 preview Generations challenges can look at more but trust me on this challenges every single time from the same prompt it always points out that there's challenges see that so I'll give you the resolution here the point is that I kind of pointed out well not kind of I pointed out multiple challenges here and actually by learning AI just off the Internet it's not going to be just one challenge right that's a complex problem the internet is a wide place we're talking about human education like there's going to be more than one challenge right maybe you know I should have come up with that when writing this original copy but I didn't and 40 also didn't because what 40 is doing what four 40 is doing it's just generating token by token it's going ahead it creates this then it creates this and so on until it arrives here and then it you know looks at all the context it has and it says okay I need to rewrite this comes with the challenge okay let's rewrite it but the main challenge right it does not consider the bigger picture and it certainly doesn't iterate and think multiple times over the bigger picture whereas 01 and11 mini and 01 preview uh certainly do think about it as you can see this is not many steps this one took a few seconds I think well using a one preview takes longer cuz it's slower this took multiple steps but every single time as you could see across all the six Generations this is a really important point it considered the fact that there's actually multiple challenges and this is a small example I could nitpick a few other points I could show you other examples that we'll do at the end of the lecture here but the point is this for o and or all other language models this is the main difference they generate token by token AKA word by word this considers the full text and then thinks about it revises the entire passage it navigates all AI educational challenges right it explores the scattered and brief nature of free AI content right it explores it thinks about it it's like a person sitting back in the armchair and being like all right so you know before I rewrite this like experienced writer he wouldn't just go ahead and like rewrite it word by word but he would think about it like okay what's the you know nature of AI education what's the nature of Education how do people learn um right there's going to be more than one challenge even though the original offer said there's one Challenge and then later on goes on to list another one there's multiple and that's what's happening here that's really the core at the essence of 01 and that's why it's better at rewriting it doesn't just consider the full context better it also kind of Ms it over and does all these extra steps to come up with the best copy possible and that's why I don't think I'll ever be using 40 for rewriting or sonnet or anything else because I want this I want you know and arguably somebody could come in here be like eigor but that's like that's one word well we could spend another hour here picking this apart I think you know you could do that I think this is superior copy to this and how much better is it well that's kind of hard to quantify but it might mean the difference between somebody joining or not joining and it might mean the difference between it actually influencing their life in a meaningful way depending on what you do with your copy like it could change it could literally change lives and I certainly um you know I'm perfectionistic in certain ways but I just have high quality standards and I really value if something is better and this is to me clearly better and not just once but consistently because it thinks okay so that is sort of my first and overarching point keep that in mind as we go for everything this thinking process there's value to it that goes deep but not for everything okay I'm highlighting this because this is a thing I do all the time I use AI to edit all the time again I write and then I rewrite with AI cat GPT is my favorite editor now 01 mini is my best editor why is 01 mini better than 01 preview slightly I don't know it's just what I found for my testing and also a lot of people online seem to share this opinion Larry asking a question Eagle do you have any custom instructions o for on for 40 no I don't this is just a raw output I don't so let's get into the little presentation I prepared here also all the visuals in today's presentation courtesy of um Microsoft co-pilot they're not great but they're okay it was super efficient to generate with it as I said I'm working on my co-pilot skills but either way what you'll see today it's all co-pilot generated um so there you go let me share my screen all right so yeah I think what we started
12:21

01 Use Cases Lecture

there with like that use case it really illustrates it so well it just thinks before it continues and that's the overarching story here okay so I prepared that little presentation that kind of summarizes things gives you a few extra tools a few ways to think about it but then my goal today is going to be one to actually fit it within the hour um because a lot of people they scheduled for this hour and two to end this um in an open-end manner that might Inspire further conversation within the community as you'll see in the end um I kind of like raise a problem that you guys can grapple with um on your own but let us begin here so what's the plan here what are we going to do throughout the next 30 to 40 minutes first of all I'm going to give you a super quick overview just to catch everybody up then we're going to talk about limitations important topic right so many limitations with AI in the course we dedicate a chapter to it we talk about this all the time let's we need you need to be cognizant of these to get the most out of these models then we're going to talk about usage strategies so you know a few tips and tricks a few platforms that I would recommend I there's one that allows file uploads today with o1 already um also it should be noted you know this lecture is dated with October 1st 2024 so you know at a later point in time some of these things might change as of today this is all up to date um AI moves faster oh yeah and then we're going to talk about very rough prompting guidelines that I kind of came up with myself and in conversations with others and then last but not least we're going to talk real world applications and then you know we could spend hours on that because there's a lot to discuss there but there's some good ones so let's get into the overview okay so you know very simple this is a brand new family of models it's not just a new model it's a whole new like clothing line so to say by open ey it's a whole new brand however you want to kind of whatever you want to compare this to but yeah these reasoning these thing step by step Chain of Thought type of models they are a new line and they're titled o1 and then probably consecutively O2 Etc there models that gaug in thought before responding so I think I Illustrated this quite well on the initial example they think over the entire process they think about the beginning middle and the end they do additional prompts additional steps before they even start rewriting and use all of that in their context to get you better responses um not always needed the downside obviously is you know it's way slower some of these Generations take 40 seconds whereas with 40 it might take two okay the Vari different model that models that we have today are 01 preview o1 One Mini the faster and cheaper alternative and then the upcoming 01 that we should get um throughout you know I don't know maybe October maybe later who really knows it employs Chain of Thought reasoning we talked about this in previous prompt engineering lectures where we discussed various papers basically it's this concept of think step by step where it basically prompts itself a few times before it does the final prompt that's kind of the summary welcome on board Gart good to have you here and then it's train on multilingual data sets and this is kind of an interesting reference point that we talked about in the office hours not sure this is too useful but this came up in a conversation recently with also Community member rasis duava and he kind of pointed out that and I noticed this often in the thinking process it starts it like thinks in Mandarin or in Hindi or in just foreign languages and you're like why would it do that so we had a discussion about this and he also prompted GP in various ways and he kind of came up with this theory that seems to be true that behind the 01 model it's not just English data sets that do this reasoning because a certain type of reasoning and certain type of thinking is better done in other languages let me just you know super small personal uh you know tangent here I grew up trilingual and like a very unusual setup in my childhood and certain languages just like every everybody who's at least bilingual will know that other languages they just hide these specific cultural like gems and expressions and idioms that you just cannot translate it's very hard and they're just unique to the language and also if you kind of for example this is also super uh tangential but if you look at the history of psychotherapy it seems to be really heavily weigh weight weighted towards uh German speaking languages right with a lot of the grades coming from Switzerland and Austria and there's something about the German language that maybe it's maybe it's in the culture but I think a lot of it is also embedded in the language that just goes deeper than what you could express in other languages now certainly there's a lot to that discussion and we could discuss that forever I just want to point out that in my experience and also seems like in open AI testing and development using various languages and knowledge from various languages and writing from various languages seems to yield Superior results when it comes to reasoning uh than purely going for englishon data sets or maybe it's just a limited nature that hey there's a lot of English data but if you include other data sets in other languages it just performs better whatever tangent over I just kind of wanted to point that out uh and if you look at the reasoning steps sometimes especially in the beginning you will it way more often went into other languages and just an interesting kind of side point there all right so what are the current limitations really quickly inside of cat gbt 50 messages per week for 401 preview and 50 per day for 01 mini quite simple you do need the premium subscription um teams or Plus or Enterprise or education those are the four to access it uh the knowledge cut off is in October 2023 the full feature set is expected at release with all the tools of the 01 Marcus yesterday in the office hour actually surfaced a little screenshot from The openai Forum where one of the openai employees actually confirmed that once open AI open eyes 01 the full model not the preview not the mini when the when 01 is releasing it will come with all the features what are the features I relisted them here for you on the right side their memory and custom instructions their data analysis their file uploads web browsing they the GPT integration and usage they're the vision capabilities and they're the advanced voice mode I'm very excited for this and I really think that giving all these tools to a model that can go that much further giving it a bit more context and autonomy is going to be a major step forward a bigger step forward than the switch from like 4 to 40 for example a bigger step forward than the addition of any single one of these tools I think it's going to be a major jump and it should be very interesting here within the community as that happens but I you know I don't usually talk about the rumor mill too much we spend time on Cold Heart practical workflows but some of you might know Mr strawberry man here from Twitter the hype Master behind this uh 01 release and before it was nicknamed uh you know project strawberry now he's back at it tweeting again I wanted to feature this one tweet here because I do agree with this point the jump from 01 preview to 01 is similar uh to the jump we saw from gpt2 to4 you'll be in a State of shock for months now okay that might be a little bit overhyped there at the end but then who knows at the end of the day I think adding all of these functionalities to a model that does perform better is going to be incredible going to be able to upload well you're all of your personal contacts you're going to be able to give it screenshots of your website you're going to be able to tell it make this better and give it a random screenshot and it's just going to do its thing that's quite incredible and when the quality of doing that is going to be I don't know 20 30% better than what we got with 40 well that's a huge win because let's be real the quality of 40 very often and so so many use cases that we look at here on this channel and Community all those use cases they're like slightly below the threshold of where they become like incredibly amazing and you want to use them often they're just like ah this is really good but it's not quite there right that's kind of the feeling as I would describe it in so many cases if we get a 20 30% boost and if this holds true then I think we're going to be at a very interesting uh point in time in generative AI because yeah as I said I think a bulk of the use cases that we explore and prompt for they're just below this threshold of being useful and just this little boost in ability and in quality might just tip it over and then um yeah I think even more mass adoption would be the logical consequence of that and then yeah everybody you know in this room will be not just one but multiple steps ahead because we're dealing with all this stuff already and then that's good so if you use it through the API right now it's only available to users that are tier four above what does tier four mean well this is a screenshot of the open website you can see if you spent $250 or more in the API then uh then you have it uh I believe this is throughout the last month certainly tier five was so you had to spend $1,000 or more in the last 30 days and then you had A1 on release right now they put it down to tier four if you don't use the API a lot then you're not going to have it that's the kind of the gist of it okay and these are kind of the final numbers I'm not you can check this out if you want that's the pricing uh it's the most expensive model opening I has it's one of the most expensive models available it's pretty much only Opus is a bit more expensive um it's really interesting to check out this little calculator over here I'm just going to head on over and show you the website here but this is a Nifty little tool um that we've also been using in yesterday's office hours not sponsored or anything it's just a cool little tool and uh that's completely freely available and you can go in here and say something like okay if I have you know 800 tokens of input which is like pretty common and it outputs 500 tokens and I make I was looking at a project where I hook up a form and to my YouTube audience recommend various free resources that we have usually you know I don't know my estimation would be maybe we're going to like require 8,000 maybe 10,000 API calls for that so I can put that in there uh the input tokens would actually be a lot because like the prompts that I crafted there they include a lot of context and multishot example so it would be something like this output tokens would be my estimation um and then here I can quickly check how much it would cost with the various models if you check it out all1 preview right here this little fun project would cost me $3,300 in API costs not so great if I go with the mini it's just 670 way better if I go with for all mini it's $33 ha incredible and actually the new King in this category is Gemini 1. 5 flash as updated last week $336 this is how cheaply you can actually generate 10,000 of custom emails with a 20,000 token input which is a lot as you know so that this is an incredible price is the quality good enough for writing emails eh probably not my preference here qualitatively would be Sonet coming it in at a juicy $690 so just a good calculator just a good tool to bookmark the presentation and all the links will be in the event recording as per usual so no worries about that I just wanted to show you this thought this was useful okay back to business let's get to the interesting part here okay let me skip over these we were right here so the interesting part how do you use this how do we get more out of this what have I found what have I and the team found so far well first of all you can use it via chat GPT plus this is kind of the traditional way you're only going to have 50 messages per week so very quickly I ran into that limit especially when I was testing certain things and rerunning all one three times right like test one prompt three times another another and all of a sudden you're like up you're done for the week not for the day for the week right or one preview luckily for things like rewriting and things that are not related to the stem Fields mini is actually better there we have 50 per day but that's the initial way secondly if you're level four or higher which most people are probably not then you can access it through the API and use the playground um with it as per usual but here's my tip if you want to go beyond this if you hit this limit and you want to go further PO is your friend uh I tested various tools multiple community members actually recommended various tools I know various people are using a lot of these new platforms that have popped up platforms that allow you to um basically you pay a monthly subscription and then you get and they do the API calls for you and you get like a chat interface there's many of these but the one uh the one that I found to be actually the best with o1 and I say the best carefully uh because that's quite a strong term but I think it's objectively the best is pole wait I have to pull up Pole where did I place my pole going to reopen pole like so all right why is po the best it's the only one that has working file upload okay and also like voice recording but the file upload is the main thing you can actually upload PDFs to this and it works with it many other of these platforms that I've tested offer file upload but then with 01 concretely with 01 preview it just doesn't work because it's costly right like if you upload something they have to vectorize it they have to store that they have to put all those tokens into the context window and then also pay for it all so they'd rather disable it 01 preview on PO actually works I could add a file here run a prompt here it will work this does require subscription I believe it's $25 $26 a month or something but um I'm I am not entirely positive how many okay there you go you have um you have a million tokens per month one message of 01 is 10,000 so here you're getting twice the messages as in chat chpt I suppose if you're maxing it out on a weekly basis you're actually getting half right cuz in chat GPT you would get 50 * 4 200 messages here you get 100 messages in total so you know but you get them right away and you can use them right away so if you're kind of stuck I would recommend this I know Durk also saying here uh specializes in that some time ago and I know also Durk is a big fan of Po in general because it allows you to as so many people in the community are because it allows you to use all of these various language models uh with some more advanced features and fun fact and I'll just kind of throw that in before we move on the creator of Po the CEO of PO is also the creator of quora that you might be familiar with a big online forum for answering questions and I believe his name is the Angelo Kora what is his full name Adam D'Angelo yes and Adam D'Angelo here if you didn't know also anybody who's who has been following the open ey drama will know that Adam D'Angelo was and also still is on the board of open I so as you can see here so fun fact this popular chatbot platform that ships features right before open AI regularly okay this is not the first time they have these additional features always before open a happens to sit on the board of open eye now my little conspiracy theory on when I put my little tinf fall hat on here would be that they actually use this as a testing ground before they ship it into cat GPT and this has been the case multiple times now I don't know just kind of wanted to point you towards that PO great platform um you can use all of these different um chatbots and now let's move on if my Mac allows me to yes it does thank you Mac okay so let's get back to a little presentation here po might be uh the way to go if you want to go beyond the limits you know there's a way to pay for more tokens even if you don't have the API access all right so few prompting guidelines and then let's get into practically use cases shall we so these are very rough okay like this is definitely not uh like hey this is not law but first things first the one thing that is clear and that is law actually is what opena I recommend it on release don't tell it to think step by step the entire thing is trained and fine-tuned for step-by-step thinking if you tell it that you're just going to make it worse secondly as with all prompting markdown formatting and clear delimiters really help so if you use um certain formatting conventions that we talked about many times in the community especially in the advanced prompt engineering learning path we go into how to structure your prompts how to create bullet points sub bullet points how to use H1 H2 H3 headings for the various parts of your prompts it still works best because it it's the Internet it's the internet way of communicating hierarchy that's what it comes down to and that just helps and my last Point here is it favors short to medium length context rather than long context prompts matter of fact if you give it super long prompts with five short examples it often just completely disregards them because in the thinking process it kind of takes on a life of its own which is kind of creepy to be honest you know it just it rethinks your input and it's like okay human thank you for those instructions but you know what like prompt number seven suggested that we should do it differently so I'll just go with that and all of a it just disregards all your initial instructions and does its own thing yeah so um that's why longer uh instructions are not recommended but they certainly work I feel like what I've from what I've seen on the online discussions people have condemned it like oh one doesn't work with long instructions anymore prompt engineering is dead I think a lot of that is actually just theatrics or like just is there to engage people and to enrage people like oh my God now I've you know like oh all my prom are going to be useless or something it's just there to elicit the emotional response it's not actually really factual I have some examples here that we'll go into next slide is basically like use cases and we'll go one by one and look at some but I have some examples where I gave it some of our most complex prompts that we teach within the community and the courses and it performed really well on them and some in some cases it just kind of improved it in a way where it thought it would be even better and just went of that and then disregarded kind of the templates that we gave it and the five shot prompts and all the detailed instructions but it certainly works with long promps I would not never would I State like not a single one of the complex prompts that I tried like didn't work they all worked they maybe didn't work exactly as they would in 40 it came up with its own ideas of how to interpret all that but it's aware of all the instructions it's very similar if not identical language model in the background it's the outputs are similar to 40 right it just does this extra thinking in between so I wouldn't say long prompts don't work but I will say that it works exceptionally well with short prompts even if you just give it something like but not too short right so if I just told it and my other examples remember how I told it make improve this text that's that would be short by itself but I give it a lot of context with the text right then I give it the text and all of a sudden it's like a mediumsized prompt CU like there's like two three paragraphs of text so and that's what I found to work best if you just tell it like hey I want to create a business plan good luck like that doesn't work well even if for my company AI advantage that doesn't work so well it works way better if you kind of like go towards this short to medium area where you tell it about the business plan but then you also tell it about the company and you tell it about your goal with the business and then you let it do its thing so don't keep it too short right remember like when last year we had we covered some of these Auto J GPT excuse me and baby agis and these different Frameworks that kind of popped up these first uh agentic Frameworks that were publicly available and there the prompting was really like just State the goal and like hands off and let it do it let it do its thing that's not the case here it actually works better if you tell it the goal but then also Flash the context out a little bit you know add that extra sentence those extra two sentences the target audience fles out the goal a little bit that's just you know just my personal take come up with your own ideas I'll be sharing things as I learn them but this is what seems to work best for me and all the other people that I've consulted this with but way this was such a fun Journey over the last two weeks sorry for any anybody who I harassed with the question of like and what are you using A1 for every team meeting every office hours I kind of brought this up got a lot of um interesting answers and this presentation is sort of the culmination of that but yeah it's been fun I really enjoy doing these Innovations events once a month we get to kind of like dive deep into a specific topic and the topic that we're diving deep into today is open ai1 Real World applications that's what we're here for we arrived we have about 15 minutes left and what we're going to do in those 15 minutes is I'm going to show you various prompts that I ran through both 40 and 01 mini and then I'm going to kind of comment on the results and on why you would want to consider some of these and why this is probably the best model for the things that are listed on screen right now so let's just briefly go over them and then get into the concrete examples I don't have concrete examples for every single one of them cuz we don't have time for all that but for multiple here I do so first of all Superior for Content editing this is the example that I showed off right off the bat that was you know where it's where you give it a text and you just want it to rewrite you're not using it for creative writing you're using it for editing now I did include creative writing here too as you can see bullet point number three but especially storytelling this is one that has been raised by Daniel Pierce team member he said it's really good at storytelling and I can confirm and we had a little conversation about this and it's the same idea of It kind of abstracts away from your requirement of the story and it thinks about okay what makes a good story how should we structure it what other elements could we include in the story to improve it if it does all that it's going to craft a better story for you it's going to consider the beginning the middle and the end while writing the beginning rather than just going token by token and writing your story right do you see the pattern here every time it every time there's like a there's a larger unit that you need to respect before you start generating every single token every single word that's where this performs better stories are one example translation tasks or content editing another example bullet point number two here right if you're translating a phrase well it's one thing to translate a word but especially if you're translating some of these more peculiar phrases like I tried it on a lot of German phrases let me tell you German has a lot of weird phrases that don't make any sense whatsoever if you just translate them word by word but they they're really effective in the context of the language and you kind of have to you have to extract the meaning of the phrase and then maybe come up with a whole new approach to communicate that in English this is something that foral can't really do well because it just looks at the word it's like okay first word what does this translate to okay you know and it just generates that sure it considers all the context but there's just these thinking steps they add a layer of translation quality that you cannot get from Google translator you cannot get it from other llms 40 when it comes to translation I would even go as far as saying this is the probably the best translator in the world that I've seen certainly between um German and English which I tested it on extensively and then also between Slovak and English which is the other language I speak it's also better than anything else I've seen so you know to be discussed but um in my opinion it's the best translator we ever got sure you know you want to translate a phrase and waiting 30 seconds in some cases might not be ideal but if you want the highest quality that's the price you pay right now okay so next up we have strategic planning and problem solving and process optimization code review okay so let's look at some examples here I would say but as you can see all of these strategic planning and problem solving and business planning for example this is exactly the type of use case where the entire business plan should be considered more in depth and a and a certain level of original thought would probably be beneficial when you're trying to create a business plan rather than just going word by word and and coming up with you know like depending on the last sentence let's write the next one and sure you know like I don't want to understate the fact that 40 also considers the entire context and uses that in predicting the next token but there's just it just goes further if it takes this stepbystep thinking stepbystep reasoning approach and then process optimization is an interesting one too where you know if you have a sop and you want to improve that sop yeah it's probably good to kind of consider the whole the entire sop every single step of it from beginning to end and to maybe form some original thoughts about what could improve the SOP rather than just going step by step and trying to make the Step better you want to make the hole better that's why this works better last but not least code review this is one that is Up For Debate I think this one the most out of this entire list but essentially people say that for code Generation Well people say for code generation I still use son 3. 5 it's I think it's the best in terms of code generation sometimes 01 uh preview is really excellent to do can it has longer output so if you want to generate longer applications in one shot then probably A1 preview is what you want to go there toward but code review again just like content editing where you you're trying to make sense of it form original ideas and then apply those that's where it shines okay so let's look at some practical examples and some comparisons here shall we okay so let's see I have various um dual tabs set up here um and I kind of want to review this one by one okay this is going to be my last example so let's keep that let's keep that for the end a yeah business planning shall we this is an interesting one okay we looked at the rewriting example right in the beginning so I think I've made my point there it's just better at rewriting just no discussions there let's look at some others though there you go that's cleaner so this one is about this basic prompt uh simple prompt right I would call this a small context uh small context prompt but it has enough in there right I fleshed out what the what thei advantage means a little bit and I gave it a lot of freedom in formulating this so I ran it once in 40 and once an 01 preview let me take a sip of water here before we analyze this and kind of compare some of the results and some of the points here all right so here this generation has been done by 01 mini as you can see a first one is 01 and then o1 mini and o1 mini I would say generally speaking when it comes to use cases that are not um in the sciences that are not physics that are not math problems o1 mini is superior right a lot of the open blog post and as you can see also in this presentation I didn't really focus on like O It's So Good In Math because I just know that a bulk of the audience and the viewership here doesn't tackle math problems in their day-to-day like most like okay it's better at physics PhD problems well that's amazing I don't remember the last time I encountered the physics PhD problem in my everyday routine and if you do then fantastic U 01 preview is your friend but for some of these other things that are non-science related I actually consistently find 01 mini to be better which is interesting because you have more messages it's faster it's just a win like this is sort of it's better so why would it be better in this case well let's have a subjective and qualitative look at what it produces here saying if you have little kids you constantly have physics problems yeah fair enough see when you're doing that would be a great real world scenario there yeah all right so let's have a look at this so you know it generated more with 01 mini here so as you can see first things first um no actually here it has 10 points I thought it generated more ah yeah here in the second shot it generated 15 of them and then it was slightly different formatting so as you can see there is a lot of variance amongst the generations this one looks a lot like the 40 generation this one um if I scroll down and go to example number three this one looks almost identical right just the points are shuffled in a different order you have three sub bullet points here two it's very similar it's not night and day at all but I would actually say that this goes a little more in depth first of all and again this is slight right and it's objective but I would just like to highlight that let's take some something let's take one point and compare it okay so we're creating a business plan for the advantage a generative AI educational Brands teaching practical llm usage so how about this what about the first point what about the ordering of these I feel like that sort of matters if I generate a business plan it kind of matters what I start with especially if I'm using this to kind of generate new ideas I want the most important points to be first I would argue that 40 was absolutely terrible at this it kind of every time you regenerate it's going to shuffle it differently right like now it's develop tailored learning paths now it creates special specialized Niche programs if I regenerate it again it will be something different again let's have a look but I would argue that yeah I think the curriculum and content development is actually the most important part and look it gives you four sub points here instead of again just two so I just from that without even reading I would say yes more detail and actually prioritizing correctly is a win and is a better response right so that's one that's why I would prefer o One Mini here over 40 what did it say differentiation and specialization okay build interactive real world AI project I mean that's cute that's good but isn't it more important for the curriculum to be in relevance to the various Industries I think it is right it wants to reflect the latest advancements in Ai and LM Technologies and then you want to incorporate real world applications and case studies to ensure practical relevance and then it gives you different strategies to do that I would just I think that's more important than like designing some exotic interactive path again this is just subjective you can make up your own mind this is just my take on this but I do prefer this and if you look at it it's consistently better it's not just this one time right again it goes through curriculum first because it kind of makes up its mind as what is better so whenever it comes to planning strategizing I do prefer this maybe let's look at another Point how about Student Success let's see if it uh brings up this point ah yeah focus on Student Success and support okay so we can compare these two points right so let's have a look AI powered feedback use AI to provide personalized feedback on student projects and quizzes really that's your first take for all like I should be using 4 a not a human touch to give the people feedback all right okay whatever notes taken what about this mentorship programs establish mentorship opportunities where experience professional guide uh guide students through their Learning Journey providing personalized advice and support this is like 10 times better advice right if you pay a consultant and he will tell you hey you should use AI to give all your students feedback and the other one will be like hey what about like starting a mentorship program like this is just better advice again subjective but that's what I'm doing here just sharing my opinion right okay career support what about let's give it another chance 40 what you got here offer job placement Services resume building workshops and AI project portfolios okay partner with companies hiring AI talent that help students connect with them postgraduation okay let's see this is similar resume building and job placement this is quite similar to help students transition into AI roll this is very similar so you know I would say no criticism here kind of came up with the same thing live Q& A is support provide realtime support via live Q& A sessions or office hours where students can engage directly with instructors aha luckily we do this and let's see what it said here create forums discussion groups and alumni networks to foster a sense of community enabling peer support and networking opportunities this is sort of similar but higher level again I would say this is a tie here and then I have an extra point here which is nice Implement regular feedback loop to understand St that needs good point I mean that's how you achieve success you know you try things the ones that work you keep and then you have a feedback loop in place I think this is great advice there was no fourth Point here fair enough I would say especially on the first one this is a clear win qualitatively for me 401 mini so I don't know what did we find it's it prioritizes the most important points at the top it actually within the sub points it raises issues that are more important uh consistently if you look at the whole thing you'll find that this is pretty consistent across all the generations and o1 mini is actually even better than o1 preview because if we look at o1 preview let's maybe look at the Strategic Partnerships point and then move on to the next use case here a yeah there you go so look at that this only gives us okay two points that's fair enough here there's three academic collaborations and Industry alliances this T tells us educational Partnerships this is kind of the same point right partner with your universities and research institutions co-develop courses and gain academic endorsement into the curriculum and here it tells me I mean comparable so maybe this is this resonates a little more with me than this does but okay what else industry collaborate with tech companies for guest lectures internships and job placements for students uh developed corporate programs that trained you know here I just feel like it doesn't quite understand what we're doing we don't want to be developing courses for others we develop courses for ourselves we don't want to develop corporate programs or inhouse like we're offline this just gets it I would say but again not a landslide win I would just say it's just slightly better but I think overall it's just you know I guess we could also now in the interest of time we won't but we could also compare o One Mini to this again mini for these cases where it's more about strategizing and planning and writing and rewriting is just better when it comes to maths physics Sciences sure uh preview is the one to go with but my default is sort of mini these days okay good so that's our little comparison or about the strategizing use case now maybe we have time for two more here but this is the prompt comparison use case and this is the text and then I think I have one or two more here oh yeah this is just the basic maybe I can just briefly show this off but this is the basic math use case you know simulating the outcome of 10,000 coin flips inside of 40 and then doing the same thing inside of 01 obviously here this is where 01 preview actually shines this is a case like every time it comes to maths you go want to go with preview it just works better than mini and as you can see it kind of this struggles it comes up with a simple table and then methodology and then you know it just says that it simulates it and gives you some sort of conclusion you can look at the code it Ran So it used code interpreters so unfair Advantage here right it got to run some code but what did it use a panda data frame and then gives you this so it did the calculation inside of code interpreter which is the best and then here it just went ahead and it mapped it out it simulated the coin flips and analyzed the flips and figured it all out so you can see a whole lot more uh steps here expected value calculations and other calculations that I'm not quite uh that I don't have understanding of here and okay so statistical significance right here okay this seems well certainly when it comes to learning I would prefer this every time rather than just giving it me giving me the example okay and then the conclusion is kind of the same it's going to be 50% of the time but the approach is a very different one right this fires up a tool and that's kind of it Saving Grace if I were to go ahead here and disable the tooling which I have the small interface we're going to go here and we're going to disable the tooling code interpreter like so and rerun this like so it should be a night and day type of situation now it doesn't have code interpreter to fall back on so it's going to have to do its thing by the way they adjusted 40 there's just no debate to include a bit more step-by-step reasoning you know a year ago it would have just given you an answer here and it would have been terribly wrong now it actually makes some assumptions and gets you results so that's nice to see but never the I think it's just let's see what this comes up with but this is just such an elaborate an amazing answer here that you couldn't ask for much more but look at that let's go with a hypothesis they definitely infused some of the 01 learnings and chain of thought learnings into original 40 there's no doubt like even a few months ago this did 40 did not you do this on release but yeah there you go although the differences between heads and tails is small a difference of 80 flips in 10,000 trials is small but it points towards potential bias yeah so it couldn't even test it properly it even says hey this is limited I would like to test it more rigorously but I couldn't there's a difference of 80 flips which is small and there might be a bias there but as you can see this is just the better result here clearly um it gives you the expected number here but yeah as you can see just by this response twice the length more detailed Superior in these use cases although this get did get a whole lot better okay one last one to round it out here okay we're around the 1 hour Mark I want to show you one last one which is um this was the text editing which was prompt generation ah this was custom instruction generation just a quick note on this you know in the community in the advanced prompt engineering learning path we teach Uh custom instruction creation like how to create custom custom instructions for yourself right and we it kind of follows this template interesting thing like 01 worked super well with our Giga prompt here that we provide in the community only um it worked super well with it but 01 kind of again it Formed its original thoughts about how to do it and it wrote it more like a essay more like a paragraph more like an email introduction rather than following the template and it's funny how consistent it is look at that this is 40 all of the generations follow the template look at that language preferences colon and then the language preferences every time it follows the templat you see that over here no matter how often I rerun that we gave it a strict temperate template to follow it does it in 40 01 no matter how many times you run it always just writes this little essay because it figures it's a better approach and it might honestly it might be even a better approach because then you save context length and could put even more stuff in there just again sort of a again an example of how o1 handles things differently and then one final example would be the prompt generation I have to briefly locate this for you but I ran our prompt generator through this um and the results were quite interesting and that sort of a lot of Windows here oh yeah there you go this is the 01 result is the 40 results there it is okay so this is the final comparison ladies and gentlemen prompt generator AI Advantage prompt generator as you know it from our Advanced prompt engineering lectures in the community or as you know it from the business blueprint product we have there's a total of 1,000 over 1,000 different professions each one of them comes with a GPT preset but also 30 prompts but also the generator that creates these prompts now this is the prompt generator I have to briefly check here it ends with the custom instructions here social media brand manager that was it the social media manager that's the preset here and we I entered the prompt into both 40 and o one preview here now here's the thing there was so much variability across the generation an 01 preview that I was surprised first of all it fought for 30 seconds which is like wow that's a long time the other time if I go to the other Generations here I thought it I think it did even more this was 01 mini this one did 40 seconds and look what it did here on the first shot and I want to end on this it Formed its own idea of how to generate prompts wait I'm scrolling a lot I apologize for that we're going to go to one concrete prompt and I'm going to end on this okay classic marketing prompt uh creating a Content calendar right so let's look for that content calendar there you go it was prompt number 10 in this case it's sort of Random so uh I bet we're going to have it here too yes create content calendar so our generator is set up to generate prompts like this easily editable with for with uh space for variables everything in a square a bracket is a variable it respects that here that's good but then look at what 01 did here it came up with its own idea of how to prompt it came up with this approach of like all right let's give it the prompt but then also let's specify the steps that we need to take to successfully complete this prompt because it probably somewhere in the in the fine-tuning or somewhere in the model training they just set the whole thing up to take multiple steps before it gets to a result so it reflects that in our prompt generator and that's why I thought this was an interesting point to end on because it reflects this approach of taking multiple steps before doing something in activities that have nothing to do with reasoning right this is sort of like we're creating prompts to solve different work related problems but it reflects kind of that core capability of the model and that core programming let's say onto the prompts and which one of these is better well I'll leave this up for you to decide but I think it's just an interesting note that all the prompt generators that we gave you can run these through 01 preview and get some interesting and unexpected results all of a sudden you're going to be creating a content calendar in four steps right you're going to be listing key dates you're going to be assigning content themes you're going to be including post times optimized for the audience Peaks and you're going to be ensuring variety and balance across content types whereas here well there's no real talk about optimized posting times uh variety and balance we do have I think right incorporating trending topics special events well not really there's balance but there's not variety so this is more detailed now is this better I don't know I would have to test this I would need to go in I would run this prompt through again 40 Sonet for um probably A1 Mini those will be the threee that I be testing and comparing seeing which result I like best but I just wanted to show you that it just takes a completely different approach to solving problems than 40 did and when you're in one of these and when you're cter one of these use cases when you're editing text try out mini you you'll be surprised by how good it is when you're translating especially tricky phrases try out um try out 01 storytelling same thing strategic planning business planning problem solving optimizing Sops reviewing code all of these are cases where I found it to be superior and I hope this was helpful that's all we got for today thank you

Ещё от The AI Advantage

Ctrl+V

Экстракт Знаний в Telegram

Транскрипты, идеи, методички — всё самое полезное из лучших YouTube-каналов.

Подписаться