# Large Language Models Explained | LLM Basics for Beginners | How ChatGPT Actually Works| Edureka

## Метаданные

- **Канал:** edureka!
- **YouTube:** https://www.youtube.com/watch?v=gnZ_Yg8qBDs
- **Дата:** 06.05.2026
- **Длительность:** 28:14
- **Просмотры:** 1,347
- **Источник:** https://ekstraktznaniy.ru/video/50517

## Описание

🔥PGP in Generative AI and ML in collaboration with Illinois Tech: https://www.edureka.co/executive-programs/pgp-generative-ai-machine-learning-certification-training
🔥 Integrated MS+PGP Program in Data Science & AI https://www.edureka.co/dual-certification-programs/ms-data-science-pgp-gen-ai-ml-birchwood

In this *Generative AI Course* , we will dive into Generative AI, exploring its definition, key examples, and diverse applications and examples. We'll also discuss the evolution of generative AI, highlight its potential future impact, and showcase an LLM  project. Using the Python programming language, streamlit,  and Google Gemini API, we will build a healthcare assistant that Analyzes medical diseases like chicken pox, etc. This video is ideal for tech enthusiasts and beginners alike, as it unpacks generative AI's transformative role in the tech world.

✅ 00:00 - Introduction to Generative AI
✅ 01:26 - What is Generative AI?
✅ 02:02 - Generative AI Examples
✅ 02:38 - Applications of

## Транскрипт

### Introduction to Generative AI []

Good evening, everyone. So, I'm Arindam. We will try to understand language model. Okay, what are the different kinds of language model model? In terms of different vendors. What are the options available for us language model? Okay, like I have from open AI as Gemini, what we have from other sources, we'll talk about those. All right, let's get started then. So, we will talk about large language models. Okay, in simple terms, can you imagine, okay, having an assistant in today's world which does different things for you? It can help you in writing code. verifying, doing research. It can help you in, you know, like telling it which route it can help you and say out of five different options, which is the best suited one, most economical one. It is just like having a personal assistant at our disposal in the form of a large language model. So, what do we do using large language model? Large language model have the potential where they show you human-like traits by understanding context. And it is going to help you with different tasks, results, wherever we leverage them. Previously, a couple of years, a few years back, there were certain products like Apple released Siri, Amazon

### What is Generative AI? [1:26]

released Alexa, right? Which was more of a voice assistant thing. Correct? So, where we were saying something, Alexa, what is the weather? Alexa, how much time it will take for me to reach from position A to position B? Okay, what is the best fare for this particular route? Now, we used to do voice-enabled systems. In large language model, what it does is it understands voice-related things. It understands text, it understands audio, video. It has different large language models which have been trained on different schematics and tasks. Okay?

### Generative AI Examples [2:02]

In current date, we have 60 plus large language model available in market. 60 plus. Imagine the extent of large language model that is available in market. So, with the 60 plus model, they are trained they are focused on diverse task and diverse domain. Okay? Some are conversationally AI related models, some are image related models. We will talk about some of them and we will also try to cover in today's session how to differentiate between different large language models for different tasks.

### Applications of Generative AI [2:38]

Okay? Proceeding forward, some of the famous examples around large language model is going to be Google's Bard. Okay? We have GPT-3. Recently, we have GPT-3. 5, 40, 40 mini, right? Llama, we have Llama 2, Llama 3, 3. 1. Microsoft has Turing, Microsoft has agentic frameworks as well. We have Open AI's separate, right? So, likewise, we have multiple others. These are the famous ones. There are other models providers like Stability Diffusion, Jamba, Cohere, Amazon's Bedrock, it's Claude, Haiku, Sonnet. So, we will try to summarize in terms of model choosing which model and what kind of parameters we will consider in order to choose a particular large language model. Okay? Now, some of the applications where we will leverage a large language model is going to be first around language translation. So, before AI, before the, you know, like generative AI, agentic AI, do you remember having a feature like Google Translate? What was Google Translate doing? You scan a particular photo and it is to convert from X language to Y language. You can convert from English to Marathi, English to Chinese, English to Russian, English to some other language, or Hindi to English, Hindi or Bengali to, you know, German, right? So, whenever we want to do large language translation, we have to click a photo and upload it in Google Translate. It used to give us relevant meaning based on the words that it can find. Do you remember using some of them? Google Translate? Like around 5-6 years back? Somewhere around 2015-16, that's where Google released Google Translate. Correct? So, that was a very native stage of, you know, translation, right? So, where we are uploading a photo, and based on that photo, the translation was happening. In today's world, what we are getting is we are giving a much more detailed version of translation. Okay? It does not need you to upload the photo. You can simply copy-pasted text, copy-pasted message, or take a screenshot and put it, right? It gives you better result, faster response, and more user-friendly. So, language translation can be very easily done using large language models. So, this is one of the application area. Okay? I hope language translation is

### Evolution of Generative AI [5:01]

clear. Next, we have summarization. Now, imagine going through a particular, you know, email, and that email has around 500 words, right? An words where it is written very detailed and specific related email without, you know, like covering other things. Now, what happens is, one, two, you do not have time to go through the email specifics in a very granular form, and you want to understand what is the summary that the user has tried to convey as a part of this email content. So, what do you do? You use the text summarization. In Gmail as well, even if you are using Gmail, there is an option, right? Summarize, which tells you in bullet points what is the main summary of this particular email which has been sent out. Is this an escalation email? Is this a refund related query related email? What kind of tasks are we going

### What is LLM? [5:56]

to do over here? Or what is being requested in that email? Those things get specified as a part of text summarization. This is from an email summarization. Similarly, if you upload a PDF document and you know, a Word document which contains some specifics, right? And you want to summarize them in top five bullet points out of 10 bullet points. It is in a position where it would be able to and give you the output accordingly. Okay? Next, we have chatbots and virtual assistant. Okay? Chatbot means we have seen, right? Where you type in a question using a large language model and it kind of gives you a response. So, in order to summarize a chat, response, or get a query done, okay? So, in such cases, you will get a chatbot. Okay? Next, we have virtual assistant. For example, we have Alexa, we have Echo, all right? Echo Dot, multiple product is there, right? So, likewise, we have different virtual assistant do tasks on behalf of us when guided in terms of voice, in terms of different methods. Okay? So, we can use it as a part of chatbots and virtual assistant. Next, we have content generation. You see a lot of posts, social media posts floating in the internet. In LinkedIn, in Facebook, not Facebook and Insta not much, but particularly in LinkedIn. All of a sudden, after large language model came, you see an increasing number of influencers are available in market. Why? Because creating post or penning down a post has become very easy. People will go to large language model. They will say that, "Okay, use a very nice catchphrase. " And based on that we have a very good impulse. Right? Previously, if you remember of used to spend hours and sometimes days as well. In order to write and they will try to confuse

### Structure of LLM [8:04]

spell check, grammar check. In today's world, if you have 30 minutes of un attention, you can create a very nice content because of the tools and within market. So, at the generation, people are using a lot of large language model to generate diverse Ghibli image. Few months ago, a lot of coming in the market was using large language people were up to that. The image was getting generated. That was another case of content generation. So, using this what you can see is how content are being generated by using Ghibli photos or a demo photos and it was used for marketing content and generating content. Okay? Next, we have conversational AI. A very clear-cut example of conversational AI is ChatGPT. ChatGPT is a very tailor-made conversational AI bot, okay, or a large language model using which you can do diverse task. For example, what is conversational AI? It can understand just like a human assistant. It can do research on behalf of you. It can check details If you give a voice-enabled command, it can decode that and it can do the task. If you upload a image or video, you know, OpenAI has the models like DALL-E is there using which it can decode images, videos, and it can help you with streaming related data. Right? So, for anything to everything that is required in terms of creating content or reading different forms of data, everything can be done as a part of conversational AI. Okay? So, these are some of the applications that we can see around the large language model. Okay. Now, we will see how do large language models work? Large language models can be used to write books, to read information from and articles. Okay? So, likewise, if you want to generate data for a particular website or write in a particular website, you can very easily use it using large language model. If you want to write some articles, okay? You can generate it using large language model. These are some of the areas. Okay? Next, moving forward. And as you can see, these are different layers of data in this like when it sees as when LLM sees it as a learning method, these are different layers. These pointers are different references. Likewise, the LLM learns on different patterns, different

### How LLM Works? [10:23]

references that it is able to find in the training data set. Okay? Now, how does large language model work? So, here we have a large language model. Based on this model, we're going to feed some data, which is going to be tokenized. What is tokens? Tokens are words that is given as a prompt, an input, which is used in order to enhance the learning for the large language model. Once the token has been provided, once it is done tokenization, then we have to understand which model to be used. If it

### Real Life example of LLM [10:52]

is an image-based models, I would say go with Amazon Titan. If it is a embedding-based model, I would say go with Google Gemini. If it is a text-based or paragraph-based model which is more of conversational data, I would say go with GPT-4 or 4 mini, depending on whichever version is more compatible. So, likewise, depending on task to task, depending on the data at hand, we will have to choose which model is better suited for our task, and we will select

### LLM project [11:20]

and proceed with that. Okay? Now, as you can see over here, this is a demo representation on how the indexes are generated, the embedding is done. So, whenever a word is being characters is created as token, this token is fed to the LLM, okay, to learn from this information and gather more specifics. So, tokens will generally be generated in terms of high value if it is more contextually mapped and in terms of low value if it is less contextually mapped. So, we have to understand, okay, which is the best possible method for us, okay, based on the value number. So, if you see your input text, input text is passed through an encoder. What is encoder? Encoder that has which is the input text and it con- converts to in terms of en- coded numbers. So, so that these are not easily understood in terms of if someone tries to hack it. Then it goes to decoder mechanism. What decoder does an encoder is to make sure that the data if you have for a huge data file, summarize the data tokens, a small contact. Based on this, what a 20 will be saved on MB or 212 KB. Based on it, you map that information and store it. In encoder, what it makes sure is that the mapping in the data, the relation between different points, that part is retained and based on that, you will see that the encoded information is captured. What decoder does, decoder consumes that encoded numbers and it tries to recreate the structure back by making sure that it is skipping on any part, it is not missing any domain area. So, based on that, it will try to decode and generate very nicely what we initially planned, okay, what the initial data was provided. Proceeding next. Next, we will talk about a few topics under the name continuation, text completion, and generate new text. What is prompt continuation? So, whenever we are using large language model, it is important for us to connect one particular source of information or one particular information point with other. So, what we are doing over here in terms of prompt, whenever we are writing prompt, we will have to connect the prompt using a proper technique, which could be zero-shot fine-tuning, it could be one-shot fine-tuning, right? Or it could be few-shot prompting, right? So, depending on a proper prompting technique, we'll have to identify how we want to build synergy in between multiple prompts. Likewise, we will continue and extend it for other areas as well. This is basically done as a part of prompt continuation technique. Next is text completion. So, whenever we're trying to generate new data or process new information, what we have to make sure is the text is in a complete state. Okay? Now, so in terms of training data, whenever we are using it for training a large language model or for customizing it, we have to give complete text. Why? Because if the texts are not complete, if the information is not up to the mark, in that case it will end up losing key information, right? Or losing key mapping. So, text completion should also be handled and generation of new text. So, whenever we're training a model on a new data or unforeseen data, we also try to make sure that it is generating new text, it is generating information, relevant information, and all of those. So, those things are validated using generating new text. Okay? Now, so how does a large language model work? Imagine this case, okay? So, we have a large language model which is named as Polly. So, Amazon has This is used for different kinds of data sources, okay? It can use, it can train different data source at a particular point in time. So, what happens in this case with the data at hand, it will try to learn this type of information. And it will try to process and give us multiple versions of the same It could be a text based model. It could be an image-based model. Right? And it could do different So, here it say basically listens to individual words and phrases. And it tries to understand states how it is broken into and how token is formed. As I mentioned in the previous, generally whenever we provide what it does that LLM is going to do, break it down into where parts the word and form a token. See what are individual words in a token-like format. And then it's in single-handed format. Likewise, from once the token is ready, the embeddings. Okay? So, like what it is doing is it is breaking down into small chunks and then it is leading the processing part. Okay? So, now let's say as you guys probably can learn, it can learn from data, sentences, it can learn So, what it tries to hear is it's mapping of that information and as a part of while creating the language model. Okay? So, whenever we're creating a language model, it should be in a position where it can process and handle different information. So, what it does is it hears, it captures that information and uses it as a part of language model generation. Okay? Some of the other areas of large language model also deals with answering questions. Whenever we want to have a question and we want to get it answered, in that case we can use a large language model. For this particular question-answer-based response, we generally create a vector DB. Through vector, we will basically index and store the information in the question. And whenever we're asking a question, then it is going to refer those document. You can see these are all different tokens, all right? And they are over here in this particular picture as you can see, right? These are different contextually is over here. So, using these vectors, basically we use it to train our custom model scenarios. And using that, we whenever we ask a question, it is going to give us a response. Similarly, we have for sharing as well. Whenever we're trying to share information from one particular source to other, we can use a large language model to summarize and share the information on behalf of the question asked. We can also use large language model for creating stories and as well as for you know, like sharing stories as well. This is a part of This is a past of social media, you know, like engineering, all right? Or what we can do is you can use it to write book, to summarize book, all right? Different use cases are there where we would be able to summarize the same. Okay. Okay. Now, how to fine-tune the model? So, generally support this is what does is it basically allows you to the original and it what it does so that you're able to do translation of it is regarding is creating document-related details or summarizing a particular document, right? So, likewise dedicated tasks is going to be there or creating a workflow model or creating a road map or a flow diagram. These are different cases which can be trained on some demo large language model. model Using which, we we are able to see how the responses look like. Okay? Guys, let me show that what are the different that is available. You would be able to Let me create a new second. Second, we have a base class data, okay? And we want to modify it to fit our tailored needs. So, this process of iterating and modifying a large language model to fit our custom needs, so this is a case of fine-tuning. The first thing that we'll do is we'll try to understand different large language models that is available. So, starting with different LLM providers. Okay? So, first is OpenAI. Then, we have Gemini. Okay? Then, we have OpenAI, then we have AWS. So, right? Then, we have Cohere. Likewise, we have Anthropic will cover over here. Let's say what are the different versions. Okay, let's try to understand. I said we have 60 plus LLMs available in market and growing. Available like we have Jamba. All right? Then, we have Dolly, right? Okay, we'll mention it over here. So, OpenAI has different models, right? So, under OpenAI, we have models like GPT-3. 0, Llama, which is Facebook's, correct? So, we have Meta's offering. Llama will capture over here. So, we'll start with the 3. 5, GPT-4, 4. 0, GPT-4. 0 Mini. Okay? GPT-4. 0 Turbo. Then we have G PT-4 conversation, okay? 4 or mini, 4 or Turbo, 4 or is there, okay? Sorry, this is not But then we have V2. 2. And 4. Done it 3. 2. There. Model streaming related. And then Gemini, we have Gemini 1. 5 Pro. Gemini 2. And 5 Gemini 2. Latest Gemini 2. 0 Pro. 2. 0, 2. 5 Pro. Right? 2. 5 advanced related task. Okay, you know the score, which is tool for open AI for doing coding related. Correct? And Gemini Gemini 2. 5 and related. It's multi part with Anthropic related. Claude Sonnet. So, Claude also has multiple versions, 3. 0 3. 7. 0, 3. 5, 3. 7. This is 3. 5, 3. 4. 0 must 4. 0 3. 5 Right? Likewise, for Cohere, Cohere also has its own set of models. Jump, Lama 2 Lama Lama So, these are some instrument on 7 billion, some instrument 70 billion, 5 billion parameters different from the other. All the models have some unique trait. Where is increases and differentiates one you know, audio data, one is data, one is data on whichever select accordingly. Which are generally whenever we are choosing many models are there and numerous ones. We have question is that this is separate stability diffusion. Right? Then, we also have For AWS, we have Amazon Titans. I missed Titans. Titan also has different versions. Is there. Sorry, this examples this that I've given you, these are a part of topic. Amazon Titan, which is a part of AWS. Okay? So, around four or five more uh details will be there. So, likewise, you can see when we club all of them together, most of them uh or the famous companies ones get more limelight. But, likewise, we have many other small-scale Mistral is there. I hope you have heard about Mistral. Right? They give you large language models, small language models, different options is there. So, depending dependent on whichever task you choose, you can accordingly choose a relevant large language model for the same. Okay? The next thing that becomes a differentiator for us in large language model is in terms of token limit. Okay? Now, different models have different kinds of token limit. Okay? Let me explain what is token limit. So now we have multiple options, correct? Are all relevant for us? No. Okay, so it also depends on what kind of token limits we have. Okay. Now, one second. Now, let me create a new slide. Can I remove this? The things that I've written over here. No, why yes? This perplexity is a no why is a multiple option. Okay, so you can choose accordingly whichever suits. Now, what happens? Things, okay, distinction between different models. So the first that we'll have is let's say GPT related engine. So we have 4. 0 mini, 4. 0, 4. 0 turbo. These are all large language models. Mini has 128k tokens. This four has 8196. Okay. And this has 16328 something like that tokens is there. Likewise, if you see the input token and output token varies based on model to model. Similarly, if you compare this with Gemini or let's say Anthropic related models. Okay. Anthropic has considerably high token size and it is costlier as well. So we have Claude, Haiku, Sonnet, Opus. They all have about superiorly high token size, which means it can process much more bigger data and give you much more detailed output. So likewise, whenever you're choosing a large language model, you have to decide which model or what kind of techniques is required. Okay. Which model is best suited for my requirement? If you want to have model. Which with time, but you do not want to but if you want text something in that case more number of words. In that case you can or you can choose for immediate relief. So depending on the para on that trade off that we really have a scenario basis. See our results okay. Thanks everyone for your time. I hope the session was in Thank you.