# Build an AI Chat Agent with AWS Bedrock + SageMaker (Full Project)

## Метаданные

- **Канал:** Cloud Guru
- **YouTube:** https://www.youtube.com/watch?v=8Yh-rq0VcgA
- **Дата:** 07.05.2026
- **Длительность:** 25:08
- **Просмотры:** 96
- **Источник:** https://ekstraktznaniy.ru/video/50025

## Описание

🚀 Watch a fully working AI Chat Agent built with AWS Bedrock and Amazon SageMaker — complete with prompt orchestration, inference workflow, scalable architecture, and real-time responses.

🔥 DevOps Career Boost
🚀 Linux Foundation Certifications – 30% OFF
💥 Get 30% OFF Linux Foundation Courses & Certifications
COUPON CODE: CLOUDGURU

☸️ Top Kubernetes Certifications (HIGH DEMAND)

👉 Enroll here for CKAD: https://training.linuxfoundation.org/certification/certified-kubernetes-application-developer-ckad/?source=aw&sv1=affiliate&sv_campaign_id=2797056&awc=85919_1773067559_bfa5e6ace1397c69f68bfb64ff35fa1f

👉 Enroll here for CKA: https://training.linuxfoundation.org/certification/certified-kubernetes-administrator-cka/?source=aw&sv1=affiliate&sv_campaign_id=2797056&awc=85919_1773067536_a1f91345cf951dbbbecaad9281bf8434

👉 Enroll here for CKS: https://training.linuxfoundation.org/certification/certified-kubernetes-security-specialist/?source=aw&sv1=affiliate&sv_campaign_id=2797056&awc=85919_17

## Транскрипт

### Project Introduction & Agenda []

And um this is my presentation for the final project. Um in this project uh our agenda is to uh deep dive into Amazon Bedrock and Amazon Sagemaker um as uh services um which are very powerful services provided by Amazon for the AI uh functionalities. And uh first the agenda is that we will deep dive into these services, uh look at their capabilities, what uh each of these services do, and then in the process we'll create a chatbot uh utilizing one of the pre-trained large language models. So, uh let's start with

### Getting Started with Amazon Bedrock [0:52]

Bedrock. Uh Amazon so I have my um screen uh open to Amazon Bedrock, but here's how you we go from there. You search in console uh AWS console home and search for Bedrock. And uh that's when you click on this link, this is what opens. So, um so to get started, so Amazon Bedrock is a very powerful uh platform which uh which in itself has lot of capabilities in their pre-trained or uh pre-configured towards like a marketplace. You can subscribe for a new model, you can train your own model, you can fine-tune any large language models. Uh you can test those models, you we can uh and then we can uh also uh do our billings, have Amazon take care of our billings. So, basically we pay the bill to Amazon and we don't have to uh keep a track of different models and whatnot. So, let's see what all uh are the capabilities of it. So, within the within Bedrock, these are all the providers that they have listed and I'm sure this list will keep on growing. There is a whole host of models from these providers or services which are hosted within Bedrock. So, Bedrock acts as a hub to where from where we can pull these models and use our inferences, fine-tune, train, and all those things.

### Explaining Foundation Models [2:29]

Now, let's see what is a foundation model. So, foundation model is something where we have all these models created which are basic models that most of us are also familiar with like there's a lot of hype around these. So, the all the latest and greatest models are over here. Um These are base trainable foundation LLMs from Llama, Anthropic, Amazon, and other providers. Um marketplace deployments is that these are the subscribable endpoints that we can get from marketplace and use as a API endpoint for our task. For example, if I wanted to create a language translator model, I can just host it from here within the Bedrock. This would provide me a URL and my for example, a web application or a mobile application can just call this as a URL and get the language translations done. Imported model is something which I can create a custom model. Let's say I created a fine-tuned model for just my need. Let's say I fine-tuned a chat GPT or a Anthropic LLM model and I can train it to with my specific company data to detect a customer's a transaction, right? And that can be my own hosted model, which only I have access to that's catered for my data only. That I can use it and put it in the hub here for only me to use. Prompt router is again something that can be created and there are some default prompt router from Anthropic and Meta. That we can train it and we can try and see how these responses look like and choose our model accordingly.

### Playground Features (Chat, Text, Image, Video) [4:27]

Now then after this there are some playgrounds that they have created for chat and text, image and video. Basically all this is a is a playground for us to as the name suggests obviously is to create or choose various models. See how each of the model perform for the same output. We can compare the input output latency and all that. And then decide what is the best fit best suited model for our need. They have two flavors of it. One could be a chat or text, the other could be an image based or video based model. Even within chat there could be a single prompt model which basically means that it is provided a single prompt and then it keeps on the conversation going from there. Or there could be a chat based model which it where you can interact with much more like a chat GPT. And this is how you can use this. So for example, I have my Mistral 7B used here. So I can use it. I can change the parameter and what not and then I can ask it you know Hello. It'll think and it will just interact with me and answer from the Mistral. So, what's amazing over here is that I did not directly subscribe from Mistral. I didn't have to configure and or see how the you know, Mistral would work for the same input with something else. For example, I can test it all here and decide what model is suited for my specific prompts and then I can use those for my needs.

### How to Subscribe to Models (Model Catalog) [6:11]

Um now let's jump into uh something called I mean, now that we are using this model, let's see how I can subscribe for these models. So, I'll go to model catalog and for example, I let's say I want to use Cloud 3. 5 IQ. Uh I can just go here. Uh it's it'll provide all the API request and sample for this, but I don't need to worry about that. What I'll do is I'll just go back here, click on these three dots, and say modify access. What this does is uh it is going through Amazon's marketplace. And here I can choose whichever model I want to use. So, Mistral, you can see I already have access. For example, if I wanted to use like a high Cloud Sonnet, um I can search for it. I can select this. I can request access to it. And boom, I have access for it. I can I may have to fill in some information depending on the model and the required the requirements, but that's all I need to do. Once I get approval for it, I have access to that model and then AWS takes care of billing and everything and I just get one bill for my entire AWS. Of course, depending on the billing and all that. Um

### Introduction to Builder Tools (Agents, Knowledge Bases) [7:36]

Um All right. So, after this, let's see into some builder jump into builder tools. There are builder tools which are like agents, flows, knowledge base, and prompt manager. So, uh, agent is something that will look into look into these, but what I wanted to mean in my opinion, from what I'm observing is I think the builder tools are also being moved towards the SageMaker platform. So, we'll deep dive into that over there and look into how those are being used or created to used to create these different agents and knowledge base. Um, uh, um, yep. So, let's see, uh, So, let's see what is a knowledge base

### Deep Dive: What is a Knowledge Base? [8:29]

actually. So, a knowledge base is something, like a Q& A knowledge base. So, for example, uh, let's, uh, I mean, if my company has a whole host of documentation on Confluence, and I want to create a chatbot based on that knowledge, I can have, uh, Amazon crawl my whole entire Confluence, um, depending on the needs, right? And create a knowledge base out of it. And then, basically, that knowledge base is base is passed into a vector store, and from there, another model, whatever is being used by my chatbot, will be uh

### Creating a Knowledge Base with a Vector Store [9:17]

referencing this knowledge base and answer my questions based off of that particular set. So, let's see how we can create a knowledge base. I'll say create with a vector store, and then I have an option to choose Amazon S3 web crawler. So, S3 is basically I can put my data in an S3 bucket and this can keep polling that and updating my knowledge base. I can do a web crawler where I can just provide a URL for a web page which is publicly available and Amazon has a services to crawl it and pull data out of it. So let's say I choose a web crawler. I go next and I say, you know, www. uh nasa. com. I can say default which will default as per the just the URLs that are provided and the page pages and subpages. Host would only be crawling the that particular host and sub domain will try to crawl everything and of course depending on which option we choose the cost will get affected. Once that is chosen, we need to choose a chunking

### Chunking Strategies (Fixed-size vs. Semantic) [10:45]

policy. So basically what this does is it will crawl that whole text as a big chunk of text. Then we need to parse that whole text into small chunks so we can decide or depending on the model we choose to chunk it, it will create a vector out of it. So for each model it will break into small manageable chunks and then for each chunk it will create a vector out of it and then that those vectors will be saved in the vector database for our more another model which is used by our chatbot to query and parse. And basically that's how it works or we can we have multiple options. We can do Amazon Bedrock default path. We can choose or we can choose either of these default models. And these models are will be shown by default for whichever we would have chosen or subscribed in the previous screens. And then we have to do chunking strategy. Default chunking could be a fixed size chunking which is my token is a fixed size. Could be a hierarchical chunking where you have a parent-child type relationship. Could be a semantic chunking for example like uh one sentiment could be for stored in together. The other sentiment could be stored in a different way. So these are all strategies we'll have to think about depending on what is our application. So for if my application is just to create a knowledge base for example and I want to create um the knowledge base on cloud versus let's say AWS versus Azure. Maybe I want to do a semantic searching semantic chunking so that all the AWS queries will be chunked together and Azure-based together. And that's about it. So you say next and let Amazon do it all. This takes some time and and then eventually our model is ready. Behind the scenes AWS will create indexes. It will create a knowledge base like this. This is how the data source looks like. It uh uh and then this is how the vector database looks like. And behind the scenes AWS have configured S3 bucket for me for this and it has uh Oh, I'm sorry. It has created a vector database for me and that is what it has prepopulated with the database like this website's data that I provided it earlier. All right. So now let's move on to uh

### Transition to AWS SageMaker [13:50]

SageMaker. So SageMaker is basically this whole I mean in the analogy that I can think of is SageMaker is that actually where all this action happens. So we can use the Bedrock as a hub to take all those models and then SageMaker provides all the compute functionalities where we can pull those models and train it or create the real world applications utilizing those. So in order to start with SageMaker we'll have to first come here, create a unified studio domain. So we we'll have to create a SageMaker domain first. Uh this takes a couple of minutes. So I have created the domain already. Uh so this is my domain and basically it has pre you know a lot of capabilities there. There's ML ops, there is model training, there is model fine-tuning, there is we can write or create our own Jupiter notebooks, we can create chatbots, we can create prompts and whole host of things. So I go to my domain it and here it it's almost like it's own whole new thing. Uh it has all the account association. We can even configure single sign-on with our company if this was used by an enterprise to do all this and it is pretty seamless. It provisions the roles and users automatically depending on the models that we create. And it has a knowledge of what all the models that I have subscribed for that region and what are the models that are available for me. So if you see the models that I have subscribed previously, it automatically is pulling from Bedrock and it is available for me to use. So now let's jump on to SageMaker Studio. So I'll open the

### Setting up a SageMaker Project [15:53]

unified studio. And I'll So for this once we go to the studio, what we have to do is depending on our project need, we'll have to create a project. So sorry, if I can give it whatever name I want, but what's important is the project profile. So if I my task is that I want to create ML experiments, I want to use EMRs, EC2, Lake House or I want to create Jupiter notebooks, things like that. I need to use data analytics. If I want to use chatbots and the toolings accordingly, I need to use generative AI application development and for my data Lake House development, Redshift and all, I need to use SQL analytics. So I have already pre-created a SageMaker project for my chatbot and I'll jump into that. So what I've done is I come here, I create click on this discover and I choose my model which is again a I need to have subscribed to this model from my Bedrock. And I choose my model and then I say, "Okay, I want to use a this chatbot I use this model to build a chatbot. I'll I'll select my Mistral 7B instruct model. And then it is telling me that okay, create a create the model, right? So it is saying quick some give providing some quick prompts up front, but I we can tweak these prompts and decide to initialize this model accordingly. Now, let's say that uh I did Mistral and I um Let's say we go to the generative AI playground. We go to the um choose our model and go to the model catalog.

### Creating a Banking Assistant Chat Agent [18:44]

So what I'll do is I'll go to this chat agent that I have created it previously. My intent is that to create a banking assistant which is going to help me choose the choose a proper credit card. So for this the intent is that it utilizes the uh um uh It has already has some prompts associated with it. So, if I say help me find a good credit card, it is going to answer me few things, right? help me answer and find a credit card for myself. Now, let's say that there is a problem, right? So, let's say that model start hallucinating or it starts doing what it's not supposed to do. If I'm If I want to put this chatbot on a on my website, I want to make sure that it's If I ask do some prompt engineering to it or have prompt pollution, I want to safeguard against it. If I ask it to write a Python code for something, I don't want it to do that because then I'll end up incurring cost for it, which is not my intended purpose. So, for the for this

### Setting up Guardrails (Content Filters & Safeguards) [20:01]

what we create what is called a guardrail. So, these guardrails are basically the the exactly how the name suggests, right? So, when when I say create a guardrail, I can call it whatever, right? Demo. And I can define enable content filter. I don't want my model to either take any hate words, insults, sexual abuses, violence, and all those things, right? I want to If and if those are getting blocked by this guardrail, I can prompt what if uh um what my model should answer, right? I can even use advanced filter and say let's say I mean even though this is a banking model, I don't want to want it to start providing financial information, financial advice. I can say don't provide financial advice, right? So, I can say no financial advice. No investment advice and I can provide this as a prompt and then it will block those. So this all takes some time to create, but I have pre-created those. So let's jump into the demo right away and then we'll see how it works.

### Demo: Testing Guardrails (Coding & Off-topic blocks) [21:39]

So I'll select my uh Um, let's say no coding. So I created a guardrail and it is not going to provide any coding answers to it. Uh let's say I say help me with credit card. It's going to start answering what what is a good credit card utilizing that Mistral 7 billion instruct model by the way. Now let's say that suggest I ask it you know help me write Python code for Fibonacci. Now mind you if I'm putting this chatbot in my website for uh chatting I don't want somebody to start misusing this to for some other purpose which is not in not intended purpose for this model. So this is how I stop it. Similarly, if there is a profanity or anything like that this model will will catch it and stop it. Uh now let's say I ask it okay suggest some action movies. — Now this is again going to work because I have not provided uh I have not told this model to not provide anything like that, but again that's not the intended purpose of this model. So we are going to tweak this uh guardrail again and say, you know, um no, only business. So now if I say, you know, um suggest some action movies to the same model, it's going to stop me and say, you know, that's not allowed. So this is how we can uh keep tweaking this model. So uh this seems very easy, but here it's doing multiple things. It's utilizing a Mistral model out of uh Bedrock um from SageMaker and again this model can be tweaked, fine-tuned, and whatnot. Again, put it into the Bedrock uh as demonstrated earlier, and then we can utilize it here uh seamlessly. Um it is uh using guardrails. It can also use the knowledge base that we showed uh showed up front. Uh so here in the data, we can uh depending on the model, right? So if I were to even pay more and choose a uh higher version model, I this data option will get enabled for me and I can say, you know, uh use only um only the data from the knowledge base. I can tweak this uh inference parameters and uh accordingly my responses will change. So uh here's all uh that I had to show for this uh thank you so much for the opportunity.

### Closing Remarks [24:52]

— Mhm.