# I Built a Complex RAG App Using Warp, the Agentic Development Environment 🤖🧠

## Метаданные

- **Канал:** Python Simplified
- **YouTube:** https://www.youtube.com/watch?v=bPNmmDPyGzk
- **Дата:** 01.04.2026
- **Длительность:** 14:55
- **Просмотры:** 10,497

## Описание

What if Python is no longer the closest thing to plain English? We’ve entered the era where we can build serious, full-stack software using nothing but natural language! 👉 Try Warp for free today → https://go.warp.dev/simplifiedytev

In this tutorial, we’re using a professional Agentic Development Environment (ADE) to build a production-ready RAG application from scratch! 🏗️💻

By the end of this video, you’ll see exactly how to take a tiny language model like Qwen and turn it into a world-class expert on any research paper or PDF you throw at it. No more manual debugging for hours—we’re using AI agents to architect our pipeline, handle our vector database, and even design a beautiful user interface! 🎨🤖

📚 What you'll learn:
• What RAG is (The "Open Book Exam" for AI) 
• Setting up Warp ADE on Windows
• Building a minimal RAG pipeline via natural language prompts
• Creating a Vector Database from PDF research papers (AlexNet, DeepSeek, etc.)
• Natural Language Debugging: How to fix LLM hallucinations without code 
• Designing a Custom GUI with Flask using a layout sketch
• Scaling your app to handle multiple documents

🛠️ Tools used:
• Warp ADE 
• Python 3.12
• LangChain
• Qwen Instruct LLM 
• FAISS
• Flask 

🔎 Helpful Resources:
⭐ Full code on GitHub: 
https://github.com/MariyaSha/RAG_GUI_GenAIApp.git

⭐ Hugging Face RAG Pipeline:
https://huggingface.co/learn/cookbook/en/rag_with_unstructured_data

⭐ AlexNet Research Paper:
https://arxiv.org/abs/1803.01164

⏰ Timestamps ⏰
01:13 - What exactly is RAG? (Retrieval-Augmented Generation)
01:52 - Setting Up Warp ADE 
03:02 - Prompting the Agent for a Specific RAG Pipeline 
04:34 - Reviewing and Manually Adjusting the Agent’s Code
05:35 - Simplifying the LLM Chat Interface 
07:45 - Debugging LLM Hallucinations with Natural Language 
09:31 - Designing the GUI with Flask using a Wireframe Sketch 
10:51 - Final UI Polishing and Testing with Multiple Research Papers 

📝 Prompts 📝

1. Initial RAG Pipeline Build [04:01]
Build a minimal RAG app from this guide:
https://huggingface.co/learn/cookbook/en/rag_with_unstructured_data
PDF-only
Extract and chunk Alexnet.pdf text
Use LangChain
Use a very small Qwen Instruct
Save pipeline as app.py

2. Automatic Module Installation [05:18]
suggest which modules to install in rag_env to match the requirements of app.py

3. Debugging Accuracy & Conciseness [07:53]
the pipeline works, but not perfectly.
- LLM gives wrong answers (AlexNet has 60 million parameters and was trained on a single GTX 580 GPU).
- LLM provides too much information.
- LLMs answers must be short and concise.
- review attached output and fix these issues.

4. Refactoring to vectorize.py [10:14]
please move the code that generates the vector database into a separate file - vectorize.py. We will run vectorize.py independently and get app.py to read from it.

5. GUI Design with Flask [10:51]
Design GUI for app.py.
- use Flask.
- use attached app layout image as inspiration.
- use logo.png.
- don't change the RAG pipeline itself - just add a GUI in app.py.

6. UI Refinements & Sidebar Scaling [12:05]
remove "RAG Chat" text next to the logo. make logo bigger. make sidebar 30% smaller in width and justify side bar content to the center.

7. Multi-PDF Embedding [12:44]
instead of embedding AlexNet.pdf only - please embed all the PDF files from the local folder /rag_app/research. don't change anything else.

#Python #AI #RAG #MachineLearning #Coding #ArtificialIntelligence #LLM #SoftwareEngineering #Flask #VectorDatabase

## Содержание

### [1:13](https://www.youtube.com/watch?v=bPNmmDPyGzk&t=73s) What exactly is RAG? (Retrieval-Augmented Generation)

So, first of all, what exactly is RAG? Imagine your language model is a student with a serious memory problem. If you give it an exam, it will fail badly because it cannot remember anything from the learning material. So, RAG for our model is like an open-book exam, where it doesn't need to remember the learning material because it is right in front of it during the exam, and all it needs to do is search quickly. And that's exactly what retrieval-augmented generation is all about. Intelligence that doesn't come from memory, but from access. And in this project, we are going to prove it. Yeah, and in a perfect timing, my mouse just died. So, my apologies for the cable, but you're going to have to live with it. Now

### [1:52](https://www.youtube.com/watch?v=bPNmmDPyGzk&t=112s) Setting Up Warp ADE

first things first, let's download Warp, our ADE. We will navigate to warp. dev and we'll go ahead and download it for Windows. Then we will run the executable and we will follow all the setup wizard instructions. And finally, we will launch warp. Then we will sign up, or in my case, sign in. And beautiful, we are officially inside warp and we can start exploring it. Now, if you are to have a interface, similar to traditional IDEs, we have a project explorer, a code editor, and a very futuristic terminal where we can attach images, contacts, give voice commands, and even choose the type of agent we want to talk to. In my case, I'll just leave it on auto to save some credits. Great, we can start building our rag application. Now, to keep things organized, let's create a project folder with mkdir rag app. And just like that, our agent kicks in and suggests that we change the directory. So, instead of typing the command, we just press the right arrow and enter. And okay, we have a project folder and now play with our agents. In my case, I'm just going to cut to the chase. I want to see a very specific pipeline and

### [3:02](https://www.youtube.com/watch?v=bPNmmDPyGzk&t=182s) Prompting the Agent for a Specific RAG Pipeline

I want it to work immediately without debugging anything. So, on my end, I already found the instructions for the rag pipeline I have in mind. It is the official hugging face documentation and one of the examples here is taking PDF documents and embedding them in a vector database, which is exactly what I need. There are other examples here like PowerPoint slides and websites, but I'll just ask warp to get rid of them. Now, to make it work, we also need some kind of a PDF document. In my case, the AlexNet research paper. That's how computers learn to recognize all kinds of stuff in pictures. You can find both of these links in the description as well as all the prompts that I'm about to type. But before we type anything, we'll just drag and drop our PDF document into the project folder and then back in warp, in the same terminal where our code went, let's go ahead and talk to an agent in a natural language. So, we will say, "Build a minimal rag app from this guide. " Pasting our URL, of course. "PDF only extract and chunk AlexNet PDF text. Use LangChain. Use a very small Qwen-Instruct and then save pipeline as app. py. " Beautiful. Let's give it a run. Now, the way it works, we first see a preview of what the agent built. And only after we accept it, the file will be generated and we'll see just how many credits it

### [4:34](https://www.youtube.com/watch?v=bPNmmDPyGzk&t=274s) Reviewing and Manually Adjusting the Agent’s Code

cost us. Now, to view our new file, we will click on open project explorer at the very top left, and then app. py, which will then pop the code on the right. When we scroll through the agent's code, if anything bugs you, you can just manually revise it. For example, if we're already trying to switch from CPU to GPU, we need to know if we succeeded, right? So, let's go ahead and quickly add a print statement right after we set the device, with print, and then running on followed by the device. Okay, and that way, if we run on GPU, it will print GPU. CPU, it will print CPU. Now, we can also tackle bigger issues that we don't want to handle manually. For example, we have all these command line arguments that nobody needs. I just want to chat with the model, and this has nothing to do with it. So, let's quickly select all of it, and let's write to our agent, "I don't need those command line arguments.

### [5:35](https://www.youtube.com/watch?v=bPNmmDPyGzk&t=335s) Simplifying the LLM Chat Interface

I just want a simple Q& A style chat with the LLM. " Okay, now, when we select a chunk of code, we can just pull it into our prompt by clicking on the attach context button. So, let's do so. Then, once we see that our block is attached, we will then run our prompt, and then the agent will know exactly what we're talking about. And perfect, we now get a GitHub-style code preview with added code in green and removed code in red as a diff. We see that the agent simplified our app, and I think we're ready to go. For this, we will create a new working environment with conda create -n. We will call this environment rag env, and we will install Python 3. 12 in it. Warp then suggests to activate it, so let's do so. And then we can just say suggest which modules to install in rag env to match the requirements of app. py. So, basically, figure out all the things that we don't have time to figure out. And it will automatically suggest the command, so with a right click, we will pick the run in terminal option followed by enter. And great, we are now ready to run the app with python app. py. Now, initially, we get a bunch of errors, but then Warp immediately starts resolving them, and it happens automatically without having to press any buttons. So, in my case, I'll just accept the changes, and let's give it another go. And perfect, everything seems to work, so let's quickly test our model with some questions. So, let's ask, how many parameters in AlexNet? And even though we get an answer, this is the wrong answer. AlexNet has 60 million parameters, and you can verify that on page five of the PDF. Another question we can try, in case something just randomly bugged out, is what GPU AlexNet trained on. And we get the wrong answer once again, plus a whole bunch of mumbo jumbo that follows. So, something is obviously wrong in our workflow, but instead of manually debugging it, we'll just debug it with natural language. So let's

### [7:45](https://www.youtube.com/watch?v=bPNmmDPyGzk&t=465s) Debugging LLM Hallucinations with Natural Language

collapse our app with Ctrl C, and let's say the pipeline works, but not perfectly. LLM gives wrong answers. AlexNet has 60 million parameters and was trained on a single GTX 580 GPU. And also, the LLM provides too much information. LLM's answers must be short and concise. And then review attached output and fix these issues. But before we press enter, let's select the entire output and we'll pick attach as agent mode context. And now, let's give it another run. So eventually, after a few minutes of debugging, Warp will figure out how to get the correct answers on both our questions. So as you can see, what GPU AlexNet trained on and video GTX 580, and how many parameters AlexNet has, 60 million. Perfect. And to make sure that nothing was hardcoded because I caught it trying, okay? We'll ask a few other questions like resolution of ImageNet images. Okay, beautiful, 256 by 256. That's correct. And then what's the dropout probability in AlexNet? 0. 5. Amazing. And perfect, these answers are correct. We can also scroll through the code and double-check that there are no special blocks for prompts with GPU or parameters. If you see them on your end, just select them and tell Warp it is not allowed to hardcode anything because the user will ask all kinds of questions, not just about GPUs or parameters. Now, once we are happy with the rag process

### [9:31](https://www.youtube.com/watch?v=bPNmmDPyGzk&t=571s) Designing the GUI with Flask using a Wireframe Sketch

we need to wrap it in a beautiful interface. Now, when it comes to designing the interface, I'm not a big fan of giving AI full control. I usually design a sketch or some kind of a wireframe that the agent can use as a reference. So, feel free to download my sketch and my logo. The link is in the description. Now, before we tackle the GUI, I just want to move the vector section into a separate file. And that way, we don't need to create a new database every single time we run app. py, which is what's going on right now. It makes way more sense to create the database independently and then get our app to read from it instead of generating it time and time again. So, let's type, "Please move the code that generates the vector database into a separate file. Let's call it vectorize. py. We will run vectorize. py independently and get app. py to read from it. " Great. So, now we generate the database outside app. py with python vectorize. py, which seem to be working like it should. Beautiful. So, let's move on with pasting the logo in our project folder and attaching my layout sketch to the terminal prompt with attach image. Then, we will type the following prompt, "Design GUI for app. py.

### [10:51](https://www.youtube.com/watch?v=bPNmmDPyGzk&t=651s) Final UI Polishing and Testing with Multiple Research Papers

Use Flask. Use attached app layout image as inspiration. Use logo. png. Don't change the rag pipeline itself. Just add a GUI in app. py. " Boom. Once we approve all the files, the only thing left to do is pip install Flask and then python app. py. And then we can navigate to localhost:5000, which we will just copy in our browser and boom, here's our beautiful application. So, let's verify it works with our two testing questions. So, how many parameters in AlexNet? Perfect. And then, what GPU AlexNet trained on? And perfect, it got both right. But, I do have some issues with the layout. So, first, we need to remove the rag chat text because our logo already includes it. And then, we need to include the rest of these research papers in our vector database, just like we've done with AlexNet. So, first, let's get rid of the text with remove rag chat text next to the logo. Then, make logo bigger. Make sidebar 30% smaller in width. And justify sidebar content to the center. And wow, this time, it's perfect. So, now, let's quickly download the rest of the research papers from the layout. And let's add them to our vector database. On my end, I'll just put everything in a research folder at the root of our project, and I'm going to ask Warp to do the following. Instead of embedding AlexNet PDF only, please embed all the PDF files from the local folder {slash} rag app {slash} research. And don't change anything else. And beautiful, once Warp is done, we will go ahead and run vectorize. py once again, just because we changed a few things, and then we will run our app. And let's give it a final test asking about something else, okay? What languages Deep Seek R1 is optimized for? And perfect, Chinese and English, okay? And how many layers in smallest VGG? Amazing. In both cases, it is 100% correct. And congratulations, we have officially designed a complex full stack rag application using natural language instead of Python. And the best part is this project is not only production ready, but this is one of the best things you can put in your portfolio. So, I highly recommend going over what the agent built, refining it manually, adding your own touch, and then pushing it to your GitHub as soon as possible. Now, if you'd like to review my code or the code that my agents built, you can find everything on GitHub. And if you have any questions, please leave them in the pinned comment below. And thank you so much for watching. If you found this video helpful, please share it with the world. And don't forget to leave it a huge thumbs up and all kinds of comments. Now, if you'd like to see more videos of this kind, you can always subscribe to my channel and turn on the notification bell. I'll see you very soon in an amazing cybersecurity tutorial. You're not going to believe it. So, in the meantime, bye-bye. Thank you very much. Mario is my stylist. Bravo. Oh, but my mouse dead. Bye-bye mouse. I'm going to do it with my thumb. I just need a break. I have coffee. Why do I say error? It was kind of slow. Fantastic. Amazing. Acceptable and fantastic. That will do. One more time, just in case. One more time. Done. High five. Well done, man.

---
*Источник: https://ekstraktznaniy.ru/video/52904*