AWS SA Bootcamp Hackathon Winning Project - Demo, Design, Learnings

23:48

AWS SA Bootcamp Hackathon Winning Project - Demo, Design, Learnings

Cloud With Raj 19.05.2026 1 298 просмотров 37 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

🚀Join the real-world SA Bootcamp (Less than 7 spots left, past students got high paying jobs) : https://app.cloudwithraj.com/ We ran 2 year anniversary SA Bootcamp Hackathon amongst 400 students! Students had one week to build something real. The winning team? Rishu, Shravanth, Taranmeet. They built a flashcard app for people with dyslexia. In this interview, we go over the design, demo, and learnings from this project. 🤝Connect with the winners: Shravanth: https://www.linkedin.com/in/shravanth-v/ Rishu: https://www.linkedin.com/in/rishu-gandhi-81862ab6/ Taranmeet: https://www.linkedin.com/in/tkindra/ 💻Try the app: https://www.auralearner.com/ , Code: AURA-SPRING-2026 00:00 - Intro 02:51 - Demo 10:15 - Design 16:31 - High Scale Consideration 18:23 - Challenges and Learnings 21:15 - What's Next? 22:30 - Connect With Them

Оглавление (7 сегментов)

Intro

Welcome guys and girls. Today is a special occasion. Today we have folks who competed with 400 other essay bootcampers and they created a real world challenge and they won the essay bootcamper two-year anniversary hackathon. Now, first things first, when we started this hackathon, I specifically told them that the project must not be just toy project or an AI slop. The project should be such that it can be used in a real world project or something that actual production solutions architects face day-to-day. And super excited to showcase their solution as well as under the hood how they thought about it, how they built it, what are some of the challenges, as well as we get to know the actual participants. With that, we're going to do a little short intro. Reshu, welcome. Why don't you introduce yourself? Thanks Raj. Hi everyone. My name is Reshu Gandhi and I currently have seven plus years experience in building production AI/ML systems from deep learning models to flood gen AI pipelines at enterprise scale. I was recently selected as an AWS community builder in the serverless track. Um so I'm going to, you know, build more technical depth in rag architecture, generative AI and cloud native design. I currently work at Wells Fargo deploying gen AI and rag workflows for cybersecurity threat detection and mission critical production environments. And I'm also currently pursuing Stanford Lead Executive Program where I'm bringing the deep technical expertise with business strategy and product thinking. Excellent. Tarmey. Thanks Raj for the introduction, great introduction Reshu. Thank you um Raj. So I have about 20 years plus of experience in IT. So I started my journey as a Java developer, progressed to as a lead and solution architect. Uh so I have been working in financial services domain in various uh in various capacities as a lead engineer, as an architect in various geographies from London to Canada and working in major banks such as Morgan Stanley, TD, RBC. And my focus has been on the Java side of things. And now my focus is shifting towards Agentic and GenAI where I have been able to make some cool GenAI products and showcase to senior leadership teams. Currently, I'm working as a solution architect with The Home Depot Canada. Raj, thank you for having us on. Uh it's been a pleasure working with you through the entire boot camp and for I'm a cloud solutions architect with about 8 years of experience building AWS platforms in healthcare with over 100 million users. Recently, I my previous work was with Prime Therapeutics. I was a principal software engineer and a cloud solutions architect with them. Yeah, over to you. Thank you for having

Demo

us on. Okay, so let's get to the heart of it. So, why don't you show us what solution did you build? How does it work? During this demo, I'm going to ask you some questions as well. Sounds good. Thank you. So, this is our learning studio that we have built as a part of Raj's hackathon. It was I think we had about a week and a half to come up with the MVP. We had a lot of ideas to go with and we chose something that's close to heart for Taran and myself. We both have family members who are who have dyslexia. And for people coming from India, you might have seen Taare Zameen Par. And um in that Ishaan actually has dyslexia. That's the all the words jump jumping around. We have a lot of difficulty reading normal text as paragraphs, as books with 100 200 pages. And that's where I thought this would be a be the right support that 20% of the population would need. Let's get started. I will sign in. Is this font also dyslexic friendly? Yes. Sorry, I forgot to mention. The font that we are using is Open Dyslexic and is a free font available for anybody to use, but unfortunately nobody has decided to focus on this problem that 20% of our population has. And this is our entry point. Once you sign in, we have flashcards, questions and answers behind them. And each of the questions can be spoken into. I have a few documents already preloaded, so that's why we have flashcards that are preloaded. It It's an interface for people coming back in to essentially revise what they have already learned. We got focus mode, which is really important for people with ADHD where they need to just focus on one question, figure out the answer, hyperfocus, and move on from that. And we are keeping track of all of this. So, as you go through, you keep saying got it, okay, no, whatever it is based on your understanding of the question, right? It's basically like a game at this point, which is something ADHD folks enjoy. And we get a full scorecard instant instantaneous response. And I feel like that's important. And here's our uploading facility, which is basically taking all the PDFs that we have. That the user wants to upload. Yeah. So, here's a live example of what we want to provide, the value we want to provide. And Taren needs daughter younger daughter is dyslexic, so we wanted to see from the customer side what is going to help them and she gave us this example. Like hey, we have a uh these questions um and these questions are not really readable because as I mentioned earlier dyslexia um creates a lot of issue for reading. And we said, "Okay, let's see if we can make this readable and more interactive, gamify the entire learning procedure um that the kid has to go through. Uh so, I'm going to drop in the PDF. Um and once we have the PDF, I can choose how many of our flashcards that I want and hit generate here. What happens in the back end is it's going to take this, create embeddings uh out of it, and then use the embeddings to create uh a rag pipeline for us. Once you have the rag pipeline, we're able to create um the flashcards and return back immediately. Um And and as soon as you upload and you get a response, we have the entire uh flashcard set here ready to go for study uh within less than 30 seconds. Um uh and So, this can be used for literally any topic, like even for science and stuff. If you have a PDF copy of the book, you can literally generate bunch of flashcards before your exam and whatnot. — Yes. Yes, that's correct. So, the way um I would just like to add. So, this is where the user input whatever extra knowledge they want to bring to whenever they upload a PDF or, you know, share a YouTube video link or give an MP4 file. So, they're actually bringing the knowledge base. Did you guys and girls do anything to uh evaluate the ideal model for how did you all navigate it? Yeah, we ended up doing quite a bit of uh testing with respect to all the models that was available within the bedrock environment because we wanted to keep everything as secure as possible and try to get this to the highest standards in and nothing's higher than HIPAA. So, we decided to choose HIPAA and I have some bit of experience in that. When we think of this from a essay perspective everybody would say, "Oh yeah, let's just go for the highest reasoning model to get the exact value. " But, that's not the outcome we're trying to target here. We're trying to make this as fast as possible so the user is not going to lose attention. And when it comes to neurodivergence, ADHD, attention is the key parameter from a customer's perspective that we need to look at. And that's why we ended up going with uh Amazon No micro and from a hacker tone perspective, it kind of helped us because it's one of the most cheapest models available. So, the principles of architecture remain the same. Well, we move it to Gemini. I remember the time when we used to run the VMs on prem and then the solution used to be, "Hey, let's get the biggest possible VM. Hey, give the biggest Java heap to be do it. " But, we have to go from the customer first perspective and look at what enriches a customer and focus on the customer perspective is the latency that matters, not the way you got the best biggest model out there. So, and we focus on that perspective as well as of from a well-architected framework perspective, cost optimization cost is important. You might be getting 99. 9% accuracy, but if the cost is 10 times higher we don't deliver it the solution, right? So, in this case, we looked at all the trade-offs and we went with the Titan model. Got it. Okay, so I wanted to add so because this was you know, the objective was to build it out like an MVP for the hackathon purpose, but you know, as we have more scalability, as we start adding more users, we're going to reiterate our testing process to see what makes more sense and we may choose a different model, you know, depending whether it's meeting the customer needs or not. So, it's always, you know, changeable process. We're not stuck on any one use one model versus any other. So, we'll revisit that when we have to. Excellent. Okay, so I see some of the other buttons like history and ride, but before we go there, let's we are all

Design

architects here. Let's talk a little bit about the architecture. Can one of you walk us through the actual design of this application? Sure, yeah, I can talk about that. So, we'll be sharing our architecture diagram as well and then I'll give a quick walk-through of let's say what happens when a user uploads a PDF, right? So, I would say the best way to think about it is in three stages: the user's browser, our serverless backend, and the AI layer. And I do want to mention that none of this runs on a traditional server. So, when a user uploads a PDF, right? So, the first thing happens is the browser ask our upload URL lambda for a pre-signed URL. Basically, a time-limited permission slip to write directly to S3. So, the file never really touches lambda. It goes straight from the browser to S3, which matters because lambda has payload limits and cost, you know, per millisecond. So, once the file lands in S3, the browser calls our generate lambda with just a file key, which essentially becomes a pointer to where the file is. So, the lambda reads it, extracts the text, and sends it to the Amazon Noah micro via bedrock with a prompt that says, "Please generate me flashcards, keep the language simple, and return the structured JSON. " And the model responds. We save the session in DynamoDB and send the cards back to the browser. So, the whole thing also runs inside a private VPC and the entire stack is serverless. Yeah, so which got me curious. Why serverless over containers or EC2? Yeah, do you want to take that over, Turney? Thanks, Rush. That's a very important question. That's a interesting question there. It all comes back from the customer perspective. What is important to the customer? I would say it is the latency that is important. Customer does not care if you're running a cluster or serverless over there. So, for us, it was looking at the customer first and optimizing it for the best experience as well as cost. Well, think it from our use case perspective. A user logs in, does a it does and goes to bed at night, hopefully. And then the servers are if they're running in EC2 machines, they're running idle and then we're, you know, paying for all of that cost. As well as the geographically our customers are at this point in North America and if they are not using it, again your computer resources are idle. As well as the cost to maintain a large containerized infrastructure and a pipeline would be high. So, it was not a just a most easiest route that we picked in. It was a well-thought decision to go with the serverless architecture. From our architecture perspective, we are using Lambda as a full function as a service over there. As well as now using all the serverless computer resources such as S3, DynamoDB. So, our cost is really very minimal ongoing cost, right? So, we only pay for the usage as it goes. Except for our NAT gateway that we have to pay for. — Are you using Bedrock's native knowledge bases or did you build out something of your own? How did you solve that part? So, GenAI is the core of what the product does, right? It's not bolted on. So, let me walk you through how it actually works. When the user uploads content, our Lambda sends the text to Amazon over micro via bedrock. So, the data itself is secure within an AWS network across the region, right? Uh and with a specifically carefully engineered prompt that we created, uh it generates X number of cards and then the and keeps the questions under 15 words, which is really important cuz we're looking for speed at this uh when we're trying to give back um the flash cards. And then answer under uh within the 30-second limit that API Gateway enforces. Uh we can walk we can talk about how we can get that changed through service request, but uh that's we want to work with what we have at this point um for the hackathon. So, and we're using plain language only. We're not We're trying to make sure it is readable and uh understandable, comprehendable by students uh of a younger age. And it's returned with uh at the end it's we get a JSON object uh converted from the model itself. So, we parse that response and save it into DynamoDB for our history tab, which I can walk through. Uh for larger documents, we use rag, which is retrieval augmented generation. I'm sure everyone knows about this. Uh the analogy I use is um instead of cramming an entire textbook uh into the AI's context window at once, we have uh massive textbooks, right? Uh I mean, I I can't even imagine how big they can get. Uh the uh content gets chunked into 512 token pieces and then we uh end up storing that into the bedrock knowledge base backed by S3 vectors storage. That's something new that came in reinvent. Uh we wanted to see if that fit our idea, and it really worked well for us. And at the generation time we retrieve only the chunks that match the specific user's file. Um and that is filtered by metadata sidecar we uh we write at the upload time. So YouTube is different, right? Uh because these those answers are small enough to fit within a prompt. So we skip the knowledge base entirely, go directly to the model, and uh knowing when not to use rag is like uh is what is really important than when to use it. Um so we save on the costs associated with it. So we um so we're able to deliver value faster to the customer at the end of Got it. No, I really like that uh you all have implemented a real-world way of doing rag because in all projects no one just ingests a full document. You have to do chunking, either semantics and then semantic search or similarity, reranking if needed, metadata, right? Very good. Um So do you I don't know like can thousands of people use this application? What would you do when this

High Scale Consideration

application reach high scale? Yes, yeah, I can talk about the scalability piece. So yeah, right now um the way things are we would definitely change, you know, some of the things. The first thing is we would decouple the architecture because right now let's say when someone uploads audio, right? Our lambda invokes itself with the fire-and-forget call. So it works, but it's a workaround. And I believe at a time you can only in invoke, you know, a certain number of lambdas, but let's say what if you have 100,000 users all at once because if this in real world like, you know, if this was being used by users, we don't have any control over how many users can use it at once. So if that happens, so we would have to introduce SQS plus a lambda consumer so that we can decouple the architecture. So an SQS is built exactly for this pattern. So decoupled asynchronous processing and we can you know utilize its benefits like automatic retries. We can also have dead letter queues and concurrency control. So that is something we would be doing. And there's this additional feature that we've added is currently we're tracking consecutive study days in DynamoDB. So you know how in Snapchat there's a streak function. So we currently have that on our website as well because we want to track how many users are coming in consecutively so that we can also provide some incentives for them. Like we can say hey if you're coming in every day, we can maybe unlock you know maybe 20 flash cards. So we do want to do all those things as part of our future features that we would want to add. So the next step would be you know wiring DynamoDB streams to lambda that triggers like SES emails when someone breaks a streak. So the whole pipeline is designed for this and we have still have to connect this last piece. So these are some of the things we would be implementing from scalability perspective.

Challenges and Learnings

perspective. Got it. So okay, so the solution looks quite good, looks crisp, very responsive. But I'm sure not everything went great from get-go. So what are some of the learning points or what things were broken and you learned the hard way? Honestly, this one is really easy. We got burned by it directly, right? And it's a lesson we'll carry into every other cloud project that we will face in the future. Our first instinct obviously was to say oh yeah, let's just use the vector store provided by Amazon OpenSearch serverless, right? It's a standard recommendation you see everywhere for rag on AWS. So we spin it up and everything worked technically and then we looked at the pricing, too. This was one of the major mistakes we did. We never thought of what's the scope of the hackathon, what's the budget limit, which is something we should have done it, but we'll remember this for our entire life that hey, here's fun piece of the puzzle. And OpenSearch serverless has a minimum capacity of two OCUs, which is OpenSearch compute units, at about 20 24 cents per OC per OCU per hour. So, when you do that math, it's just 24 cents an hour, we like oh, okay, it's nothing. But, for a month it comes up to $350 in a month. So, even if zero users hit it. So, if no hits happen, we're still paying a default of 350 a month. For a hackathon project, that's not a cost that's technically a wall that we built we jumped into headfirst, got burned by it as well. Serverless is in the name, which is what AWS is betting on for customers, too. I think and it is truly serverless, but there's that minimum that is required. And this floor is $350 a month. So, we switched to S3 vectors middle of the hackathon, and I was really scared doing this, but I said it makes sense cuz we can talk about hey, these are the pitfalls we saw during our implementation phase, right? So, we switched to S3 vectors. AWS is newer purpose-built vector storage, right? Um pay per request, um true I true zero I uh idle cost. For our workload, it was orders of magnitude cheaper, and it worked well, um almost as good as OpenSearch. Um the lesson, always read the pricing page, and not just the feature page. And uh is a marketing word, in some parts of uh the cloud business as an entirety. It doesn't always mean what you think, so always be sure of what you're looking

What's Next?

for. That's a good one. So, what are the future plans you guys and girls plan to enhance this? Uh where do you all go from here? Thanks, Raj. Yes. Um since our hackathon, this project has expanded already a lot. Like as you we saw in the demo, we added features like a streaks and so forth and history and so forth. So, it was just like the project was so good like and we wanted to carry on with it and we saw here we are actually delivering solving customer's problem and we see that we are going to be adding value to the customers and that's why we want to continue forward. Now, we are, you know, we have a lot of things in our mind. Example, like we are thinking about the teacher mode, maybe we'll expand it to the school, it will become a monitor. So, we have a whole lot of ideas that are in our bucket, but what we want to do is go back to the — Get there some experience, get some feedback, right? So, the first feature that we are putting in is a feedback tool where we are getting the feedback for the customer first and then building it something, not building what we want, but what building what the customer wants. That's going to be our motto going forward. Got it. Excellent. And I saw there is a feedback option there. Very happy to see that you all are building that. Okay, so how can folks

Connect With Them

connect with you all? LinkedIn is the best way to reach me. Uh I'll have the LinkedIn below there in the description. Please feel free to connect with me anytime. Excellent. How about you, Sravan? Uh yeah, same. Yeah, so feel free to reach out to me on LinkedIn and I also write blogs mostly on AWS because I'm an AWS community builder. So, that's another way to reach out to me as well. And um yes. Great. Uh yeah, same. LinkedIn would be the best place. I'm active. Uh I write articles, uh doing some medium posts, starting into writing medium posts about the community barrier community users here. And a lot of fun facts across the board. Speaking with solutions architects and people from multiple levels of software development come to our meetups and we have and definitely the growth in the perspective has been immense. So I try to share my learnings through many of my posts. Feel free to connect with me. — Love it. Thank you for participating in this hackathon and building some real world amazing project and more importantly sharing with the viewers so they can also get energized and build something cool. All right folks, have a good one.

Другие видео автора — Cloud With Raj

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник