# Why Hotstar Buffers During IPL Finals — DNS + GSLB Explained for Cloud Architects

## Метаданные

- **Канал:** IT k Funde
- **YouTube:** https://www.youtube.com/watch?v=QQdXEKZt6zg
- **Дата:** 08.05.2026
- **Длительность:** 11:52
- **Просмотры:** 837
- **Источник:** https://ekstraktznaniy.ru/video/50048

## Описание

Hotstar streams IPL to 50 million+ concurrent viewers without crashing.
But your friend's Hotstar buffers in the final over. Why?

The answer is DNS + GSLB — and it's one of the most common system design questions in Cloud Architect interviews today.

In this video, I break down:
✅ What is GSLB (Global Server Load Balancing) — in plain English
✅ How DNS + GSLB work together to route millions of requests globally
✅ The Amazon warehouse analogy that makes this click instantly
✅ Active-Active vs Active-Passive GSLB setup
✅ MEP Protocol — what it is and why interviewers ask about it
✅ Latency-based, Geo-based, Weighted & Failover routing policies
✅ The real reason Hotstar can buffer — TTL explained
✅ How to answer this in a Cloud Architect interview confidently

🎯 The interview question covered:
"Why does Hotstar buffer during IPL finals and how would you fix it 
using distributed architecture?"

If you're preparing for Cloud Architect, Solutions Architect, or 
Senior DevOps interviews — t

## Транскрипт

### The IPL Interview Question That Stumped a Cloud Architect []

One of my friend was recently in a cloud architect interview and the interviewer asked him if he follows IPL and he said yes, I'm a big fan. Then he asked him a scenario based question. He asked him, "Imagine that you are watching IPL finals between Mumbai Indians and Chennai Super Kings. It is the final over. Chennai Super Kings needs 20 runs from six balls and you're all glued up to your Hotstar screen where you're watching this IPL final and suddenly it starts buffering. Can you tell me why and how we could prevent it using our system design principles using something which is very relevant in today's distributed architecture? There was a pause. My friend took some time and then very gracefully said, "Sir, I would like to pass this one because I do not know the answer. " Completely fine, which is also fine. If you don't know an answer rather than giving fake definitions, mugged up answers, it's better to accept that you don't know. This actually creates a very good impression by the way. But nonetheless, this video is about what the interviewer was expecting and he was expecting my friend to talk about how we combine DNS and GSLB. GSLB by the way is called as global server load balancer, okay? So, how you use DNS plus GSLB to create this kind of an architecture where you can see millions of requests, millions of views. You can serve millions of viewers across globe using this technology. So, today we'll understand the architecture and I'm pretty sure by the end of this video you will be in a position to know what it is at a basic level and then you can ex- pand on it, explore it further. So, without further ado, let's get started.

### Amazon Warehouse Analogy — How GSLB Thinks [1:34]

So, as always, let's start with the easy-to-digest analogy which will plant this idea in your brain. And that is about how Amazon works and fulfills our orders. So, imagine if this customer has ordered a product, then that product is not directly shipped from a factory. No. So, basically, you can consider that this algorithm which is smart fulfillment router. Imagine that Amazon has that and it automatically know based on the behavior of this customer and the location or the group of users that what are some most popular products. And those products are already stacked and stored in those warehouses. So, Amazon has a warehouse in Delhi, in Bangalore, in Chennai, Mumbai, okay, Pune for example. So, when this order is placed, this router scans that order, okay, and it automatically gets those products in advance. So, it is very rare that it has something which is not talked in any of the warehouses. So, it will have, you know, stocks across these different warehouses. Now, for example, if this person is in Mumbai, then automatically it will know that Pune is the nearest warehouse from where this could be shipped. And this shipping will happen in almost few hours. And this smart algorithm, this is how GSLB, which is global server load balancer, also works. The role of this particular fulfillment center is not to continuously talk to this customer. No, the role of this center is to only ensure that the customer is fulfilled by the nearest warehouse. Similarly, global server load

### GSLB = Load Balancer of Load Balancers [3:04]

balancer can be called as load balancer of load balancers, okay? Something like metadata, which is data about data. Similarly, load balancer which talks to other load balancers. So, the moment the request will come, it will decide, okay, this is the nearest CDN, content delivery network, from which this request has to be served. So, someone watching Mumbai Indians final which is happening in Mumbai sitting in Delhi will not be served from a Mumbai data center. He will be served from Delhi data center. So, we will understand it in more detail in the next section, but understand that this is the whole concept of building something with DNS, which is obviously your domain name server, and then forwarding that request to GSLB. Now, the caveat is not every architecture needs GSLB. It is needed when you have distributed locations and you need very fast performance, you are getting millions of requests which needs to be served from different data centers, then you need this kind of an architecture. So, now let's understand

### Step-by-Step Architecture: DNS + GSLB Flow [4:00]

it in bit more detail in the detailed architecture of GSLB. Now, let's stand the actual architecture how it works. So, imagine this user sitting in Delhi watching that same IPL final and he sends that request and that request obviously at step one generates a DNS query which goes to DNS resolver. So, it could be any DNS resolver. For example, it could be your Google domain name server. Okay? So, DNS, for example. Okay? Once it goes there, obviously it would write a route it further to the actual authoritative server. In this case, it would go to your global server load balancer GSLB. So, this section, this box is actually DNS plus GSLB setup. So, obviously Hotstar will have its own GSLB architecture and it will directly go to the authoritative server which will have GSLB enabled first of all and then based on the request coming, it will decide where to route it. Now, we have to take a pause here. You can see that GSLB one is there and GSLB two is in Singapore, GSLB one is Mumbai. So, in here you can have different kinds of setup like

### Active-Active vs Active-Passive + MEP Protocol [5:07]

active based on what kind of routing policies you are creating. It can also be active passive. For example, for your disaster recovery, for example, your GSLB one goes down, your GSLB two should automatically start receiving those requests. And these two talk with a protocol which is called as MEP, metric exchange protocol. So, you can imagine that this is kind of a WhatsApp group where two of these individuals are joined together. So, they talk to each other using MEP protocol. A very good thing, it might be asked in the interview. While all this is happening

### Health Check Probes — How GSLB Monitors Edge Locations [5:39]

there are frequent things happening at step three which is a parallel step happening all the time and that is your health check pro. So, your GSLB will continue. So, imagine always remember that this is your load balancer of load balancers, okay? So, it is continuously probing your different CDN locations. This is your CDN edge locations, okay? location and this box is CDN plus origin server. Like where actually it is happening, the origin server from where the request is being fed. So, this health check probe will continuously happen so that GSLB understands that whether all the edge locations are active, healthy and are able to serve the request. So, for example, if there is a failure here and this particular health check fails, automatically any user from Bangalore who's watching the same IPL final will not be sent here. He would be redirected to the next closest. So, for example, it could be redirected to the Chennai data center or CDN location, okay? So, that happens continuously and once we decide

### DNS Response: IP Address + TTL Explained [6:38]

that okay, this Delhi user has to be served by the Delhi edge location data center, then the actual IP address and the TTL, time to live, is the cache, like how long this information should be cached at the client side, okay? So, DNS response would be the IP address and TTL, okay? And then this whole GSLB setup will be out of the question. Connection will be direct between the user laptop and this data center, okay? Now, the role here is completed. The only role was to route it to the right CDN location. And then routing policies

### Routing Policies — Latency, Geo, Weighted, Failover [7:11]

are different based on your needs. So, latency-based routing, wherein the primary thing to ensure is that the response should be the fastest. So, it would always detect that which is the closest edge location to be serving this particular request. Geo-based is very important. Sometimes you have to have geo-based routing. So, that if a user from UK is trying to watch IPL and opens the app, it should be routed from your particular geo-based CDN because it might be that you're not allowed to watch IPL from your app which is registered in UK, okay? Or maybe the pricing is different, for example. Or there are GDPR restrictions to watch certain kind of content in certain kind of country. There geo-based routing comes into picture. Then weighted routing means that 90% or 80% requests should go to one CDN location and 20% to another based on the resources and needs of your overall architecture. It might be that one of your CDN locations are having massive infrastructure and massive resources while other is just coming up. So you would have weighted routing there. Just I'm giving you random examples, but you will understand the concept, right? Failover routing, again, it will depend that the moment one particular data center is inactive or health check is failed, automatically it should switch to another location. And you can have multiple such routing policies in one GSLB as well. It's not only one-to-one mapping. You can have multiple features enabled. So once this is done, this is being now served

### How Content is Served from Edge Locations (CDN + Cache Miss) [8:34]

directly from your edge location in Delhi. And generally all the data is continuously getting buffered here, okay, cached. Only when there is cache miss that this particular location will go to the origin server which is in Mumbai, maybe an EC2 instance, and get the data back. But generally it is served from your edge location and that is how you do distributed computing. So

### The Real Reason Hotstar Buffered — TTL Trade-off [8:55]

coming back to the question where my friend got stuck, okay? He got stuck because he was asked why the buffering is happening. And the answer for that could lie in how you have set up your TTL, time to live. So it could happen and he could have answered it in a manner that "Sir, I assume, and this is my analysis, that while watching that particular IPL final, that particular user was already being served from one of the edge locations, and the time to live for that particular interaction was, for example, 300 seconds, which goes to 5 minutes. And while he was watching it, suddenly it could be possible that particular edge location, because of my load or something, went down and became unavailable. The health check failed, but still that the TTL was 5 minutes. So, it has to first end that TTL, let that 5 minutes pass, and then only it will again go through the same process, and GSLV will again come into picture, and then maybe it would be served from a different edge location. So, to answer your question, I think the solution is to very, very closely monitor how you are setting your TTL. If you need fast response without any failures, then your TTL should be lower. Maybe you can lower it to 60 seconds, but never make it to zero because zero means there would be severe load on your servers. So, you can reduce it to maybe 60 seconds, but there is a trade-off. As an architect, you

### Architect Mindset: Always Think in Trade-offs [10:14]

have to always think in trade-offs. There is no one right or wrong answer between the two. So, if you're lowering the TTL, time to live, then you're increasing the pressure on your server. So, while you're giving great performance, it is coming at a greater cost. So, if it is something, you have to find a balance between the two. And that's where you have to define how much TTL you should be setting up. So, this could be one of the way to answer the question, but you can only answer these questions when you can explain this kind of an architectural flow. So, I hope friend, this was useful, and next time when you are in any such interview, you can answer it better. Again, now your job is to go and study it in more detail because our idea was to bring this concept to you. If you like this video

### Free Resources + AWS Cloud Jumpstart [10:58]

give it a like, thumbs up, subscribe. If you want to learn AWS, I have created a master cheat sheet. So, if you're going for any cloud interview around AWS, I have created this master cheat sheet which explains once AWS services in one line. So, you can imagine what good refresher it could be for you before your exam or before your interview. And if you're serious to learn cloud from scratch as a beginner, I have a core program which is for AWS cloud jumpstart where we understand how we can become a good cloud professional using AWS step-by-step, and also how to use generative AI to reduce a lot of work which we do on day-to-day basis as a cloud professional. Be it building architectures or be building code or preparing for interviews. So that new module has been added recently. So go check that out. The links will be in the description. And until next time, keep learning, keep sharing, and keep growing. Bye for now.