# Implementing DEL, EXPIRE, and Cleanup in Redis | Redis Internals

## Метаданные

- **Канал:** Arpit Bhayani
- **YouTube:** https://www.youtube.com/watch?v=yGdk0hmvkgo
- **Дата:** 02.05.2026
- **Длительность:** 27:23
- **Просмотры:** 1,687

## Описание

System Design for SDE-2 and above: https://arpitbhayani.me/masterclass
System Design for Beginners: https://arpitbhayani.me/sys-design

Build Your Own Redis / DNS / BitTorrent / SQLite - with CodeCrafters.
Sign up and get 40% off - https://app.codecrafters.io/join?via=arpitbbhayani

# Recommended videos and playlists

If you liked this video, you will find the following videos and playlists helpful

Redis Internals: https://www.youtube.com/watch?v=h30k7YixrMo&list=PLsdq-3Z1EPT0eElcdOON9fdaeaQjlyXDt
Redis Internals Reading Material: https://docs.google.com/document/d/1lHKI6bia3ZEPoAKgXWIm_SKv_Sii1qtN3EYMEdxKpEQ/edit?tab=t.0

System Design: https://www.youtube.com/watch?v=o7qLKfILuD8&list=PLsdq-3Z1EPT27BuTnJ_trF7BsaTpYLqst
Designing Microservices: https://www.youtube.com/watch?v=JPj6mhVLQN0&list=PLsdq-3Z1EPT0ug8eizS71G6LZb6-4FAFt
Database Engineering: https://www.youtube.com/watch?v=-htbah3eCYg&list=PLsdq-3Z1EPT2C-Da7Jscr7NptGcIZgQ2l&pp=gAQBiAQB
Concurrency In-depth: https://www.youtube.com/watch?v=2PjlaUnrAMQ&list=PLsdq-3Z1EPT3VjDhjMb5yBsgn0wn2-fjp
Research paper dissections: https://www.youtube.com/watch?v=LXhgFAZroG8&list=PLsdq-3Z1EPT2XEJ0AmF02LBK1RFNd-jK8
Outage Dissections: https://www.youtube.com/watch?v=LeT_s-UFw-U&list=PLsdq-3Z1EPT3_Z97svMs6S2y7tv1PcUmc

Hash Table Internals: https://www.youtube.com/watch?v=jjW8w8ED3Ns&list=PLsdq-3Z1EPT2UnueESBLReaVSLIo_BuAc
Bittorrent Internals: https://www.youtube.com/watch?v=v7cR0ZolaUA&list=PLsdq-3Z1EPT1rNeq2GXpnivaWINnOaCd0

# Things you will find amusing

Bookshelf: https://arpitbhayani.me/bookshelf
Papershelf: https://arpitbhayani.me/papershelf

# Other socials

I keep writing and sharing my practical experience and learnings every day, so if you resonate, then follow along. I keep it no fluff.

LinkedIn: https://linkedin.com/in/arpitbhayani
Twitter: https://twitter.com/arpit_bhayani

Thank you for watching and supporting! It means a ton.

I am on a mission to bring out the best engineering stories from around the world and make you all fall in love with engineering. If you resonate with this, then follow along. I always keep it no-fluff.

## Содержание

### [0:00](https://www.youtube.com/watch?v=yGdk0hmvkgo) Segment 1 (00:00 - 05:00)

So in the previous video, we implemented get, put, and TTL. In this video, we would be implementing delete, expire, and we would see how Redis does auto expiration of keys, which means auto deletion of expired keys. Right? So first, like always, let's see what the behavior of a normal Redis server is, and then we would be re-implementing it in our own Go lang based implementation. Like always, top right contains the normal Redis server running on port 6379, and bottom left is where we are connecting it through Redis CLI. Now let's say I set a particular key K with value V. Right? Now, if I do a delete, here you can see that delete is taking an argument, which is key. If I pass in K, the key would be deleted. The return value is one. What does this one mean? This one is the number of keys which are deleted. So delete, if I do delete K again, key K is already deleted, what would be the return value? It would return zero. So one is the number of keys that were successfully deleted. Because key K didn't exist, like after we deleted it, it does not exist anymore, when we hit the delete uh again on the same key, it returns zero. Right? Now if I set multiple keys, key K1 V1, set key K2 V2, and now if I do delete K1 K2 K3 K4. Two keys exist, K1 and K2, but I'm triggering delete of K1 K2 K3 K4. It returns me two. Two implies that out of this four, two keys were deleted, and two keys didn't exist. It's a no-op operation. So now if I do run this command again, we are getting zero because K1 K2 K3 K4, all four doesn't exist. Right? That's why it is returning me zero. This is what we would have to re-implement. Right? So this is how delete works. Second is let's see how expire works. Let me set a particular key K with value V, and now I'm triggering now what a use case of expire is, let's say I set a particular key. And while setting, I didn't uh provide any expiration, but then I realized, "Hey, now I want to set an expiration to this key. " So that's why what I can do is I can pass in expire function with K with key K, and the second argument is the number of seconds. Let's say I want to expire this key by 10 seconds. As I fire this, now if I do TTL of K, you can see the time elapsed, or the time to live for that is consistently decreasing, and decreasing, and decreasing. Right? So this is how your TTL was set with expire. Right? Now, if I what happens if I pass in a key that does not exist? Let's I pass in ABC. This key does not exist. First, because I didn't pass second argument, it gave me syntax error. Now if I pass in argument, it returns me zero. What does this zero mean? If Now this zero implies that the operation was unsuccessful. It could be because key does not exist. any reason. So if your Redis was able to set an expiration, it would return one. If it was unable zero. So here, if I do a set K {comma} V, and if I do expire K in 10 seconds, it returns me one because setting like setting the expiration was successful. Right? This is what the return value of expire is. Now given that we know how delete and expire behave, we would be re-implementing it on Go lang base. But this is not done yet because we also like after we go through that, we would be taking a look at how Redis does auto deletion. So if the key is expired, it would need to auto delete the key. Right? We would be implementing that as well. So that is a very interesting algorithm that we would look at the end. So first, let's start re-implementing this, and see and do a very and do an extremely detailed code walk-through. So what we would do is we would start with our normal eval. go file. Eval. go file is where we are implementing all the commands, and here we add delete and expire command. Delete, again all the arguments passed over here, and then I/O read writer, similarly for expire. So here if I check eval delete, what do we see? Eval delete, whatever argument we have passed are the keys, and we just have to iterate over them and explicitly delete it from the hash map that we have, or the hash table that we have. We are just doing that. We are iterating over all the keys, and we are triggering a delete. This delete is a store function written in store. go file. The objective of this is so that we mask the any complexities that might aware uh sorry, that might come up. So what it does is it just triggers a delete from the hash map store, delete the key K. Right? That's what it is doing. And it is returning true if the delete was successful, and is returning false unsuccessful. Right? And unsuccessful because key does not exist. As simple as that. So if key exist, we are deleting it and returning true. If key does not exist, we are returning false because this is what we need to do count plus because the return value of delete is the number of keys that are

### [5:00](https://www.youtube.com/watch?v=yGdk0hmvkgo&t=300s) Segment 2 (05:00 - 10:00)

deleted. Right? So we just iterated over the keys, trigger a delete. If the delete was successful, we do delete plus plus, otherwise we don't increment it there. And at the end, we just encode and return. So return value is an integer which states the number of keys that are deleted. Right? And now that we have this, if I check the encode function because it's an integer, just one change I've made over here is we earlier just added int64. Now I've added all flavors of integer because count deleted was a simple integer type. That's why we've added all flavors of integer over here, encoded in the exact same way, uh colon percentage D {slash} R {slash} N. Right? And this is how you would be implementing delete operation. Iterating over all the arguments, deleting it from the hash map. If deleted, count plus plus, and at the end, return the integer count. Right? Now let's see how expire works. Expire function sets the expiration. First of all, we do a basic check on the arguments because we require exactly two arguments over here. So if the length of arguments is less than equal to one, it's an error condition. Right? Second is the first argument becomes the key. And the second argument, you have to convert you have to consider it as an integer of 64, which would be the duration in seconds of expiry. Right? We set it over here. In case it is not an integer, error would be set. So we would can type error value is not an integer out of range. This is the exact Redis error that it throws. Then what we are doing is we are getting the key. Because to set the expiration, you have to get the object from the hash map. We got the object over here. In case the key does not exist, in case that object doesn't exist, what do we do? We return zero. Because if with expiration, if key we saw that if key didn't exist, we returned zero. Right? Because the operation was unsuccessful. We were not able to set. So what Redis document says that zero if timeout was not set, example key doesn't exist, or operation skipped due to provided arguments. Any reason if timeout was unsuccessful, uh setting timeout was unsuccessful, we return zero. Otherwise, we set the expiration. So expiration is set to what? We store expire set to absolute time at which the key would be expired. So it would be time. now. unixmillis plus expiration duration in seconds into 1,000 because we store it in milliseconds. Right? And at the end, we just return one as an integer response. A classic implementation of delete and expire. Right? So this is how delete and expires are implemented. So now let's look into how auto deletion happens. We saw how delete is implemented. We saw how expire is implemented. Now let's talk about how auto deletion would happen because now here up until this implementation, we deleted something from the hash table only when the delete command was issued. Right? While getting it, we were checking if it is expired, we were returning not found. Right? But we did not explicitly delete it. So let's see how deletion how Redis does key auto deletion. Because if you have set an expiry, after that expiration time is reached, I would want to actually delete the key from the hash table. How do we do that? That is where let's quickly take a look at how Redis expires the key. So in Redis, we can set expiration to the keys by typing expire K 10, expire key and the seconds. Right? Now here, why do we have to set expiration? So that there is no burden for manual deletion. You can just set and forget. You just set the expiration and forget, your Redis would do the clean up. We have to now write that clean up job. So how does that happen? So the first intuition that you might think, "Hey, it could be another thread that is running. " But remember, Redis is single threaded. That is where this is such a beautiful implementation. So Redis expires or Redis auto deletes the key in two modes. First is the passive mode. So And the second one is the active mode. So first thing, let's take a look at the passive mode. So what passive mode says is that a key is passively expired when some client tries to access it, and the key is found to be expired. Which means that if you're getting an object from the hash table, and if that object has some expiration set, right? And if that key is found to be expired, you would be immediately deleting it. Right? So you are accessing an object from the hash table, you got that object. If the expiration time is already passed, then you would be triggering an explicit deletion, and then returning null to your user that key does not exist. That is a passive one, which means that if your expired key is ever accessed, if an expired object is ever accessed, you are explicitly deleting it. Problem solved. But what about the keys that are never accessed? Which means a key's expiration is set, but that key is never get for some reason. Then key would be lying there only. So that is where you have another mode, which is an active mode. So what active mode does? This is a beautiful probabilistic or rather beautiful statistical algorithm that it

### [10:00](https://www.youtube.com/watch?v=yGdk0hmvkgo&t=600s) Segment 3 (10:00 - 15:00)

does. Now, bit of math, slight bit of math, but very interesting algorithm. So here what it does is given that the passive mode of deletion would take care of most the keys like most of the keys that needs to be deleted would be taken care from the passive mode, right? Because a key is very likely that you put it in the cache is very likely to be accessed and it that would be deleted. So now the keys that are in question are the one that are never accessed. That would not be much, right? many keys like this. So that is where what active mode does is active mode works with a very simple statistical theory that if you take a random sample from a population that random sample would represent the distribution of the population. Right? So here if you randomly pick a sample of 20 right? If you pick a If you randomly pick a sample of 20, the number of expired keys or the percentage of expired keys in this sample of 20 would be very close to the percentage of expired keys in the actual population. So that is where what it does is Redis runs a loop. Right? A simple loop that executes 10 times every second, which means every milli every 100 milliseconds it executes a loop. It's a timer It's an interrupt that happens. We'll see how it is implemented. 10 times every second it runs. What it does is it test random keys having some expiration set. So out of all the keys that are that you have added in the Redis, it would pick 20 keys at random whose expiration is set. Out of those 20 keys that it has picked at random, it would see how many keys of them are expired. It would first of all delete them. If these deleted keys or if these expired keys are more than 25% then it would repeat the process. Right? This is what would This is what is going to happen. So this would be a continuous process that is running whose job is to pick 20 key or pick 20 keys at random. Out of these 20 keys, find out how many of them are expired. First of all, delete them. See if these are more than 25%. If it is, which means that if your sample contains more than 25% of expired keys not deleted, your population would also contain more than 25% of keys that are expired but not deleted. This is what it plays with, right? And this is such a beautiful algorithm. Why? Because this is not the only way to delete it. You are anyway having a passive mode which would be doing the bulk of the deletion. The active mode is only taking care of the keys that are never accessed, that are expired but never accessed, right? So that would not be huge. And for a normal generic use case, this would not be huge. So this is where it plays very beautifully with statistics where it says that if my sample has 25% more than 25% of expired keys that are not deleted, my population will also have more than 25% of expired keys which are not deleted. So I have to repeat this process. Now why is it doing so? First of all, it does not have to iterate through all the keys because that is extremely costly. Redis is trying to minimize the overhead burden. You might think you hey, let me have another data structure in which I'm keeping all the expired keys to make it faster or something. Why to add that extra memory burden? Redis is already an in-memory database. It needs memory to store keys and values not some additional data structures, right? The second reasoning you can give is hey, if I'm not creating another data structure, then why can't I just iterate through all the keys? That is costly. Imagine you have million keys having expiration set of 1 year or something. Right? Why you are wasting time doing that? So that is where this quick sampling See, most cases the passive mode would do the job, right? For edge cases where you would have a chance of memory leak, you are just playing by So no additional data structures, no additional space requirements, just a random sampling, testing it and the here it does this exactly 20 keys. And how does it comes to this number 20? It's normal central limit theorem. Right? Detailed like proof of it is a little bit out of this scope, but if you Google it, you'll find an amazing resource on it. Right? Okay. So the 20 keys it samples out of the 20, it sees how many are expired but not deleted. It deletes them and repeats the process if it is more than 25%. As simple as this, right? It does not have to go through all this part. Now, where do we put this? A normal intuition would say hey, let this be a separate thread. Redis is single-threaded. You cannot have a separate thread. So then how would you do that? That is where it's all about structuring the code. Redis does it in a different way, but to make the explanation simpler, we would put it

### [15:00](https://www.youtube.com/watch?v=yGdk0hmvkgo&t=900s) Segment 4 (15:00 - 20:00)

in a place that makes it very sensible, very easy to understand and then I'll tell you how to make it complicated, right? Okay. So now let's see how this fares. So again, we'll do an extremely deep code walk-through on see and how would you write an auto deletion. So given that this cannot be a separate thread, where do we put it? The only place where we can put it in our event loop. Right? The event loop that we wrote, that is where we have to put it. The infinite for loop that is testing that is waiting on epoll_wait. Just before that we would add it. But what's the frequency? Although Redis runs it 10 times every second, we'll not run it very frequently. Let's say we run it at a frequency of at max once every second, right? So I'm setting up a cron frequency. This could be a configuration if you want to. A cron frequency of 1 second. So time duration of time 1. second and we would maintain last time when it ran, right? So let's say last time it ran was now when my server starting. This is exactly the time when it was executed first. Right? So what we can do is in the async server that we have written, the gigantic for loop that we wrote, which is where the first thing we would do is execute this cron. Where the thing is we would check if time. now is after your last cron execution time add cron frequency, which means that now that my time So when this loop would execute, when my control flow would come over here, it would check that hey, is this like is my current time more than last execution plus cron frequency, which means if the time if the enough time is passed, then I'm running this delete expired keys as a simple cron. And then I'm setting my last cron execution time equal to time. now. Very simple. And then our normal flow would work, which is where we are doing epoll_wait and reading the events and accepting the connections and doing that part, right? That would work just fine. We are just plugging our logic at the first thing in my infinite for loop that we are running, right? So if enough time has passed, delete the expired keys and This does not guarantee that it would run exactly once every second. Every time the control flow would come over here, it would run it. Right? A better way Now here you can see that you cannot because it is single-threaded. It's not separate thread that would run exactly 1 second after every second. It is single-threaded until and unless my code execution flow does not come over here or my control here, this would not execute, which is fine. Which is perfectly fine, right? There is nothing wrong in it, right? How Redis does it, it uses interrupts, right? And then you can configure a kernel level interrupt and then it would run it. Not making overly complicated, this would work just fine because we are just understanding how single-threaded high-performance database systems could be built. Right? So now let's see what my delete expired keys function look like. Everything else is pretty clear. When I'm doing time. now, if my time. now is after last cron time execution plus cron frequency, then I'm triggering this, right? So now what it would do? It is another infinite for loop in which although I should have had it bounded, but that's fine. We are just writing a dummy implementation of it. What we are doing is writing an infinite for loop whose job is to do an expired sample, right? And it is just doing expired sample and getting the fraction. If it is fraction is less than 0. 25, then break. So which means if I have deleted more than 25% of the keys, which means in my sample if there were more than 25% of the keys that were expired but not deleted, then I would have to continue, otherwise I'm breaking the loop, right? And I'm just printing in the log deleted the deleted expired but undeleted keys. Total number of keys which are present right now after the deletion is stored is the length of store, which the hash. I'm just printing it for our purpose that indeed the deletion has happened. Right? Okay. So now that delete expired keys is set over here, now what expired sample does? Expired sample does this part. So what we do is again writing a sampling code versus Golang's map iteration is pretty random. The order in which a hash map or a map of Golang is iterated is random depends on the hashing function that it is used. So that is where let's assume that iteration of a hash is not in any order but in a random order. So what we are doing is because we have a limit of 20, we would be and we can sample 20 keys that are expired that have some expiry set. So the flow is I'm iterating through the hash map, right? map. If object. expired_at is not equal to minus one, which means some expiry is set, I'm doing limit minus minus, and if expired at is less than equal to time. now. unixmillisec, which means the key is expired, already expired, I'm triggering a delete.

### [20:00](https://www.youtube.com/watch?v=yGdk0hmvkgo&t=1200s) Segment 5 (20:00 - 25:00)

And I'm doing expired count plus plus. And if I have found 20 such keys whose expired is not Oh, sorry. If whose expired is set, this limit would become zero. I'm breaking the loop here. Right? So, which means that this loop would run up until I find 20 such keys whose expiration is set. Right? And then uh because expired count would hold the number of keys that I've actually deleted, then I can just take a fraction of it. Expired count divided by 20. Right? So, if this fraction is less than 25 more less than 25%, which means less than 0. 25 or more than 70. 25, we would be taking up this column. Right? This is the active mode of deletion. Now, you see how it would work, right? In the single-threaded that are we in the event loop, the first thing that we are doing is seeing if my last execution of this cron was before or rather it was done well before, which means I've I'm well past my cron frequency, I would trigger this, in which I'm sampling 20 and seeing how many of them are expired. I'm deleting those expired and returning the fraction. If this fraction is less than 25, I'm breaking the loop. Otherwise, the loop would continue. A normal active deletion uh flow would happen like this. Right? Now, what about passive deletion? Now, here you see the beauty of it. That's why we have abstracted the implementation. So, here, if you check the store. go file, what we know that every time a file every time a key is accessed, we check for the expiration. If it is already expired, we are deleting it or we have to delete it. Right? So, that is where in the get function, and which is one of the reason why I didn't expose my hash map literally in my eval file, so that I we can have we can mask this thing over here. So, in the get function, we are getting the value for a particular key. If value is not equal to nil, which means value exists, I'm checking the expiration. If expiration is less than time. now. unixmillis, then I'm deleting the key and returning nil, which is what would be written over here if the key does not exist. If the value exists and it is not expired, I'm just returning V, which means an unexpired key would delete as is. If the key is made to expiration, we are passively deleting it. Right? So, this is like a lazy deletion. Right? So, with this lazy deletion, if an key is accessed and found to be expired, it's deleted. Right? If key is not expired, it is moved forward. And then periodically, we are running this job, which is an active deletion cycle in a single-threaded event loop, whose job is to periodically randomly sample 20 keys and see if like see how many of them are expired but not deleted, you delete them. And once you have deleted it, you re-repeat the loop until that fraction comes down below 25%. This is how Redis does expiration of keys, and this is how you would be implementing or you would be re-implementing it in Go lang. So, now let's look at a very quick demonstration of this. I'm running our server on port 7379. And now, what do we do is I'm just connecting the Redis client on port 7379. Here, you can see the first output is deleted the expired but undeleted keys. Total keys equal to zero. Right now, I've nothing. There were no keys to be deleted. The output of this is total number of keys are zero. Right? Now, let me do a set of key K with value V. No expiration set. It does Okay. If I check TTL of K, I get minus one. Right? Because no expiration is set. And now, what I'm doing is I'm setting expire of key K to value 10, which means in 10 seconds, this key would be expired. If I check this, here you can see the timer starting. And on the top, you can see total number of keys is one. Keys is one. Keys is zero. Right? So, we did not delete it. The loop automatically deleted the particular key. And here, you can see how my total number of keys are zero. Right? Because the loop that we are running, its job is to actively delete the keys. Right? While if the key is getting accessed, it would be automatically deleted if it is expired. So, let's take that as an example as well. So, let's say I set K {comma} V, do expiry 10, and I'm doing normal get. Here, you would see as soon as I'm get doing get get get, and if the key would have expired and I'm and I would have done a get on it, it would start returning me nil. So, without me accessing it, because I'm doing a get of it, which means I'm accessing an expired key Well, I'm accessing a key which is found to be expired, it is automatically deleted. Lazy deletion. So, passive deletion in action, loop in action. And this is how you would be implementing auto deletion. Right? And now that we are on the topic, let's quickly see how we can set expiration of it with all the edge cases. Set K {comma} V. If I forgot to set expiration, I can just do this. I can type in Let me first do a deletion normal deletion of key K1, K2, K3, K4. I

### [25:00](https://www.youtube.com/watch?v=yGdk0hmvkgo&t=1500s) Segment 6 (25:00 - 27:00)

should see zero because key K1, K2, K3, K4 doesn't exist, but key K exists. If I do this, it returns zero. But as soon as I am adding K to it, integer is one because one key was deleted. Right? Delete functionality working. If I do this again, output is zero. Right? Similarly, with expire, if I'm passing a key that does not exist, it would return me zero because operation is not completed. There was no key to set. Right? But if I do set K {comma} V, and then I set expire 10, the output is one. Right? A classic implementation of delete and uh expire. Right? Great. We implemented delete, expiration, saw how Redis does auto deletion, active mode, passive mode. We implemented everything. Right? Okay. So, what's next? Now, here, you can see a very nice thing which is pending. The pending is although we deleted it, right? We expired We deleted the keys explicitly by deleting. We did auto deletion for the keys which are expired. But what if because your Redis stores everything in RAM, what would happen if I'm just inserting keys bluntly? Right? Someday, your RAM would get filled. What happens when there is no more space to allocate? Right? No more like you run out You are running a malloc, but that has no space to allocate something. What would happen? In that case, your program crashes. But you don't want your server to crash, right? Because your end user requests are will get failed. So, what do you do? A classic guess thing is you would do eviction. Which means that when your cache is full, you do eviction. So, in the next video, we would look at how Redis does eviction. We would implement a very simple eviction strategy, which works on the number of keys. We'll not make it overly complicated. We'll just implement it that hey, at max my Redis server or our Redis implementation can have, let's say, five keys just as an example. Right? And we would see as soon as we are trying to add the sixth key, it would evict one and put our the new key that we are trying to insert in it. A classic eviction strategy. Right? This is what we would be implementing in the next one. I hope you found it amazing. That is it for this one. I'll see you in the next one. Thanks, everyone.

---
*Источник: https://ekstraktznaniy.ru/video/49981*