I Let Python Pick My March Madness Bracket - Bracket Simulation Tutorial
24:05

I Let Python Pick My March Madness Bracket - Bracket Simulation Tutorial

Corey Schafer 17.03.2025 16 777 просмотров 536 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
In this video, we'll be creating a March Madness bracket simulator using Python. We'll build a program that simulates tournament games with realistic probabilities based on team seeds, allowing for both expected outcomes and potential upsets. The tutorial covers using Python's dataclasses to create a Team class, implementing a weighted probability system for game simulations, and building a tournament function that processes all rounds until a champion is crowned. Whether you're filling out your own bracket or just interested in tournament simulations, this project offers a fun application of Python's random module with practical sports analytics. The code is available in the description below for you to customize and use. Let's get started... The code from this video can be found at: https://gist.github.com/27fcf83e5a0e5a87f415ff19bfdd2a4c ✅ Support My Channel Through Patreon: https://www.patreon.com/coreyms ✅ Become a Channel Member: https://www.youtube.com/channel/UCCezIgC97PvUuR4_gbFUs5g/join ✅ One-Time Contribution Through PayPal: https://goo.gl/649HFY ✅ Cryptocurrency Donations: Bitcoin Wallet - 3MPH8oY2EAgbLVy7RBMinwcBntggi7qeG3 Ethereum Wallet - 0x151649418616068fB46C3598083817101d3bCD33 Litecoin Wallet - MPvEBY5fxGkmPQgocfJbxP6EmTo5UUXMot ✅ Corey's Public Amazon Wishlist http://a.co/inIyro1 ✅ Equipment I Use and Books I Recommend: https://www.amazon.com/shop/coreyschafer ▶️ You Can Find Me On: My Website - http://coreyms.com/ My Second Channel - https://www.youtube.com/c/coreymschafer Facebook - https://www.facebook.com/CoreyMSchafer Twitter - https://twitter.com/CoreyMSchafer Instagram - https://www.instagram.com/coreymschafer/ #Python #MarchMadness

Оглавление (5 сегментов)

Segment 1 (00:00 - 05:00)

Hey there. How's it going everybody? So, the March Madness brackets were released yesterday and people are starting to fill out their brackets and this is a 64 team tournament. So, the odds of filling out a perfect bracket are almost impossible. I actually don't think anyone's ever filled out a perfect bracket. Uh, at least not since they have started tracking and verifying them. Uh, it's so unlikely that there are a lot of sites out there that have some pretty crazy prices for filling out a bracket with a certain number of wins. So, in this video, I thought it would be fun to make a bracket simulator that at least simulates uh games in the ballpark of what might we might see in a real tournament. So, we're going to weight the teams by their seed so that higher seeds are more likely to advance and then we can adjust those weights if needed. And at the end of the video, I'll run my script and use all the values that I get from this simulation to fill out my own brackets. Now, I don't really know anything about college basketball, so I figure that the results that it gives me will be just as good as anything that I could come up with off the top of my head. And also, I'll be sure to post a link to this code in the description section below if anyone would like to run this themselves and play around with it a bit. Uh maybe one of us will get that perfect bracket out there, even if that's almost impossible. So, with that said, let's go ahead and get started. So, first, let's import what we need for this project. Uh, so we're going to need the random module to add some randomness to our simulations. And we'll also use data classes to create a simple team class. Now, let's go ahead and create that team class using a Python data class. Now, I haven't done a tutorial on data classes yet, but I plan to put one together in the near future. If you haven't used data classes before, they're essentially a shorthand way to create classes that are primarily used to store data. they automatically generate certain methods like the init method based on the fields that we define. So they're pretty clean and simple. And for this project, we just need to uh track the team's name and seed. So I'll create a data class here and that's going to be called team. And within here, we're just going to have a name, and that's going to be a string. And we're going to have a seed, and that's going to be an integer. And now we need to set up our tournament bracket for a complete March Madness tournament. Uh we'll need all 64 teams organized into their initial matchups. Now I've prepared this data ahead of time in a snippets file. So let me grab these teams over here. Hopefully I wrote all of these down correctly. So I will copy this and then we will go over how I have arranged this. So let me scroll up here. So what we've created here is a list of tupils where each tupil represents a matchup between two teams. So each team has a name and a seed number like we defined in our data class. And we've organized this by regions so that it plays out just like a real tournament. And I've commented each reg region so that we know which is which. So we have south here, west, east, and midwest. Now this is basically what the tournament's going to look like. Some of the brackets uh wait until these final four are done uh or the first four are done. Sorry about that. Uh before they initialize the bracket. Um I just have it in here. There's still a couple of games to be played. So this is either going to be Alabama State or St. Francis, but either one of those are going to be the 16th seed. So we kind of have an idea of what this bracket's going to look like. So now let's create a function to simulate a single game between two teams. Now, we're going to update this to be more realistic later on, but for our first version, we'll keep it really simple, uh, just so we can test this. Let's make it set so that the team with the better seed always wins. And if both teams have the same seed, then we'll randomly choose a winner between those. So, to do this, I'm going to create a new function here, and I'll call this simulate game. And we'll just have this be between team one and team two. So first we'll check if these are equal. So I'll say if team one seed is equal to uh team two seed then we'll just return a random winner. So I'll do random. choice and pass in both of those. So random choice between team one and team two. Okay. And now we'll check if team one seed is greater than team two seed. Now the lower the seed, the better. So we would return team two in this instance. And finally, if it doesn't hit either of those conditionals, then we just want to return uh team one because that means that it would be the lowest seed. Okay, so now let's test this function with a single matchup to make sure that it works as expected. So, I'm going to grab the first matchup from our teams list and run our function on it. So, I'll say that the uh first matchup is equal to

Segment 2 (05:00 - 10:00)

and that was first round. And I'll just grab the first value from that first round. And let's go ahead and print out that first matchup there. Save that and run it. And let me see here. It says we have a problem. I think this is probably because yeah, it's because I've updated my llinter recently and there's still a couple of things that I need to turn off. Uh the code's fine. It's just telling me to use something other than random. Choice if this is for cryptographic purposes, but it isn't. So that's fine. But we can see here that it gives our first matchup as Auburn versus uh either Alabama State or St. Francis. That's a number one seed versus 16. So if we grab the winner of that. So I'll say winner is equal to uh simulate game and we will pass in Auburn as the first one which is going to be index of zero for that match. And then first round with the first index. So let's print out the winner. And the way we have it so far, the number one seed should win. Oops. And I made a mistake. That's not first round. that is first matchup. So, let me rerun that and we can see that our number one seed wins. So, that's working so far. Uh, like I said, I'll update the logic of the game simulation in just a bit to be more historically accurate. Uh, but I'm going to leave it like that for now so that we can uh get certain results while testing through this. So, now let's create a function that runs through all the games in the tournament. Uh so I'll create a function here called simulate tournament and we'll just pass in this first round as an argument. Oh, and I want to be sure that I create this as a function here. And now we're going to keep track of the current games that need to be played. So at the start of the tournament, this is just going to be the first round of games. So I'll say current games is equal to that first round. Okay. And now let's make a loop so that as long as we have games to simulate, we keep running through the code. So each time it goes through this loop, it's going to be a new round. So I'll say while the length of the current games is greater than zero, then for right now I'll just print out uh let me print something here that's just a new line and we'll say new round. And I'm also going to want to keep track of all the winners for this round. Right now, it's just going to be an empty list. So now, let's loop through all the matchups in the current round. And we'll grab each team from each of those matches and simulate that game. So we can say for team one, team two in the current games. So this will just be the first team of each of those tupils and this will be the second team of each of those tupils. So we can just say the winner is equal to simulate game and we will pass in team one and team two and then we'll append this to our winners list. Now I could have done this as a list comprehension but I think this is easier for people to read for people who don't know how to use list comprehensions. So now outside of that for loop if we are down to just one winner then that means that we've found the winner of the tournament. So let's go ahead and check that. So I'll say if the length of the winners is equal to one, then we're just going to return uh that winner. So I'll say return winners and just grab that first index and return that. Now again, I've updated my llinter here. Uh so I think the error that it's probably given me is that I don't have a return outside of that while loop. I'll just return uh none for now. And that way I can get that stuff uh to not be so ugly while it's highlighted there. Okay. So now back in our while loop here, if we have more than one winner, then that means that we need to set up the next round. So let's create pairs of winners to face each other. The way we can do this is I'll just create a next round equal to an empty list. And then I'll say for i in range, and I'll explain this in just a bit. So for I in range of starting at zero and we'll go to the length of the winners and then I'm going to use a step of two here and then we will append these to the next round. So I'll say next roundappend and we will append the winners of I and winners I + one. So let me explain what we're doing here. Whoops. And actually this needs to be a tupil. Uh so now let

Segment 3 (10:00 - 15:00)

me explain this. So if we remember uh our first round is a list of tupils. So what we're doing here is we're looping over our winners and we're just starting at uh an index of zero going all the way up to the length of those winners and we're doing a step by two. Uh so within here we are appending to our next round. So if we're just at the start, let's say that this is zero. So we are appending that first index and then I + 1 would be one. That would be the second index that appends that tupole for the next round. And then we keep going down until all of our winners have been appended with another team to do that matchup. So basically, let me just write a comment here. um a list that looks like 1 2 3 4 five six would instead get uh transferred into a list that looks like this. So it would be 3 4 and then five six. So we can see that it takes our list of winners and appends them to this next round in this tupal form to where we have those matchups. So I hope that makes sense. So then finally we need to update our current games uh with our next round of matchups that we just created uh so that the while loop continues with the next round. So I'll say current games is equal to next round and this should be current games. There we go. And then we should never hit this return none at the end. But it's good practice to have a default return value. So now let's take a look at this entire U simulate tournament function here. So just to recap, this function takes our first round of games and simulates each matchup using our simulate game function and it collects all those winners and then pairs them up for the next round. And it keeps doing this until there's only one winner left, which is the winner of the tournament. So let's run a quick simulation with our current setup to see what happens. So, first I'm going to uh delete what we had up here where we were just simulating the one game. And then down here, I'm going to say that our winner is equal to and this is going to be simulate tournament instead of simulate game. And instead of passing in just two teams, I'm going to pass in that entire first round of teams there. And then we'll print out that winner. So, if I save this and run it and let's see what we got here. Um, so something I messed up. Oh, and I should have seen this because it underlined it for me. This should be winners instead of winner there. Sorry about that. So now if I run this and that underline is still there. I think that is because it wants me to make it a list comprehension. Yeah, sorry. I'm still getting used to this new llinter. Uh, I'm going to upgrade my uh RC file after this video to where some of these errors don't pop up anymore. But I ran that and we can see that it's printing through each new round. And we'll have more information about those rounds here in just a bit. But for the winner of the tournament, we can see that we got Florida at the number one seed. Now, since all those number one seeds are making it to the final four with the logic that we have now, then it's just going to randomly pick one of those to win in the semi-finals. and finals. So, if I run this a few more times, then we should just get a different number one seed in here. If I run it a few more times, sorry, I need to run the code up here. So, we can see now we got Auburn, Florida, Duke, Houston. It's always going to be a number one seed with how we have it set up right now. And actually, let me make this print function. Let's make this an F string here. And instead of that team with the seed, let's just do a new line here to split that up a bit. And then we will say the winner. name wins the tournament. Okay, let's run that. And we can see that now uh we have a little bit better output there. Okay, so our basic logic is working where we can simulate through a tournament. But if the better seed won every single game, then that would be extremely boring and a crappy simulation. uh because in real March Madness uh we know that upsets happen all the time which is what makes it so exciting. So let's update our simulate game function to make it a bit more realistic. Uh instead of always just picking the better seed, what we're going to do is we'll weight the probability based on the seed numbers. So better seeds will still have an advantage, but upsets will be possible. So I'm going to remove the function that we currently have. So, I'm going to take this out and I have an improved version in my snippets file that I'll paste in here and then we'll explain what's going on. So, here within the snippets file, I'm going to grab

Segment 4 (15:00 - 20:00)

this and then I will explain exactly what we're doing here. And now, let me walk through this. So, now instead we're calculating a weight for each team based on the inverse of their seed. So a number one seed gets a weight of this would be one over one if this if team one was a one seed and if this was a 16 seed here for seed two then that would get a 1 over6 which is in decimal like 0625 but basically that means that the number one seed would have 16 times more weight than a 16 seed and then I can convert those to probabilities. So if I come down here then we are dividing each weight by the total of those weights added up and then I am multiplying that by 100 to get a percentage. Now I don't actually have to multiply by 100 here. Uh the probabilities work just fine as decimals as well but I'm printing this out later on. Uh so I'm just doing that so that we can see it as a percentage. So with these numbers in a number one versus a number 16 matchup uh then the number one seed would have about a 94% chance of winning while the 16 seed would have about a 6% chance of winning. Uh historically I think a number one seed has about a 98% chance of winning against a 16 seed. Uh I kind of like the idea of rooting for upsets. So I'll likely fill out my bracket with these weights as they currently are. But if we wanted to adjust these weights, then I do have a commented out section of code here that does that. So in the commented out code here, uh basically we're using a power to adjust the weights. And a higher power means a larger advantage for a better seed because we'll be dividing by a higher number with the higher seeds. So I'm just going to leave that commented out because I kind of like the idea of leaning more towards upsets. uh but the code will still be here commented out for anyone who wants to download this and play with these weights a little bit. So now if I scroll down here to the bottom after we have these probabilities, we're using this random nut choices function with these weights to randomly select a winner based on these probabilities and then we print out the matchup details so that we can see what's happening. So right here I'm printing out uh both of the teams. I'm printing out their name, their probability that they had of winning that match, and then at the end here, I'm printing out the winner of that match. Actually, just for clarity here, let me also add in the team seed. So, I'll do the team one seed here, and I will do the team two seed here, right after we print out the team name. Okay, so I've run this exact logic with a large number of simulations earlier and tracked the stats. And with the weights as we have them now, a number one seed will win the entire tournament about 75% of the time. Uh, a number two seed will win about 15% of the time, number three around 5% of the time, and it gradually decreases for the other seeds. And this is fairly close to historical outcomes. So now I should be able to uh run our code here and get more realistic results than what we saw before. So just for fun, I'm going to run this one more time and whatever results I get, I'm going to use those results to fill out my own actual brackets for this year. Uh so fingers crossed that I don't end up with anything that's too insane. Um, but I do have this set slightly favoring upsets. So, it's possible I could end up with a bracket that looks pretty wild. Um, so let me run this and let's see what we get. Hopefully not a 16 seed winning the entire thing. So, I run this and we can see here that I have Florida winning the tournament. Now, let's scroll up here to the first rounds and see if anything crazy is going on. So we can see here in our first round we have Auburn which is number one seed and their probability of winning this match against the 16 seed is 94. 1 which is how we have it set up with our current weights. Uh winner was Auburn. So they had a 94% chance weighted to be picked by that random dot choice. Um so here we have an eight versus a nine. So we can see that this is weighted much smaller. So, this uh Louisville only had a 52. 9% chance. And we can see here that Kraton won with just a 47% chance. Now, let's see if there's any major upsets here. So, right here we have an 11 seed beating Old Miss. Um let's see if there's anything else. We have Oklahoma nine seed beating Yukon. Uh Colorado State a 12 seed

Segment 5 (20:00 - 24:00)

beating a five seed. Um, let's keep looking through here. Okay, so here, this is pretty unlikely. We have a 15 seed beating a number two seed. Um, so I said I was going to fill out my bracket with these results, so that's a little wild. That's probably not going to happen, but I'm going to fill them out like this anyway. Yeah, down here we have uh Troy beating Kentucky. Probably not going to happen. Uh, realistically uh you know, we are just using random values here. uh which most of the time produce realistic results. Uh and normally I would run this a couple of times until it, you know, looked like something that I'd want to actually fill out on a bracket, but I promised I was going to go with the first result. So, I'm result, even if some of these are pretty unlikely. So, let's scroll down to the final four here and at least see if our final four uh looks like something that could actually happen. Yeah. So, we have a number two seed, uh number one, a number one, and a number one. Um, let's see here. We have in the top eight a four seed, two seed, one seed, three, one, two, one, two. So, that's fairly realistic if we look at historical data. Um, this is usually kind of what you end up with in terms of the seeds that end up making it to those rounds. Okay, so after I finish recording the video, this is a bracket that I'm going to go fill out on a few different sites. Um, but before we finish up, there's a couple of things that I wanted to mention about the code that we have here. So, first let me close this and go up here to our simulate game function. So, basically, I wanted to give you all a fun and easy way to simulate a tournament like this. And I thought that this was the easiest way that I could explain it in a short video, but there are tons of different ways that you could change the logic in the simulate game function here. Uh, right now, we're just going off waiting by seed. But if you wanted to get fancy, you could even add a power ranking for individual teams instead of just relying on the seeds. So you could add a power attribute to the team data class up here. And then when we initialize uh this first round, we could add in our own power there. And that way you could add individual weights for each team. Uh that way you could adjust the odds for specific teams that you think are underrated or overrated based on their seeds. So that would be one other way that you could do it. Uh I mean technically you wouldn't even need to restrict the logic just to this script here. Another way is that we could connect to the chat GPT API and have this simulate game function send off each matchup so that AI chooses the winner for you. So, you're not limited to just the way that we've done it here. You can use any logic that you'd like in this simulate game function, but however you want to do it, the rest of the tournament logic should still work just fine for you in running through the rest of the tournament, um, depending on how you want to change this up. So, feel free to update this function if you'd like to try your hand at improving these simulations. But with that said, I think that is going to do it for this video. Hopefully you found this fun and interesting and have a pretty good idea for how you'd create your own bracket simulator using Python. And even if basketball isn't your thing, you could adapt this code for any tournament style competition. And it doesn't even have to be 64 teams. It could be a smaller tournament than that. But if you have any questions about what we covered in this video, then feel free to ask in the comment section below and I'll do my best to answer those. And if you enjoy these tutorials and would like to support them, then there are several ways you can do that. The easiest way is to simply like the video and give it a thumbs up. Also, it's a huge help to share these videos with anyone who you think would find them useful. And if you have the means, you can contribute through Patreon or YouTube. And there are links to those pages in the description section below. Be sure to subscribe for future videos. And thank you all for watching.

Другие видео автора — Corey Schafer

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник