# Python Tutorial: Itertools Module - Iterator Functions for Efficient Looping

## Метаданные

- **Канал:** Corey Schafer
- **YouTube:** https://www.youtube.com/watch?v=Qu3dThVy6KQ
- **Дата:** 13.11.2018
- **Длительность:** 45:49
- **Просмотры:** 242,752
- **Источник:** https://ekstraktznaniy.ru/video/12054

## Описание

In this Python Programming Tutorial, we will be learning about the itertools module. The itertools module is a collection of functions that allows us to work with iterators in an efficient way. Depending on your problem, this can save you a lot of memory and also a lot of work. Let's get started...

Functions Covered in This Video:
count - 1:19
zip_longest - 6:48
cycle - 9:17
repeat - 11:09
starmap - 14:06
combinations - 15:34
permutations - 15:34
product - 19:45
chain - 21:40
islice - 23:37
compress - 28:50
filterfalse - 31:49
dropwhile - 32:24
takewhile - 32:24
accumulate - 34:54
groupby - 37:04
tee - 43:28

The code from this video can be found at:
https://github.com/CoreyMSchafer/code_snippets/tree/master/Python/Itertools

Iterators Tutorial:
https://youtu.be/jTYiNjvnHZY

Sorting Tutorial:
https://youtu.be/D3JvDWO-BY4


✅ Support My Channel Through Patreon:
https://www.patreon.com/coreyms

✅ Become a Channel Member:
https://www.youtube.com/channel/UCCezIgC97PvUuR4_gbFUs5g/join

✅ O

## Транскрипт

### <Untitled Chapter 1> []

hey there how's it going everybody in this video we're going to be taking a look at the inter tools module so itertools is a collection of tools that allows us to work with iterators in a fast and memory efficient way now if you don't know what an iterator is it's basically sequential data that we can iterate or loop over now I would recommend being familiar with the concept of iterators and generators before watching this video I do have a video that I released a little over a week ago that I'll leave in the description section below if you want to watch that video first it's probably not absolutely necessary but understanding how an iterator is exhausted or how some only hold one object in memory at a time is going to help with understanding how itertools is working as well so the inter tools module contains a number of commonly used iterators as well as functions for combining several iterators now The Sitter tools module is available in the Python standard library so there's nothing that you need to install in order to use this now if you're looking for a specific itertools function then I'm going to try to put a timestamp in the description section below to each of the functions that we go over in this video so you can look there for a time step to whichever function that you're looking for so first off let's go ahead and import this so that we can see what we can do with this so first I'll just say import it err tools ok so right off the bat let me show you one of the most simple itertools functions so the

### count [1:19]

first function that we're going to look at is called count and just like it sounds it simply returns an iterator that counts so we can say that use that just by saying counter is equal to itertools dot count and by default if we don't pass in any arguments then count will just start at 0 and count out by one each iteration and this is just going to go on forever so right now if we were to loop over this it would just start counting up from zero by one and never stop so if you're not careful then you can get stuck in an infinite loop now I'm going to show you what this looks like but I wouldn't recommend you do this on your machine because sometimes it's hard to stop from executing and can freeze up your computer or program but I'm going to try to run this and just stop it quickly so I'm just going to say for num in counter oops in counter and we will just print out num now again this is going to be an infinite loop but I'm just run this and stop it quickly okay so I ran that and then quickly stopped it by hitting ctrl C but even in that short time it already counted up by a lot we already got up over two hundred thousand here now if you are familiar with iterators then we can actually get the next value without using a for loop we can simply just use the next function and pass in our iterator so instead of looping through it like this instead let's do something like I will remove this for loop and just say print next and then next counter so let's save that and run it and when we run that we can see that we get a zero now if I copy and paste this a couple of times and run this again so I'll copy and paste this a few times here and run it then we can see that each time we run next we get 0 1 2 3 and it just keeps on going and this is one big key take away from these iterators some of them can go on forever but even if they do we can still just get one item at a time ok so why is this useful well there's a lot of different scenarios that this might be useful but let's just look at one example so it's very common to have a list of values and want to have some kind of index assigned to them so to show an example let me create a quick list here so I'm going to comment out these print statements and I'm just going to make a quick list of data here and say that I want 100 200 300 and 400 so here we just have a list of some random numbers so let's say that we wanted this data to be paired up with an index value for some reason well for example let's say that our data took place over a daily basis and we wanted to graph this so we want to pair it up by saying ok 100 is associated with day 0 200 1 300 is day 2 and so on now if we don't know how much data there's going to be then we can simply use the count function to provide values for any amount of data so for example we could simply say something like this so I will say daily underscore data is equal to and I'm going to use the zip function here and then I will pass in itertools dot count and as the first argument and then our data list as the second argument now if you don't know what the built-in zip function does basically it combines two intervals and pairs the values together so it will get the first value of the count function which by default is zero and it will pair it with the first value of data which in this example is 100 and then it will move on and pair one with two hundred and two with three hundred and so on now the zip function will return an iterator itself that needs to be looped over in order to get all the values now if we want we can simply convert the results to a list and get them all at once so I could just convert this entire thing to a list by wrapping all of that in a list function so now that we've done that let me print out our daily data variable here so I will print that out then we can see that it paired those values up in our result and since we use itertools count it just kept grabbing the next value until our data list had gone through all of its values so that count function will work with any size of data okay so we can also pass some arguments into our count function in order to start at a different place and we can also step by a different amount as well so let me uncomment out these print statements here and for now I'm just going to comment out our data example here and let me actually grab these print statements and put these up here okay so now let's look at passing some arguments into our count function so if I wanted to start from something other than zero then I could simply pass in a value of start equals five for example if I save that and run it we can see that now our counter starts at five and we get five six seven eight now it's still counting up by one and we can also pass a step argument to change how the counter is incremented so if I started at five then I could also pass in an argument of step equals five if I save that and run it then we can see it now starts at five and now it's counting up by five each time so 5 10 15 20 now the counter is pretty versatile it can also count backwards and can also count by decimal numbers as well so if I was to instead say that our step is negative two point five for example and then run that then we can see that it starts at five and subtracts 2. 5 from each step and then even goes into the negatives okay so that's a look at the count

### zip_longest [6:48]

function now since we've already seen an example using the built-in zip function down here let me show you a knitter tools function that is just like this except it doesn't end until the longest iterable is exhausted and again the built in zip function ends on the shortest iterable so this itertools function is called zip longus so a pretty obvious name for what it does now remember this pairs iterables together so if it doesn't end until the longest interval is exhausted then it means it will need to pair some other values with some placeholders so by default that's going to be a none value so let me comment out these lines up here and uncomment out this example where we used this zip function here now this worked before because zip ends after our shortest interval is exhausted and we're not going to be able to use the itertools count function with the zip longest the way that we have it right here because we're converting it into a list and since count goes on forever then that would just run out of memory trying to convert that to a list now you could still use count and just not cast it to a list but then you just need to get the next values one at a time but let's leave this list here and simply replace count with a range of values instead so instead I'm going to a switch count out here and instead just say something like range of 10 values so if I run this right now while we're still just using this regular zip built-in then we can see that it runs just like it did before it just pairs those values up until our shortest interval is exhausted and the shortest interval here is data so as soon as all of our data is paired together then it just cuts off those values and it doesn't pair the rest of our range here but if instead we use zip longest from itertools so I'll say it err Tools dot zip underscore longest if I save that and run it then we can see that it pairs those intervals together but when our data variable runs out it continues and just pairs the rest of our range with none values so that can be useful depending on the type of problem that you're trying to solve so we can see here that it paired up the values and then this is where our data ran out but then it continued with the range and just put none values here for four or five six seven eight and nine okay so that was a quick detour looking at the zip longest function and now let's get back to looking at a couple more iterative tools functions that can go on indefinitely so let's look at the cycle function so

### cycle [9:17]

cycle also returns an iterator that goes on forever basically it takes an iterable as an argument and will cycle through those values over and over so let me get rid of this example that we have here and uncomment out this so that we can see an example of this cycle function okay so instead of count I'm going to say itertools dot cycle and this takes in an iterator and basically like I said it's just going to cycle over these over and over so I will just pass in a list of one two three and four our print statements here I only have four here right now let me make this six total here so that we can see exactly what this is doing so I will save this and run it and let me zoom up a little bit okay so we can see that our output is just one two three so it loops through our list that we just passed into cycle but once it hits the end of the list it just cycles back through so that's a pretty simple concept but there's a lot that you could do with it so for example if you wanted to simulate a switch of something getting turned on or off or something like that then you could simply create a cycle with two values so that could either be a 1 and a negative 1 or 0 or you could even pass in a list of strings of on and off so for example if I came up here instead of a list I'll actually use a tuple just to show a different data structure here and I'll say on and off as our two values in that tuple so if I save that and run it then if I screw up to the top here we can see that it just cycles between the values of on off on off each time the next value is fetched so that's a look at the cycle function it's super simple but there's a lot of different ways that we could use it okay so now let's look at the last infinite iterator and itertools and

### repeat [11:09]

that's going to be the repeat function and this one is also super simple it just takes some input and repeats it indefinitely so for example if I was to come up here and say itertools dot repeat and then just pass in a value of two for example if I save that and run it then we can see that it simply repeats the same value over and over each time we fetch the next value and we can set a limit on this too so if I was to instead come up here and say repeat and then just say times equal to three and save that and run it then if I scroll up here a little bit then we can see that it repeats the value for the first three times and then it throws a stop iteration exception now if we had run that through a for loop then it would have just looped through that three times and we wouldn't have seen that exception since the for loop handles those stop iteration exceptions for us now on the surface this might not seem very useful but it's usually used for passing in a stream of constant values two functions like map or zip that also operate on intervals so for example if you wanted to get the square of the values one through ten then you could do something like this for example so I'm just going to comment out these print statements here so let me type out an example of this and I actually got this from the Python documentation so I'm going to say squares is equal to map and with map will use the pal function which takes the power of values and we will pass in arguments of a range of ten and also we'll do a knitter tools dot repeat and we'll just pass in a value of two there as well now if you don't know what map does basically it takes a function in this case we're using pal which raises a number to a certain power and then it takes iterables and uses the values from those to pass as arguments to that function and it will loop through and pass values from those intervals into a function until the shortest list of arguments has run through all of its values so this will start off by passing in the next value from our range and that would start at zero and it will also pass in the first value of our repeat iterable which is always going to - so the first value it will calculate is going to be zero to the second power and then the next time through it'll do one to the second power and then two to the second power and so on so if I print this out this is actually going to be needed to be converted to a list as well otherwise it's just going to be an iterator waiting to be iterated on but I'll cast it to a list so I'll just put out print out a list of those squares and run that now we can see that we got a list of the squares of all the values from zero to nine so that's a little more practical of an example of where you would use that repeat function like I said it's usually used for passing in a stream of constant values to a function like map or zip now since we're

### starmap [14:06]

already using the map function this would be a good time to look at another itertools function that modifies the map function a little bit and this is called star map basically star map is very similar to map but instead of taking arguments from iterables like we're doing here it instead takes arguments that are already paired together as tuples so for example let's use star map to get these first few powers so instead of passing in range and this repeat here I'm just going to pass in a list and these are going to be a list of tuples that have the arguments already paired together and I'm just going to do the first few here so I'm going to go 0 to the second power and then let me go ahead and copy this to make this a little quicker so I'll do 0 to the second power then 1 to the second power and then 2 to the second power and instead of map we instead want to use itertools dot star map so if I save that and run it then you can see that it ran our arguments here as tuples into our power function and we got 0 1 & 4 for the results of those arguments so pretty similar to map but instead it takes its arguments as a list of tuples instead so depending on the problem that you're trying to solve that might be more useful for you ok so far we've gone over itertools functions that produce iterators that can go on forever but now let's look at some useful functions that will eventually terminate

### combinations [15:34]

so first let's go over two of the moe popular in Turtles functions and those are combinations and permutations so these allow us to take an iterable and return all of the combinations or permutations from that interval and if you don't know the difference between combinations and permutations basically combinations are all the different ways that you can group a certain number of items or the order does not matter and permutations are all the different ways that you can group a certain number of items where the order does matter so let's look at some examples and this will be more clear so I'm going to use a couple of lists hear from our snippets file and I'll post this snippets file in the code to description section below if you want to use this as well but I'm just going to grab three code snippets here and I'm going to paste these up here at the top and to clean up what we have so far let me go ahead and just remove what we already have here and save that okay so for this example I'm going to use our letters list here that consists of the values of a b c d so let's say that i want to get all of the possible combinations of two values from those letters so to do that I can simply say I'll say result is equal to itertools dot combinations and I want to get the combinations of the letters all the different combinations of two values and now let me make this output a little smaller here and now let me do a for loop where I say for item in result print the item so let me save that and run it and make this a little larger again okay so we can see that when we loop through those it gives us all the different combinations of two values that we can make from our original list of ABCD now remember with combinations the order does not matter so we can see that we have a combination here of a B but we don't see a combination of B a and that's because they are seen as the same combination and sometimes this is exactly what we want so for example if we're simulating a poker hand or something like that then it wouldn't matter what order your hand is in so an ace King would be the same thing as a king ace so you would need to produce bow of those combinations now if order does matter then you'll want to use permutations instead that will give you all of the different ways that you can a group certain number of items where the order does matter so let's go ahead and look at that so I'm going to just replace combinations here with our permutations function and rerun this so instead of combinations I will say permutations save that and run it and now if I scroll up here we can see that we got more values there and that's because we're going to get all the different ways that we can arrange two values from our original iterable where the order does matter so we can see that we have a value for a B here at the top but if we scroll down a little bit then we also have a permutation for BA as well now you can think of this as trying to simulate all of the different possible results of a race or something like that so if a came first and then B then that would be different than B coming first and then a so permutations is what you would want to use in a situation like that now one thing that you'll notice is that combinations and permutations don't repeat values so what I mean by that so for example with our permutations here we have all the different ways that we can arrange two items from our original list but it doesn't repeat any of those values so for example it doesn't give us a as one of the permutations and that's because there's only one a in our original list but what if we wanted to allow repeats so for example let's use our numbers list so you can see how having numbers list here of 0 1 2 and 3 so let's pretend that we wanted to see all the different ways that we could create a four-digit code using these values and that would include repeats so we could have a code that's 0 0 or 0 1 or anything like that well for that we couldn't use combinations or permutations because that will only give us the ways that we can range the values in that list if we

### product [19:45]

wanted the values to be able to repeat then we could use the product function which will give you the Cartesian product of iterables that you pass in now if we only pass in one iterable then we can tell it how many times we want it to be able to repeat those values so with that said let's look at an example here so I'm gonna say itertools dot product and we're going to use our numbers here and let's say that we want to set repeat equal to four so what this is going to do is it's going to give us all the different ways that we could arrange these numbers where repeats are allowed so if I save this and run it and scroll up here to the top then we can see that this gives us what we were looking for in terms of going through all the different ways that we could arrange these numbers in groups of four now this is also a way that you could build some kind of password cracker so for example you could create a loop where you go through all the different ways to arrange alphanumeric characters in you know a pattern of six or something like that now I should also mention that there's a way to get combinations with repeated values as well and we do that with a function called combinations with replacements so if I just go up here and replace product with combinations whoops be sure to spell this right combinations underscore with underscore replacement and instead of repeat here we can just pass that and as a positional argument instead so if I save that and run it and go up here to the top then we can see that now we get all the different combinations of those four numbers but it allows repeated values as well so that's a look at some of the ways to get different groups of arrangements from an iterable and always find it fascinating to use combinations and permutations and products and they're definitely useful depending on what you're trying to solve okay so now let's look at some other useful itertools functions that allow us to solve certain types of problems so

### chain [21:40]

now let's look at the chain function so chain allows us to chain together intervals so that it will go through all the items in the first interval and after that has been exhausted then it'll go through all the items in the second interval and so on so I have three different lists here that I pulled from my snippets are there and let's pretend that we want to loop over all of these values and all of these lists so how would we do this so one way would be that we could create a new list that combines all three of these lists and then loops over those so we could say something like let me overwrite this part here so I could just say combined is equal to you know letters plus numbers plus names now the problem with an approach like this is that it creates a new list with all of those values in memory so the sample list that we have here are really short so it wouldn't be a big deal using it in this situation but what if these lists contain millions of items it would be majorly inefficient for us to make copies of all of those and put them into another variable or what if those weren't even list a that they were generators or something like that instead then how would we loop over all of those at once well to do this we can use chain so instead of doing it the way we have here let's instead say combined is equal to itertools dot chain and I will just pass in all of those to our chain so letters numbers and names so now if we loop over these I'll say for item in combined then we will print out the item if I save that and run it then if I scroll up here then we can see that first loop through our letters and after those were exhausted it looped there our numbers and then our names at the end so we can see that it looped over all of the items in all three of those iterables so that can be extremely efficient depending on the data that you're iterating over okay so now let's

### islice [23:37]

look at a function that will allow us to get a slice of an iterator so you can think of this like list slicing where you can specify that you only want the first five item items of a list or something like that but this function will allow us to perform slicing on an iterator and this function is called I slice so to use this there are three different arguments that we can pass in a stopping point to go from the beginning of an iterator until it hits a stopping point so for example let's say that I wanted to slice a range from 0 to 9 and stop on the fifth index so to do that I could just come up here and say itertools dot I slice and we will slice a range of 10 values and I will pass in a stopping value of 5 now instead of combined here I'm just going to change that to result so that variable makes more I also have to change it in the for loop here as well so if I save that and run it then we can see that it gives us the first five items of that iterable now if you're thinking to yourself you know why is this useful because I can already do that with list slicing well don't worry we'll look at some useful examples here in just a minute okay so we can also pass in a starting point argument into I slice as well now if there's only one argument like we have here then it will assume that it's the stopping point but if we put in two arguments then it will assume that it is the starting point and stopping point so if we wanted to skip the first value then I could come in here and just say one as our first argument and then five and it'll think that one is the starting point and five is the stopping point so if I save that and run it then we can see that now it started at index 1 and stopped at index 5 and lastly we can also do steps so if we want to do the same slice that we have now but only get every other value then we can simply say that we want to step by 2 so up here I can also pass in another argument there as a 2 and that'll be the step so if I save that and run it then we can see that we get the values from the 1st to 5th index but it steps by 2 okay so why is this useful so this is useful because like with our other examples so far we may have an iterator that is just too large to put into memory so we don't want to cast it to a list just to get a certain slice of that iterator so for example let's imagine that we have some log files on your machine that are thousands and thousands of lines but you only want to grab the top few lines that are the header of the log file so to do this efficiently we can use I slice now I have a sample file here where I'm going to do this so I have a sample file called test log and this is it here and we can see that we have some fake data in here but the top three lines have the date the author and the description so I just want to grab those and ignore everything else so let's do this using I slice so I'm going to come back to the demo here I'll make this a little smaller here and for now I'm just going to comment out our previous example and scroll down here a little bit and now let's use this I slice on this log file so I'm gonna say with open to open this file and that file was called test dot log now this is in the same directory as my python file so I don't have to pass in a full path if it's in the same directory then it'll work just like this and I want to read that file and now that we have that file open files are actually iterators themselves and whenever you call next on them it gets the next line in the file so we can use them just like any other iterator so in order to just grab the header I'm going to say that the header is equal to itertools dot I slice I've spelled that wrong I slice we will pass in the file as the first argument and let's just pass in a 3 and remember with one argument that's just going to be the stopping point so it's gonna start at the beginning which is the first line and just go up to 3 so that should grab me the first three lines of the file so now I'm going to loop over those so I'm gonna say 4 line in header go ahead and print each of those lines so I will save that and run it and we can see that it prints those first few lines from the log file now those are broken up on two different lines but that's just because the line itself has a newline character in it and the print function also adds a line between print statements now if we wanted to get rid of that then we could just simply come up here to our print statement and say end is equal to an empty string and it won't add an empty line between those print statements so if I save that and run it then we can see that now it just gets those first few lines so that's useful because if we were looping over tons of large files and getting just these few lines then doing it this way will allow us to get those values without loading the entire contents of that file into memory ok so now that we've seen that now let's look at a few functions and Inter tools that allow us to select certain elements from an iterable first let's look at the

### compress [28:50]

compressed function so compress is a function that I could see being used in data science style problems where you have some data and some selectors that you can use to filter down that data so let's say that we have a list of true/false values that correspond to my letters list here so first let me just get rid of this file example here and scroll back up and now like I was saying let's say that we have a list of true/false values and they're gonna correspond to my letters list here and I'm going to call these selectors so I'll say selectors is equal to and this will just be a list and I'll just pass in a list of true false true so we can just pretend that this is another column of data of true/false values and we could pretend that this is anything like if someone is over the age of 21 or if they're married or whatever so we can use the compress function and pass in our iterable with these selectors and it will return a new iterable that only contains the items in our iterable that had a corresponding true value so this will make more sense when we see this example so let me uncomment out our code here and instead of I slice I'm going to instead look at the compressed function so I'll say compress and let's pass in our letters here and also pass in those selectors now like I said this compress function should return and iterable that has all of our corresponding letters that had true values and our selectors so if I save that and run it then we can see that it gives us the values of a B and D so C wasn't included because it's corresponding selector was false now you might notice that this is kind of similar to the built-in filter function the difference is that filter uses a function to determine whether something is true or false but with compress those values are just passed in as an iterable so depending on your problem that you're trying to solve you would use whichever of those are most appropriate so let me show you how filter works really quick just so we can compare the two so with filter we need to create a function so I'm going here at the top and I'm just going to say I'm going to call this LT to for values less than two and we'll take in a value of in there and so I'll just say if n is less than 2 then we want to return true otherwise we'll just return false and now if we run an interval through that built-in filter function using this function that we just created then it will give us all of the values that are less than 2 so for our result here I'll just say filter and I will use that function that we just created which was LT 2 and I'll pass in our numbers list here so let me save that and run it and we can see that when we run that we get the numbers that are less than 2 now

### filterfalse [31:49]

there are some functions and itertools that compliment these built-in functions so there is one integer tools called filter false and it's just like this built-in filter function except instead it gives you the values that return false instead of true so instead of using this built-in filter function here if I instead said itertools dot filter false if i save that and run it then we can see that now we get the values that are greater than or equal to 2 because those are what returned false from our function now there are two more

### dropwhile [32:24]

itertools functions that are similar to filter but these stop filtering once a function returns true now I don't think I've ever found the need to actually use these but I'm sure they could be useful in certain situations so first let's look at drop while so the drop while function will drop values from an iterable until one of the values returns true so for and then from that point on it simply returns the rest of the iterable so to show you the difference from filter let me put in some other values at the end of our numbers list here that would have been filtered out so I will say 0 1 2 3 then 2 1 0 so if I save this with those new values and run this then we can see that the values greater than or equal to 2 still get filtered out but instead let's use drop while instead of filter false so I will say drop while and use the same arguments and if I save that and run that now and scroll up here a little bit then we can see that our first couple of values were less than two so it dropped those but once it hit a value that was greater than or equal to two then it stopped applying that filter and just returned the rest of the iterable so it doesn't filter out all of those values just the drops the first few ones that met that criteria so our result down here was equal to two three two one zero so everything after those first couple of values now again I don't think I've ever personally used this but I'm sure there are situations where it could be useful and for the opposite of that we also have the take while function so take while well instead grab all the values that return true and as soon as it hits a value that doesn't return true then it will just return the values that it has at that point so see what this looks like instead of drop while I will simply replace this with take while so let's save that and run it and we can see that what it did there was take the values that return to true from our function until it hit a value that returned false and after it got then it just returned the values that it had up until that point so 0 was less than 2 so it added that one that added that then it hit 2 which is equal to or greater than 2 and so it just stopped at that point and return 0 and 1 ok so we're just about finished up I know that this video is getting a little long but we only have a couple more functions to go over so another function that we're going to

### accumulate [34:54]

look at is called accumulate so just like it sounds this takes an iterable and returns accumulated sums of each item that it sees and it keeps using addition by default but you can use other functions as well you know such as multiply or something like that so let's pass our numbers list and the through this accumulate function and see what this does so let me scroll up here a little bit I'm gonna get rid of our selectors and our function here we're not gonna need those anymore and now I just want to use this accumulate function and I want to pass in our numbers in to this function okay so if I run this then we can see that each time through the loop it just keeps a running total of the values that it has seen so far so it starts at 0 and then 0 plus 1 is 1 and then 1 plus 2 is equal to 3 and then 3 plus 3 is equal to 6 and it just keeps doing that all the way through our list so whenever we scroll down here it kept a total sum of 9 at the end and we can pass in different functions as well so if we wanted to multiply these values instead of adding them then we can simply say so we'll have to actually import something here I'm gonna import it here at the top actually so I'll say import operator because we need to grab this multiply function so down here I can now say for the function that we want to use operator dot mu L for multiply function and now if we run that using this data now this isn't very exciting because all of our values are going to be 0 since our first value here was 0 everything multiplied by 0 is going to be 0 so instead let me just get rid of that first 0 there and rerun this scroll up then now we can see that it's keeping a running product so 1 times 2 is 2 times 3 is 6 times 2 is 12 and so on okay so that is the accumulate function okay so now let's take a look

### groupby [37:04]

at our last major function in itertools and that is going to be called group by now I saved this one for last since it's a little harder to explain but hopefully after you see the examples it'll be a bit easier so this will go through an iterable and group values based on a certain key and then it will return a stream of tuples now the tuples consist of the key that the items were grouped on and the segment value of the tuple is an iterator that contain all of the items that were grouped by that key so I know that sounds confusing but let's take a look at an example and hopefully this will sink in so to show this I'm gonna grab some code from my snippets here so up here and my snippets I'm gonna grab this big list of dictionaries here and I'm going to this right here into our code and actually just to make this more clean I'm just going to replace everything except our itertools import so let me repay stat in there and like I said I'll have a link to these code snippets and the description section below if you'd like to follow along with this so what we have here is a list of dictionaries and each dictionary in this list contains some information so we have a person's name and a city and a state so let's say that we wanted to group all of our people by the state that they're from so how would we do that well that's what our group by function can help us with but first we're going to need a function that tells group by exactly what we want to group on and that's going to be our key so we need to write a function that tells us what to return from a single item in our iterable so I'm gonna go up here to the top and I'm going to create a function here just called get state and this is going to take in a person because that's what a single item from our interval is going to be is it's going to be a dictionary that represents a person and now we just want to return what we want our key to be so I'm just going to say return person and we're going to access that state value so that is going to tell group by that for every item in our interval we want to group by the state so now let's actually run this so down here at the bottom and you can see that this list we have several people here in our list now down here at the bottom I'm gonna say person underscore group and I'm gonna set this equal to and we will say itertools dot group by and we want to group this list and this list is called people so I'll scroll back down here so this is called people and we want the key to be equal to that get state function that we created and now let's loop over this so each item in this person group is going to be a tuple of two things so the first thing is going to be our key and the key that we used was the state and the second is going to be an iterable of all the items in that so it should be an iterable of each person in that state so let's print both of these out just so we can see what this looks like so I'm gonna say four key and group in our person group and for now let's just print out both the key and the group so if I save this and run it then we can see let me make this a little larger here we can see that we have four different groups one for each state that was present in our list of people and our group variable here is an iterable that should contain all of the people from that state so let's also loop over that group and print those out as well so within our for loop here I'm just going to print the key on the first line and now I'm gonna loop over that group so I'm gonna say for person in group and we just want to print out that person and I'm also going to put a print statement here at the end and that's just going to be to spread out our output a little bit so now let me save this and run it and scroll up here to the top and again let me make this just a little larger okay so in our output here we can see that we are first printing out our key and that's what we print it out right here which is the state and then we're looping over the group which are the people that are in that state so we can see that from New York we have John Doe and Jane Doe and then it moves on to Colorado and prints out the people from there then it goes to West Virginia and prints those out and finally North Carolina so our group by function did a lot of work in the background there for us just to get those results in a nice efficient way now there are all kinds of interesting problems that we can solve using this group by function so for example what if instead of printing out all the people from that state we just wanted to print the number of people in our list from there so instead I'm just going to comment out our for loop there instead I'm just going to print out our key and then just the length of our group so if I save that and run it oh and that's actually an iterator I forgot so we have to cast that to a list before we can get the length so if I cast that to a list and then rerun this then we can see that in our list we had two of those people from New York two from Colorado two from West Virginia and three from North Carolina so you can probably think of all different types of problems that you could use to solve with this group by function so for example how many students in your class got AIDS versus bees versus C's and so on sorts of problems like that now one thing that I should mention is that group by expects the initial iterable to be sorted so that it can group properly so if we look at the initial list of our values then the people are already sorted by state so we have the people from New York first people from Colorado second and so on now if we were to put someone from New York onto the end of that list then it wouldn't include them in the first group of people from that state so group by is a little different than the SQL version of the group by function in the sense that it needs the values to be sorted beforehand and we won't cover sorting in this video but if you want to see how we'd sort a list of dictionaries like this then I do have a separate video on sorting where I covered all of that and I'll be sure to put that in the description section below as well okay so we're basically finished up but I wanted to show you one last thing that we'll just take about two seconds so

### tee [43:28]

let's say that we wanted to replicate an iterator now this can sometimes be harder to do than we think well the inner tools module gives us a nice simple function for replicating iterators easily and we can do this using the T function so let's say that I wanted to replicate this person group and to do two different intervals so to do that I could simply come down here and say copy one copy to is equal to itertools dot t and that is te e and then just pass in that person group and if you were to run that then now your copy 1 and copy 2 will be their own individual intervals and if we wanted more copies then we could just pass in an argument to this T function and return however many that we'd like and one thing to note is that you should no longer use the original iterator after you copy it so in this example it means that we should only use these copies of copy one and copy two and not use this original person group iterator after that we've made the copy or it could have unintended consequences of exhausting the items in the replicates so that's something to keep in mind there now on a side note I actually have no idea what T stands for or why they named it that rather than something a little more intuitive so if anyone knows what that means then feel free to let me know in the comments I think it comes from a Linux command with the same name but I'm not too familiar with that Linux command either but anyways with that said I think that is going to do it for this video hopefully now you have some ideas for how you can begin to use the itertools module and your daily coding and possibly write some more efficient code but if you do have any questions about what we covered in this video then feel free to let ask in the comment section below and I'll be sure to answer those and if you enjoy these tutorials and would like to support them then there are several ways you can do that these its ways to simply LIKE the video and give it a thumbs up and also it's a huge help to share these videos with anyone who you think would find them useful and if you have the means you can contribute through patreon and there's a link to that page in the description section below be sure to subscribe for future videos and thank you all for watching
