Python Pandas Tutorial (Part 6): Add/Remove Rows and Columns From DataFrames
16:55

Python Pandas Tutorial (Part 6): Add/Remove Rows and Columns From DataFrames

Corey Schafer 01.02.2020 316 305 просмотров 6 801 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
In this video, we will be learning how to add and remove our rows and columns. This video is sponsored by Brilliant. Go to https://brilliant.org/cms to sign up for free. Be one of the first 200 people to sign up with this link and get 20% off your premium subscription. In this Python Programming video, we will be learning how to add and remove rows and columns from dataframes using the append and drop methods. We will also see how we can create new columns by combining elements from existing ones. Let's get started... The code for this video can be found at: http://bit.ly/Pandas-06 StackOverflow Survey Download Page - http://bit.ly/SO-Survey-Download ✅ Support My Channel Through Patreon: https://www.patreon.com/coreyms ✅ Become a Channel Member: https://www.youtube.com/channel/UCCezIgC97PvUuR4_gbFUs5g/join ✅ One-Time Contribution Through PayPal: https://goo.gl/649HFY ✅ Cryptocurrency Donations: Bitcoin Wallet - 3MPH8oY2EAgbLVy7RBMinwcBntggi7qeG3 Ethereum Wallet - 0x151649418616068fB46C3598083817101d3bCD33 Litecoin Wallet - MPvEBY5fxGkmPQgocfJbxP6EmTo5UUXMot ✅ Corey's Public Amazon Wishlist http://a.co/inIyro1 ✅ Equipment I Use and Books I Recommend: https://www.amazon.com/shop/coreyschafer ▶️ You Can Find Me On: My Website - http://coreyms.com/ My Second Channel - https://www.youtube.com/c/coreymschafer Facebook - https://www.facebook.com/CoreyMSchafer Twitter - https://twitter.com/CoreyMSchafer Instagram - https://www.instagram.com/coreymschafer/ #Python #Pandas

Оглавление (16 сегментов)

Introduction

hey there how's it going everybody in this video we're going to be learning how to add and remove columns from our data frames we'll also take a look at how we can combine information from multiple columns into one now in my last video on updating rows and columns that one was pretty long but this one should be a lot shorter now I'd like to mention that we do have a sponsor for the series of videos and that is brilliant so I really want to thank brilliant for sponsoring this series and it would be great if you all could check them out using the link in the description section below and support the sponsors and I'll talk more about their services in just a bit so with that said let's go ahead and get started now like I said in the last video we saw how to update information within our rows and columns now we're gonna see how we can add and remove rows and columns first let's look at adding columns now adding columns is going to be pretty easy for us because it's basically the same thing that we did when we were updating values we can simply create a column and pass in a series of values that we want that column to have so I currently have my snippets file open here that we've seen in previous videos so that we can see what this looks like on a smaller data set and as usual if you want to follow along then I'll have links to the code and the notebooks and the data that I'm using in this series in the description section below so for example let's say

Combining Columns

that we wanted to combine our first name and last name column into a single column and simply call that column first name so first in order to get a series of the first name and last name combined we could simply say when it come down here to the bottom we can simply grab that first column and then we can just add these together and I'm putting a space between there and then I will add in the last name and if I run this whoops and I missed my second plus symbol there now if I run this then we can see that we get the first name and then a space that's what this section is doing here and then the last name so now that we have this series of values here we can see that we have three values in order to add these to a new column with these values we can simply say DF and then the name of what we want our new column to be I'm going to call this full name and then I'm just going to copy this that gave us that series four and assign this full name column to that returned series so if I run that and then we look at our data frame then

Removing Columns

now we can see that our data frame here has this new column that is the first and the last name combined and again I'm using strings here but you could also create a new column using the apply method that we saw in the last video to have a new column for some mathematical analysis for another column in the data frame as well now I do want to point out that you can't use the dot notation when assigning a column like this we have to use the brackets like we did here in order to make these assignments because if you use dot notation then Python is going to think that you're trying to assign an attribute onto the data frame object and not a column okay so that's how we'd add a column to our data frames now let's look at removing columns so now that we have our full-name column let's say that we no longer need or once our first and last name columns so to remove these I can use the drop method on our data frame so it's as easy as just saying DF drop and now what do we want to drop columns and the columns are going to be equal to and I'm gonna pass in a list because we want to delete multiple columns here so I want to delete the first column and the last column so if I run this then we can

String Split

see that it returns a data frame without those columns and like we've seen before this just gives us a view of what our data frame would look like but it doesn't actually apply those changes if we're happy with those changes then we can set the in place argument to true so that it changes our data frame in place so I can come over here and just say in place equal to true within our drop method and if I run that and then we look at our data frame then now we can see that our data frame no longer has that first and last name column now if we wanted to reverse that process and split that full name column into two different columns then that's a little more complicated but still pretty simple so we've seen the string split method a few times in the series so far so let's run that on our full-name column and see what we get so I'm gonna say DF and access that full name column and now I'm going to use that string class on our series and then we will do a split and we'll just split this on a space now split splits on spaces by default but I just want to be explicit here so if we run this then the

Expand

result of that split method is that we get the first name and the last name and a list so the first name is the first value and the last name is the second value now if we want to assign these to two different columns then we need to expand this list so that they're actually in two different columns so to do this in pandas we can use the expand argument so let's see what this looks like so this is within the split method here and we can just pass in another argument and say expand is equal to true so if I run this then we can see that the results are pretty similar but now everything that was in our list is split up into columns so now we have two columns here of those split results so now what we need to do is set two columns and our data frame to those two columns that were just returned so we can say I'm going to if we remember from

Multiple Columns

earlier in the series if we want to access multiple columns then within the brackets we can pass in a list so we're going to have two pairs of brackets here the inner bracket is our list of columns so I want to add a first column and a last column and we're going to set that equal to this what we did here with the split method so if I run this then now our first and last column should have been assigned to these columns here so let's take a look at our data frame and see if that worked so we can see that

Add Single Row

now we added a first and a last column with the values that we returned up here okay so that's how we would add and remove columns so now let's look at adding and removing rows of data so there are a couple of different ways that we might want to add to our data frame so first we might just want to add a single row to our data frame of new data and second maybe we want to combine two data frames together into a single data frame by appending the rows of one to another so first let's look at adding a single row of data so we can do this with the append method so if I want to add a single row then I can just say DF dot append and now we can pass in our values here so I'm just gonna pass in a dictionary here and we'll pass in a first name of Tony so if I run this then we can see that we

Ignore Index

get an error now this is because this currently doesn't have an index now it can sometimes be difficult to read these pandas errors and figure out what the problem is but in this case it tells us exactly what to do it says down here at the bottom can only append a series if ignore index is equal to true or if the series has a name so let's just ignore the index and our existing data frame will automatically assign this new row and index itself so up here at the top we can simply pass in an argument of ignore underscore index and set that equal to true now if I run this then we can see that worked we're no longer getting an error and also down here at the bottom we can see that this new name was appended now we only assign this row a first name value so we can see here that we assign that as Tony and all of the other column values are set to n/a in which is not a number which is used for missing values so you can pass in an entire series or a list of information there in order to add a single row of data of any information that you want now if we have a data frame that we'd like to append to our existing data frame then we can do that as well so let me create a new data frame here from our existing values up here at the top so

Create New DataFrame

I'm just going to scroll up here and I'm gonna grab our first dictionary here of the data that we originally created our data frame with and I'm going to modify this a bit so I'm going to just

Modify DataFrame

have this be two names here so I'm going to take out these third values and then we will go ahead and up these so update these so for the first name I'll do Tony and Steve and for the last name I'll do Stark and Rogers and for the email addresses let's say I'll do Iron Man At avenge comm and for the second one I'll do cap at avenge comm so now I'm going to create a new data frame here from this new dictionary now I'm going to call this D f2 so now I can just say PD dot data frame and pass in that people dictionary there and now we should have

Append DataFrames

a second data frame okay so now let's say that we want to add this to our existing data frame so one way we can do this is to simply append the one data frame to the other now these have conflicting indexes and they also have columns that are not in the same order so again we're going to want to ignore the indexes when appending these so that they are assigned indexes properly so I'm going to say D F dot append and I'm going to pass in d f2 so that it appends it to our original data frame and then I'm going to say ignore underscore index is equal to true and if I run this here

Sort Columns

then we can see down here at the bottom that it added these new rows now if you got a warning here then the reason for this is because we didn't pass in all of the columns in the same order when appending these so it's giving us a warning here that there are different ways that it could have sorted the columns so don't worry too much about that but in the future version of pandas it's going to set the sort to false by default and actually pandas version one was just released as I was recording this series so this may have already been done but we can ignore this for now but if we wanted to then we could pass in the value of sort equal to false and get rid of this warning so if I went back up here I passed an sort is equal to false and also when I run this it's no longer going to sort these columns here so if I

Make Changes Permanent

run this then we can see that we no longer get that warning and now it's not sorting the columns anymore now unlike the drop method if we want to make these changes permanent then we don't have an in-place method to use instead we'd have to just set the data frame to this returned data frame by copying this and then we can say DF is equal to and then just pass in that returned data frame there so if I run that then we look at our original data frame then we can see that with those rows were added on there now I'll be honest some of you might

Remove Rows

want to ask in the comment section below why some of these have an in-place argument and others don't but honestly I'm not really sure I'm sure there's a reason but I'd have to do some more digging around to find out exactly why but this append method doesn't have an in-place argument like the drop method has so we have to do it this way okay so lastly let's look at removing rows so let's say that you're an Ironman fan and you want to remove Steve Rogers here from our data frame now we can do that in almost the same way that we dropped our columns but instead of specifying the columns that we want to drop where you can simply pass in the indexes that we want to drop so I can come down here and say DF drop and we can see here on the far left if you've watched my video on indexes this has an index of four so let's just say we want to drop an index to of four so if I run this and we can

Drop Rows

see that we still have Iron Man and Tony Stark here but we no longer have Steve Rogers so that fourth index was deleted and again if you want to actually apply that to the data frame then you'll need to set the in place argument to true now you might want to do something a little more complicated and drop rows using a conditional now I'd probably do this using Lok like we saw in the previous video where we were learning about filtering data from our data frame but we can also do this using drop so if I wanted to drop all of the rows where the last name is equal to dou so we have two of those values here then I can pass in the indexes of that filter so let me show you what this means and it won't be as complicated so I can say DF drop and now I'm gonna say index is equal to put in my conditional so if you remember from the filtering video we can actually pass the conditional here inside of our brackets so now I can say okay I want a conditional where the last name of that column equals doh now the only difference here is that we have a conditional here but we want the indexes since we're saying index is equal to so just here at the end I'm gonna use this method and just say dot index so if I run this then we can see that it removed those values with the last name of DOE now like I said in that filtering video I don't really like all of this being bunched up together because I think that it's hard to read and you always want your code to be easy to read by other developers so I would pull the conditional out into its own variable and instead I would say something like this I would say filt for filter is equal to and then I'll just grab our conditional here and cut that out and paste that here and now we can say that we want our filter applied to that data frame and then grab the index of that so if I run this then we

Outro

can see that gives us the exact same result but that's a little bit easier to read okay so that's been an overview of adding and removing rows and columns from our data frames now before we end here I'd like to mention the sponsor of this video and that sponsor is brilliant so in this series we've been learning about pandas and how to analyze data and python and brilliant would be an excellent way to supplement what you learn here with their hands-on courses they have some excellent courses and lessons that do a deep dive on how to think about and analyze data correctly for data analysis fundamentals I would really recommend checking out their statistics course which shows you how to analyze graphs and determine significance in the data and I would also recommend their machine learning course which takes data analysis to a new while you learn about the techniques being used that allow machines to make decisions where there's just too many variables for a human to consider so to support my channel and learn more about brilliant you can go to brilliant org Forge slash CMS to sign up for free and also the first 200 people they go to that link will get 20% off the annual premium subscription and you can find that link in the description section below again that's brilliant org forge slash CMS okay so I think that's gonna do it for this pandas video I hope you feel like you got a good idea for how to add and remove columns and rows from your data frame and feel comfortable doing that in the next video we'll be learning about different ways to sort our data but if anyone has any questions about what we've covered in this video then feel free to ask in the comment section below and I'll do my best to answer those and if you enjoy these tutorials and would like to support them then there are several ways you can do that the easiest way is to simply like the video and give it a thumbs up and also it's a huge help to share these videos with anyone who you think would find them useful and if you have the means you can contribute through patreon and there's a link to that page and a description section below be sure to subscribe for future videos and thank you all for watching

Другие видео автора — Corey Schafer

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник