Jupyter Notebooks Tutorial | How to use them & tips and tricks!
18:33

Jupyter Notebooks Tutorial | How to use them & tips and tricks!

AssemblyAI 03.01.2022 19 792 просмотров 480 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Jupyter notebooks are a must-have for data scientists. In this video, we learn about the commonly used settings, how to use Jupyter notebooks, how to perform the basic actions, what each indicator means, and also some extra tips and tricks for using Jupyter notebooks. Get your free API token for AssemblyAI here👇 https://www.assemblyai.com/?utm_source=youtube&utm_medium=referral&utm_campaign=yt_mis_13 Even if you have been using Jupyter notebooks for a while, there might be some extra gems for you in here!

Оглавление (5 сегментов)

Intro

jupyter notebooks are a data scientist's best friend but if you've never got an instruction on them it might be a little bit confusing or overwhelming to start working with them at first so in this video we will learn what jupiter notebooks are how we can use them and also some basic functionalities that will be helpful during your data science projects this video is brought to you by assembly ai is a company that is making a state-of-the-art speech to text api using deep learning technologies if you want to go and try it out yourself you can get a free api token using the link in the description

Jupyter Notebooks

all right let's get started so in front of me is what we call a jupiter notebook so in this notebook there are a bunch of building blocks i will walk you through all of them show you how everything works and also mention some of the critical settings so uh what we call the main building block of jupiter notebooks is a cell so this is a cell i can write anything i want in this cell i can even say you know three plus five and then i will calculate it in that way it kind of works like a normal python function right um or i can create a variable i can create another variable and i can sum these variables up and then it will tell me what it is uh what i'm doing to run these cells is shift enter so when you press shift enter it runs that cell um one thing that you might want to pay attention here is that every time i run a cell only that code in the cell is run so you might remember if you ever use or a written python code if you write a python script and you run it the whole script is run from beginning to end and this is the main difference jupiter notebooks have and the main reason that data scientists use jupiter notebooks is because whatever you write in a cell that's the only thing you run when you just run one cell the whole notebook is not run and also you can change the sequence of how things run so you can run the first cell you can go ahead and run the fifth cell and so on and so forth so this flexibility is something that is really helpful for data scientists when they're doing their work as i said you're basically writing normal python code you know we can create we can do some mathematical functions all calculations we can create variables we can do calculations on them or we can just use normal python code let's write the classic hello world for example and then i will print it for you so basically all everything you can do in python you can do here and some extra things that i will show you in a second um let's walk through the structure i'll show you what everything means so you might realize that sometimes there is a green highlight on a cell but sometimes there's a blue so what these mean is if you have a blue highlight it means that this cell is selected but you are not yet in editing mode of the cell so if i type something you know it's starting to go into the find and replace mode but that's not what i want if i want to change something i click it again then it goes into do green mode which is the editing mode then you can edit things the cell that you have you see some numbers next to it we have the input which is the cell and we have the output of the cell right below it these numbers tell you when you ran this first so apparently i ran this third the cell third and then after that i ran this one and then i did some other things and then sixth i ran this one and then eight i ran this one ninth i read this one this tells you when you ran something and it is helpful because as i said you can run the fifth cell in the first and the third and then the fifth after another and this tells you when you rend uh what cell you run in which order so that if there is something dependent on this cell that affects the output of the eighth one then you know oh i need to run this again before i run this one because if it affects the output of that one all

Settings

right so let's look at some of the settings that we have uh for the file section of the settings is classic you know you can save it as something this file as something else you can rename it etc make a copy but one important thing is that you can download this notebook as a bunch of different things mainly you will probably need to download it as a python file if you know maybe you're doing some experimentation on a notebook or you're building some sort of code because you're more comfortable with the jupyter notebook but your company or the people that you work for or you if you're building a project might need the code at the end as a python script and this one makes it really easy to just import it as a python script and let me show you why this might be needed so this is the same code one of them in a jupiter notebook and one of them is in a python file so a python file looks very straightforward that like we've seen before you can import libraries you have the code and it runs from beginning to end but when you have a jupiter notebook file as you can see the extension is already different let me assume the same maybe it'll be helpful um the extension is already different so for a python file you have dot py and for a jupyter notebook file you have i pi y and b so that's a jupiter notebook file extension and a jupyter notebook file looks a little bit different well actually a lot different than a python file so that's why a python file can be run easily on a terminal for example just by calling python and the name of the file whereas the jupyter notebook file needs to be run on jupyter notebooks and nowhere else it can be read as you can see it looks kind of like a json file so that's why occasionally you might need to download it as a python script when you it will just give you a sequential file like the one i showed you and it will be easier to run for whoever needs it next we have editing kind of helps us organize the cells a little bit so if i want to for example uh let's say i select this one i can hold shift and go down with my arrows and you know select these cells then i can say maybe you know what i don't want them to be there i'll say cut cells and when i select here i can come and say paste cells above and now it's going to be pasted above the cell that i will i selected so this can be sometimes useful if you want to move chunks of code from one place to the other it's kind of good to know if you want to add a new cell what you can do is to either use this plus button here or if you're at the end of the notebook as long as you press shift and enter you're going to be adding new cells to be able to add code so these are two things that could be useful to know and one other trick is that once a cell is selected with a blue highlight so you know it's selected but not in editing mode you can press a that will add cells above it or you can press b that will add cells below it this was something i learned a little bit further into my data science journey and it was actually very helpful i wished i had known that before um and if you want to delete cells if even if it has code or it doesn't have code just empty you can select this or you can click this cut selected cells button and then it will cut it for you sometimes you just need to clean code there is code that you don't need or just empty cells that are crowding the notebook you have

Cells

all right um some other things that are important so in the cell section of our settings you can uh run yourself just through this menu too as i said you normally run it with shift and enter but you can also choose to run it just saying run these cells or it says cells because run the cells that are selected so if i select multiple it you can become cells you can choose to run the selected cells and the ones below or you run the cells and then select the one below it and that's how it works or you run the cell and insert a new one below it so these are some things or if you have a long notebook with a lot of cells and you don't want to go shift enter shift enter the whole time you can just say run all and it will run all of it for you or you can choose to run only the ones that are above the one that you selected or below another thing that is useful is cell type you can see this one here but you also have an option for that here so as you can see there is a drop down list it says code you can also choose markdown and some other options but i'm going to choose markdown right now to show you so i'll just select this one so if i just write code let's say i'm writing a comment for this code and i will say from now on i am going into data cleaning uh so if you run it as a code if the type of this cell block or the cell is code then it will be seen as comments but you can also make it a markdown and then it will be formatted as a markdown and once you press shift enter it becomes a heading because if you know a little bit of markdown you might know if you have only one hash then you get a heading once a big heading and the more hashes you ha add the smaller the heading gets so that's one thing that you can do you can also add a list for example list item one list item two and you know add some explanations if you are doing some sort of data cleaning or i don't know some data exploration you can add some information of um there are 540 missing data points for example and add more information that you want to share with the person that you are presenting this notebook to i will not go into the details of how markdown works but if you want to learn um more about markdown and how to use it you can just google markdown and there are a lot of nice resources and guides showing you how to use markdown but a couple that i want to show you that i found useful was you can add links to other things so i can say i don't know cats and dogs let's say i want to add a link to photos of cats and dogs then i will google cats and then i will i can take this link and i can paste it here and now this will be a clickable link once i run the cell so as you can see very nice and then if someone clicks on this they'll end up at this page so yeah uh that's a nice thing to know and another nice thing is that you can create sections for your notebook but i will show you that on the other notebook that i have where we have sections of where we can jump on the notebook itself so that's coming up all right so let's go to the

Kernel

next setting now we have kernel uh kernel is the thing that your jupyter notebook is running on you might have a bunch of different kernels and each of these kernels can have a different um set of libraries that is already installed in them a couple set of different settings that is already set on them so based on what you want you might choose a different kernel so right now i only have a python 3 kernel and a virtual environment kernel that i set and in the virtual environment kernel i have my deep learning libraries installed whereas for my python 3 or inside my python 3 kernel i only have normal data science libraries installed like pandas and scikit-learn etc sometimes you might run into a problem your kernel might or your cell might be taking a long time to run what you can do for example is to say interrupt so if my cell is running for a long time and you kind of want to stop it because you found a problem that is causing it to run for a long time you can say interrupt you can also click a button that is here then it will stop the running of the cell and interrupt the kernel then it will restart it will have to you will have to restart and rerun the cell what else you can do is to restart the kernel uh if you restart the kernel nothing will happen it will just uh restart so all of the code that you run already will not be run anymore so if i run this again it will say one because it will the first thing that is run even though the output is still visible another thing you can do is to restart and clear all the output then you will start with a clean slate or you can say restart and run all so you will restart the kernel and then run everything while i apparently remove the a uh all right you can shut it down you can reconnect so basically those are kind of things that are giving you the chance to restart the kernel that this notebook is running on in case you run into a problem or something is running for a very long time and finally and this section i already showed you some of the things but you know these are kind of easily navigating from one place to the other if you want to carry one of your cells you can carry it above or below we talked about the adding cutting again if you want to run your notebook this is one of the options you can use if a notebook sorry if i sell to run a cell and if a seller is running for a very long time you can use the interrupt or rerun here and if you want to save your notebook which you should do once in a while to not lose your changes you can either click this point or little button or you can say control s as always and that will save your latest updates on the notebook all right so let me show you a more established jupiter notebooks to kind of see how everything works in real world kind of when you're doing an actual project so this is a notebook that i used before it is a project where we try to analyze and build a model on the data that was collected on the taxis of new york city um so a bunch of things that i want to show you first here is as i mentioned you can create clickable links that will send you to different sections of your notebook so you can see here i have a table of contents and if i click one of them it will send me to the relevant section immediately and if i click back to top it will send me back to table of contents so how we do that is basically very similar to how we did the uh clickable links the urls you basically do the same thing like that you did with the url with the brackets but then you add the id of the section that you want to go to with a hash at the beginning and how you create the id of a section is by doing this you basically assign an id to it so i do that also for the sections then i you know add them an id so for the first one i add the id imports so then if i run this and i run that then i click the import library section it will immediately send me to this one and for going back to top i uh link it back to the table of contents section but apart from all the structuring and everything basically jupiter notebooks are any other python script the only difference as i said is that you can run your code in different chunks so you know you can import your libraries here data set you can visualize your data set using pandas and it is actually very easy to visualize data sets especially when you read it in a pandas data frame you can you know basically show it as a table even if you have a gigantic table it is very easy to see using the head function of pandas but only showing the first five data points uh it is it makes it very easy to use for anyone who doesn't even have any experience with code to use jupyter notebooks you know if you want to present something to your manager to your boss um if you want to present your findings on a piece of data it is this is a very nice way of putting it in front of them another nice thing about jupiter notebooks is that having inline visualizations you can basically you know if you have a python script you will probably have to run the whole thing without any errors to be able to see the visualizations whereas with jupyter notebooks you can just see or show the visualizations uh in between code so you know i have some code up until now imported libraries imported my data sets and then you know showed my data set in a table format and now i am showing it in a visualization i am showing the histograms of my different columns if i want to see if there are any outliers in my data set i can easily do that here so these little things add a lot of value a lot of flexibility to jupyter notebooks for data scientists one other nice trick with jupyter notebooks on top of all of this is that you can install libraries inside jupiter notebooks without having to leave jupyter notebooks so you know how you normally install libraries is pip install let's say i want to install pandas right you have to run this command in a terminal but if you want to install it without leaving jupiter notebooks all you have to do is add an exclamation point at the beginning of this line and run it i mean because i already have pandas on my computer it's going to tell me the requirement is already satisfied but if you want to install something this is the way to do it on jupyter notebooks and that's it about jupiter notebooks this is all the basic functionality that you need to know to start using them they are very versatile very flexible and very easy to use even for beginners and that's why data scientists love them but what do you think about jupiter notebooks we would love to hear that in the comment section below thanks for watching this video i hope you enjoyed it if you liked it don't forget to give us a like and maybe even subscribe to be one of the first people to know when we publish a new video if you have any questions or comments we would love to hear them in the comment section and once again don't forget to go grab your free api token for assembly ai using the link in the description thanks for watching again and i'll see you in the next video

Другие видео автора — AssemblyAI

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник