# Partial Dependence Plot in Knime

## Метаданные

- **Канал:** Saqib Ali
- **YouTube:** https://www.youtube.com/watch?v=1qt3qPxFddA
- **Дата:** 19.12.2019
- **Длительность:** 10:52
- **Просмотры:** 441
- **Источник:** https://ekstraktznaniy.ru/video/46018

## Описание

An overview of the Partial Dependence Plot in Knime. Partial Dependence plot helps in improving the interpretability of Machine Learning models.

## Транскрипт

### <Untitled Chapter 1> []

hello this socket and in this video we will take a look at the partial

### Partial Dependency Plot [0:05]

dependency plot node and name this node was recently introduced in 1950 interpretability of machine learning models so a partial dependency plot shows you how it model behaves when the values of a continuous predictor changes within a predefined domain this is a really great tool to understand how your model will behave with new data and as a data scientist you should in addition to having a good accurate model you should also be able to explain how your model is working so this is the right tool for that so let's take a look at the output of the model so let's take the interactive view and this will launch a Chrome browser takes a little bit of time because it's rendering a I think it's the running SVG and once you have that you'll see all the plots and there you go so we have our first plot and this is take a look at the to access the bottom axis is your continuous predictor and the y-axis is your class probabilities in our cases we are analyzing a it's a 1 subscriber data set and we want to predict if it's going to be a churn or not so we have the class probabilities for churn equals to true now looking at this plot we can know one thing really stands out that as the number of the day minutes so that's the number of day ministry because subscribers consuming per day over the billing period increases the behavior of the model is that the class probabilities for churn equals two to true goes up so for example if the subscriber has 300 minutes class probability is 0. 6 now based on how you have to find a cut-off this could possibly be a turn equals to 2 so that's looking at how the day's day minutes show up in the parsha in the partial dependency plot look and similarly you can look at the other continuous wave will so we have to keep the data chart and so we took and we take a look at evening minutes let's take a look at that up see there's a trend in there as well so in minutes don't see a much of a trend it does go up maybe these are just people who talk a lot and whether it's daytime or evening time but maybe the daytime is more important for this or a piece in this model because the day times may not be free versus evening minutes are free something like that so good thing it would be good in good to explore that and similarly if we want to take a look at our accountant that's the number of days the subscriber has been active so if you look at this we see that it's pretty much a flat line this is like budget over here but it's a flat line indicating that the account length does not really at least the model does not this particular model does not behave any differently when the accountants change if you use different values for that so the class probabilities remain pretty much the same around maybe I would say point zero three and so that's not there's not much change over here so it's really these you plots really help you understand how your continuous predictors contributing to how your model is behaving so it's really and you need to understand that and these plots are key to that so now let's stake so that's that was the output from one of the model it's one of the neural network models so let's take a look at the output from another neural network model and for some reason okay so again we allege the interactive viewer which will launch a Chrome browser and let's see if we have the same friends of week as we saw in the other model and so how do we have the day minutes again we see an upward trend which plateaus off but I think overall you see the same pattern that as the number of minutes increase the class probability of churn equals accrue goes up it doesn't go up the way that the other one went but there is definitely a trend so I think this is something worthwhile exploring so similarly let's see if the account client has any bearing on how the model behaves it does and you can see in this model the accountant and the more you have stayed with as with the phone company the the class probability of churn equals to true goes down so if you have stayed there for a long time you might just stay there for additional time so ready got to have plots to explore again you want to get an understanding of how your model is behaving with different data these are the plots which help you do that so now let's take a look at how to get to these plots it's and it's really straightforward it's really simple the way that this works is you your knowledge your cleaning your crazy data set and your testing data set and you're building your models in this over here and then you are predict you're using the protector nodes to protect the values so to use the partial dependency plot or to build out a partial dependency plot you use the partial dependency plot node and the relation to that you have to use the partial dependency preprocessor know this node is available on my hub so you can download that and use that as it is in your workflow it's a meta node and so basically what you're doing is that instead of sending your testing data directly to the predictor you're actually passing it through the partial dependency pre processing node which creates a sample for your predictor and if you take a look at the configuration for the partial dependency pre processing node you'll notice that these are all the continuous variables that you want to include in your or at least you want to plot for your partial dependency plots and that is that I did not include longitude and latitude which doesn't make sense to plot those in this case at least so and we have all the other continuous variables in this pre-processing node one thing to note is that all of these have to be for the for this node they have to be double it will not take integer or long or floating points it has to be double so just make sure that all your continuous way was all double and then once you have that set up you need to pass that into the predictor node and output of the peripheral node goes into the partial

### Partial Dependency Plot Node [8:32]

dependency plot node and let's take a

### Configuration for the Partial Dependency Plot Node [8:36]

look at the configuration for the partial dependency plot node again we want to in this note we want to include other configuration for this node all the continuous variables that we wanna plot and then right here at the very bottom we have this is your class probability in our case we want to see if as a churn or not so we set it to probability of churn equals to true and again this is the probabilities are coming from your predictor node and so once you have that and if you don't see the probabilities of the of your the class probabilities you probably don't have the trans probability check mark in your predictor notes and to do that you just go into your configuration and then a you can check append columns with normal normalized class distributions similarly over here we want to make sure that the class properties are included right here so that looks good now once you have that and he have configured your partial dependency plot node then you can just run it and we'll get UT the output so this is a really good tool to explore how your model P will behaved with new data and also shows you how changing a value of a continuous predictor changes the behavior of the models and what's the prediction essentially of your model so hopefully you will take a look at this node and you'll find it useful and if you have any comments just leave them in this video thank you bye-bye