# 9.8 Poisson Regression in R: Fitting a Model To Count Data in R

## Метаданные

- **Канал:** MarinStatsLectures-R Programming & Statistics
- **YouTube:** https://www.youtube.com/watch?v=S7MkI6M4suc

## Содержание

### [0:00](https://www.youtube.com/watch?v=S7MkI6M4suc) Segment 1 (00:00 - 05:00)

in this video we're going to look at fitting a poisson regression model to the medical expenditures data using r i've already imported this data and attached it recall that this data is measured on the individual level that is we've recorded the number of times the event has occurred for each individual and its count data meaning everyone has been followed for exactly one year so we can look at the number of occurrences rather than having to work with the rate for the sake of jumping into things we're not going to define a research question right now we're just going to look at how to fit a model in r how to get the output and interpret it and work with that we've already talked a lot about model building and variable selection procedures and regression models and here is no different so to do so i'm going to use a y or outcome variable of ofp this is the number of visits to a physician in a year and recall everyone has been followed for exactly one year we're going to use x variables of health this is the self-perceived health status recorded as average excellent or poor average being the reference category gender recorded as male female females being the reference and school the number of years of education so let's look at fitting a poisson regression model i'm going to save it in an object called poisson model one to do so we make use of the glm command as we did with logistic regression except here we specify that the family is poisson and this is the way of letting r know we want to fit a poisson regression model so you'll recall that there's lots of generalized linear models logistic being one poisson being another the other syntax for the model is the same as it was for all of the regression models that we've looked at we're going to estimate ofp as a function of health gender and school so let's submit that here and look at a summary of the model output we can see similar to logistic regression we get the null deviance this is the same concept as the total sum of squares and the residual deviance this is the same concept as the sum of squared error or the unexplained deviance we'll also return the aic for and then we can see up here the model coefficients so this is all as it was before the model's intercept the coefficient for excellent health the coefficient for poor health recall average health is the reference so that's part of the intercept the coefficient for male females are the reference and the coefficient for the number of years of schooling so in a separate video we're going to take this model output and work with it we're going to calculate rate ratios from this model we're going to learn how to use this model to make a prediction for what we'd estimate the number of visits to a physician would be in a year given an individual's health status gender and years of school before doing that i want to look at a few more things in r here so first just to recall we can exponentiate the coefficients to get rate ratios and in a separate video i'll go over and explain why these exponentiated coefficients give us rate ratios it's the exact same reasoning that it was for logistic regression where we got odds ratios but we'll go over that explanation again in a separate video now i just want to add a quick note that say for the health variable the reference category is average health and i actually think that this is a good category to use as the reference because we can look at how do the number of visits change if health is excellent or if health is poor relative to average but i do want to look at how we can change the reference category and in my example i'll look at how we can change the reference to be poor mostly for the sake of demonstrating how to change the reference category so what i'm going to do is create a new variable i'm going to call it health 2 and i'm going to make poor health be the reference first let's just take a look at the health variable and we can see if we look at a table we can see that average is the first category or what's going to be the reference now i'm going to create a second variable health 2 i'm going to re-level the health variable and i'm going to make the reference be poor and i just want to point out we could re-level the health variable itself and not create a new one the reason i'm creating a new variable is through the rest of these videos i want to work with the health variable as it is with average being the reference so i'm going to create a new variable that has poor being referenced so let's submit that command here we can then look at a table of the health2 we see for this variable poor comes first or poor is now the reference so that looks like what we wanted now i just want to add a quick note before looking at a model that there's other ways you can do this that are less proper ways to do it but still achieve your goal so before reading the data into r if you have it saved in say an excel format you can relabel the categories using a find and replace so for example we could label the first one as aa poor the second one is bb average and then cc excellent and that's going to have them be ordered poor than average than excellent if you recall the categories get ordered alpha numerically so this is sort of an ugly way to do it but it gets the job done an important point is when you're working in r only you are looking at your output so if you want to do things in this more hacky way that's completely fine so now let's just look at fitting the model using the health2 variable

### [5:00](https://www.youtube.com/watch?v=S7MkI6M4suc&t=300s) Segment 2 (05:00 - 05:00)

the reparameterized version so i'm going to call that poisson model 2 i'm going to fit that here and look at a summary of it and we can see with the model coefficients we now have a coefficient for how the number of visits changes for average health relative to poor and excellent health poor being the reference so through the rest of these videos we're going to work with average being the reference as i do think that makes a bit more sense but i just wanted to show how we could change the reference category if we wanted to

---
*Источник: https://ekstraktznaniy.ru/video/44710*