class: center, middle, inverse, title-slide # Multinomial Regression ### Yue Jiang ### STA 210 / Duke University / Spring 2024 --- ### Alligator diets <img src="img/alligator.jpg" width="80%" style="display: block; margin: auto;" /> --- ### Alligator diets <img src="multinomial_files/figure-html/unnamed-chunk-2-1.png" style="display: block; margin: auto;" /> --- ### Multinomial data In this case, our outcome is categorical, but *not* ordered - as well, we have more than two categories, so we can't simply use a straightforward logistic regression. .question[ Suppose you wanted to predict the preferred diet of an alligator based on its sex and its length. How might you do this? ] --- ### Just a bunch of logistic regressions Our outcome `\(Y\)` has `\(J\)` total categories. We might intuitively choose one of these categories to be the referent category, and then compare each of the other categories against it in a pairwise comparison with logistic regressions. Suppose `\(j\)` is the reference category. Then we will fit each of the following models for `\(j = 2, \cdots, J\)`: `\begin{align*} \log\left(\frac{P(Y_i = j)}{P(Y_i = 1)}\right) = \beta_{0;j} + \beta_{1;j}x_{i1} + \cdots + \beta_{p;j}x_{ip} \end{align*}` .question[ Take a look at this model - what does each term represent? Why do we only fit `\(j - 1\)` separate models? ] --- ### Fitting a multinomial regression model ```r library(nnet) m1 <- multinom(food ~ sex + length, data = gators) ``` ``` ## # weights: 12 (6 variable) ## initial value 64.818125 ## iter 10 value 48.651915 ## final value 48.651292 ## converged ``` --- ### Fitting a multinomial regression model ```r summary(m1) ``` ``` ## Call: ## multinom(formula = food ~ sex + length, data = gators) ## ## Coefficients: ## (Intercept) sexM length ## invertibrate 4.070252 0.2313371 -2.4218022 ## other -1.543849 -0.7175895 0.2512553 ## ## Std. Errors: ## (Intercept) sexM length ## invertibrate 1.476859 0.6746705 0.8283374 ## other 1.313245 0.8485217 0.5486893 ## ## Residual Deviance: 97.30258 ## AIC: 109.3026 ``` --- ### Fitting a multinomial regression model ```r exp(coef(m1)) ``` ``` ## (Intercept) sexM length ## invertibrate 58.5717389 1.260284 0.08876151 ## other 0.2135574 0.487927 1.28563828 ``` --- ### Fitting a multinomial regression model ```r head(round(fitted(m1), 3)) ``` ``` ## fish invertibrate other ## 1 0.238 0.692 0.069 ## 2 0.262 0.660 0.078 ## 3 0.232 0.735 0.033 ## 4 0.240 0.725 0.035 ## 5 0.271 0.649 0.081 ## 6 0.305 0.602 0.093 ``` --- ### An important assumption One important assumption in the multinomial model is that the probability of being in category A or B shouldn't depend on whether category C is included or not as a potential option. .question[ What does this mean in our alligator diet example? ]