class: center, middle, inverse, title-slide # Multiple Comparisons ### Dr. Maria Tackett ### 09.18.19 --- class: middle, center ### [Click for PDF of slides](06-mult-comparisons.pdf) --- ### Announcements - HW 01 **due TODAY at 11:59p** - Reading 03 for Monday - HW 02 due Wednesday, 9/25 at 11:59p --- ### Today's Agenda - Multiple comparisions - Introducing multiple linear regression --- ## R Packages used in the notes ```r library(tidyverse) library(knitr) library(broom) ``` --- class: middle, center ## Multiple Comparisons --- ### After ANOVA: Individual Group Means - Suppose you conduct an ANOVA and conclude that at least one group mean has a different mean response value. - The next question you want to answer is **which group?** -- - One way to answer this question is to compare the estimated means for each group, accounting for the random variability we'd naturally expect -- - Since we've assumed the variance is the same for all groups, we can use a pooled standard error with `\(n-K\)` degrees of freedom to calculate the confidence .alert[ `$$\bar{y}_i \pm t^* \times \frac{s_P}{\sqrt{n_i}}$$` where `\(s_P\)` is the pooled standard deviation ] --- ### After ANOVA: Difference in Means - We can also estimate the difference in two means, `\(\mu_1-\mu_2\)` for each pair of groups .alert[ `$$(\bar{y}_1-\bar{y}_2) \pm t^* \times s_P\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}$$` where `\(s_P\)` is the pooled standard deviation ] - If we have `\(K\)` groups, we will make `\({K \choose 2} = K(K-1)/2\)` such comparisons - Ex: If we have 6 groups, we'll make `\({6 \choose 2} = 6(6-1)/2 = 15\)` comparisons --- ### Multiple Comparisons - When making multiple comparisons, there is a higher chance that a Type I error will occur, e.g. conclude that there is a significant difference between two groups even when there is not - **At a Minimum**: When calculating multiple confidence intervals or conducting multiple hypothesis tests to compare means, you should clearly state how many CIs and/or tests you computed. - **Good practice**: Account for the number of comparisons being made in the analysis - We will discuss one method: <font class = "vocab">Bonferroni correction</font> --- ### Confidence levels - <font class="vocab">Individual confidence level: </font> success rate of a procedure for calculating a <u>single</u> confidence interval -- - <font class="vocab">Familywise confidence level: </font> success rate of a procedure for calculating a <u>family</u> of confidence intervals + "success": all intervals in the family capture their parameters -- - <font class="vocab">Issue: </font>There is an increased chance of making at least one error when calculating multiple confidence intervals + The same is true when conducting multiple hypothesis tests --- ### Bonferroni correction - <font class="vocab">Goal: </font> Achieve at least `\(100(1-\alpha)\)`% familywise confidence level for `\(\mathcal{C}\)` confidence intervals - Where `\(\alpha\)` is the significance level for the corresponding two-sided hypothesis test -- - Calculate each of the `\(k\)` confidence intervals at a `\(100(1-\frac{\alpha}{\mathcal{C}})\)`% confidence level - When there are `\(K\)` groups, there are `\(\mathcal{C}=\frac{K(K-1)}{2}\)` pairs of means being compared -- - **Notes**: + The exact familywise confidence level is not easily predictable. This partially depends on the level of dependence between the intervals. + Bonferroni correction is sometimes too conservative, i.e don't reject `\(H_0\)` as much as you should --- ### Population Density in the Midwest - There are 5 groups (states) in the `midwest` data, so we will do `\({5 \choose 2} = 10\)` comparisons. - If we want a familywise confidence level of 95%, then we should use a `\((1 - 0.05/10)\times 100 = 99.5\)`% confidence level for each pairwise comparison --- ### Pairwise CI .small[ ```r library(pairwiseCI) pairwiseCI(log_popdensity ~ state, data = midwest, method = "Param.diff", conf.level = 0.995, var.equal = TRUE) ``` ``` ## ## 99.5 %-confidence intervals ## Method: Difference of means assuming Normal distribution and equal variances ## ## ## estimate lower upper ## IN-IL 0.4089 0.0213 0.7966 ## MI-IL 0.0315 -0.4564 0.5194 ## OH-IL 0.8237 0.4050 1.2424 ## WI-IL -0.1959 -0.6745 0.2827 ## MI-IN -0.3774 -0.8457 0.0909 ## OH-IN 0.4148 0.0246 0.8049 ## WI-IN -0.6048 -1.0547 -0.1550 ## OH-MI 0.7922 0.2903 1.2940 ## WI-MI -0.2274 -0.7987 0.3438 ## WI-OH -1.0196 -1.5070 -0.5322 ## ## ``` ] --- ### Pairwise CI | estimate| lower| upper|comparison | |--------:|------:|------:|:----------| | 0.409| 0.021| 0.797|IN-IL | | 0.032| -0.456| 0.519|MI-IL | | 0.824| 0.405| 1.242|OH-IL | | -0.196| -0.674| 0.283|WI-IL | | -0.377| -0.846| 0.091|MI-IN | | 0.415| 0.025| 0.805|OH-IN | | -0.605| -1.055| -0.155|WI-IN | | 0.792| 0.290| 1.294|OH-MI | | -0.227| -0.799| 0.344|WI-MI | | -1.020| -1.507| -0.532|WI-OH |