class: center, middle, inverse, title-slide .title[ # Stratification in the Cox model ] .author[ ### Yue Jiang ] .date[ ### STA 490/690 ] --- ### The Rossi et al. trial... <!-- --> .question[ What do you notice? ] --- ### Financial aid intervention ``` r m1 <- coxph(Surv(week, arrest) ~ fin + wexp, data = Rossi) ggcoxzph(cox.zph(m1), var = "wexp") ``` <!-- --> .question[ What might we do in the presence of non-proportional hazards? ] --- ### Stratification in the Cox model Previously, we've seen how allowing time-varying coefficients might help address proportional hazards violations. We might also consider simply not requiring proportional hazards for those "difficult" covariates by estimating different baseline hazards for each strata: `\begin{align*} \lambda_{i, prior \, work=yes}(t) &= \lambda_{0, prior \, work = yes}(t)\exp(\mathbf{x}_i\boldsymbol\beta)\\ \lambda_{i, prior \, work=no}(t) &= \lambda_{0, prior \, work = no}(t)\exp(\mathbf{x}_i\boldsymbol\beta)\\ \end{align*}` In this case, we are estimating separate baseline hazards stratified by work experience. .question[ What might the partial likelihood look like for this *stratified* model? (would've been a good homework question, rats!) ] --- ### Stratification in the Cox model ``` r m2 <- coxph(Surv(week, arrest) ~ fin + strata(wexp), data = Rossi) summary(m2) ``` ``` ## Call: ## coxph(formula = Surv(week, arrest) ~ fin + strata(wexp), data = Rossi) ## ## n= 432, number of events= 114 ## ## coef exp(coef) se(coef) z Pr(>|z|) ## finyes -0.3781 0.6852 0.1897 -1.993 0.0463 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## exp(coef) exp(-coef) lower .95 upper .95 ## finyes 0.6852 1.459 0.4724 0.9938 ## ## Concordance= 0.547 (se = 0.024 ) ## Likelihood ratio test= 4.03 on 1 df, p=0.04 ## Wald test = 3.97 on 1 df, p=0.05 ## Score (logrank) test = 4.02 on 1 df, p=0.05 ``` --- ### Stratification in the Cox model .question[ What do you notice? What might potential drawbacks be to stratification? There's some evidence of non-proportional hazards due to the financial aid treatment. What would happen if we were to stratify by this variable? ] --- ### Additional applications ``` r bladder2[1:15,] ``` ``` ## id rx number size start stop event enum ## 1 1 1 1 3 0 1 0 1 ## 2 2 1 2 1 0 4 0 1 ## 3 3 1 1 1 0 7 0 1 ## 4 4 1 5 1 0 10 0 1 ## 5 5 1 4 1 0 6 1 1 ## 6 5 1 4 1 6 10 0 2 ## 7 6 1 1 1 0 14 0 1 ## 8 7 1 1 1 0 18 0 1 ## 9 8 1 1 3 0 5 1 1 ## 10 8 1 1 3 5 18 0 2 ## 11 9 1 1 1 0 12 1 1 ## 12 9 1 1 1 12 16 1 2 ## 13 9 1 1 1 16 18 0 3 ## 14 10 1 3 3 0 23 0 1 ## 15 11 1 1 3 0 10 1 1 ``` --- ### The Anderson-Gill model ``` r m3 <- coxph(Surv(start, stop, event) ~ rx, data = bladder2) summary(m3) ``` ``` ## Call: ## coxph(formula = Surv(start, stop, event) ~ rx, data = bladder2) ## ## n= 178, number of events= 112 ## ## coef exp(coef) se(coef) z Pr(>|z|) ## rx -0.3733 0.6885 0.1976 -1.889 0.0589 . ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## exp(coef) exp(-coef) lower .95 upper .95 ## rx 0.6885 1.452 0.4674 1.014 ## ## Concordance= 0.552 (se = 0.03 ) ## Likelihood ratio test= 3.68 on 1 df, p=0.06 ## Wald test = 3.57 on 1 df, p=0.06 ## Score (logrank) test = 3.61 on 1 df, p=0.06 ``` --- ### Adjusting for number of prior events? .question[ What is being implied by this model? ] ``` r m4 <- coxph(Surv(start, stop, event) ~ rx + cluster(id) + strata(enum), data = bladder2) summary(m4) ``` ``` ## Call: ## coxph(formula = Surv(start, stop, event) ~ rx + strata(enum), ## data = bladder2, cluster = id) ## ## n= 178, number of events= 112 ## ## coef exp(coef) se(coef) robust se z Pr(>|z|) ## rx -0.2458 0.7821 0.2130 0.2095 -1.173 0.241 ## ## exp(coef) exp(-coef) lower .95 upper .95 ## rx 0.7821 1.279 0.5187 1.179 ## ## Concordance= 0.541 (se = 0.031 ) ## Likelihood ratio test= 1.35 on 1 df, p=0.2 ## Wald test = 1.38 on 1 df, p=0.2 ## Score (logrank) test = 1.34 on 1 df, p=0.2, Robust = 1.51 p=0.2 ## ## (Note: the likelihood ratio and score tests assume independence of ## observations within a cluster, the Wald and robust score tests do not). ``` --- ### Stratifying by event... .question[ What model is this? ] ``` r m5 <- coxph(Surv(start, stop, event) ~ rx + cluster(id) + strata(enum), data = bladder2) summary(m5) ``` ``` ## Call: ## coxph(formula = Surv(start, stop, event) ~ rx + strata(enum), ## data = bladder2, cluster = id) ## ## n= 178, number of events= 112 ## ## coef exp(coef) se(coef) robust se z Pr(>|z|) ## rx -0.2458 0.7821 0.2130 0.2095 -1.173 0.241 ## ## exp(coef) exp(-coef) lower .95 upper .95 ## rx 0.7821 1.279 0.5187 1.179 ## ## Concordance= 0.541 (se = 0.031 ) ## Likelihood ratio test= 1.35 on 1 df, p=0.2 ## Wald test = 1.38 on 1 df, p=0.2 ## Score (logrank) test = 1.34 on 1 df, p=0.2, Robust = 1.51 p=0.2 ## ## (Note: the likelihood ratio and score tests assume independence of ## observations within a cluster, the Wald and robust score tests do not). ``` --- ### Another method... .question[ What model is this? ] ``` r m6 <- coxph(Surv(rep(0, 178), stop-start, event) ~ rx + cluster(id) + strata(enum), data = bladder2) summary(m6) ``` ``` ## Call: ## coxph(formula = Surv(rep(0, 178), stop - start, event) ~ rx + ## strata(enum), data = bladder2, cluster = id) ## ## n= 178, number of events= 112 ## ## coef exp(coef) se(coef) robust se z Pr(>|z|) ## rx -0.1635 0.8492 0.2020 0.2194 -0.745 0.456 ## ## exp(coef) exp(-coef) lower .95 upper .95 ## rx 0.8492 1.178 0.5524 1.305 ## ## Concordance= 0.521 (se = 0.03 ) ## Likelihood ratio test= 0.66 on 1 df, p=0.4 ## Wald test = 0.56 on 1 df, p=0.5 ## Score (logrank) test = 0.66 on 1 df, p=0.4, Robust = 0.59 p=0.4 ## ## (Note: the likelihood ratio and score tests assume independence of ## observations within a cluster, the Wald and robust score tests do not). ```