class: center, middle, inverse, title-slide .title[ # Time-varying elements ] .author[ ### Yue Jiang ] .date[ ### STA 490/690 ] --- ### Extending the Cox model `\begin{align*} \lambda_i(t) &= \lambda_0(t)\exp\left(\mathbf{x}_i\boldsymbol\beta \right)\\ \lambda_i(t) &= \lambda_0(t)\exp\left(\mathbf{x}_i(t)\boldsymbol\beta \right)?\\ \lambda_i(t) &= \lambda_0(t)\exp\left(\mathbf{x}_i\boldsymbol\beta(t) \right)? \end{align*}` .question[ What are we doing in each of these expressions? ] --- ### Schoenfeld residuals <!-- --> .question[ Which of the extensions to the Cox model mentioned on the previous slide do you think is most relevant here? Why? ] --- ### Scaled Schoenfeld residuals The vector of Schoenfeld residuals for individual `\(i\)` is given by the `\(i\)`th term of the first derivative of the log partial likelihood with respect to `\(\boldsymbol\beta\)`, evaluated at `\(\widehat{\boldsymbol\beta}\)`. For the vector of Schoenfeld residuals `\(\mathbf{r}_i\)`, we can scale them as `\begin{align*} \mathbf{r}^\star_i = D \times Var(\widehat{\boldsymbol\beta})\mathbf{r}_i, \end{align*}` which can be used to define a test of proportional hazards. --- ### The Grambsch-Therneau test The expected value of the `\(i\)` scaled Schoenfeld residual for `\(\beta_k\)` is approximately `\(\beta_k(t_i) - \widehat{\beta}_j\)`, where `\(\beta_k(t_i)\)` is the time-varying coefficient of `\(x_j\)` evaluated at the failure time `\(t_i\)` corresponding to individual `\(i\)` (note - these are *specifically* evaluated at failure times; censored observations do not contribute). If proportional hazards is satisfied, then `\(r^\star_{ik} + \widehat{\beta}_k\)` should be constant. That is, we are testing whether there is a linear relationship between `\(E(r_{ik})\)` and time. If `\(E(r_{ik})\)` depends on time (i.e., is not constant), then proportional hazards is violated. ```r cox.zph(coxm1) ``` ``` ## chisq df p ## TRTMT 39.827 1 2.8e-10 ## EJFPER 0.716 1 0.40 ## PREVMI 0.652 1 0.42 ## GLOBAL 41.187 3 6.0e-09 ``` --- ### The Grambsch-Therneau test ```r ggcoxzph(cox.zph(coxm1), var = "EJFPER", ylim = c(-2, 2)) ``` <!-- --> --- ### The Grambsch-Therneau test ```r ggcoxzph(cox.zph(coxm1), var = "TRTMT", ylim = c(-2, 2)) ``` <!-- --> --- ### Fitting time-varying coefficients Ok, so there may be a linear (let's say linear) relationship between `\(\beta_{TRTMT}\)` and time. Consider the following linear predictor incorporating this linear time dependence at time `\(t_j\)` `\begin{align*} \eta_i &= TRT_i\beta_{TRT} + TRT_i\beta^\star t_j + \\ &\mathrel{\phantom{=}} EJF_i \beta_{EJF} + PMI_i \beta_{PMI}, \end{align*}` where the contribution to the partial likelihood for individual `\(i\)` failing at time `\(t_j\)` is given by `\begin{align*} \frac{\exp(\eta_i)}{\sum_{k \in \mathcal{R}_k}\exp(\eta_k)}. \end{align*}` --- ### Fitting time-varying coefficients .question[ Can we simply model this as an interaction with time? For instance, what do you think of the model implied by the code below (supposing we had a nice `time` variable? ] ```r coxm2 <- coxph(Surv(DWHFDAYS, DWHF) ~ TRTMT + EJFPER + PREVMI + TRTMT * time, data = dig) ``` --- ### Fitting time-varying coefficients Note - fitting time-varying coefficients like this is very computationally (and space) intensive if you have a lot of failure times (not worth getting into but I'm happy to handwave an answer - ask again after we talk about how `R` handles time-varying covariates). Fitting the code below actually creates an object that is almost half a gigabyte in size! ```r coxm2 <- coxph(Surv(DWHFDAYS, DWHF) ~ TRTMT + EJFPER + PREVMI + tt(TRTMT), tt = function(x, t, ...) { x * t }, data = dig) summary(coxm2) ``` --- ### Fitting time-varying coefficients ```r coxm2 <- coxph(Surv(DWHFDAYS, DWHF) ~ TRTMT + EJFPER + PREVMI + tt(TRTMT), tt = function(x, t, ...) { x * t }, data = dig) summary(coxm2) ``` ``` ## Call: ## coxph(formula = Surv(DWHFDAYS, DWHF) ~ TRTMT + EJFPER + PREVMI + ## tt(TRTMT), data = dig, tt = function(x, t, ...) { ## x * t ## }) ## ## n= 6799, number of events= 2332 ## (1 observation deleted due to missingness) ## ## coef exp(coef) se(coef) z Pr(>|z|) ## TRTMT -0.5722515 0.5642536 0.0652410 -8.771 < 2e-16 *** ## EJFPER -0.0379380 0.9627726 0.0023806 -15.936 < 2e-16 *** ## PREVMI 0.0497461 1.0510042 0.0436404 1.140 0.254 ## tt(TRTMT) 0.0006029 1.0006031 0.0001060 5.686 1.3e-08 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## exp(coef) exp(-coef) lower .95 upper .95 ## TRTMT 0.5643 1.7723 0.4965 0.6412 ## EJFPER 0.9628 1.0387 0.9583 0.9673 ## PREVMI 1.0510 0.9515 0.9648 1.1449 ## tt(TRTMT) 1.0006 0.9994 1.0004 1.0008 ## ## Concordance= 0.609 (se = 0.006 ) ## Likelihood ratio test= 337.2 on 4 df, p=<2e-16 ## Wald test = 333.7 on 4 df, p=<2e-16 ## Score (logrank) test = 338.6 on 4 df, p=<2e-16 ``` --- ### Fitting time-varying coefficients You can make the time-transform function `tt` according to any functional form you like, for instance in this case maybe a log relationship or inclusion of a quadratic term might make more sense. With that said, for more complicated forms, numerical optimization becomes really horrible, especially for "non-small" datasets like this one. --- ### Onto time-dependent *covariates* .question[ Actually, I'll let you handle this one. How might you (reasonably) incorporate time-varying *covariates* into the Cox model? Keep in mind the contribution of each individual at observed failure times `\(t_j\)` to the partial likelihood: `\begin{align*} \frac{\exp(\mathbf{x}_i\boldsymbol\beta)}{\sum_{j \in \mathcal{R}(t_j)}\exp(\mathbf{x}_j\boldsymbol\beta)} \end{align*}` and the product of all such contributions at each failure time. ] (we'll spend a lot of time in-class on Thursday tackling a real-world dataset - please bring your laptops (or whatever you use for data analysis))