class: center, middle, inverse, title-slide .title[ # Parametric PH Models ] .author[ ### Yue Jiang ] .date[ ### STA 490/690 ] --- ### An AFT model ```r library(survival) aft_w <- survreg(Surv(DWHFDAYS, DWHF) ~ TRTMT + EJFPER + PREVMI, data = dig, dist = "lognormal") summary(aft_w) ``` ``` ## ## Call: ## survreg(formula = Surv(DWHFDAYS, DWHF) ~ TRTMT + EJFPER + PREVMI, ## data = dig, dist = "lognormal") ## Value Std. Error z p ## (Intercept) 6.06283 0.12297 49.30 < 2e-16 ## TRTMT 0.55329 0.06741 8.21 2.2e-16 ## EJFPER 0.05932 0.00385 15.42 < 2e-16 ## PREVMI -0.11991 0.07059 -1.70 0.089 ## Log(scale) 0.82037 0.01636 50.14 < 2e-16 ## ## Scale= 2.27 ## ## Log Normal distribution ## Loglik(model)= -20343.2 Loglik(intercept only)= -20501 ## Chisq= 315.72 on 3 degrees of freedom, p= 3.9e-68 ## Number of Newton-Raphson Iterations: 3 ## n=6799 (1 observation deleted due to missingness) ``` --- ### Visualizing the AFT model <img src="ph-1_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" /> --- ### A similar model for hazards? <img src="ph-1_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" /> --- ### A proportional hazards model .question[ How might we relate hazards to have a proportional relationship given a linear predictor? What do you think about the model below, where `\(\lambda_0\)` is the .vocab[baseline hazard], the hazard for an individual with `\(\mathbf{x}_i = \mathbf{0}\)`.? `\begin{align*} \lambda(t) &= \lambda_0(t)(\beta_1x_1 + \beta_2x_2 + \cdots + \beta_px_p) \end{align*}` ] -- .question[ What about the following? Any issues below? `\begin{align*} \lambda(t) &= \lambda_0(t)\exp\left(\beta_0 + \beta_1x_1 + \beta_2x_2 + \cdots + \beta_px_p \right) \end{align*}` ] --- ### A proportional hazards model `\begin{align*} \lambda(t) &= \lambda_0(t)\exp\left(\beta_1x_1 + \beta_2x_2 + \cdots + \beta_px_p \right) \end{align*}` A .vocab[proportional hazards] model is a linear model for the log hazard ratio (though *technically* you could technically set up other models that satisfy proportional hazards - what might some other candidate functions look like? Why doesn't anyone do that?). .question[ How might we interpret `\(\beta\)` terms in such a model? ] --- ### A proportional hazards model `\begin{align*} \lambda(t) &= \lambda_0(t)\exp\left(\beta_1x_1 + \beta_2x_2 + \cdots + \beta_px_p \right) \end{align*}` Covariates in a PH model have a multiplicative effect on *hazards*. For instance, if `\(\beta_k = 0.4\)`, then `\(\exp(\beta_k) \approx 1.5\)`. Holding all else equal, an individual with covariate `\(x_k\)` one unit greater than another is expected to have approximately 1.5 times the *hazard* of the event than the other. This multiplicative effect holds *regardless* of the time `\(t\)`. .question[ Would you rather have a higher linear predictor or a lower linear predictor in a PH model? How does this compare to the AFT model? ] --- ### A proportional hazards model <img src="ph-1_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" /> --- ### Likelihood derivations The proportional hazard model for the model above assumes an exponential distribution with parameter `\(\lambda\)` (sorry again about notation). In particular, `\begin{align*} \lambda_i(t) &= \lambda\exp\left(\beta_1x_{1i} + \beta_2x_{2i} + \cdots + \beta_px_{pi} \right) \end{align*}` .question[ What is the distribution function for individual `\(i\)`? How about the survival function? ] --- ### Likelihood derivation Individual `\(i\)` also has an exponentially-distributed failure time (why?); this time, with parameter `\(\lambda\exp(\mathbf{x}_i\boldsymbol\beta)\)`. Thus `\begin{align*} \lambda_i(t) &= \lambda\exp(\mathbf{x}_i\boldsymbol\beta) \end{align*}` where `\(\mathbf{x}_i\boldsymbol\beta = \beta_1x_{1i} + \cdots + \beta_px_{pi}\)`. It's sometimes easier to rewrite and think of the linear predictor using `\(\lambda = \exp(\beta_0)\)` so that `\begin{align*} \lambda_i(t) &= \exp(\mathbf{x}_i\boldsymbol\beta)\\ f_i(t) &= \exp(\mathbf{x}_i\boldsymbol\beta)\exp(-\exp(\mathbf{x}_i\boldsymbol\beta)t)\\ S_i(t) &= \exp(-\exp(\mathbf{x}_i\boldsymbol\beta)t) \end{align*}` where `\(\mathbf{x}_i\boldsymbol\beta = \beta_0 + \beta_1x_{1i} + \cdots + \beta_px_{pi}\)` and `\(\beta_0 = \log(\lambda)\)`. --- ### Likelihood derivation .question[ What does the likelihood function for `\(\boldsymbol\beta = (\beta_0, \beta_1, \cdots, \beta_p)^T\)` look like for independently right-censored data with no truncation? How about the log-likelihood? ] --- ### Likelihood derivation `\begin{align*} \mathcal{L}(\boldsymbol\beta | Y, \mathbf{X}) &= \prod_{i = 1}^n f_i(t_i)^{\delta_i}S_i(t_i)^{1 - \delta_i}\\ &= \prod_{i = 1}^n \lambda_i(t_i)^{\delta_i}S_i(t_i)\\ &= \prod_{i = 1}^n \exp(\mathbf{x}_i\boldsymbol\beta)^{\delta_i}\Big(\exp(-\exp(\mathbf{x}_i\boldsymbol\beta)t_i) \Big)\\ \log\mathcal{L}(\boldsymbol\beta | Y, \mathbf{X}) &= \sum_{i = 1}^n \delta_i(\mathbf{x}_i\boldsymbol\beta) - \exp(\mathbf{x}_i\boldsymbol\beta)t_i \end{align*}` --- ### Fitting a PH model ```r library(eha) ph_exp <- phreg(Surv(DWHFDAYS, DWHF) ~ TRTMT + EJFPER + PREVMI, data = dig, dist = "weibull", shape = 1) ph_exp ``` ``` ## Call: ## phreg(formula = Surv(DWHFDAYS, DWHF) ~ TRTMT + EJFPER + PREVMI, ## data = dig, dist = "weibull", shape = 1) ## ## Covariate W.mean Coef Exp(Coef) se(Coef) Wald p ## TRTMT 0.518 -0.299 0.742 0.042 0.000 ## EJFPER 29.455 -0.040 0.960 0.002 0.000 ## PREVMI 0.645 0.050 1.051 0.044 0.254 ## ## log(scale) 6.629 0.075 0.000 ## ## Shape is fixed at 1 ## ## Events 2332 ## Total time at risk 6092212 ## Max. log. likelihood -20512 ## LR test statistic 337.51 ## Degrees of freedom 3 ## Overall p-value 0 ``` .vocab[ How might we interpret the coefficient estimates here? ] --- ### Comparing to the AFT model ```r library(survival) aft_e <- survreg(Surv(DWHFDAYS, DWHF) ~ TRTMT + EJFPER + PREVMI, data = dig, dist = "exponential") summary(aft_e) ``` ``` ## ## Call: ## survreg(formula = Surv(DWHFDAYS, DWHF) ~ TRTMT + EJFPER + PREVMI, ## data = dig, dist = "exponential") ## Value Std. Error z p ## (Intercept) 6.62850 0.07495 88.43 < 2e-16 ## TRTMT 0.29856 0.04167 7.17 7.7e-13 ## EJFPER 0.04037 0.00239 16.92 < 2e-16 ## PREVMI -0.04978 0.04364 -1.14 0.25 ## ## Scale fixed at 1 ## ## Exponential distribution ## Loglik(model)= -20511.5 Loglik(intercept only)= -20680.3 ## Chisq= 337.51 on 3 degrees of freedom, p= 7.6e-73 ## Number of Newton-Raphson Iterations: 5 ## n=6799 (1 observation deleted due to missingness) ```