Survivorship and hazard

class: center, middle, inverse, title-slide

.title[
# Survivorship and hazard
]
.author[
### Yue Jiang
]
.date[
### STA 490 / STA 690
]

---

### Representing survival data

Underlying data:
- `\(T\)`: Failure time, a non-negative random variable
- `\(C\)`: Censoring time, a non-negative random variable
Observed data for individual `\(i\)`:
- `\(Y_i\)`: `\((T_i \wedge C_i)\)`, the minimum of `\(T_i\)` and `\(C_i\)`
- `\(\delta_i\)`: `\(1_{(T_i \le C_i)}\)`, whether we observe a failure

**Our goal is to make inferential statements about** `\(T\)`.

---

### Survival function

The .vocab[survival function] is given by `\(S(t) = P(T > t)\)` and has the following properties:

- `\(S(0) = 1\)`
- `\(\lim_{t \to \infty} S(t) = 0\)`
- It is non-increasing: `\(S(t_2) \le S(t_1)\)` for `\(t_2 \ge t_1\)`

.question[
What do these properties mean in plain English?
]

---

### Survival function

The survival function is simply the complement of the distribution function:

`\begin{align*}
F(t) = P(T \le t) = 1 - S(t)
\end{align*}`

Suppose we have absolutely continuous `\(T\)`. The distribution function `\(f(t)\)` is related to the density by

`\begin{align*}
f(t) &= \lim_{dt \to 0^+} \frac{P(t\le T < t + dt)}{dt}\\
&= \frac{dF(t)}{dt} 
\end{align*}`

Or equivalently,

`\begin{align*}
F(t) = \int_0^t f(u)du.
\end{align*}`

---

### Hazard function

The .vocab[hazard function] is given by

`\begin{align*}
\lambda(t) = \lim_{dt \to 0^+} \frac{P(t \le T < t + dt | T \ge t)}{dt}
\end{align*}`

Note that this is **not** a probability (for continuous `\(T\)`), and can be unbounded

.question[
- What does the hazard function represent in plain English? 
- Can you give an example of something with increasing / decreasing hazard?
- Why might we want to think in terms of hazards for interpretability reasons?
]

---

### Cumulative hazard

Similarly to how the distribution function represents a cumulative density, 
the .vocab[cumulative hazard] is given similarly as:

`\begin{align*}
\Lambda(t) = \int_0^t \lambda(u)du
\end{align*}`

.question[
- Intuitively, what is `\(\Lambda(0)\)`?
- Must `\(\Lambda(t)\)` be non-decreasing? Explain.
- What is `\(\lim_{t\to \infty} \Lambda(t)\)`? Explain.
]

---

### Hazard and survival

`\begin{align*}
\lambda(t) &= \lim_{dt \to 0^+} \frac{P(t \le T < t + dt | T \ge t)}{dt} \\
&= \lim_{dt \to 0^+} \frac{P(t \le T < t + dt, T \ge t)/P(T \ge t)}{dt} \\
&= \lim_{dt \to 0^+} \frac{P(t \le T < t + dt)/P(T \ge t)}{dt}\\
&= \frac{f(t)}{S(t)}
\end{align*}`

.question[
Show that `\(\Lambda(t) = -\log(S(t))\)`.

As a hint, use the chain rule to express `\(\lambda(t)\)` as a function of `\(S(t)\)`, then integrate both sides from `\(0\)` to `\(t\)`.
]

---

### Hazard and survival

.question[
Consider a distribution with *constant* hazard, such that `\(\lambda(t) = c\)` for all times `\(t\)`.

- What is the density function associated with this hazard? 
- Can you think of a real-world example of such a situation?
]

---

### A potential issue

.question[
What is wrong with the general statement `\(f(t) = \frac{dF(t)}{dt}\)` for all distributions? (we actually glossed over this when relating `\(\Lambda(t)\)` to `\(S(t)\)` as well)
]

---

### Being more correct

Let `\(F(t)\)` be a non-decreasing càdlàg function with countably many jumps at `\(t_1, t_2, \cdots\)`

---

### Being more correct

Define `\(\Delta F(t_j) = F(t_j) - F(t_j^-) > 0\)`. Then we can "always" write

`\begin{align*}
F(t) = \int_0^t f(u)d(u) + \sum_{j: t_j \le t}\Delta F(t_j),
\end{align*}`

regardless of whether there are discontinuity points in `\((0, t]\)`.

.question[
What does the above expression mean in plain English?
]