Likelihoods for arbitrary censorship/truncation

class: center, middle, inverse, title-slide

.title[
# Likelihoods for arbitrary censorship/truncation
]
.author[
### Yue Jiang
]
.date[
### Duke University
]

---

### Estimation for censored data

For independently right-censored data, an individual's contribution to the likelihood is

`\begin{align*}
&\mathrel{\phantom{=}} f(t_i)^{\delta_i}S(t_i)^{1 - \delta_i}\\
&= \lambda(t_i)^{\delta_i}S(t_i)^{\delta_i}S(t_i)^{1 - \delta_i}\\
&= \lambda(t_i)^{\delta_i}S(t_i)
\end{align*}`

And so the full likelihood, assuming *i.i.d.* failure distributions is

`\begin{align*}
\prod_{i = 1}^n f(t_i)^{\delta_i}S(t_i)^{1 - \delta_i} = \prod_{i = 1}^n \lambda(t_i)^{\delta_i}S(t_i)
\end{align*}`

Which may be maximized either parametrically or non-parametrically.

---

### Estimation for censored data

What about left-censored data?

.question[
- What does it mean for an observation to be left-censored?
- What does an *observed* failure time contribute to the likelihood?
- What does a *censored* failure time contribute to the likelihood?
]

<br>

.question[
- What is the full contribution for an arbitrary individual to the likelihood for independently censored i.i.d. left-censored data?
]

---

### Estimation for censored data

For independently left-censored data, an individual's contribution to the likelihood is

`\begin{align*}
&\mathrel{\phantom{=}} f(t_i)^{\delta_i}F(t_i)^{1 - \delta_i}
\end{align*}`

And so the full likelihood, assuming *i.i.d.* failure distributions is

`\begin{align*}
\prod_{i = 1}^n f(t_i)^{\delta_i}F(t_i)^{1 - \delta_i}
\end{align*}`

---

### Estimation for interval censored-data

.question[
Using similar arguments to what we've seen before, how about *interval* censored data, where observations are censored in the interval `\((u_{L_i}, u_{U_i}\)`) under the same assumptions?
]

---

### Estimation for censored data

`\begin{align*}
\prod_{i = 1}^n f(t_i)^{\delta_i}\left(F(t_{U_i}) - F(t_{L_i})\right)^{1 - \delta_i}
\end{align*}`

---

### Back to truncation

.quesiton[
What is the difference between truncation and censoring?

- What might be a real-world example of left-truncated data?
- What might be a real-world example of right-truncated data?
- What might be a real-world example of *interval*-truncated data?
]

---

### Truncated likelihood contributions

(Consider left truncation vs. left censoring). Remember, in left censoring, we know of the existence of someone with a failure at time `\(T < t\)`. However, in left truncated data, we do not observe them at all.

.question[
Suppose observations are left truncated at time `\(u\)`. What would be the likelihood contributions of *observed failures* at time `\(t_i\)`? 
]

---

### Truncated likelihood contributions

We *know* that the failure time `\(T\)` has to be greater than the truncation time `\(u\)` for left truncated data, and observe a failure at time `\(t_i\)`.

The individual likelihood contribution is:

`\begin{align*}
f(t_i | T > u) = \frac{f(t_i)}{S(u)}
\end{align*}`

---

### Truncated likelihood contributions

Similarly, for right truncation at time `\(w\)`, the likelihood contribution is:

`\begin{align*}
f(t_i | T < w) = \frac{f(t_i)}{F(w)}
\end{align*}`

---

### Truncated likelihood contributions

And finally, for interval truncated data between times `\(u\)` and `\(w\)`, the likelihood contribution is:

`\begin{align*}
f(t_i | u < T < w) = \frac{f(t_i)}{F(w) - F(u)}
\end{align*}`

---

### Right censored, left truncated data?

.question[
Suppose you have a population of (potentially) right censored data with left truncation. What would the individual likelihood contributions look like? How about the full likelihood to maximize, assuming *i.i.d.* failure times, independent censoring, and common known truncation time?
]

---

### Right censored, left truncated data?

`\begin{align*}
&\mathrel{\phantom{=}}\prod_{i = 1}^n f(t_i | T > u)^{\delta_i}S(t_i | T > u)^{1 - \delta_i} \\ 
&= \prod_{i = 1}^n \left(\frac{f(t_i)}{S(u)}\right)^{\delta_i}\left(\frac{S(T_i)}{S(u)}\right)^{1 - \delta_i}\\
&= \prod_{i \in obs} \frac{f(t_i)}{S(u)}\prod_{j \in cens} \frac{S(t_j)}{S(u)}
\end{align*}`

---

### Arbitrary combinations

.question[
What might the full likelihood look like for arbitrary combinations of censoring and truncation of various types?
]

We might think of breaking down the population into the following four groups. assuming independence, we could simply multiply the contributions of the following groups:

- untruncated and uncensored
- untruncated and censored
- truncated and uncensored
- truncated and censored