Standard error estimation (one-sample)

class: center, middle, inverse, title-slide

.title[
# Standard error estimation (one-sample)
]
.author[
### Yue Jiang
]
.date[
### STA 490 / STA 690
]

---

### Review: Kaplan-Meier estimate

Recall that the .vocab[risk set] `\(Y_j\)` at time `\(t_j\)` is those individuals who had not yet failed or been censored by that time. Let `\(N_j\)` be the number of individuals who are in the risk set (the size of the risk set).

If `\(d_j\)` is the number of individuals who fail at that time, then we have the .vocab[Kaplan-Meier] estimate as given by:

`\begin{align*}
\widehat{S}(t) = \prod_{j: t_j \le t} \left\{1 - \frac{d_j}{N_j} \right\},
\end{align*}`

where `\(t_j\)` are the unique death times `\(\le t\)`.

---

### Review (?)

In general, for a random variable `\(X\)` and some function `\(g(\cdots)\)`, `\(E(g(X)) \neq g(E(X))\)`.

.question[
But when might be `\(E(g(X))\)` be *"close"* to `\(g(E(X))\)`, perhaps close enough to "comfortably" say `\(E(g(X)) \approx g(E(X))\)`?

...what do we even mean by close?
]

---

### Caveat (only for STA 690)

We're going to be doing some pretty uncomfortable things with making assumptions on "independence." In fact, many things aren't independent in later slides (but we say they are), but the real reason why what we do is ok is beyond the scope of this course.

---

### A first order approximation

Assume some regularity conditions hold (let's say all moments exist and `\(g\)` is real analytic, but we certainly don't need something as strong as this).

.question[
What is the first order Taylor expansion for a (sufficiently nicely behaved) *function* of a random variable, `\(g(X)\)`, around the random variable's expectation `\(E(X)\)`?

**Hint**: remember, `\(E(X)\)` itself is a value, not a random variable - it might be easier to think of it as " `\(\mu\)` "
]

--

`\begin{align*}
g(X) = g(E(X)) + g^\prime(E(X))(X - E(X)) + \cdots
\end{align*}`

---

### A first order approximation

`\begin{align*}
g(X) &= g(E(X)) + g^\prime(E(X))(X - E(X)) + \cdots\\
g(X) &\approx g(E(X)) + g^\prime(E(X))(X - E(X))
\end{align*}`

.question[
What is the *expectation* of this quantity?
]

---

### A first order approximation

`\begin{align*}
E\left(g(X)\right) &\approx E\big(g(E(X)) + g^\prime(E(X))(X - E(X))\big)\\
&= g(E(X))
\end{align*}`

<br>
Ok, so `\(E(g(X)) \approx g(E(X))\)`.

.small[(to a first-order approximation, and under some conditions on both the distribution of `\(X\)` and on the function `\(g\)` that we're going to assume are satisfied).]

---

### A first order approximation

.question[
How about the variance?

.small[(again, we're hand waving away talking about some regularity conditions)]
]

---

### A first order approximation

`\begin{align*}
Var\left(g(X)\right) &\approx Var\big(g(E(X)) + g^\prime(E(X))(X - E(X))\big)\\
&= Var\big(g(E(X)) + Xg^\prime(E(X)) - E(X)g^\prime(E(X)))\big)\\
&= Var\big(Xg^\prime(E(X))\big)\\
&= \left(g^\prime(E(X))\right)^2Var(X)
\end{align*}`

---

### Back to the Kaplan-Meier estimate

Last time, we broke down the timeline into intervals of the form `\([t_j, t_{j+1})\)`, and found that a decent estimator for the probability of surviving past this interval, `\(p_j\)`, given that you made it there in the first place, is

`\begin{align*}
\widehat{p}_j = \widehat{P}(T > t_{j+1} | T > t_j) = 1 - \frac{d_j}{N_j}.
\end{align*}`

.question[
Suppose we know that we've made it to some time `\(t_j\)` and know the exact risk set at that time, `\(Y_j\)` (and thus `\(N_j\)`).

What is the distribution of the observed number of failures (or non-failures) during this interval (and with what parameters)? Did we make any additional assumptions? 
]

---

### Back to the Kaplan-Meier estimate

Given `\(N_j\)`, `\(d_j \sim Bin(N_j, p_j^\star)\)` (where `\(p_j^\star = 1 - p_j\)`):

`\begin{align*}
Var\left(1 - \frac{d_j}{N_j}\right) &= \frac{p^\star_j(1-p^\star_j)}{N_j}
\end{align*}`

We can use `\(\widehat{p}_j = 1 - \frac{d_j}{N_j}\)` as a good estimate of `\(1 - p^\star_j\)`, and so

`\begin{align*}
Var\left(1 - \frac{d_j}{N_j}\right) &\approx \frac{(1-\widehat{p}_j)\widehat{p}_j}{N_j}
\end{align*}`

---

### Back to the Kaplan-Meier estimate

Consider the log of the Kaplan-Meier estimate:

`\begin{align*}
\widehat{S}(t) &= \prod_{j: t_j \le t} \left\{1 - \frac{d_j}{N_j} \right\}\\
\log\left(\widehat{S}(t)\right) &= \sum_{j: t_j \le t} \log\left(\widehat{p}_j\right).
\end{align*}`

---

### Back to the Kaplan-Meier estimate

We are interested in 
`\begin{align*}
Var\left(\log\big(\widehat{S}(t)\big)\right) = Var\left(\sum_{j: t_j \le t} \log\left(\widehat{p}_j\right)\right).
\end{align*}`

.question[
Is it appropriate for us to say that
`\begin{align*}
Var\left(\sum_{j: t_j \le t} \log\left(\widehat{p}_j\right)\right) = \sum_{j: t_j \le t} Var\left(\log(\widehat{p}_j)\right)?
\end{align*}`
]

.small[this is where we *really* start to get hand-wavy]

---

### Back to the Kaplan-Meier estimate

*Anyway*,

`\begin{align*}
Var\left(\log(\widehat{p}_j)\right) &\approx \frac{Var(\widehat{p}_j)}{E(\widehat{p}_j)^2}\\
&= \frac{p_j(1 - p_j)/N_j}{p_j^2}\\
&= \frac{1 - p_j}{p_jN_j}\\
&= \frac{d_j}{N_j(N_j - d_j)}\\
Var\left(\log\big(\widehat{S}(t)\big)\right) &\approx \sum_{j: t_j \le t} \frac{d_j}{N_j(N_j - d_j)}
\end{align*}`

---

### Almost there

`\begin{align*}
Var\left(g(X)\right) &\approx \left(g^\prime(E(X))\right)^2Var(X)
\end{align*}`

Consider again

`\begin{align*}
Var\left(\log\big(\widehat{S}(t)\big)\right) &\approx \frac{1}{E(\widehat{S}(T))^2}Var(\widehat{S}(t))\\
\\
&\approx \frac{1}{\big(\widehat{S}(T)\big)^2}Var(\widehat{S}(t))
\end{align*}`

---

### Almost there

`\begin{align*}
Var\left(\log\big(\widehat{S}(t)\big)\right) \approx \sum_{j: t_j \le t} \frac{d_j}{N_j(N_j - d_j)} \approx  \frac{1}{\big(\widehat{S}(T)\big)^2}Var(\widehat{S}(t))
\end{align*}`

and so

`\begin{align*}
Var(\widehat{S}(t)) \approx \big(\widehat{S}(T)\big)^2 \sum_{j: t_j \le t} \frac{d_j}{N_j(N_j - d_j)}
\end{align*}`

---

### A point-wise confidence interval

We can how create the Wald-type confidence interval:

`\begin{align*}
\widehat{S}(t) \pm z^\star_{1 - \alpha/2} \widehat{S}(t)\sqrt{\sum_{j: t_j \le t} \frac{d_j}{N_j(N_j - d_j)}}
\end{align*}`