---
title: "Estimability"
output: html_notebook
---
## Create data
```{r data}
x1 = -4:4
x2 = c(-2, 1, -1, 2, 0, 2, -1, 1, -2)
x3 = 3*x1 -2*x2
x4 = x2 - x1 + 4
Y = 1 + x1 + x2 + x3 + x4 + c(-.5, .5, .5, -.5, 0, .5, -.5, -.5, .5)
dev.set = data.frame(Y, x1, x2, x3, x4)
```
## Fit linear model and extract coefficients.
```{r lm1}
lm1234 = lm(Y ~ x1 + x2 + x3 + x4, data=dev.set)
coefficients(lm1234)
```
Note the NA's corresponding to coefficients that are not estimable.
Now refit the model with predictors in a different order (name reflects order in design matrix
```{r lm2}
lm3412 = lm(Y ~ x3 + x4 + x1 + x2, data=dev.set)
coefficients(lm3412)
```
## Predictions
Look at data and predictions from the two model fits.
```{r}
cbind(dev.set, predict(lm1234), predict(lm3412))
```
Perfect agreement, confirming that the mean's are estimable even if individual parameters are not.
Out of sample prediction with a new dataframe `test.set`
```{r}
test.set = data.frame(
x1 = c(3, 6, 6, 0, 0, 1),
x2 = c(1, 2, 2, 0, 0, 2),
x3 = c(7,14, 14,0, 0, 3),
x4 = c(2, 4, 0, 4, 0, 4))
out = cbind(test.set,
predict(lm1234, new=test.set),
predict(lm3412, new=test.set))
out
```
Note the NA's. Can we determine the points that would not be estimable ahead of the time?
## Estimability
Use function `epredict` from the library `estimability`
```{r}
library("estimability" )
cbind(epredict(lm1234, test.set), epredict(lm3412, test.set))
```