---
title: "Estimability"
output: html_notebook
---

## Create data

```{r data}
x1 = -4:4
x2 = c(-2, 1, -1, 2, 0, 2, -1, 1, -2)
x3 = 3*x1  -2*x2
x4 = x2 - x1 + 4

Y = 1 + x1 + x2 + x3 + x4 + c(-.5, .5, .5, -.5, 0, .5, -.5, -.5, .5)

dev.set = data.frame(Y, x1, x2, x3, x4)
```

## Fit linear model and extract coefficients.
```{r lm1}
lm1234 = lm(Y ~ x1 + x2 + x3 + x4, data=dev.set)
coefficients(lm1234)
```

Note the NA's corresponding to coefficients that are not estimable.

Now refit the model with predictors in a different order (name reflects order in design matrix

```{r lm2}
lm3412 = lm(Y ~ x3 + x4 + x1 + x2, data=dev.set)
coefficients(lm3412)
```

## Predictions
Look at data and predictions from the two model fits.
```{r}
cbind(dev.set, predict(lm1234), predict(lm3412))
```

Perfect agreement, confirming that the mean's are estimable even if individual parameters are not.

Out of sample prediction with a new dataframe `test.set`

```{r} 
test.set = data.frame(
    x1 = c(3, 6, 6, 0, 0, 1),
    x2 = c(1, 2, 2, 0, 0, 2),
    x3 = c(7,14, 14,0, 0, 3),
    x4 = c(2, 4, 0, 4, 0, 4))

out = cbind(test.set,
    predict(lm1234, new=test.set),
    predict(lm3412, new=test.set))
out
```

Note the NA's.  Can we determine the points that would not be estimable ahead of the time?

##  Estimability


Use function `epredict` from the library `estimability`
```{r}
library("estimability" )
cbind(epredict(lm1234, test.set), epredict(lm3412, test.set))
```