---
title: "G-priors, Mixtures and JAGS"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

```{r preliminary}
set.seed(8675309)    #google number and you will not be able get rid of this for the rest of the day
```

Generate some data and center
```{r data and ols}
 X = rnorm(30, 0, 1)
 Xc = X - mean(X)
 Y = Xc* 3 + rnorm(30, 0, 1)
 Yc = Y - mean(Y)
 ols = lm(Y ~ Xc)  
```
 
 
G-prior as augmented data with $Y_0 = 0$ (at mean) and $X_0 = \sqrt{SSX/g}$ to find estimates using OLS

```{r bayes}
 SSX = sum(X^2)
 g = 5

 Xo = sqrt(SSX/g)
 Xo
 Xcomp = c(Xc, Xo)
 Ycomp = c(Yc, 0)
 bols = lm(Ycomp ~ Xcomp -1)
```

Compare Bayes with OLS
```{r}
plot(Xcomp, Ycomp)
points(sqrt(SSX/g), 0, pch=15, col=2)

abline(0, ols$coef[2])
abline(0, bols$coef, lty=3, lwd=3, col=2)


legend(1, -2, legend=c("OLS", "g-prior"), lty=c(1, 3), col=c(1,2))
```

The "prior observation" is depicted in red and appears to be an outlier that is pulling down the regression line towards the point.  Of course $g = 5$ is a completely arbitrary choice!  So what should you use?

Let's fit the model using a Zellner-Siow Cauchy prior as a mixture of normals.   We'll use the package `jags` (you will need to install separately) and the library `R2Jags`

```{r libraries}
library("R2jags")

```

First we need to define a function that represents the model
```{r model}
model = function(){
  
  # sampling model for Y | beta, phi 
  for (i in 1:n) {
      Y[i] ~ dnorm(beta0 + (X[i] - Xbar)*beta, phi)
  }
  # Note that second argument for the normal
  # is a precision in JAGS
  
  # generate beta as a scale mixtures of normals
  # where lambda is the scale parameter that we are mixing over
  beta0 ~ dnorm(0, phi*0.0000001)
  beta ~ dnorm(0, phi*tau*SSX)
  tau  ~ dgamma(.5, .5*n)
  
  # prior for phi to approximate p(phi) = 1/phi
  phi ~ dgamma(.005, .005)
  sigma = pow(phi, -.5)
}
```

In the two normal distributions, the second argument is actually the precision, not the variance as in R.  JAGS does not allow improper priors, so the prior $p(\phi) \propto 1/\phi$ has to be expressed as a limiting Gamma.   The
Although our sampling model is defined with a loop, this is just a way to represent the model.  As `jags` is written in `C++` so this is very efficient.   

To run jags we need to provide the data, we will do this as a list
```{r jags}
data = list(Y=Y, X=X, n =length(Y), SSX=sum(Xc^2), Xbar=mean(X) )

ZSout = jags(data,model=model, inits=NULL,    
             parameters.to.save=c("beta0", "beta", "tau", "sigma"), 
             n.iter=10000)
```

I'm letting JAGS generate the initial values `inits=NULL` and will monitor `beta0`, `beta`, `tau` and `sigma` for posterior inference. The `n.iter` will run for 10,000 iterations, with half used for burnin by default.

Typing the name of the output, will provide a summary of parameters and diagnostics
```{r output}
ZSout
```

IF we want to obtain Credible intervals, we can use the `HPDinterval` function

```{r}
HPDinterval(as.mcmc(ZSout$BUGSoutput$sims.matrix))
```


Let's add the posterior mean to the plot from before.
```{r plots}
plot(X, Y)
points(sqrt(SSX/g), 0, pch=15, col=2)

abline(0, ols$coef[2])
abline(0, bols$coef, lty=3, lwd=3, col=2)
abline(0, ZSout$BUGSoutput$mean$beta, lty=2, lwd=3, col=3)
legend(1, -2, 
       legend=c("OLS", "G-prior", "ZS-prior"), 
       lty=c(1, 3, 2), 
       col=c(1,2,3))

```


The estimate from the Cauchy prior is very close to the MLE, even though it is centered at 0, with the normal prior.  This is due to a property of the Cauchy of having what is called 'bounded influence'.