HW10

Date

Mon, Nov 20, 2017

Links

Reading

See Due date in Sakai

Please look at before class Tuesday in case there are questions or clarifications needed (or post on Piazza). Use LaTeX or write by hand (must be legible) and scan to submit via Sakai.

Confirm Corollary 13.5.2 in Christensen by direct mulitplication
Exercise 13.7 in Christensen
Exercise 13.9
Consider the model \[\begin{eqnarray*} Y & = & 1 \alpha + X\beta + \epsilon \\ \epsilon_i \mid \phi, \lambda_i &\sim &N(0, 1/(\phi \lambda_i)) \\ \lambda_i & \sim & G(a/2, a/2) \\ \beta_j \mid \phi, \gamma_j & \sim & N(0, 1/(\phi \gamma_j)) \\ \gamma_j & \sim& G(\delta/2, \delta/2) \\ p(\alpha, \phi) &\propto& 1/\phi \end{eqnarray*}\]
where \(X\) has been centered and standardized so that \(X^TX\) has diagonal elements 1.
1. (optional) Show that marginal distributions \(\epsilon_i \mid \phi\) are iid Student t with \(a\) degrees of freedom. Similarly show that \(\beta_j \mid \phi\) are iid Student t with \(\delta\) degrees of freedom.
2. (optional) derive the full conditional distributions for each of the blocks of parameters \(\beta\), \(\alpha\), \(\lambda\), \(\gamma\), and \(\phi\) supplying the name and all hyper-parameters.
3. Modify the JAGS code for the stackloss data below for the hierarchical model above using your choice of \(\delta \le a\). Provide plots of the posterior distributions for \(\lambda\) (side-by-side boxplots). What does this suggest about “outliers”? Construct credible intervals for the \(\beta\)s (in the original units). How do these compare to the frequentist solutions or the model selection results from MC3.REG or BAS from class? How sensitive are the results to the choice of \(a\) and \(\delta\)?
4. (optional) Modify the JAGS code to use the Horseshoe prior.
Exercise 13.8 Explore methods discussed in class and with JAGS as suggested above and provide a brief write up with your recommendation and conclusions about the effects of the variables on stackloss and issues of outliers/influential points focusing your discussion on your “final” model. Briefly discuss the advantages and disadvantages of the different approaches to handing outliers.

JAGS Code

library(MASS)
library(car)
data(stackloss)


### R interface to JAGS:
suppressMessages(library(R2jags))

# Create a data list with inputs for JAGS

n = nrow(stackloss)
## scale X such that X^TX has ones on the diagonal;
## scale divides by the standard deviation so we need
## to divide by the sqrt(n-1)
scaled.X = scale(as.matrix(stackloss[, -4]))/sqrt(n-1)
# are diagonal elements 1?
# check
t(scaled.X) %*% scaled.X

##             Air.Flow Water.Temp Acid.Conc.
## Air.Flow   1.0000000  0.7818523  0.5001429
## Water.Temp 0.7818523  1.0000000  0.3909395
## Acid.Conc. 0.5001429  0.3909395  1.0000000

data = list(Y = stackloss$stack.loss, 
            X=scaled.X, 
            p=ncol(scaled.X),
            n = n)

#extract the scales from the scaled object and fix
data$scales = attr(scaled.X, "scaled:scale")*sqrt(n-1) # fix scale
data$Xbar = attr(scaled.X, "scaled:center")

Define the jags model with \(t_9\) errors and a Cauchy prior on coefficients.

rr.model = function() {
  df <- 9
  shape <- df/2

  for (i in 1:n) {
    mu[i] <- alpha0 + inprod(X[i,], alpha)
    lambda[i] ~ dgamma(shape, shape)
    prec[i] <- phi*lambda[i]
    Y[i] ~ dnorm(mu[i], prec[i])
  }
  phi ~ dgamma(1.0E-6, 1.0E-6)
  alpha0 ~ dnorm(0, 1.0E-6)

  for (j in 1:p) {
    prec.beta[j] <- lambda.beta[j]*phi
    alpha[j] ~ dnorm(0, prec.beta[j])
    # transform back to original coefficients
    beta[j] <- alpha[j]/scales[j]
    lambda.beta[j] ~ dgamma(.5, .5)
  }

  # transform intercept to usual parameterization 
  beta0 <- alpha0 - inprod(beta[1:p], Xbar)

  sigma <- pow(phi, -.5)
}

# parameters to monitor
parameters = c("beta0", "beta", "sigma","lambda.beta", "lambda")

# run jags from R  (see Resources to install)
stack.sim = jags(data, 
                 inits=NULL, 
                 par=parameters,
                 model=rr.model, 
                 n.iter=50000)

## module glm loaded

## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 21
##    Unobserved stochastic nodes: 29
##    Total graph size: 241
## 
## Initializing model

# create an MCMC object with the results for the MCMC draws
stack.mcmc = as.mcmc(stack.sim$BUGSoutput$sims.matrix)

Summaries of objects

plot(stack.sim)

summary(stack.sim)  # names of objects in bf.sim

##                    Length Class  Mode     
## model               8     jags   list     
## BUGSoutput         24     bugs   list     
## parameters.to.save  6     -none- character
## model.file          1     -none- character
## n.iter              1     -none- numeric  
## DIC                 1     -none- logical

stack.sim  # print gives summary

## Inference for Bugs model at "/var/folders/n4/nj1122xj6bn5_xgbptv7bml40000gp/T//Rtmp93J1Dn/modelccc6db440e4.txt", fit using jags,
##  3 chains, each with 50000 iterations (first 25000 discarded), n.thin = 25
##  n.sims = 3000 iterations saved
##                mu.vect sd.vect    2.5%     25%     50%     75%   97.5%
## beta[1]          0.807   0.174   0.436   0.697   0.815   0.928   1.124
## beta[2]          0.812   0.484  -0.030   0.453   0.806   1.135   1.812
## beta[3]         -0.056   0.111  -0.290  -0.123  -0.049   0.012   0.156
## beta0          -43.472   9.379 -61.861 -49.437 -43.754 -37.635 -24.299
## lambda[1]        0.920   0.443   0.272   0.600   0.844   1.163   1.961
## lambda[2]        1.036   0.468   0.341   0.699   0.964   1.298   2.161
## lambda[3]        0.870   0.406   0.268   0.573   0.807   1.104   1.832
## lambda[4]        0.705   0.352   0.199   0.452   0.644   0.890   1.558
## lambda[5]        1.075   0.485   0.346   0.719   1.005   1.348   2.202
## lambda[6]        1.025   0.465   0.331   0.684   0.955   1.278   2.155
## lambda[7]        1.045   0.471   0.332   0.696   0.991   1.320   2.155
## lambda[8]        1.090   0.491   0.361   0.730   1.012   1.360   2.260
## lambda[9]        1.041   0.478   0.327   0.698   0.974   1.311   2.176
## lambda[10]       1.080   0.485   0.341   0.719   1.012   1.377   2.193
## lambda[11]       1.074   0.484   0.347   0.725   0.994   1.356   2.191
## lambda[12]       1.062   0.488   0.326   0.700   0.986   1.348   2.178
## lambda[13]       1.013   0.461   0.310   0.676   0.944   1.276   2.031
## lambda[14]       1.046   0.462   0.346   0.715   0.978   1.316   2.138
## lambda[15]       1.057   0.484   0.323   0.706   0.983   1.348   2.169
## lambda[16]       1.090   0.492   0.365   0.735   1.015   1.375   2.240
## lambda[17]       1.080   0.488   0.348   0.723   1.006   1.346   2.249
## lambda[18]       1.094   0.485   0.361   0.748   1.026   1.364   2.208
## lambda[19]       1.072   0.473   0.357   0.728   1.005   1.339   2.131
## lambda[20]       1.073   0.481   0.351   0.714   1.002   1.354   2.218
## lambda[21]       0.563   0.318   0.141   0.337   0.496   0.720   1.344
## lambda.beta[1]   0.024   0.056   0.000   0.004   0.011   0.026   0.113
## lambda.beta[2]   0.362   0.823   0.002   0.032   0.095   0.296   2.820
## lambda.beta[3]   1.382   1.533   0.033   0.326   0.854   1.914   5.769
## sigma            3.133   0.667   2.069   2.654   3.047   3.509   4.670
## deviance       107.653   5.557  98.155 103.865 107.312 110.912 119.243
##                 Rhat n.eff
## beta[1]        1.001  3000
## beta[2]        1.001  3000
## beta[3]        1.002  1300
## beta0          1.002  1900
## lambda[1]      1.001  3000
## lambda[2]      1.001  3000
## lambda[3]      1.002  1400
## lambda[4]      1.001  3000
## lambda[5]      1.001  3000
## lambda[6]      1.002  1800
## lambda[7]      1.001  3000
## lambda[8]      1.001  3000
## lambda[9]      1.002  1700
## lambda[10]     1.001  2400
## lambda[11]     1.001  3000
## lambda[12]     1.001  3000
## lambda[13]     1.001  2000
## lambda[14]     1.001  3000
## lambda[15]     1.001  3000
## lambda[16]     1.001  3000
## lambda[17]     1.002  1300
## lambda[18]     1.001  3000
## lambda[19]     1.001  3000
## lambda[20]     1.002  1500
## lambda[21]     1.002  1900
## lambda.beta[1] 1.001  3000
## lambda.beta[2] 1.001  3000
## lambda.beta[3] 1.001  2600
## sigma          1.001  3000
## deviance       1.001  3000
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 15.4 and DIC = 123.1
## DIC is an estimate of expected predictive error (lower deviance is better).

quantile(stack.mcmc[,"beta0"], c(.025, .5, .975))

##      2.5%       50%     97.5% 
## -61.86121 -43.75403 -24.29857

HPDinterval(stack.mcmc)

##                        lower        upper
## beta[1]         4.638428e-01   1.13662441
## beta[2]        -7.162923e-02   1.71499116
## beta[3]        -2.958222e-01   0.15001317
## beta0          -6.195231e+01 -24.36107762
## deviance        9.762219e+01 118.47459055
## lambda[1]       1.989899e-01   1.75176340
## lambda[2]       2.356777e-01   1.90107273
## lambda[3]       2.087026e-01   1.66501876
## lambda[4]       1.531981e-01   1.44023339
## lambda[5]       2.412858e-01   2.06355793
## lambda[6]       2.575407e-01   1.96172962
## lambda[7]       2.426858e-01   1.96816891
## lambda[8]       2.712453e-01   2.09079813
## lambda[9]       2.597283e-01   1.99274763
## lambda[10]      2.540884e-01   2.01898248
## lambda[11]      2.390395e-01   2.02222601
## lambda[12]      2.675751e-01   2.06238960
## lambda[13]      2.781800e-01   1.95738450
## lambda[14]      2.632950e-01   1.97752426
## lambda[15]      2.255139e-01   2.01459100
## lambda[16]      2.933322e-01   2.09676001
## lambda[17]      2.607850e-01   2.04516220
## lambda[18]      2.888112e-01   2.08048098
## lambda[19]      2.747088e-01   2.00283111
## lambda[20]      2.989747e-01   2.06580791
## lambda[21]      1.081938e-01   1.22390285
## lambda.beta[1]  2.978349e-05   0.08428433
## lambda.beta[2]  5.246925e-06   1.70543953
## lambda.beta[3]  9.231511e-05   4.46541268
## sigma           1.970293e+00   4.46706333
## attr(,"Probability")
## [1] 0.95