Instructions

1. Introduction

This is a Matlab implementation for the Bayesian genetic sparse factor model proposed in Runcie and Mukherjee (Genetics 2013). This code uses a Gibbs sampler to draw samples from the posterior distribution of a multivariate linear mixed effect model, where the random effects are generally unobserved genetic values (breeding values) with known covariance (ex. based on a pedigree). The focus of the model is on estimating the matrix of genetic (and residual) covariances among traits, called the G-matrix. Download here.

2. A Brief Tutorial

Unzip the downloaded file. To start, make sure the folder "BSF-G/" is in the search path of Matlab. The ``setup.mat" file should be in the current working directory. This file contains:

Table 1: default
Parameter description
Y $ n \times p$ data matrix
X $ b \times n$ fixed effect design matrix *
Z_1 $ r \times n$ random effect design matrix for factor model
Z_2 $ r2 \times n$ additional random effect design matrix*
A $ r \times r$ Additive genetic relationship matrix
U_act $ r \times p$ known genetic effect matrix $ ^+$
E_act $ r \times p$ known residual matrix $ ^+$
gen_factor_Lambda $ p \times k_1$ known genetic factor loadings matrix $ ^+$
error_factor_Lambda $ p \times k$ known latent factor loadings matrix $ ^+$
G $ p \times p$ known G-matrix $ ^+$
R $ p \times p$ known residual covariance matrix $ ^+$
h2s $ p \times 1$ known trait heritabilities $ ^+$
factor_h2s $ p \times k$ known latent factor heritabilities $ ^+$

where $ n$ is the number of individuals, $ r$ the number of genetic effects (ex. lines or individuals), $ r2$ is the number of 2nd random effects. Parameters marked with an * are optional. Those marked with a $ ^+$ are necessary if the data is from a simulation and you want to compare to the known values.

The main function is: fast_BSF_G_sampler(). This function reads ``setup.mat", and takes as input prior hyperparameters and various control parameters for the Gibbs sampler. The file ``model_setup.m" is set up to run the analysis for either of the example datasets. Type ``help fast_BSF_G_sampler" for more details.