Lecture 39 (lab)—Friday, April 20, 2007
What was covered?
- R topics covered
- the use of the bugs function from the arm package to run WinBUGS models from within R
- extracting information from the object created by the bugs function
- Fitting ordinary regression and multilevel models as Bayesian models using WinBUGS
- Assessing Markov chain convergence to the posterior distribution
- Comparing Bayesian and frequentist results
R functions and commands demonstrated
- bugs (from arm) runs WinBUGS from within R
R function options
- bugs.directory= (argument to bugs) the directory that contains the WinBUGS program
- data= (argument to bugs) name of R object containing the raw data in list format
- debug= (argument to bugs) is assigned the value TRUE or FALSE depending on whether the WinBUGS window should remain open for debugging after fitting a model
- inits= (argument to bugs) a list of initial values or a function that initializes each Markov chain
- model.file= (argument to bugs) name of the file containing the code that specifies the BUGS model
- parameters= (argument to bugs) name of the R object containing the vector of parameter names of the parameters to track
- n.burnin= (argument to bugs) number of samples to be used for the burn-in period that are then discarded. The default value is one half the number of iterations.
- n.chains= (argument to bugs) the number of chains to run simultaneously
- n.iter= (argument to bugs) the total number of iterations for each chain
R packages used
- arm for the bugs function that allows running WinBUGS from within R
- nlme for lme to compare WinBUGS results with frequentist results
Preliminaries
- WinBUGS is a stand-alone package but is not particularly easy to use. Andrew Gelman of Columbia University has written an R package, arm, that among other things permits running WinBUGS models from within R. This simplifies the use of WinBUGS considerably because we can use R to set up the analysis and process the results and use WinBUGS only to fit the model .
- To run WinBUGS in this fashion requires the following.
- Version 2.4 (or later) of R. The arm package is not compatible with earlier versions of R.
- The arm package of R.
- WinBUGS 1.4. The key that removes any restrictions on the use of WinBUGS should be installed.
- WinBUGS can be installed to any location on your machine but you need to know the directory path to the program. The path needs to be specified when you invoke WinBUGS from within R.
- Each WinBUGS model requires a model file that specifies the details of the model. This is a text file that can be created in Notepad or some other text editor. This file needs to saved to the current working directory of R.
- To set the working directory of R go to the File menu and select Change dir... (Fig. 1). In the dialog box that appears set the directory to match the location where you want to save the WinBUGS model file. This directory now becomes the location where R will save graphics and other output.
A WinBUGS example—the complete pooling model
- To illustrate the Bayesian approach to model fitting, we refit a few of the models from last week. These used an invertebrate development data set and the goal was to relate the variables lnPLD (response) to lntemp (predictor) while respecting the basic structure of the data set as defined by the variable species. In the language of multilevel modeling, lnPLD and lntemp are called level-1 variables.
> inverts<- read.table('http://www.unc.edu/courses/2006spring/ecol/145/001/data/lab11/74species.csv', sep=',', header=TRUE)
- As a first example of a Bayesian model we'll ignore the structure inherent in these data and just fit an ordinary regression model. This is also called a complete pooling model.
- WinBUGS accepts only numeric data. Character data and factors are not allowed. Furthermore variables must be entered into WinBUGS in a special way. To accomplish this we need to pull off each variable individually from the data frame inverts. In addition I carry out the log transformation and the centering of the predictor in R and then assign the individual variables to objects.
> x<-log(inverts$temp)-log(15)
> y<-log(inverts$PLD)
- The Bayesian model that we'll construct next uses the number of observations in the data set as a constant. So I calculate this value in R and assign the result to another object.
> n<-length(y)
- Next I write the Bayesian model. I use NotePad for this and save the result as cpmodel.txt to the directory that I've defined as the R working directory. Here's the model followed by an explanation of each line.
#complete pooling
model{
for(i in 1:n) {
y[i]~dnorm(y.hat[i],tau.y)
y.hat[i]<-a+b*x[i]
}
a~dnorm(0,.0001)
b~dnorm(0,.0001)
tau.y<-pow(sigma.y,-2)
sigma.y~dunif(0,100)
}
- The model begins with the key word model. The actual model code is contained within the pair of curly braces { }.
- A for loop is used to describe the probability model for the data, the likelihood so to speak. Notice that index runs from 1 to n, the variable recording the number of observations, so this loop specifies a probability model for each observation.
- The dnorm function in WinBUGS differs from that of R. It has only two arguments--the first records the mean of the distribution and the second specifies what's called the precision.
- Notice that the value of the mean argument, y.hat[i], includes an index i. This causes each observation to have a potentially different mean.
- The value of the precision argument is given the name tau.y
- The second line of the loop further defines the mean, y.hat. The use of the assignment arrow means that y.hat is a derived quantity. The formula given is just the regression model: a + bx where a is the intercept and b is the slope.
- The definition of y.hat here appears to be redundant. It would seem that we could have made the formula a + b*x[i] the first argument of the dnorm function and skipped the definition of y.hat entirely. You should never do this! WinBUGS is easily confused when complicated expressions are used as arguments in probability functions. Regression formulas and the like should always be specified outside of probability functions just as we're doing here.
- The remainder of the code specifies priors for the model parameters. The next two lines place noninformative priors on the regression parameters a and b. Since a priori these could be anything, we use normal priors. The second argument is the precision and we choose this to be low. The precision is the reciprocal of the variance so we're choosing a normal distribution with a variance of 10,000 as the prior. Since the variables are on a log scale it is clear that this is an extremely wide distribution implying that we have essentially no knowledge about the true values of a and b.
- The next line defines the quantity tau.y, the precision of the normal distribution for the likelihood. The usual approach is to place a prior on the standard deviation rather than the precision, so the line tau.y<-pow(sigma.y,–2) serves to connect the precision to the standard deviation thus making the precision a derived quantity rather than a parameter. Notice the use of the pow (for power) function in lieu of writing this out with the exponentiation operator. This is the preferred way of writing powers in WinBUGS.
- Finally the last line sets a uniform prior on the standard deviation, sigma.y, uniform on the interval (0, 100). For log-transformed variables an interval such as this is more than wide enough to cover any realistic value for sigma.y.
- With the model saved to the working directory, we enter the following three lines back in R.
> invert.data<-list("n","y","x")
> invert.inits<-function() {list(a=rnorm(1),b=rnorm(1),sigma.y=runif(1))}
> invert.parameters<-c("a","b","sigma.y")
- These three lines create objects that will be passed to WinBUGS for use in estimating the model.
- The first object defines all data values and constants used in the WinBUGS model. Notice the unusual syntax with the list function followed by the object names in quotes. This is required. The data object should contain only those quantities actually used in the WinBUGS model and no extra variables.
- The second object provides a way to initialize the Gibbs sampler. The values specified here are not important except that they must be legal values for the parameters. Thus to guarantee that sigma.y is given an initial positive value, a random uniform number is drawn (which by default is from the interval 0 to 1). The inits object is set up as a function because it will be used repeatedly for each different chain we request thus guaranteeing that each chain begins at a different point.
- The last object records the parameter values and derived objects that we wish to track. WinBUGS will return samples from the posterior distribution for all objects listed here.
- We're now ready to run the model. I load the arm package and make the following call using the bugs function from arm.
> library(arm)
> invert.3<-bugs(invert.data, invert.inits, invert.parameters, "cpmodel.txt", n.chains=3, bugs.directory = "D:/Program Files/WinBUGS14/", n.iter=10, debug=T)
- data = invert.data The first argument to bugs is the data object created above.
- inits = invert.inits The second argument is the function that sets the initial values for each chain.
- parameters = invert.parameters The third argument is an object containing the list of parameters we wish to monitor.
- model.file = "cpmodel.txt" The fourth argument is the name of the file that contains the WinBUGS code that defines the model.
- n.chains=3 The fifth argument lists the number of Markov chains to create. Three is generally enough to check proper mixing.
- bugs.directory = The sixth argument specifies the directory that contains the WinBUGS program, WinBUGS14.exe.
- n.iter=10 The seventh argument specifies the number of iterations of Gibbs sampling to carry out. For a first run this should be a small number since more than likely the code will fail to run anyway.
- debug=T The last argument turns the debugger on. With debug=T R will freeze and the WinBUGS window will remain open so that error messages can be examined and interpreted. Closing the WinBUGS window will then unfreeze R and return control to the console.
- Fig. 2 shows partial contents of the log window in WinBUGS from this run. The important thing to observe is that there are no error messages here.
- WinBUGS reports the "model is syntactically correct" telling us there are no coding errors in the model file.
- The line "data loaded" tells us that WinBUGS had no problems with the data values.
- The line "model compiled" tells us the model plus data are ready to run.
- The line "model initialized" tells us that all parameters have been assigned suitable initial values.
- The rest of the log (Fig. 3) displays summary statistics and graphs of individual Markov chains for different parameters and statistics. With only 10 iterations contributing to these reports, there is little to be learned from them. Observe though that by the 8th iteration the chains already appear to be mixing.
- Since there were no errors we can now run the model for real with a reasonable number of iterations. In the run below I specify 500 iterations. By default the bugs function is set up to treat the first half of the iterates as part of the the burn-in period. These iterations are discarded and only the values from the second half of the run are saved. (This can be changed by specifying a value for the n.burnin argument of bugs.) Since I request 3 chains this means BUGS will return 750 (3 times 250) samples, which, if convergence has been attained, should be samples from the posterior distributions of each parameter. It is not necessary to keep the debugger on, but I do so in order to inspect some of plots of the Markov chains that are produced.
> invert.3<-bugs(invert.data, invert.inits, invert.parameters, "cpmodel.txt", n.chains=3, bugs.directory = "D:/Program Files/WinBUGS14/", n.iter=500, debug=T)
- Fig. 4 shows plots of three chains for standard deviation parameter, sigma.y. Observe that trajectories are thoroughly interdigitating indicating that the chains have mixed fairly well.
- To unfreeze the R console, close the WinBUGS window. We can view summary statistics for posterior distribution of each parameter by typing the name of the object to which the bugs output was assigned at the console.
> invert.3
Inference for Bugs model at "cpmodel.txt", fit using winbugs,
3 chains, each with 500 iterations (first 250 discarded)
n.sims = 750 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
a 3.0 0.1 2.8 2.9 3.0 3.0 3.1 1 750
b -0.7 0.1 -0.9 -0.8 -0.7 -0.6 -0.5 1 440
sigma.y 0.9 0.0 0.8 0.8 0.9 0.9 0.9 1 750
deviance 550.0 2.4 547.3 548.3 549.5 551.1 556.4 1 740
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
pD = 2.9 and DIC = 553.0 (using the rule, pD = Dbar-Dhat)
DIC is an estimate of expected predictive error (lower deviance is better).
- The output shows the mean, standard deviation, and various quantiles of the posterior distributions of the three parameters a, b, and sigma.y. To be sure that we are actually sampling from the posterior distribution we should examine the column labeled Rhat. Rhat is the mixing index. Numerically it is the square root of the variance of the mixture of all the chains divided by the average within-chain variance. If the chains are mixed these values should be roughly the same yielding a ratio of approximately 1. Since Rhat is reported to be 1 for all the parameters this is evidence that the chains are well-mixed and that the Gibbs sampler is sampling from the posterior distribution.
- A lot of other useful information is contained in by the object created by the bugs function.
> names(invert.3)
[1] "n.chains" "n.iter" "n.burnin" "n.thin"
[5] "n.keep" "n.sims" "sims.array" "sims.list"
[9] "sims.matrix" "summary" "mean" "sd"
[13] "median" "root.short" "long.short" "dimension.short"
[17] "indexes.short" "last.values" "pD" "DIC"
[21] "DICbyR" "model.file" "is.DIC" "program"
- The list object summary contains the information shown in the above summary table organized in matrix form.
- The three objects mean, sd, and median repeat some of the information contained in summary.
- The three components sims.array, sims.list, sims.matrix each contain the 750 samples from the posterior distribution of each parameter but organized in different ways.
- The sims.array component arranges the information in a three dimensional array, a set of matrices, one for each parameter. The individual matrices contain the samples obtained from each Markov chain as separate columns in the order the sampled observations were obtained. Thus each column records the sampling history for that Markov chain. Below I display the first two samples in each chain from all of the parameters. Each parameter is represented by a 250 x 3 matrix.
> invert.3$sims.array[1:2,,]
, , a
[,1] [,2] [,3]
[1,] 3.1 2.9 3.1
[2,] 3.0 3.0 3.0
, , b
[,1] [,2] [,3]
[1,] -0.80 -0.79 -0.80
[2,] -0.52 -0.91 -0.65
, , sigma.y
[,1] [,2] [,3]
[1,] 0.84 0.84 0.82
[2,] 0.86 0.91 0.81
, , deviance
[,1] [,2] [,3]
[1,] 550 548 557
[2,] 551 552 551
> dim(invert.3$sims.array[,,'a'])
[1] 250 3
- We can use the information contained in the sims.array object to plot our own trace of the Markov chains to evaluate mixing. In Fig. 5 I plot the last 100 realizations from the three chains for the intercept.
> plot(c(151,250), range(invert.3$sims.array[151:250, , 'a']), type='n', ylab='a', xlab='iteration')
> for(i in 1:3) {
lines(151:250, invert.3$sims.array[151:250, i, 'a'], col=i)}
- In the sims.list component, the chains are combined into a single long vector and their order has been randomly permuted (so that the actual sample history of the observations is lost). The individual parameters can be accessed as list objects. Below I select the first five elements of each parameter and in the second call I specify the first five samples from the posterior distribution of the parameter a.
> sapply(invert.3$sims.list,function(x) x[1:5])
a b sigma.y deviance
[1,] 2.8 -0.50 0.83 555
[2,] 2.8 -0.60 0.94 556
[3,] 2.9 -0.48 0.85 551
[4,] 3.0 -0.78 0.83 548
[5,] 2.9 -0.65 0.86 548
> invert.3$sims.list$a[1:5]
[1] 2.8 2.8 2.9 3.0 2.9
- In the sims.matrix component the three chains are combined again but this time the samples from the posterior distributions occupy the columns of a matrix. As was the case with the sims.list component, the row order of the observations in the sims.matrix component has been randomly permuted (so that the actual sample history is lost).
> invert.3$sims.matrix[1:10,]
a b sigma.y deviance
[1,] 2.8 -0.50 0.83 555
[2,] 2.8 -0.60 0.94 556
[3,] 2.9 -0.48 0.85 551
[4,] 3.0 -0.78 0.83 548
[5,] 2.9 -0.65 0.86 548
[6,] 2.9 -0.82 0.82 551
[7,] 2.9 -0.65 0.85 548
[8,] 2.9 -0.74 0.81 549
[9,] 3.0 -0.53 0.83 550
[10,] 3.1 -0.66 0.90 552
> invert.3$sims.matrix[1:5,'a']
[1] 2.8 2.8 2.9 3.0 2.9
- We can use this information to calculate our own summary statistics. Here are the means of the sampled posterior distributions. The values match what's reported in the summary table above.
> apply(invert.3$sims.matrix,2,mean)
a b sigma.y deviance
2.9560 –0.7049 0.8566 550.0477
- It's useful at this point to contrast the estimates obtained from the posterior distributions with what we would have obtained by fitting the model with the lm function.
> lm(y~x)->out
> summary(out)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-3.057 -0.505 0.172 0.569 2.133
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.9559 0.0588 50.25 < 2e-16 ***
x -0.7045 0.1112 -6.33 1.4e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.852 on 216 degrees of freedom
Multiple R-Squared: 0.157, Adjusted R-squared: 0.153
F-statistic: 40.1 on 1 and 216 DF, p-value: 1.37e-09
- Observe that the estimates for a, b, and sigma.y in the output agree with the Bayesian results to either the second or third decimal places.
Fitting a separate intercept model using WinBUGS
- This model is of no special interest by itself but is transitional to the multilevel models we consider next. This model fits a common slope to all species but separate intercepts. In contrast with the random intercepts model we consider next it does this by actually estimating 74 separate intercepts without assuming a common probability model (informative prior). This corresponds to the frequentist model in which the variable species is added to the common pooling model as a factor with 74 levels.
- To set this up in WinBUGS we first need to convert the species variable to a numeric variable. We also need a constant that records the number of species.
> J<-length(unique(inverts$species))
>
species<-as.numeric(inverts$species)
- To set up the model we replace the constant intercept, a, of the common pooling model with the variable intercept, a[j], of the separate intercepts model. In the table below I've color-coded the corresponding portions of the two models that are different.
| Common pooling model |
Separate intercepts model |
model{
for(i in 1:n) {
y[i]~dnorm(y.hat[i],tau.y)
y.hat[i]<-a+b*x[i]
}
a~dnorm(0,.0001)
b~dnorm(0,.0001)
tau.y<-pow(sigma.y,-2)
sigma.y~dunif(0,100)
} |
model{
for(i in 1:n) {
y[i]~dnorm(y.hat[i],tau.y)
y.hat[i]<-a[species[i]]+b*x[i]
}
b~dnorm(0,.0001)
tau.y<-pow(sigma.y,-2)
sigma.y~dunif(0,100)
for(j in 1:J){
a[j]~dnorm(0,.0001)
}
}
|
- Observe that in specifying the model for the response, the fixed a gets replaced by a multi-valued a. Because the model for the response varies by observation we need an index for a that also varies with the observation, hence species[i]. The variable species has 218 values of which only 74 are different. These are the numerical values that correspond to the different species. Thus there are only 74 different intercepts and the code links up the correct intercept with the correct species.
- Each of the separate intercepts also gets its own prior. Hence the single prior for the constant intercept gets replaced by 74 priors, one for each species. Because the intercepts are unmodeled, the prior is noninformative. The assignment of priors is efficiently summarized in the for loop shown at the end of the program. The value of J is 74, the number of species, and will be input as data to WinBUGS.
- I save the model specification to the file sepints.txt in the R working directory. To fit this model in WinBUGS we just add the new data arguments to the data object and change how the initial values for a are specified.
- Two new variables are included in the data list, "species" and "J".
- Notice that the parameter a is assigned a vector of length J of initial values because there are now 74 intercepts to initialize.
- Notice that even though a is a vector it is still entered in the parameter list as "a".
> invert.data<-list("n","J","y","species","x")
> invert.inits<-function() {
list(a=rnorm(J),b=rnorm(1),sigma.y=runif(1))}
> invert.parameters<-c("a", "b", "sigma.y")
> invert.3b<-bugs(invert.data, invert.inits, invert.parameters, "sepints.txt", n.chains=3,
bugs.directory = "D:/Program Files/WinBUGS14/", n.iter=10, debug=T)
- The code above runs properly, so the next step is to increase the number of iterations. Once again 500 turns out to be enough to obtain convergence.
Random intercepts model
- To move from the separate intercepts model to the random intercepts model requires only a minor change in the WinBUGS code. We replace the noninformative prior for the individual intercepts with an informative prior.
| Separate intercepts model |
Random intercepts model |
model{
for(i in 1:n) {
y[i]~dnorm(y.hat[i],tau.y)
y.hat[i]<-a[species[i]]+b*x[i]
}
b~dnorm(0,.0001)
tau.y<-pow(sigma.y,-2)
sigma.y~dunif(0,100)
for(j in 1:J){
a[j]~dnorm(0,.0001)
}
} |
model{
for(i in 1:n) {
y[i]~dnorm(y.hat[i],tau.y)
y.hat[i]<-a[species[i]]+b*x[i]
}
b~dnorm(0,.0001)
tau.y<-pow(sigma.y,-2)
sigma.y~dunif(0,100)
for(j in 1:J){
a[j]~dnorm(a.hat[j],tau.a)
a.hat[j]<-mu.a
}
mu.a~dnorm(0,.0001)
tau.a<-pow(sigma.a,-2)
sigma.a~dunif(0,100)
} |
- In the separate intercepts model we have assigned separate noninformative priors to each intercept. In the random intercepts model we have a group-level model (level-2 model) for the intercepts. In the Bayesian approach this is expressed by assigning informative priors to the intercepts (thus making them modeled parameters). The notation differs slightly from what we did previously for multilevel models.
Old notation: 
New notation: 
Actually in the model above we suppress the existence of the terms
and work directly with the distribution of the
.
- The two lines, a[j]~dnorm(a.hat[j],tau.a)and a.hat[j]<-mu.a could have been replaced with the single line a[j]~dnorm(mu.a,tau.a). The two-line approach makes it easy to add level-2 predictors to model the variability in the intercepts. (See below.)
- Finally the parameters specified in the priors for the intercepts are given their own priors, called hyperpriors. In a similar vein the parameters mu.a and sigma.a are sometimes called hyperparameters.
- I save the model specification to the file randomints.txt in the R working directory. To fit this model in WinBUGS we just add additional arguments to the inits and parameter declarations from before.
- The new parameters mu.a and sigma.a get initial values.
- The new parameters also get added to the parameter list.
> invert.data<-list("n","J","y","species","x")
> invert.inits<-function() {list(a=rnorm(J), b=rnorm(1), sigma.y=runif(1), mu.a=rnorm(1), sigma.a=runif(1))}
> invert.parameters<-c("a", "b", "sigma.y", "mu.a", "sigma.a")
> invert.3b<-bugs(invert.data, invert.inits, invert.parameters, "randomints.txt", n.chains=3, bugs.directory = "D:/Program Files/WinBUGS14/", n.iter=10, debug=T)
- The code above runs properly, so the next step is increase the number of iterations. Once again 500 turns out to be enough.
Random slopes and intercepts model (slopes uncorrelated with intercepts)
- This is again a transitional model of no special interest by itself. It's transitional to the random slopes and intercepts model in which the intercept and slope random effects are allowed to be correlated. I include this model to make the additional complexity necessitated by the correlation more understandable.
- The only real change in this model is that now both a and b are modeled parameters. Thus each a equation in the random intercepts model has an additional b equation.
| random intercepts model |
random slopes and intercepts (uncorrelated) |
model{
for(i in 1:n) {
y[i]~dnorm(y.hat[i],tau.y) y.hat[i]<-a[species[i]]+ b*x[i]
}
b~dnorm(0,.0001)
tau.y<-pow(sigma.y,-2)
sigma.y~dunif(0,100)
for(j in 1:J){
a[j]~dnorm(a.hat[j],tau.a)
a.hat[j]<-mu.a
}
mu.a~dnorm(0,.0001)
tau.a<-pow(sigma.a,-2)
sigma.a~dunif(0,100)
} |
model{
for(i in 1:n) {
y[i]~dnorm(y.hat[i],tau.y)
y.hat[i]<-a[species[i]]+ b[species[i]]*x[i]
}
tau.y<-pow(sigma.y,-2)
sigma.y~dunif(0,100)
for(j in 1:J){
a[j]~dnorm(a.hat[j],tau.a)
b[j]~dnorm(b.hat[j],tau.b)
a.hat[j]<-mu.a
b.hat[j]<-mu.b
}
mu.a~dnorm(0,.0001)
mu.b~dnorm(0,.0001)
tau.a<-pow(sigma.a,-2)
sigma.a~dunif(0,100)
tau.b<-pow(sigma.b,-2)
sigma.b~dunif(0,100)
}
|
Random slopes and intercepts model
- The major change required in moving to correlated random intercepts and slopes is that the probability model for these parameters changes from independent normals to a multivariate normal distribution. This necessitates making the intercepts and slopes the columns of a matrix. In the code below this matrix is denoted by B.
- The code in red sets up the multivariate normal probability model for the matrix of random effects.
- The code in blue formulates noninformative priors for the variances and covariances of the random effects. Observe that to go from a precision matrix (as required for the multivariate normal distribution in WinBUGS) to a variance matrix we have to calculate a matrix inverse (first blue line). The covariance matrix for the random effects is denoted Sigma.B in the code below. (Note: I use lower case letters to denote entries of this matrix. So sigma.b is a scalar, while Sigma.B is a matrix.)
| random slopes and intercepts (uncorrelated) |
random slopes and intercepts |
model{
for(i in 1:n) {
y[i]~dnorm(y.hat[i],tau.y)
y.hat[i]<-a[species[i]]+ b[species[i]]*x[i]
}
tau.y<-pow(sigma.y,-2)
sigma.y~dunif(0,100)
for(j in 1:J){
a[j]~dnorm(a.hat[j],tau.a)
b[j]~dnorm(b.hat[j],tau.b)
a.hat[j]<-mu.a
b.hat[j]<-mu.b
}
mu.a~dnorm(0,.0001)
mu.b~dnorm(0,.0001)
tau.a<-pow(sigma.a,-2)
sigma.a~dunif(0,100)
tau.b<-pow(sigma.b,-2)
sigma.b~dunif(0,100)
}
|
model{
for(i in 1:n) {
y[i]~dnorm(y.hat[i],tau.y)
y.hat[i]<-a[species[i]]+ b[species[i]]*x[i]
}
tau.y<-pow(sigma.y,-2)
sigma.y~dunif(0,100)
for(j in 1:J){
a[j]<-B[j,1]
b[j]<-B[j,2]
B[j,1:2]~dmnorm(B.hat[j,], Tau.B[,])
B.hat[j,1]<-mu.a
B.hat[j,2]<-mu.b
}
mu.a~dnorm(0,.0001)
mu.b~dnorm(0,.0001)
Tau.B[1:2,1:2]<-inverse(Sigma.B[,])
Sigma.B[1,1]<-pow(sigma.a,2)
sigma.a~dunif(0,100)
Sigma.B[2,2]<-pow(sigma.b,2)
sigma.b~dunif(0,100)
#correlation
Sigma.B[1,2]<-rho*sigma.a*sigma.b
Sigma.B[2,1]<-Sigma.B[1,2]
rho~dunif(-1,1)
} |
- Here's how we would run this model.
- The biggest change in addition to the need for initial values for sigma.b and rho is that the initial values for the intercepts and slopes must be organized in a matrix. The declaration B=array(rnorm(2*J),c(J,2)) sets up a matrix with J rows and 2 columns with random normal initial values.
- Additional parameters we track are mu.b, sigma.b, and rho.
> invert.data<-list("n","J","y","species","x")
> invert.inits<-function() {
list(B=array(rnorm(2*J),c(J,2)), sigma.y=runif(1), mu.a=rnorm(1), sigma.a=runif(1), mu.b=rnorm(1), sigma.b=runif(1), rho=runif(1))}
> invert.parameters<-c("a", "b", "sigma.y", "mu.a", "sigma.a", "mu.b", "sigma.b", "rho")
> invert.3c<-bugs(invert.data, invert.inits, invert.parameters, "randomslopeint.txt", n.chains=3, bugs.directory = "D:/Program Files/WinBUGS14/", n.iter=10, debug=T)
Random slopes and intercepts model with a level-2 predictor
- When we fit the random slopes and intercepts model to the invertebrate data in lecture 36 we also considered a level-2 model in which the species intercepts varied by feeding type. As a final model we examine how this model would be fit using WinBUGS.
- To do this we first need a new data vector that records the feeding type for each species. The feeding.type variable that occurs in the data frame has a separate value for each observation. To use it in the level-2 model above we instead need a single value for each species (the level-2 unit) making it a vector of length 74. We can accomplish this with the tapply function applied to feeding.type, stratified by species, using a generic function that just pulls off the first observation for each species.
- As a numeric variable feeding type is coded 1 and 2 so I subtract 1 to give it a dummy coding. Finally I turn the result into a numeric vector.
f<-as.numeric(tapply(inverts$feeding.type, inverts$species, function(x) x[1])-1)
- The changes we need to make to the random slopes and intercepts model are minimal. The feeding type variable gets added to the level-2 equation for the intercept and its coefficient gets assigned a noninformative prior. The changes are shown below.
| random slopes and intercepts |
random slopes and intercepts (with a level-2 predictor) |
model{
for(i in 1:n) {
y[i]~dnorm(y.hat[i],tau.y)
y.hat[i]<-a[species[i]]+ b[species[i]]*x[i]
}
tau.y<-pow(sigma.y,-2)
sigma.y~dunif(0,100)
for(j in 1:J){
a[j]<-B[j,1]
b[j]<-B[j,2]
B[j,1:2]~dmnorm(B.hat[j,], Tau.B[,])
B.hat[j,1]<-mu.a
B.hat[j,2]<-mu.b
}
mu.a~dnorm(0,.0001)
mu.b~dnorm(0,.0001)
Tau.B[1:2,1:2]<-inverse(Sigma.B[,])
Sigma.B[1,1]<-pow(sigma.a,2)
sigma.a~dunif(0,100)
Sigma.B[2,2]<-pow(sigma.b,2)
sigma.b~dunif(0,100)
#correlation
Sigma.B[1,2]<-rho*sigma.a*sigma.b
Sigma.B[2,1]<-Sigma.B[1,2]
rho~dunif(-1,1)
}
|
model{
for(i in 1:n) {
y[i]~dnorm(y.hat[i],tau.y)
y.hat[i]<-a[species[i]]+ b[species[i]]*x[i]
}
tau.y<-pow(sigma.y,-2)
sigma.y~dunif(0,100)
for(j in 1:J){
a[j]<-B[j,1]
b[j]<-B[j,2]
B[j,1:2]~dmnorm(B.hat[j,], Tau.B[,])
B.hat[j,1]<-mu.a + a.1*f[j]
B.hat[j,2]<-mu.b
}
mu.a~dnorm(0,.0001)
mu.b~dnorm(0,.0001)
a.1~dnorm(0,.0001)
Tau.B[1:2,1:2]<-inverse(Sigma.B[,])
Sigma.B[1,1]<-pow(sigma.a,2)
sigma.a~dunif(0,100)
Sigma.B[2,2]<-pow(sigma.b,2)
sigma.b~dunif(0,100)
#correlation
Sigma.B[1,2]<-rho*sigma.a*sigma.b
Sigma.B[2,1]<-Sigma.B[1,2]
rho~dunif(-1,1)
}
|
- I save this model as "finalmodel.txt" and run it using the code shown below.
> invert.data<-list("n","J","y","species","x", "f")
> invert.inits<-function() {list(B=array(rnorm(2*J), c(J,2)), sigma.y=runif(1), mu.a=rnorm(1), a.1=rnorm(1), sigma.a=runif(1), mu.b=rnorm(1), sigma.b=runif(1), rho=runif(1))}
> invert.parameters<-c("a", "b", "sigma.y", "mu.a", "sigma.a", "mu.b", "sigma.b", "rho", "a.1")
> invert.3d<-bugs(invert.data, invert.inits, invert.parameters, "finalmodel.txt", n.chains=3, bugs.directory = "D:/Program Files/WinBUGS14/", n.iter=10, debug=T)
# repeat with 500 iterations per chain
> invert.3d<-bugs(invert.data, invert.inits, invert.parameters, "finalmodel.txt", n.chains=3, bugs.directory = "D:/Program Files/WinBUGS14/", n.iter=500, debug=T)
Comparing Bayesian and Frequentist Results
- At this point it would be interesting to compare the fit of the Bayesian model to what we obtained with the frequentist approach used last time. Because the frequentist version returns point estimates for the parameters while the Bayesian version returns posterior distributions I take the means of these posterior distributions for the Bayesian point estimates. To fit the model of the previous section, a random slopes and intercepts model with a level-2 predictor, with a frequentist perspective I use the lme function from the nlme package. Additional details on lme model syntax can be found in notes for lecture 36.
> library(nlme)
> lme(log(PLD)~I(log(temp)-log(15))+feeding.type, random=~I(log(temp)-log(15))|species, data=inverts, method='ML')->out
- I first compare the fixed effect parameter estimates.
#frequentist values
> fixef(out)
(Intercept) I(log(temp) - log(15)) feeding.typeP
2.6375 -1.4489 0.6774
#Bayesian means
> invert.3d$mean[c("mu.a","mu.b","a.1")]
$mu.a
[1] 2.631
$mu.b
[1] -1.443
$a.1
[1] 0.6851
- Next I examine the variance components
#frequentist variance components
> VarCorr(out)
species = pdLogChol(I(log(temp) - log(15)))
Variance StdDev Corr
(Intercept) 0.75256 0.8675 (Intr)
I(log(temp) - log(15)) 0.28135 0.5304 -0.479
Residual 0.02303 0.1518
#Bayesian variance components (posterior means)
> invert.3d$mean[c("sigma.a","sigma.b","sigma.y","rho")]
$sigma.a
[1] 0.896
$sigma.b
[1] 0.5422
$sigma.y
[1] 0.1556
$rho
[1] -0.4440
- Next I compare the random slopes. I use the coef function on the lme object for this. This calculates
which is also what is being reported as b in the Bayesian model.
> invert.3e$mean[c("b")]
$b
[1] -1.5131 -1.7768 -1.3042 -1.4879 -1.7267 -2.2108 -1.5651 -1.3281 -1.7297 -1.5817
[11] -1.2255 -1.6738 -1.6516 -1.2867 -1.2916 -1.2602 -1.1192 -0.6581 -1.7243 -1.3585
[21] -1.4672 -1.4815 -1.5592 -1.3679 -1.4006 -1.2816 -1.7486 -0.7824 -1.5324 -1.4147
[31] -1.7609 -1.3958 -1.5897 -1.8182 -1.1379 -1.1969 -0.6043 -3.1154 -1.1434 -1.6132
[41] -1.6632 -1.4185 -1.6377 -1.4604 -1.1114 -1.7542 -1.5703 -1.2947 -0.9866 -1.8999
[51] -1.1519 -1.7177 -1.3753 -1.6963 -2.0192 -1.5475 -0.9388 -1.5048 -0.9855 -1.6285
[61] -1.9230 -1.2068 -1.7785 -0.9031 -2.1305 -0.7184 -1.0747 -0.8558 -0.9051 -1.4644
[71] -1.5930 -0.9411 -1.4179 -1.4266
> coef(out)[,2]
[1] -1.5291 -1.7789 -1.3108 -1.4901 -1.7416 -2.2453 -1.5802 -1.3316 -1.7338 -1.5938
[11] -1.2302 -1.6909 -1.6725 -1.2929 -1.2994 -1.2665 -1.1185 -0.6435 -1.7471 -1.3797
[21] -1.4773 -1.4841 -1.5791 -1.3684 -1.4329 -1.2816 -1.7574 -0.7792 -1.5533 -1.4260
[31] -1.7825 -1.4052 -1.5919 -1.8386 -1.1407 -1.1863 -0.6000 -3.1658 -1.1396 -1.6258
[41] -1.6602 -1.4189 -1.6325 -1.4790 -1.1158 -1.7506 -1.5964 -1.2980 -0.9817 -1.9240
[51] -1.1554 -1.7335 -1.3813 -1.7192 -2.0553 -1.5656 -0.9350 -1.5142 -0.9822 -1.6457
[61] -1.9378 -1.2060 -1.7992 -0.8974 -2.1454 -0.7138 -1.0753 -0.8491 -0.9011 -1.4596
[71] -1.6020 -0.9341 -1.4202 -1.4439
#find maximum difference between the two sets of estimates
> max(abs(invert.3e$mean[c("b")]$b-coef(out)[,2]))
[1] 0.05037
- Finally I examine the random intercepts. The coef function returns
. The estimates returned by the Bayesian model include the feeding.type predictor as part of the intercept,
. To make these sets of estimates comparable I subtract off the feeding type contribution to the Bayesian estimate.
> coef(out)[,1]
[1] 2.5468 3.6734 2.2109 1.7286 1.8461 1.8098 2.8159 2.6335 3.2397 2.8106
[11] 2.8251 3.0223 3.0639 2.6332 2.8451 2.4693 1.7579 0.5197 2.9303 2.5899
[21] 3.2657 2.8229 1.0205 3.4077 3.5302 1.8626 3.4411 2.3351 1.7219 1.9237
[31] 1.6488 2.5302 3.1612 3.1614 2.9885 1.4778 1.6214 5.4582 3.0875 3.2099
[41] 3.2572 2.7917 3.8079 2.3639 3.1863 2.3364 2.6165 3.2145 1.7107 2.9872
[51] 2.9670 3.1763 2.6377 3.3110 2.2412 3.1127 2.3110 3.2095 2.8632 3.5682
[61] 3.6175 2.4185 3.0040 2.8714 3.4201 1.6942 3.1227 1.8006 -0.6312 3.2298
[71] 2.9375 0.9286 2.8944 2.5456
> invert.3d$mean["a"]$a-invert.3d$mean["a.1"]$a.1*f
[1] 2.5405 3.6719 2.1972 1.7191 1.8367 1.7963 2.8089 2.6323 3.2395 2.8068
[11] 2.8189 3.0062 3.0551 2.6167 2.8414 2.4627 1.7597 0.4992 2.9171 2.5809
[21] 3.2629 2.8171 1.0186 3.4024 3.5133 1.8528 3.4330 2.3224 1.7234 1.9236
[31] 1.6482 2.5308 3.1497 3.1527 2.9809 1.4703 1.6128 5.4356 3.0905 3.2001
[41] 3.2440 2.7947 3.7891 2.3726 3.1822 2.3284 2.5920 3.2009 1.7056 2.9818
[51] 2.9585 3.1670 2.6191 3.2823 2.2340 3.1021 2.3154 3.1952 2.8685 3.5501
[61] 3.6060 2.4154 2.9786 2.8617 3.4083 1.6868 3.1095 1.8008 -0.6431 3.2277
[71] 2.9176 0.9252 2.8730 2.5397
> coef(out)[,1]->freq.int
> invert.3d$mean["a"]$a-invert.3d$mean["a.1"]$a.1*f->Bayes.int
#find maximum difference between the two sets of estimates
> max(abs(Bayes.int-freq.int))
[1] 0.02873
- The differences are all very small. It is clear here that we can approximate frequentist estimates by using summary statistics of Bayesian posterior probabilities.
Course Home Page