Lecture 47 (lab 12)—Tuesday, April 11, 2006
What was covered?
- Using lattice graphics to display population-averaged and subject-specific models
- Using residual analysis to assess the structural form of the level-1 model
- Caterpillar plots of random effects for comparing level-2 units
- Identifying influential level-2 units and/or level-2 units not fit well by a model
- Specifying uncorrelated random effects
- Explicitly specifying a correlation structure in a mixed model
R functions and commands demonstrated
- ACF (from nlme) calculates the autocorrelation function for model residuals
- coef (from nlme) extracts the subject-specific parameter estimates from an lme model object
- fixef (from nlme) extracts fixed effect parameter estimates from an lme model object
- ranef (from nlme) extracts random effect predictions from an lme model object
R function options
- alpha= (option to plot when plotting ACF objects) used for specifying the α-level for confidence bands when plotting an ACF object
- corAR1() (option for the correlation argument of lme) specifies an AR(1) correlation structure for the level-1 residuals of a mixed model
- correlation= (argument to lme) for specifying a correlation structure for the level-1 residuals of a linear mixed effects model
- na.action= (argument to lme, lm, and glm ) for changing the default response (which is to abort) when missing values are encountered in a data set
- pdDiag= (argument to lme) can be used in the random argument of lme to specify uncorrelated random effects
R packages used
- limma used in the user-created resids.table function
- lattice defines the functions for lattice graphics—xyplot and various panel functions
- nlme for fitting linear mixed effects models
Using lattice graphics to display the results of a multilevel model
- We continue with the larval pelagic duration time data set we began working with last time.
> #obtain data
> inverts<- read.table('http://www.unc.edu/courses/2006spring/ecol/145/001/data/lab11/74species.csv', sep=',', header=TRUE)
> names(inverts)
[1] "species" "meantemp" "temp" "PLD" "taxon" "climate.3"
[7] "feeding.type" "lntemp" "lnPLD" "n" "PLD.15c" "lnPLD.15c"
[13] "X1000.k" "lograte"
- I refit the random slopes and intercepts model we ended with last time.
- lnPLD is the response
- lntemp is the level-1 predictor centered at log(15)
- species defines the level-2 units in the data set
> library(nlme)
> model2<-lme(lnPLD~I(lntemp-log(15)), random=~I(lntemp-log(15))|species, data=inverts, method='ML')
- Symbolically, this is the following model.

or in composite form
- As was discussed in Lecture 45, a multilevel model returns estimates on a number of different levels.
- It yields what's called the population-averaged model, a model that describes the general level-1 pattern of all the level-2 units. This is a model constructed from the fixed effects alone.
- It also yields the subject-specific model. This is a model that describes the behavior of individual level-2 units. It incorporates both fixed effects and random effects.
- The nlme package has three functions that are useful in extracting the quantities needed to construct these models: fixef, ranef, and coef.
- fixef returns the fixed effect parameter estimates used in constructing the population-averaged model. For model 2 above these would be β0 and β1.
- ranef returns a matrix of random effect predictions for each level 2 unit. These are
and
with each row in the matrix corresponding to the predictions for a different level-2 unit.
- coef returns a matrix of subject-specific parameters. For model 2 above these are
and
, each row in the matrix corresponding to a different level-2 unit.
- To understand the relationship between the population-averaged and subject-specific models it is worthwhile to also consider the individual ordinary least squares (OLS) regression models. These are the models we would obtain if we fit the level-1 model separately to each level-2 unit.
- The function listed below randomly selects a subset of level 2 units and plots the OLS line, the subject-specific line, and the population-averaged line for each selected level 2 unit. The first part of the code assembles the information needed to produce the plot generated by the last half of the code.
plot.lattice<-function(inverts,model2,seed)
{
set.seed(seed)
sub1<-sample(unique(inverts$species),9)
sub2<-inverts[inverts$species %in% sub1,]
#obtain random effects
ranef.list<-ranef(model2)
#obtain fixed effect estimates
fixef.list<-fixef(model2)
#obtain list of species
species.list<-sort(unique(inverts$species))
#first create data frame with same linking field name as in raw data
ran.data<-data.frame(rownames(ranef.list), ranef.list)
colnames(ran.data)<-c('species','int','slope')
#add random effects to data set by species
inverts2<-merge(inverts, ran.data)
#select species that are to be plotted
data.combined<-inverts2[inverts2$species %in% sub1,]
#change background color to white
trellis.par.set(col.whitebg())
#generate graph
xyplot(lnPLD~I(lntemp-log(15))|species,data=data.combined, layout=c(3,3),
xlab=expression(log(temperature/15)),ylab=expression(log(PLD)),
panel=function(x,y,subscripts){
panel.xyplot(x,y,pch=16)
panel.abline(lm(y~x),lty=1,col=4)
panel.abline(c(fixef.list[1],fixef.list[2]),lty=1,lwd=2,col=2)
panel.abline(c(data.combined$int[subscripts][1]+fixef.list[1],
data.combined$slope[subscripts][1]+fixef.list[2]),lty=2)
},
subscripts=T,strip = strip.custom(par.strip.text = list(cex = 0.85)),
par.settings = list(axis.text = list(cex = 0.7)),
key=list(lines=list(lty=c(1,1,2),col=c(4,2,1)),
text=list(c("OLS fitted trajectory",
"population average trajectory",
"model-based empirical Bayes trajectory"),cex=rep(.85,3)),
border=0,space='top'))
}
- To use the function I just specify the name of the data set, the model name, and a seed for the random number generator. The function then randomly selects 9 species to plot based on the seed specified. Fig. 1 shows two runs of the function with different random seeds.
> library(lattice)
> plot.lattice(inverts,model2,10)
> plot.lattice(inverts,model2,17)

Fig. 1 Fitted trajectories for random selections of species
- The empirical Bayes trajectory is the subject-specfic model, i.e., the model that includes both fixed and random effects. The empirical Bayes trajectory is a weighted sum of the population-averaged trajectory and the individual OLS trajectory. The weights are determined by the amount of data available for this subject and the reliability of the OLS model. If there are few data values and/or they are very close together, the empirical Bayes line will more closely resemble the population-averaged line. If there are a lot of data and/or the data values are spread out, then the empirical Bayes estimate will more closely resemble the OLS line.
- This explains why the multilevel model can be estimated even when some subjects have so few observations that an individual OLS model cannot be fit. For these observations the subject-specific model will be very close to the population-averaged model.
- What follows is a description of what the various lines in the last half of the function do. When the random effects were merged with the data in the first part of the function each species ended up with the same random effect repeated for all of its observations. This explains some of the peculiarities in the panel function when we try to reference these random effects.
lnPLD~I(lntemp-log(15))|species |
This defines plotting variables and the grouping variable (species) |
layout=c(3,3) |
Sets up a tableau of 3 rows and 3 columns of plots |
xlab=expression(log(temperature/15)) |
Label for x-axis. Here expression is used to display mathematical notation. Note: |
panel=function(x,y,subscripts){ |
Begins definition of panel function for defining what is included in each plot. The subscripts argument references the elements of each group and is needed only for adding the empirical Bayes trajectories |
panel.xyplot(x,y,pch=16) |
Adds points |
panel.abline(lm(y~x),lty=1,col=4) |
Adds OLS line |
panel.abline(c(fixef.list[1], fixef.list[2]), lty=1, lwd=2, col=2) |
Adds population-averaged line |
panel.abline(c(data.combined$int[subscripts][1] + fixef.list[1], data.combined$slope[subscripts][1] + fixef.list[2]),lty=2) |
Adds subject-specific lines. Subscripts identifies the elements belonging to a group (species) and [1] selects only the first of these elements (needed because the random effects are duplicated due to the merge). The keyword subscripts is needed because we have to calculate a different intercept and slope for each species based on the variables that are referenced here. |
subscripts=T |
Required if subscripts are to be used |
strip = strip.custom(par.strip.text = list(cex = 0.85)) |
Adjusts size of text in the strips above each plot (the species names) |
par.settings = list(axis.text = list(cex = 0.85)) |
Adjusts size of numbers on axis |
key=list(lines=list(lty=c(1,1,2), col=c(4,2,1)), text=list(c("OLS fitted trajectory", "population average trajectory", "model-based empirical Bayes trajectory"), cex=rep(.85,3)), border=0,space='top') |
Creates legend to label lines appearing in plots |
Checking the structural form of the level-1 model
- Level-1 residuals are useful for checking the structural form of the level-1 model. Since we have a single predictor this is done by plotting the level-1 residuals against that predictor. Since residual plots tend to be noisy when there are lots of data values, superimposing a loess curve can be helpful in detecting a systematic trend.
- I plot the Pearson residuals from model 2 against the predictor log(temperature). To better observe the pattern in the loess curve I zoom in on the interval from –1 to 1 in the plot.
> plot(inverts$lntemp,residuals(model2, type='pearson'), xlab='log(temperature)', ylab='Pearson residuals', ylim=c(-1,1))
> lines(lowess(residuals(model2, type='pearson')~inverts$lntemp), col=2, lwd=2)
- The plot shows a distinct quadratic trend. I add a quadratic term and then use a likelihood ratio test to assess its statistical significance.
> model3<-lme(lnPLD~I(lntemp-log(15))+I((lntemp-log(15))^2), random=~I(lntemp-log(15))|species, data=inverts, method='ML')
> anova(model2,model3)
Model df AIC BIC logLik Test L.Ratio p-value
model2 1 6 210.4372 230.7441 -99.21859
model3 2 7 199.2543 222.9457 -92.62713 1 vs 2 13.18290 3e-04
The result is highly significant. Notice also that the AIC has dropped substantially. Thus both criteria argue for including a quadratic term.
- Notice that this has changed the level-1 model so that now there are three parameters at level 1.

Each of these parameters can have a level-2 equation associated with it. In particular we now have the option of allowing the coefficient of the quadratic term to vary randomly across species. I test the need for that below.
> anova(model3,model3.5)
Model df AIC BIC logLik Test L.Ratio p-value
model3 1 7 199.2543 222.9457 -92.62713
model3.5 2 10 196.1694 230.0143 -88.08469 1 vs 2 9.084886 0.0282
So we have weak evidence that the quadratic term should include a random effect.
Using level-2 random effects to address research questions of interest
- The primary research question is to quantify the variability in the larval duration temperature relationship across invertebrate species. The operating hypothesis is that intercepts will be much more variable than slopes. The level 2 random effects can be used to test this hypothesis since they represent the individual variability in slopes and intercepts for the various invertebrate species about the population mean trajectory.
- Random effects are not said to estimated, but rather "predicted". The method used to obtain them is called empirical Bayes estimation so the random effects are sometimes just called empirical Bayes estimates. The random effects can be extracted from an nlme model with the ranef function. Below I extract the random effects from the quadratic model fit in the previous section. Since there are 74 pairs of them, one for each species, I display only the first 10 below.
> ranef(model3)[1:10,]
(Intercept) I(lntemp - log(15))
Amphiprion melanopus 0.1171001 -0.02799795
Armases miersii 0.4770219 -0.09358326
Balanus amphitrite 1 -0.2219984 0.13166601
Balanus eburneus -0.7310860 0.05395505
Balanus trigonus -0.6782908 -0.06610844
Callianassa tyrrhena -0.7457328 -0.38392852
Cancer gracilis 0.3637000 -0.11882529
Cancer irroratus 0.1905045 0.06848089
Cancer magister 0.8615153 -0.16859720
Cancer oregonensis 0.3654544 -0.11613289
- The software package MLwiN provides a useful display of level-2 random effects in something it refers to as a caterpillar plot. Caterpillar plots display each random effect in increasing order along with a 95% confidence interval. The display allows one to immediately assess how different the random effects are for different species. Based on formulas in Goldstein (1995) I've written some R functions to generate these plots for model3 above. The code can be found on the class web page in the data/lab12 folder. There are three functions listed there along with sample runs. The functions are the following.
- resids.table: this takes two arguments, the name of the data set and the model object that is the result of an lme fit. It requires an R package called limma which it loads, so this needs to be installed. This function calculates the standard errors of the random effects and outputs them along with the random effects into a table. I have not tried to make this function totally portable so it currently uses species as the grouping variable to identify the level-2 units throughout the function. There are two places where it assumes that the model includes variables lntemp and lntemp2 as predictors (in creating a matrix called xmat) and that the variable lntemp has been declared as random (to construct an object called Zmat). So to adapt this function to other problems these variables would need to be changed.
- graph.u1: this creates the caterpillar plot for the level-2 slope random effects. It requires as input arguments the object created by the resids.table function and the name of the data set. Hardwired into the code is the assumption that the grouping variable is called species.
- graph.u0: this creates the caterpillar plot for the level-2 intercept random effects. It requires as input arguments the object created by the resids.table function and the name of the data set. Hardwired into the code is the assumption that the grouping variable is called species.
- After pasting the functions into R I produce the graphs as follows.
> out.resids<-resids.table(inverts,model3)
> out.resids[1:10,]
Species.num u0 u1 sd.u0 sd.u1
Amphiprion melanopus 1 0.1171001 -0.02799795 0.2098857 0.2864359
Armases miersii 2 0.4770219 -0.09358326 0.2077348 0.2802973
Balanus amphitrite 1 3 -0.2219984 0.13166601 0.1426457 0.2228906
Balanus eburneus 4 -0.7310860 0.05395505 0.1768714 0.2569641
Balanus trigonus 5 -0.6782908 -0.06610844 0.1552050 0.2482023
Callianassa tyrrhena 6 -0.7457328 -0.38392852 0.1390658 0.2253486
Cancer gracilis 7 0.3637000 -0.11882529 0.1439511 0.2110557
Cancer irroratus 8 0.1905045 0.06848089 0.1293948 0.1845003
Cancer magister 9 0.8615153 -0.16859720 0.1863755 0.2525198
Cancer oregonensis 10 0.3654544 -0.11613289 0.1691172 0.2404039
- The first 10 lines of the out.resids file illustrates its basic structure. I next produce the two caterpillar plots. Both functions are set up so that after the plot is produced you can interactively label points on the graph by clicking near the point. The text is placed roughly where you click. To exit interactive mode right-click on the graph and choose quit (Windows), or press the esc key (Macintosh).
> graph.u1(out.resids,inverts)
> graph.u0(out.resids,inverts)

Fig. 3 Caterpillar plots for level-2 residuals
- Observe how very different the plots are.
- The slope random effects (plot of u1i on the left) are all very similar and except for three species (labeled) their confidence intervals include 0. This suggests that if we were to omit these three species the "need" for slope random effects would disappear (which it does). This directly addresses the research question of interest because it suggests that for 71 of the 74 species, a common slope model is adequate. Interestingly at least two of the aberrant species are indeed unusual. Limulus polyphemus is an ancient lineage (the horseshoe crab) and Laqueus californianus is a brachiopod.
- The intercept random effects are all quite different from each other and from zero. Thus although a common slope model might work for this assemblage, a common intercept model would not.
Identifying level-2 units that are poorly fit by the model
- Snijders and Bosker (1999) have constructed statistics that help identify those level-2 units poorly described by a multilevel model. One statistic reflects a level-2 unit's influence on model fit and a second statistic describes how well that level-2 unit is fit by the model. The fit statistic simultaneously examines all the parameter estimates for that unit in the form of a multivariate residual and identifies if in concert the parameters are in any way atypical. The fit statistics has an asymptotic chi-squared distribution which is used in constructing a significance test.
- Snijders and Bosker (1999) recommend only being concerned about those level-2 units that are both influential on the fit and are themselves fit poorly by the model. I display both of these statistics in a plot that makes identification of unusual points easy. The code for producing this plot can be found in the file lab12 diagnostics.txt The code is set up to examine the level-2 units for the quadratic model, model3, we fit above. The graph is displayed in interactive "identify" mode to allow points to be labeled interactively. Pasting the code into R produces the graph shown in Fig. 4.
- On the influence scale (y-axis) there are five observations that are clearly disjunct from the rest. Of these only 3 are extreme (p < .05, blue line) on the fit scale (x-axis). Snijders and Bosker (1999) suggest using a Bonferroni correction to the α-level (red vertical line) to avoid inadvertently declaring a level-2 unit as unusual by chance. If we were to use this criterion here, none of the observations would be deemed unusual. Since I would rather err on the side of caution I would stick with the usual significance level of α = .05 in which case three observations would be flagged.
Checking the adequacy of the correlation structure induced by a mixed model
- It was noted in Lecture 46 that when multiple random effects are included in a mixed model, the correlation that is induced by the random effects can be quite complicated. While this correlation structure is often robust enough to account for most situations that arise in practice there exist a number of special cases where a different kind of correlation structure might be required. A simple example of this is in longitudinal studies where the level-1 observations are made repeatedly over time on the same level-2 units. Here one might typically expect correlations to decay over time something that is usually not captured by including random effects. Still, the random effects formulation is attractive because it readily deals with observational imbalance (different numbers of observations on the same unit), missing data, and an unequal spacing of observations within units.
- Fortunately, the nlme package allows the specification of an additional correlation structure for the level-1 observations. Since the larval duration data set we've been considering is not a longitudinal study, I use a different data set to illustrate specifying correlation structures in mixed models. This data set describes the growth of apples on trees in an orchard. The data are described in Schabenberger and Pierce (2002), p. 466.
At the Winchester Agricultural Experiment Station of Virginia Polytechnic Institute and State University ten apple trees were randomly selected and twenty five apples were randomly chosen on each tree. We concentrate on the analysis of the apples in the largest size class, those whose initial diameter exceeded 2.75 inches. In total there were eighty apples in that size class. Diameters of the apples were recorded in two-week intervals over a twelve-week period.
- The goal is to develop a model of apple growth. Observe that the data have structure. We have repeated measurements on individual apples (up to six) and the apples themselves came in groups from individual trees. These are hierarchical data with three levels to the hierarchy: individual measurements nested in apples nested in trees. For the moment we'll ignore the last level, the trees, and treat this as a two-level design.
> apples<- read.table('http://www.unc.edu/courses/2006spring/ecol/145/001/data/lab12/AppleData.csv',
sep=',', header=TRUE)
> names(apples)
[1] "TREE" "APPLE" "size" "appleid" "time" "diam"
> apples[1:15,]
TREE APPLE size appleid time diam
1 1 1 7 1 1 2.90
2 1 1 7 1 2 2.90
3 1 1 7 1 3 2.90
4 1 1 7 1 4 2.93
5 1 1 7 1 5 2.94
6 1 1 7 1 6 2.94
7 1 4 7 4 1 2.86
8 1 4 7 4 2 2.90
9 1 4 7 4 3 2.93
10 1 4 7 4 4 2.96
11 1 4 7 4 5 2.99
12 1 4 7 4 6 3.01
13 1 5 7 5 1 2.75
14 1 5 7 5 2 2.78
15 1 5 7 5 3 2.80
- The variable diam is the response and time is the predictor. It indicates the different times that measurements were made on the same apple. To understand the meaning of some of the variables a little bit better, I focus on a section of the data set where there is a transition from one tree to another.
> apples[115:130,]
TREE APPLE size appleid time diam
115 2 25 7 50 1 2.75
116 2 25 7 50 2 2.80
117 2 25 7 50 3 2.83
118 2 25 7 50 4 2.85
119 2 25 7 50 5 2.86
120 2 25 7 50 6 2.88
121 3 1 7 51 1 2.91
122 3 1 7 51 2 3.00
123 3 1 7 51 3 3.02
124 3 1 7 51 4 3.03
125 3 1 7 51 5 NA
126 3 1 7 51 6 NA
127 3 10 7 60 1 2.81
128 3 10 7 60 2 2.89
129 3 10 7 60 3 2.87
130 3 10 7 60 4 2.93
- TREE identifies the tree from which the apple came and APPLE identifies the specific apple from that tree. Notice that APPLE is not a unique identifier since it starts over from 1 once we switch to a new tree. The variable appleid on the other hand uniquely identifies the different apples in the study. So, when we treat this as two-level data set we will need to use appleid as the grouping variable. (If APPLE were used then the data for apples from different trees but with the same value of APPLE would be combined.) Observe also there are some missing data.
- I begin by obtaining a panel display of the data for each apple using lattice graphics. I make use of the fact that the plot function when applied to a groupedData object produces such a plot by default. There's way too much data to look at all at once, so I restrict things to just the first two trees.
> apples.sub<-apples[apples$TREE<=2,]
> grp<-groupedData(diam~time|appleid, data=apples.sub)
> plot(grp)
- Although Fig 5 shows a slight curvilinear trend for some apples, it appears, at least as a first approximation, that a growth model linear in time might be adequate.
- I begin in the usual fashion by first fitting an unconditional means model, a two-level model with no predictors in which only the intercept is random at level 2.
> model0<-lme(diam~1, random=~1|appleid, data=apples, method='ML', na.action=na.omit)
Note the use of the option na.action=na.omit. This instructs lme to drop observations that have missing data for any of the variables currently used in the model (only diam here). The default action with missing data is na.fail which causes the program to abort and not produce any output.
- From this model we can calculate the intra-class correlation coefficient (ICC), which estimates the correlation among observations coming from the same apple.
> VarCorr(model0)
appleid = pdLogChol(1)
Variance StdDev
(Intercept) 0.009545563 0.09770140
Residual 0.003049309 0.05522054
> 0.009545563/(0.009545563+0.003049309)
[1] 0.7578928
- A correlation of 0.76 is pretty strong evidence that we need to account for the data structure when analyzing these data. I next add time as a predictor at level 1.
> model1<-lme(diam~time, random=~1|appleid, data=apples, method='ML', na.action=na.omit)
> VarCorr(model1)
appleid = pdLogChol(1)
Variance StdDev
(Intercept) 0.0105001410 0.10247020
Residual 0.0004234962 0.02057902
- By comparing the residual variance, σ2, of this model (model1) with that of the unconditional means model (model0), we obtain a pseudo-R2 for level 1.
> (0.003049309-0.0004234962)/0.003049309
[1] 0.8611173
So 86% of variability in apple diameter is explained by its linear relationship with time. This is quite high. Notice this truly is a "pseudo" R2 here because the level-2 variance, τ2, actually increased slightly in going from model0 to model1.
- Although the intra-class correlation coefficient provides sufficient motivation to account for the data structure when building a model, additional motivation can be obtained by comparing our structured model to one in which structure is completely ignored, an ordinary linear regression model.
> modelOLS<-lm(diam~time, data=apples, na.action=na.omit)
> sapply(list(modelOLS,model1),AIC)
[1] -780.0338 -1820.8447
So we see that the AIC is substantially lower for the model that includes structure.
- Finally I consider a model that allows both the slopes and intercepts to be random.
> model2<-lme(diam~time, random=~time|appleid, data=apples, method='ML', na.action=na.omit)
> sapply(list(model1,model2), AIC)
[1] -1820.845 -1920.751
- There is a substantial drop in AIC indicating we should prefer a model in which both slopes and intercepts are random. Since the two models are nested we can also carry out a likelihood ratio test. The results concur with the conclusions we derived by comparing the AIC values.
> anova(model1, model2)
Model df AIC BIC logLik Test L.Ratio p-value
model1 1 4 -1820.845 -1804.399 914.4224
model2 2 6 -1920.751 -1896.082 966.3755 1 vs 2 103.9062 <.0001
- I next examine the variance components from the random slopes and intercepts model.
> VarCorr(model2)
appleid = pdLogChol(time)
Variance StdDev Corr
(Intercept) 7.847939e-03 0.088588592 (Intr)
time 5.182587e-05 0.007199019 0.542
Residual 2.614601e-04 0.016169727
- The output from VarCorr indicates that there is a sizable correlation, 0.542, between the random effects. Schabenberger and Pierce (2002) from which these data are taken, elected to fit a model in which the random effects are assumed to be independent of each other. By modifying the random statement in lme we can fit such a model in R too. The specification is random=list(appleid=pdDiag(~time)). The pdDiag function creates a diagonal covariance matrix,
,
instead of the more general covariance matrix we've been using,
.
I fit this model and compare it to the model in which the random effects are allowed to be correlated.
> model3<-lme(diam~time, random=list(appleid=pdDiag(~time)), data=apples, method='ML', na.action=na.omit)
> sapply(list(model2, model3), AIC)
[1] -1920.751 -1906.999
> anova(model2, model3)
Model df AIC BIC logLik Test L.Ratio p-value
model2 1 6 -1920.751 -1896.082 966.3755
model3 2 5 -1906.999 -1886.442 958.4996 1 vs 2 15.75172 1e-04
- Both AIC and the likelihood ratio test agree that contrary to what is done in Schabenberger and Pierce (2002), the random effects should be allowed to be correlated. We stick with model 2.
- Since the observations are measured over time, the question arises as to whether the correlation induced by the random effects is adequate to handle the serial correlation that may exist between observations (observations measured closer in time are likely to more similar than those more separated in time). We can examine this possibility by plotting the autocorrelation function (ACF). The ACF R function calculates the correlation between pairs of observations when separated by increasing time lags. Applying the ACF to the residuals from a model is a way to determine if there is any residual correlation remaining that was unaccounted for by the inclusion of predictors and random effects. The estimated correlation between residuals separated by k time units,
, is given by the following formula.

Here
is the jth level-1 residual for level-2 unit i and
is the residual from the same unit but k time units later. N(k) is the number of terms that go into the numerator sum and N(0) is the total number of different residuals used in the numerator calculation.
- Applying ACF to an lme model yields a list of correlations at various lags. The function assumes that the observations within a group are listed in temporal order with equal time intervals between observations. If this is not the case, the results returned by ACF will not be sensible.
> ACF(model2)
lag ACF
1 0 1.0000000
2 1 -0.1697802
3 2 -0.3497668
4 3 -0.3275452
5 4 -0.1584048
6 5 0.5242356
- To determine if any of these correlations are unusually large we can compare them against approximate two-sided critical bounds at a specified significance level. The recommended bounds are

where z(p) denotes the quantile of a standard normal distribution such that P(Z ≤ z(p)) = p.
- We don't have to worry about constructing these bounds ourselves. The plot function when applied to an ACF object does this automatically when a level is specified for the alpha argument. I plot the ACF with α = .05 and α = .01, the latter being the Bonferroni-adjusted α-level to account for the fact that there are five separate correlations (for five nonzero lags) whose significance we wish to test.
> plot(ACF(model2),alpha=.05)
> plot(ACF(model2),alpha=.01)
Fig. 6 ACF results for α = .05 (left) and α = .01 (right)
Observe that even with the Bonferroni correction there are significant autocorrelations at the first three lags. (The significant autocorrelation at lag 5 can probably be ignored since there isn't much data going into its calculation.) Also notice that contrary to expectation the correlations between nearby observations (in time) are negative!
- Although the graph does not exhibit a gradual decay in correlations with time as would be expected with temporally correlated data, we can still try to add such a correlation structure to these data. A complete list of all the available correlation structures can be found in Pinheiro and Bates (2000), p. 234. The simplest such correlation model is AR(1) in which the correlation at lag 1 is given by φ and correlations at further lags are given by
. A correlation structure is added to a linear mixed effect model in R by specifying a correlation argument in the lme function call. The AR(1) correlation structure is specified by corAR1( ).
> model4<-lme(diam~time, random=~time|appleid, data=apples, method='ML', na.action=na.omit, correlation=corAR1())
> sapply(list(model2,model4),AIC)
[1] -1920.751 -1936.372
- AIC argues that adding the correlation structure has been an improvement. I examine the summary statistics.
> summary(model4)
Linear mixed-effects model fit by maximum likelihood
Data: apples
AIC BIC logLik
-1936.372 -1907.592 975.186
Random effects:
Formula: ~time | appleid
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 0.085754233 (Intr)
time 0.006026921 0.83
Residual 0.021343046
Correlation Structure: AR(1)
Formula: ~1 | appleid
Parameter estimate(s):
Phi
0.4937614
Fixed effects: diam ~ time
Value Std.Error DF t-value p-value
(Intercept) 2.831036 0.009996493 370 283.20296 0
time 0.028944 0.000973891 370 29.71994 0
Correlation:
(Intr)
time 0.388
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.3548201 -0.4006783 0.1000436 0.6102602 2.8733065
Number of Observations: 451
Number of Groups: 80
- Surprisingly, given what we saw in the ACF plot, the estimated value of φ is positive. I plot the ACF for the new model.
> plot(ACF(model4),alpha=.01)
- As Fig. 7 shows the problems haven't gone away. We could try other correlation structures but these would all be ad hoc measures. Unusual correlations such as these indicate model misspecification. We could try to fix this by including additional time-varying predictors in the model or by choosing a different functional form for the model. This could mean fitting a quadratic model or even a nonlinear model. Since there are no additional predictors available for these data and since we'll be considering nonlinear mixed effect models next week in the context of a more interesting data set that I know more about, we'll stop here with this somewhat rather inadequate model.
- Tomorrow we'll examine how the third level in the hierarchy, trees, can be accounted for in the model.
Cited References
- Goldstein, Harvey. 1995. Multilevel Statistical Models. Edward Arnold, London.
- Pinheiro, J. C. and Bates, D. M. 2000. Mixed-Effects Models in S and S-Plus . Springer-Verlag, New York.
- Schabenberger, Oliver and Francis J. Pierce. 2002. Contemporary Statistical Models for the Plant and Soil Sciences. CRC Press, Boca Raton, FL.
- Snijders, Tom and Roel Bosker. 2000. Multilevel Analysis. Sage Publications, London.
Course Home Page