Lecture 43 (lab 11)—Tuesday, April 4 , 2006

What was covered?

R functions and commands demonstrated

R function options

R packages used

Overview of the data

> #obtain data
> inverts<- read.table('http://www.unc.edu/courses/2006spring/ecol/145/001/data/lab11/74species.csv', sep=',', header=TRUE)
> names(inverts)
 [1] "species"      "meantemp" "temp"  "PLD" "taxon"   "climate.3"
 [7] "feeding.type" "lntemp"   "lnPLD" "n"   "PLD.15c" "lnPLD.15c"
[13] "X1000.k"      "lograte"

> inverts[1:10, c("species", "lnPLD", "lntemp", "feeding.type", "climate.3")]
               species    lnPLD   lntemp feeding.type climate.3
1 Amphiprion melanopus 2.509599 3.218876            P  tropical
2 Amphiprion melanopus 2.197225 3.332205            P  tropical
3      Armases miersii 2.960105 3.173878            L  tropical
4      Armases miersii 2.397895 3.377588            L  tropical
5 Balanus amphitrite 1 2.833213 2.708050            P temperate
6 Balanus amphitrite 1 2.639057 2.995732            P temperate
7 Balanus amphitrite 1 2.302585 3.135494            P temperate
8 Balanus amphitrite 1 2.079442 3.218876            P temperate
9 Balanus amphitrite 1 2.197225 3.295837            P temperate
10    Balanus eburneus 2.028148 2.995732            P temperate

> dim(inverts)
[1] 218 14
> length(unique(inverts$species))
[1] 74

> table(tapply(inverts$PLD,inverts$species,length))

 2  3 4 5 6
29 29 9 5 2

Examining the data structure

> set.seed(10)
> sub1<-sample(unique(inverts$species),16)
> sub1

 [1] Limulus polyphemus     Dendraster excentricus  Haliotis rufescens
 [4] Ostrea lurida          Callianassa tyrrhena    Chthamalus stellatus
 [7] Crassostrea virginica  Spirorbis spirorbis     Lytechinus variegatus
[10] Gadus morhua           Mactra solidissima      Hydroides elegans
[13] Cancer irroratus       Laqueus californianus   Cryptasterina pentagona
[16] Elminius modestus
74 Levels: Amphiprion melanopus Armases miersii Balanus amphitrite 1 ... Upeneus tragula

> sub2<-inverts[inverts$species %in% sub1,]
> dim(sub2)

[1] 50 14

Fig. 1  Default lattice graph for grouped data objects

> library(nlme)
> sub3<-groupedData(lnPLD~lntemp|species, data=sub2)

> plot(sub3)

> library(lattice)
> trellis.par.set(col.whitebg())

> xyplot(lnPLD~lntemp|species, data=sub2, panel=function(x,y) {
panel.xyplot(x,y)
panel.abline(lm(y~x), lty=1, col=4)
},
strip=strip.custom(par.strip.text = list(cex=0.75)))

Model building

The unconditional means model

where

To understand how the R specification of this model works, I substitute the level-2 equation into the level-1 equation to obtain the composite model.

model0<-lme(fixed=lnPLD~1,random=~1|species,data=inverts,method="ML")

> summary(model0)
Linear mixed-effects model fit by maximum likelihood
Data: inverts
     AIC      BIC    logLik
451.2372 461.3906 -222.6186

Random effects:
Formula: ~1 | species
        (Intercept)  Residual
StdDev:   0.8198609 0.4529645

Fixed effects: lnPLD ~ 1
               Value Std.Error  DF  t-value p-value
(Intercept) 2.888758 0.1008001 144 28.65829       0

Standardized Within-Group Residuals:
        Min          Q1         Med         Q3        Max
-2.15822703 -0.58084922 -0.05855462 0.55220371 2.38062759

Number of Observations: 218
Number of Groups: 74

> VarCorr(model0)
species = pdLogChol(1)
            Variance  StdDev
(Intercept) 0.6721719 0.8198609
Residual    0.2051768 0.4529645

> VarCorr(model0)[1,1]
[1] "0.6721719"

As a result our calculations will turn out to be rather awkward. I need to convert each character value to a numeric value before I can do arithmetic on it.

> tau.sq<-as.numeric(VarCorr(model0)[1,1])
> sigma.sq<-as.numeric(VarCorr(model0)[2,1] )
> tau.sq/(tau.sq+sigma.sq)

[1] 0.76614

So the correlation between observations coming from the same species is 0.766, a sizeable value and far from 0, the value required for ordinary linear regression to be appropriate here.

> intervals(model0)
Approximate 95% confidence intervals

Fixed effects:
               lower     est.   upper
(Intercept) 2.689977 2.888758 3.08754
attr(,"label")
[1] "Fixed effects:"

Random Effects:
Level: species
                    lower      est.     upper
sd((Intercept)) 0.6844931 0.8198609 0.9819995

Within-group standard error:
    lower      est.     upper
0.4034828 0.4529645 0.5085144

The random intercepts model

For completeness I've included equations at level 2 for all level-1 parameters. Since the slopes are not assumed to vary across species in this model, I set the level-2 equation for the slope for each species equal to a constant.

> model1<-lme(lnPLD~lntemp, random=~1|species, data=inverts,method="ML")

This is identical to the unconditional means model specification except that we now have a predictor in the fixed effects portion of the model.

fixed=lnPLD~1+lntemp

> summary(model1)
Linear mixed-effects model fit by maximum likelihood
Data: inverts
     AIC      BIC    logLik
239.5597 253.0977 -115.7798

Random effects:
Formula: ~1 | species
       (Intercept)  Residual
StdDev:  0.8891414 0.2115642

Fixed effects: lnPLD ~ lntemp
               Value Std.Error   DF   t-value p-value
(Intercept) 6.164731 0.17954444 143  34.33541       0
lntemp     -1.161889 0.05168823 143 -22.47879       0
Correlation:
       (Intr)
lntemp -0.811

Standardized Within-Group Residuals:
        Min          Q1         Med         Q3        Max
-3.08734705 -0.45362977 -0.04267881 0.50275266 3.62335848

Number of Observations: 218
Number of Groups: 74

> model.OLS<-lm(lnPLD~lntemp,data=inverts)

We can test for the importance of structure using a likelihood ratio test. The degrees of freedom for the test is just the difference in the number of estimated parameters. The random intercepts model estimates one additional parameter .

> logLik(model.OLS)
'log Lik.' -273.5342 (df=3)
> logLik(model1)
'log Lik.' -115.7798 (df=4)
> 1-pchisq(2*(logLik(model1)-logLik(model.OLS)),1)
[1] 0

Clearly the difference in the loglikelihoods is gigantic. The AIC tells an identical story. The random intercepts model is a huge improvement over the ordinary least squares model. Clearly it is important that we account for data structure in fitting our models.

> .5*0 + .5*(1-pchisq(2*(logLik(model1)-logLik(model.OLS)),1))
[1] 0

so in this instance our conclusion does not change.

> anova(model1,model0)
       Model df      AIC      BIC    logLik   Test  L.Ratio p-value
model1     1  4 239.5597 253.0977 -115.7798
model0     2  3 451.2372 461.3906 -222.6186 1 vs 2 213.6775  <.0001

The variance components are shown next.

> VarCorr(model1)
species = pdLogChol(1)
            Variance  StdDev
(Intercept) 0.7905725 0.8891414
Residual    0.0447594 0.2115642

> VarCorr(model0)
species = pdLogChol(1)
            Variance  StdDev
(Intercept) 0.6721719 0.8198609
Residual    0.2051768 0.4529645

Pulling out the relevant quantities we have the following.

> (as.numeric(VarCorr(model0)[2,1]) - as.numeric(VarCorr(model1)[2,1]))/ as.numeric(VarCorr(model0)[2,1])
[1] 0.7818496

So 78% of the variability of lnPLD at level 1 (the individual level) is explained by its linear relationship to lntemp.

The random slopes and intercepts model

where

> model2<-lme(lnPLD~1+lntemp, random=~1+lntemp|species, data=inverts, method="ML")

or equivalently

> model2<-lme(lnPLD~lntemp, random=~lntemp|species, data=inverts, method="ML")

> summary(model2)
Linear mixed-effects model fit by maximum likelihood
 Data: inverts
       AIC      BIC    logLik
  210.4372 230.7441 -99.21859

Random effects:
 Formula: ~lntemp | species
 Structure: General positive-definite, Log-Cholesky parameterization
            StdDev    Corr
(Intercept) 2.0544348 (Intr)
lntemp      0.5301851  -0.92
Residual    0.1520061

Fixed effects: lnPLD ~ lntemp
               Value  Std.Error  DF   t-value p-value
(Intercept) 7.088667 0.29554220 143  23.98530       0
lntemp     -1.454417 0.08465556 143 -17.18040       0
Correlation:
       (Intr)
lntemp -0.942

Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.56320100 -0.36679640 -0.01297869 0.41001698 3.64570103

Number of Observations: 218
Number of Groups: 74

> anova(model2,model1)
       Model df      AIC      BIC     logLik   Test  L.Ratio p-value
model2     1  6 210.4372 230.7441  -99.21859
model1     2  4 239.5597 253.0977 -115.77985 1 vs 2 33.12252  <.0001

So the addition of random slopes is a significant improvement.

> .5*(1-pchisq(33.12252,1)) + .5*(1-pchisq(33.12252,2))
[1] 3.642662e-08

> VarCorr(model2)
species = pdLogChol(lntemp)
            Variance   StdDev    Corr
(Intercept) 4.22070229 2.0544348 (Intr)
lntemp      0.28109623 0.5301851 -0.92
Residual    0.02310587 0.1520061

> as.numeric(VarCorr(model2)[2,3]) * as.numeric(VarCorr(model2)[1,2]) * as.numeric(VarCorr(model2)[2,2])
[1] -1.002092

> model2.5<-lme(lnPLD~I(lntemp-mean(lntemp)), random=~I(lntemp-mean(lntemp))|species, data=inverts, method="ML")
> VarCorr(model2.5)

species = pdLogChol(I(lntemp - mean(lntemp)))
                         Variance   StdDev    Corr
(Intercept)              0.81092490 0.9005137 (Intr)
I(lntemp - mean(lntemp)) 0.28109627 0.5301851 -0.444
Residual                 0.02310586 0.1520061

Observe that the correlation has been reduced by half.

Cited References

Course Home Page


Jack Weiss
Phone: (919) 962-5930
E-Mail: jack_weiss@unc.edu
Address: Curriculum in Ecology, Box 3275, University of North Carolina, Chapel Hill, 27516
Copyright © 2006
Last Revised--May 15, 2007
URL: http://www.unc.edu/courses/2006spring/ecol/145/001/docs/lectures/lecture43.htm