Read in and subset the data
# read in data
> srrs2 <- read.table("http://www.unc.edu/courses/2007spring/enst/562/001/assignments/assign10/radon.txt", header=T, sep=",")
# select only Minnesota data
> minn<-srrs2[srrs2$state=="MN",]
# obtain 6-character string to identify counties
> minn$countycode<-substr(as.character(minn$county),1,6)
# log-transform radon while adjusting zero values if necessary
> minn$logradon <- log(ifelse(minn$activity==0, .05, minn$activity))
# set background to white for older versions of R
> trellis.par.set(col.whitebg())
> xyplot(logradon~factor(floor)|countycode,data=minn, layout=c(17,5), strip=strip.custom(par.strip.text=list(cex=0.6)), panel=function(x,y) {
panel.xyplot(x,y)
mymeans<-tapply(y,x,mean)
panel.lines(as.numeric(names(mymeans))+1,mymeans,col=2)
}, xlab='Floor (0=basement, 1=first floor)', ylab='log(radon)')

Fig. 1 Lattice graph of household log radon levels by county for lowest floor in house

where ![]()
The composite equation for this model is obtained by inserting the level-2 equation into the level-1 equation.

> library(nlme)
> lme(fixed=logradon~1,random=~1|county,data=minn,method='ML')->out0
> out0
Linear mixed-effects model fit by maximum likelihood
Data: minn
Log-likelihood: -1139.845
Fixed: logradon ~ 1
(Intercept)
1.308058Random effects:
Formula: ~1 | county
(Intercept) Residual
StdDev: 0.3056031 0.8090579Number of Observations: 919
Number of Groups: 85
> VarCorr(out0)
county = pdLogChol(1)
Variance StdDev
(Intercept) 0.09339322 0.3056031
Residual 0.65457471 0.8090579From the output we learn that σ2 = 0.655 and τ2 = 0.093. Thus there is far more variability of household log radon levels within counties than between counties. This implies that the structure in this data set may not be particularly important. It also tells us that in terms of modeling we should invest most of our effort in trying to "explain" variability within counties rather than between counties, because that's where most of the variability lies. Thus level-1 predictors should prove to be more valuable than level-2 predictors.

and can be calculated in R as follows.
> as.numeric(VarCorr(out0)[1,1])/ (as.numeric(VarCorr(out0)[1,1])+ as.numeric(VarCorr(out0)[2,1]))
[1] 0.1248626
Part 1

with the same probability models for the random effects as before. The corresponding composite equation is the following.
> lme(fixed=logradon~1+floor, random=~1|county, data=minn, method='ML')->out1
> summary(out1)
Linear mixed-effects model fit by maximum likelihood
Data: minn
AIC BIC logLik
2195.602 2214.896 -1093.801Random effects:
Formula: ~1 | county
(Intercept) Residual
StdDev: 0.3230887 0.765721Fixed effects: logradon ~ 1 + floor
Value Std.Error DF t-value p-value
(Intercept) 1.4589422 0.0515057 833 28.325837 0
floor -0.7036136 0.0713706 833 -9.858592 0
Correlation:
(Intr)
floor -0.292Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-4.88589873 -0.60768113 0.01106789 0.62978548 3.44052086Number of Observations: 919
Number of Groups: 85

is highly significant (the reported p-value is rounded to 0) leading us to conclude that there is a statistically significant linear relationship between log radon and floor. In truth because floor is a dichotomous variable coded 0 and 1, we can also conclude that mean log radon levels are different in houses with basements compared to houses without basements.
> sapply(list(out0,out1),AIC)
[1] 2285.690 2195.602
Part 2
> out1a<-lm(logradon~floor,data=minn)
> sapply(list(out1,out1a),AIC)
[1] 2195.602 2273.466The AIC is dramatically lower in the current model suggesting that allowing the intercepts to be random is a major improvement.
Part 3
> VarCorr(out0)
county = pdLogChol(1)
Variance StdDev
(Intercept) 0.09339322 0.3056031
Residual 0.65457471 0.8090579
> VarCorr(out1)
county = pdLogChol(1)
Variance StdDev
(Intercept) 0.1043863 0.3230887
Residual 0.5863287 0.7657210

> (as.numeric(VarCorr(out0)[2,1])- as.numeric(VarCorr(out1)[2,1]))/ as.numeric(VarCorr(out0)[2,1])
[1] 0.1042601

The probability models for the random effects change because we've added an additional random effect.
The corresponding composite equation is the following.
> lme(fixed=logradon~1+floor, random=~1+floor|county, data=minn, method='ML')->out2
> VarCorr(out2)
county = pdLogChol(1 + floor)
Variance StdDev Corr
(Intercept) 0.1162887 0.3410114 (Intr)
floor 0.1120414 0.3347258 -0.319
Residual 0.5726597 0.7567428
> sapply(list(out1,out2),AIC)
[1] 2195.602 2197.084As we can see the AIC of the random slopes and intercepts model actually went up relative to random intercepts model. Therefore we should prefer the simpler model and stick with fixed slopes but random intercepts.
> cty <- read.table ("http://www.unc.edu/courses/2007spring/enst/562/001/assignments/assign10/cty.txt", header=T, sep=",")
> minn2<-cty[cty$st=='MN',]
> merge(minn,minn2,by.x="cntyfips",by.y="ctfips")->merge3

where as before
. The composite equation is the following.
> lme(fixed=logradon~1+floor+log(Uppm), random=~1|county, data=merge3, method='ML')->out3
> summary(out3)
Linear mixed-effects model fit by maximum likelihood
Data: merge3
AIC BIC logLik
2157.913 2182.029 -1073.956Random effects:
Formula: ~1 | county
(Intercept) Residual
StdDev: 0.1457421 0.7689218Fixed effects: logradon ~ 1 + floor + log(Uppm)
Value Std.Error DF t-value p-value
(Intercept) 1.4627406 0.03756813 833 38.93568 0
floor -0.6788426 0.06969627 833 -9.74001 0
log(Uppm) 0.7180305 0.09064038 83 7.92175 0
Correlation:
(Intr) floor
floor -0.362
log(Uppm) 0.156 -0.011Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-5.12860182 -0.60172541 0.03181403 0.64952352 3.32903088Number of Observations: 919
Number of Groups: 85

As we can see from the reported p-value, this test is highly significant. Thus we should retain this variable. We can also reach this conclusion by comparing the AIC of this model to that of a model without log(Uppm).
> sapply(list(out1,out3),AIC)
[1] 2195.602 2157.913The addition of log(Uppm) has led to a dramatic decrease in AIC. Thus we should prefer a model with log(Uppm) as a predictor.

Here the baseline model is the random intercepts model without the log(Uppm) predictor. I extract the components I need from the VarCorr object and carry out the calculations.
> VarCorr(out1)
county = pdLogChol(1)
Variance StdDev
(Intercept) 0.1043863 0.3230887
Residual 0.5863287 0.7657210
> VarCorr(out3)
county = pdLogChol(1)
Variance StdDev
(Intercept) 0.02124076 0.1457421
Residual 0.59124069 0.7689218
> (as.numeric(VarCorr(out1)[1,1])- as.numeric(VarCorr(out3)[1,1]))/ as.numeric(VarCorr(out1)[1,1])
[1] 0.7965177
| Jack Weiss Phone: (919) 962-5930 E-Mail: jack_weiss@unc.edu Address: Curriculum in Ecology, Box 3275, University of North Carolina, Chapel Hill, 27516 Copyright © 2007 Last Revised--May 5, 2007 URL: http://www.unc.edu/courses/2007spring/enst/562/001/docs/solutions/assign10.htm |