Much of this preliminary code is borrowed from Assignment 4. The first part creates the raw counts. In addition I create an indicator variable for each treatment that is just a single number (the treatment number) repeated as many times as there are individual observations for that treatment.
> #Treatment 1
> trt1<-c(19,12,18,18,11,12,7,8,4,4,1,1,1,1,1,1,1)
> c.trt1<-c(0:10,12,13,15,17,19,26)
> trt1.dat<-rep(c.trt1,trt1)
> #Treatment 2
> trt2<-c(24,16,16,18,15,9,6,5,3,4,3,1)
> counts2<-c(0:10,12)
> trt2.dat<-rep(counts2,trt2)
> #Treatment 3
> trt3<-c(43,35,17,11,5,4,1,2,2)
> trt3.dat<-rep(0:8,trt3)
> #Treatment 4
> counts4<-c(0:7,10,11)
> trt4<-c(47,23,27,9,7,3,1,1,1,1)
> trt4.dat<-rep(counts4,trt4)
Finally I assemble the treatment labels into a single vector, do the same for the counts, and declare the treatment variable to be a factor.
> #make treatment categorical variable
> trt<-rep(1:4,c(length(trt1.dat), length(trt2.dat), length(trt3.dat), length(trt4.dat)))
> #assemble counts
> x<-c(trt1.dat, trt2.dat, trt3.dat, trt4.dat)
> #create a factor for treatment
> trt.f<-factor(trt)
Question 1
I fit the requested model.
> library(MASS)
> model1<-glm.nb(x~trt.f)
> summary(model1,corr=FALSE)
Call:
glm.nb(formula = x ~ trt.f, init.theta = 1.47144974025270, link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.9705 -1.4324 -0.3071 0.2768 2.9291
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.39459 0.08792 15.863 < 2e-16 ***
trt.f2 -0.24191 0.12659 -1.911 0.056 .
trt.f3 -1.00030 0.13788 -7.255 4.02e-13 ***
trt.f4 -0.98359 0.13754 -7.151 8.60e-13 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(1.4714) family taken to be 1)
Null deviance: 623.56 on 479 degrees of freedom
Residual deviance: 538.74 on 476 degrees of freedom
AIC: 1958.5
Number of Fisher Scoring iterations: 1
Theta: 1.471
Std. Err.: 0.174
2 x log-likelihood: -1948.453
The individual Wald tests reported in the summary table are not appropriate tests of a treatment effect. Instead they test the specific contrast that is coded by that dummy variable. To test for a treatment effect we need to carry out a likelihood ratio test comparing this model to one in which a treatment effect is not included. Since there are no other predictors in the model, this can be done simply by applying the anova function of R to the model.
> anova(model1)
Analysis of Deviance Table
Model: Negative Binomial(1.4714), link: log
Response: x
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL 479 623.56
trt.f 3 84.82 476 538.74 2.835e-18
Warning message:
tests made without re-estimating 'theta' in: anova.negbin(model1)
The anova function reports the likelihood ratio test result. Since the test result is highly significant, we should conclude that the treatment variable should be retained in the model.
Alternatively, we can compare the AIC values of a model with the treatment variable to that of a model with just an intercept.
> AIC(model1)
[1] 1958.453
> AIC(glm.nb(x~1))
[1] 2028.952
Since the AIC for a model with a treatment effect is far lower than that of the model that includes just an intercept, we should retain the treatment effect.
Question 2
The contrasts we need to answer this question are the following:
(0,1,1,-2)
(0,1,-1,0)
(-3,1,1,1)
First in each contrast the treatments in one group have a sign opposite to the treatments in the second group. Thus for example in the first contrast vector listed above, treatments 2 and 3 have the same sign while treatment 4 has the opposite sign. Second within each group the magnitudes of the coefficients are chosen so that they are the same for each treatment in that group. This common magnitude is chosen in such a way that when it is added over all members of the group the sum is equal in magnitude to the sum of the comparable coefficients in the contrasting group. Applying these principles to the first contrast I choose 1 as the coefficient for both treatments 2 and 3. Having done so this forces the coefficient of treatment 4 in the second group to be –2.
To fit the model I bind the three contrast vectors together as a matrix and declare a new factor with these as the contrasts.
> trt.f2<-factor(trt)
> contrasts(trt.f2)<-cbind(c(0,1,1,-2),c(0,1,-1,0),c(-3,1,1,1))
> model2<-glm.nb(x~trt.f2)
> summary(model2,corr=FALSE)
Call:
glm.nb(formula = x ~ trt.f2, init.theta = 1.47144974025269, link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.9705 -1.4324 -0.3071 0.2768 2.9291
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.83814 0.04905 17.088 < 2e-16 ***
trt.f21 0.12083 0.04227 2.858 0.00426 **
trt.f22 0.37919 0.06996 5.420 5.95e-08 ***
trt.f23 -0.18548 0.02640 -7.027 2.11e-12 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(1.4714) family taken to be 1)
Null deviance: 623.56 on 479 degrees of freedom
Residual deviance: 538.74 on 476 degrees of freedom
AIC: 1958.5
Number of Fisher Scoring iterations: 1
Theta: 1.471
Std. Err.: 0.174
2 x log-likelihood: -1948.453
The rows labeled trt.f21, trt.f22, and trt.f23 in the summary table are separate tests of the coded contrasts. From the displayed p-values we see that all contrasts statistically significant. Based on the sign of the estimate for the contrasts and the way in which they were coded we therefore can conclude that the mean of treatment means 2 and 3 is larger than treatment mean 4 (contrast 1), the treatment 2 mean is larger than the treatment 3 mean, and the treatment 1 mean is larger than the means of treatment 2, 3, and 4 means.
Question 3
It's easy to verify that the algebraic requirements for orthogonal contrasts hold.

Technically there is a third condition, the four treatment groups should have the same number of observations. If this is not the case then the second algebraic condition needs to be modified to reflect the different observation numbers. (See lecture 30.) For the current data set the treatment groups are balanced so this is not an issue.
> table(trt.f2)
trt.f2
1 2 3 4
120 120 120 120
It's worth noting that because a log link is used in the negative binomial regression model the interpretation of the orthogonal contrasts changes. With a log link the contrasts are not a contrast of means, but of log means. Here again are the contrasts defined by R.
> contrasts(trt.f2)
[,1] [,2] [,3]
1 0 0 -3
2 1 1 1
3 1 -1 1
4 -2 0 1
I begin by writing out the fitted equations for the log mean based on the contrast coding scheme.

Consider the first contrast occupying column 1. This contrast is the difference of the mean of the treatment 2 and 3 log means and the treatment 4 log mean. Using the equations listed above we can write this as follows.

So, as advertised, if β1 = 0 then the average of the log means for treatments 2 and 3 is not different from the log mean of treatment 4. Can we make better sense of this statement? If the contrast is not significant then we conclude the following.

You might recognize that the quantity on the left hand side of the last equality is the geometric mean of treatment means 2 and 3. Thus, because of the log link, which turns sums into products, our contrasts are actually statements about geometric means of treatments rather than arithmetic means.
Question 4
For the factor used in question 1 the coding scheme is as follows.
> contrasts(trt.f)
2 3 4
1 0 0 0
2 1 0 0
3 0 1 0
4 0 0 1
Because there is an intercept in the model too the equations for the log means are the following. For each equation I also given the coefficient vector which when multiplied by the coefficient vector gives us the log mean.
Using these equations the log means can be obtained from the regression output as follows.
> coef(model1)
(Intercept) trt.f2 trt.f3 trt.f4
1.3945932 -0.2419137 -1.0003014 -0.9835879
> logmean1<-coef(model1)[1]
> logmean2<-sum(coef(model1)[1:2])
> logmean3<-sum(coef(model1)[c(1,3)])
> logmean4<-sum(coef(model1)[c(1,4)])
> c(logmean1,logmean2,logmean3,logmean4)
(Intercept)
1.3945932 1.1526795 0.3942918 0.4110053
I next extract the variance-covariance matrix.
> summary(model1)$cov.scaled->var.cov
> var.cov
(Intercept) trt.f2 trt.f3 trt.f4
(Intercept) 0.007729465 -0.007729465 -0.007729465 -0.007729465
trt.f2 -0.007729465 0.016024393 0.007729465 0.007729465
trt.f3 -0.007729465 0.007729465 0.019010791 0.007729465
trt.f4 -0.007729465 0.007729465 0.007729465 0.018917675
The variances we need are most efficiently obtained by premultiplying and postmultiplying the variance-covariance matrix by a vector that defines each of the linear combinations of the βs. These vectors are displayed above and are used in the calculations below.
> trt1.var<-c(1,0,0,0)%*%var.cov%*%c(1,0,0,0)
> trt2.var<-c(1,1,0,0)%*%var.cov%*%c(1,1,0,0)
> trt3.var<-c(1,0,1,0)%*%var.cov%*%c(1,0,1,0)
> trt4.var<-c(1,0,0,1)%*%var.cov%*%c(1,0,0,1)
> c(trt1.var,trt2.var,trt3.var,trt4.var)
[1] 0.007729465 0.008294928 0.011281326 0.011188211
Finally I modify the code error bars function from Assignment 4 to use these new estimates. To obtain the confidence limits I carry out the following calculations.
![]()
Observe that we find the endpoints of the confidence interval on the log scale first and then exponentiate the results. (It's on the log scale that we fit the model and hence it's on this scale that the asymptotic normality of the maximum likelihood estimates kicks in.)
> arrow.func<-function(x, mean, bound) arrows(x, exp(mean+bound), x, exp(mean-bound), code=3, angle=90, length=.1)
> #draw graph
> plot(1:4,exp(c(mean1,mean2,mean3,mean4)), ylim=c(1,5), type='n', xlab='treatment', ylab='mean count', axes=FALSE, xlim=c(.8,4.2))
> axis(1,at=1:4)
> axis(2)
> box()
> points(1:4, exp(c(mean1,mean2,mean3,mean4)), pch=16,cex=1.2)
> arrow.func(1,logmean1, qnorm(.975)*sqrt(trt1.var))
> arrow.func(2,logmean2, qnorm(.975)*sqrt(trt2.var))
> arrow.func(3,logmean3, qnorm(.975)*sqrt(trt3.var))
> arrow.func(4,logmean4, qnorm(.975)*sqrt(trt4.var))

Question 5
Except for minor differences in the estimates obtained by the different functions used to obtain the maximum likelihood estimates in the present assignment and Assignment 4, the pictures are identical. Here's the picture from Assignment 4.

| Jack Weiss Phone: (919) 962-5930 E-Mail: jack_weiss@unc.edu Address: Curriculum in Ecology, Box 3275, University of North Carolina, Chapel Hill, 27516 Copyright © 2006 Last Revised--March 28, 2006 URL: http://www.unc.edu/courses/2006spring/ecol/145/001/docs/solutions/assign7.htm |