My negative loglikelihood function for the negative binomial distribution is shown below.
> NBvec2.neg<-function(data,p) -sum(log(dnbinom(data,mu=p[1],size=p[2])))
I enter each set of data and fit a negative binomial model. Each time I use the mean number of counts for that treatment as an initial guess for mu. I use 1 as the initial guess for theta. I save the results in the objects out1, out2, out3, and out4.
> #Treatment 1
> trt1<-c(19,12,18,18,11,12,7,8,4,4,1,1,1,1,1,1,1)
> c.trt1<-c(0:10,12,13,15,17,19,26)
> trt1.dat<-rep(c.trt1,trt1)
> mean(trt1.dat)
[1] 4.033333
> nlm(function(x) NBvec2.neg(trt1.dat,x),c(4,1),hessian=TRUE)->out1
> #Treatment 2
> trt2<-c(24,16,16,18,15,9,6,5,3,4,3,1)
> counts2<-c(0:10,12)
> trt2.dat<-rep(counts2,trt2)
> mean(trt2.dat)
[1] 3.166667
> nlm(function(x) NBvec2.neg(trt2.dat,x),c(3,1),hessian=TRUE)->out2
> #Treatment 3
> trt3<-c(43,35,17,11,5,4,1,2,2)
> trt3.dat<-rep(0:8,trt3)
> mean(trt3.dat)
[1] 1.483333
> trt3.dat<-rep(0:8,trt3)
> nlm(function(x) NBvec2.neg(trt3.dat,x),c(1,1),hessian=TRUE)->out3
>
> #Treatment 4
> counts4<-c(0:7,10,11)
> trt4<-c(47,23,27,9,7,3,1,1,1,1)
> trt4.dat<-rep(counts4,trt4)
> mean(trt4.dat)
[1] 1.508333
> nlm(function(x) NBvec2.neg(trt4.dat,x),c(1.5,1),hessian=TRUE)->out4
Collect results in single table
> cbind(out1$estimate,out2$estimate,out3$estimate,out4$estimate)->out
> rownames(out)<-c('mu','theta')
> colnames(out)<-c('trt1','trt2','trt3','trt4')
> out
trt1 trt2 trt3 trt4
mu 4.033330 3.166665 1.483333 1.508333
theta 1.502877 1.760487 1.333132 1.153521
I carry out a parametric test and a randomization test for each treatment.
> #Treatment 1
Pooled Pearson Chi-squared Test
I first examine at what point to lump the tail probabilities. From the output below we see the largest k for which P(X > k) > 5 is k = 11.
> 120*(1-pnbinom(10:15,mu=out1$estimate[1],size=out1$estimate[2]))
[1] 8.215601 6.194139 4.659713 3.498585 2.622275 1.962447
Next I examine the individual probabilities up to and including k = 11.
> dnbinom(0:11,mu=out1$estimate[1],size=out1$estimate[2])*120
[1] 16.909034 18.513716 16.879303 14.358517 11.775818 9.441964 7.455350 5.821693
[9] 4.507925 3.467693 2.653386 2.021462
There are a couple of choices here. I group the last three and decide that 4.5 is close enough to 5 to be left in its own category.
> Ei<-c(dnbinom(0:8, mu=out1$estimate[1], size=out1$estimate[2]), sum(dnbinom(9:11, mu=out1$estimate[1], size=out1$estimate[2])),1-pnbinom(11, mu=out1$estimate[1], size=out1$estimate[2]))*120
> Ei
[1] 16.909034 18.513716 16.879303 14.358517 11.775818 9.441964 7.455350 5.821693
[9] 4.507925 8.142541 6.194139
> Oi<-c(trt1[1:9],sum(trt1[10:11]),sum(trt1[12:length(trt1)]))
> Oi
[1] 19 12 18 18 11 12 7 8 4 5 6
Note: For k = 11, TRT1 is zero and that value wasn't entered. Hence the second group only combines two nonzero values.
> sum((Oi-Ei)^2/Ei)
[1] 6.411393
> sum((Oi-Ei)^2/Ei)->pear1
> 1-pchisq(pear1,length(Oi)-1-2)->p.1
> p.1
[1] 0.6012518
Randomization Test
For the randomization test I populate the observed values with zeros so that I end up with a vector of length 27.
> Oi.1<-c(trt1[1:11],0,trt1[12:13],0,1,0,1,0,1,0,0,0,0,0,0,1)
> length(Oi.full)
[1] 27
> expprob1<-c(dnbinom(0:25,mu=out1$estimate[1],size=out1$estimate[2]),1-pnbinom(25,mu=out1$estimate[1],size=out1$estimate[2]))
> sum(expprob1)
[1] 1
> chisq.test(Oi.1,p=expprob1,simulate.p.value=TRUE)->random1
> random1
Chi-squared test for given probabilities with simulated p-value (based on 2000 replicates)
data: Oi.full
X-squared = 22.8414, df = NA, p-value = 0.4983
> #Treatment 2
Pooled Pearson Chi-squared Test
I first examine at what point to lump the tail probabilities. From the output below we see the largest k for which P(X > k) > 5 is k = 11.
> 120*(1-pnbinom(8:15,mu=out2$estimate[1],size=out2$estimate[2]))
[1] 7.1999603 4.9277910 3.3564198 2.2766839 1.5387632 1.0367602 0.6965987 0.4668941
> dnbinom(0:9,mu=out2$estimate[1],size=out2$estimate[2])*120
[1] 19.602369 22.179282 19.674754 15.850329 12.123718 8.976999 6.500759 4.631921
[9] 3.259910 2.272169
Keep first 7 categories, merge the next 2, and then all the rest.
> Ei.2<-c(dnbinom(0:6,mu=out2$estimate[1], size=out2$estimate[2]), sum(dnbinom(7:8, mu=out2$estimate[1], size=out2$estimate[2])),1-pnbinom(8, mu=out2$estimate[1], size=out2$estimate[2]))*120
> Ei.2
[1] 19.602369 22.179282 19.674754 15.850329 12.123718 8.976999 6.500759 7.891831
[9] 7.199960
> trt2
[1] 24 16 16 18 15 9 6 5 3 4 3 1
> Oi<-c(trt2[1:7],sum(trt2[8:9]),sum(trt2[10:length(trt2)]))
> Oi
[1] 24 16 16 18 15 9 6 8 8
> length(Oi)
[1] 9
> length(Ei.2)
[1] 9
> pearson2<-sum((Oi-Ei.2)^2/Ei.2)
> pearson2
[1] 4.49745
> 1-pchisq(pearson2,length(Ei.2)-1-2)->p.2
> p.2
[1] 0.6096794
Randomization Test
> Oi.2<-c(trt2[1:11],0,1)
> expprob2 <-c(dnbinom(0:11,mu=out2$estimate[1],size=out2$estimate[2]),1-pnbinom(11,mu=out2$estimate[1],size=out2$estimate[2]))
> length(expprob2)
[1] 13
> length(Oi.2)
[1] 13
> chisq.test(Oi.2,p=expprob2,simulate.p.value=TRUE)->random2
> random2
Chi-squared test for given probabilities with simulated p-value (based on 2000 replicates)
data: Oi.full2
X-squared = 8.8654, df = NA, p-value = 0.7166
> #Treatment 3
Pooled Pearson Chi-squared Test
> 120*(1-pnbinom(5:10,mu=out3$estimate[1],size=out3$estimate[2]))
[1] 4.4118050 2.4207674 1.3222542 0.7196144 0.3904771 0.2113574
> dnbinom(0:5,mu=out3$estimate[1],size=out3$estimate[2])*120
[1] 44.272941 31.084638 19.098068 11.175192 6.375749 3.581607
> Ei.3<-c(dnbinom(0:4,mu=out3$estimate[1],size=out3$estimate[2]),1-pnbinom(4,mu=out3$estimate[1],size=out3$estimate[2]))*120
> Ei.3
[1] 44.272941 31.084638 19.098068 11.175192 6.375749 7.993412
> Oi.3<-c(trt3[1:5],sum(trt3[6:length(trt3)]))
> pearson3<-sum((Oi.3-Ei.3)^2/Ei.3)
> pearson3
[1] 1.186620
> 1-pchisq(pearson3,length(Ei.3)-1-2)->p.3
> p.3
[1] 0.756215
Randomization Test
> expprob3<-c(dnbinom(0:7,mu=out3$estimate[1],size=out3$estimate[2]),1-pnbinom(7,mu=out3$estimate[1],size=out3$estimate[2]))
> length(expprob3)
[1] 9
> chisq.test(trt3,p=expprob3,simulate.p.value=TRUE)->random3
> random3
Chi-squared test for given probabilities with simulated p-value (based on 2000 replicates)
data: trt3
X-squared = 2.6892, df = NA, p-value = 0.962
> #Treatment 4
Pooled Pearson Chi-squared Test
> 120*(1-pnbinom(5:10,mu=out4$estimate[1],size=out4$estimate[2]))
[1] 5.1347980 2.9649476 1.7084416 0.9827824 0.5645753 0.3239613
> dnbinom(0:5,mu=out4$estimate[1],size=out4$estimate[2])*120
[1] 45.737256 29.895689 18.240663 10.864958 6.392893 3.733743
> Ei.4<-c(dnbinom(0:4, mu=out4$estimate[1], size=out4$estimate[2]), 1-pnbinom(4, mu=out4$estimate[1], size=out4$estimate[2]))*120
> Oi.4<-c(trt4[1:5],sum(trt4[6:length(trt4)]))
> pearson4<-sum((Oi.4-Ei.4)^2/Ei.4)
> pearson4
[1] 6.603187
> 1-pchisq(pearson4,length(Oi.4)-1-2)->p.4
> p.4
[1] 0.0856807
Randomization Test
> trt4
[1] 47 23 27 9 7 3 1 1 1 1
> Oifull.4<-c(trt4[1:8],0,0,1,1)
> length(Oifull.4)
[1] 12
> expprob4<-c(dnbinom(0:10,mu=out4$estimate[1],size=out4$estimate[2]),1-pnbinom(10,mu=out4$estimate[1],size=out4$estimate[2]))
> chisq.test(Oifull.4,p=expprob4,simulate.p.value=TRUE)->random4
> random4
Chi-squared test for given probabilities with simulated p-value (based on 2000 replicates)
data: Oifull.4
X-squared = 11.988, df = NA, p-value = 0.3348
> #Summary of Results
> p.out<-rbind(c(p.1,p.2,p.3,p.4), c(random1$p.value,random2$p.value,random3$p.value,random4$p.value))
> rownames(p.out)<-c('parametric','randomization')
> colnames(p.out)<-c('trt1','trt2','trt3','trt4')
> p.out
trt1 trt2 trt3 trt4
parametric 0.6012518 0.6096794 0.756215 0.0856807
randomization 0.5387306 0.7166417 0.962019 0.3348326
I write a function to draw the arrows. It draws an arrow from the mean minus the bound to the mean plus the bound. The bound will be input as the standard error of the estimate times the .975 quantile of a normal distribution.
> arrow.func<-function(x,mean,bound) arrows(x,mean+bound,x,mean-bound,code=3,angle=90,length=.1)
> #draw graph
> plot(1:4,c(out1$estimate[1], out2$estimate[1], out3$estimate[1], out4$estimate[1]),
ylim=c(1,5), type='n', xlab='treatment',ylab='mean count',axes=FALSE,xlim=c(.8,4.2))
> axis(1,at=1:4)
> axis(2)
> box()
> points(1:4, c(out1$estimate[1], out2$estimate[1], out3$estimate[1], out4$estimate[1]), pch=16,cex=1.2)
> arrow.func(1,out1$estimate[1], qnorm(.975)*sqrt(solve(out1$hessian)[1,1]))
> arrow.func(2,out2$estimate[1], qnorm(.975)*sqrt(solve(out2$hessian)[1,1]))
> arrow.func(3,out3$estimate[1], qnorm(.975)*sqrt(solve(out3$hessian)[1,1]))
> arrow.func(4,out4$estimate[1], qnorm(.975)*sqrt(solve(out4$hessian)[1,1]))

I use the plkhci function in the Bhat package to obtain the profile likelihood confidence intervals.
> library(Bhat)
> x.in<-list(label=c('mu','theta'), est=out1$estimate, low=c(.1,.1), high=c(10,10))
> plkhci(x.in,function(p) NBvec2.neg(trt1.dat,p),'mu')->bound1
> bound1
[1] 3.401719 4.804918
> x.in<-list(label=c('mu','theta'), est=out2$estimate, low=c(.1,.1), high=c(10,10))
> plkhci(x.in,function(p) NBvec2.neg(trt2.dat,p),'mu')->bound2
> bound2
[1] 2.675589 3.759237
> x.in<-list(label=c('mu','theta'), est=out3$estimate, low=c(.1,.1), high=c(10,10))
> plkhci(x.in,function(p) NBvec2.neg(trt3.dat,p),'mu')->bound3
> bound3
[1] 1.196123 1.842141
> x.in<-list(label=c('mu', 'theta'), est=out2$estimate, low=c(.1,.1), high=c(10,10))
> plkhci(x.in,function(p) NBvec2.neg(trt4.dat,p),'mu')->bound4
> bound4
[1] 1.207350 1.889753
> #draw graph
> plot(1:4,c(out1$estimate[1], out2$estimate[1], out3$estimate[1], out4$estimate[1]), ylim=c(1,5), type='n', xlab='treatment', ylab='mean count', axes=FALSE, xlim=c(.8,4.2))
> axis(1,at=1:4)
> axis(2)
> box()
> points(1:4, c(out1$estimate[1], out2$estimate[1], out3$estimate[1], out4$estimate[1]), pch=16, cex=1.2)
I write an arrow function that draws an arrow from the first component of the vector I call bound to the second component of bound.
> profilearrow.func<-function(x,bound) arrows(x, bound[1], x, bound[2], code=3, angle=90, length=.1)
> profilearrow.func(1,bound1)
> profilearrow.func(2,bound2)
> profilearrow.func(3,bound3)
> profilearrow.func(4,bound4)
> mtext('95% profile likelihood CI for mean',side=3,line=.5)
Observe that the plot of the profile confidence intervals is virtually identical to the plot of the Wald confidence intervals.

| Jack Weiss Phone: (919) 962-5930 E-Mail: jack_weiss@unc.edu Address: Curriculum in Ecology, Box 3275, University of North Carolina, Chapel Hill, 27516 Copyright © 2006 Last Revised--March 23, 2006 URL: http://www.unc.edu/courses/2006spring/ecol/145/001/docs/solutions/assign4.htm |