Assignment 2

Due Date

Friday, February 3, 2006

Data Source

http://www.unc.edu/courses/2006spring/ecol/145/001/data/assign1/assignment1.csv

Included are six years of temperature data and disease prevalence for coral reefs in the Great Barrier Reef of Australia. The variables of interest are PREV_1 (prevalence of White syndrome as counts, i.e., number of infected reefs observed) and WSSTA (# of Weekly Sea Surface Temperature Anomalies that occurred in the previous year) as well as variables locating a reef's geographic position, LAT_DD and LON_DD, the date on which it was observed, DATE, and its name, REEF_NAME. The rest of the variables can be ignored. The data are in the form of a comma-delimited text file in which the variable names appear as the first row of the file.

Questions

For all questions submit the R code you used to produce your answers as well as your answer. Electronic submissions are welcome.

  1. Using the full data set for the 48 different reefs evaluate the mean-variance relationship for the variable PREV_1. Parallel what we did in class except use more categories than we were able to use. Deciles might be a good place to start. Produce a plot that compares a Poisson, NB1, and NB2 model for the mean-variance relationship.
    1. Hint 1: In creating the categories for PREV_1 you will only be able to use the quantiles of WSSTA as guidelines for choosing the cutpoints for the cut function. If you try deciles you'll see that some of the quantiles are repeated. The cut function requires unique cutpoints. Therefore you will need to drop the duplicated values and specify the unique cutpoints as an argument to cut explicitly using the c function.
    2. Hint 2: The cut function has an argument called right whose default value is right=TRUE. You may want to try right=FALSE. This will help in separating the zeros from the ones. You could also choose cutpoints that are not integers.
    3. Hint 3: I would suggest that 8 categories is the absolute minimum here. You may want to try other categorizations to see if the results are robust to how the categories are chosen. The discreteness of WSSTA will eventually limit how many categories you can create and still be able to compute a variance.
  2. Repeat the plot we did in class for the negative binomial distribution but add a Poisson distribution with λ = 1 and another with λ = 5. Compare these to the corresponding negative binomial distributions. Remembering that the Poisson distribution corresponds to a negative binomial with an infinite value of θ, what seems to be the primary effect of decreasing the value of θ for each value of λ? Submit your answer to this question along with your plot. Modify the legend accordingly.
  3. Produce a single plot that compares four different gamma distributions with different shape and scale parameters. The relevant function in R is dgamma. Examine the help window for dgamma and notice that the shape parameter is identical to the one used in class but that the scale parameter used in dgamma is the reciprocal of ours. To start with try α = 1, 2, and 5 with β = 1. For your fourth curve select another value of β that seems to generate an interesting distribution. What effect does the scale parameter seem to have on the distribution? Note: Because the gamma distribution is a continuous distribution, do not plot points only draw lines. Include an informative legend.

Course Home Page


Jack Weiss
Phone: (919) 962-5930
E-Mail: jack_weiss@unc.edu
Address: Curriculum in Ecology, Box 3275, University of North Carolina, Chapel Hill, 27516
Copyright © 2006
Last Revised--Jan 24, 2006
URL: http://www.unc.edu/courses/2006spring/ecol/145/001/docs/assignments/assign.htm