Assignment 1 — Solution

Problem 1

#read in data
corals<-read.table('http://www.unc.edu/courses/2006spring/ecol/145/001/data/assign1/assignment1.csv', header=TRUE, sep=',')

We are told that each reef appears as many times in the data set as it was visited. To get a count for the number of times each reef was visited, use the table function on REEF_NAME.

table(corals$REEF_NAME)

   19131S    19138S    19159S   20104S    21529S    22088S AGINCOURT BORDER IS
        6         6         6        6         6         6         6         6
BROOMFIEL    CARTER   CHICKEN CHINAMAN    DAVIES DECAPOLIS       DIP  EAST CAY
        6         6         6        6         6         5         6         6
FITZROY I GANNET CA GREEN ISL HASTINGS  HAVANNAH HAYMAN IS HORSESHOE      HYDE
        6         6         6        6         6         6         6         6
JOHN BREW LADY MUSG  LANGFORD   LINNET LIZARD IS LOW ISLES MACGILLIV    MACKAY
        6         6         6        6         6         6         6         6
MARTIN (1 MICHAELMA  MYRMIDON  NO NAME NORTH DIR  ONE TREE  OPAL (2) PANDORA (
        6         6         6        6         6         6         6         6
     REBE       RIB ST. CRISP THETFORD TURNER CA WRECK ISL     YONGE
        6         6         6        5         6         6         6

From the list we can see that there are two reefs who were visited five times, Decapolis and Thetford.

BONUS

An expression such as table(corals$REEF_NAME)<6 is a Boolean expression that evaluates to TRUE or FALSE.

table(corals$REEF_NAME)<6

   19131S    19138S    19159S   20104S    21529S    22088S AGINCOURT BORDER IS
    FALSE     FALSE     FALSE    FALSE     FALSE     FALSE     FALSE     FALSE
BROOMFIEL    CARTER   CHICKEN CHINAMAN    DAVIES DECAPOLIS       DIP  EAST CAY
    FALSE     FALSE     FALSE    FALSE     FALSE      TRUE     FALSE     FALSE
FITZROY I GANNET CA GREEN ISL HASTINGS  HAVANNAH HAYMAN IS HORSESHOE      HYDE
    FALSE     FALSE     FALSE    FALSE     FALSE     FALSE     FALSE     FALSE
JOHN BREW LADY MUSG  LANGFORD   LINNET LIZARD IS LOW ISLES MACGILLIV    MACKAY
    FALSE     FALSE     FALSE    FALSE     FALSE     FALSE     FALSE     FALSE
MARTIN (1 MICHAELMA  MYRMIDON  NO NAME NORTH DIR  ONE TREE  OPAL (2) PANDORA (
    FALSE     FALSE     FALSE    FALSE     FALSE     FALSE     FALSE     FALSE
     REBE       RIB ST. CRISP THETFORD TURNER CA WRECK ISL     YONGE
    FALSE     FALSE     FALSE     TRUE     FALSE     FALSE     FALSE

The expression table(corals$REEF_NAME) returns a vector. To access its elements I can enter something like

table(corals$REEF_NAME)[c(2,4)]

19138S 20104S
     6      6

If instead of a list of numbers I give it a list of TRUEs and FALSEs, where the list is as long as there are elements in the vector, it returns only those elements that correspond to the TRUEs. Thus when I give it the contents of table(corals$REEF_NAME)<6, I only get back those reefs for which the statement is TRUE. These are the reefs that were not visited six times.

table(corals$REEF_NAME)[table(corals$REEF_NAME)<6]

DECAPOLIS THETFORD
        5        5

Problem 2

I plot disease prevalence against WSSTA, adding a linear regression and a lowess curve just as we did in class.

plot(corals$WSSTA,corals$PREV_1, xlab='WSSTA',ylab='Disease Prevalence')
abline(lm(corals$PREV_1~corals$WSSTA))
lines(lowess(corals$PREV_1~corals$WSSTA),lwd=2,col=3)

Clearly the scale is all wrong. After some experimentation, I settle on a y-axis that ranges from 0 to 20.

plot(corals$WSSTA,corals$PREV_1,xlab='WSSTA',ylab='Disease Prevalence',ylim=c(0,20))
lines(lowess(corals$PREV_1~corals$WSSTA),lwd=2,col=3)
abline(lm(corals$PREV_1~corals$WSSTA))

Plot before rescaling Plot with y-axis rescaled from 0 to 20

The lowess curve reveals that disease prevalence is not monotonic with WSSTA. Prevalence increases as WSSTA increases from 0 to 5, but then decreases afterword. This suggests perhaps a quadratic realtionship is more appropriate here.

Problem 3

library(date)
plot(as.date(as.character(corals$DATE)), corals$LAT_DD, xlab='Date', ylab='Latitude')

As the plot shows the complete confounding of space and time has continued for all six years of the study. This will have a profound effect on the way the data will need to be analyzed.

Problem 4

summary(corals$WSSTA)
 Min. 1st Qu. Median  Mean 3rd Qu.  Max.
0.000   1.000  3.000 6.064  8.000 30.000

Summary returns basic descriptive statistics for the variable WSSTA. We see the minimum value is 0 and the maximum value is 30. The first, second, and third quartiles are 1, 3, and 8 respectively. These are the numbers for which 25%, 50%, and 75% of the observations are less than or equal to. The mean is 6.064.

Problem 5

boxplot(corals$WSSTA)

The boxplot with various features identified is shown to the right.

Problem 6

Method 1

The table function can be used to obtain a list of counts.

table(corals$PREV_1)

  0   1   2   3   4  5  6  7  8  9 11 12 13 14 15 16 17 18 19  20  21
109  41  27  17  10  4  5  2  5  7  2  3  4  1  1  2  2  1  3   3   1
 22  23  24  26  27 28 30 31 32 34 35 37 49 51 60 75 77 90 92 101 106
  2   1   1   1   2  1  1  1  2  1  1  1  2  1  1  1  1  1  1   1   1
149 221 315 336 343
  1   1   1   1   1

This is a vector of numbers and I can extract the first number, the number of zero counts, using bracket notation as follows.

table(corals$PREV_1)[1]
  0
109

Now if I add up all the numbers produced by table, I obtain the total number of observations made on all the reefs.


sum(table(corals$PREV_1))
[1] 280

All that's left to do is to divide these two numbers and multiply by 100.


table(corals$PREV_1)[1]/sum(table(corals$PREV_1))*100
0
38.92857

So 38.9% of the prevalence observations were zero.

Method 2

Following the logic used in answering the bonus problem in Problem 1, another way to obtain the number of zero counts is by the following.

sum(corals$PREV_1==0)
[1] 109

The sum function coerces Boolean TRUE values to 1, and FALSE values to 0. Therefore the sum is just the number of zeros. Then the rest of the argument precedes as above.

Course Home Page


Jack Weiss
Phone: (919) 962-5930
E-Mail: jack_weiss@unc.edu
Address: Curriculum in Ecology, Box 3275, University of North Carolina, Chapel Hill, 27516
Copyright © 2006
Last Revised--Jan 30, 2006
URL: http://www.unc.edu/courses/2006spring/ecol/145/001/docs/solutions/assign1.htm