Lecture 20 —Wednesday, February 15, 2006

What was covered?

Terminology Defined

General linear models (GLMs)

The assumptions of GLMS using the language of GLIMs

  1. Random component: The response variables Y1, Y2, … , Yn are assumed to be independent and normally distributed. More formally we assume . Observe that the mean is allowed to vary for each Yi but that the variance does not.
  2. Systematic component: The p covariates are combined to form a linear predictor. The linear predictor is represented by the Greek letter η (eta).
  3. Link function: The systematic component and random component are linked together via a link function g. For GLMs the function g is taken to be the identity link. In other words we assume μ = η or as is more typically written

The potential limitations of GLMs

  1. Response variables of interest in ecology are often not normally distributed nor do they tend to exhibit constant variance. While transformations can sometimes "fix" one or both of these problems, transformations introduce their own baggage and make the resulting model difficult to interpret. There's also no guarantee that a suitable transformation will be found.
  2. Response variables may have a restricted range. Thus count data are required to be non-negative and proportions are required to be between zero and one. The normal distribution does not have such a range restriction and hence may not be an appropriate probability model for such data.
  3. It is often observed in ecological data that the variance is a function of the mean. This is especially true of restricted range data for values near the boundary of the range. Such values will necessarily have a skewed distribution because their distribution is truncated by the boundary. This in turn limits the amount of variability they can display, at least in one direction.
  4. The identity link can be problematic. Since the predictors do not have the same range restrictions as does the response variable, the systematic component may yield predicted values outside the legal range of the data.

Generalized linear models (GLIMs)

Because we require g to be monotonic, it follows that it is invertible. Thus we can also write this last expression in terms of the inverse link function, .

Written this way we can see that GLIMs have the potential of solving the range restriction problem. Since the linear predictor is connected to the mean through , a judicious choice of link function can constrain the predictions to map onto a desired range.

Exponential family of distributions

The Poisson distribution is a member of the exponential family

Cited Reference

 

Course Home Page


Jack Weiss
Phone: (919) 962-5930
E-Mail: jack_weiss@unc.edu
Address: Curriculum in Ecology, Box 3275, University of North Carolina, Chapel Hill, 27516
Copyright © 2006
Last Revised--Feb 17, 2006
URL: http://www.unc.edu/courses/2006spring/ecol/145/001/docs/lectures/lecture20.htm