Lecture 37—Friday, March 24, 2006

What was covered?

Terminology defined

The confusion (classification) matrix

 
Observed
 
Yi = 1 Yi = 0

Predicted
(using decision rule)

A
C
A+C
B
D
B+D
 
A+B
C+D
A+B+C+D

Fig. 1 Maximizing specificity and sensitivity

Make the choice of c less arbitrary

Use all possible values of c simultaneously

Fig. 2 Visualizing habitat suitability models

    

Fig. 3  The effect of changing c on model calibration statistics. The area under each curve is a probability

Fig. 4  ROC curves for the different models of Fig. 2

Test the model against new data

Latent variable models for binary variables

for some k. The unobserved variable is also sometimes called a latent variable.

The probit model

Here I use c as a subscript of superscript to remind us that this is a model constructed using the underlying continuous variable .

This last expression is just the generalized linear model for a binomial random variable using a probit link.

The logistic model

A logistic distribution looks very much like a normal distribution but with fatter tails. If we assume the errors of our simple linear regression model for the continuous response has a logistic distribution with mean 0 and variance , we obtain the following.

where has a logistic distribution mean 0 and variance 1. If we multiply this quantity by the resulting random variable will have the standard logistic distribution.

Course Home Page


Jack Weiss
Phone: (919) 962-5930
E-Mail: jack_weiss@unc.edu
Address: Curriculum in Ecology, Box 3275, University of North Carolina, Chapel Hill, 27516
Copyright © 2006
Last Revised--April 3, 2006
URL: http://www.unc.edu/courses/2006spring/ecol/145/001/docs/lectures/lecture37.htm