ENVIRONMENTAL STATISTICS

Instructor: Richard L. Smith

This page was last updated July 6, 2009.

** Final Schedule for Presentations**

The actual presentations have been deleted to save disk space, but if anyone would like to see the originals, please email me.

** Changryong Baek write-up **

** Petro Borysov write-up**

** Chris Cabanski write-up **

** Jessi Cisewski write-up **

** Jennifer Clark write-up **

** Jennifer Clark appendix **

** Emil Cornea write-up **

** Scott Hauswirth write-up**

** Beth Horton write-up**

** Soyoung Jeon write-up **

** Sungkyu Jung write-up**

** Jordan Kern write-up **

** Valmik Khadke write-up **

** Xuan Li write-up **

** Feng Liu write-up **

** Xin Liu write-up **

** Eric Lock write-up **

** Ruiwen Zhang write-up **

** Su Zhang write-up **

** Zhitao Zhang write-up **

** Yingqi Zhao write-up**

** Summary of Classes 1-3 (Jan 13, 15, 20) and part of Class 4 (Jan 27): **
In these classes I talked about a
** forthcoming JASA paper **
on constructing probabilistic climate projections using ensembles
of climate models, and also referred to a
** book chapter **
that described the same and related work in more tutorial fashion.
In particular, Appendix A of the book chapter discussed basic methods
of Bayesian statistics, including the Gibbs sampler and Hastings-Metropolis
sampler. I also cited a
** recent note on climate projections for North Carolina**
as an example of how these techniques could be applied to answer climate
change questions in specific regions, and mentioned the
** PCMDI website **
where a large volume of climate model output data is available for
public download and analysis.
Although I haven't yet mentioned it in class, I should also point towards the
** CRU website **
(Climatic Research Unit, University of East Anglia) which holds many observational
datasets, as well as providing further links with other websites storing
climate data. Two related references are a
** paper by Reinhard Furrer and co-authors**
where they discuss application of some similar ideas in the context of
a fully spatial-temporal dataset, and a
** paper by Tebaldi and Sanso**
on joint modeling of temperature and precipitation.

In ** class 4 (Jan 27)**, I started to discuss the
** NMMAPS data website **
which is used to examine the relationship between air pollution exposure
and various adverse health coutcomes including death.
I am including here a link to the
** 2004 JAMA paper **
that played a major role in EPA's 2008 revaluation of the ozone
standard; I also recommend exploring
** Francesca Dominici's website**
(from which the above link to the JAMA paper was taken) for much
more information about NMMAPS and the research it has led to.
Here is a
** Simple program to evaluate health outcomes in one city**
and a
** more complicated program **
that is intended to reproduce the individual-city analysis of the
2004 JAMA paper (based partly on code in the NMMAPS data website lined above).
The output from that program is
** stored here **
(Cols 1-2: parameter estimate and standard error for the "all year"
ozone-mortality relationship;
cols 3-4: parameter estimate and standard error for the "summer only"
ozone-mortality relationship; each computed for all the 108 NMMAPS cities
except where ozone data are not available ("NA")).
Note that both these programs presume that you have pre-loaded the NMMAPS
dataset into R, which can be done either directly from their website, or
within R by clicking on "Packages", then "Install package(s)", and scrolling
down the list of available packages to "NMMAPSlite".
I'm also linking a
** 2007 seminar in Biostatistics**
where I discussed some of my own work on these analyses.
For the next class, I recommend looking as a
** paper by Everson and Morris**
that discusses hierarchical Bayesian analysis for this kind of dataset.

Updates February 3: here are two versions of a
** sample tlnise program **
for computing hierarchical estimates. The second program uses
** this file **
that indicates in which of the 7 NMMAPS regions each city lies.

** Link to paper by Crooks et al.**

** Link to notes on Air Pollution Epidemiology**
(updated Feb 5 but still work in progress!)

** Beginning Feb 10**, we started discussing ** Detection and
Attribution in Climatology**.
** Here ** is the current version of my presentation on this
topic (this version Feb 12 but still being updated!)
For the moment, I've left this topic incomplete but will return to it
later.

** Beginning Feb 17**, we started discussing spatial statistics.
The
** first part of the presentation **
covered basic motivation,
background on spatial processes, estimation, kriging, examples
(classes of Feb 17, Feb 18, Feb 24, Feb 26).

The classes of March 3 and March 5 we spent talking mostly about packages and computation (see links below).

The classes of March 17, March 26, March 31, April 2 were spent talking
about nonstationary processes and then spatial monitor design, covered in
** this presentation. **
(We didn't cover the "lattice models" sections.)

All these topics are covered in more detail in my
** full length notes on environmental statistics **

The class of April 7 covered Zhengyuan Zhu's "two-stage" approach to spatial design. I hope to post some notes on that later.

Then, beginning in the last part of the class on April 7 and continuing
through the classes of April 9, April 14 and April 16, we covered
** Extreme Value Theory **. Much of this material is covered in
** RLS review paper (2003) **
while the actual presentations are
** here **

Notes from Malta short course (thanks to Eric Gilliland - see also
"Extremes Toolkit" below):

** Main notes **

** Practice examples **

** Sample R code **

The last two classes (April 21, 23) we returned to the theme of
** detection and attribution **. The presentation from those
two classes is
** here **

** Papers referred to in class**

Cressie (1989), * The American Statistician * 43, no. 4, 197-202

Handcock and Stein (1993), * Technometrics * 35, no. 4, 403-410

Holland, de Oliveira, Cox and Smith (2000), * Environmetrics * 11, 373-393

Holland, Caragea and Smith (2004), * Atmospheric Environment * 38, 1673-1684

Smith, Kolenikov and Cox (2003), * J. Geophys. Res. * 108, D24, 9004,
doi:10.1029/2002JD002914, 2003

** Papers for Possible Class Presentation:**

Furrer, Sain, Nychka and Meehl (2007) * Environmental and Ecological Statistics * 14, 249-266
(climate ensembles; spatial-temporal fields)

Tebaldi and Sanso (2009) * JRSSA * 172, 83-106
(climate ensembles; joint modeling of temperature and precipitation)

Fuentes and Raftery (2005), * Biometrics * 61, 36-45
(combining observational data with air quality models)

Fuentes et al. (2006), * Biometrics * 62, 855-863
(spatial models for fine particles and mortality)

Stein, Chi and Welty (2004), * JRSSB * 66, 275-296 (approximate likelihoods for large spatial datasets)

Banerjee, Gelfand, Finley and Sang (2008), * JRSSB * 70, 825-848 (large spatial data sets)

Stein (2005), * JRSSB * 67, 667-687 (Space-time analysis of environmental data)

Stein (2008), * Journal of the Korean Statistical Society * 37, 3-10
(modeling approaches for large spatial datasets)

Shao, Stein and Ching (2007), * Journal of Statistical Planning and Inference * 137, 2277-2293 (air quality models)

Jun and Stein (2004), * Atmospheric Environment * 38, 4427-4436 (air quality models)

Kaufman, Schervish and Nychka (2007), * JASA *, to appear.
(tapering method for large spatial datasets)

Furrer, Genton and Nychka (2006), * Journal of Computational and Graphical Statistics * 15, 502-523.
(tapering method for large spatial datasets)

Jun and Stein, Nonstationary covariance models for global data.
* Annals of Applied Statistics * 2, pp. 1271-1289.

Rue, Martino and Chopin (2009), * JRSSB *, to appear.
(approximate Bayesian inference for latent Gaussian process models)

Cressie and Johanesson (2008), * JRSSB * 70, 209-226
("Fixed rank kriging for very large spatial data sets")

Diggle, Tawn and Moyeed (1998), * JRSSB * 47, 299-350
(Original paper on "Model-Based Geostatistics")

Gelfand, Banerjee and Gamerman (2005), * Environmetrics * 16, 465-479.
(Dynamic models for spatial-temporal processes)

Paciorek and Schervish (2006), * Environmetrics * 17, 483-506.
nonstationary covariance functions

Paciorek (2007), * Computational Statistics and Data Analysis * 51, 3631-3653
(spatial logistic regression...)

Anderson and Bell (2009), * Epidemiology * 20, 205-213
(effect of weather on mortality)

Bell and Dominici (2008), * American Journal of Epidemiology * 167, 986-997
("effect modifiers" for the ozone-mortality association)

Pope, Ezzati and Dockery,
N Engl J Med 360:376, January 22, 2009
("Fine-Particulate Air Pollution and Life Expectancy in the United States")

Dominici, Peng, Bell, Pham, McDermott, Zeger, Samet (2006),
* JAMA * 295, 1127-1134.
("Fine Particulate Matter Air Pollution and Hospital Admissions...")

Peng, Chang, Bell, McDermott, Zeger, Samet, Dominici (2008),
* JAMA * 299, 2172 - 2179.
("Coarse Particulate Matter Air Pollution and Hospital Admissions...")

** Links for Software etc.: **

RLS programs and data

geoR home page

IMAGe software page (includes "fields" )

Statistics of Weather and Climate Extremes
(Rick Katz's page at NCAR - includes Extreme Toolkit)

gstat home page

Pebesma (2004), * Computers and Geosciences * 30, 683-691 (about gstat)

** Revised Class Schedule (3/17/09). **