Tuesdays and Thursday, 9:30-10:45
First class: Tuesday January 13
Last class: Thursday April 23
No class March 10 or 12 because of Spring Break
No class January 22 or March 19 because the instructor is out of town.
There may be further changes in schedule but these will be announced.
Richard L. Smith
Course web page: http://www.unc.edu/~rls/s890/s890.html
The environmental sciences, broadly defined, are one of the major contemporary applications of statistics. The field of climate change, to take just one high-profile example, is responsible for huge datasets, both observational (e.g. records of temperature and precipitation) and model-generated (created by climate models, which are large-scale computer programs to simulate the earth's climate). Some of the most topical problems involve climate extremes - for example, studying whether increases in the frequency of extreme meteorological events may be attributed to anthropogenic increases in greenhouse gases. Environmental pollution is another example, which is also responsible for both observational and model-generated datasets, the latter as a result of computer models that are used to forecast the spread of air pollutants (e.g. CMAQ, the EPA's Community Multiscale Air Quality model). Another area of research in connection with air pollution concerns the human health effects of air pollution. Large datasets have also been compiled to look at this problem, in particular, the NMMAPS (National Morbidity and Mortality Air Pollution Study) dataset that was compiled at Johns Hopkins (see http://www.ihapss.jhsph.edu).
However, the intent of this course is not only to discuss these and other applications, but also to develop the statistical methodology that lies behind them. The methodologies that will be covered include Bayesian statistics, spatial statistics and extreme value theory. None of these will be covered in the same depth as would be the case in a specialized course, but the intention is to equip the student with the main tools that are used in environmental applications and to discuss how they are actually applied.
Prerequisites, Readings and Assignments
The course is open to graduate students in Statistics, Operations Research and other quantitative disciplines. There are no formal prerequisites, but I shall assume all participants have take a graduate-level course in linear models and regression (STOR 674 or equivalent). Other courses that are desirable, but not strictly essential, include a graduate-level course in basic statistics theory (such as STOR 665) and some exposure to time series and multivariate analysis (such as are covered in STOR 754, which was taught by the same instructor in Fall 2008).
I expect to make extensive use of my own online course notes (from 2001),
that are available at
Some other references that will be useful for the course include:
Banerjee, S., Carlin, B.P. and Gelfand, A.E. (2004),
Hierarchical Modeling and Analysis for Spatial Data .
Chapman and Hall/CRC, Boca Raton, FL.
Coles, S.G. (2001), An Introduction to Statistical Modeling of Extreme Values. Springer Verlag, New York.
Cressie, N. (1993), Statistics for Spatial Data . Second edition, John Wiley, New York.
Le, N.D. and Zidek, J.V. (2006), Statistical Analysis of Environmental Space-Time Processes. Springer Verlag, New York.
Smith, R.L. (2003), Statistics of extremes, with applications in environment, insurance and finance. Chapter 1 of Extreme Values in Finance, Telecommunications and the Environment , edited by B. Finkenstadt and H. Rootzen, Chapman and Hall/CRC Press, London, pp. 1-78. Preprint version available from:
Stein, M.L. (1999), Interpolation of Spatial Data: Some Theory of Kriging. Springer Verlag, New York.
There will be no homeworks or exams. Students taking the course for credit will be expected to work on a small independent class project involving reading of the literature, analysis of a real dataset, or a computational project (e.g. simulation to examine the properties of some environmental statistics procedure) and to present the project either orally or in writing. The exact format of this will depend on the number of students taking the course, but provisionally, the last week of classes has been reserved for student presentations.
Tentative Schedule of Topics (subject to change!)
Class 1, Jan 13: Introduction: Bayesian statistics applied to problem
of combining models for climate change.
Class 2, Jan 15: Bayesian statistics and climate change (contd.)
Class 3, Jan 20: Bayesian statistics in environmental epidemiology: health effects of air pollution. Measurement error problems, combining data across cities, the NMMAPS project.
Class 4, Jan 27: Bayesian statistics in environmental epidemiology (contd.)
Class 5, Jan 29: Bayesian statistics in environmental epidemiology (contd.)
Class 6, Feb 3: Spatial statistics
Class 7, Feb 5: Spatial statistics (contd.)
Class 8, Feb 10: Spatial statistics (contd.)
Class 9, Feb 12: Spatial statistics (contd.)
Class 10, Feb 17: Nonstationary spatial models
Class 11, Feb 19: Nonstationary spatial models (contd.)
Class 12, Feb 24: Detection and attribution in climatology
Class 13, Feb 26: Detection and attribution in climatology (contd.)
Class 14, Mar 3: Lattice models
Class 15, Mar 5: Spatial-temporal models
Class 16, Mar 17: Spatial-temporal models (contd.)
Class 17, Mar 24: Network design
Class 18, Mar 26: Network design (contd.)
Class 19, Mar 31: Trend estimation in climatic time series
Class 20, Apr 2: Trend estimation in climatic time series (contd.)
Class 21, Apr 7: Extreme values
Class 22, Apr 9: Extreme values (contd.)
Class 23, Apr 14: Extreme values (contd.)
Class 24, Apr 16: Extreme values (contd.)
Class 25, Apr 21: Student presentations
Class 26, Apr 23: Student presentations (contd.)