TIME SERIES AND MULTIVARIATE ANALYSIS

Instructor: Richard L. Smith

This page was last updated December 16, 2008.

**Final exam including solutions and comments **

**examdata1.txt (Data for question 1)**

**raleighozone.txt (Data for question 2)**

**NCnormals.txt (Data for question 3)**

**Slides of final class presentation**

Older material:

The final exam will be posted on this webpage at 5:00pm, Tuesday December 2.

It is a take-home exam that is to be handed to me personally at 4:00pm, Wednesday December 3, or may be left before that time in my mailbox in the 3rd floor of Hanes Hall.

The exam will include both theoretical and computational material and you are free to use all course materials and computers in preparing it.

You are reminded that the Honor Code is in force during this exam. You are not allowed to consult with each other, or with any outside person, in any way at all. You are free to ask me questions in person or by phone or email. If you consult references other than the course materials, you should include a full citation in your answer.

We DO have our last regular class in the usual room at 12:30, and this is also an opportunity to ask me additional questions.

PCA for the Crimes dataset in R:

**Worked example (pdf file)**

**R code (text file)**

New R examples (from class of 11/24/08):

**Factor analysis examples **

**Ruspini dataset (example for clustering) **

**Sample code for analysis of Ruspini dataset **

**Astronomy dataset (example for clustering) **

**Sample code for analysis of astronomy dataset **

**Link to Fraley-Raftery tech report on model-based clustering in R**

Older code and examples:

**Iris analysis **

**PCA examples **

** Paleoclimatology analysis **

** HW4 (due 11/18/08) **
** Updated 11/08/08 **

Announcements:

There will be no class Thursday, November 6.

The final exam will be a takehome exam (open book, includes computer exercises). The exam will be posted online at 5pm Tuesday, December 2 and will be due to the instructor by 4pm Wednesday, December 3 (exact times to be confirmed nearer the day).

HW2: Problems 10.13, 10.15, 10.21, 10.22 from Brockwell and Davis.
Due date: November 4 2008.

* Note: * I think there is a misprint in 10.15(d) - J200 should be I200.
Also you can use the result of 10.13 to solve 10.15(d).

** Spectral analysis programs ** (also available on Datasets page)

The midterm exam is in class on Tuesday, October 14. It is a closed-book exam but you may bring in one handwritten sheet (writing allowed on both sides) with your own personal notes and formulas. You should also bring a blue book and a calculator.

** Midterm from 1999 course**

** Solutions**

OFFICE HOURS: I will hold my regular office hour on Wednesday, October 8, and an additional office hour Monday, October 13 (1:00-3:00pm both days).

** Older Links **

I was asked for some references about periodically correlated time series. Here are some:

1. The recent book by Harry Hurd,

*Periodically Correlated Random Sequences: Spectral Theory and Practice*,
Harry L. Hurd, Abolghassem Miamee

Wiley Series in Probability and Statistics.

ISBN: 978-0-471-34771-2

Hardcover, 384 pages, October 2007

2. Some papers by Robert Lund and various co-authors:

Lund, Hurd, Bloomfield and Smith (1997),
* Journal of Climate * ** 11 **, 2787-2809.

Lund and Basawa (2000), * Journal of Time Series Analysis *
** 21 **, 75-93.

Basawa and Lund (2001), * Journal of Time Series Analysis *
** 22 **, 651-663.

Basawa, Lund and Shao (2004), * Statistics and Probability Letters*
** 67 **, 299-306.

Lund, Shao and Basawa (2006), * Australia New Zealand Journal of
Statistics * ** 48 **, 33-47.

**Notes on MLE (updated 10/03/08)**

HW2: Problems 4.5, 4.12, 4.16, 5.15, 5.8 from Brockwell and Davis. Due date: October 7 2008.

Notes:

In 4.12, verify the given Yule-Walker estimates for yourself in R (go here for the data). Also compute the MLE for the AR(2) model. However you can decide whether to carry out the spectral density calculation based on the Yule-Walker or ML estimates. Note that this is a "theoretical" spectral density calculation, based on the fitted AR model; it doesn't require the spectral density estimation techniques within R.

In 4.16, feel free to use the plotting routines within R or any other package to display the results, but again the main point of the exercise is algebraic manipulation of the spectral density.

Amend question 5.8 as follows: Use the method of 5.15 to solve part (c) of this problem, and also part (b) based on the forecasts you get in (c). Although this is a numerical calculation, you are expected to do this part essentially by direct calculation, rather than running the whole thing through a package. However, then use the "predict" command within R to solve the same problem, and compare the results. The point of the exercise is that the method of 5.15 does not in fact give the exact finite-series optimal predictor, but it should be close.

HW1: Problems 3.1, 3.2, 3.7, 3.14, 3.15 from Brockwell and Davis. Due date: September 18 2008.

My office hours are now fixed as WEDNESDAY 1:00-3:00 pm.

Professor Richard Smith

Office: Hanes 303

Phone: 962-2660 office, 408-8126 home

Email: rls * at *email.unc.edu

The course STOR 754 is one of the Statistics program's "B-level" courses. Students in the Statistics program will have already taken the first-year or "A-level" courses. Students from other programs are welcome to take the course provided they have had roughly the following background: a graduate-level course in regression and linear models equivalent to STOR 664; and some exposure to graduate-level statistical theory at the level of the STOR 654/655. If in doubt, please email the instructor.

** Required ** text:
*Time Series: Theory and Methods *(Springer Series in Statistics) by
Peter J. Brockwell, Richard A. Davis

** Recommended ** text:
* Multivariate Analysis * (Probability and Mathematical Statistics) (Paperback)
by K.V. Mardia, J.T. Kent and J.M. Bibby.

These will be supplemented by the instructor's own course notes (see links below for the current version of these notes, though they may get updated as the course proceeds).

** Topics, Time series: ** Stationary processes; autocovariances, spectral
density, linear process, ARMA. Estimation, time and spectral domain methods.
Prediction.

** Topics, Multivariate analysis: ** Basic theory of the multivariate normal
and Wishart distributions. Principal components, canonical correlations, factor
analysis, cluster analysis.

Other topics that may be covered as time permits: Multivariate time series, state space models, long-range dependence, dimension-reduction techniques in multivariate analysis.

** Software: ** The Brockwell-Davis book includes an excellent self-contained
(and very user-friendly) time series package called ITSM, and part of the time
we will use that. However, R also includes extensive facilities for both time
series and multivariate analysis, so we will use that as well. No previous knowledge
of ITSM is needed; some previous exposure to R is desirable but not essential.
If you don't know R already you can
download it here.

The online notes linked below include numerous references to S-PLUS programs. As you may know, S-PLUS has now largely been supplanted by R, and we won't be using S-PLUS in this course. However, many programs from S-PLUS are easily modified (or may need no modifying at all) to run in R.

* Special Note: * B-level courses are normally taught once every two years,
but this schedule is not followed exactly. The last time STOR 754 was
taught was Spring 2008, when the instructor was Professor P.K. Sen. It is my understanding
that Professor Sen concentrated on the "Multivariate" half of the course. Given this
background, my intention is to spend at least two-thirds of the class time on
time series, though the exact amount may depend on how many students who register
for this class also took STOR 754 in the spring. Direct overlap between the two
courses will be kept to a minimum. * Therefore, it has been agreed that students
who previously took STOR 754 with Professor Sen are allowed (and encouraged) to take
it a second time to learn about time series. *

The following materials were used last time I taught this course, in 1999.

**Time Series Notes**

**Multivariate Analysis Notes**

**Time series datasets**

**Additional datasets used in computing assignment**

**S-PLUS programs for Chapter 6 of time series notes**

**Multivariate datasets**

Return to Richard Smith's page