## Lecture 9—Wednesday, Sept. 17, 2003

What was covered?

• Overview and review of estimation.
• Introduction to the bootstrap
• Bootstrap confidence intervals

Overview of Estimation

In class and/or in the textbook, we've dealt with four types of estimation methods

1. Method of moments (also called plug-in) estimators.
• These are easy to calculate but typically have no theory to back them up or to determine their precision, unless they happen to be estimators of another class also.
2. Estimators based on non-statistical theory.
• In this category I would put any estimator developed from biological criteria rather than statistical ones. Many of the diversity estimators and the like are examples. If there is any statistical theory (such as formula for standard errors) associated with them it will be on a case by case basis.
3. Maximum likelihood estimators (mles)
• There is a full statistical theory associated with them.
• Their main drawback is that they're not always relevant in specific cases. In particular, to construct an mle you first need a probability model for your data. It may not always be obvious what the appropriate probability model might be and in many cases there will be competing candidates.
4. Least squares estimators.
• We have not discussed these, but your text does. They can always be calculated but they may not always be appropriate.
• Least squares estimation yields the value that minimizes with respect to for a set of data values and a given function f.
• If you can assume normality, then standard errors of the estimators can be constructed and hypothesis tests conducted. In linear models where normality can be assumed, least squares estimators are optimal. Outside of linear models and normally distributed response variables, least squares is generally inappropriate.
• The popularity of least squares estimators is largely a historical accident and owes primarily to the fact that explicit analytical solutions can be obtained using calculus.
• Any situation where least squares estimation is appropriate can also be formulated as a maximum likelihood estimation problem in which a normal probability model is selected.

Note: the sample mean is a method of moments estimator, a maximum likelihood estimator (under virtually all probability models), and a least squares estimator. In a least squares sense, it is the value of c that minimizes the expression .

Bootstrapping

• Bootstrapping is a computer-intensive technique that can be used to obtain standard errors for estimators regardless of their origin. It can also be used to construct interval estimates of parameters.

• Bootstrapping is especially useful in situations where there is no statistical theory to guide us or where standard statistical theory is inappropriate or suspect. Thus it is especially appropriate for categories 1, 2 , and even 4 in the above overview of estimation theory.

• The bootstrap was invented by Bradley Efron in the middle 1970s. The name was chosen because in the method one seems to use the data to pull oneself up by one's own bootstraps. Efron remarked in an early paper that he had contemplated calling the method the "shotgun" because it "... can blow the head off any problem if the statistician can stand the resulting mess."
• There are two distinct flavors of the bootstrap: the parametric bootstrap and the nonparametric bootstrap.
• In the parametric bootstrap a probability model is assumed for the underlying population.
• In the nonparametric bootstrap we proceed without a probability model.
• When most statisticians use the word "bootstrap" they are referring to the nonparametric bootstrap.

Parametric Bootstrap

• Suppose we're studying seed shadows so we measure the distance from a source plant to a random selection of propagules. Below is a random sample of 10 such distances.
`41.53, 18.73, 2.99, 30.34, 12.33, 117.52, 73.02, 223.63, 4.00, 26.78`
• We assume this is a random sample from an exponential distribution with parameter , a common probability model for seed shadows. We wish to estimate and obtain some measure of the precision of our estimate.
• Using the method of moments, maximum likelihood theory, divination, or whatever, we decide to estimate by . For our data this yields = 0.018153.
• We next use a statistical computer package, (R, or whatever), to generate 10 realizations from an exponential distribution with = 0.018153. (In R this could be done with rexp(10,0.018153). The resulting sample is called the first bootstrap sample. Using our estimation formula above, we then estimate from this sample. Denote this estimate by .
• Next we repeat this step B times. (B is often taken to be 1000 or more.) Obtain bootstrap estimates .
• Calculate and use this to obtain the standard error of as follows.

Nonparametric Bootstrap

• Consider again the seed shadow example above. This time don't assume any probability model at all but suppose there is still a parameter of interest to us.
• Estimate in whatever fashion seems reasonable. For simplicity, assume we use the same estimator as above.
• Construct the empirical distribution function of our sample. For a sample of size n, the empirical density function assigns a weight of to each data value.
• The empirical distribution function is then given by . This is a step function that jumps a height (or some multiple of if there are repeated observations) at each unique data value. The empirical distribution function is an estimator of the population cumulative distribution function.
• Using the empirical density function as the population model, take a sample of size 10 from this "population". In the example above we would sample from the ten data values assigning equal probability to each.
• Notice that using this protocol we may select the same observation more than once in our sample. This is called sampling with replacement.
• In R (or any other programming language for that matter) we could carry out this procedure by first assigning the data values the numbers 1 through 10. We then select observations based on the return value of `ceiling(10*runif(10)`, where `runif` is the R function that returns a random number chosen uniformly from the interval [0, 1] and `ceiling` is the ceiling function (the function that rounds its argument up to the nearest integer).
• A simpler way in R is to use the sample function as will be demonstrated this week in lab.
• Having obtained the bootstrap sample calculate for the sample (just as was done in the parametric bootstrap). The rest of the procedure follows that outlined for the parametric bootstrap except that we use the empirical distribution to generate our samples.
• Why does this make sense?
• Although it seems like there's no probability model guiding the selection of our sample, in reality there is one. It's the empirical probability model--the probability model suggested by our data. In this model each data point is assigned a probability based on its frequency.
• The empirical probability function is a best estimate of the underyling population probability distribution.

Bootstrap Confidence Intervals

There are five basic bootstrap confidence intervals implemented in the boot library of R. I describe each in turn.

Normal (Standard) Bootstrap Confidence Interval: type='norm'

• A generic template for a confidence interval is the following.

• is the estimate of from our sample. For we use a bootstrap estimate, the standard deviation of the bootstrap estimates of .
• To justify the use of this confidence interval we still must meet the usual assumptions--large sample size or normal population, etc.

Percentile Bootstrap Confidence Interval: type='perc'

• Rather than appeal to a normal or t-distribution to obtain the quantiles for a confidence interval, we use the bootstrap distribution itself.
• To obtain a 95% confidence interval, e.g., we generate the bootstrap distribution and determine the 2.5 percentile and the 97.5 percentile of the bootstrap. These numbers form the boundaries of our confidence interval.
• Percentile bootstrap confidence intervals tend to be highly asymmetric.

Basic Bootstrap Confidence Interval: type='basic' (not covered in lecture)

• The algebra is a bit confusing for this one but in many respects it resembles the percentile confidence interval. The basic idea behind this one is that the bootstrap distribution of bootstrap values about the sample estimate should resemble the sampling distribution of the sample estimate about the true population parameter. Symbolically, the distribution of , which we can observe, should resemble the distribution of , which we can't (unless we take multiple samples).
• Put another way, we want to find L and U so that

but we use

to actually find them and hope that they're close to the ones we actually want.

• Rewriting the second probability statement we have

• Next we use the bootstrap distribution to find the quantiles we need. Call them and . Setting them equal to the endpoints of the interval defined in the above probability statement and solving for L and U yields

• Next return to the original probability statement, the one we actually care about, and solve the inequality for .

• Lastly, plug in the values for L and U obtained from the bootstrap distribution.

• These are the limits that are returned by R for the basic bootstrap confidence interval.

Studentized-t (Percentile-t) Bootstrap Confidence Interval: type='stud'

• The studentized-t bootstrap confidence interval takes the same form as the normal confidence interval except that instead of using the quantiles from a t-distribution (or a normal distribution) a bootstrapped t-distribution is constructed from which the quantiles are computed.
• For each bootstrap sample the following quantity is calculated

• The complication with this formula is that we need some way of estimating , the standard error of the bootstrap estimate in each bootstrap sample. In R we actually need to enter a formula for this quantity. So if is a statistic for which there exists a variance formula then we can proceed. Note: in principle it would be possible to bootstrap the bootstrap sample to get an estimate of the standard error but this is very resource intensive. So this method works best when a formula for the variance already exists.
• Having obtained the B values of obtain the quantiles needed for the confidence interval. To generate a 95% confidence interval, e.g., obtain the 2.5 percentile and the 97.5 percentile of the bootstrap t distribution. Use these numbers like ordinary t-statistics and construct the confidence interval in the usual way.

where is the bootstrap estimate of the standard error of . Notice that the quantiles of the bootstrapped t are used in what is perhaps the reversed order from what you might expect.

Bias-corrected and Accelerated Bootstrap Confidence Interval: type='bca'

• The bias-corrected and accelerated bootstrap confidence interval attempts to shift and scale the percentile bootstrap confidence interval to compensate for bias. The formulas for doing this are fairly complicated and rather unintuitive.
• There are two parameters involved, one called attempts to correct the bias. A second, , is called the acceleration parameter.
• There is no unanimity of opinion on whether BCa yields the best bootstrap confidence intervals. In my limited experience I have found that there often problems in trying to calculate these intervals so that warnings are printed suggesting that the calculated estimates may be suspect.
• There is an approximation to the BCa method, called the ABC method, for approximate bootstrap confidence interval, that finds the endpoints of the BCa method analytically. I've seen cases where the ABC method seems to work when the BCa method does not or where the BCa method runs out of memory. In the boot library the function abc.ci finds ABC endpoints. See the lab 4 notes for more information.

#### Vocabulary

• Least squares estimate
• Bootstrap
• Parametric bootstrap
• Nonparametric bootstrap
• Bias-corrected and accelerated bootstrap
• Normal (standard) bootstrap confidence interval
• Basic bootstrap confidence interval
• Percentile bootstrap confidence interval
• Bootstrap-t confidence interval
• Bias-corrected and accelerated bootstrap confidence interval