Lecture 10: Point and Interval Estimation


Statistical Estimation

Estimator:
a statistic which is used to estimate a parameter.

Estimate:
the numerical value of the estimator that is used.

Note: if we use xbar to estimate MU and xbar = 10, then xbar is the estimator whereas the estimate is 10.

Point Estimate:
a single number.

Interval Estimate:
a range of values within which the parameter is though to lie.


Point Estimation

When we choose an estimator to be used to make a point estimate of a particular population parameter, all we can do is compare the sampling distributions of the various estimators.

for example: sample mean, sample median, etc.

Such a comparison will show which estimators are likely to depart considerably from the population parameter.

We use 3 criteria:

  1. unbiasedness
  2. efficiency
  3. consistency

Unbiasedness
an unbiased estimator is a statistic, the expected value of which equals the parameter being estimated.

thus, xbar is an unbiased estimator because E(xbar) = MU

Efficiency
if two estimators are unbiased, one is more efficient than the other if its variance is less than the variance of the other.

the variance of the sample mean is sigma2 / n

if the population is normal, the variance of the sampling distribution of the sample median is 1.57SIGMA2

Consistency
a statistic is a consistent estimator of a parameter if the probability that the statistic's value is very near the parameter's value increasingly approaches 1 as the sample size increases.

for example, for xbar --> standard error = SIGMA / n1/2 --> 0 as n --> infinity


Confidence Intervals for the Population Mean

Where SIGMA is Known: Large Sample

Note: define P{X > 32} as probability that X is greater than 32. Now define P{a < X < b} as the probability that X lies between a and b. Thus, the probability that the value of the sample mean lies between

MU - 1.96 SIGMA / n1/2 and MU + 1.96 SIGMA / n1/2

is denote by

P{MU - 1.96 SIGMA / n1/2 < xbar < MU + 1.96 SIGMA / n1/2}

thus from what we know about the sampling distribution of the sample mean implies:

P{MU - 1.96 SIGMA / n1/2 < xbar < MU + 1.96 SIGMA / n1/2} = 0.95

because the probability that any normal random variable will lie within 1.96 standard deviations of its mean is 0.95.

To construct an interval estimate for the population mean MU:

This means that before the sample is drawn, when xbar is unknown, there is a 0.95 probability that the interval will include the population mean MU.

but


Example 1

SIGMA = 10
n = 100
xbar = 2.00

it would be incorrect to say

P{2.00 - 1.96 * 10 / n1/2 < MU < 2.00 + 1.96 * 10 / n1/2} = 0.95

P{0.04 < MU < 3.96} = 0.95

Since MU is a constant, it is either in this interval or it is not.


All one can say is that if intervals of this sort are calculated repeatedly, they will include the population mean in about 95% of the cases.

Thus, what we have is a confidence interval with confidence coefficient (1 - alpha). In the above example (1 - ALPHA) = 0.95

or

ALPHA = 0.05

Confidence Interval for MU (SIGMA known):
if the confidence coefficient is set equal to (1 - ALPHA), the confidence interval for the population mean is:

Where Za/2 is the value of the standard normal variable that is exceeded with a probability of ALPHA / 2.


Example 2

Suppose we wanted a 90% C.I. instead of 95%.

(1 - ALPHA) = 0.90

so

ALPHA / 2 = 0.05

from the Standard Normal Table --> Z0.05 = 1.64

so

2.00 plus/minus 1.64 (10 / (100)1/2) --> C.I. [0.36 < MU < 3.64] = 0.90


Where SIGMA is Unknown: Large Sample

Just use the sample standard deviation, s, as an estimator for SIGMA.

so we get

P{{xbar - 1.96 s / n1/2 < MU < xbar + 1.96 s / n1/2} = 0.95

or


Example 3

n = 90
MU = 810
s = 85

A 99% confidence interval would be:

Za/2 = Z0.005 = 2.576

810 - 2.579 (85 / (90)1/2 < MU < 810 + 2.579 (85 / (90)1/2

810 - 2.579 (8.96) < MU < 810 + 2.579 (8.96)

786.92 < MU < 833.08


Confidence Intervals for the Difference Between Two Means: Independent Samples

If we take two populations, one with mu1, sigma1 and the other with mu2, sigma2, and take a simple random sample of n1 observations from the first and n2 observations from the second; and if these two random samples are entirely independent, then:

If both samples are large, and if the confidence coefficient is set equal to (1 - alpha), the confidence interval for the difference between the population means (mu1 - mu2) is:

where

s21 is the variance of the first sample and

s22 is the variance of the second sample.


Example 4

A Soup Company has two plants and suspects that the mean drained weight of the contents of a can is higher at plant 1 than plant 2. To test this they draw a random sample of 100 cans from each plant and construct a 90% confidence interval for difference of means.

xbar1 = 23.02 xbar2 = 23.83
s21 = 0.64 s22 = 0.36
n1 = 100 n2 = 100

(1 - alpha) = 0.90

thus,

alpha = 0.10

and

alpha/2 = 0.05

Z0.05 = 1.64

so

so the 90% confidence interval is 0.026 to 0.354 oz.


The t-Distribution

The t-distribution if the population sampled is normally distributed,

has the t-distribution. The t-distribution is symmetrical, bell-shaped and has zero as its mean.

The t-distribution is a sampling distribution of the statistic

It is a family of distributions, each of which corresponds to a particular number of degrees of freedom. In this context, the number of degrees of freedom equals (n - 1).


Confidence Interval Where sigma is Unknown: Small Sample (n < 30)

If the confidence coefficient is set equal to (1 - alpha), the confidence interval for the population mean is:

where ta/2 is the value of a t variable (with n-1 degrees of freedom) that is exceeded with a probability of alpha /2.


Example 5

n = 16

xbar = 20

s = 4

a 95% C.I. is given by:

(1 - alpha) = 0.95

thus,

alpha = 0.05

and

alpha/2 = 0.025

from Table A6 with degrees of freedom = n-1 = 16-1 = 15

we get a t-value of 2.131.

so

The 95% confidence interval is

17.869 to 22.131


Example 6

Mean length of the life of a light bulb? Test 9.

5,000 5,100 5,400
5,200 5,400 5,000
5,300 5,200 5,200

Calculate a 90% confidence interval.

solution

First calculate the sample mean and standard deviation.

because n = 9 use

(1 - alpha) = 0.90

thus,

alpha = 0.10

and

alpha/2 = 0.05

Look up t0.05 with 8 degrees of freedom from Table A6.

t0.05 = 1.86

so

or

5,107 to 5,293 hours.