Algebra and Calculus of Normal Distributions
|
Statistical Topic:
Understanding the distributions that define the population under investigations
is essential to understanding statistics. Two of these distributions
we will study in this exercise are the normal and the standard normal distribution. |
Student Issue:
How do states compare in per capita personal income? What does a population
data set of state income values look like? |
Data Set:
Table 1. Per Capita Personal Income
by State for 1997 contains U.S. Census Bureau estimates of personal income. |
Goals of Data Analysis Lab:
Using personal income by state as a population data set, we can compare
the histogram for these values with the histogram for the z-scores of these
values. We can also compute the parameters for each of these distributions.
Using the function for the normal curve (see your textbook) for this data
set, a smooth curves is generated which has as it area under this
curve equal to 1. We can do this for both the state income data and
the z-scores for the state income data. By using the trapezoidal rule to
find areas under the the theoretical standard normal curve, we approximate
these areas and compare our approximation with areas under the curve for
the z-distribution in our textbook. |
Statistical Techniques:
-
Mark income for three states (one is your state) on Figure 1 on Table
1. This graph is a population histogram for our state income
data set. Notice that is almost normally distributed. Mark
the mean and standard deviation on Figure 1 (use greek letters).
Compute z-scores for these three states and places these z-scores on the
Figure 2. Is the shape of these two distributions
different? Why or why not are these two graphs different?
-
Using the personal income for each state and the function for the normal
distribution f(x), we can also draw a scatterplot for the population data
set that is a smoothed curve for this population data set. Interpret
this graph, Figure 3 on Table 2, giving approximate
values for the middle, shape, spread and outliers.
-
Using the z-scores for the personal income for each state and the function
for the standard normal distribution, another scatterplot of z-scores as
above as shown in Figure 4 on Table 2.
How does the shape of this graph compare with the one drawn in Figure
3?
-
Changing from a real data set to the theoretical function for the standard
normal, we can compute areas under the curves or probabilities that an
observation will fall between a certain set of z-scores. Use the
formula for the area of a trapezoid (Area=1/2*width*(height on one side
and height on other side) and Table 3.
to compute areas under the curve for the theoretical standard normal
distribution. Note that h1 and h2 are the heights of the trapezoid
which represent the values in the function for the z- distribution.
The value of h will always be h since we are working in .5 increments.
Compute and mark on Figure 5. the area under
the curve for every .5 z-score interval from -3 to 3.
-
Using these computed areas for each .5 intervals, calculate the areas under
the curve for -1<= z <=1, -2<= z <=2 and -3<= z<= 3.
How close does your values come to approximating the areas under the curve
for the z-distribution in your textbook ? Since all the areas represent
probabilities, what is the probability that z>=2?
|