Algebra and Calculus of Normal Distributions 

Statistical Topic:
Understanding the distributions that define the population under investigations is essential to understanding statistics.  Two of these distributions we will study in this exercise are the normal and the standard normal distribution.
Student Issue: 
How do states compare in per capita personal income? What does a population data set of state income values look like?
Data Set:
Table 1.  Per Capita Personal Income by State for 1997 contains U.S. Census Bureau estimates of personal income. 
Goals of Data Analysis Lab: 
Using personal income by state as a population data set, we can compare the histogram for these values with the histogram for the z-scores of these values.  We can also compute the parameters for each of these distributions.  Using the function for the normal curve (see your textbook) for this data set,  a smooth curves is generated which has as it area under this curve equal to 1.  We can do this for both the state income data and the z-scores for the state income data. By using the trapezoidal rule to find areas under the the theoretical standard normal curve, we approximate these areas and compare our approximation with areas under the curve for the z-distribution in our textbook.   
Statistical Techniques:
  1. Mark income for three states (one is your state) on Figure 1 on Table 1.  This graph is a population histogram for our state income data set.  Notice that is almost normally distributed.  Mark the mean and standard deviation on Figure 1 (use greek letters).  Compute z-scores for these three states and places these z-scores on the Figure 2.  Is the shape of these two distributions different?  Why or why not are these two graphs different? 
  2. Using the personal income for each state and the function for the normal distribution f(x), we can also draw a scatterplot for the population data set that is a smoothed curve for this population data set.  Interpret this graph, Figure 3 on Table 2, giving approximate values for the middle, shape, spread and outliers.
  3. Using the z-scores for the personal income for each state and the function for the standard normal distribution, another scatterplot of z-scores as above as shown in Figure 4 on Table 2. How does the shape of this graph compare with the one drawn in Figure 3? 
  4. Changing from a real data set to the theoretical function for the standard normal, we can compute areas under the curves or probabilities that an observation will fall between a certain set of z-scores.  Use the formula for the area of a trapezoid (Area=1/2*width*(height on one side and height on other side) and Table 3. to compute areas under the curve for the theoretical standard normal distribution.  Note that h1 and h2 are the heights of the trapezoid which represent the values in the function for the z- distribution.  The value of h will always be h since we are working in .5 increments.  Compute and mark on Figure 5. the area under the curve for every .5  z-score interval from -3 to 3.
  5. Using these computed areas for each .5 intervals, calculate the areas under the curve for -1<= z <=1, -2<= z <=2 and -3<= z<= 3.  How close does your values come to approximating the areas under the curve for the z-distribution in your textbook ?  Since all the areas represent probabilities, what is the probability that z>=2?