Randomness of Ages of U.S. Residents
|
Statistical Topic:
Understanding the concept of randomness is crucial to theories on how
to draw inferences about a population by analyzing only a single, smaller
sample. The purpose of random number generators such as dice and
a random number tables can produce results that appear to be controlled
entirely by chance. Students will learn the difference between theoretical,
empirical and experimental probability and how to present results using
a probability distribution histogram. Differences in randomness
with small numbers and large numbers will also be considered. |
Student Issue:
Understanding the age structure of U.S. residents is key to understanding
the labor market, number of college students, number of children in schools,
consumer markets and almost every aspect of the U.S. economy. The
pattern of violent victimizations (rape, sexual assault, robbery with or
without injury, aggravated assult with injury and simple assault with minor
injury) across age groups is not random with teenagers ages 12-19 being
victim of crimes almost 42 times more often that senior citizens, 65 years
and older |
Data Set:
To estimate the number of U.S. Resident in 1997 by age group consult
Table 1. Resident Population. |
Research Questions:
What are the different types of probability-theoretical, empirical
and experimental?
What are random numbers and how are they generated?
What happens to probabilities when small numbers are involved? Large
numbers? |
Statistical Techniques:
-
The ages of U.S. residents 5 and over are presented in class intervals
of equal widths of 15 years starting with age 5 in Table
1. If the number of residents were equal for each of the five
age intervals we would expect the probability to equal to 1/5. Draw
a probability distribution with the age intervals on the x-axis and the
probability for each class on the y-axis. This would be a theoretical
probablity distribution and have a shape called uniform or rectangular.
-
Compute the actual percent or probability for each age interval as determined
by the actual number of U.S. residents determined by the Bureau of the
Census and place this value in Table 1. Draw a probability distribution
with the age class intervals on the x-axis and the empirical (based
on real observations or numbers) probabilities on the y-axis.
Compare this probability distribution with the one draw in question 1.
Can you think of factors that would cause the empirical probablilites
to be different from the theoretical probabilities?
-
Suppose we decided to simulate an experiment in which we continually generated
a person whose age come from one of the five groups with an equal probability
(theoretical)? We could use a random number generator such a one dice which
has the chance of selecting a 1, 2, 3, 4, 5 or 6. We could number each
age interval as a 1, 2, 3, 4 or 5. Next we could throw this dice
20 times to generate 20 random numbers. Using these numbers generated
from the dice throw to determine the age group for the 20 fictious people
you have created. Compute the experimental probabilities (based on
the randomness of tossing the dice) for each age group. Draw a histogram
(probability distribution) for your twenty people. How does this
histogram compare with the distributions drawn in question 2 or 3? Note:
if you do not have a dice you can use a random number table that comes
in the back of most statistics text. Select any column or row of one digit
numbers and chose only those numbers ranging from 1-5.
-
Have your instructor collect the number for all age groups from each
student and draw a histogram for all the experimental data collected.
How does this probility distribution compare with those drawn previously
especially the one in question 4? Why do you think these distributions
are different? If you did this 248,488,000 times (number of people in the
U.S.) which probability distribution should closest reflect your results?
|
Social Commentary:
-
The next time you visit a mall or restaurant, look around and mentally
determine the age groups of about 10-20 people. Which one of the probabilities
you have computed in this exercise best fits this group of people?
Why do you think you got the results you did?
|