19 October 2000
Sociology 208
Statistics for Sociologists
Name_________________________________
I pledge I have neither given nor received unauthorized help.
Signature________________________________
MIDTERM
Each part of a problem counts equally. The total number of points possible is 29. Show your equations and calculations: arithmetic errors will not count if the procedure is correct. This is an open-book, one 4"*6"-notecard-of-notes exam. Feel free to use a calculator. Be sure to indicate units as appropriate. If you have questions about the wording of a problem, please ask me for clarification. You may want to look through all the questions and do the ones easiest for you first, to save time for the ones that take more thinking or calculation.

1. The university just published new rates for accidental death and dismemberment insurance (University Gazette, Oct. 11, 2000: p. 6). For the purposes of this question, these are the only 3 options possible. In a department with 20 employees (staff and faculty), there is the following distribution: 

         payoff if killed in
         an accident 
         or dismembered 
employee's cost 
per month
number of employees
selecting option
           $200,000
           $250,000
           $300,000
$3.60
$4.50
$5.40
4
7
9

 
The administrative manager asks you to analyze what the new rates will mean for the department's employees.

(a) Draw a graph of the cumulative frequency distribution for cost/month for this department.
 
 
 
 

(b) What is the mean cost/month for employees in this department?
 
 
 
 
 
 

(c) What is the standard deviation? 
 
 
 
 
 
 
 
 

(d) (1) If the probability of accidental death or dismemberment is .001 per month, what is the expected payoff from paying for $300,000 worth of insurance? (2) Does this make the insurance a good deal or a bad deal?
 

2. In Soc. 208, Fall 2000, the means and standard deviations for the first two assignments were as follows:

 

Assignment 1
Assignment 2

total points
possible

61
63

mean 
 

55.70 
58.42 

standard deviation 
 

1.93
2.17

The TA is worried that his grading may have been wilder (i.e., more variable) on the first assignment. (a) What would you tell him? (b) How did you determine the relative variability for the two assignments?
 
 
 
 
 
 
 
 
 

 

3. You get the following printout for family incomes of a small town:
cumulative percentage of families 

10% 
24% 
56% 
79% 
100% 

family income/month (to nearest 1000 dollars)

$2000
$3000
$4000
$5000
$6000

Draw a basic box plot for this town. Label the parts. (Don't worry about special codes for outliers, if there are any. Also, assume this is the whole distribution, that income is not grouped into categories.)
 
 
 
 
 
 
 
 
 
 

4. As a family sociologist/demographer, you are curious about the living arrangements of people in the U.S. In particular, you want to know about gender and how many generations live in the household. A helpful colleague at the U.S. Census Bureau gives you the following bivariate probability distribution:

                                                               X=gender of householder
          Y=# of generations in
          household
x1=man x2=woman Total
          y1=1 .28 .30 .58
          y2=2 .15 .24 .39
          y3=3 .01 .02 .03 
          Total .44 .56 1.00
(a) Using the marginal probabilities for # of generations, find the E{generations}.

       (b) What is the sigma2{generations}?
 
 
 

        (c) (1) What is the standardized value for Y=1? (2) What is its probability?
 
 
 
 
 

        (d) What is P (Y=2 intersection X=2)?
 
 
 
 

        (e) What is P(Y=2 | X=2)?
 
 
 
 

         (f) What is P (Y=2 U X=2)?
 
 
 
 
 

          (g) (1) Are X and Y independent? Show why or why not? (2) Briefly, what does this mean?
 
 

             5. You are given the following probability density function for earthquake intensity. (We don't have noticeable earthquakes
          that often in N.C. Hurricanes, tornadoes, and floods are a different matter.)




        What is P (2<X<4)?
 
 
 
 

          6. For three N.C. counties, there is the following information for number of voters in a presidential election:
 
 
    County: E{# voters}  sigma {# voters}
          1
          2
          3
100
200
150
20
10
7

          (a) What is the expected value of total number of voters among the three counties?
 
 
 
 
 

            (b) What is the variance of the total number of voters among the three counties?
 
 
7. You are helping a friend in Journalism and Mass Communication with a project about TV viewing. Unfortunately, you have a data set that does not include hours of TV watched. However, you know from media theory that the relationship between years of education and hours of TV watched is as follows:

TVHOURS= 5.7 -.2 (EDUCATION).

The data with which you are working do include years of education, and you get
E{EDUCATION}=13.2 years
sigma2 {EDUCATION}=8.6 years squared.

Based on what you know, what would be the (a) expected value and (b) variance for TVHOURS?
 
 
 
 
 
 
 
 
 
 

8. Attached is a Stata printout for the variable SEI. Socioeconomic status is a favorite variable among stratification researchers, ranging from 0 (very low status) to 100 (high status). From this printout, draw what you think the distribution looks like. Remember to consider skew and peakedness measures.

 

9. In a Soc. 208 class, there is the following age distribution.

ID   Age
1     54
2     51
3     48
4     47
5     29
6     28
7     28
8     25
9     24
10   24
11   24
12   22
13   21

(a) Draw a stem-and-leaf plot for this distribution. (b) What does this show about the age distribution?


 
 
 
 

________________________________________________________________________________________________________
Active log Wed Oct 18 12:23:38 2000 Page 1
 

                                                       STATA  tm
                                                      Statistics/Data Analysis
set more 1

1.  use “Z:\Sociology\faculty\rachelr\soc208\gss96.dta", clear
    (1972-1996 General Social Survey Cumulative File)

2.  summ sei, detail

                 RESPONDENT SOCIOECONOMIC INDEX
 --------------------------------------------------------------------------
 
                Percentiles 
          1% 
          5% 
          10% 
          25%

          50%

          75%
          90%
          95% 
          99% 

18.5 
26.4 
28.4 
32.3 

38.9

63.5
76.4 
82.7 
92.3
                 .

Smallest
17.1
17.1
17.1
17.1 
 

Largest
97.2
97.2
97.2 
97.2

 

Obs
Sum of Wgt

Mean
Std. Dev

Variance
Skewness
Kurtosis

2781
2781

47.85451
18.99283

360.7276
5971111
2.2015 
 

Stata Corporation
702 University Drive East
College Station, Texas 77840
409-696-4600,    fax 409-696-4601