How to do an Election Survey
The following statement is true, even though almost everyone involved in election polling denies it.
Editors, pollsters, and pundits will curse, evade, or ignore the truth of that statement, sometimes with great heat. But it is still true. And if you are going to do election surveys, you might as well get used to this simple fact: your success or failure will be judged by how well your poll predicts the outcome. It is a reasonable test, and a fair one.
In polling, there are not many opportunities to test the validity of a poll against the real-world values that the poll is supposed to approximate. The reason is as obvious as it is simple: if we could measure the real-world values, we wouldn't need a poll in the first place.
There are two exceptions to this general rule. One is the United States Census, which does a valiant job of attempting to reach everybody and collect some basic information about each person. One way to check on the validity of a poll is to compare the demographics of the poll's sample with the census demographics of the population from which the sample is drawn. This method works best in years close to the census.
The other opportunity for such an external test of validity comes with every election. If 56 percent of a sample says it is going to vote for Crockett for mayor, and if the sampling error is four percentage points, and if Crockett gets between 52 and 60 percent of the vote, the poll has passed the test. Such a test separates the wimps from the stout-hearted in a way that a comparison with the census does not. If the census says that a population is 22 percent black and the poll only finds 15 percent, the pollster will merely shrug and crank in a statistical weight bringing blacks up to their quota. (An ethical pollster will report that procedure.) But the pressures of election reporting compel the poll to be published before the validity check, which is poetic justice indeed. There is no way to hide from it or cover it up with cosmetic weighting procedures.
Watch the newspapers in the last two weeks of any national election. You will find that the weaker polls suddenly become silent. Ask why, and you get a variety of reasons: the polling budget ran out, polling distracts attention from the issues, it might affect the outcome of the election, etc. Don't believe any of them. The real reason is that the pollster doesn't want the poll to be compared to the election. By stopping two weeks in advance, he or she can claim that the poll was accurate as of the time it was taken, no matter how the election turns out. If the election results differ from the poll results, the voters must have changed their minds. As, indeed, voters sometimes do.
An election poll is, of course, good for other things than predicting the outcome of elections. It can show what issues are motivating the voters. It can measure familiarity with the issues and the candidates. It can show what coalitions are being formed or renewed. It can provide insights into candidate strategy that are in turn derived from the candidate's own polls.
But to do any of these good things,
the poll must be a valid representation of the participating electorate.
And the election is the check on whether it succeeds at that. This chapter
is about the things you can do to make certain that your own poll matches
the election outcome.
Very few organizations are content with single preelection polls. If one poll is newsworthy, a lot of polls are a lot more newsworthy. To pay for more polls, there is a strong temptation to cut costs in procedure, and sampling is a frequent target for cutting. Because personal, in-home interviews are too slow for the competitive needs of modern media, virtually all election polls are done now by telephone.
Drawing a representative sample of telephone households is easy using the random-digit methods discussed in chapter 6. The issue is how far down the respondent selection process randomness needs to be preserved. In the 1980s, many polls used quota selection to choose a respondent within the random household. This practice had theoretical justification in the 1970s because members of the same household tended to have similar political views and to vote alike. Ronald Reagan changed that. Two trends, the growth of the women's movement and the rightward movement of the Republican Party combined to make voting behavior correlate with gender in a way that it had not done before. Journalists called the phenomenon the "gender gap" when women were suddenly more likely to be Democrats than men.
Suddenly a quota sample did not look as good. Although the gender difference could be controlled to some extent by setting quotas for males and females, the breakup of homogeneous households raises the possibility that household members might vary in other ways not controlled by the quotas. Probability sampling down to the individual level is therefore the safest procedure, but it comes at a cost. It requires taking the time to list the household members, to choose one at random, and to call back later if that person is not at home. Irving Crespi, in his excellent roundup of current wisdom on election polls, reports that failure to conduct callbacks leads to underrepresentation of Democrats ˝ Republicans evidently being more likely to be found at home on a typical evening.1
Polls using callbacks generally set a fixed number of attempts, usually one, two, or three, and dedicate the last night or two of a poll to finishing up the callbacks. That procedure necessarily extends the polling period, which is another cost when you are under deadline pressure.
The nerd boxes ˝ those little clumps of agate type that newspapers use to disclose the methods and limitations of their polls ˝ rarely talk about refusals. But refusals are one of the main sources of error in all polls. Their effect is far more serious than sampling error because you can't estimate their effect the way you can specify the probability of different ranges of sampling error. In fact, the maddening thing about this source of error is that it is often totally unpredictable. Often it will cause no visible error at all. This is true because of the following general rule:
To make the harmless bias rule intuitively clear, think about a barrel of apples. You want to estimate the ratio of green apples to ripe apples in the barrel. The apples were thoroughly mixed before they were packed. Therefore, you can draw your sample from the apples in the top of the barrel, which are the easiest to get, and make a good estimate. Position in the barrel does not correlate with greenness.
Now suppose that the barrel was packed by some sinister force. It did not want you to know about the green apples, and so it put them all in the bottom of the barrel. Take a sample from the top, generalize to the barrel as a whole, and you will be wrong. This time, the bias in your selection is correlated with the thing you are measuring.
The harmless bias rule sounds like a gift to pollsters. It is not. The only way to be sure that a bias is harmless is to compare the biased sample with an unbiased one -- or, as in the case of an election or a census, with the total population. A bias that is harmless in one poll can give you an unjustified sense of confidence and set you up for a disaster in the next poll. That happened to the Literary Digest in 1936. Its sample was based on telephone and motor vehicle registration lists, both indicators of relative affluence. Affluence was not correlated with voter choice in 1932, and its poll was right on the mark. Then President Roosevelt built his New Deal coalition of farmers, workers, and minorities and caused a historic party realignment that left voter choice very much correlated with economic status. The Literary Digest confidently repeated its biased sampling in 1936, was horribly wrong, and did not live to poll again. George Gallup used an unbiased sample, got it right, and became the dominant pollster of his time.
This history is worth remembering when we think about refusals. About a third of the people contacted in telephone surveys refuse to be interviewed. Refusal to participate has been shown to correlate with age (older people are less likely to consent to an interview), with education (the less educated are the most reluctant), and with urban residence (city dwellers are less likely to cooperate). Place of residence is more of a problem with face-to-face interviews than with telephone interviews.2
These factors give a Republican bias to telephone interviews. As luck would have it, they also provide some correction for another problem faced by election polls: weeding the nonvoters out of the sample. The same people who don't like to take part in telephone interviews tend to be the same people who don't like to vote. This is good luck for pollsters, but it is the same kind of luck that the Literary Digest enjoyed in 1932. A benign correlation pattern could turn nasty in some future election campaign. And there is not much a pollster can do about it except watch out for issues that involve old folks, the less educated, and city dwellers. Some day they will do important things that the conventional polls will not detect. Think about that if you are a young person hoping to make a name in this business. If you can anticipate that inevitable day and avoid the bias that sends the conventional pollsters running for the hills, you could be the Gallup of the twenty-first century.
The low election participation rates in the United States make life hard for pollsters. In the 1988 presidential election, only 50 percent of the voting age population showed up at the polls. The low turnout creates two obvious problems:
1. You need a bigger sample. To get the 3 percent error margin provided by a sample of 1,000, you have to interview 2,000 people in order to end up with 1,000 voters.
2. You have to figure out which of the people in your oversized sample will belong to the voting 50 percent.
The second problem is by far the most difficult. Of course, you can just ask people if they plan to vote or not. The trouble with that tactic is that voting is perceived as a socially useful activity, and so respondents do not like to admit not participating. About 80 percent say they are registered, but only about 65 percent actually are. And those who are registered greatly overestimate their likelihood of voting.
Over the years, pollsters have developed a number of stratagems for weeding out the nonvoters. At one point, the Gallup Poll used nine questions to predict turnout likelihood, built them into a scale, and then went back to the public records after the election to see who really did vote. From that information, a statistical weighting procedure was devised based on the predictive power of each of the nine items. Some of the nine items were direct and some were oblique. Samples:
The best predictors were a registration question, a direct question about the respondent's plans to vote in the upcoming election, the ladder scale, and the question on frequency of voting.4 All nine questions were used to form a cumulative scale. At the top of the scale, where people gave all the pro-voting answers, turnout averaged 87 percent. At the bottom of the scale, it was only 8 percent.
The straightforward way to apply this scale, and the method used for many years by the Gallup Poll, was to array all of the respondents along the scale according to the likelihood of their voting and then estimate the turnout. If a turnout of 50 percent was expected, then the people who formed the bottom half of the sample in voting likelihood would be dropped and the prediction would be based on those who were left. The Gallup folks, leaving as little as possible to chance, also used their turnout scale to estimate the turnout itself.
This bootstrapping is effective, but a hardworking pollster who goes to all of the trouble to interview hundreds or even thousands of people hates to throw half of those interviews away. The Election and Survey Unit at CBS News found a way to avoid that. Using the same kind of public-record data to find which survey respondents really vote, they constructed a voting-rate scale using the same principles as Gallup's but with fewer questions. Then they used these probabilities as weights across the entire sample. A person with an 89 percent probability of voting gets a statistical weight of .89. The respondent with only a 3 percent chance is weighted at .03. Every interview is used, and the probability model should, in theory, project to the election with at least as much accuracy as the model that drops the least likely voters. It worked well for the CBS News/New York Times Poll in 1988. However, Crespi reports that polling organizations that use such weighting schemes are less likely to be successful predictors of election outcomes than those that kick the low-likelihood voters all the way out of the sample. Paul Perry, the quiet statistician who guided the Gallup Poll in its years of methodological improvement after 1948, tried using a weighting model with his data and found that the results were almost identical. So he stuck to the cut-off model because of its simplicity -- a virtue when you are under deadline pressure.5
With these facts in hand, it is possible to screen out respondents who could not legally be registered. Most states purge their rolls of voters who have not voted after a certain time period. Before each election, CBS News checks the laws of each state to see what that time period is. If the person last voted six years ago, and the time period for his or her state is four years, voting in the current election is probably not possible. And those who have moved in the past two years but have not reregistered are also considered ineligible. Both groups get a voting probability of zero.
The remaining voters are divided into twelve categories, depending on their past voting frequency and their interest in the campaign. A different probability is assigned to each category. The range is 89 percent to 3 percent.
The CBS method is effective but complicated. Simpler methods are cheaper. Many election pollsters try to identify the likely nonvoters at the very front of the interview, so that they can get rid of them immediately. Why spend 20 minutes on the telephone with someone whose interview you are going to throw out at the analysis stage? If a person's responses indicate that he or she will not vote, the interviewer can politely end the conversation and go on to the next number.
But watch out! If the procedure calls for the interviewer to hang up without collecting the basic demographic data, the means for checking the sample against the known census characteristics is lost. So some low-budget pollsters design the survey to collect at least that much information from the likely nonvoters so that they will have a sample of the entire population to compare to census information. Weighting can be performed on the total sample to make it conform to the census, and then the nonvoters can be dropped out.
Here's another way. The need to collect demographic data from folks who are not likely to vote can be avoided without forgoing demographic weighting altogether. The Bureau of the Census asks about voting participation in its Current Population Survey. From that survey, one can derive estimates of the demographic characteristics of the voting population. Weight your sample of likely voters to conform to those demographic estimates, and you have bypassed the need to deal with nonvoters at all.
But there are a couple of problems. One is that the census survey is notoriously inaccurate. Far more people claim to have voted than actually turned out. Another potential problem is that the turnout patterns of past elections may not be the same as the pattern for the current election. And the census survey is national in scope. The turnout demographics in the city or state you are surveying might be quite different.
The old Gallup method of throwing away the unlikely voters still looks pretty good. That system was based on personal, in-home interviews. Telephone interviewing adds a complication, because the telephone itself is a kind of nonvoter screen due to its bias against the less educated. This bias, you will recall, stems from higher refusal rates among the less educated as well as the unevenness of telephone penetration. Therefore, if you expect a turnout of 50 percent and drop 50 percent from your sample, you will have dropped too many. If one-third of those called refused the interview, and if they are all nonvoters, you have already eliminated a substantial portion of the nonvoting problem.
It is somewhere around here that polling ceases to be a science and becomes an art. You need to build some experience in your state or city to find out what kind of nonvoter correction works best. Some pollsters get by with a short, sweet method: ask one screening question about registration and one about likelihood of voting, and count only those who say they are registered and certain to vote. This method generally eliminates far fewer unlikely voters than are theoretically necessary, but the remainder tend to eliminate themselves by not having telephones or by refusing the interview. Such a shortcut is quick, it is easy, and it is cheap. Crespi, however, reports that polls that use this method do not enjoy the accuracy of those that use a tighter screen, i.e., with more questions. It did not appear to make much difference in the polls that Crespi looked at whether the questions were used as up-front screeners or applied after the fact to dump the unlikely voters.6 The after-the-fact method is more theoretically sound, however, because it gives you a full-population sample to evaluate against the census.
Most polls stick pretty close to the Gallup language with only minor variations:
Giving the party identification is essential, because party label remains one of the basic tools used by voters in arriving at a decision. The order in which the candidates are named can also make a difference. Unfortunately, there is a big gap in the literature about the nature of that difference. In an experiment with the Carolina Poll in the 1988 presidential election in North Carolina, we found a recency effect. We programmed our CATI system so that the interviewers named the Democratic candidates first for a random half of the sample and the GOP contenders were first in the other half. Bush and Quayle did better when they were named last. As it happened, they were listed last on the North Carolina ballot, and that version proved to be the most accurate in predicting the election. This outcome is at least consistent with theory. The form of the survey question should be as much like the form of the election question as possible, down to the relative positioning of the candidates. Since the vice-presidential candidate is listed on the ballot along with the presidential candidate, both names should be in the survey question.
One reason the internal order effect has never received much attention may be that it is fairly well randomized nationally. However, when the Gallup Poll used a paper ballot in personal interviews, care was taken to match the candidate order with the actual ballot in each state. That order varies because the party in power in each state generally tries to rig the ballot to its own benefit. Political folklore expects a primacy effect, meaning that the candidates mentioned first should be favored. But if the single experiment in North Carolina is indicative, they have it all wrong and are actually hurting their candidates by putting them first on the ballot.
Folklore is often well grounded, and it should not be lightly dismissed on the basis of a single experiment. Perhaps primacy effect was dominant in an earlier period, when people had more time to look at things and ponder them. Recency effect, on the other hand, could be the product of the information overload of our age. With so much data streaming through our heads, there may be less chance of any given particle staying put there for very long. If so, the most recently processed quanta could have the best chance of remaining salient. Or perhaps primacy effect dominates in written communication and recency effect in oral communication. If that is true, matching the ballot order is not important, and you should probably neutralize the effect by rotating candidate order.
One difference between an election and a poll is that in the privacy of the voting booth, all but a very few can make up their minds. In a poll a significant minority are still undecided. If the election is to be the standard by which the poll is judged, some procedure needs to be found to deal with the undecided.
As the election gets closer, the number of undecided begins to drop, but levels of 10 to 15 percent in the final days are not uncommon. Some media pollsters are perfectly happy with that. The cushion of undecided makes a nice explainer if the poll turns sour. If a poll shows a 40-40 tie with 20 percent undecided, the election results can be as lopsided as 60-40 in either direction, and a pollster can claim that his or her results were right on the money and those pesky undecided voters all went the same way. (Pollsters do make outlandish excuses. In 1989, one pollster caught with a bad prediction claimed that it was the respondents' fault for lying to his interviewers.)
Paul Perry of the Gallup Organization struggled for years with the problem of allocating the undecided and concluded that reducing their number is a lot easier than figuring out what they are going to do. The "leaning" question, the second part of the two-part Gallup question cited above, will cut the undecided by about half. Using it is simple. Just add the leaners to the decided group and report both totals, but make it clear that you are basing your election prediction on the sums with the leaners included. The procedure clearly improves predictive power.7
In personal interviews, the undecided can be reduced still more by using a secret ballot. Instead of asking the respondent how he or she plans to vote, the interviewer offers a printed ballot and displays a box with a padlock and a big label, "SECRET BALLOT BOX." The paper ballot also makes it possible to replicate the physical layout of the real ballot so that any positional biases in the actual election are built into the poll as well. This can be an important advantage in a state that allows straight-ticket voting. It is easier to simulate the straight-ticket option on paper than it is in a telephone interview. The straight-ticket choice can be offered in a telephone survey, however, and you should consider it if you are polling in a state that uses straight-ticket voting and if there are state and local candidates who might receive some coattail effect from better-known national candidates.
Even after the leaning question, a hard core of undecided voters remains. What to do about them? Paul Perry found that many of them are unlikely to vote at all. In the 1976 Gallup Poll, for example, the leaning question reduced the undecided to 5 percent. Kicking out the unlikely voters dropped it to 3.7 percent.8
Lots of different ways have been tried for getting rid of that last remnant of undecided respondents. There are complicated ways and simple ways. An example of a complicated way would be to use a powerful statistical model such as Logit or Probit to predict candidate choice from demographic and issue preference variables. These are regressionlike models that deal with categorical variables. Some pollsters have tried splitting the undecided by party preference. Nick Panagakis of Market Shares Corporation believes that in state and local elections, the undecided tend to go against the incumbent. My own work with the Carolina Poll suggests that it is hard to beat the procedure of dropping the undecided from the percentage base in making the final prediction. It has the virtue of simplicity. And it is based on a reasonable theory: that many of the hard-core undecided will not vote, and those who do will tend to divide in the same way as those who have decided. Crespi, after looking at the records of a number of pollsters, endorses this practice, although he also believes there may be something to be said for basing the division on party affiliation.9
In considering whether to weight your data, it is important to keep this distinction in mind: Some weights are design weights built into the sampling procedure. Others are corrective weights employed after the fact to fix (or cover up) flaws in the sample. Use of the latter is controversial. Employment of the former is not.
The obvious example of a design weight in a telephone survey is household size. Most phone surveys are designed to obtain only one interview from a household that was selected with a probability of being chosen equal to that of all other households. That means that people from large households get less of a chance to be interviewed than people from small households. The inequity is easily corrected by weighting for household size. A person in a household with four adults gets counted with four times the weight of a person who lives alone.
In the old days before computers, weighting was done by replicating cards. An interview from a four-person household would have its card copied three times. SAS and SPSS make weighting easier. It is all done mathematically inside the computer. To keep the weighting from inflating sample size and throwing off your statistical tests, you can divide each person's household size by the average household size to arrive at the final weight. For example, if the average household size is two, the four-person household will have a weight of two, and the one-person household will have a weight of one-half.
Corrective weights are another matter. The theory behind them is less sound. In the Carolina Poll we typically get 15 percent black respondents, even though the true proportion of blacks in the voting-age population is closer to 21. Weighting the 15 percent up to 21 is based on a theoretical assumption that we know is wrong: that the blacks who cannot be reached by telephone are just like those whom we do reach. In fact, they are poorer, younger, more alienated, more likely to be male. But we do the weighting anyway. Why? It helps the election prediction. Unrepresentative blacks are better than no blacks at all.
The only other weight we use in the Carolina Poll is for age category. In some elections it makes a difference, as when social security is an issue. Our poll usually needs adjustment of no more than a point or two in each category.
The source for the weighting data is a biennial publication of the Bureau of the Census. Early in each election year, it produces a report for each state on the characteristics of the voting-age population. The report is based on estimates and projections from the most recent census, but it is the best estimate of what's really out there that you can get.10
Some pollsters go much more heavily into weighting. There are basically three ways to do it.
1. Sequential weighting. Put the weights in one at a time. After each weighting, run a new frequency count and check the next variable that you intend to weight, so that you can calculate how much adjustment it needs. Proceed in this manner until all of the weights are in.
Problem: one weight affects another. For example, if your first weight is to bring up the proportion of blacks, you will throw off the sex balance because black respondents are disproportionately female. Fix the gender balance, and race goes back out of line.
2. Cell weighting. Find out from census data what proportion of the population is in every possible combination of your weighting categories. If you are weighting by age, sex, race, and education, find out what the population has in each combination, cell by cell.
Problem: there can be a lot of cells. Figure four categories of age, two of sex, three of race, and five of education. That's 120 cells. Some will be very small. A few might even be empty. A few wildly deviant cases might get blown up to create some large and strange effect.
3. Rim weighting. A computer program applies weights in sequential fashion, varying the order, and checking the marginals to find the optimum solution. It stops when it finds a simple combination of weights that yields good approximations for the marginals (total frequencies) of the weight variables.
Problem: it takes a special computer program. When I last looked, neither SAS nor SPSS offered rim weighting, although they well might by the time you read this.
The experience of the Carolina Poll suggests that it is good to be conservative about weighting. In North Carolina, at least, the telephone itself is a crude nonvoter screen. When we try to weight for education, we destroy that feature by boosting the influence of low-educated unlikely voters. Our best election predictions are produced when we weight by race and age and, when necessary, gender. Crespi reports that those are the most popular weighting variables among the polls that he examined, and that those who weight get better predictions than those who do not.11
It used to be so simple. Pollsters took their time, collected results from interviewers by mail, and reserved the more complicated procedures, including adjustments for undecided and nonvoters, for the last poll before the election. That tended to give the national polls a Democratic bias in the early going. Unlikely voters are disproportionately Democrats, and so dropping them out gave the Republicans an apparent last-minute boost.
Competition has made the national polls more honest. The nonvoter screens now kick in much earlier. CBS starts limiting its reports to what it calls "the probable electorate" with its first survey in the September before a national election. When national polls give sharply different results in late summer the differences can sometimes be traced to differences in timing for the nonvoter screen.
Older polls differed from today's in another important way. They assumed a voter decision model that was relatively static. It assumed that once you had the voters pegged, they would not change very much. That assumption was one of the causes of the great debacle of 1948, when all the major polls predicted that Thomas E. Dewey would defeat President Harry S Truman. The pollsters quit too soon, and a lot of disaffected Democrats who had earlier vowed not to vote for Truman came back to the fold near the end of the campaign.
Today's media-driven campaigns and the related decline of party loyalties produce elections that are even more volatile. Polling data suggest that Jimmy Carter lost the 1980 election in the last 48 hours of the campaign.
Such instability makes election predictions more difficult, but the track records of the serious pollsters is still quite good. Timing is now considered critical. All of the careful methodological details discussed above take time. When do you stop and make your prediction? Can you base it on a poll taken the night before the election?
One-night polls are notoriously inaccurate. They don't allow time for the callbacks needed to get the folks who are not always at home. But you can plan a poll for three or four nights that will end in time for publication the Sunday or even the Monday before the election. By tracking the results of each night, you might pick up on gross last-minute shifts, although you can't be sure because of sampling variability and unequal completion rates for each day. Finishing up on a weekend is convenient because you can interview in the daytime to call back people who could not be reached at night. A popular schedule in 1988 was to finish callbacks on Sunday afternoon, use the early evening hours to do the final weighting and screening, and produce the final prediction in time for Monday morning newspapers.
A good reporter will give the numbers for each stage of the process: the response to the straight voting intention question, the response with leaners added, and the response with leaners added and the undecided allocated. But which should be emphasized?
The honest thing to do is to emphasize the numbers that can be compared most directly to the election outcome, i.e., the ones with leaners added and undecided allocated, the ones that add to 100 percent. It is the only way to ensure full credit for being accurate or full blame for getting it wrong. The Gallup Organization found that out years ago. The Harris Poll does it that way. So does Gordon Black, the pollster for USA Today. Some others don't. The next chapter will explore their reasons and the consequences for precision journalism.
Election day opens a whole new world of data collection possibilities. Newspapers and broadcasters alike have an interest in detecting the outcome ahead of the official vote count. The television networks try the hardest to do the most in this area, but newspapers with early deadlines also have an interest in early detection. There are two basic methods:
1. Projection from the early-reporting precincts.
2. Exit polls.
The first attempts of the networks to do early projections on election night were primitive. In 1960, CBS used a model based on the timing of the returns. At given intervals on election night, the computer looked at the number of Republican and Democratic votes for president and compared them to the votes at the same time in 1956. On that basis, an early prediction was produced. It said Richard M. Nixon would defeat John F. Kennedy.
What the computer couldn't know was where those votes were coming from. Kansas had introduced a faster method of vote counting between 1956 and 1960, and so the votes from Republican Kansas hit the wires at about the same time that Connecticut had come in four years before. The error was corrected when more votes came in.12
A better way is to base the model on geography. Pick either a random sample of precincts, which is the safest way, or pick a sample that you know will report early. If the sample is random, you can simply generalize from it to the whole electorate. If it is purposeful, you will have to assume that it will deviate from the whole in the same way that it did in the previous election.
The method works. On election night in 1976, I hung out at the ABC studio in New York City, looking over the shoulder of David Neft, an associate of Louis Harris. Neft had prepared a model based on early-reporting precincts. When the first returns came in, I noticed that the turnout was less than in 1972 across a great variety of locations. Because 1972 had had the lowest voting participation since 1948, I filed a story that said turnout in this election was "the lowest in a generation."
My bureau chief was upset. All the wires were writing stories about what they perceived to be a high turnout. Some even talked about a record turnout. This is a common election-day phenomenon. After covering the campaign for months, wire service reporters have nothing to write about for the p.m. cycle on election day, so they call up precinct officials and ask about the turnout. The officials look out their windows, notice a long line, and say something like, "Wow, we've got a really great turnout." What they forget to allow for is that the previous election, with which they are making the mental comparison, was not a presidential election. Also, the population has grown, so it takes more bodies to equal the same percentage turnout. And, perhaps most important, a high turnout makes their jobs seem more significant. I explained all this, and my story was put on the wire. Most gratifying of all, the next issue of Columbia Journalism Review reproduced two sets of headlines side by side. One set was from some Knight-Ridder papers with my low turnout story. The other was from some bigger papers that went with the wire accounts of the imagined heavy turnout. The final figure was 53.5 percent, compared to 55.2 percent in 1972 and 60.9 percent in 1968. Not since the Truman-Dewey contest of 1948 had it been so low.
Newspapers sometimes need the benefit of election-night projections if they are in slow-reporting areas and have early deadlines. Election night was a constant source of frustration for the Detroit Free Press until Mike Maidenberg introduced an early-projection system in 1969. Maidenberg, who later became a publisher, used a probability sample of 20 out of Detroit's 1,111 precincts. That is not very many, but he reduced the risk by stratifying on two dimensions.
Race was a factor in that election. Richard Austin, a black candidate, was running against Sheriff Roman Gribbs. Detroit had never had a black mayor before. To get good representation of racial attitudes, Maidenberg drew the sample from a list that had been divided into two groups, according to percentage of vote cast for George Wallace in the 1968 presidential election. One group was above average in Wallace support and the other was below.
Within each group, precincts were ordered according to their historic Democratic strength. Maidenberg then used a random start and a constant skip interval to draw the 20 precincts. As a check, he compared the 20 precincts to the total for the 1968 election. They were close, but not perfect. Abandoning science for art, he added two black precincts to improve the model and used it in the 1969 primary election for a toe-in-the-water, carefully hedged projection for use in early editions. The highest error in any race was less than 2 percent.
Emboldened, Maidenberg revised the model for the general election, using the primary as the benchmark. He added three more precincts for a total of 25. It proved to be accurate within a percentage point. Some clerical errors made the lead of the winner, Sheriff Roman Gribbs, seem larger than it was, so the Free Press was bolder than it should have been that night. But the system quickly became routinized. When the first black mayor was finally elected in Detroit -- it was Coleman Young -- the fact that black precincts were the slowest to report made the evening especially suspenseful. Kurt Luedtke was the executive editor and Neal Shine was managing editor. The projections indicated Young would win. Here is Shine's version of what happened next:
Mike Maidenberg was not drinking champagne that night. He took copies of the Free Press down to city hall and passed them around. The election returns were being posted on a blackboard. They showed Young losing. The members of the city hall crowd looked at the blackboard, looked at the Free Press headline, and scratched their heads. Then, as they held those papers in their hands, the numbers on the board began to change. The black precincts came in, Coleman won, and it almost looked like the Free Press had caused it. "It was like magic," Maidenberg recalled.
When Maidenberg left the Free Press, he was succeeded in his election night role by Tim Kiska. Kiska developed the habit of buying a new suit to wear on each election day. It made him seem more authoritative, he explained, when he had to convince editors and writers of the right way to interpret his numbers. And if his projection should turn out wrong, he would have a new suit to wear when he went out hunting for another job.
The networks now use a combination of sample precincts and exit polls to do their early calls. Exit polls have a nobler purpose, of course. They are excellent tools for assessing the hidden forces in an election. By comparing issue preferences with candidate choice, you can find out what campaign themes made the most difference. If new coalitions are forming, you can identify them. And, since you already know who won by the time you finish the analysis, you are not distracted by the horse-race aspect.
The methodology of an exit poll is straightforward. Draw a random sample of precincts. Interview every Nth voter coming out of those precincts, using a one-page (both sides), self-administered questionnaire. Make sure the questionnaire has the words "SECRET BALLOT" in 24-point type or larger at the top. Have each interviewer carry a box with a slot in the top to hold the ballots. This kind of survey has the following obvious advantages:
1. You know that your respondents are voters, because you intercept them at the polling place.
2. Nobody is undecided at that time.
3. Sampling does not depend on households or telephones. Your sample will therefore be the best evidence of the demographic characteristics of the voters.
And you have instant reassurance that your sample is representative, because you can compare its election decision to the election itself. In all of survey research, there is no better external test of validity.
There is, however, still the problem of refusals. Exit polls, like telephone interviews, are refused by about a third of those approached. Those who refuse generally do so because they are in a hurry. Evidently, being in a hurry does not correlate with voter choice, because well-administered election polls are quite accurate.
Of course, like anything else, they can be administered in a sloppy manner. Some exit pollsters sample day parts as well as precincts. For example, Precinct A might be sampled from 8:00 to 10:00 and the interviewers then moved to Precinct B for sampling from 11:00 to 1:00. That increases the number of precincts in the sample, but it adds some administrative confusion. My own preference is to go with as few as twenty-five precincts but keep the interviewers there all day to ensure a good sample from each one.
Be sure to establish a procedure that prevents the interviewers from selecting the respondents. Station at least two interviewers at each place so that one can count off the voters while another solicits the interviews. That way, the every Nth rule can be enforced. If you let the interviewers choose the respondents, they will pick people who look nice or friendly or sexy. Looking grubby or menacing could well correlate with voter preference. That might explain why the New York Times' earliest attempts at exit polling in 1972 presidential primaries turned up undercounts of voters for George Wallace, who was then running on a segregation platform that appealed to poor whites.
Short and simple questions work best in the exit polls. An agree-disagree list can help sort respondents into categories on a number of key issues in a hurry. Very simple multiple-choice questions work well. A question on when the respondent made up his or her mind on the main contest is a staple of most exit polls and helps explain the dynamics of the campaign.
Some state legislatures have tried to stop exit polls by banning interviews within some minimum distance from the polling place. Such laws were found to be a violation of the First Amendment. The attempt represents the surfacing of an uneasiness with precision journalism and its fruits that will likely be seen again. The sources of that uneasiness will be explored in the next chapter.
2. Robert M. Groves and Lars E. Lyberg, "An Overview of Non-response Issues in Telephone Surveys," in Robert Groves et al., eds., Telephone Survey Methodology (New York: John Wiley & Sons, 1988), pp. 203-205. return to text5. Crespi, Pre-Election Polling, p. 93. return to text 9. Crespi, Pre-Election Polling, p. 116. return to text 11. Crespi, Pre-Election Polling, pp. 39-40. return to text