Abstract
To date, field experiments on campaign tactics have focused overwhelmingly on mobilization and voter turnout, with far more limited attention to persuasion and vote choice. In this paper, we analyze a field experiment with 56,000 Wisconsin voters designed to measure the persuasive effects of canvassing, phone calls, and mailings during the 2008 presidential election. Focusing on the canvassing treatment, we find that persuasive appeals had two unintended consequences. First, they reduced responsiveness to a followup survey among infrequent voters, a substantively meaningful behavioral response that has the potential to induce bias in estimates of persuasion effects as well. Second, the persuasive appeals possibly reduced candidate support and almost certainly did not increase it. This counterintuitive finding is reinforced by multiple statistical methods and suggests that contact by a political campaign may engender a backlash.
This is a preview of subscription content, access via your institution.
Notes
 1.
The data set and replication code are posted online at https://dataverse.harvard.edu/dataverse/DJHopkins. Due to their proprietary nature, two variables employed in our analyses are omitted from the data set: the Democratic performance in a precinct and each respondent’s probability of voting for the Democratic candidate.
 2.
Strategies to study persuasion include natural experiments based on the uneven mapping of television markets to swing states (Simon and Stern 1955; Huber and Arceneaux 2007) or the timing of campaign events (Ladd and Lenz 2009). Other studies use precinctlevel randomization (e.g. Arceneaux 2005; Panagopoulos and Green 2008; Rogers and Middleton 2015) or discontinuities in campaigns’ targeting formulae (e.g. Gerber et al. 2011).
 3.
In a related vein, Shi (2015) finds that postcards exposing voters to a dissonant argument on samesex marriage reduce subsequent voter turnout.
 4.
Experimental studies also rely on selfreported vote choice, not the actual vote cast. This is less of a concern, as preelection public opinion surveys like this one typically provide accurate measures of vote choice (Hopkins 2009).
 5.
Such support scores are commonly employed by campaigns. To generate them, data vendors fit a model to data where candidate support is observed, typically survey data. They then use the model, alongside known demographic and geographic characteristics, to estimate each voter’s probability of supporting a given candidate in a much broader sample. The specific model employed is proprietary and unknown to the researchers. The Pearson’s correlation with a separate measure of precinctlevel prior Democratic support is 0.47, indicating the importance of precinctlevel measures in its calculation in this data set. For more on the use of such data and scores within political science, see Ansolabehere et al. (2011), Ansolabehere et al. (2012), Rogers and Aida (2014) and Hersh (2015).
 6.
This age skew reduces one empirical concern, which is that voters under the age of 26 have truncated vote histories. Only 2.1% of targeted voters were under 26 in 2008, and thus under 18 in 2000.
 7.
Specifically, voters were coded as “strong Obama,” “lean Obama,” “undecided,” “lean McCain,” and “strong McCain.”
 8.
We can do additional analyses to approximate the effect of the treatment on people who actually spoke to the canvassers (the socalled Complier Average Causal Effect; see Angrist et al. 1996), and report the results in the Conclusion.
 9.
 10.
Similar results for the phone and mail treatments show no significant differences across groups.
 11.
 12.
Voters under the age of 26 would not have been eligible to vote in some of the prior elections, and might be disproportionately represented among the lowturnout groups. We have age data only for 39, 187 individuals in the sample. The negative effects of canvassing in the zeroturnout group persist (with a larger confidence interval) when the data set is restricted to citizens known to be older than 26.
 13.
 14.
For example, Enos et al. (2014) find that direct mail, phone calls, and canvassing had small effects on turnout for voters with low probabilities of voting, large effects for voters with middletohigh probabilities of voting, and smaller but still positive effects for those with the highest probabilities of voting.
 15.
Results using logistic regression are highly similar.
 16.
In separate, ongoing research, we use the turnout results described above as a benchmark with which to evaluate each of these methods.
 17.
As Little et al. (2012) explain, “weighted estimating equations and multipleimputation models have an advantage in that they can be used to incorporate auxiliary information about the missing data into the final analysis, and they give standard errors and p values that incorporate missingdata uncertainty”(1359).
 18.
 19.
To examine the performance of our model for multiple imputation, we performed tests in which we deliberately deleted 500 known survey responses from the fully observed data set (n = 12,442) and then assessed the performance of our imputation model for those 500 cases where we know the correct answer. In each case, we used the full multiple imputation model to generate five imputed data sets for each new data set, and then calculated the share of deleted responses which we correctly imputed. The median outofsample accuracy across the resulting data sets was 74.9 %, with a minimum of 73.3 % and a maximum of 76.0 %. This performance is certainly better than chance alone.
 20.
In fact, the associated p value is less than 0.002, meaning that the finding would remain significant even after a Bonferroni correction for multiple comparisons to account for the analyses of the phone and mail treatments.
 21.
The associated 95 % confidence interval spans from −3.03 to −0.60.
 22.
We could add additional covariates that only affect this equation without affecting our discussion below. The existence of such variables is commonly necessary for empirical estimation of selection models, although it is not strictly required, as these models can be identified solely with parametric assumptions about error terms.
 23.
Throughout these analyses, we drop our measure of respondents’ age, which is the only independent variable with significant missingness.
 24.
Here, \(\delta\) is set to 0.0001.
 25.
 26.
Still, even in light of this potential to underestimate variance, Demirtas et al. (2007) demonstrate that the smallsample properties of the original ABB are superior when compared to wouldbe corrections.
 27.
IPW requires data that are fully observed with the exception of the missing outcome. We thus set aside 20 respondents who were missing data for covariates other than age or Obama support.
References
Adams, W. C., & Smith, D. J. (1980). Effects of telephone canvassing on turnout and preferences: A field experiment. Public Opinion Quarterly, 44(3), 389–395.
Albertson, B., & Busby, J. W. (2015). Hearts or minds? Identifying persuasive messages on climate change. Research & Politics.
Angrist, J. D., Imbens, G. W., & Rubin, D. B. (1996). Identification of causal effects using instrumental variables (with discussion). Journal of the American Statistical Association, 91, 444–455.
Ansolabehere, S., & Hersh, E. (2011). Who really votes? In P. M. Sniderman & B. Highton (Eds.), Facing the challenge of democracy: Explorations in the analysis of public opinion and political participation. Princeton University Press.
Ansolabehere, S., & Hersh, E. (2012). Validation: What big data reveal about survey misreporting and the real electorate. Political Analysis, 20(4):437–459.
Arceneaux, K. (2005). Using cluster randomized field experiments to study voting behavior. The Annals of the American Academy of Political and Social Science, 601(1), 169–179.
Arceneaux, K. (2007). I’m asking for your support: The effects of personally delivered campaign messages on voting decisions and opinion formation. Quarterly Journal of Political Science, 2(1), 43–65.
Arceneaux, K., & Kolodny, R. (2009). Educating the least informed: Group endorsements in a grassroots campaign. American Journal of Political Science, 53(4), 755–770.
Arceneaux, K., & Nickerson, D. W. (2009). Who is mobilized to vote? A reanalysis of 11 field experiments. American Journal of Political Science, 53(1), 1–16.
Bechtel, M. M., Hainmueller, J., Hangartner, D., & Helbling, M. (2014). Reality bites: The limits of framing effects for salient and contested policy issues. Political Science Research and Methods (forthcoming).
Broockman, D. E., & Green, D. P. (2014). Do online advertisements increase political candidates’ name recognition or favorability? evidence from randomized field experiments. Political Behavior, 36, 263–289.
Cardy, E. A. (2005). An experimental field study of the GOTV and persuasion effects of partisan direct mail and phone calls. The Annals of the American Academy of Political and Social Science, 601(1), 28–40.
Cranmer, S. J., & Gill, J. (2013). We have to be discrete about this: A nonparametric imputation technique for missing categorical data. British Journal of Political Science, 43(2), 425–449.
Das, M., Newey, W. K., & Vella, F. (2003). Nonparametric estimation of sample selection models. The Review of Economic Studies, 70(1), 33–58.
Demirtas, H., Arguelles, L. M., Chung, H., & Hedeker, D. (2007). On the performance of biasreduction techniques for variance estimation in approximate Bayesian bootstrap imputation. Computational Statistics & Data Analysis, 51(8), 4064–4068.
Enos, R. D., Fowler, A., & Vavreck, L. (2014). Increasing inequality: The effect of GOTV mobilization on the composition of the electorate. The Journal of Politics, 76(1), 273–288.
Enos, R. D., & Hersh, E. D. (2015). Party activists as campaign advertisers: The ground campaign as a principalagent problem. American Political Science Review, 109(02), 252–278.
Gerber, A., Karlan, D., & Bergan, D. (2009). Does the media matter? A field experiment measuring the effect of newspapers on voting behavior and political opinions. American Economic Journal: Applied Economics, 1(2), 35–52.
Gerber, A., & Green, D. (2000). The effects of canvassing, telephone calls, and direct mail on voter turnout: A field experiment. American Political Science Review, 94(3), 653–663.
Gerber, A. S., Kessler, D. P., & Meredith, M. (2011). The persuasive effects of direct mail: A regression discontinuity based approach. Journal of Politics, 73(1), 140–155.
Gerber, A. S., & Green, D. P. (2012). Field experiments: Design, analysis, and interpretation. New York, NY: W.W. Norton and Company.
Gerber, A. S., Huber, G. A., Doherty, D., Dowling, C. M., & Hill, S. J. (2013). Who wants to discuss vote choices with others? Polarization in preferences for deliberation. Public Opinion Quarterly, 77(2), 474–496.
Gerber, A. S., Huber, G. A., & Washington, E. (2010). Party affiliation, partisanship, and political beliefs: A field experiment. American Political Science Review, 104(04), 720–744.
Gerber, A. S., Gimpel, J. G., Green, D. P., & Shaw, D. R. (2011). How large and longlasting are the persuasive effects of televised campaign ads? Results from a randomized field experiment. American Political Science Review, 105(01), 135–150.
Glynn, A. N., & Quinn, K. M. (2010). An introduction to the augmented inverse propensity weighted estimator. Political Analysis, 18(1), 36–56.
Green, D. P., & Gerber, A. S. (2008). Get out the vote: How to increase voter turnout. Washington, DC: Brookings Institution Press.
Heckman, J. (1976). The common structure of statistical models of truncation, sample selectionand limited dependent variables, and simple estimator for such models. Annals of Economic and Social Measurement, 5, 475–492.
Hersh, E. D. (2015). Hacking the electorate: How campaigns perceive voters. New York, NY: Cambridge University Press.
Hersh, E. D., & Schaffner, B. F. (2013). Targeted campaign appeals and the value of ambiguity The Journal of Politics, 75(02), 520–534.
Hopkins, D. J. (2009). No more wilder effect, never a Whitman effect: When and why polls mislead about black and female candidates. The Journal of Politics, 71(3), 769–781.
Huber, G. A., & Arceneaux, K. (2007). Identifying the persuasive effects of presidential advertising. American Journal of Political Science, 51(4), 957–977.
Imai, K., King, G., & Stuart, E. A. (2008). Misunderstandings between experimentalists and observation lists about causal inference. Journal of the Royal Statistical Society: Series A, 171(2), 481–502.
Issenberg, S. (2012). Obama Does It Better. Slate.
King, G., Honaker, J., Joseph, A., & Scheve, K. (2001). Analyzing incomplete political science data: An alternative algorithm for multiple imputation. American Political Science Review, 95(1), 49–69.
Ladd, J. M., & Lenz, G. S. (2009). Exploiting a rare communication shift to document the persuasive power of the news media. American Journal of Political Science, 53(2), 394–410.
Little, R. J., D’Agostino, R., Cohen, M. L., Dickersin, K., Emerson, S. S., Farrar, J. T., et al. (2012). The prevention and treatment of missing data in clinical trials. New England Journal of Medicine, 367(14), 1355–1360.
Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). New York, NY: Wiley.
Matland, R. E., & Murray, G. R. (2013). An experimental test for backlash against social pressure techniques used to mobilize voters. American Politics Research, 41(3), 359–386.
Michelson, M. R. (2014). Memory and voter mobilization. Polity, 46, 591–610.
Moore, R. T. (2012). Multivariate continuous blocking to improve political science experiments. Political Analysis, 20(4), 460–479.
Nicholson, S. P. (2012). Polarizing cues. American Journal of Political Science, 56(1), 52–66.
Nickerson, D. W. (2005a). Partisan mobilization using volunteer phone banks and door hangers. The Annals of the American Academy of Political and Social Science, 601(1), 10–27.
Nickerson, D. W. (2005b). Scalable protocols offer efficient design for field experiements Political Analysis, 13, 233–252.
Nickerson, D. W. (2008). Is voting contagious? Evidence from two field experiments. American Political Science Review, 102(1), 49.
Nickerson, D. W., & Rogers, T. (2010). Do you have a voting plan? Implementation intentions, voter turnout, and organic plan making. Psychological Science, 21(2), 194–199.
Panagopoulos, C., & Green, D. P. (2008). Field experiments testing the impact of radio advertisements on electoral competition. American Journal of Political Science, 52(1), 156–168.
Rogers, T., & Nickerson, D. (2013). Can inaccurate beliefs about incumbents be changed? And can reframing change votes? HKS Faculty Research Working Paper Series RWP13018.
Rogers, T., & Middleton, J. A. (2015). Are ballot initiative outcomes influenced by the campaigns of independent groups? A precinctrandomized field experiment showing that they are. Political Behavior, 37, 567–593.
Rogers, T., & Aida, M. (2014). Vote selfprediction hardly predicts who will vote, and is (misleadingly) unbiased. American Politics Research, 42(3), 503–528.
Rubin, D. B. (2008). For objective causal inference, design trumps analysis. The Annals of Applied Statistics, 2, 808–840.
Rubin, D. B., & Schenker, N. (1991). Multiple imputation in healthcare databases: An overview and some applications. Statistics in Medicine, 10(4), 585–598.
Rubin, D., & Schenker, N. (1986). Multiple imputation for interval estimation for simple random samples with ignorable nonresponse. Journal of the American Statistical Association, 81(394), 366–374.
Schafer, J. L. (1997). Analysis of incomplete multivariate data. London: Chapman & Hall.
Shi, Y. (2015). Crosscutting messages and voter turnout: Evidence from a samesex marriage amendment. Political Communication. (forthcoming).
Siddique, J., & Belin, T. R. (2008a). Multiple imputation using an iterative hotdeck with distancebased donor selection. Statistics in Medicine, 27(1), 83–102.
Siddique, J., & Belin, T. R. (2008b). Using an approximate Bayesian bootstrap to multiply impute nonignorable missing data. Computational Statistics & Data Analysis, 53(2), 405–415.
Simon, H. A., & Stern, F. (1955). The effect of television upon voting behavior in Iowa in the 1952 presidential election. American Political Science Review, 49(2), 470–477.
Sinclair, B. (2012). The social citizen. Chicago, IL: University of Chicago Press.
Sinclair, B., McConnell, M., & Green, D. P. (2012). Detecting spillover effects: Design and analysis of multilevel experiments. American Journal of Political Science, 56(4), 1055–1069.
Taber, C. S., & Lodge, M. (2006). Motivated skepticism in the evaluation of political beliefs. American Journal of Political Science, 50(3), 755–769.
Van Buuren, S., Brand, J. P. L., GroothuisOudshoorn, C. G. M., & Rubin, D. B. (2006). Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation, 76(12), 1049–1064.
Vavreck, L., et al. (2007). The exaggerated effects of advertising on turnout: The dangers of selfreports Quarterly Journal of Political Science, 2(4), 325–343.
Westfall, P. H., & Young, S. S. (1993). Resamplingbased multiple testing: Examples and methods for pvalue adjustment. New York, NY: Wiley.
Zaller, J. R. (1992). The nature and origins of mass opinion. New York, NY: Cambridge University Press.
Acknowledgments
This paper has benefitted from comments by David Broockman, Kevin Collins, Eitan Hersh, Seth Hill, Michael Kellermann, Gary King, Marc Meredith, David Nickerson, Maya Sen, and Elizabeth Stuart. For research assistance, the authors gratefully acknowledge Julia Christensen, Zoe Dobkin, Katherine Foley, Andrew Schilling, and Amelia Whitehead. David Dutwin, Alexander Horowitz, and John Ternovski provided helpful replies to various queries.Earlier versions of this manuscript were presented at the 30th Annual Summer Meeting of the Society for Political Methodology at the University of Virginia, July 18th, 2013 and at Vanderbilt University’s Center for the Study of Democratic Institutions, October 18th, 2013.
Author information
Affiliations
Corresponding author
Appendices
Appendix
Persuasion Script
Good Afternoon—my name is [INSERT NAME], I’m with [ORGANIZATION NAME]. Today, we’re talking to voters about important issues in our community. I’m not asking for money, and only need a minute of your time.
As you are thinking about the upcoming election, what issue is most important to you and your family? [LEAVE OPEN ENDED—DO NOT READ LIST]
If not sure, offer the following suggestions:

Iraq War

Economy/ Jobs

Health Care

Taxes

Education

Gas Prices/Energy

Social Security

Other Issue
Yeah, I agree that issue is really important and that our economy is hurting many families in Wisconsin. Do you know anyone who has lost a job or their health care coverage in this economy?
I understand that a lot of families are struggling to make ends meet these days.
When you think about how that’s affecting your life, and the people running for president this year, have you decided between John McCain and Barack Obama, or, like a lot of voters, are you undecided? [IF UNDECIDED] Are you leaning toward either candidate right now?

Strong Obama

Lean Obama

Undecided

Lean McCain

Strong McCain
[If strong McCain supporter, end with:] Ok, thanks for your time this evening. [If strong Obama supporter, end with:] Great, I support Obama as well, I know he will bring our country the change we need. Thanks for your time this evening.
[ONLY MOVE TO THIS SECTION WITH LEANING OR UNDECIDED VOTERS] With our economy in crisis, job and heath care loses at an alltime high, our country is in need of a change. But as companies are laying off workers and sending our jobs overseas, John McCain says that our economy is “fundamentally strong”—he just doesn’t understand the problems our country faces. McCain voted against the minimum wage 19 times. His tax plan offers 200 billion dollars in tax cuts for oil companies and big corporations, but not a dime of tax relief for more than a hundred million middleclass families. During this time of families losing their homes, McCain voted against measures to discourage predatory lenders and John McCain has never supported working families in the Senate and there is no reason to believe he will as President.
On the other hand, Barack Obama will do more to strengthen our economy. Obama will cut taxes for the middle class and help working families achieve a decent standard of living. Obama’s tax cuts will put more money back in the pockets of working families. He’ll stand up to the banks and oil companies that have ripped off the American people and invest in alternative energy. Obama will control the rising cost of healthcare and reward companies that create jobs in the U.S.
After hearing that, how are you feeling about our presidential candidates? What are your thoughts on this?
Obama will reward companies that keep jobs in the U.S., and make sure tax breaks go to working families who need them. Barack Obama offers new ideas and a fresh approach to the challenges facing Wisconsin families. Instead of just talking about change, he has specific plans to finally fix health care and give tax breaks to middleclass families instead of companies that send jobs overseas. Obama will bring real change that will finally make a lasting improvement in the lives of all Wisconsin families.
Now that we’ve had a chance to talk, who do you think you’ll vote for in November? John McCain and Barack Obama, or, are you undecided? [IF UNDECIDED] Are you leaning toward either candidate at this point?

Strong Obama

Lean Obama

Undecided

Lean McCain

Strong McCain
Thanks again for your time, [INSERT VOTER’S NAME], we appreciate your time and consideration.
Survey Questions
“Hi, I’m calling with [survey firm redacted] with a brief, oneminute, opinion survey. We are not selling anything and your responses will be completely confidential.
Now first, thinking about the election for President this November, will you be voting for Senator Barack Obama, the Democratic candidate, or Senator John McCain, the Republican candidate?

1.
Obama: Thank you. [GO TO Q2]

2.
McCain: Thank you. [GO TO Q2]

3.
VOLUNTEER ONLY Undecided/Don’t Know/Other: Thank You. [GO TO Q1]

4.
VOLUNTEER ONLY REFUSED TO ANSWER [GO TO Q1]
If the election were held today and you had to decide right now, toward which candidate would you lean?

1.
Obama

2.
McCain

3.
VOLUNTEER ONLY Completely Undecided

4.
VOLUNTEER ONLY REFUSED TO ANSWER
Finally, for demographic purposes only, in what year were you born?” [Collect fourdigit yea]
Additional Tables
A Formal Statement of Selection Bias
Here, we formalize the problem of sample selection. Doing so enables us to group estimators based on their underlying assumptions about how fully the observed covariates can account for the patterns of missing data.
The dependent variable of interest is \(Y_i^*\), support for Barack Obama. It is a function of the treatment (denoted as \(X_{1i}\)) and a vector of covariates (denoted as \(X_{2i}\)) that may or may not be observed. The treatment is randomized and is therefore uncorrelated with \(X_{2i}\) and the error terms in both equations below assuming a sufficient sample size.
We only observe the \(Y_i^*\) for those voters who respond to the survey, indicated by the indicator variable \(d_i\).
The variable indicating that \(Y_i^*\) is observable is a function of the same covariates which affect \(Y_i^*\).
We assume the \(\epsilon\) and \(\eta\) terms are random variables uncorrelated with each other and any of the independent variables.^{Footnote 22} (Particular \(\beta\) or \(\gamma\) coefficients may be zero for variables that affect only selection or the outcome.)
We can rewrite the equation for the observed data as
The various statistical approaches for dealing with sample selection diverge regarding their assumptions about \(X_{2i}\). One common approach is to assume that \(X_{2i}\) is fully specified and observed. In such cases, we can predict the missing values for which \(d_i^* < 0\) using the observed data. Statisticians refer to this assumption as “missing at random” (Schafer 1997; King et al. 2001; Little et al. 2002). Under this assumption, we might then apply some form of multiple imputation, which leverages the observed covariances among the variables to impute potential values for the missing data. Given that \(X_{2i}\) is fully specified, multiple imputation can be employed to estimate missingness in an outcome variable, an independent variable, or both.
Other approaches to sample selection are unwilling to assume that \(X_{2i}\) is fully observed—in such cases, the data are instead assumed to have nonignorable missingness. These approaches turn to other assumptions, typically concerning the process that generates the missing data. If \(X_{2i}\) is unobserved, \(\beta _2X_{2i}\) will become part of the error term in the \(Y_i\) equation and \(\gamma _2X_{2i}\) will become part of the error term in the \(d_i\) equation. While \(X_{1i}\) (the randomized treatment) and \(X_{2i}\) are uncorrelated in the whole population, they are not necessarily uncorrelated in the sampled population. To see this, note that
The turnout case provides an example of how this bias can manifest itself. Suppose that the unobserved variable (\(X_{2i}\)) is unmeasured civicmindedness, and it has a positive effect on whether someone responds to a pollster (implying \(\gamma _2>0\)) as well as a positive effect on Obama support (implying \(\beta _2>0\)). This would mean that in the observed data, the treated respondents would be more civically minded on average. Naturally, this could induce bias, as the treated, observed respondents are disproportionately high in civicmindedness compared to observed respondents in the control group. This can explain the spurious finding in the surveyedonly column of Table 3. We know from the full data set that the treatment had no overall effect on turnout, but in the subsample of those who answered the followup survey, the canvass treatment is spuriously associated with a statistically significant positive effect.
Assuming \(X_{2i}\) is unobserved, two conditions must be met for sample selection to cause bias in randomized persuasion experiments with subsequent surveys:

1.
\(\gamma _1 \ne 0\). This is necessary to induce a correlation between randomized treatment and some unobserved variable in the observed sample. This can be tested and, for our data, we found \(\gamma _1 < 0\) for lowturnout types and \(\gamma _1 >0\) for middleturnout types.

2.
\(\gamma _2 \ne 0\) and \(\beta _2 \ne 0\). In other words, given our characterization of the datagenerating process, the error terms in the two equations are correlated.
If \(X_{2i}\) is not fully observed, the errors in the selection and outcome equations may be correlated. Heckman (1976) models such correlated errors by assuming that the errors in the two equations are distributed as bivariate normal random variables. This allows us to derive the value of the error term in the outcome equation conditional on being observed. Nonparametric selection models such as Das et al. (2003) approximate the conditional value of the error term with a polynomial function of the covariates. In practice, this involves fitting a firststage model that produces a propensity of being observed. Powers of this fitted propensity are then included in the outcome equation.
Additional Estimation Strategies
Approximate Bayesian Bootstrap
Since nonrandom attrition threatens to bias listwise deletion models, we consider another imputation model that accounts for this possibility. In particular, we use hot deck imputation, which can be useful under three conditions satisfied by this experiment: when the missingness of interest is present primarily in a single variable, when the data contain many variables that are not continuous (Cranmer et al. 2013), and when there are many available donor observations (Siddique and Belin 2008b). Here, we employ the particular variant of hot deck imputation outlined in Siddique and Belin 2008b): an Approximate Bayesian Bootstrap (ABB) (see also Rubin and Schenker 1986; Rubin and Schenker 1991; Demirtas et al. 2007; Siddique and Belin 2008a). That approach has the added advantage that it can relax the assumption of ignorability in a straightforward manner by incorporating an informative prior about the unobserved outcomes.^{Footnote 23} These analyses focus on the 45,875 respondents who had Catalist phone match scores, although the results are similar when instead analyzing the full data set of 56,000 respondents.
Specifically, each iteration of the ABB begins by drawing a sample from the fully observed “donor” observations, which in our example number 12,439. This step allows the ABB to more accurately reflect variability from the imputation. One can draw the donor observations with equal probability in each iteration, which effectively assumes that the missingness is ignorable conditional on the observed covariates. But importantly, researchers can also take weighted draws from the donor pool, which is the equivalent of placing an informative prior on the missing outcome data (Siddique and Belin 2008b). This allows researchers to relax the ignorability assumption, and to build in additional information about the direction and size of any bias.
Irrespective of the prior, we then build a model of the outcome using the covariates for the respondents with no missing outcome data, being sure to weight the donor observations by the number of times they were drawn in each iteration of the bootstrap. The subsequent step is to predict \(\hat{Y}\) for all observations—both donor and donee—by applying that model to the covariates X. For each observation with a missing outcome—there are 33,025 in this example—we next need to draw a “donor” observation that provides an outcome. Following Siddique and Belin (2008b), we do so by estimating a distance metric for each observation i as follows: \(D_i = (\hat{y}_0\hat{y}_i+\delta )^k\), where \(\delta\) is a positive number which avoids distances of zero.^{Footnote 24} For each missing observation, an outcome is imputed from a donor chosen with a probability inversely proportional to the distance \(D_i\). As k grows large, note that the algorithm chooses the most similar observation in the donor pool with high probability, while a k of zero is equivalent to drawing any observation with equal probability.^{Footnote 25}
Unlike a singleshot hot deck imputation, this approach does account for imputation uncertainty—and here, we fit our standard logistic regression model to 5 separately imputed data sets and then combine the answers using the appropriate rules (Rubin and Schenker 1986; King et al. 2001). Yet there is an important potential limitation to this technique. While running the algorithm multiple times will address the uncertainty stemming from the imputation of missing observations, it will not address the uncertainty stemming from small donor pools—and the reweighting in the nonignorable ABB has the potential to exacerbate this concern (Cranmer et al. 2013).^{Footnote 26}
We first run the Approximate Bayesian Bootstrap assuming ignorablility (which means the prior is zero) and setting \(k=3\). Table 11 shows that, as we reported in the manuscript, such a model estimates the average treatment effect of canvassing to be −1.65 percentage points, with a corresponding 95% confidence interval from −3.29 to −0.01. That estimate is similar to those recovered using listwise deletion. We also report additional results after adding an informative prior which reduces the share of respondents who back Obama from 57.5% in the observed group to 54.0% in the unobserved group. We chose the magnitude of the decline–3.5 percentage points–to approximate the largest decline in survey response observed across any of the turnout groups. In other words, in light of the differential attrition identified above, 3.5 percentage points is a large but still plausible difference between the observed and unobserved populations conditional on observed covariates. Here, the estimated treatment effect becomes −1.73 percentage points, with a 95% confidence interval from −3.34 to −0.05. This result is essentially unchanged from the result with no prior. The table then presents various combinations of the prior and the k parameter, with little difference across the specifications except that reducing k below two (which means we are reducing the penalty for matching less similar observations) appears to increase the uncertainty regarding the estimated treatment effect. We also report results using all observations with, again, similar results.
Inverse Propensity Weighting
Inverse propensity weighting (IPW) is an alternative approach to dealing with attrition that uses some of the same building blocks as multiple imputation: it leverages information in the relationships among observed covariates to reweight the observed data such that they approximate the full data set (Glynn et al. 2010).
Specifically, we first use logistic regression on the full sample^{Footnote 27} to estimate a model of survey response. We employ the same model specification as above, with the exception that we drop our measure of age because it has substantial missingness. From the model, we generate a predicted probability of survey response for each respondent, estimates which vary from 0.13 to 0.35. For the 12,439 fully observed respondents, we then calculate the average treatment effect of canvassing, weighted by the inverse predicted probability of responding to the survey. Doing so, the estimated treatment effect of canvassing is −1.79 percentage points, with a 95% confidence interval from −3.52 to −0.05 percentage points.
Heckman Selection
Heckman selection models assume that the errors in the selection equation and outcome equation are distributed bivariate normally. With this assumption, the expected value of the error in the outcome equation conditional on selection can be represented with an inverse Mills’ ratio. There is considerable disagreement in the literature about the appropriateness of this assumption. Some find it implausible, given that the key assumption is about the joint distribution of unobserved quantities. Others find the approach more plausible than assuming away the correlation of errors across selection and outcome equations as is done in other selection models.
Table 12 shows results from several specifications of a Heckman selection model. In the first column no additional controls are included. In the second column, the controls listed at the bottom of the table are included. In the third column, the sample is limited to those who voted in 2 or fewer previous elections in the dataset. The results are qualitatively similar to the nonparametric selection model. The significant (or nearly so) \(\rho\) parameter indicates that there is some modest correlation between errors in the two equations. A statistically significant \(\rho\) parameter indicates that the errors are correlated, a necessary, but not sufficient condition for selection bias. In this case, since the estimates are similar to methods that assume no correlation of errors, there does not appear to be selection bias.
Rights and permissions
About this article
Cite this article
Bailey, M.A., Hopkins, D.J. & Rogers, T. Unresponsive and Unpersuaded: The Unintended Consequences of a Voter Persuasion Effort. Polit Behav 38, 713–746 (2016). https://doi.org/10.1007/s1110901693388
Published:
Issue Date:
Keywords
 Field experiment
 Political campaigns
 Political persuasion
 Nonrandom attrition
 Survey response