Skip to main content

Advertisement

Log in

Hookworm Eradication as a Natural Experiment for Schooling and Voting in the American South

  • Original Paper
  • Published:
Political Behavior Aims and scope Submit manuscript

Abstract

Educational attainment is robustly associated with greater political participation, yet the causal nature of this finding remains contested. To assess this relationship, I leverage a natural experiment in the Rockefeller Sanitary Commission’s (RSC) anti-hookworm campaign, which exogenously expanded primary and secondary education in the early-twentieth century American South. I evaluate two RSC hookworm interventions: exposure to the campaign and proportion treated. I use genetic matching to control for observable factors that influenced the haphazard dispensing of treatment, and implement new matching methods for continuous campaign interventions. I also use a variety of methods to assess the robustness of the results to a number of alternative accounts. Throughout, I find a consistent positive effect of education on participation, suggesting additional evidence for a causal interpretation of the ‘education effect’.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. Data and code to reproduce analysis are available on the Political Behavior Dataverse at: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/YQNLWF.

  2. These are: Alabama, Arkansas, Georgia, Kentucky, Louisiana, Mississippi, North Carolina, South Carolina, Tennessee, Texas, and Virginia.

  3. Following Bleakley (2007), I use pre-RSC hookworm incidence as an instrument and recover similar, though weaker results. See Sect. IV.1, p. 8 in the Online Appendix.

  4. The result for college attendance in the Youth-Parent Socialization Panel Study was shown in follow-up studies by Henderson and Chatfield (2011) and Mayer (2011) to be sensitive to choices about which covariates to control for, and how to conduct matching.

  5. Hookworm rates come from RSC archival reports (Rockefeller Sanitary Commission 1909, 1915), and data collected independently by Thoman (2009). Hookworm incidence data are collected for 740 counties, but are missing (and imputed) for 555 counties. See below and Sect. IV.2, p. 11 of the Online Appendix for more details.

  6. Genetic matching uses an evolutionary algorithm to maximize similarities across treatment and control groups on covariates over multiple matching iterations. The method has been used widely to non-parametrically condition on variables prior to estimation (Diamond and Sekhon 2014).

  7. Standard (bipartite) genetic matching is used for the dichotomous RSC Campaign instrument. Here matching is 1-to-1, without calipers, and with replacement to estimate the (local) average treatment effect for the treated (LATT) counties. I allow ties in the matches, so multiple equally ‘good’ controls can be matched to each treated unit, with weights to partition these uniformly. Ultimately, 385 unique control counties are matched to 633 treated counties, for an effective sample size of 1018. Given this weighting, I use weighted two-stage-least-squares (2SLS) to estimate IV effects for the RSC Campaign instrument. Continuous NBP genetic matching is used for the RSC % Treated instrument. NBP matching is optimal and 1-to-1, with only one odd unit discarded. No calipers are used on d{X}. The estimand derived from optimal matching on a discretized continuous intervention is proportional to the (local) average treatment effect (LATE). See Henderson (2015a, b) for more details.

  8. Henderson (2015a) finds this estimate is proportional to an average discretizing shift E{z 1 − z 0}.

  9. See Henderson (2015b), and Sect. III, p. 2 in the Online Appendix for details on this genetic matching approach and its implementation. Also, see Henderson (2015a) for a discussion of the error-in-variables structure of matching with continuous interventions.

  10. As additional robustness checks, I estimate bootstrap standard errors, and produce Hodges–Lehmann (HL) point estimates, which are robust to possibly weak instruments (Imbens and Rosenbaum 2005). See Sect. V.3, p. 21 in the Online Appendix for more details.

  11. Many of the measures of hookworm incidence and the RSC treatments were collected by Eric Thoman, who graciously agreed to share his data.

  12. In additional robustness checks, I analyze schooling measures from the 5% census samples for the same years housed by the Integrated Public Use Microdata Series (IPUMS) (Ruggles et al. 2010). Age groups for these data nearly-perfectly correspond with the census data. Age groups in the IPUMS are 6–17, 6–14, and 15–17.

  13. See Clubb et al. (2006) for how to measure eligible voters in the South.

  14. Only those Southerners born between 1894 and 1909 (i.e., 6–17 in 1911–1915) could have been in school during the RSC campaign, and thus able to receive hookworm treatment. Assuming these populations are equally distributed across age groups, this would mean that about 0/8th of this population would have been eligible in 1912, 1/8th eligible in 1916, 3/8th eligible in 1920, 7/8th eligible in 1928, and 8/8th eligible in 1932. Including years 1912 and 1936 does not change the results, nor does breaking this into individual-year comparisons, e.g., 1928 and 1912. I present additional results in Sect. VI.2, p. 47 of the Online Appendix to assess the robustness of the findings to choosing earlier elections as baselines. I find that after matching, the results are robust to choosing any baseline elections years before 1928. I also assess how long the treatment effects last, finding that the education effect appears to persist until the mid-to-late 1940s. These are presented in Sect. VI.3, p. 48 of the Online Appendix.

  15. Women are enfranchised by 1920. Also, 1932 was a big Democratic swing year. Results are robust to using just elections in 1920 and 1928.

  16. A benefit of studying turnout in the Democratic South is that the lack of two-party competition helps alleviate confounding issues related to partisan voting. For instance, national election swings largely have a uniform effect across the South, and do not depend very much on local factors that might bias county comparisons over time.

  17. Table I, p. 7 in the Online Appendix includes the list of covariates, with descriptive statistics.

  18. See Sect. IV.2, p. 11 of the Online Appendix for details and robustness checks on the imputation approach. I validate the imputations by randomly imputing a subset of units with non-missing data, and measure the mean square prediction error using real data.

  19. In a related study, Bleakley (2007) looks at hookworm incidence rates to assess the RSC’s impact on education. His motivation is that places with greater infection should experience larger reductions in hookworm, and thus also greater expansions in educational attainment after the campaign. This is an imperfect instrument since other factors (like infrastructural investments or better economic growth) occurring over the period could also reduce hookworm rates, but have little to do with the campaign. Nevertheless, given this logic, I also include hookworm incidence as an instrument as a robustness check, and recover similar results. These are presented in Sect. IV.1, p. 8 in the Online Appendix.

  20. Two additional findings support instrument exogeneity. Predominantly black counties (>60%) receiving the intervention experienced remarkably similar rates of educational expansion, but no appreciable increase in turnout, as expected under Jim Crow disenfranchisement. Rosenbaum (2002) sensitivity tests also affirm that estimates are robust to possible unobserved confounders that triple the odds of treatment for the treated counties. See Sects. V.4 (p. 27) and V.2 (p. 19) in the Online Appendix for more details.

  21. The vast majority of infections (about 85%) in 1910 involved children 18 years or younger, with most of these aged 12–16. Education was also far from universal in this period. Both foster conditions for hookworm eradication to have big impacts on Southern educational attainment. See Sect. II, p. 2 in the Online Appendix for evidence on the effectiveness of the RSC in reducing hookworm infection.

  22. F-statistics for the campaign are all greater than 20 after matching, and F-statistics for proportion treated are all greater than 14 except for high school aged children.

  23. In an additional step, I further assess the robustness of IV estimates to this strong instruments assumption through both parametric (2SLS) and non-parametric (HL) estimation approaches. HL estimates are used in IV analyses to provide an additional test of the weak instruments assumption, since these provide correct coverage in hypothesis testing, regardless of the strength of the first-stage association (Rosenbaum 2002). HL confidence intervals contain as much or as little information as is available in the instrument. HL results are more conservative than 2SLS in incorporating the information about first stage effects in the resulting IV standard errors. See Table VII in Sect. V.3, p. 23 of the Online Appendix for the results of this analysis.

  24. The HL results are similar to the 2SLS estimates, though standard errors are typically larger due to the conservative feature of the HL rank estimator. See Table VII (p. 23) in the Online Appendix for the HL results for children aged 6–17.

  25. To assess whether these effects generalize to other similar interventions, I replicate the estimation strategy exploiting a series of New Deal anti-malaria efforts in the 1930s and 1940s, including the Agricultural Adjustment Act (1933) and the National Malaria Eradication Program (1947). These programs had the effect of reducing malaria infection in the South, targeting areas with the greatest rates of infection. The interventions appear to have expanded both childhood education (0.023; p = 0.031) and adult political participation (0.018; p = 0.038) the most, where pre-New Deal malaria infection was the heaviest. These results should be interpreted with much caution since both adults and children benefited directly from these public health policies, each surely had spillover economic or sociodemographic effects, and anyway mandatory public education was expanding dramatically during this period. Nevertheless these are suggestive that the educational and participatory effects of the hookworm campaign are not idiosyncratic to other similar public health investments. See Sect. VI.1, p. 45 in the Online Appendix.

  26. There is heteroskedasticity in these bivariate associations, even after conditioning on covariates, due in part to many counties having few or no children treated. A nice property of discretizing this instrument during genetic matching is that such heteroskedastic variance is absorbed appropriately into the standard errors of the (now-binary) treatment estimates.

  27. The average county population in 1920 is 22,563, with an effect size of 0.017 impacting 30% of the population yields: 22,563 × 0.017 × 0.3 = 115.

  28. For turnout this is 22,563 × 0.0139 × 0.26 = 82. Setting the vote-eligible proportion at 0.44 inflates the effect size to 1.2, which could imply a potential violation of the exclusion restriction. However, census schooling rates are likely to be measured with some error (Ruggles et al. 2010). Such error could attenuate estimates of the RSC effect on schooling, implying an overly large turnout effect. Further, while turnout counts are likely well-measured, the vote-eligible population is not. This could also yield an attenuation effect in turnout differences across treated counties that inflates the apparent effect of the RSC on overall voting. See sensitivity analysis in Sect. 6.2 for more details.

  29. Stratification can also be motivated since hookworm does not uniformly afflict all school age children. Stratifying on ages likely to be similarly affected can reduce additional variance in schooling not expected to be influenced by the RSC campaign. Stratifying in this way also clarifies whether or not the RSC is impacting schooling only for some subset of the population, especially older school-age children in certain counties.

  30. Similar results are recovered for the HL estimates stratifying by age groups. See Table VIII (p. 24) of the Online Appendix for details.

  31. See Figures II–IV in Sect. V.1, p. 14 in the Online Appendix for balance plots showing differences on 19 covariates across the RSC instruments before and after matching.

  32. Differences emerge from the RSC decision to target counties with greater rates of hookworm. Given these differences, simple comparisons of education and turnout across targeted counties are unlikely to recover unbiased estimates of the causal effect of education on turnout, without adjusting for factors driving the RSC's targeting decisions.

  33. Figures II(b)–IV(b), p. 16 in the Online Appendix show balance plots for the instruments after matching.

  34. QQplots report the quantiles of two distributions, showing differences in the distributions when these fall off the 45-degree line. This test is not available for the proportion treated instrument since its denominator is a measure of hookworm incidence.

  35. In this placebo, prior schooling is from 1910 and prior turnout is from the 1912 presidential election. Neither of these is used during matching, though counties are matched on schooling in 1900 and turnout in 1908.

  36. IV analyses also assume that the assignment of the instrument for county i does not influence assignment for j, usually called the Stable Unit Value Treatment Assumption, as well as complier monotonicity, that there are no counties assigned to Z that always defy assignment by taking 1  Z.

  37. Both the intention-to-treat, E[Y|Z], and first stage, E[D|Z], effects are well-estimated using either aggregate or individual-level data, assuming RSC treatment does not have an effect on the composition of the counties, or movement between them. Otherwise, aggregating data prior to estimation could be conditioning on a post-treatment variable that induces a ‘reversal’ in the sign of individual-level treatment estimates (Pearl 2014; Spenkuch 2017). I find empirically that the RSC campaign had no discernible effect on changes in county population or mobility. Thus, any concerns about interpreting aggregate effects here mainly involve IV excludability, and not more classical ecological inference problems. See Sect. V.4, p. 33 in the Online Appendix.

  38. Relatedly, the relevant SUTVA assumption here operates at the county and not individual-level. In other words, if the dispensaries had “spillover” effects amongst individuals but within counties, this would be included in the total county-level effect of the RSC campaign. This would not constitute a SUTVA violation, nor would it confound estimation, though could alter how we interpret the (individual-level) effective impact size of treatment. On the other hand, the RSC campaign was targeted at the level of counties and not individuals, though of course was administered to individual children. This is analogous to studying interventions randomized at the school- or classroom-level, where estimating student-level effects generally could be biased. Hence the advantage of estimating aggregate county-level effects is that this follows directly from the particular assignment process used by the RSC campaign.

  39. Such concerns rest on difficult-to-assess counterfactuals. Some indirect effects of the RSC campaign could spillover into increased turnout, though many such paths seem unlikely. For instance, women predominantly provided health care for children in the early twentieth century American South, though voted at much lower rates (typically 16–20% points less) than men. In contrast, fathers provided relatively little childcare investment (Bleakley 2007). Thus, freeing up parents from caring for no-longer-sick children would likely have minor impact on increasing adult voter turnout. To the degree this concern persists, sensitivity analysis on the exclusion restriction can assess how much impact this mechanism would need to have to preclude interpreting education's effect on turnout as causal.

  40. Notably, γZ can represent the indirect influence of all possible (linear) alternatives Q through which Z can impact Y. See Sect. V.4, p. 27 in the Online Appendix for more details.

  41. See Sect. V.4, p. 27 in the Online Appendix for details, including additional exploratory analysis of interpreting the excludability assumption generally, and using aggregate data.

  42. Indeed, explicit to IV estimation is that inferences are valid only for those who would have complied with treatment, that is, took up more education as a result of being assigned a specific form of encouragement (e.g., proximity to a college, born before a school-grade cutoff, assigned a Vietnam draft lottery number). Some of the strongest empirical evidence about the education effect has been drawn from the use of plausibly exogenous instruments (e.g., Berinsky and Lenz 2011; Dee 2004; Sondheimer and Green 2010). Each finding individually speaks to a local (and possibly idiosyncratic) effect, but in sum joins a wider mapping of measurements taken in different samples, at multiple times and places.

  43. From a comparative perspective, for example, conditions in the early twentieth century American South rather closely resemble some parts of the contemporary developing world, combining relative poverty in income, health, and education, with a semi-authoritarian political order. Finding in this setting that expanded education yielded greater turnout in the 1930s, may shed new light on theoretical questions or historical patterns of interest to scholars of American political development, or those studying in current comparative settings.

  44. The ‘local compliers’ problem is magnified when people receive all sorts of non-random encouragements to obtain more schooling. For instance, regression analyses of the education effect using a high-quality representative survey, are only able to make inferences about the particular subsample of people who, in receiving education, happened to comply with a highly complex array of non-random interventions.

References

  • Berinsky, A., & Lenz, G. (2011). Education and political participation: Exploring the causal link. Political Behavior, 33(3), 357–373.

    Article  Google Scholar 

  • Bleakley, H. (2007). Disease and development: Evidence from hookworm eradication in the American South. Quarterly Journal of Economics, 122(1), 73–117.

    Article  Google Scholar 

  • Brady, H., Verba, S., & Schlozman, K. (1995). Beyond SES: A resource model of political participation. American Political Science Review, 89(2), 271–294.

    Article  Google Scholar 

  • Campbell, D. E. (2006). Why we vote: How schools and communities shape our civic life. Princeton, NJ: Princeton University Press.

    Google Scholar 

  • Clubb, J. M., Flanigan, W. H., & Zingale, N. H. (2006). Electoral data for counties in the United States: Presidential and congressional races, 1840–1972. ICPSR08611-v1. Ann Arbor, MI: ICPSR.

  • Conley, T. G., Hansen, C. B., & Rossi, P. E. (2012). Plausibly exogenous. Review of Economics and Statistics, 94(1), 260–272.

    Article  Google Scholar 

  • Dee, T. S. (2004). Are there civic returns to education? Journal of Public Economics, 88(3), 1697–1720.

    Article  Google Scholar 

  • Dewey, J. (1916). Democracy and education. New York: The Free Press.

    Google Scholar 

  • Diamond, A., & Sekhon, J. S. (2014). Genetic matching for estimating causal effects: A general multivariate matching method for achieving balance in observational studies. Review of Economics and Statistics, 95(3), 932–945.

    Article  Google Scholar 

  • Downs, A. (1957). An economic theory of democracy. New York: Harper & Row.

    Google Scholar 

  • Grusky, D. (Ed.). (2001). Social stratification: Class, race, and gender in sociological perspective. Boulder, CO: Westview Press.

    Google Scholar 

  • Henderson, J. A. (2015a). Estimating causal effects using coarsened treatments as instruments. SSRN. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2685357. Accessed 1 May 2017.

  • Henderson, J. A. (2015b). A genetic matching approach to estimating treatment effects using non-binary interventions. SSRN. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2701448. Accessed 1 May 2017.

  • Henderson, J., & Chatfield, S. (2011). Who matches? Propensity scores and bias in the causal effects of education on participation. Journal of Politics, 73(3), 646–658.

    Article  Google Scholar 

  • Imbens, G. W., & Rosenbaum, P. R. (2005). Robust, accurate confidence intervals with a weak instrument: Quarter of birth and education. Journal of The Royal Statistical Society, Series A, 168(1), 109–125.

    Article  Google Scholar 

  • Jackson, R. (1996). A reassessment of voter mobilization. Political Research Quarterly, 49, 331–349.

    Article  Google Scholar 

  • Kam, C., & Palmer, C. (2008). Reconsidering the effects of education on political participation. Journal of Politics, 70, 612–631.

    Article  Google Scholar 

  • Kam, C. D., & Palmer, C. L. (2011). Rejoinder: Reinvestigating the causal relationship between higher education and political participation. Journal of Politics, 73(3), 659–663.

    Article  Google Scholar 

  • Keele, L., & Morgan, J. W. (2017). How strong is strong enough? Strengthening instruments through matching and weak instrument tests. Annals of Applied Statistics, 10(2), 1086–1106.

    Article  Google Scholar 

  • Keyssar, A. (2009). The right to vote: The contested history of democracy in the United States. New York: Basic Books.

    Google Scholar 

  • Lu, B., Zanutto, E., Hornik, R. C., & Rosenbaum, P. R. (2011). Optimal nonbipartite matching and its statistical applications. The American Statistician, 65(1), 21–30.

    Article  Google Scholar 

  • Lu, B., Zutto, E., Hornik, R., & Rosenbaum, P. R. (2001). Matching with doses in an observational study of a media campaign against drug abuse. Journal of the American Statistical Association, 96(456), 1245–1253.

    Article  Google Scholar 

  • Luster, T., & McAdoo, H. (1996). Family and child influences on educational attainment: A secondary analysis of the high/scope Perry preschool data. Developmental Psychology, 32(1), 26–39.

    Article  Google Scholar 

  • Mayer, A. K. (2011). Does education increase political participation? Journal of Politics, 73(3), 633–645.

    Article  Google Scholar 

  • Mayhew, D. (1986). Placing parties in American politics. Princeton, NJ: Princeton University Press.

    Book  Google Scholar 

  • Miguel, E., & Kremer, M. (2004). Worms: Identifying impacts on education and health in the presence of treatment externalities. Econometrica, 72(1), 159–217.

    Article  Google Scholar 

  • Milligan, K., Moretti, E., & Oreopoulos, Philip. (2004). Does education improve citizenship? Evidence from the United States and the United Kingdom. Journal of Public Economics, 88(3), 1667–1695.

    Article  Google Scholar 

  • Pearl, J. (2014). Comment: Understanding Simpson’s paradox. The American Statistician, 68(1), 8–13.

    Article  Google Scholar 

  • Rockefeller Sanitary Commission. (1909). The Rockefeller commission by-laws. Rockefeller Archive Center, RG III 2, Series O, Box 52, File 544.

  • Rockefeller Sanitary Commission. (1915). Fifth annual report of the Rockefeller Sanitary Commission Hookworm eradication campaign. New York: Rockefeller Archive Center. https://web.archive.org/web/20150523214411/https://www.rockefellerfoundation.org/app/uploads/RF-Annual-Report-1915.pdf. Accessed 1 May 2017.

  • Rosenbaum, P. R. (2002). Observational studies (2nd ed.). New York: Springer.

    Book  Google Scholar 

  • Rosenstone, S., & Hansen, J. (1993). Mobilization, participation, and democracy in America. New York: Longman Publishing.

    Google Scholar 

  • Ruggles, S., Trent Alexander, J., Genadek, K., Goeken, R., Schroeder, M. B., & Sobek, M. (2010). Integrated public use microdata series: Version 5.0 [Machine-readable database]. Minneapolis: University of Minnesota.

    Google Scholar 

  • Saunders, P. (Ed.). (1990). Social class and stratification. New York: Routledge.

    Google Scholar 

  • Schlozman, K. (2002). Citizen participation in America. In I. Katznelson & H. Milner (Eds.), Political science: State of the discipline. New York: W.W. Norton.

    Google Scholar 

  • Sekhon, J. S. (2011). Matching: Multivariate and propensity score matching with automated balance search. Journal of Statistical Software, 42(7), 1–52. Computer program available at http://sekhon.berkeley.edu/matching/. Accessed 1 May 2017.

  • Sekhon, J. S., & Mebane, W. R., Jr. (1998). Genetic optimization using derivatives: Theory and application to nonlinear models. Political Analysis, 7, 189–203.

    Article  Google Scholar 

  • Sondheimer, R. M., & Green, D. P. (2010). Using experiments to estimate the effects of education on voter turnout. American Journal of Political Science, 54(1), 174–189.

    Article  Google Scholar 

  • Spenkuch, J. (2017). Ecological inference with instrumental variables, working paper. http://www.kellogg.northwestern.edu/faculty/spenkuch/research/ei-iv.pdf. Accessed 1 May 2017.

  • Staiger, D., & Stock, J. H. (1997). Instrumental variables regression with weak instruments. Econometrica, 65(3), 557–586.

    Article  Google Scholar 

  • Tenn, S. (2007). The effect of education on voter turnout. Political Analysis, 15(4), 446–464.

    Article  Google Scholar 

  • Thoman, E. B. (2009). Historic Hookworm prevalence rates and distribution in the southeastern United States. Rockefeller Archive Center: http://www.rockarch.org/publications/resrep/thoman.pdf. Accessed 1 May 2017.

  • van Buuren, S., & Groothuis-Oudshoorn, K. (n.d.). MICE: Multivariate imputation by chained equations in R. Journal of Statistical Software. Forthcoming.

  • Wolfinger, R., & Rosenstone, S. (1980). Who votes? New Haven: Yale University Press.

    Google Scholar 

  • Woodward, C. V. (1951). Origins of the New South: 1877–1913. Baton Rouge, LA: Louisiana State University Press.

    Google Scholar 

Download references

Acknowledgements

For valuable comments on this project I thank Deborah Beim, Henry Brady, David Campbell, Devin Caughey, Sara Chatfield, Jacob Hacker, Eitan Hersh, Danny Hidalgo, John Holbein, Greg Huber, Cindy Kam, David Nickerson, Mikael Persson, Eric Schickler, Jasjeet Sekhon, Alex Theodoridis, Rocio Titiunik and Kimhouy Tong, and three anonymous reviewers. I thank Eric Thoman for generously sharing data. All errors are the author’s responsibility.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John A. Henderson.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 1437 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Henderson, J.A. Hookworm Eradication as a Natural Experiment for Schooling and Voting in the American South. Polit Behav 40, 467–494 (2018). https://doi.org/10.1007/s11109-017-9408-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11109-017-9408-6

Keywords

Navigation