Abstract
The experimental approach has begun to permeate political science research, increasingly so in the last decade. Laboratory researchers face at least two challenges: determining who to study and how to lure them into the lab. Most experimental studies rely on student samples, yet skeptics often dismiss student samples for lack of external validity. In this article, we propose another convenience sample for laboratory research: campus staff. We report on a randomized experiment to investigate the characteristics of samples drawn from a general local population and from campus staff. We report that campus staff evidence significantly higher response rates, and we find few discernible differences between the two samples. We also investigate the second challenge facing researchers: how to lure subjects into the lab. We use evidence from three focus groups to identify ways of luring this alternative convenience sample into the lab. We analyze the impact of self-interest, social-utility, and neutral appeals on encouraging study participation, and we find that campus staff respond better to a no-nonsense approach compared to a hard-sell that promises potential policy benefits to the community or, and especially, to the self. We conclude that researchers should craft appeals with caution as they capitalize on this heretofore largely untapped reservoir for experimental research: campus employees.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
By “stimulus” we refer to the experimental manipulation(s) typically introduced by random assignment to experimental subjects. We expect a response to that stimulus, and thus, the stimulus is typically seen as the “cause” and the response seen as the “effect.”
The common distinction in locale is between laboratory and field studies. While traditionally most experiments have taken place in the laboratory, more are moving to the field, where “randomization is performed within a naturalistic setting” (Green and Gerber 2002, 812).
For example, researchers interested in collecting millisecond response latency data or in administering sophisticated audio-visual treatments may prefer to conduct studies in the laboratory rather than the field.
The following subject pools were identified: undergraduates from a psychology class, undergraduates from a political science class, undergraduates from other or unreported classes, other students, adults sampled from campus, adults sampled from the local community, or adults sampled at the national level.
Experiments conducted in either the field or the laboratory are eligible for inclusion under this category. “Natural experiments” and “quasi-experiments” are not included (see King et al. 1994, p. 7, note 1, for a critique of the latter term). Observational studies do not count as experiments (i.e., standard public opinion surveys without embedded experiments, personality assessments, participant observation, etc.).
Our finding of an increased number of articles featuring experiments is consistent with other content analyses. McDermott (2002b) reports an upward trend based on her review of a number of political science journals from 1926 to 2000. Morton and Williams (Forthcoming) restrict their review to the American Political Science Review (APSR), American Journal of Political Science, and Journal of Politics between the years 1950 and 2005; they find that “Experimentation is increasing dramatically in political science” (2). Finally, Druckman et al. (2006) document a similar positive trend in their analysis of the APSR and its contents from 1906 to 2004. Moreover, the authors examine the citation rate (via the Web of Science Social Sciences Citation Index) for the experimental articles they find in comparison to a sample of nonexperimental articles. They find that the studies with experiments are cited more frequently. Thus, not only are experiments on the rise but their impact is presumably increasing in step with that rise.
This aspect of validity is also referred to as “construct validity,” per Cook and Campbell (1979).
No experimental study can claim to generalize to all populations across space and time and so researchers must carefully identify the populations to which generalizations are both relevant and appropriate.
In fact, Morton and Williams (Forthcoming: 9–10) gently admonish scholars who “rarely worry about the external validity issue with observational data but do worry about it with experimental data.”
The three journals were Journal of Personality and Social Psychology, Personality and Social Psychology Bulletin, and Journal of Experimental Social Psychology. In the 1985 analysis the personality section of the first-listed journal was excluded from the analysis.
Typically in such cases scholars argue that certain characteristics of their student sample make the use of that sample a more stringent test of their hypothesis(es) (see, for example, Sigelman et al. 1991: 135; Kahn and Geer 1994: 100; Funk 1997: 690–691). The expectation is that the effect of the treatment will be lowest among this sub-group; thus, if results are obtained one can assume they would obtain, and likely at a greater level, in a more general population. Alternatively, at least one set of scholars (Druckman and Nelson 2003) argues that the effect of their treatment is likely to be greater among their student sample and, because the research is testing the hypothesis that there will be no effect, the student sample provides a more difficult test. While these arguments can be reasonably persuasive, others (e.g,. McDermott 2002b: 40) remind us that “external validity is only fully established through replication.” The same, of course, goes for studies based on observational data.
Researchers can often recruit the desired number of undergraduate participants for a specific study by offering non-monetary incentives, namely course credit (either by requiring participation or offering extra-credit for participation; in either case, typically student subjects are offered a choice between the study itself or an alternate learning activity, in order to avoid coercing participation).
Recall, however, that students who volunteer for studies may not necessarily be representative of students at-large; researchers must take care to think through this possibility before automatically claiming that student subject pools generalize to students at-large.
One of the few studies that have directly examined the issue comes from the field of business organization/ behavior. Gordon et al. (1986) analyzed 32 studies that contain analyses of both student and nonstudent adult samples and find significant differences in treatment effects across a majority of those studies. In almost all the cases, the adult nonstudent sample is a subset of the larger population (e.g., psychology faculty, managers, judges). See also critiques by Greenberg (1987) and Dobbins et al. (1988) and responses by Gordon et al. (1987) and Slade and Gordon (1988). Within the field of consumer research, Peterson (2001) surveys studies reporting analyses of both student and nonstudent adult (often housewives or women in general) samples. He finds that effect sizes for college students were generally larger on average and in a number of cases he finds that the direction of the relationship differed across the two sample types. Others have noted a paucity of empirical studies within political science that examine differences in treatment effects across student and nonstudent subject pools (e.g., Mintz et al. 2006; Peterson 2001). We expect there may be a bias in what is submitted and accepted for publication. Presumably, scholars typically carry out an experiment on more than one sample type in order to replicate the study for the purpose of increasing external validity. If so, findings of different effects across samples may be less likely to be submitted and/or accepted for publication. Or, it is possible that student samples provide pilot data that are then replicated using nonstudents, and only the latter are reported.
For example, Henrich (2000) compares results of the ultimatum game played by nonstudent adult members of the Machiguenga community in the Peruvian Amazon to results from the same game played under similar circumstances by graduate students at UCLA. He finds clear differences in behavior across the two groups, which he attributes to differences in culture. Henrich et al. (2004) compare a more extensive series of studies in numerous small-scale societies to experimental research conducted using student subjects. The authors argue that, whereas there is little variation across the behavior manifested by students in such games across country settings, there are significant differences across the communities they study.
Further, these concerns are expressed more often in some research areas than others. See McGraw and Hoekstra (1994) for a discussion of how the use of student subjects varies across types of research. For example, experiments in electoral behavior seem to be more vulnerable to concerns raised about generalizability of student samples than studies based on experimental economics.
For researchers whose study design enables them to forego the tight control of a lab and venture into naturalistic settings, one option is to target confined subjects. Thus, experimentalists have recruited subjects on trains, in farmers’ markets and state fairs, in court houses while waiting for jury duty, and in retirement homes. Such convenience samples may offer quite restricted demographic profiles (e.g., only the elderly residing in retirement homes); the study may endure significant distraction (e.g., noise); and, in the post-9/11 era, the researcher may find such samples less and less available due to tighter security. For various reasons, such as difficulty accessing confined populations, concern about distractions, issues with the technology by which the treatment is delivered, and so on, researchers may find it more advantageous to carry out their studies in a lab.
The offer of extra or class credit is typically sufficient to lure an ample number of subjects into the lab. This is not to say that all students volunteer for such opportunities. In our experience, we find that participation from classes in which students were offered extra credit averages around 50% (though this varies significantly depending on the amount of extra credit offered); actual class credit typically yields a participation rate of approximately 90%. These numbers are significantly higher than those we report in this paper for the local and campus populations who were offered a fairly substantial monetary incentive. We should note that some experimentalists compensate their student subjects with money and we assume this increases participation rates.
One limit to generalizability would occur if this interest in academic research per se were systematically correlated with a key covariate that conditioned treatment effects. Generally, we do not expect this to be the case.
The marketing list was confined to individuals with an estimated age between 24 and 80 years old, and a gender tag was included in the list.
The sampling criteria omitted the following: individuals located off campus (e.g., the Medical Center personnel), deans (and above), professors (any rank), lecturers, and Fellows; Executive Administrative Assistants, Directors, and Associate Directors; and individuals with “Research” in their title.
See Appendix B for the letters; below, we discuss the substantive content of the recruitment letters.
Some local mailings and campus mailings were returned as undeliverable. We kept track of invalid addresses and returned envelopes. This leaves us with a reduction in sample size from 2,250 to 2,139. All percentages reflect this adjusted sample size.
We show that local and campus subjects are similar on a variety of individual-level characteristics, including demographics and some politically relevant attitudes. We have no reason to expect that subjects’ responses to the demographic information should be systematically contaminated by any of the other questions that appeared on the study. Because all subjects were randomly assigned to receive various stimuli in the study, independent of which appeal they received in the recruitment letter and of which sample they represented, their overall responses should not be systematically related to their responses on basic political/attitudinal questions (partisanship, ideology, level of participation, or social trust). Further, because subjects were randomly assigned to stimuli independent of which appeal they received in the recruitment letter, then even if there were contamination based on particular conditions in the experiment, these should be unrelated to the measured differences across campus employees and local residents.
The number of respondents for whom we have individual-level data deviates from the number of targeted individuals who contacted us to participate, as some respondents contacted us but did not end up scheduling an appointment, could not be accommodated in the remaining slots, failed to show up for their appointments, or did not provide us with the usable data.
This ongoing program is called The Omnibus Project (TOP). More information on TOP can be found at: http://ps.ucdavis.edu/top.
A stronger claim could be made if we had administered the same experimental stimulus to nonstudent and student samples and if we could then identify whether, indeed, nonstudents and students responded differently or similarly. We expect that similarity and difference would depend upon the experimental stimulus itself: it would depend upon whether the effect of the experimental stimulus was conditional upon some key covariate upon which the students and nonstudents diverged. Unfortunately, such data are not available to us.
Minor post-assignment adjustments (7 subjects out of 1500) were made to ensure that the average age was comparable across the three conditions.
Unlike the local sample, we did not have a gender tag at the start of the study. An undergraduate research assistant, blind to the intentions of our research, attached a gender tag to each respondent. Although the gender tag was not added until after the study was implemented, the composition of the campus groups was quite comparable across the three conditions: 62.4% of targets in the social utility condition were female; 62.3% of targets in the self-interest condition were female; 65.0% of targets in the control condition were female.
The three sessions had seven, five, and four participants, respectively. Overall, the average age of the focus group participants was 44; 15 out of the 16 participants were female; the modal level of education (7 participants) was a college degree; and, 14 participants self-identified as Caucasian, one as Asian/Pacific Islander, and one as Hispanic. Each session was moderated by one of the authors. It is worth noting that despite our best efforts, fifteen of the sixteen participants in the focus groups were female. This figure is higher than the overall female percentage in our respondents, and it is possible that the resistance to self-interest appeals and the touting of helping behavior might have been affected by the disproportionate gender composition of our sample. However, in our analysis of response rates, we did not find any statistically significant differences in response rates to any of the appeals by sex.
Note that the letters were not identified as appealing to self-interest or social utility or neither. They were printed on different color paper, and focus group participants referred to them as such (e.g., “the blue one”).
We use pseudonyms in place of participants’ real names.
One answer was ambiguous. The person appeared to reference a comment about compensation and said “that” was a reason for her participation; however, we cannot be certain of the meaning of “that” in her statement.
The aversion to the social utility and overtly self-interested letters might also be explained by Shafir et al.’s (1993) discussion of how positive, but non-valued features of choice can actually detract from subjects’ willingness to select an option. Weakly positive features “apparently provide a reason against choosing the option” (32).
References
Achen, Christopher H. (1992). Social psychology, demographic variables, and linear regression: Breaking the iron triangle in voting research. Political Behavior, 14, 195–211.
Aronson, Elliot, Wilson, Timothy D., & Brewer, Marilynn B. (1998). Experimentation in social psychology. In Daniel T. Gilbert, Susan T. Fiske, & Gardner Lindzey (Eds.), The handbook of social psychology (pp. 99–142). Boston: McGraw-Hill.
Brooks, Deborah Jordan, & Geer, John G. (2007). Beyond negativity: The effects of incivility on the electorate. American Journal of Political Science, 51, 1–16.
Campbell, Donald, & Stanley, Julian C. (1963). Experimental and quasi-experimental designs for research. Boston: Houghton Mifflin.
Childers, Terry L., Pride, William M., & Ferrell, O. C. (1980). A reassessment of the effects of appeals on response to mail surveys. Journal of Marketing Research, 17, 365–370.
Cook, Thomas D., & Campbell, Donald T. (1979). Quasi-experimentation: Design and analysis for field settings. Boston: Houghton Mifflin.
Delli Carpini, Michael X., & Keeter, Scott (1993). Measuring political knowledge: Putting first things first. American Journal of Political Science, 37, 1179–1206.
Dillman, Don A. (1991). The design and administration of mail surveys. Annual Reviews of Sociology, 17, 225–249.
Dillman, Don A. (2007). Mail and internet surveys: The tailored design method (2nd ed.). Hoboken, NJ: John Wiley & Sons.
Dillman, Don A., Singer, Eleanor, Clark, Jon R., & Treat, James B. (1996). Effects of benefits appeals, mandatory appeals, and variations in statements of confidentiality on completion rates for Census questionnaires. Public Opinion Quarterly, 60, 376–389.
Dobbins, Gregory H., Lane, Irving M., & Steiner, Dirk D. (1988). A note on the role of laboratory methodologies in applied behavioural research: Don’t throw out the baby with the bath water. Journal of Organizational Behavior, 9, 281–286.
Druckman, James N. (2004). Political preference formation: Competition, deliberation, and the (ir)relevance of framing effects. American Political Science Review, 98, 671–686.
Druckman, James N., Green, Donald P., Kuklinski, James H., & Lupia, Arthur (2006). The growth and development of experimental research in political science. American Political Science Review, 100, 627–635.
Druckman, James N., & Nelson, Kjersten R. (2003). Framing and deliberation: How citizens’ conversations limit elite influence. American Journal of Political Science, 47, 729–745.
Fowler, James H., & Kam, Cindy D. (2006). Patience as a political virtue: Delayed gratification and turnout. Political Behavior, 28, 113–128.
Funk, Carolyn L. (1997). Implications of political expertise in candidate trait evaluations. Political Research Quarterly, 50 , 675–697.
Gerber, Alan S., & Green, Donald P. (2000a). The effect of a nonpartisan get-out-the-vote drive: An experimental study of leafletting. Journal of Politics, 62, 846–857.
Gerber, Alan S., & Green, Donald P. (2000b). The effects of canvassing, telephone calls, and direct mail on voter turnout: A field experiment. American Political Science Review, 94, 653–663.
Gordon, Michael E., Slade, L. A., & Schmitt, Neal (1986). The ‘science’ of the sophomore revisited: From conjecture to empiricism. The Academy of Management Review, 11, 191–207.
Gordon, Michael E., Slade, L. A., & Schmitt, Neal (1987). Student guinea pigs: Porcine predictors and particularistic phenomena. The Academy of Management Review, 12, 160–163.
Green, Donald, & Gerber, Alan (2002). Reclaiming the experimental tradition in political science. In Ira Katzelnson & Helen V. Milner (Eds.), Political science: State of the discipline (pp. 803–832). New York: Norton.
Greenberg, Jerald (1987). The college sophomore as guinea pig: Setting the record straight. The Academy of Management Review, 12, 157–159.
Groves, Robert M., Cialdini, Robert B., & Couper, Mick P. (1992). Understanding the decision to participate in a survey. Public Opinion Quarterly, 56, 475–495.
Henrich, Joseph (2000). Does culture matter in economic behavior? Ultimatum game bargaining among the Machiguenga of the Peruvian Amazon. American Economic Review, 90, 973–979.
Henrich, Joseph, Boyd, Robert, Bowles, Samuel, & Camerer, Colin et al. (2004). In search of homo economicus: Behavior experiments in 15 small-scale societies. American Economic Review, 91, 73–78.
Houston, Michael J., & Nevin, John R. (1977). The effects of source and appeal on mail survey response patterns. Journal of Marketing Research, 14, 374–378.
Ishiyama, John T., & Hartlaub, Stephen (2002). Does the wording of syllabi affect student course assessment in introductory political science classes? PS: Political Science and Politics, 35, 567–570.
Kahn, Kim Fridkin, & Geer, John G. (1994). Creating impressions: An experimental investigation of political advertising on television. Political Behavior, 16, 93–116.
Kam, Cindy D. (2005). Who toes the party line? Cues, values, and individual differences. Political Behavior, 27, 163–182.
King, Gary, Keohane, Robert O., & Verba, Sidney (1994). Designing social inquiry. Princeton, NJ: Princeton University.
Kropf, Martha E., & Blair, Johnny (2005). Eliciting survey cooperation: Incentives, self-interest, and norms of cooperation. Evaluation Review, 29, 559–575.
Lupia, Arthur. (2002). New ideas in experimental political science. Political Analysis, 10, 319–324.
McDermott, Rose (2002a). Experimental methodology in political science. Political Analysis, 10, 325–342.
McDermott, Rose (2002b). Experimental methods in political science. Annual Reviews of Political Science, 5, 31–61.
McGraw, Kathleen M., & Hoekstra, Valerie (1994). Experimentation in political science: Historical trends and future directions. In Michael X. Delli Carpini, Leonie Huddy, & Robert Y. Shapiro. (Eds.), Research in Micropolitics Vol 4: New Directions in Political Psychology, Greenwich, CT: JAI Press.
Merolla, Jennifer, Stephenson, Laura, & Zechmeister, Elizabeth J. (2007). Applying experimental methods to the study of information shortcuts in Mexico. Política y Gobierno, 14, 117–142.
Miller, Joanne A., & Krosnick, Jon A. (2000). News media impact on the ingredients of presidential evaluations: Politically knowledgeable citizens are guided by a trusted source. American Journal of Political Science, 44, 295–309.
Mintz, Alex, Redd, Steven B., & Vedlitz, Arnold. (2006). Can we generalize from student experiments to the real world in political science, military affairs, and international relations? Journal of Conflict Resolution, 50, 757–576.
Mintz, Alez, & Geva, Nehemia (1993). Why don’t democracies fight each other? An experimental study. Journal of Conflict Resolution, 37, 484–503.
Morton, Rebecca B., & Williams, Kenneth C. (Forthcoming). Experimentation in political science. In Janet Box-Steffensmeier, David Collier, & Henry Brady (Eds.), Oxford Handbook of Political Methodology.
Peterson, Robert A. (2001). On the use of college students in social science research: Insights from a second-order meta-analysis. Journal of Consumer Research, 28, 450–461.
Porter, Stephen R. (2004). Raising responses: What works? New Directions for Institutional Research Spring: 5–21.
Scott, John T., Matland, Richard E., Michelbach, Philip A., & Bornstein, Brain H. (2001). Just deserts: An experimental study of distributive justice norms. American Journal of Political Science, 45, 749–767.
Sears, David O. (1986). College sophomores in the laboratory: Influences of a narrow data base on social psychology’s view of human nature. Journal of Personality and Social Psychology, 51, 515–530.
Shafir, Eldar, Simonson, Itamar, & Tversky, Amos (1993). Reason-based choice. Cognition, 49, 11–36.
Sigelman, Lee, Sigelman, Carol K., & Bullock, David (1991). Reconsidering pocketbook voting: An experimental approach. Political Behavior, 13, 129–149.
Slade, L. Allen, & Gordon, Michael E. (1988). On the virtues of laboratory babies and student bath water: A reply to Dobbins, Lane, and Steiner. Journal of Organizational Behavior, 9, 373–376.
Stenner, Karen (2005). The authoritarian dynamic. Cambridge: Cambridge University Press.
Transue, John E. (2007). Identity salience, identity acceptance, and racial policy attitudes: American national identity as a united force. American Journal of Political Science, 51, 78–91.
Webster, Cynthia (1997). Effects of researcher presence and appeal on response quality in hand-delivered, self-administered surveys. Journal of Business Research, 38, 105–114.
Acknowledgements
We thank Andrea Morrison, Emerald Nguyen, Carl Palmer, Jeremy Poryes, Jennifer Ramos, Nathalie Trepo, Derek Tripp, and Whitney Wilking for research assistance. We gratefully acknowledge financial support from the UC Davis Department of Political Science Faculty-Student Collaborative Fellowship, the UC Davis Institute for Governmental Affairs, the UC Davis Senate Committee on Research, and the Claremont Graduate University Fletcher Jones Small Grant. Authors’ names are listed alphabetically.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Coding
Variables featured in Table 1:
Female is a dummy: 0 for male; 1 for female.
Age is measured in years.
Education ranges from 1 (8th grade or less) to 7 (advanced degree).
Income ranges from 1(<$15,000) to 6 (>$75,000).
Political Information is comprised of the number of correct responses to four information questions regarding political institutions: “Do you happen to know whose responsibility it is to determine if a law is constitutional or not?”; “Do you happen to know whose responsibility it is to nominate judges to the Federal Courts?”; “Do you happen to know which party currently has the most members in the [U.S.Senate][U.S. House of Representatives]?”. The Cronbach’s α for this scale is 0.63.
Party Identification is coded in three categories: 0 = Republican; .5 = Independent/Other; 1 = Democrat.
Ideology ranges from 1 = Extremely Liberal to 7 = Extremely Conservative, with “Haven’t Thought About It” placed at the midpoint.
Social Trust is an additive scale of three items, “Generally speaking, would you say that most people can be trusted, or that you can’t be too careful in dealing with people?”; “Do you think most people would try to take advantage of you if they got the chance or would they try to be fair?”; “Would you say that most of the time people try to be helpful, or that they are just looking out for themselves?”. The Cronbach’s α for this scale is 0.74.
Political Participation is an additive scale of seven acts: voted in 2004; performed campaign work; contributed to a campaign; worked on community issue; contacted a public official; attended a community meeting; and participated in a protest. The Cronbach’s α for this scale is 0.70.
Appendix B: Recruitment Letters
Rights and permissions
About this article
Cite this article
Kam, C.D., Wilking, J.R. & Zechmeister, E.J. Beyond the “Narrow Data Base”: Another Convenience Sample for Experimental Research . Polit Behav 29, 415–440 (2007). https://doi.org/10.1007/s11109-007-9037-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11109-007-9037-6