, Volume 45, Issue 2, pp 295–316 | Cite as

Does the delivery matter? Examining randomization at the item level

  • Erin M. BuchananEmail author
  • Riley E. Foreman
  • Becca N. Johnson
  • Jeffrey M. Pavlacic
  • Rachel L. Swadley
  • Stefan E. Schulenberg
Original Paper


Scales that are psychometrically sound, meaning those that meet established standards regarding reliability and validity when measuring one or more constructs of interest, are customarily evaluated based on a set modality (i.e., computer or paper) and administration (fixed-item order). Deviating from an established administration profile could result in non-equivalent response patterns, indicating the possible evaluation of a dissimilar construct. Randomizing item administration may alter or eliminate these effects. Therefore, we examined the differences in scale relationships for randomized and nonrandomized computer delivery for two scales measuring meaning/purpose in life. These scales have questions about suicidality, depression, and life goals that may cause item reactivity (i.e., a changed response to a second item based on the answer to the first item). Results indicated that item randomization does not alter scale psychometrics for meaning in life scales, which implies that results are comparable even if researchers implement different delivery modalities.


Scales Randomization Item analysis 


Compliance with ethical standards

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.


  1. Aust F, Barth M (2017) papaja: create APA manuscripts with R Markdown. Accessed 31 May 2018
  2. Bargh JA, Pratto F (1986) Individual construct accessibility and perceptual selection. J Exp Soc Psychol 22(4):293–311. CrossRefGoogle Scholar
  3. Benjamin DJ, Berger JO, Johannesson M, Nosek BA, Wagenmakers E-J, Berk R et al (2018) Redefine statistical significance. Nat Hum Behav 2(1):6–10. CrossRefGoogle Scholar
  4. Bethlehem J (2010) Selection bias in web surveys. Int Stat Rev 78(2):161–188. CrossRefGoogle Scholar
  5. Brown T (2006) Confirmatory factor analysis for applied research, 1st edn. The Guilford Press, New YorkGoogle Scholar
  6. Buchanan T, Ali T, Heffernan T, Ling J, Parrott A, Rodgers J, Scholey A (2005) Nonequivalence of on-line and paper-and-pencil psychological tests: the case of the prospective memory questionnaire. Behav Res Methods 37(1):148–154. CrossRefGoogle Scholar
  7. Buchanan EM, Valentine KD, Schulenberg SE (2014) Exploratory and confirmatory factor analysis: developing the purpose in life test-short form. In: Bindle P (ed) SAGE research methods cases. SAGE Publications, Ltd., London. CrossRefGoogle Scholar
  8. Buchanan EM, Valentine KD, Scofield JE (2017) MOTE. Accessed 31 May 2018
  9. Buhrmester M, Kwang T, Gosling SD (2011) Amazon’s mechanical turk: a new source of inexpensive, yet high-quality, data? Perspect Psychol Sci 6(1):3–5. CrossRefGoogle Scholar
  10. Cantrell MA, Lupinacci P (2007) Methodological issues in online data collection. J Adv Nurs 60(5):544–549. CrossRefGoogle Scholar
  11. Charters E (2004) New perspectives on popular culture, science and technology: web browsers and the new illiteracy. Coll Q 7(1):1–13MathSciNetGoogle Scholar
  12. Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Earlbaum, HillsdalezbMATHGoogle Scholar
  13. Cohen J (1992) A power primer. Psychol Bull 112(1):155–159. CrossRefGoogle Scholar
  14. Cohen J (1994) The earth is round (p<.05). Am Psychol 49(12):997–1003. CrossRefGoogle Scholar
  15. Comrey AL, Lee HB (1992) A first course in factor analysis, 2nd edn. Lawrence Erlbaum Associates Inc., HillsdaleGoogle Scholar
  16. Cook C, Heath F, Thompson RL (2000) A meta-analysis of response rates in Web- or Internet-based surveys. Educ Psychol Meas 60(6):821–836. CrossRefGoogle Scholar
  17. Cribbie RA, Gruman JA, Arpin-Cribbie CA (2004) Recommendations for applying tests of equivalence. J Clin Psychol 60(1):1–10. CrossRefGoogle Scholar
  18. Cronk BC, West JL (2002) Personality research on the Internet: a comparison of Web-based and traditional instruments in take-home and in-class settings. Behav Res Methods Instrum Comput 34(2):177–180. CrossRefGoogle Scholar
  19. Crumbaugh JC, Maholick LT (1964) An experimental study in existentialism: The psychometric approach to Frankl’s concept of noogenic neurosis. J Clin Psychol 20(2):200–207.<200::AID-JCLP2270200203>3.0.CO;2-UGoogle Scholar
  20. Cumming G (2012) Understanding the new statistics: effect sizes, confidence intervals, and meta-analysis, 1st edn. Routledge, New YorkGoogle Scholar
  21. Cumming G (2014) The new statistics: why and how. Psychol Sci 25(1):7–29. CrossRefGoogle Scholar
  22. De Leeuw ED, Hox JJ (1988) The effects of response-stimulating factors on response rates and data quality in mail surveys: a test of Dillman’s total design method. J Off Stat 4(3):241–249Google Scholar
  23. Deutskens E, de Ruyter K, Wetzels M (2006) An assessment of equivalence between online and mail surveys in service research. J Serv Res 8(4):346–355. CrossRefGoogle Scholar
  24. DeVellis RF (2016) Scale development: theory and applications, 4th edn. Sage, Thousand OaksGoogle Scholar
  25. Dienes Z (2014) Using Bayes to get the most out of non-significant results. Front Psychol 5(July):1–17. CrossRefGoogle Scholar
  26. Dillman DA, Smyth JD, Christian LM (2008) Internet, mail, and mixed-mode surveys: the tailored design method, 3rd edn. Wiley, HobokenGoogle Scholar
  27. Dunlap WP, Cortina JM, Vaslow JB, Burke MJ (1996) Meta-analysis of experiments with matched groups or repeated measures designs. Psychol Methods 1(2):170–177. CrossRefGoogle Scholar
  28. Etz A, Wagenmakers E-J (2017) JBS Haldane’s contribution to the Bayes factor hypothesis test. Stat Sci 32(2):313–329. zbMATHCrossRefGoogle Scholar
  29. Fang J, Wen C, Pavur R (2012a) Participation willingness in web surveys: exploring effect of sponsoring corporation’s and survey provider’s reputation. Cyberpsychol Behav Soc Netw 15(4):195–199. CrossRefGoogle Scholar
  30. Fang J, Wen C, Prybutok VR (2012b) An assessment of equivalence between Internet and paper-based surveys: evidence from collectivistic cultures. Qual Quant 48(1):493–506. CrossRefGoogle Scholar
  31. Feldman JM, Lynch JG (1988) Self-generated validity and other effects of measurement on belief, attitude, intention, and behavior. J Appl Psychol 73(3):421–435. CrossRefGoogle Scholar
  32. Frick A, Bächtiger MT, Reips U-D (2001) Financial incentives, personal information and dropout in online studies. In: Reips U-D, Bosnjak M (eds) Dimensions of internet science, 1st edn. Pabst Science Publishers, Lengerich, pp 209–219Google Scholar
  33. Gallistel CR (2009) The importance of proving the null. Psychol Rev 116(2):439–53. CrossRefGoogle Scholar
  34. Henrich J, Heine SJ, Norenzayan A (2010) The weirdest people in the world? Behav Brain Sci 33(2–3):61–83. CrossRefGoogle Scholar
  35. Higgins E, Lurie L (1983) Context, categorization, and recall: the “change-of-standard” effect. Cognit Psychol 15(4):525–547. CrossRefGoogle Scholar
  36. Hox JJ, De Leeuw ED (1994) A comparison of nonresponse in mail, telephone, and face-to-face surveys. Qual Quant 28(4):329–344. CrossRefGoogle Scholar
  37. Hutzell R (1988) A review of the purpose in life test. Int Forum Logother 11(2):89–101Google Scholar
  38. Ilieva J, Baron S, Healy NM (2002) On-line surveys in international marketing research: pros and cons. Int J Mark Res 44(3):361–376Google Scholar
  39. Jamovi Project (2018) jamovi (Version 0.8)[Computer software]. Accessed 31 May 2018
  40. JASP Team (2018) JASP (Version 0.8.6)[Computer software]. Accessed 31 May 2018
  41. Joinson A (1999) Social desirability, anonymity, and Intemet-based questionnaires. Behav Res Methods Instrum Comput 31(3):433–438. CrossRefGoogle Scholar
  42. Kass RE, Raftery AE (1995) Bayes Factors. J Am Stat Assoc 90(430):773–795. MathSciNetzbMATHCrossRefGoogle Scholar
  43. Keppel G, Wickens T (2004) Design and analysis: a researcher’s handbook, 4th edn. Prentice Hall, Upper Saddle RiverGoogle Scholar
  44. Knowles ES, Coker MC, Cook DA, Diercks SR, Irwin ME, Lundeen EJ, Sibicky ME (1992) Order effects within personality measures. In: Schwarz N, Sudman S (eds) Context effects in social and psychological research, 1st edn. Springer, New York, pp 221–236CrossRefGoogle Scholar
  45. Lakens D (2013) Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Front Psychol 4:863. CrossRefGoogle Scholar
  46. Lakens D (2017) Equivalence tests. Soc Psychol Personal Sci 8(4):355–362. CrossRefGoogle Scholar
  47. Lakens D, Adolfi FG, Albers CJ, Anvari F, Apps MAJ, Argamon SE, Zwaan RA (2018) Justify your alpha. Nat Hum Behav 2(3):168–171. CrossRefGoogle Scholar
  48. Lee MD, Wagenmakers E-J (2014) Bayesian cognitive modeling: a practical course, 1st edn. Cambridge University Press, New YorkGoogle Scholar
  49. Lewis I, Watson B, White KM (2009) Internet versus paper-and-pencil survey methods in psychological experiments: equivalence testing of participant responses to health-related messages. Aust J Psychol 61(2):107–116. CrossRefGoogle Scholar
  50. Lord CG, Lepper MR, Preston E (1984) Considering the opposite: a corrective strategy for social judgment. J Personal Soc Psychol 47(6):1231–1243. CrossRefGoogle Scholar
  51. Ly A, Verhagen J, Wagenmakers E-J (2016) Harold Jeffreys’s default Bayes factor hypothesis tests: explanation, extension, and application in psychology. J Math Psychol 72:19–32. MathSciNetzbMATHCrossRefGoogle Scholar
  52. MacLeod C, Campbell L (1992) Memory accessibility and probability judgments: an experimental evaluation of the availability heuristic. J Personal Soc Psychol 63(6):890–902. CrossRefGoogle Scholar
  53. Media (2016) The total audience report: Q1 2016Google Scholar
  54. Melton AMA, Schulenberg SE (2008) On the measurement of meaning: logotherapy’s empirical contributions to humanistic psychology. Humanist Psychol 36(1):31–44. CrossRefGoogle Scholar
  55. Meredith W (1993) Measurement invariance, factor analysis and factorial invariance. Psychometrika 58(4):525–543. MathSciNetzbMATHCrossRefGoogle Scholar
  56. Meyerson P, Tryon WW (2003) Validating Internet research: a test of the psychometric equivalence of Internet and in-person samples. Behav Res Methods Instrum Comput 35(4):614–620. CrossRefGoogle Scholar
  57. Morey RD (2015) On verbal categories for the interpretation of Bayes factors. Accessed Oct 2017
  58. Morey RD, Rouder JN (2015) BayesFactor: computation of Bayes Factors for common designs. Accessed 31 May 2018
  59. Musch J, Reips U-D (2000) A brief history of web experimenting. In: Birnbaum MH (ed) Psychological experiments on the internet, 1st edn. Elsevier, New York, pp 61–87. CrossRefGoogle Scholar
  60. Nosek BA, Banaji MR, Greenwald AG (2002) E-research: ethics, security, design, and control in psychological research on the Internet. J Soc Issues 58(1):161–176. CrossRefGoogle Scholar
  61. Olson K (2010) An examination of questionnaire evaluation by expert reviewers. Field Methods 22(4):295–318. CrossRefGoogle Scholar
  62. Panter AT, Tanaka JS, Wellens TR (1992) Psychometrics of order effects. In: Schwarz N, Sudman S (eds) Context effects in social and psychological research, 1st edn. Springer, New York, pp 249–264CrossRefGoogle Scholar
  63. Petty RE, Cacioppo JT (1986) Communication and persuasion: central and peripheral routes to attitude change, 1st edn. Springer, New YorkCrossRefGoogle Scholar
  64. Posner MI (1978) Chronometric explorations of mind, 1st edn. Erlbaum, HillsdaleGoogle Scholar
  65. Preacher KJ, MacCallum RC (2003) Repairing Tom Swift’s electric factor analysis machine. Underst Stat 2(1):13–43. CrossRefGoogle Scholar
  66. Reips U-D (2002) Standards for internet-based experimenting. Exp Psychol 49(4):243–256. CrossRefGoogle Scholar
  67. Reips U-D (2012) Using the Internet to collect data. In: Cooper H, Camic PM, Long DL, Panter AT, Rindskopf D, Sher KJ (eds) APA handbook of research methods in psychology, vol 2. Research designs: quantitative, qualitative, neuropsychological, and biological. American Psychological Association, Washington, DC, pp 291–310. CrossRefGoogle Scholar
  68. Revelle W (2017) psych: procedures for psychological, psychometric, and personality research. Northwestern University, Evanston.
  69. Rogers T (1974) An analysis of the stages underlying the process of responding to personality items. Acta Psychol 38(3):205–213. CrossRefGoogle Scholar
  70. Rogers JL, Howard KI, Vessey JT (1993) Using significance tests to evaluate equivalence between two experimental groups. Psychol Bull 113(3):553–565. CrossRefGoogle Scholar
  71. Rouder JN, Speckman PL, Sun D, Morey RD, Iverson G (2009) Bayesian t tests for accepting and rejecting the null hypothesis. Psychon Bull Rev 16(2):225–237. CrossRefGoogle Scholar
  72. Salancik GR, Brand JF (1992) Context influences on the meaning of work. In: Schwarz N, Sudman S (eds) Context efects in social and psychological research. Springer, New York, pp 237–247CrossRefGoogle Scholar
  73. Sanou B (2017) ICT facts and figures 2017. Accessed Oct 2017
  74. Schuirmann DJ (1987) A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability. J Pharmacokinet Biopharm 15(6):657–680. CrossRefGoogle Scholar
  75. Schuldt BA, Totten JW (1994) Electronic mail vs. mail survey response rates. Market Res 6:36–39Google Scholar
  76. Schulenberg SE (2004) A psychometric investigation of logotherapy measures and the Outcome Questionnaire (OQ-45.2). N Am J Psychol 6(3):477–492Google Scholar
  77. Schulenberg SE, Melton AMA (2010) A confirmatory factor-analytic evaluation of the purpose in life test: preliminary psychometric support for a replicable two-factor model. J Happiness Stud 11(1):95–111. CrossRefGoogle Scholar
  78. Schulenberg SE, Yutrzenka BA (1999) The equivalence of computerized and paper-and-pencil psychological instruments: implications for measures of negative affect. Behav Res Methods Instrum Comput 31(2):315–321. CrossRefGoogle Scholar
  79. Schulenberg SE, Yutrzenka BA (2001) Equivalence of computerized and conventional versions of the Beck Depression Inventory-II (BDI-II). Curr Psychol 20(3):216–230. CrossRefGoogle Scholar
  80. Schulenberg SE, Schnetzer LW, Buchanan EM (2011) The purpose in life test-short form: development and psychometric support. J Happiness Stud 12(5):861–876. CrossRefGoogle Scholar
  81. Smithson M (2001) Correct confidence intervals for various regression effect sizes and parameters: the importance of noncentral distributions in computing intervals. Educ Psychol Meas 61(4):605–632. MathSciNetCrossRefGoogle Scholar
  82. Smyth JD (2006) Comparing check-all and forced-choice question formats in web surveys. Public Opin Q 70(1):66–77. CrossRefGoogle Scholar
  83. Steenkamp JEM, Baumgartner H (1998) Assessing measurement invariance in cross-national consumer research. J Consum Res 25(1):78–107. CrossRefGoogle Scholar
  84. Strack F, Martin LL (1987) Thinking, judging, and communicating: a process account of context effects in attitude surveys. Recent research in psychology. Springer, New York, pp 123–148. CrossRefGoogle Scholar
  85. Strack KM, Schulenberg SE (2009) Understanding empowerment, meaning, and perceived coercion in individuals with serious mental illness. J Clin Psychol 65(10):1137–1148. CrossRefGoogle Scholar
  86. Strack F, Schwarz N, Gschneidinger E (1985) Happiness and reminiscing: the role of time perspective, affect, and mode of thinking. J Personal Soc Psychol 49(6):1460–1469. CrossRefGoogle Scholar
  87. Tabachnick BG, Fidell LS (2012) Using multivariate statistics, 6th edn. Pearson, BostonGoogle Scholar
  88. Tesser A (1978) Self-generated attitude change. In: Berkowitz L (ed) Advances in experimental social psychology, vol 11. Elsevier, New York. pp 289–338. CrossRefGoogle Scholar
  89. Tourangeau R, Rasinski KA (1988) Cognitive processes underlying context effects in attitude measurement. Psychol Bull 103(3):299–314. CrossRefGoogle Scholar
  90. Tourangeau R, Rips LJ, Rasinski K (1999) The psychology of survey response, 1st edn. Cambridge University Press, CambridgeGoogle Scholar
  91. Trent LR, Buchanan E, Ebesutani C, Ale CM, Heiden L, Hight TL, Young J (2013) A measurement invariance examination of the Revised Child Anxiety and Depression Scale in a southern sample: differential item functioning between African American and Caucasian youth. Assessment 20(2):175–187. CrossRefGoogle Scholar
  92. Tversky A, Kahneman D (1973) Availability: a heuristic for judging frequency and probability. Cognit Psychol 5(2):207–232. CrossRefGoogle Scholar
  93. Valentine KD, Buchanan EM, Scofield JE, Beauchamp M (2017) Beyond p-values: utilizing multiple estimates to evaluate evidence 1–29.
  94. Van Buuren S, Groothuis-Oudshoorn K (2011) mice: multivariate imputation by chained equations in R. J Stat Softw 45(3):1–67. CrossRefGoogle Scholar
  95. Wagenmakers E-J (2007) A practical solution to the pervasive problems of p values. Psychon Bull Rev 14(5):779–804. CrossRefGoogle Scholar
  96. Wagenmakers E-J, Morey RD, Lee MD (2016) Bayesian benefits for the pragmatic researcher. Curr Direct Psychol Sci 25(3):169–176. CrossRefGoogle Scholar
  97. Webb ES, Campbell DT, Schwartz RD, Sechrest L (1966) Unobtrusive measures: nonreactive research in the social sciences, 1st edn. Rand McNally, ChicagoGoogle Scholar
  98. Weigold A, Weigold IK, Russell EJ (2013) Examination of the equivalence of self-report survey-based paper-and-pencil and internet data collection methods. Psychol Methods 18(1):53–70. CrossRefGoogle Scholar
  99. Worthington RL, Whittaker TA (2006) Scale development research: a content analysis and recommendations for best practices. Couns Psychol 34(6):806–838. CrossRefGoogle Scholar

Copyright information

© The Behaviormetric Society 2018

Authors and Affiliations

  • Erin M. Buchanan
    • 1
    Email author
  • Riley E. Foreman
    • 1
  • Becca N. Johnson
    • 1
  • Jeffrey M. Pavlacic
    • 2
  • Rachel L. Swadley
    • 1
  • Stefan E. Schulenberg
    • 2
  1. 1.Missouri State UniversitySpringfieldUSA
  2. 2.University of MississippiOxfordUSA

Personalised recommendations