Modeling insufficient effort responses in mixed-worded scales

Jin, Kuan-Yu; Chiu, Ming Ming

doi:10.3758/s13428-023-02146-w

Modeling insufficient effort responses in mixed-worded scales

Published: 21 June 2023

Volume 56, pages 2260–2272, (2024)
Cite this article

Behavior Research Methods Aims and scope Submit manuscript

250 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Surveys often add reverse-coded questions to monitor respondents with insufficient effort responses (IERs) but often wrongly assume that all respondents consistently answer all questions with full effort. By contrast, this study expanded the mixture model for IERs and ran a simulation via LatentGOLD to show the harmful consequences of ignoring IERs to positively and negatively worded questions: less test reliability, bias and less accuracy in slope and intercept parameters. We showed its practical application to two public data sets: Machiavellianism (five-point scale) and self-reported depression (four-point scale).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Article Open access 22 August 2014

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Article 04 June 2018

A Personalised Approach to Identifying Important Determinants of Well-being

Article Open access 28 May 2024

Data Availability

Example 1 is available from https://openpsychometrics.org/_rawdata/. Example 2 is available from https://doi.org/10.5334/jopd.35. The LatentGOLD syntax for Example 1 is available from https://osf.io/rjc5p/.

Notes

Other models allow nonuniform distributions of random responses (Meade & Craig, 2012).
When simulees were generated with no IERs (i.e., π_u = π_v = [1, 0, 0]), EMMIER and GPCM are essentially equivalent.
Based on Eq. 3 and item parameters in the Appendix Table 8, the generated responses do not need to be reverse-rescored.
Sign-reversed δ-parameter estimates in LatentGOLD can be comparable to their true values.
The data set is available from https://openpsychometrics.org/_rawdata/.
Only 19 questions were included in the analyses, because the slope parameter for Q17 (“P.T. Barnum was wrong when he said that there's a sucker born every minute”) from any analytical model was very close to zero.

References

Arias, V. B., Garrido, L. E., Jenaro, C., Martinez-Molina, A., & Arias, B. (2020). A little garbage in, lots of garbage out: Assessing the impact of careless responding in personality survey data. Behavior Research Methods, 52(6), 2489–2505. https://doi.org/10.3758/s13428-020-01401-8
Article PubMed Google Scholar
Baumgartner, H., & Steenkamp, J.-B.E.M. (2001). Response styles in marketing research: A cross-national investigation. Journal of Marketing Research, 38(2), 143–156. https://doi.org/10.1509/jmkr.38.2.143.18840
Article Google Scholar
Bijslma, H. J. E., Glas, C. A. W., & Visscher, A. J. (2022). Factors related to differences in digitally measured student perceptions of teaching quality. School Effectiveness and School Improvement, 33(3), 360–380. https://doi.org/10.1080/09243453.2021.2023584
Article Google Scholar
Böckenholt, U. (2012). Modeling multiple response processes in judgment and choice. Psychological Methods, 17(4), 665–678. https://doi.org/10.1037/a0028111
Article PubMed Google Scholar
Bolt, D., Wang, Y. C., Meyer, R. H., & Pier, L. (2020). An IRT mixture model for rating scale confusion associated with negatively worded items in measures of social-emotional learning. Applied Measurement in Education, 33(4), 331–348. https://doi.org/10.1080/08957347.2020.1789140
Article Google Scholar
Bowling, N. A., Huang, J. L., Bragg, C. B., Khazon, S., Liu, M., & Blackmore, C. E. (2016). Who cares and who is careless? Insufficient effort responding as a reflection of respondent personality. Journal of Personality and Social Psychology, 111(2), 218–229. https://doi.org/10.1037/pspp0000085
Article PubMed Google Scholar
Bowling, N. A., Gibson, A. M., Houpt, J. W., & Brower, C. K. (2021). Will the questions ever end? Person-level increases in careless responding during questionnaire completion. Organizational Research Methods, 24(4), 718–738. https://doi.org/10.1177/1094428120947794
Article Google Scholar
Bowling, N. A., Huang, J. L., Brower, C. K., & Bragg, C. B. (2023). The quick and the careless: The construct validity of page time as a measure of insufficient effort responding to surveys. Organizational Research Methods, 26(2), 323–352. https://doi.org/10.1177/10944281211056520
Article Google Scholar
Chen, H.-F., & Jin, K.-Y. (2022). The impact of item feature and response preference in mixed-format design. Multivariate Behavioral Research, 57(2–3), 208–222. https://doi.org/10.1080/00273171.2020.1820308
Article PubMed Google Scholar
Christie, R., & Geis, F. (1970). Studies in Machiavellianism. Academic Press.
Google Scholar
Cole, K. L., Turner, R. C., & Gitchel, W. D. (2019). A study of polytomous IRT methods and item wording directionality effects on perceived stress items. Personality and Individual Differences, 147, 63–72. https://doi.org/10.1016/j.paid.2019.03.046
Article Google Scholar
Conijn, J. M., Emons, W. H. M., & Sijtsma, K. (2014). Statistic l_z-based person-fit methods for noncognitive multiscale measures. Applied Psychological Measurement, 38(2), 122–136. https://doi.org/10.1177/0146621613497568
Article Google Scholar
DeSimone, J. A., Davison, H. K., Schoen, J. L., & Bing, M. N. (2020). Insufficient effort responding as a partial function of implicit aggression. Organizational Research Methods, 23(1), 154–180. https://doi.org/10.1177/1094428118799486
Article Google Scholar
Drasgow, F., Levine, M. V., & Williams, E. A. (1985). Appropriateness measurement with polytomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38(1), 67–86. https://doi.org/10.1111/j.2044-8317.1985.tb00817.x
Article Google Scholar
Emons, W. H. M. (2008). Nonparametric person-fit analysis of polytomous item scores. Applied Psychological Measurement, 32(3), 224–247. https://doi.org/10.1177/0146621607302479
Article Google Scholar
Ferrando, P. J., & Lorenzo-Seva, U. (2010). Acquiescence as a source of bias and model and person misfit: A theoretical and empirical analysis. British Journal of Mathematical and Statistical Psychology, 63(2), 427–448. https://doi.org/10.1348/000711009X470740
Article PubMed Google Scholar
Gibson, A. M., & Bowling, N. A. (2020). The effects of questionnaire length and behavioral consequences on careless responding. European Journal of Psychological Assessment, 36(2), 410–420. https://doi.org/10.1027/1015-5759/a000526
Article Google Scholar
Grau, I., Ebbeler, C., & Banse, R. (2019). Cultural differences in careless responding. Journal of Cross-Cultural Psychology, 50(3), 336–357. https://doi.org/10.1177/0022022119827379
Article Google Scholar
Hong, M., Steedle, J. T., & Cheng, Y. (2020). Methods of detecting insufficient effort responding: Comparisons and practical recommendations. Educational and Psychological Measurement, 80(2), 312–345. https://doi.org/10.1177/0013164419865316
Article PubMed Google Scholar
Huang, J. L., Curran, P. G., Keeney, J., Poposki, E. M., & DeShon, R. P. (2012). Detecting and deterring insufficient effort responding to surveys. Journal of Business and Psychology, 27(1), 99–114. https://doi.org/10.1007/s10869-011-9231-8
Article Google Scholar
Jin, K.-Y., Chen, H.-F., & Wang, W.-C. (2018). Mixture item response models for inattentive responding behavior. Organizational Research Methods, 21(1), 197–225. https://doi.org/10.1177/1094428117725792
Article Google Scholar
Jin, K.-Y., Wu, Y.-J., & Chen, H.-F. (2022). A new multi-process IRT model with ideal points for Likert-type items. Journal of Educational and Behavioral Statistics, 47(3), 297–321. https://doi.org/10.3102/10769986211057160
Article Google Scholar
Kam, C. C. S., & Meyer, J. P. (2015). How careless responding and acquiescence response bias can influence construct dimensionality: The case of job satisfaction. Organizational Research Methods, 18(3), 512–541. https://doi.org/10.1177/1094428115571894
Article Google Scholar
Koutsogiorgi, C. C., & Michaelides, M. P. (2022). Response tendencies due to item wording using eye-tracking methodology accounting for individual differences and item characteristics. Behavior Research Methods, 54(5), 2252–2270. https://doi.org/10.3758/s13428-021-01719-x
Article PubMed Google Scholar
Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. Psychological Methods, 17(3), 437–455. https://doi.org/10.1037/a0028085
Article PubMed Google Scholar
Mokken, R. J. (1971). A theory and procedure of scale analysis. De Gruyter. https://doi.org/10.1515/9783110813203
Book Google Scholar
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16(2), 159–176. https://doi.org/10.1177/014662169201600206
Article Google Scholar
Niessen, A. S. M., Meijer, R. R., & Tendeiro, J. N. (2016). Detecting careless respondents in web-based questionnaires: Which method to use? Journal of Research in Personality, 63(1), 1–11. https://doi.org/10.1016/j.jrp.2016.04.010
Article Google Scholar
Ou, X. (2022). Multidimensional structure or wording effect? Reexamination of the factor structure of the Chinese general self-efficacy scale. Journal of Personality Assessment, 104(1), 64–73. https://doi.org/10.1080/00223891.2021.1912059
Article PubMed Google Scholar
Radloff, L. S. (1977). The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1(3), 385–401. https://doi.org/10.1177/014662167700100306
Article Google Scholar
Schroeders, U., Schmidt, C., & Gnambs, T. (2022). Detecting careless responding in survey data using stochastic gradient boosting. Educational and Psychological Measurement, 82(1), 29–56. https://doi.org/10.1177/00131644211004708
Article PubMed Google Scholar
Spiegelhalter, D. J., Thomas, A., Best, N., & Lunn, D. (2007). WinBUGS (version 1.4.3) [computer software]. MRC biostatistics unit, Institute of Public Health. https://www.mrc-bsu.cam.ac.uk/wp-content/uploads/manual14.pdf
Steinmann, I., van Sánchez, D., Laar, S., & Braeken, J. (2022). The impact of inconsistent responders to mixed-worded scales on inferences in international large-scale assessments. Assessment in Education: Principles, Policy & Practice, 29(1), 5–26. https://doi.org/10.1080/0969594X.2021.2005302
Article Google Scholar
Sun, T., Zhang, B., Cao, M., & Drasgow, F. (2022). Faking detection improved: Adopting a Likert item response process tree model. Organizational Research Methods, 25(3), 490–512. https://doi.org/10.1177/10944281211002904
Article Google Scholar
van Laar, S., & Braeken, J. (2022). Random responders in the TIMSS 2015 student questionnaire: A threat to validity? Journal of Educational Measurement, 59(4), 470–501. https://doi.org/10.1111/jedm.12317
Article Google Scholar
Vermunt, J. K., & Magidson, J. (2016). Technical guide to Latent Gold 5.1: Basic, advanced, and syntax. Statistical Innovations.
Google Scholar
Wang, C., & Xu, G. (2015). A mixture hierarchical model for response times and response accuracy. British Journal of Mathematical and Statistical Psychology, 68(3), 456–477. https://doi.org/10.1111/bmsp.12054
Article PubMed Google Scholar
Wang, W.-C., Chen, H.-F., & Jin, K.-Y. (2015). Item response theory models for wording effects in mixed-format scales. Educational and Psychological Measurement, 75(1), 157–178. https://doi.org/10.1177/0013164414528209
Article PubMed Google Scholar
Ward, M. K., Meade, A. W., Allred, C. M., Pappalardo, G., & Stoughton, J. W. (2017). Careless response and attrition as sources of bias in online survey assessments of personality traits and performance. Computers in Human Behavior, 76, 417–430. https://doi.org/10.1016/j.chb.2017.06.032
Article Google Scholar
Wetzel, E., & Carstensen, C. H. (2014). Reversed thresholds in partial credit models: A reason for collapsing categories? Assessment, 21(6), 765–774. https://doi.org/10.1177/1073191114530775
Article PubMed Google Scholar
Wind, S. A., & Wang, Y. (2022). Using Mokken scaling techniques to explore carelessness in survey research. Behavior Research Methods. Advanced Online Publication. https://doi.org/10.3758/s13428-022-01960-y
Wise, S. L., & DeMars, C. E. (2006). An application of item response time: The effort-moderated IRT model. Journal of Educational Measurement, 43(1), 19–38. https://doi.org/10.1111/j.1745-3984.2006.00002.x
Article Google Scholar
Woodworth, R. J., O’Brien-Malone, A., Diamond, M. R., & Schüz, B. (2018). Data from, ‘web-based positive psychology interventions: A reexamination of effectiveness.’ Journal of Open Psychology Data, 6(1), 1. https://doi.org/10.5334/jopd.35
Article Google Scholar

Download references

Author information

Authors and Affiliations

Hong Kong Examinations and Assessment Authority, 7/F, Dah Sing Financial Centre, 248 Queen’s Road East, Wan Chai, Hong Kong
Kuan-Yu Jin
The Education University of Hong Kong, B1-2/F-15, 10 Lo Ping Road, Tai Po, N.T., Hong Kong
Ming Ming Chiu

Authors

Kuan-Yu Jin
View author publications
You can also search for this author in PubMed Google Scholar
Ming Ming Chiu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kuan-Yu Jin.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Table 8 Generated item parameters in the simulation study

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jin, KY., Chiu, M.M. Modeling insufficient effort responses in mixed-worded scales. Behav Res 56, 2260–2272 (2024). https://doi.org/10.3758/s13428-023-02146-w

Download citation

Accepted: 17 May 2023
Published: 21 June 2023
Issue Date: March 2024
DOI: https://doi.org/10.3758/s13428-023-02146-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modeling insufficient effort responses in mixed-worded scales

Abstract

Access this article

Similar content being viewed by others

A new criterion for assessing discriminant validity in variance-based structural equation modeling

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

A Personalised Approach to Identifying Important Determinants of Well-being

Data Availability

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Modeling insufficient effort responses in mixed-worded scales

Abstract

Access this article

Similar content being viewed by others

A new criterion for assessing discriminant validity in variance-based structural equation modeling

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

A Personalised Approach to Identifying Important Determinants of Well-being

Data Availability

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation