Abstract
How feelings change over time is a central topic in emotion research. To study these affective fluctuations, researchers often ask participants to repeatedly indicate how they feel on a self-report rating scale. Despite widespread recognition that this kind of data is subject to measurement error, the extent of this error remains an open question. Complementing many daily-life studies, this study aimed to investigate this question in an experimental setting. In such a setting, multiple trials follow each other at a fast pace, forcing experimenters to use a limited number of questions to measure affect during each trial. A total of 1398 participants completed a probabilistic reward task in which they were unknowingly presented with the same string of outcomes multiple times throughout the study. This allowed us to assess the test–retest consistency of their affective responses to the rating scales under investigation. We then compared these consistencies across different types of rating scales in hopes of finding out whether a given type of scale led to a greater consistency of affective measurements. Overall, we found moderate to good consistency of the affective measurements. Surprisingly, however, we found no differences in consistency across rating scales, which suggests that the specific rating scale that is used does not influence the measurement consistency.
Similar content being viewed by others
Data availability
The interested reader can find the data and experimental code on the Gitlab repository of this study: https://gitlab.kuleuven.be/ppw-okpiv/researchers/u0123135/affective-consistency.
Notes
The absence of these data was double-checked. On the Prolific page of our study, we found that a total of 1416 individuals completed the experiment. In the database, however, we received data for only 1412 of these, four of which were only partial.
This assignment was performed in the following way: We generated a number from the exact time at which the participant clicked our study link, where time was put in the format HH:MM:SS.MS. Then we computed the number’s remainder after dividing by six. The result of this procedure was one of six possible numbers, which then determined the condition to which the participant was assigned.
Note that in the following set of equations, we use commas in the subscripts to distinguish denotations (e.g., parameter numbers and names for the variance components) from running indices (e.g., person, sequence, and time).
References
Adolf, J. K., Voelkle, M. C., Brose, A., & Schmiedek, F. (2017). Capturing context-related change in emotional dynamics via fixed moderated time series analysis. Multivariate Behavioral Research, 52, 499–531. https://doi.org/10.1080/00273171.2017.1321978
Aguinis, H., Pierce, C. A., & Culpepper, S. A. (2009). Scale coarseness as a methodological artifact: Correcting correlation coefficients attenuated from using coarse scales. Organizational Research Methods, 12, 623–652. https://doi.org/10.1177/1094428108318065
Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103, 411–423.
Asutay, E., Genevsky, A., Feldman-Barrett, L., Hamilton, J. P., Slovic, P., & Västfjäll, D. (2021). Affective calculus: The construction of affect through information integration over time. Emotion, 21, 159–174. https://doi.org/10.1037/emo0000681
Bonsall, M. B., Wallace-Hadrill, S. M. A., Geddes, J. R., Goodwin, G. M., & Holmes, E. A. (2012). Nonlinear time-series approaches in characterizing mood stability and mood instability in bipolar disorder. Proceedings of the Royal Society B, 279, 916–924. https://doi.org/10.1098/rspb.2011.1246
Brennan, R. L. (2001). Generalizability theory. Springer-Verlag.
Bulteel, K., Mesdagh, M., Tuerlinckx, F., & Ceulemans, E. (2018). VAR(1) based models do not always outpredict AR(1) models in typical psychological applications. Psychological Methods, 23, 740–756. https://doi.org/10.1037/met0000178
Burns, R. A., & Ma, J. (2015). Examining the association between psychological wellbeing with daily and intra-individual variation in subjective wellbeing. Personality and Individual Differences, 82, 34–39. https://doi.org/10.1016/j.paid.2015.02.023
Cunningham, W. A., Dunfield, K. A., & Stillman, P. E. (2013). Emotional states from affective dynamics. Emotion Review, 5, 344–355. https://doi.org/10.1177/1754073913489749
Dejonckheere, E., Demeyer, F., Guesens, B., Piot, M., Tuerlinckx, F., Verdonck, S., & Mestdagh, M. (2022). Assessing the reliability of single-item momentary affective measurements in experience sampling. Psychological Assessment, 34, 1138–1154. https://doi.org/10.1037/pas0001178
Dejonckheere, E., Houben, M., Schat, E., Ceulemans, E., & Kuppens, P. (2021). The short-term psychological impact of the COVID-19 pandemic in psychiatric patients: Evidence for differential emotion and symptom trajectories in Belgium. Psychologica Belgica, 61, 163–172. https://doi.org/10.5334/pb.1028
Dejonckheere, E., & Mestdagh, M. (2021). On the signal-to-noise ratio in real-life emotional time series. In C. E. Waugh & P. Kuppens (Eds.), Affect dynamics. Springer. https://doi.org/10.1007/978-3-030-82965-0_7
Dejonckheere, E., Mestdagh, M., Houben, M., Rutten, I., Sels, L., Kuppens, P., & Tuerlinckx, F. (2019). Complex affect dynamics add limited information to the prediction of psychological well-being. Nature: Human. Behaviour, 3, 478–491. https://doi.org/10.1038/s41562-019-0555-0
Driver, C. C., & Voelkle, M. C. (2018). Understanding the time course of interventions with continuous time dynamic models. In K. van Montfort, J. H. L. Oud, & M. C. Voelkle (Eds.), Continuous time modeling in the behavioral and related sciences. Springer.
Eisele, G., Lafit, G., Vachon, H., Kuppens, P., Houben, M., Myin-Germeys, I., & Viechtbauer, W. (2021). Affective structure, measurement invariance, and reliability across different experience sampling protocols. Journal of Research in Personality, 92, 104094. https://doi.org/10.1016/j.jrp.2021.104094
Eldar, E., & Niv, Y. (2015). Interaction between emotional state and learning underlies mood instability. Nature Communications, 6, 6149. https://doi.org/10.1038/ncomms7149
Frijda, N. H. (2007). The laws of emotion. Routledge.
Haney, A. M., Fleming, M. N., Wycoff, A. M., Griffin, S. A., & Trull, T. (2023). Measuring affect in daily life: A multilevel psychometric evaluation of the PANAS-X across four ecological momentary assessment samples. Psychological Assessment. https://doi.org/10.1037/pas0001231
Huys, Q. J. M., Pizzagalli, D. A., Bogdan, R., & Dayan, P. (2013). Mapping anhedonia onto reinforcement learning: A behavioural meta-analysis. Biology of Mood & Anxiety Disorders, 3. https://doi.org/10.1186/2045-5380-3-12
Kalokerinos, E. K., Murphy, S. C., Koval, P., Bailen, N. H., Crombez, G., Hollenstein, T., Gleeson, J., Thompson, R. J., Van Ryckeghem, D. M. L., Kuppens, P., & Bastian, B. (2020). Neuroticism may not reflect emotional variability. Proceedings of the National Academy of Science, 117, 9270–9276. https://doi.org/10.1073/pnas.1919934117
Koo, T. K., & Li, M. Y. (2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine, 15, 155–163. https://doi.org/10.1016/j.jcm.2016.02.012
Krosnick, J. A., & Fabrigar, L. R. (1997). Designing rating scales for effective measurement in surveys. In L. Lyberg, P. Biemer, M. Collins, E. De Leeuw, C. Dippo, N. Schwarz, & D. Trewin (Eds.), Survey measurement and process quality. John Wiley & Sons, Inc.
Kruschke, J. K. (2015). Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan ((2nd ed.). ed.). Academic Press.
Kuppens, P., & Verduyn, P. (2017). Emotion dynamics. Current Opinion in Psychology, 17, 22–26. https://doi.org/10.1016/j.copsyc.2017.06.004
Larsen, J. T., Norris, C. J., McGraw, A. P., Hawkley, L. C., & Cacioppo, J. T. (2008). The evaluative space grid: A single-item measure of positivity and negativity. Cognition & Emotion, 23(3), 453–480. https://doi.org/10.1080/02699930801994054
Liljequist, D., Elfving, B., & Roaldsen, K. S. (2019). Intraclass correlation – A discussion and demonstration of basic features. PLoS ONE, 14, e0219854. https://doi.org/10.1371/journal.pone.0219854
Loossens, T., Mestdagh, M., Dejonckheere, E., Kuppens, P., Tuerlinckx, F., & Verdonck, S. (2020). The Affective Ising Model: A computational account of human affect dynamics. PLoS Computational Biology, 16, e1007860. https://doi.org/10.1371/journal.pcbi.1007860
Loossens, T., Tuerlinckx, F., & Verdonck, S. (2021). A comparison of continuous and discrete time modeling of affective processes in terms of predictive accuracy. Scientific Reports, 11, 6218. https://doi.org/10.1038/s41598-021-85320-4
Lord, F. M., Novick, M. R., & Birnbaum, A. (1968). Statistical theories of mental test scores. Addison-Wesley.
Lucas, R. E., & Donnellan, M. B. (2012). Estimating the reliability of single-item life satisfaction measures: Results from four national panel studies. Social Indicators Research, 105, 323–331. https://doi.org/10.1007/s11205-011-9783-z
Matheson, G. J. (2019). We need to talk about reliability: Making better use of test-retest studies for study design and interpretation. PeerJ, e6918. https://doi.org/10.7717/peerj.6918
McGraw, K. O., & Wong, S. P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1, 30–46.
Moors, A., Van de Cruys, S., & Pourtois, G. (2021). Comparison of the determinants for positive and negative affect proposed by appraisal theories, goal-directed theories, and predictive processing theories. Current Opinion In Behavioral Sciences, 39, 147–152. https://doi.org/10.1016/j.cobeha.2021.03.015
Polit, D. F. (2014). Getting serious about test–retest reliability: A critique of retest research and some recommendations. Quality of Life Research, 23, 1713–1720. https://doi.org/10.1007/s11136-014-0632-9
R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
Radloff, L. S. (1977). The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1, 385–401. https://doi.org/10.1177/014662167700100306
Revelle, W. (2022). Psych: Procedures for psychological, psychometric, and personality research. Northwestern University. https://CRAN.R-project.org/package=psych
Rutledge, R. B., Skandali, N., Dayan, P., & Dolan, R. J. (2014). A computational and neural model of momentary subjective well-being. Proceedings of the National Academy of Sciences, 111(33), 12252–12257. https://doi.org/10.1073/pnas.1407535111
Schuurman, N. K., Houtveen, J. H., & Hamaker, E. L. (2015). Incorporating measurement error in n = 1 psychological autoregressive models. Frontiers in Psychology, 6, 1038. https://doi.org/10.3389/fpsyg.2015.01038
Scott, S. B., Sliwinski, M. J., Zawadzki, M., Stawski, R. S., Kim, J., Marcusson-Clavertz, D., Lanza, S. T., Conroy, D. E., Buxton, O., Almeida, D. M., & Smyth, J. M. (2020). A coordinated analysis of variance in affect in daily life. Assessment, 27, 1683–1698. https://doi.org/10.1177/1073191118799460
Smillie, L. D., Geaney, J. T., Wilt, J., Cooper, A. J., & Revelle, W. (2013). Aspects of extraversion are unrelated to pleasant affective reactivity: Further examination of the affective-reactivity hypothesis. Journal of Research in Personality, 47, 580–587. https://doi.org/10.1016/j.jmp.2013.04.008
Stan Development Team. (2022). RStan: The R interface to Stan. https://mc-stan.org/
ten Hove, D., Jorgensen, T. D., & van der Ark, L. A. (2022a). Interrater reliability for multilevel data: A generalizability theory approach. Psychological Methods, 27, 650–666. https://doi.org/10.1037/met0000391
ten Hove, D., Jorgensen, T. D., & van der Ark, L. A. (2022b). Updated guidelines on selecting an intraclass correlation coefficient for interrater reliability, with applications to incomplete observational designs. Psychological Methods. https://doi.org/10.1037/met0000516
Trull, T. J., Lane, S. P., Koval, P., & Ebner-Priemer, U. W. (2015). Affective dynamics in psychopathology. Emotion Review, 7, 355–361. https://doi.org/10.1177/1754073915590617
Vanhasbroeck, N., Ariens, S., Tuerlinckx, F., & Loossens, T. (2021). Computational models for affect dynamics. In C. E. Waugh & P. Kuppens (Eds.), Affect dynamics. Springer.
Vanhasbroeck, N., Loossens, T., Anarat, N., Ariens, S., Vanpaemel, W., Moors, A., & Tuerlinckx, F. (2022). Stimulus-driven affective change: Evaluating computational models of affect dynamics in conjunction with input. Affective Science, 3, 559–576. https://doi.org/10.1007/s42761-022-00118-5
Villano, W. J., Otto, A. R., Ezie, C. E. C., Gillis, R., & Heller, A. S. (2020). Temporal dynamics of real-world emotions are more strongly linked to prediction error than outcome. Journal of Experimental Psychology: General, 149, 1755–1766. https://doi.org/10.1037/xge0000740
Wendt, L. P., Wright, A. G. C., Pilkonis, P. A., Woods, W. C., Denissen, J. J. A., Kühnel, A., & Zimmermann, J. (2020). Indicators of affect dynamics: Structure, reliability, and personality correlates. European Journal of Personality, 34, 1060–1072. https://doi.org/10.1002/per.2277
Wilhelm, P., & Schoebi, D. (2007). Assessing mood in daily life: Structural validity, sensitivity to change, and reliability of a short-scale to measure three basic dimensions of mood. European Journal of Psychological Assessment, 23, 258–267. https://doi.org/10.1027/1015-5759.23.4.258
Acknowledgements
We would like to thank Peter Kuppens for his help at the early stages of this study. We would furthermore like to thank Nena Lathouwers, who helped us with executing a pilot study and analyzed the data that came out of it. We would also like to thank Kenny Yu for giving his opinion on earlier drafts of this paper. The analyses performed in this work were performed using resources and services of the VSC (Flemish Supercomputer Center), funded by the Research Foundation – Flanders (FWO) and the Flemish Government.
Open practice statement
In line with suggestions of the Open Science movement, we preregistered this study. This preregistration can be found on the Open Science Framework: https://osf.io/sytrn. We furthermore preregistered code, which can be found under the preregistered tag on the Gitlab page of this study. As stated earlier, data and materials can also be found on this same page.
Code availability
Participants can find the code for the analyses on the same GitLab page, repeated here: https://gitlab.kuleuven.be/ppw-okpiv/researchers/u0123135/affective-consistency.
Author information
Authors and Affiliations
Contributions
NV and FT conceptualized the study together. NV conceptualized and performed the analyses with valuable help from SV, AM, WV, and FT. NV, SV, AM, WV, and FT all wrote and reviewed the article.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Funding
This work was supported by the Research Fund of the KU Leuven (Grant C14/19/054) and by the Fonds Wetenschappelijk Onderzoek (FWO; Grant G074219N). The funders had no role in study design, data collection, analyses, decision to publish, or preparation of the manuscript.
Ethical approval
As stated in the article, this study was approved by the local ethics committee at the Psychological department of the KU Leuven (the Social and Societal Ethics Committee) under case number G-2021-3228. The study was performed in accordance with the ethical standards as laid out in the 1964 Declaration of Helsinki.
Consent to participate
As stated in the article, participants signed an informed consent before participating in our study.
Consent to publish
Within the informed consent, participants were informed on our intention to publish the results of the study. Participants consented to the submission of this study’s results for publication.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Vanhasbroeck, N., Vanbelle, S., Moors, A. et al. Chasing consistency: On the measurement error in self-reported affect in experiments. Behav Res (2023). https://doi.org/10.3758/s13428-023-02290-3
Accepted:
Published:
DOI: https://doi.org/10.3758/s13428-023-02290-3