Skip to main content
Log in

A contextualized assessment of reliability and validity of student-initiated momentary self-reports during lectures

  • Research Article
  • Published:
Educational technology research and development Aims and scope Submit manuscript
  • 1 Altmetric

Abstract

The use of Experience Sampling Methods (ESM) to assess students’ experiences, motivation, and emotions by sending signals to students at random or fixed time points has grown due to recent technological advances. Such methods offer several advantages, such as capturing the construct in the moment (i.e., when the events are fresh on respondents’ minds) or providing a better understanding of the temporal and dynamic nature of the construct, and are often considered to be more valid than retrospective self-reports. This article investigates the validity and reliability of a variant of the ESM, the DEBE (an acronym for difficult, easy, boring and engaging, and pronounced ‘Debbie’) feedback, which captures student-driven (as and when the student wants to report) momentary self-reports of cognitive-affective states during a lecture. The DEBE feedback is collected through four buttons on mobile phones/laptops used by students. We collected DEBE feedback from several video lectures (N = 722, 8 lectures) in different courses and examined the threats to validity and reliability. Our analysis revealed variables such as student motivation, learning strategies, academic performance, and prior knowledge did not affect the feedback-giving behavior. Monte Carlo simulations showed that for a class size of 50 to 120, on average, 30 students can provide representative and actionable feedback, and the feedback was tolerant up to 20% of the students giving erroneous or biased feedback. The article discusses in detail the aforementioned and other validity and reliability threats that need to be considered when working with such data. These findings, although specific to the DEBE feedback, are intended to supplement the momentary self-report literature, and the study is expected to provide a roadmap for establishing validity and reliability of such novel data types.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data Availability

Data will be made available on request.

References

  • Chauliac, M., Catrysse, L., Gijbels, D., & Donche, V. (2020). It is all in the “surv-eye”: can eye tracking data shed light on the internal consistency in self-report questionnaires on cognitive processing strategies? Frontline Learning Research, 8(3), 26–39.

    Article  Google Scholar 

  • Chavan, P., & Mitra, R. (2019). Developing a student feedback system using a design-based research approach. 2019 IEEE Tenth International Conference on Technology for Education (T4E), 1–8.

  • Chavan, P., & Mitra, R. (2022). Tcherly: A teacher-facing dashboard for online video lectures. Journal of Learning Analytics, 9, 125.

    Article  Google Scholar 

  • Chavan, P., Gupta, S., & Mitra, R. (2018). A novel feedback system for pedagogy refinement in large lecture classrooms. International Conference on Computers in Education, 464–469

  • Chavan, P., Mitra, R., & Srree Murallidharan, J. (2022). Multiscale nature of student and teacher perceptions of difficulty in a mechanical engineering lecture. European Journal of Engineering Education. https://doi.org/10.1080/03043797.2022.2047159

    Article  Google Scholar 

  • Cross, A., Bayyapunedi, M., Cutrell, E., Agarwal, A., & Thies, W. (2013). TypeRighting: Combining the benefits of handwriting and typeface in online educational videos. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 793–796

  • Csikszentmihalyi, M. (1990). Flow. The Psychology of Optimal Experience. New York (HarperPerennial) 1990.

  • D’Mello, S., Olney, A., Williams, C., & Hays, P. (2012). Gaze tutor: A gaze-reactive intelligent tutoring system. International Journal of Human-Computer Studies, 70(5), 377–398. https://doi.org/10.1016/j.ijhcs.2012.01.004

    Article  Google Scholar 

  • Durik, A. M., & Jenkins, J. S. (2020). Variability in certainty of self-reported interest: Implications for theory and research. Frontline Learning Research, 8(3), 85–103.

    Article  Google Scholar 

  • Frick, T. W., Chadha, R., Watson, C., Wang, Y., & Green, P. (2009). College student perceptions of teaching and learning quality. Educational Technology Research and Development, 57(5), 705–720.

    Article  Google Scholar 

  • Fryer, L. K., & Dinsmore, D. L. (2020). The promise and pitfalls of self-report: Development, research design and analysis issues, and multiple methods. Frontline Learning Research, 8(3), 1–9.

    Article  Google Scholar 

  • Fuller, K. A., Karunaratne, N. S., Naidu, S., Exintaris, B., Short, J. L., Wolcott, M. D., Singleton, S., & White, P. J. (2018). Development of a self-report instrument for measuring in-class student engagement reveals that pretending to engage is a significant unrecognized problem. PLoS ONE, 13(10), e0205828.

    Article  Google Scholar 

  • Graesser, A. C., & D’Mello, S. (2012). Emotions during the learning of difficult material. Psychology of Learning and Motivation, 57, 183–225.

    Article  Google Scholar 

  • Hektner, J. M., Schmidt, J. A., & Csikszentmihalyi, M. (2007). Experience sampling method: Measuring the quality of everyday life. Sage.

    Book  Google Scholar 

  • Hilliger, I., Miranda, C., Schuit, G., Duarte, F., Anselmo, M., & Parra, D. (2021). Evaluating a learning analytics dashboard to visualize student self-reports of time-on-task: A case study in a Latin American University. LAK21: 11th International Learning Analytics and Knowledge Conference, 592–598.

  • Huang, J. L., Curran, P. G., Keeney, J., Poposki, E. M., & DeShon, R. P. (2012). Detecting and deterring insufficient effort responding to surveys. Journal of Business and Psychology, 27(1), 99–114.

    Article  Google Scholar 

  • Iaconelli, R., & Wolters, C. A. (2020). Insufficient effort responding in surveys assessing self-regulated learning: Nuisance or fatal flaw? Frontline Learning Research, 8(3), 104–125.

    Article  Google Scholar 

  • Kizilcec, R. F., Papadopoulos, K., & Sritanyaratana, L. (2014). Showing face in video instruction: Effects on information retention, visual attention, and affect. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2095–2102

  • Kizilcec, R. F., Bailenson, J. N., & Gomez, C. J. (2015). The instructor’s face in video instruction: Evidence from two large-scale field studies. Journal of Educational Psychology, 107(3), 724.

    Article  Google Scholar 

  • Kuncel, N. R., Credé, M., & Thomas, L. L. (2005). The validity of self-reported grade point averages, class ranks, and test scores: A meta-analysis and review of the literature. Review of Educational Research, 75(1), 63–82.

    Article  Google Scholar 

  • Larson, R., & Csikszentmihalyi, M. (2014). The experience sampling method. Flow and the foundations of positive psychology (pp. 21–34). Springer.

    Chapter  Google Scholar 

  • Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. Psychological Methods, 17(3), 437.

    Article  Google Scholar 

  • Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741.

    Article  Google Scholar 

  • Mitra, R., & Chavan, P. (2019). DEBE feedback for large lecture classroom analytics. Proceedings of the 9th International Conference on Learning Analytics & Knowledge, 426–430

  • Moeller, J., Viljaranta, J., Kracke, B., & Dietrich, J. (2020). Disentangling objective characteristics of learning situations from subjective perceptions thereof, using an experience sampling method design. Frontline Learning Research, 8(3), 63–84.

    Article  Google Scholar 

  • Morris, R. C., Parker, L. C., Nelson, D., Pistilli, M. D., Hagen, A., Levesque-Bristol, C., & Weaver, G. (2014). Development of a student self-reported instrument to assess course reform. Educational Assessment, 19(4), 302–320.

    Article  Google Scholar 

  • Pekrun, R. (2006). The control-value theory of achievement emotions: Assumptions, corollaries, and implications for educational research and practice. Educational Psychology Review, 18(4), 315–341.

    Article  Google Scholar 

  • Pekruna, R. (2020). Commentary: Self-report is indispensable to assess students’ learning. Frontline Learning Research, 8(3), 185–193.

    Article  Google Scholar 

  • Pintrich, P. R., Smith, D. A. F., Garcia, T., & McKeachie, W. J. (1991). A manual for the use of the Motivated Strategies for Learning Questionnaire (MSLQ). Ann Arbor: National Center for Research to Improve Postsecondary Teaching and Learning (NCRIPTAL), School of Education, University of Michigan.

  • Risko, E. F., Foulsham, T., Dawson, S., & Kingstone, A. (2012). The collaborative lecture annotation system (CLAS): A new TOOL for distributed learning. IEEE Transactions on Learning Technologies, 6(1), 4–13.

    Article  Google Scholar 

  • Rivera-Pelayo, V., Munk, J., Zacharias, V., & Braun, S. (2013). Live interest meter: Learning from quantified feedback in mass lectures. In Proceedings of the Third International Conference on Learning Analytics and Knowledge (pp. 23–27).

  • Shernoff, D. J., Csikszentmihalyi, M., Shneider, B., & Shernoff, E. S. (2003). Student engagement in high school classrooms from the perspective of flow theory. School Psychology Quarterly, 18(2), 158.

    Article  Google Scholar 

  • Singh, S. (2019). Leveraging student self-reports to predict learning outcomes. International Conference on Artificial Intelligence in Education, 398–403

  • Srivastava, N., Velloso, E., Lodge, J. M., Erfani, S., & Bailey, J. (2019). Continuous evaluation of video lectures from real-time difficulty self-report. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 1–12

  • Teo, T. (2013). An initial development and validation of a Digital Natives Assessment Scale (DNAS). Computers & Education, 67, 51–57.

    Article  Google Scholar 

  • van Halema, N., Van Klaveren, C., Drachsler, H., Schmitz, M., & Cornelisz, I. (2020). Tracking patterns in self-regulated learning using students’ self-reports and online trace data. Frontline Learning Research, 8(3), 140–163.

    Article  Google Scholar 

  • Van Meter, P. N. (2020). Commentary: Measurement and the study of motivation and strategy use-determining if and when self-report measures are appropriate. Frontline Learning Research, 8(3), 174–184.

    Article  Google Scholar 

  • Veenman, M. V. (2016). Learning to self-monitor and self-regulate. Handbook of research on learning and instruction (pp. 249–273). Routledge.

    Google Scholar 

  • Zarraonandia, T., Díaz, P., Montero, Á., Aedo, I., & Onorati, T. (2019). Using a google glass-based classroom feedback system to improve students to teacher communication. IEEE Access, 7, 16837–16846.

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank students from IDP in Educational Technology, specifically Vishwas Badhe, Amit Paikrao, Raj Gubrele, Spruha Satavlekar, and Sumitra Sadhukhan, for their support in data collection. This work was supported by an IIT Bombay internal research grant, RD/0517-IRCCSH0-001, to Ritayan Mitra.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ritayan Mitra.

Ethics declarations

Conflict of interest

There are no potential conflicts of interest arising from the research.

Ethical approval

The research described in this paper was given ethical clearance by the IITB Institute Review Board (IITB-IRB-2021-037).

Informed consent

Informed consent was obtained from students.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

See Figs.

Fig. 11
figure 11

a At n = 19, the ensemble of curves considerably represented the features of the original feedback, and b At n = 29, the ensemble accurately represented the original feedback. Hence, in this case, it can be concluded that 19 to 29 students can provide feedback representative of the class. (Lecture: Demonstration of Java Programs, Course: Java Programming)

11,

Fig. 12
figure 12

a At n = 23, the ensemble of curves considerably represented the features of the original feedback, and b At n = 33, the ensemble accurately represented the original feedback. Hence, in this case, it can be concluded that 23 to 33 students can provide feedback representative of the class. (Lecture: Java Programming Steps, Course: Java Programming)

12,

Fig. 13
figure 13

a At n = 27, the ensemble of curves considerably represented the features of the original feedback, and b At n = 37, the ensemble accurately represented the original feedback. Hence, in this case, it can be concluded that 27 to 37 students can provide feedback representative of the class. (Lecture: Insertion Sort, Course: Analysis of Algorithms-A)

13,

Fig. 14
figure 14

a At n = 23, the ensemble of curves considerably represented the features of the original feedback, and b At n = 33, the ensemble accurately represented the original feedback. Hence, in this case, it can be concluded that 23 to 33 students can provide feedback representative of the class. (Lecture: Graph Coloring Problem, Course: Analysis of Algorithms-B)

14 and

Fig. 15
figure 15

a At n = 26, the ensemble of curves considerably represented the features of the original feedback, and b At n = 36, the ensemble accurately represented the original feedback. Hence, in this case, it can be concluded that 26 to 36 students can provide feedback representative of the class. (Lecture: Segmentation and Paging, Course: Operating System)

15.

Appendix 2

See Figs.

Fig. 16
figure 16

Lecture: Demonstration of Java Programs. a The original net engagement curve (blue) is well bounded by the ensemble of curves even after adding 5 dummy students, indicating that it follows a structure similar to the original curve. b With the addition of 10 dummy students, the ensemble of curves between 5 and 10 min slightly moved downwards (red box), but no sign of a significant drop or shift in the simulated curves was observed. c However, with 15 dummy students, the ensemble of curves between 5 and 10 min moved downward further, and the original peak has clearly fallen out of the ensemble of curves (red box) and seems to get flattened. Moreover, the peak present on the left side of this peak (see * and the red line) indicated a slight temporal shift towards the right. d The original peak between 5 and 10 min (red box) has clearly fallen out of the ensemble, and the structure of the ensemble indicates that the peak has disappeared. And the peak on the left (see * and the red line) indicated a clear temporal shift towards the right. Hence, it can be concluded that the breakdown happened around 15 dummy students. With a more conservative estimate, 10 can be considered a breakdown point. (Point of breakdown: 10) (Color figure online)

16,

Fig. 17
figure 17

Lecture: Java Programming Steps, Course: Java Programming. The original net difficulty curve started falling out of the ensemble of curves (highlighted with a blue box) with the addition of 15 dummy students. With the addition of 20 dummy students, the net difficulty curve has clearly fallen out of the ensemble and also experienced a slight shift towards the left. In the case of net engagement, small new peaks started forming between 20 and 25 min (red box), which became evident with the addition of 20 dummy students. Moreover, a shift in the ensemble of curves for a net engagement peak between 13 and 18 min became clear with the addition of 20 dummy students. (Point of breakdown: 15) (Color figure online)

17,

Fig. 18
figure 18

Lecture: Insertion Sort, Course: Analysis of Algorithms-A. With the addition of 15 dummy students, the structure of the ensemble of curves started to become a bit dispersed (i.e., there are no clear features, like the peaks or trends in the original curve, in the ensemble of curves). The ensemble of curves became too dispersed with 20 dummy students and also resulted in the formation of new features, such as a small new peak formed towards the end of the net difficulty curve. Moreover, slight shifts were also observed in some peaks, such as the first peak in net difficulty as well as towards the end. (Point of breakdown: 15)

18 and

Fig. 19
figure 19

Lecture: Segmentation and Paging, Course: Operating Systems. The ensemble of net engagement curves started to drop significantly (i.e., the two original net engagement peaks started to fall out of the ensemble) with the addition of 20 dummy students. With the addition of 25 dummy students, the two peaks have completely fallen out of the ensemble. (Point of breakdown: 20)

19.

Appendix 3

Instructions for students on providing DEBE feedback:

  • The feedback is entirely voluntary and anonymous. You can click whenever you feel any of the four states (i.e., Difficult, Easy, Boring, and Engaging) and as often as you like.

  • Give feedback only when you feel any of the four buttons describe your feeling. It is okay to click a few seconds late, but do not click randomly. It will not help you or your peers.

  • Do not give feedback for the topic or the entire lecture at the end. For example, if you find the topic or lecture engaging in general, do not click the engaging button at the end to reflect your overall feeling. Instead, try to click engaging only at times you feel engaged.

  • It is okay to report both cognitive-affective states if they are experienced simultaneously (e.g., if you experience difficulty and engagement at the same time or with the same concept, then you can report both states).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chavan, P., Mitra, R., Sarkar, A. et al. A contextualized assessment of reliability and validity of student-initiated momentary self-reports during lectures. Education Tech Research Dev 72, 503–539 (2024). https://doi.org/10.1007/s11423-023-10304-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11423-023-10304-2

Keywords

Navigation