Using Lexical Properties of Handwritten Equations to Estimate the Correctness of Students’ Solutions to Engineering Problems

A Correction to this article was published on 13 January 2020

This article has been updated


We present a technique that examines handwritten equations from a student’s solution to an engineering problem and from this estimates the correctness of the work. More specifically, we demonstrate that lexical properties of the equations correlate with the grade a human grader would assign. We characterize these properties with a set of features that include the number of occurrences of various classes of symbols and binary and tripartite sequences of them. Support vector machine (SVM) regression models trained with these features achieved a correlation of r = .433 (p< .001) on a combined set of six exam problems. Prior work suggests that the number of long pauses in the writing that occur as a student solves a problem correlates with correctness. We found that combining this pause feature with our lexical features produced more accurate predictions than using either type of feature alone. SVM regression models trained using an optimized subset of three lexical features and the pause feature achieved an average correlation with grade across the six problems of r = .503 (p< .001). These techniques are an important step toward creating systems that can automatically assess handwritten coursework.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Change history

  • 13 January 2020

    In this issue, the citation information on the opening page of each article HTML was updated to read ���International Journal of Artificial Intelligence in Education December 2019���,��� not ���International Journal of Artificial Intelligence in Education December 2000...���


  1. Attali, Y. (2013). Validity and reliability of automated essay scoring. Handbook of automated essay evaluation: current applications and new direction, 181–198.

  2. Attali, Y. (2015). Reliability-based feature weighting for automated essay scoring. Appl. Psychol. Meas., 39(4), 303–313.

    Article  Google Scholar 

  3. Beal, C.R., & Cohen, P.R. (2008). Temporal data mining for educational applications. In Proceedings of the 10th Pacific rim international conference on artificial intelligence: trends in artificial intelligence (pp. 66–77). Berlin: Springer.

  4. Bransford, J.D., Brown, A.L., Cocking, R.R. (Eds.). (2000). How people learn: brain, mind, experience, and school: expanded edition. Washington: The National Academies Press.

  5. Cheng, P.C., & Rojas-Anaya, H. (2008). Measuring mathematic formula writing competence: an application of graphical protocol analysis. In Proceedings of the 13th annual conference of the cognitive science society (pp. 869–874).

  6. Demirci, N. (2010). Web-based vs. paper-based homework to evaluate students’ performance in introductory physics courses and students’ perceptions: two years experience. Int. J. E-Learning, 9(1), 27–49.

    MathSciNet  Google Scholar 

  7. Gikandi, J., Morrow, D., Davis, N. (2011). Online formative assessment in higher education: a review of the literature. Comput. Educ., 57(4), 2333–2351.

    Article  Google Scholar 

  8. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H. (2009). The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter, 11(1), 10–18.

    Article  Google Scholar 

  9. Herold, J., & Stahovich, T. (2012). Characterizing students’ handwritten self-explanations. In Proceedings of the 2012 American society for engineering education annual conference and exposition.

  10. Herold, J., Stahovich, T.F., Rawsonm, K. (2013a). Using educational data mining to identify correlations between homework effort and performance. In Proceedings of the 2013 American society for engineering education annual conference and exposition.

  11. Herold, J., Zundel, A., Stahovich, T.F. (2013b). Mining meaningful patterns from students’ handwritten coursework. In Proceedings of the sixth international conference on educational data mining.

  12. Kara, L.B., & Stahovich, T.F. (2005). An image-based, trainable symbol recognizer for hand-drawn sketches. Comput. Graph., 29, 501–517.

    Article  Google Scholar 

  13. Kohavi, R., & John, G.H. (1997). Wrappers for feature subset selection. Artif. Intell., 97(1), 273–324.,, relevance.

    Article  Google Scholar 

  14. Krüger, A., Merceron, A., Wolf, B. (2010). A data model to ease analysis and mining of educational data. In Proceedings of the 3rd international conference on educational data mining.

  15. LaViola, J.J. Jr. (2007). An initial evaluation of mathpad2: a tool for creating dynamic mathematical illustrations. Comput. Graph., 31(4), 540–553.,

    Article  Google Scholar 

  16. LaViola, J.J. Jr, & Zeleznik, R.C. (2004). Mathpad2: a system for the creation and exploration of mathematical sketches. ACM Trans. Graph., 23 (3), 432–440.,

    Article  Google Scholar 

  17. Li, N., Cohen, W.W., Koedinger, K.R., Matsuda, N. (2011). A machine learning approach for automatic student model discovery. In Proceedings of the 4th international conference on educational data mining (pp. 31–40).

  18. Mostow, J., Gonzàlez-Brenes, J.P., Tan, B.H. (2011). Learning classifiers from a relational database of tutor logs. In Proceedings of the 4th international conference on educational data mining (pp. 149–158).

  19. Oviatt, S., Arthur, A., Cohen, J. (2006). Quiet interfaces that help students think. In UIST ’06: Proceedings of the 19th annual ACM symposium on User interface software and technology, ACM Press, New York (pp. 191–200).

  20. Pellegrino, J.W., Chudowsky, N., Glaser, R. (Eds.). (2001). Knowing what students know: the science and design of educational assessment. Washington: The National Academies Press.

  21. Rabiner, L. (1989). A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE, 77(2), 257–286.

    Article  Google Scholar 

  22. Rawson, K., & Stahovich, T.F. (2013). Predicting course performance from homework habits. In Proceedings of the 2013 American society for engineering education annual conference and exposition.

  23. Rawson, K., Stahovich, T.F., Mayer, R.E. (2017). Homework and achievement: using smartpen technology to find the connection. J. Educ. Psychol., 109(2), 208.

    Article  Google Scholar 

  24. Romero, C., & Ventura, S. (2010). Educational data mining: a review of the state of the art. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 40(6), 601–618.

    Article  Google Scholar 

  25. Romero, C., Romero, J., Luna, J., Ventura, S. (2010). Mining rare association rules from e-learning data. Educ. Data Mining, 171–180.

  26. Schneider, S.C. (2014). Paperless grading of handwritten homework: electronic process and assessment. In Proceedings of the American society for enginering education annual conference.

  27. Shanabrook, D.H., Cooper, D.G., Woolf, B.P., Arroyo, I. (2010). Identifying high-level student behavior using sequence-based motif discovery. In de Baker, R.S.J., Merceron, A., Jr, P.I.P. (Eds.) Proceedings of the 3rd international conference on educational data mining (pp. 191–200).

  28. Sharma, A., & Jayagopi, D.B. (2018). Automated grading of handwritten essays. In 2018 16Th international conference on frontiers in handwriting recognition (ICFHR), IEEE (pp. 279–284).

  29. de Silva, R., Bischel, D.T., Lee, W., Peterson, E.J., Calfee, R.C., Stahovich, T.F. (2007). Kirchhoff’s pen: a pen-based circuit analysis tutor. In Proceedings of the 4th eurographics workshop on sketch-based interfaces and modeling, ACM, New York, NY, USA, SBIM ’07 (pp. 75–82).

  30. Singh, A., Karayev, S., Gutowski, K., Abbeel, P. (2017). Gradescope: a fast, flexible, and fair system for scalable assessment of handwritten work. In Proceedings of fourth ACM conference on learning@ scale (pp. 81–88): ACM.

  31. Smithies, S., Novins, K., Arvo, J. (1999). A handwriting-based equation editor. In Proceedings of the 1999 conference on graphics interface ’99, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (pp. 84–91).

  32. Srihari, S.N., Srihari, R.K., Babu, P., Srinivasan, H. (2007). On the automatic scoring of handwritten essays. In Proceeddings of international joint conference on artificial intelligence (pp. 2880–2884).

  33. Stahovich, T.F., & Lin, H. (2016). Enabling data mining of handwritten coursework. Comput. Graph., 57, 31–45.,

    Article  Google Scholar 

  34. Steif, P.S., & Dollár, A. (2009). Study of usage patterns and learning gains in a web-based interactive static course. J. Eng. Educ., 98(4), 321–333.

    Article  Google Scholar 

  35. Steif, P.S., Lobue, J.M., Kara, L.B., Fay, A.L. (2010). Improving problem solving performance by inducing talk about salient problem features. J. Eng. Educ., 99(2), 135–142.

    Article  Google Scholar 

  36. Stevens, R., Johnson, D.F., Soller, A. (2005). Probabilities and predictions: modeling the development of scientific problem-solving skills. Cell Biol. Educ., 4(1), 42–57.

    Article  Google Scholar 

  37. Trivedi, S., Pardos, Z.A., Sàrközy, G.N., Heffernan, N.T. (2011). Spectral clustering in educational data mining. In Proceedings of the 4th international conference on educational data mining (pp. 129–138).

  38. Van Arsdale, T., & Stahovich, T. (2012). Does neatness count? What the organization of student work says about understanding. In Proceedings of the 2012 American society for engineering education annual conference and exposition.

Download references


This material is based upon work supported by the National Science Foundation under Award Numbers 0935239, 1432820, and 1612511. Livescribe, Inc. provided some materials used in the project. We thank Daniel Jeske for his suggestions about some of the statistical analysis.

Author information



Corresponding author

Correspondence to Thomas F. Stahovich.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Stahovich, T.F., Lin, H. & Gyllen, J. Using Lexical Properties of Handwritten Equations to Estimate the Correctness of Students’ Solutions to Engineering Problems. Int J Artif Intell Educ 29, 459–483 (2019).

Download citation


  • Educational data mining
  • Digital ink
  • Problem solving
  • Handwritten equations
  • Smartpen