Using Lexical Properties of Handwritten Equations to Estimate the Correctness of Students’ Solutions to Engineering Problems

  • Thomas F. StahovichEmail author
  • Hanlung Lin
  • Justin Gyllen


We present a technique that examines handwritten equations from a student’s solution to an engineering problem and from this estimates the correctness of the work. More specifically, we demonstrate that lexical properties of the equations correlate with the grade a human grader would assign. We characterize these properties with a set of features that include the number of occurrences of various classes of symbols and binary and tripartite sequences of them. Support vector machine (SVM) regression models trained with these features achieved a correlation of r = .433 (p< .001) on a combined set of six exam problems. Prior work suggests that the number of long pauses in the writing that occur as a student solves a problem correlates with correctness. We found that combining this pause feature with our lexical features produced more accurate predictions than using either type of feature alone. SVM regression models trained using an optimized subset of three lexical features and the pause feature achieved an average correlation with grade across the six problems of r = .503 (p< .001). These techniques are an important step toward creating systems that can automatically assess handwritten coursework.


Educational data mining Digital ink Problem solving Handwritten equations Smartpen 



This material is based upon work supported by the National Science Foundation under Award Numbers 0935239, 1432820, and 1612511. Livescribe, Inc. provided some materials used in the project. We thank Daniel Jeske for his suggestions about some of the statistical analysis.


  1. Attali, Y. (2013). Validity and reliability of automated essay scoring. Handbook of automated essay evaluation: current applications and new direction, 181–198.Google Scholar
  2. Attali, Y. (2015). Reliability-based feature weighting for automated essay scoring. Appl. Psychol. Meas., 39(4), 303–313.CrossRefGoogle Scholar
  3. Beal, C.R., & Cohen, P.R. (2008). Temporal data mining for educational applications. In Proceedings of the 10th Pacific rim international conference on artificial intelligence: trends in artificial intelligence (pp. 66–77). Berlin: Springer.Google Scholar
  4. Bransford, J.D., Brown, A.L., Cocking, R.R. (Eds.). (2000). How people learn: brain, mind, experience, and school: expanded edition. Washington: The National Academies Press.Google Scholar
  5. Cheng, P.C., & Rojas-Anaya, H. (2008). Measuring mathematic formula writing competence: an application of graphical protocol analysis. In Proceedings of the 13th annual conference of the cognitive science society (pp. 869–874).Google Scholar
  6. Demirci, N. (2010). Web-based vs. paper-based homework to evaluate students’ performance in introductory physics courses and students’ perceptions: two years experience. Int. J. E-Learning, 9(1), 27–49.MathSciNetGoogle Scholar
  7. Gikandi, J., Morrow, D., Davis, N. (2011). Online formative assessment in higher education: a review of the literature. Comput. Educ., 57(4), 2333–2351.CrossRefGoogle Scholar
  8. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H. (2009). The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter, 11(1), 10–18.CrossRefGoogle Scholar
  9. Herold, J., & Stahovich, T. (2012). Characterizing students’ handwritten self-explanations. In Proceedings of the 2012 American society for engineering education annual conference and exposition.Google Scholar
  10. Herold, J., Stahovich, T.F., Rawsonm, K. (2013a). Using educational data mining to identify correlations between homework effort and performance. In Proceedings of the 2013 American society for engineering education annual conference and exposition.Google Scholar
  11. Herold, J., Zundel, A., Stahovich, T.F. (2013b). Mining meaningful patterns from students’ handwritten coursework. In Proceedings of the sixth international conference on educational data mining.Google Scholar
  12. Kara, L.B., & Stahovich, T.F. (2005). An image-based, trainable symbol recognizer for hand-drawn sketches. Comput. Graph., 29, 501–517.CrossRefGoogle Scholar
  13. Kohavi, R., & John, G.H. (1997). Wrappers for feature subset selection. Artif. Intell., 97(1), 273–324.,, relevance.CrossRefGoogle Scholar
  14. Krüger, A., Merceron, A., Wolf, B. (2010). A data model to ease analysis and mining of educational data. In Proceedings of the 3rd international conference on educational data mining.Google Scholar
  15. LaViola, J.J. Jr. (2007). An initial evaluation of mathpad2: a tool for creating dynamic mathematical illustrations. Comput. Graph., 31(4), 540–553., Scholar
  16. LaViola, J.J. Jr, & Zeleznik, R.C. (2004). Mathpad2: a system for the creation and exploration of mathematical sketches. ACM Trans. Graph., 23 (3), 432–440., Scholar
  17. Li, N., Cohen, W.W., Koedinger, K.R., Matsuda, N. (2011). A machine learning approach for automatic student model discovery. In Proceedings of the 4th international conference on educational data mining (pp. 31–40).Google Scholar
  18. Mostow, J., Gonzàlez-Brenes, J.P., Tan, B.H. (2011). Learning classifiers from a relational database of tutor logs. In Proceedings of the 4th international conference on educational data mining (pp. 149–158).Google Scholar
  19. Oviatt, S., Arthur, A., Cohen, J. (2006). Quiet interfaces that help students think. In UIST ’06: Proceedings of the 19th annual ACM symposium on User interface software and technology, ACM Press, New York (pp. 191–200).Google Scholar
  20. Pellegrino, J.W., Chudowsky, N., Glaser, R. (Eds.). (2001). Knowing what students know: the science and design of educational assessment. Washington: The National Academies Press.Google Scholar
  21. Rabiner, L. (1989). A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE, 77(2), 257–286.CrossRefGoogle Scholar
  22. Rawson, K., & Stahovich, T.F. (2013). Predicting course performance from homework habits. In Proceedings of the 2013 American society for engineering education annual conference and exposition.Google Scholar
  23. Rawson, K., Stahovich, T.F., Mayer, R.E. (2017). Homework and achievement: using smartpen technology to find the connection. J. Educ. Psychol., 109(2), 208.CrossRefGoogle Scholar
  24. Romero, C., & Ventura, S. (2010). Educational data mining: a review of the state of the art. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 40(6), 601–618.CrossRefGoogle Scholar
  25. Romero, C., Romero, J., Luna, J., Ventura, S. (2010). Mining rare association rules from e-learning data. Educ. Data Mining, 171–180.Google Scholar
  26. Schneider, S.C. (2014). Paperless grading of handwritten homework: electronic process and assessment. In Proceedings of the American society for enginering education annual conference.Google Scholar
  27. Shanabrook, D.H., Cooper, D.G., Woolf, B.P., Arroyo, I. (2010). Identifying high-level student behavior using sequence-based motif discovery. In de Baker, R.S.J., Merceron, A., Jr, P.I.P. (Eds.) Proceedings of the 3rd international conference on educational data mining (pp. 191–200).Google Scholar
  28. Sharma, A., & Jayagopi, D.B. (2018). Automated grading of handwritten essays. In 2018 16Th international conference on frontiers in handwriting recognition (ICFHR), IEEE (pp. 279–284).Google Scholar
  29. de Silva, R., Bischel, D.T., Lee, W., Peterson, E.J., Calfee, R.C., Stahovich, T.F. (2007). Kirchhoff’s pen: a pen-based circuit analysis tutor. In Proceedings of the 4th eurographics workshop on sketch-based interfaces and modeling, ACM, New York, NY, USA, SBIM ’07 (pp. 75–82).Google Scholar
  30. Singh, A., Karayev, S., Gutowski, K., Abbeel, P. (2017). Gradescope: a fast, flexible, and fair system for scalable assessment of handwritten work. In Proceedings of fourth ACM conference on learning@ scale (pp. 81–88): ACM.Google Scholar
  31. Smithies, S., Novins, K., Arvo, J. (1999). A handwriting-based equation editor. In Proceedings of the 1999 conference on graphics interface ’99, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (pp. 84–91).
  32. Srihari, S.N., Srihari, R.K., Babu, P., Srinivasan, H. (2007). On the automatic scoring of handwritten essays. In Proceeddings of international joint conference on artificial intelligence (pp. 2880–2884).Google Scholar
  33. Stahovich, T.F., & Lin, H. (2016). Enabling data mining of handwritten coursework. Comput. Graph., 57, 31–45., Scholar
  34. Steif, P.S., & Dollár, A. (2009). Study of usage patterns and learning gains in a web-based interactive static course. J. Eng. Educ., 98(4), 321–333.CrossRefGoogle Scholar
  35. Steif, P.S., Lobue, J.M., Kara, L.B., Fay, A.L. (2010). Improving problem solving performance by inducing talk about salient problem features. J. Eng. Educ., 99(2), 135–142.CrossRefGoogle Scholar
  36. Stevens, R., Johnson, D.F., Soller, A. (2005). Probabilities and predictions: modeling the development of scientific problem-solving skills. Cell Biol. Educ., 4(1), 42–57.CrossRefGoogle Scholar
  37. Trivedi, S., Pardos, Z.A., Sàrközy, G.N., Heffernan, N.T. (2011). Spectral clustering in educational data mining. In Proceedings of the 4th international conference on educational data mining (pp. 129–138).Google Scholar
  38. Van Arsdale, T., & Stahovich, T. (2012). Does neatness count? What the organization of student work says about understanding. In Proceedings of the 2012 American society for engineering education annual conference and exposition.Google Scholar

Copyright information

© International Artificial Intelligence in Education Society 2019

Authors and Affiliations

  1. 1.Department of Mechanical EngineeringUniversity of CaliforniaRiversideUSA
  2. 2.Department of Computer Science and EngineeringUniversity of CaliforniaRiversideUSA
  3. 3.AmazonSeattleUSA
  4. 4.Department of Mechanical EngineeringUniversity of CaliforniaRiversideUSA

Personalised recommendations