Advancing the state of the art for handwritten math recognition: the CROHME competitions, 2011–2014

  • Harold Mouchère
  • Richard Zanibbi
  • Utpal Garain
  • Christian Viard-Gaudin
Original Paper

Abstract

The CROHME competitions have helped organize the field of handwritten mathematical expression recognition. This paper presents the evolution of the competition over its first 4 years, and its contributions to handwritten math recognition, and more generally structural pattern recognition research. The competition protocol, evaluation metrics and datasets are presented in detail. Participating systems are analyzed and compared in terms of the central mathematical expression recognition tasks: (1) symbol segmentation, (2) classification of individual symbols, (3) symbol relationships and (4) structural analysis (parsing). The competition led to the development of label graphs, which allow recognition results with conflicting segmentations to be directly compared and quantified using Hamming distances. We introduce structure confusion histograms that provide frequencies for incorrect subgraphs corresponding to ground-truth label subgraphs of a given size and present structure confusion histograms for symbol bigrams (two symbols with a relationship) for CROHME 2014 systems. We provide a novel analysis combining results from competing systems at the level of individual strokes and stroke pairs; this virtual merging of system outputs allows us to more closely examine limitations for current state-of-the-art systems. Datasets along with evaluation and visualization tools produced for the competition are publicly available.

Keywords

Handwriting recognition Mathematical expression recognition Competitions Performance evaluation 

References

  1. 1.
    Anderson, R.H.: Syntax-directed recognition of hand-printed two-dimensional mathematics. In: Symposium on Interactive Systems for Experimental Applied Mathematics: Proceedings of the Association for Computing Machinery Inc., Symposium, pp. 436–459. ACM, New York, NY, USA (1967)Google Scholar
  2. 2.
    Mouchère, H., Viard-Gaudin, C., Kim, D.H., Kim, J.H., Utpal, G.: Crohme 2011: competition on recognition of online handwritten mathematical expressions. In: Proceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR). Beijing, China (2011)Google Scholar
  3. 3.
    Mouchère, H., Viard-Gaudin, C., Kim, D.H., Kim, J.H., Utpal, G.: Icfhr 2012—competition on recognition of on-line mathematical expressions (crohme 2012). In: Proceedings of the 13th International Conference on Frontiers in Handwriting Recognition (ICFHR). Bari, Italy (2012)Google Scholar
  4. 4.
    Mouchère, H., Viard-Gaudin, C., Zanibbi, R., Garain, U., Kim, D.H., Kim, J.H.: Icdar 2013 crohme: third international competition on recognition of online handwritten mathematical expressions. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR). Washington, DC, USA (August 2013)Google Scholar
  5. 5.
    Mouchère, H., Viard-Gaudin, C., Zanibbi, R., Utpal, G.: Icfhr 2014—competition on recognition of on-line mathematical expressions (crohme 2014). In: Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition (ICFHR). Creta, Greece (2014)Google Scholar
  6. 6.
    Chan, K., Yeung, D.: Mathematical expression recognition: a survey. Int. J. Doc. Anal. Recognit. 3(1), 3–15 (2000)CrossRefGoogle Scholar
  7. 7.
    Sain, K., Dasgupta, A., Garain, U.: Emers: a tree matching-based performance evaluation of mathematical expression recognition systems. Int. J. Doc. Anal. Recognit. 14(1), 75–85 (2011)CrossRefGoogle Scholar
  8. 8.
    Zanibbi, R., Mouchère, H., Viard-Gaudin, C.: Evaluating structural pattern recognition for handwritten math via primitive label graphs. In: IS&T/SPIE Electronic Imaging, pp. 865817-1–865817-11. International Society for Optics and Photonics (2013)Google Scholar
  9. 9.
    Blostein, D., Grbavec, A.: Recognition of mathematical notation. In: Bunke, H., Wang, P. (eds.) Handbook of Character Recognition and Document Image Analysis, pp. 557–582. World Scientific Publishing Company (1997)Google Scholar
  10. 10.
    Zanibbi, R., Blostein, D.: Recognition and retrieval of mathematical expressions. Int. Doc. Anal. Recognit. 15(4), 331–357 (2012)CrossRefGoogle Scholar
  11. 11.
    Cajori, F.: A History of Mathematical Notations, vol. 2. The Open Court Publishing Company, Chicago (1929)MATHGoogle Scholar
  12. 12.
    Marriott, K., Meyer, B., Wittenburg, K.B.: Visual language theory, ch. In: A Survey of Visual Language Specification and Recognition, pp. 5–85. Springer, New York (1998)Google Scholar
  13. 13.
    Awal, A.-M., Mouchère, H., Viard-Gaudin, C.: Towards handwritten mathematical expression recognition. In: Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR), pp. 1046–1050 (2009)Google Scholar
  14. 14.
    Quiniou, S., Mouchère, H., Saldarriaga, S., Viard-Gaudin, C., Morin, E., Petitrenaud, S., Medjkoune, S.: Hamex—a handwritten and audio dataset of mathematical expressions. In: Proceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR), pp. 452–456 (2011)Google Scholar
  15. 15.
    Stria, J., Bresler, M., Průs̆a, D., Hlavc, V.: Mfrdb: Database of annotated on-line mathematical formulae. In: Proceedings of the 13th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 542–547 (2012)Google Scholar
  16. 16.
    Aguilar, F.D.J., Hirata, N.S.: Expressmatch: a system for creating ground-truthed datasets of online mathematical expressions. In: Proceedings of 10th IAPR International Workshop on Document Analysis Systems (DAS), pp. 155–159. IEEE (2012)Google Scholar
  17. 17.
    MacLean, S., Labahn, G., Lank, E., Marzouk, M., Tausky, D.: Grammar-based techniques for creating ground-truthed sketch corpora. Int. J. Doc. Anal. Recognit. 14(1), 65–74 (2011)CrossRefGoogle Scholar
  18. 18.
    Garain, U., Chaudhuri, B.B.: A corpus for OCR research on mathematical expressions. Int. J. Doc. Anal. Recognit. 7(4), 241–259 (2005)CrossRefGoogle Scholar
  19. 19.
    Zanibbi, R., Pillay, A., Mouchère, H., Viard-Gaudin, C., Blostein, D.: Stroke-based performance metrics for handwritten mathematical expressions. In: Proceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR), pp. 334–338. IEEE (2011)Google Scholar
  20. 20.
    Álvaro, F., Sánchez, J.-A., Benedí, J.-M.: An image-based measure for evaluation of mathematical expression recognition. In: Sanches, J., Micó, L., Cardoso J. (eds.) Pattern Recognition and Image Analysis, vol. 7887 of Lecture Notes in Computer Science, pp. 682–690. Springer, Berlin (2013)Google Scholar
  21. 21.
    Garain, U., Chaudhuri, B.: OCR of Printed Mathematical Expressions. Springer, New York (2007). doi:10.1007/978-1-84628-726-8_11 CrossRefGoogle Scholar
  22. 22.
    Tapia, E., Rojas, R.: A survey on recognition of on-line handwritten mathematical notation. In: Technical Report, Free University of Berlin, January (2007)Google Scholar
  23. 23.
    Blostein, D., Zanibbi, R.: Processing mathematical notation, chap. 5.6. In: Doermann, D., Tombre, K. (eds.) Handbook of Document Image Processing and Recognition. Springer, London (2014)Google Scholar
  24. 24.
    Simistira, F., Katsouros, V., Carayannis, G.: A template matching distance for recognition of on-line mathematical symbols. In: Proceedings of the 11th International Conference on Frontiers in Handwriting Recognition (ICFHR), (Montréal), pp. 415–420 (2008)Google Scholar
  25. 25.
    Stria, J., Průs̆a, D.: Web application for recognition of mathematical formulas. In: Proc. Conf. Theory and Practice of Information Technologies, (Vrátna dolina, Slovak Republic), pp. 47–54 (2011)Google Scholar
  26. 26.
    Stria, J., Bresler, M., Průs̆a, D., Hlavác, V.: Mfrdb: Database of annotated on-line mathematical formulae. In: Proceedings of the 13th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 542–547 (2012)Google Scholar
  27. 27.
    Awal, A.-M., Mouchère, H., Viard-Gaudin, C.: Improving online handwritten mathematical expressions recognition with contextual modeling. In: Proceedings of the 12th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 427–432 (2010)Google Scholar
  28. 28.
    Awal, A.-M., Mouchère, H., Viard-Gaudin, C.: A global learning approach for an online handwritten mathematical expression recognition system. Pattern Recognit. Lett. 35(1), 68–77 (2014). Frontiers in Handwriting ProcessingCrossRefGoogle Scholar
  29. 29.
    Hu, L., Zanibbi, R.: HMM-based recognition of online handwritten mathematical symbols using segmental k-means initialization and a modified pen-up/down feature. In: Proceedings of International Conference Document Analysis and Recognition, pp. 457–462. Beijing, China (Sept. 2011)Google Scholar
  30. 30.
    Hu, L., Hart, K., Pospesel, R., Zanibbi, R.: Baseline extraction-driven parsing of handwritten mathematical expressions. In: Proceedings of International Conference Pattern Recognition, pp. 326–330. Tsukuba Science City, Japan (Nov. 2012)Google Scholar
  31. 31.
    Hu, L., Zanibbi, R.: Segmenting handwritten math symbols using AdaBoost and multi-scale shape context features. In: Proceedings of International Conference Document Analysis and Recognition, pp. 1180–1184. Washington, USA (2013)Google Scholar
  32. 32.
    Eto, Y., Suzuki, M.: Mathematical formula recognition using virtual link network. In: Proceeding of International Conference Document Analysis and Recognition, pp. 762–767. Seattle, USA (2001)Google Scholar
  33. 33.
    Davila, K., Ludi, S., Zanibbi, R.: Using off-line features and synthetic data for on-line handwritten math symbol recognition. In: Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 323–328. Crete, Greece (2014)Google Scholar
  34. 34.
    Zanibbi, R., Blostein, D., Cordy, J.R.: Recognizing mathematical expressions using tree transformation. IEEE Tran. Pattern Anal. Mach. Intel. 24(11), 1455–1467 (2002)CrossRefGoogle Scholar
  35. 35.
    Álvaro, F., Zanibbi, R.: A shape-based layout descriptor for classifying spatial relationships in handwritten math. In: ACM Symposium Document Engineering, pp. 123–126. Florence, Italy (2013)Google Scholar
  36. 36.
    Liwicki, M., Bunke, H.: Feature selection for HMM and BLSTM based handwriting recognition of whiteboard notes. Int. J. Pattern Recognit. Artif. Intell. 23(5), 907–923 (2009)Google Scholar
  37. 37.
    Celik, M., Yanikoglu, B.: Mathematical formula recognition using a 2D stochastic graph grammar. In: Proceedings of International Conference Document Analysis and Recognition, pp. 161–166. Beijing, China (2011)Google Scholar
  38. 38.
    Julca-Aguilar, F., Hirata, N., Viard-Gaudin, C., Mouchère, H., Medjkoune, S.: Mathematical symbol hypothesis recognition with rejection option. In: Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 500–504. Crete, Greece (2014)Google Scholar
  39. 39.
    Le, D., Phan, T.V., Nakagawa, M.: A system for recognizing online handwritten mathematical expressions and improvement of structural analysis. In: Proceedings of 11th IAPR International Workshop on Document Analysis Systems (DAS). Tours, France (2014)Google Scholar
  40. 40.
    Zhu, B., Gao, J., Nakagawa, M.: Objection function design for MCE-based combination of on-line and off-line character recognizers for on-line handwritten Japanese text recognition. In: Proceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR), pp. 594–599. Beijing, China (2011)Google Scholar
  41. 41.
    Lee, A., Nakagawa, M.: A tool for ground-truthing online handwritten mathematical expressions. In: International Graphonomics Society Conference. Nara, Japan (2013)Google Scholar
  42. 42.
    Álvaro, F., Sánchez, J., Benedí, J.: Recognition of printed mathematical expression using two-dimensional stochastic context-free grammars. In: Proceedings of International Conference Document Analysis and Recognition, pp. 1225–1229. Beijing, China (2011)Google Scholar
  43. 43.
    Álvaro, F., Sánchez, J., Benedí, J.: Recognition of online handwritten mathematical expressions using 2D stochastic context-free grammars and Hidden Markov Models. Pattern Recognit. Lett. 35, 56–67 (2014)CrossRefGoogle Scholar
  44. 44.
    Álvaro, F., Sánchez, J., Benedí, J.: Offline features for classifying handwritten math symbols with recurrent neural networks. In: Proceedings of International Conference Pattern Recognition, p. (to appear) (2014)Google Scholar
  45. 45.
    Labahn, G., Lank, E., MacLean, S., Marzouk, M. S., Tausky, D.: Mathbrush: a system for doing math on pen-based devices. In: Proceedings of Document Analysis Systems, pp. 599–606. Nara, Japan (2008)Google Scholar
  46. 46.
    MacLean, S., Labahn, G.: A new approach for recognizing handwritten mathematics using relational grammars and fuzzy sets. Int. J. Doc. Anal. Recognit. 16(2), 139–163 (2013)CrossRefGoogle Scholar
  47. 47.
    Chou, P.A.C.: Recognition of equations using a two-dimensional stochastic context-free grammar. In: Pearlman, W.A. (ed.) Visual Communications and Image Processing IV, vol. 1199 of SPIE Proceedings Series, pp. 852–863 (1989)Google Scholar
  48. 48.
    Zanibbi, R., Blostein, D., Cordy, J.R.: Recognizing mathematical expressions using tree transformation. Pattern Analysis and Machine Intelligence, IEEE Transactions on 24, 1455–1467 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Harold Mouchère
    • 1
  • Richard Zanibbi
    • 2
  • Utpal Garain
    • 3
  • Christian Viard-Gaudin
    • 1
  1. 1.LUNAM/University of Nantes/IRCCyNNantesFrance
  2. 2.Department of Computer ScienceRochester Institute of TechnologyRochesterUSA
  3. 3.Computer Vision and Pattern Recognition Unit (CVPRU)Indian Statistical InstituteKolkataIndia

Personalised recommendations