EMERS: a tree matching–based performance evaluation of mathematical expression recognition systems
- 201 Downloads
Performance evaluation of mathematical expression recognition systems is attempted. The proposed method assumes expressions (input as well as recognition output) are coded following MathML or TEX/LaTEX (which also gets converted into MathML) format. Since any MathML representation follows a tree structure, evaluation of performance has been modeled as a tree-matching problem. The tree corresponding to the expression generated by the recognizer is compared with the groundtruthed one by comparing the corresponding Euler strings. The changes required to convert the tree corresponding to the expression generated by the recognizer into the groundtruthed one are noted. The number of changes required to make such a conversion is basically the distance between the trees. This distance gives the performance measure for the system under testing. The proposed algorithm also pinpoints the positions of the changes in the output MathML file. Testing of the proposed evaluation method considers a set of example groundtruthed expressions and their corresponding recognized results produced by an expression recognition system.
KeywordsOCR Mathematical expressions Performance evaluation Tree matching Euler strings
Unable to display preview. Download preview PDF.
- 1.Blostein D., Grbavec A.: Recognition of mathematical notation. In: Bunke, H., Wang, P.S.P. (eds) Handbook of Character Recognition and Document Image Analysis, pp. 557–582. World Scientific Publishing Company, Singapore (1997)Google Scholar
- 3.Demaine, E.D., Mozes, S., Rossman, B., Weimann, O.: An O(n 3)-time algorithm for tree edit distance available at. http://www.citebase.org/abstract?id=oai:arXiv.org:cs/0604037 (2006)
- 6.Garain, U.: Recognition of printed and handwritten mathematical expressions. Ph.D. Thesis, Indian Statistical Institute, Kolkata, India (2005)Google Scholar
- 7.Garain, U., Chaudhuri, B.B.: On OCR of printed mathematical expressions, digital document processing. In: Chaudhuri, B.B. (ed.) Advances in Pattern Recognition. pp. 235–259. Springer-Verlag, London Ltd. (2007)Google Scholar
- 8.Klein, P.: Computing the edit-distance between unrooted ordered trees. In: Bilardi, G., Italiano, G.F., Pietracaprina, A., Pucci, G. (eds.) Proceedings of the 6th Annual European Symposium, No. 1461, pp. 91–102. Springer-Verlag, Berlin (1998)Google Scholar
- 9.Lapointe, A., Blostein, D.: Issues in performance evaluation: a case study of math recognition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp. 1355–1359 (2009)Google Scholar
- 10.Mitra, J., Garain, U., Chaudhuri, B.B., Kumar Swamy, H.V., Pal, T.: Automatic understanding of structures in printed mathematical expressions. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp. 540–544 (2003)Google Scholar
- 11.Okamoto, M., Imai, H., Takagi, K.: Performance evaluation of a robust method for mathematical expression recognition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp. 121–128 (2001)Google Scholar