DAS 2006: Document Analysis Systems VII pp 153-163

# Structural Analysis of Mathematical Formulae with Verification Based on Formula Description Grammar

• Seiichi Toyota
• Seiichi Uchida
• Masakazu Suzuki
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3872)

## Abstract

In this paper, a reliable and efficient structural analysis method for mathematical formulae is proposed for practical mathematical OCR. The proposed method consists of three steps. In the first step, a fast structural analysis algorithm is performed on each mathematical formula to obtain a tree representation of the formula. This step generally provides a correct tree representation but sometimes provides an erroneous representation. Therefore, the tree representation is verified by the following two steps. In the second step, the result of the analysis step, (i.e., a tree representation) is converted into a one-dimensional representation. The third step is a verification step where the one-dimensional representation is parsed by a formula description grammar, which is a context-free grammar specialized for mathematical formulae. If the one-dimensional representation is not accepted by the grammar, the result of the analysis step is detected as an erroneous result and alarmed to OCR users. This three-step organization achieves reliable and efficient structural analysis without any two-dimensional grammars.

## Keywords

False Alarm Analysis Step Mathematical Formula Tree Representation Mathematical Symbol
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

## References

1. 1.
Anderson, R.H.: Syntax-directed recognition of hand-printed two-dimensional mathematics. In: Klerer, M., Reinfelds, J. (eds.) Interactive Systems for Experimental Applied Mathematics, pp. 436–459. Academic Press, London (1968)Google Scholar
2. 2.
Chan, K.-F., Yeung, D.-Y.: Mathematical expression recognition: a survey. Int. J. Doc. Anal. Recognit. 3(1), 3–15 (2000)
3. 3.
Chou, P.A.: Recognition of equations using a two-dimensional stochastic context-free grammar. In: Proc. SPIE, vol. 1199(2), pp. 852–863 (1989)Google Scholar
4. 4.
Eto, Y., Suzuki, M.: Mathematical Formula Recognition Using Virtual Link Network. In: Proc. ICDAR, pp. 430–437 (2001)Google Scholar
5. 5.
Fateman, R.J., Tokuyasu, T., Berman, B.P., Mitchell, N.: Optical character recognition and parsing of typeset mathematics. Journal of Visual Communication and Image Representation 7(1), 2–15 (1996)
6. 6.
Garain, U., Chaudhuri, B.B.: ‘A syntactic approach for processing mathematical expressions in printed documents. In: Proc. ICPR, vol. 4(4), pp. 523–526 (2000)Google Scholar
7. 7.
Okamoto, M., Miao, B.: Recognition of mathematical expressions by using the layout structure of symbols. In: Proceedings of First International Conference on Document Analysis and Recognition Saint Malo, pp. 242–250 (1991)Google Scholar
8. 8.
Suzuki, M., Tamari, F., Fukuda, R., Uchida, S., Kanahori, T.: ‘INFTY — An integrated OCR system for mathematical documents. In: Proc. ACM Symposium on Document Engineering, pp. 95–104 (2003)Google Scholar
9. 9.
Suzuki, M., Uchida, S., Nomura, A.: A ground-truthed mathematical character and symbol image database. In: Proc. ICDAR, vol. 2(2), pp. 675–679 (2005)Google Scholar
10. 10.
Toumit, J.-Y., Garcia-Salicetti, S., Emptoz, H.: A hierarchical and recursive model of mathematical expressions for automatic reading of mathematical documents. In: Proc. ICDAR, pp. 119–122 (1999)Google Scholar
11. 11.
Zanibbi, R., Blostein, D., Cordy, J.R.: Recognizing mathematical expressions using tree transformation. IEEE Trans. Pattern Anal. Mach. Intell. 24(11), 1455–1467 (2002)
12. 12.

## Authors and Affiliations

• Seiichi Toyota
• 1
• Seiichi Uchida
• 2
• Masakazu Suzuki
• 3
1. 1.Graduate School of MathematicsKyushu University
2. 2.Faculty of Information Science and Electrical EngineeringKyushu University
3. 3.Faculty of MathematicsKyushu University