Towards a Parser for Mathematical Formula Recognition

  • Amar Raja
  • Matthew Rayner
  • Alan Sexton
  • Volker Sorge
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4108)


For the transfer of mathematical knowledge from paper to electronic form, the reliable automatic analysis and understanding of mathematical texts is crucial. A robust system for this task needs to combine low level character recognition with higher level structural analysis of mathematical formulas. We present progress towards this goal by extending a database-driven optical character recognition system for mathematics with two high level analysis features. One extends and enhances the traditional approach of projection profile cutting. The second aims at integrating the recognition process with graph grammar rewriting by giving support to the interactive construction and validation of grammar rules. Both approaches can be successfully employed to enhance the capabilities of our system to recognise and reconstruct compound mathematical expressions.


Optical Character Recognition High Level Feature Vertical Projection Graph Grammar Horizontal Projection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alt, F.L.: Digital pattern recognition by moments. Journal of the ACM 9(2), 240–258 (1962)MATHCrossRefGoogle Scholar
  2. 2.
    Grbavec, A., Blostein, D.: Mathematics recognition using graph rewriting. In: Proc. of ICDAR 1995, pp. 417–421 (1995)Google Scholar
  3. 3.
    Ha, J., Haralick, R.M., Philips, I.T.: Understanding mathematical expressions from document images. In: Proc. of ICDAR 1995, pp. 956–959 (1995)Google Scholar
  4. 4.
    Hu, M.-K.: Visual pattern recognition by moment invariants. IEEE Transactions on Information Theory 8(2), 179–187 (1962)CrossRefGoogle Scholar
  5. 5.
    Lavirotte, S., Pottier, L.: Optical formula recognition. In: Proc. of ICDAR 1997, pp. 357–361 (1997)Google Scholar
  6. 6.
    Suzuki, M., Tamari, F., Fukuda, R., Uchida, S., Kanahori, T.: Infty — an integrated OCR system for mathematical documents. In: Proceedings of ACM Symposium on Document Engineering, pp. 95–104 (2003)Google Scholar
  7. 7.
    Okamoto, N., Miao, B.: Recognition of mathematical expressions by using the layout structures of symbols. In: Proc. of ICDAR 1991, pp. 242–250 (1991)Google Scholar
  8. 8.
    Parkin, S.: The comprehensive latex symbol list. Technical report, CTAN (2003), available at:
  9. 9.
    Rekers, J., Schürr, A.: Defining and parsing visual languages with layered graph grammars. J. Visual Languages and Computing 8(1), 27–55 (1997)CrossRefGoogle Scholar
  10. 10.
    Sexton, A., Sorge, V.: A Database of Glyphs for OCR of Mathematical Documents. In: Kohlhase, M. (ed.) MKM 2005. LNCS (LNAI), vol. 3863, pp. 203–216. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  11. 11.
    Suzuki, M., Uchida, S., Nomura, A.: A ground-truthed mathematical character and symbol image database. Technical report, Kyushu University (2004)Google Scholar
  12. 12.
    Wang, Z., Faure, C.: Structural analysis of handwritten mathematical expressions. In: Proc. of Ninth Int. Conf. on Pattern Recognition (1988)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Amar Raja
    • 1
  • Matthew Rayner
    • 1
  • Alan Sexton
    • 1
  • Volker Sorge
    • 1
  1. 1.School of Computer ScienceUniversity of BirminghamUK

Personalised recommendations