Database-Driven Mathematical Character Recognition

  • Alan Sexton
  • Volker Sorge
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3926)


We present an approach for recognising mathematical texts using an extensive \({\rm L\kern-.36em\raise.3ex\hbox{\sc a}\kern-.15em T\kern-.1667em\lower.7ex\hbox{E}\kern-.125emX}\) symbol database and a novel recognition algorithm. The process consists essentially of three steps: Recognising the individual characters in a mathematical text by relating them to glyphs in the database of symbols, analysing the recognised glyphs to determine the closest corresponding \({\rm L\kern-.36em\raise.3ex\hbox{\sc a}\kern-.15em T\kern-.1667em\lower.7ex\hbox{E}\kern-.125emX}\) symbol, and reassembling the text by putting the appropriate \({\rm L\kern-.36em\raise.3ex\hbox{\sc a}\kern-.15em T\kern-.1667em\lower.7ex\hbox{E}\kern-.125emX}\) commands at their corresponding positions of the original text inside a \({\rm L\kern-.36em\raise.3ex\hbox{\sc a}\kern-.15em T\kern-.1667em\lower.7ex\hbox{E}\kern-.125emX}\) picture environment. The recogniser itself is based on a novel variation on the application of geometric moment invariants. The working system is implemented in Java.


Feature Vector Character Recognition Optical Character Recognition Font Size Mathematical Text 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: Proc. of the 23rd VLDB Conference, pp. 426–435 (1997)Google Scholar
  2. 2.
    Flusser, J.: Fast calculation of geometric moments of binary images. In: Gengler, M. (ed.) Pattern Recognition and Medical Computer Vision, pp. 265–274. ÖCG (1998)Google Scholar
  3. 3.
    The JSTOR scholarly journal archive,
  4. 4.
    Levy, P.: Possible world semantics for general storage in call-by-value. In: Bradfield, J.C. (ed.) CSL 2002 and EACSL 2002. LNCS, vol. 2471, pp. 232–246. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  5. 5.
    Lin, W., Wang, S.: A note on the calculation of moments. Pattern Recognition Letters 15(11), 1065–1070 (1994)CrossRefGoogle Scholar
  6. 6.
    Lladós, J., Valveny, E., Sánchez, G., Martí, E.: Symbol recognition: Current advances and perspectives. In: Blostein, D., Kwon, Y.-B. (eds.) GREC 2001. LNCS, vol. 2390, pp. 104–127. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Parkin, S.: The comprehensive latex symbol list. Technical report, CTAN (2003), Available at:
  8. 8.
    Philips, W.: A new fast algorithm for moment computation. Pattern Recognition 26(11), 1619–1621 (1993)CrossRefGoogle Scholar
  9. 9.
    Sexton, A., Sorge, V.: A database of glyphs for ocr of mathematical documents. In: Kohlhase, M. (ed.) MKM 2005. LNCS, vol. 3863, pp. 203–216. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Sexton, A., Swinbank, R.: Bulk loading the M-tree to enhance query performance. In: Williams, H., MacKinnon, L.M. (eds.) BNCOD 2004. LNCS, vol. 3112, pp. 190–202. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    Sexton, A., Todman, A., Woodward, K.: Font recognition using shape-based quad-tree and kd-tree decomposition. In: Proc. of JCIS 2000, pp. 212–215. Assoc. for Intel. Machinery (2000)Google Scholar
  12. 12.
    Sonka, M., Hlavac, V., Boyle, R.: Image processing, analysis and machine vision, 2nd edn. International Thomson Publishing (1998)Google Scholar
  13. 13.
    Suzuki, M., Uchida, S., Nomura, A.: A ground-truthed mathematical character and symbol image database. In: Proc. of ICDAR 2005, pp. 675–679. IEEE Computer Society Press, Los Alamitos (2005)Google Scholar
  14. 14.
    Transactions of the American Mathematical Society, Available as part of JSTOR at:
  15. 15.
    Trier, D., Jain, A., Taxt, T.: Feature extraction methods for character recognition - a survey. Pattern Recognition 29(4), 641–662 (1996)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Alan Sexton
    • 1
  • Volker Sorge
    • 1
  1. 1.School of Computer ScienceUniversity of BirminghamUK

Personalised recommendations