Skip to main content

Advertisement

Log in

Recognition and retrieval of mathematical expressions

  • Original Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

Document recognition and retrieval technologies complement one another, providing improved access to increasingly large document collections. While recognition and retrieval of textual information is fairly mature, with wide-spread availability of optical character recognition and text-based search engines, recognition and retrieval of graphics such as images, figures, tables, diagrams, and mathematical expressions are in comparatively early stages of research. This paper surveys the state of the art in recognition and retrieval of mathematical expressions, organized around four key problems in math retrieval (query construction, normalization, indexing, and relevance feedback), and four key problems in math recognition (detecting expressions, detecting and classifying symbols, analyzing symbol layout, and constructing a representation of meaning). Of special interest is the machine learning problem of jointly optimizing the component algorithms in a math recognition system, and developing effective indexing, retrieval and relevance feedback algorithms for math retrieval. Another important open problem is developing user interfaces that seamlessly integrate recognition and retrieval. Activity in these important research areas is increasing, in part because math notation provides an excellent domain for studying problems common to many document and graphics recognition and retrieval applications, and also because mature applications will likely provide substantial benefits for education, research, and mathematical literacy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Adeel M., Cheung H.S., Khiyal H.S.: Math go! prototype of a content based mathematical formula search engine. J. Theor. Appl. Inf. Technol. 4(10), 1002–1012 (2008)

    Google Scholar 

  2. Aho A.V., Kernighan B.W., Weinberger P.J.: The AWK Programming Language. Addison-Wesley, New York (1988)

    MATH  Google Scholar 

  3. Altamimi, M., Youssef, A.S.: An extensive math query language. In: ISCA International Conference on Software Engineering and Data Engineering, pp. 57–63. Las Vegas, USA (2007)

  4. Aly W., Uchida S., Suzuki M.: Identifying subscripts and superscripts in mathematical documents. Math. Comput. Sci. 2(2), 195–209 (2008)

    Article  MATH  Google Scholar 

  5. Anderson, R.H.: Syntax-Directed Recognition of Hand-Printed Two-Dimensional Equations. PhD thesis, Harvard University, Cambridge (1968)

  6. Anderson R.H.: Two-dimensional mathematical notation. In: Fu, K.S. (ed.) Syntactic Pattern Recognition, Applications, pp. 174–177. Springer, New York (1977)

    Google Scholar 

  7. Anthony, L., Yang, J., Koedinger, K.R.: Adapting handwriting recognition for applications in algebra learning. In: Proceedings of the ACM Workshop on Educational Multimedia and Multimedia Education, pp. 47–56. Augsburg (2007)

  8. Ashida, K., Okamoto, M., Imai, H., Nakatsuka, T.: Performance evaluation of a mathematical formula recognition system with a large scale of printed formula images. In: Proceedings of the International Conference on Document Image Analysis for Libraries, pp. 320–331. Lyon (2006)

  9. Asperti, A., Guidi, F., Coen, C., Tassi, E., Zacchiroli, S.: A content based mathematical search engine: whelp. In: Proceedings of Types for Proofs and Programs 2004, vol. 3839 of LNCS, pp. 17–32. Springer (2006)

  10. Ausbrooks, R., Buswell, S., Carlisle, D., Chavchanidze, G., Dalmas, S., Devitt, S., Diaz, A., Dooley, S., Hunter, R., Ion, P., Kohlhase, M., Lazrek, A., Libbrecht, P., Miller, B., Miner, R., Rowley, C., Saregent, M., Smith, B., Soiffer, N., Sutor, R., Watt, S. (2010) Mathematical markup language (MathML) version 3.0, W3C recommendation (http://www.w3.org/math/)

  11. Awal, A.M., Mouchère, H., Viard-Gaudin, C.: Towards handwritten mathematical expression recognition. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 1046–1050. Barcelona (2009)

  12. Awal, A.M., Mouchére, H., Viard-Gaudin, C.: The problem of handwritten mathematical expression recognition evaluation. In: Proceedings of the International Conference on Frontiers in Handwriting Recognition, pp. 646–651. Montréal (2010)

  13. Baker, J.B., Sexton, A.P., Sorge, V.: A linear grammar approach to mathematical formula recognition from PDF. In: Proceedings of Mathematical Knowledge Management, vol. 5625 of LNAI, pp. 201–216. Springer (2009)

  14. Baker, J.B., Sexton, A.P., Sorge, V.: Faithful mathematical formula recognition from PDF documents. In: Proceedings of the International Workshop on Document Analysis Systems, pp. 485–492, Boston (2010)

  15. Belongie S., Malik J., Puzicha J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24, 509–522 (2002)

    Article  Google Scholar 

  16. Berman, B.P., Fateman, R.J.: Optical character recognition for typeset mathematics. In: Proceedings of the International Syposium on Symbolic and Algebraic Computation, pp. 348–353. Oxford (1994)

  17. Bernstein, M.I.: Computer input/output of two-dimensional notations. In: Proceedings of the Syposium on Symbolic and Algebraic Manipulation, pp. 102–103 (1971)

  18. Blostein, D., Cordy, J., Zanibbi, R.: Applying compiler techniques to diagram recognition. In: Proceedings of the International Conference on Pattern Recognition, vol. 3, pp. 123–126. Québec (2002)

  19. Blostein, D., Grbavec, A.: Recognition of mathematical notation. In: Handbook of Character Recognition and Document Image Analysis, pp. 557–582. World Scientific (1997)

  20. Blostein, D., Lank, E., Rose, A., Zanibbi, R.: User interfaces for on-line diagram recognition. In: Selected Papers from the Fourth International Workshop on Graphics Recognition Algorithms and Applications, vol. 2390 of LNCS, pp. 92–103. Springer (2002)

  21. Blostein, D., Lank, E., Zanibbi, R.: Treatment of diagrams in document image analysis. In: Proceedings of the International Conference on Theory and Application of Diagrams, pp. 330–344. Springer, London (2000)

  22. Borlund, P.: User-centered evaluation of information retrieval systems. In: Information Retrieval: searching in the 21st Century, pp. 21–37. Wiley (2009)

  23. Bunt, A., Terry, M., Lank, E.: Friend or foe? Examining CAS use in mathematics research. In: Proceedings of the International Conference on Human Factors in Computing Systems, pp. 229–238. New York (2009)

  24. Cajori F.: A History of Mathematical Notations. 2 vols. Open Court Publishing Company, Chicago (1929)

    Google Scholar 

  25. Carette, J., Farmer, W.M.: A review of mathematical knowledge management. In: Proceedings of Mathematical Knowledge Management, vol. 5625 of LNAI, pp. 233–246. Springer (2009)

  26. Case D.O.: Looking for Information: a Survey of Research on Information Seeking, Needs, and Behavior. Academic Press, San Diego (2002)

    Google Scholar 

  27. Casey R.G., Lecolinet E.: A survey of methods and strategies in character segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 18(7), 690–706 (1996)

    Article  Google Scholar 

  28. Chan K.-F., Yeung D.-Y.: Mathematical expression recognition: a survey. Int. J. Doc. Anal. Recognit. 3, 3–15 (2000)

    Article  Google Scholar 

  29. Chan K.-F., Yeung D.-Y.: Error detection, error correction and performance evaluation in on-line mathematical expression recognition. Pattern Recognit. 34(8), 1671–1684 (2001a)

    Article  MATH  Google Scholar 

  30. Chan, K.-F., Yeung, D.-Y.: Pencalc: a novel application of on-line mathematical expression recognition technology. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 774–778. Seattle (2001b)

  31. Chang S.-K.: A method for the structural analysis of two-dimensional mathematical expressions. Inf. Sci. 2, 253–272 (1970)

    Article  MATH  Google Scholar 

  32. Char, B.W., Watt, S.M.: Representing and characterizing handwritten mathematical symbols through succinct functional approximation. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 1198–1202. Curitiba (2007)

  33. Chaundy T.W., Barrett P.R., Batey Charles: The Printing of Mathematics. Oxford University Press, London (1957)

    Google Scholar 

  34. Chou, P.A.: Recognition of equations using a two-dimensional stochastic context-free grammar. In: Proceedings of Visual Communications and Image Processing IV, vol. 1199 of Proceeding SPIE, pp. 852–863 (1989)

  35. Datta R., Joshi D., Li J., Wang J.Z.: Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 1–60 (2008)

    Article  Google Scholar 

  36. Davenport, J.H., Kohlhase, M.: Unifying math ontologies: a tale of two standards. In: Intelligent Computer Mathematics, vol. 5625 of LNAI, pp. 263–278. Springer (2009)

  37. Dewar M.: Openmath: an overview. ACM SIGSAM Bull. 34, 2–5 (2000)

    Article  Google Scholar 

  38. Doermann D.: The indexing and retrieval of document images: a survey. J. Comput. Vis. Imag. Underst. 70, 287–298 (1998)

    Article  Google Scholar 

  39. Drake, D.M., Baird, H.S.: Distinguishing mathematics notation from english text using computational geometry. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 1270–1274. Seoul (2005)

  40. Duda R.O., Hart P.E., Stork D.G.: Pattern Classification. 2nd edn. Springer, New York (2001)

    MATH  Google Scholar 

  41. Einwohner, T.H., Fateman, R.J.: Searching techniques for integral tables. In: Proceedings of the International Symposium on Symbolic and Algebraic Computation, pp. 133–139. Montréal (1995)

  42. Eto, Y., Suzuki, M.: Mathematical formula recognition using virtual link network. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 430–437. Seattle (2001)

  43. Fateman, R.J., Tokuyasu, T.: Progress in recognizing typeset mathematics. In: Proceedings of the SPIE, vol. 2660, pp. 37–50 (1996)

  44. Fitzgerald, J.A., Geiselbrechtinger, F., Kechadi, T.: Mathpad: a fuzzy logic-based recognition system for handwritten mathematics. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 694–698. Curitiba (2007)

  45. Freund Y., Schapire R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1995)

    Article  MathSciNet  Google Scholar 

  46. Fujimoto, M., Kanahori, T., Suzuki, M.: Infty editor—a mathematics typesetting tool with a handwriting interface and a graphical front-end to OpenXM servers. In: Computer Algebra— Algorithms, Implementations and Applications, vol. 1335 of RIMS Kokyuroku, pp. 217–226 (2003)

  47. Fujiyoshi, A., Suzuki, M., Uchida, S.: Verification of mathematical formulae based on a combination of context-free grammar and tree grammar. In: Proceedings of the International Conference on Mathematical Knowledge Management, vol. 5144 of LNCS, pp. 415–429. Springer (2008)

  48. Fujiyoshi, A., Suzuki, M., Uchida, S.: Syntactic detection and correction of misrecognitions in mathematical OCR. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 1360–1364. Barcelona (2009)

  49. Garain, U.: Identification of mathematical expressions in document images. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 1340–1344. Barcelona (2009)

  50. Garain U., Chaudhuri B.B.: Recognition of online handwritten mathematical expressions. IEEE Trans. Syst. Man Cybern. Part B 34(6), 2366–2376 (2004)

    Article  Google Scholar 

  51. Garain U., Chaudhuri B.B.: A corpus for OCR research on mathematical expressions. Int. J. Doc. Anal. Recognit. 7(4), 241–259 (2005)

    Article  Google Scholar 

  52. Garain, U., Chaudhuri, B.B.: OCR of printed mathematical expressions. In: Digital Document Processing, pp. 235–259. Springer (2007)

  53. Genoe, R., Fitzgerald, J.A., Kechadi, T.: An online fuzzy approach to the structural analysis of handwritten mathematical expressions. In: Proceedings of the International Conference on Fuzzy Systems, pp. 242–250. Vancouver (2006)

  54. Golubitsky, O., Watt, S.M.: Online computation of similarity between handwritten characters. In: Proceedings of Document Recognition and Retrieval, vol. 7247 of Proceedings of the SPIE, pp. C1–C10, San Jose (2009)

  55. Golubitsky O., Watt S.M.: Distance-based classification of handwritten symbols. Int. J. Doc. Anal. Recognit. 13(2), 133–146 (2010)

    Article  Google Scholar 

  56. Golubitsky, O., Watt, S.M.: Improved classification through runoff elections. In: Proceeding of the Workshop on Document Analysis Systems, pp. 59–64. Boston (2010)

  57. Graf, P.: Substitution tree indexing. In: Proceedings of the International Conference on Rewriting Techniques and Applications, pp. 117–131. London (1995)

  58. Grbavec, A., Blostein, D.: Mathematics recognition using graph rewriting. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 417–421. Montréal (1995)

  59. Hashimoto, H., Hijikata, Y., Nishida, S.: Incorporating breadth first search for indexing MathML objects. In: Proceedings of the International Conference on Systems, Man and Cybernetics, pp. 3519–3523, Singapore (2008)

  60. Hatcher E., Gospodnetić O.: Lucene in Action. 2nd edn. Manning, USA (2010)

    Google Scholar 

  61. Hearst M.A.: Search User Interfaces. 1st edn. Cambridge University Press, New York (2009)

    Google Scholar 

  62. Hiemstra, D.: Information retrieval models. In: Information Retrieval: searching in the 21st Century, pp. 1–17. Wiley (2009)

  63. Higham N.J.: Handbook of Writing for the Mathematical Sciences. Society for Industrial and Applied Mathematics, Philadelphia (1993)

    MATH  Google Scholar 

  64. Hu J., Kashi R.S., Lopresti D., Wilfong G.T.: Evaluating the performance of table processing algorithms. Int. J. Doc. Anal. Recognit. 4(3), 140–153 (2002)

    Article  Google Scholar 

  65. Hull, J.F.: Recognition of Mathematics Using a Two-Dimensional Trainable Context-Free Frammar. Master’s thesis, MIT, Cambridge (1996)

  66. Kacem A., Belaid A., Ben Ahmed M.: Automatic extraction of printed mathematical formulas using fuzzy logic and propagation of context. Int. J. Doc. Anal. Recognit. 4, 97–108 (2001)

    Article  Google Scholar 

  67. Kamali, S., Tompa, F.: Improving mathematics retrieval. In: Proceedings of Digital Mathematics Libraries, pp. 37–48, Grand Bend (2009)

  68. Kanahori, T., Sexton, A.P., Sorge, V., Suzuki, M.: Capturing abstract matrices from paper. In: Borwein, J.M., Farmer, W.M. (eds.) Proceedings of Mathematical Knowledge Management, vol. 4108 of LNAI, pp. 124–138. Springer (2006)

  69. Kanahori, T., Suzuki, M.: A recognition method of matrices by using variable block pattern elements generating rectangular areas. In: Graphics Recognition—Algorithms and Applications, vol. 2390 of LNCS, pp. 320–329. Springer (2002)

  70. Kanahori, T., Suzuki, M.: Detection of matrices and segmentation of matrix elements in scanned images of scientific documents. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 433–437. Edinburgh (2003)

  71. Kanahori, T., Suzuki, M.: Refinement of digitized documents through recognition of mathematical formulae. In: Proceedings of the International Workshop on Document Image Analysis for Libraries, pp. 27–28. Lyon (2006)

  72. Kanungo T., Haralick R.M., Baird H.S., Stuetzle W., Madigan D.: A statistical, nonparametric methodology for document degradation model validation. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1209–1223 (2000)

    Article  Google Scholar 

  73. Kim, K., Rhee, T.-H., Lee, J.S., Kim, J.H.: Utilizing consistency context for handwritten mathematical expression recognition. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 1051–1055. Barcelona (2009)

  74. Knuth, Donald E.: Tex and METAFONT—New Directions in Typesetting. Digital Press, Bedford (1979)

    MATH  Google Scholar 

  75. Kohlhase, A., Kohlhase, M.: Re-examining the MKM value proposition: from math web search to math web research. In: Proceedings of the Symposium on Towards Mechanized Mathematical Assistants, vol. 4573 of LNCS, pp. 313–326. Springer (2007)

  76. Kohlhase, M.: OMDoc: an Open Markup Format for Mathematical Documents, vol. 4180 of LNAI. Springer (2006)

  77. Kohlhase, M., Anca, S., Jucovschi, C., Palomo, A.G., Sucan, I.: MathWebSearch 0.4: a semantic search engine for mathematics. (unpublished manuscript. http://kwarc/info/kohlhase/publications.html) (2008)

  78. Kohlhase, M., Sucan, I.: A search engine for mathematical formulae. In: Proc. Artificial Intelligence and Symbolic Computation, volume 4120 of LNAI, pp. 241–253. Springer (2006)

  79. Kopec G.E., Chou P.A.: Document image decoding using Markov source models. IEEE Trans. Pattern Anal. Mach. Intell. 16(6), 602–617 (1994)

    Article  Google Scholar 

  80. Kosmala, A., Rigoll, G.: On-line handwritten formula recognition using statistical methods. In: Proceedings of the International Conference on Pattern Recognition, pp. 1306–1308. Brisbane (1998)

  81. Labahn, G., Lank, E., MacLean, S., Marzouk, M., Tausky, D.: Mathbrush: a system for doing math on pen-based devices. In: Proceedings of the Workshop on Document Analysis Systems, pp. 599–606. Japan (2008)

  82. Labahn, G., Lank, E., Marzouk, M., Bunt, A., MacLean, S., Tausky, D.: Mathbrush: a case study for pen-based interactive mathematics. In: Proceedings of Eurographics Workshop Sketch-Based Interfaces and Modeling, Annecy. France (2008)

  83. Lapointe, A., Blostein, D.: Issues in performance evaluation: A case study of math recognition. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 1355–1359. Barcelona (2009)

  84. LaViola, J.J., Leal, A., Miller, T.S., Zeleznik, R.C.: Evaluation of techniques for visualizing mathematical expression recognition results. In: Proceedings of Graphics Interface, pp. 131–138. Windsor, Canada (2008)

  85. La Viola J.J., Zeleznik R.C.: Mathpad2: a system for the creation and exploration of mathematical sketches. ACM Trans. Graph. 23(3), 432–440 (2004)

    Article  Google Scholar 

  86. La Viola J.J., Zeleznik R.C.: A practical approach to writer- dependent symbol recognition using a writer-independent recognizer. IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 1917–1926 (2007)

    Article  Google Scholar 

  87. Lavirotte, S., Pottier, L.: Optical formula recognition. In: Proceedings of the International Conference on Document Analysis and Recognition, vol. 1, pp. 357–361. Ulm (1997)

  88. Lee, H.-J., Wang, J.-S.: Design of a mathematical expression understanding system. Pattern Recognit. Lett. 18(3), 289–298 (1997)

    Google Scholar 

  89. Li, C., Zeleznik, R.C., Miller, T., La Viola, J.J.: Online recognition of handwritten mathematical expressions with support for matrices. In: Proceedings of the International Conference on Pattern Recognition, pp. 1–4. Tampa (2008)

  90. Liang, P., Narasimhan, M., Shilman, M., Viola, P.A.: Efficient geometric algorithms for parsing in two dimensions. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 1172–1177. Seoul (2005)

  91. Libbrecht, P., Melis, E.: Methods for access and retrieval of mathematical content in ActiveMath. In: Proceedings of the International Congress on Mathematical Software, vol. 4151 of LNCS, pp. 331–342. Springer (2006)

  92. Lopresti, D., Wilfong, G.: Evaluating document analysis results via graph probing. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 116–120. Seattle (2001)

  93. MacLean S., Labahn G., Lank E., Marzouk M., Tausky D.: Grammar-based techniques for creating ground-truthed sketch corpora. Int. J. Doc. Anal. Recognit. 14(1), 65–74 (2011)

    Article  Google Scholar 

  94. Malon C.D., Uchida S., Suzuki M.: Mathematical symbol recognition with support vector machines. Pattern Recognit. Lett. 29, 1326–1332 (2008)

    Article  Google Scholar 

  95. Manning C.D., Raghavan P., Schütze H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)

    Book  MATH  Google Scholar 

  96. Marinai, S., Miotti, B., Soda, G.: Mathematical symbol indexing using topologically ordered clusters of shape contexts. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 1041–1045. Barcelona (2009)

  97. Marriott, K., Meyer, B., Wittenburg, K.D.: A survey of visual language specification and recognition. In: Visual Language Theory, pp. 5–85. Springer (1998)

  98. Martin, W.A.: Computer input/output of mathematical expressions. In: Proceedings of the Symposium on Symbolic and Algebraic Manipulation, pp. 78–89. Los Angeles (1971)

  99. Mathematical Reviews and Zentralblatt für Mathematik. Mathematics subject classification. http://www.ams.org/mathscinet/msc/msc2010.html (2010)

  100. Matsakis, N.: Recognition of Handwritten Mathematical Expressions. Master’s thesis, MIT, Cambridge (1999)

  101. Michler G.O.: Report on the retrodigitization project “Archiv der Mathematik. Archiv der Mathematik 77, 116–128 (2001)

    MathSciNet  MATH  Google Scholar 

  102. Michler G.O.: How to build a prototype for a distributed digital mathematics archive library. Ann. Math. Artif. Intell. 38, 137–164 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  103. Miller B.R., Youssef A.S.: Technical aspects of the digital library of mathematical functions. Ann. Math. Artif. Intell. 38, 121–136 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  104. Miller, E.G., Viola, P.A.: Ambiguity and constraint in mathematical expression recognition. In: Proceedings of the 15th National Conference on Artificial Intelligence, pp. 784–791. Madison (1998)

  105. Miner, R., Munavalli, R.: An approach to mathematical search through query formulation and data normalization. In: Towards Mechanized Mathematical Assistants, vol. 4573 of LNAI, pp. 342–355. Springer (2007)

  106. Miyazaki, Y., Iguchi, Y.: Development of information-retrieval tool for MathML-based math expressions. In: Proceedings of the International Conference on Computers in Education, pp. 419–426. Tapei (2008)

  107. Munavalli, R., Miner, R.: Mathfind: a math-aware search engine. In: Proceedings of the International Conference on Information Retrieval, pp. 735–735. New York (2006)

  108. Nagy G.: Twenty years of document image analysis in PAMI. IEEE Trans on Pattern Anal. Mach. Intell. 22(1), 38–62 (2000)

    Article  Google Scholar 

  109. Nagy, G., Seth, S.: Hierarchical representation of optically scanned documents. In: Proceedings of International Conference on Pattern Recognition, pp. 347–349. Montréal (1984)

  110. Normann, I., Kohlhase, M.: Extended formula normalization for \({\epsilon}\) -retrieval and sharing of mathematical knowledge. In: Proceedings of Towards Mechanized Mathematical Assistants, vol. 4573 of LNAI, pp. 356–370. Springer (2007)

  111. Okamoto, M., Imait, K.T.: Performance evaluation of a robust method for mathematical expression recognition. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 121–128. Seattle, (2001)

  112. Okamoto, M., Miao, B.: Recognition of mathematical expressions by using the layout structures of symbols. In: Proceedings of the International Conference on Document Analysis and Recognition, vol. 1, pp. 242–250. Saint-Malo (1991)

  113. Okamoto, M., Miyazawa, A.: An experimental implementation of a document recognition system for papers containing mathematical expressions. In: Structured Document Image Analysis, pp. 36–53. Springer (1992)

  114. Panic, M.: Math handwriting recognition in Windows 7 and its benefits. In: Intelligent Computer Mathematics, vol. 5625 of LNCS, pp. 29–30. Springer (2009)

  115. Phillips, I.: Methodologies for using UW databases for OCR and image understanding systems. In: Proceedings of Document Recognition V, vol. 3305 of SPIE Proceedings, pp. 112–127. San Jose (1998)

  116. Plamandon R., Srihari S.N.: On-line and off-line handwriting recognition: a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 63–84 (2000)

    Article  Google Scholar 

  117. Pollanen, M., Wisniewski, T., Yu, X.: Xpress: a novice interface for the real-time communication of mathematical expressions. In: Proceedings of the Workshop on Mathematical User-Interfaces. Linz (2007)

  118. Rabiner L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  119. Raja, A., Rayner, M., Sexton, A.P., Sorge, V.: Towards a parser for mathematical formula recognition. In: Mathematical Knowledge Management, vol. 4108 of LNAI, pp. 139–151. Springer (2006)

  120. Rath, T.M., Manmatha, R.: Word image matching using dynamic time warping. In: Proceedings of Computer Vision and Pattern Recognition, pp. 521–527. Madison, (2003)

  121. Rath T.M., Manmatha R.: Word spotting for historical documents. Int. J. Doc. Anal. Recognit. 9, 139–152 (2007)

    Article  Google Scholar 

  122. Rhee T.H., Kim J.H.: Efficient search strategy in structural analysis for handwritten mathematical expression recognition. Pattern Recognit. 42(12), 3192–3201 (2009)

    Article  MATH  Google Scholar 

  123. Rüger, S.: Multimedia resource discovery. In: Information Retrieval: Searching in the 21st Century, pp. 39–62. Wiley (2009)

  124. Sain K., Dasgupta A., Garain U.: EMERS: A tree matching-based performance evaluation of mathematical expression recognition systems. Int. J. Doc. Anal. Recognit. 14(1), 75–85 (2011)

    Article  Google Scholar 

  125. Salton G., McGill M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)

    MATH  Google Scholar 

  126. Samet H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, New York (2006)

    MATH  Google Scholar 

  127. Sexton, A.P., Sorge, V.: Abstract matrices in symbolic computation. In: Proceedings of the International Symposium on Symbolic and Algebraic Computation, pp. 318–325. Genoa (2006)

  128. Shafait F., Keysers D., Breuel T.M.: Performance evaluation and benchmarking of six page segmentation algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 30(6), 941–954 (2008)

    Article  Google Scholar 

  129. Shatnawi, M., Youssef, A.S.: Equivalence detection using parse-tree normalization for math search. In: Proceedings of the International Conference on Digital Information Management, vol. 2, pp. 643–648. Lyon (2007)

  130. Shi, Y., Li, H.Y., Soong, F.K.: A unified framework for symbol segmentation and recognition of handwritten mathematical expressions. In: Proceedings of the International Conference on Document Analysis and Recognition, vol. 2, pp. 854–858. Curitiba (2007)

  131. Shi, Y., Soong, F.K.: Symbol graph based discriminative training and rescoring for improved math symbol recognition. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, pp. 1953–1956. Las Vegas (2008)

  132. Smeulders A.W.M., Worring M., Santini S., Gupta A., Jain R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22(12), 1349–1380 (2000)

    Article  Google Scholar 

  133. Smirnova, E., Watt, S.M.: Communicating mathematics via pen-based computer interfaces. In: Proceedings of the International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, pp. 9–18. Timişoara (2008)

  134. Smirnova, E., Watt, S.M.: Context-sensitive mathematical character recognition. In: Proceedings of the International Conference on Frontiers in Handwriting Recognition, pp. 604–610. Montréal (2008)

  135. Smithies, S.: Freehand Formula Entry System. Master’s thesis, University of Otago, Dunedin (1999)

  136. Smithies S.: Equation entry and editing via handwriting and gesture recognition. Beh. Inf. Technol. 20(1), 53–67 (2001)

    Article  Google Scholar 

  137. Smithies, S., Novins, K., Arvo, J.: A handwriting-based equation editor. In: Proceedings of Graphics Interface, pp. 84–91. Kingston (1999)

  138. So, C.M., Watt, S.M.: Determining empirical charateristivs of mathematical expression use. In: Proceedings of Mathematical Knowledge Management, vol. 3863 of LNCS, pp. 361–375. Springer (2005)

  139. So, C.M., Watt, S.M.: On the conversion between content MathML and OpenMath. In: Proceedings of the Conference on Communicating Mathematics in the Digital Era, pp. 169–182. Aveiro (2006)

  140. Suzuki, M., Kanahori, T., Ohtake, N., Yamaguchi, K.: An integrated OCR software for mathematical documents and its output with accessibility. In: Proceedings of the International Conference on Computers Helping People with Special Needs, vol. 3119 of LNCS, pp. 648–655. Springer (2004)

  141. Suzuki, M., Tamari, F., Fukuda, R., Uchida, S., Kanahori, T.: INFTY: An integrated OCR system for mathematical documents. In: Proceedings of Document Engineering, pp. 95–104. Grenoble (2003)

  142. Suzuki, M., Uchida, S., Nomura, A.: A ground-truthed mathematical character and symbol image database. In: Proceedings of the International Conference on Document Analysis and Recognition, vol. 2, pp. 675–679. Seoul (2005)

  143. Takiguchi, Y., Okada, M., Miyake, Y.: A fundamental study of output translation from layout recognition and semantic understanding system for mathematical formulae. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 745–749. Seoul (2005)

  144. Tapia, E., Rojas, R.: Recognition of on-line handwritten mathematical formulas in the e-chalk system. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 980–984. Edinburgh (2003)

  145. Tapia, E., Rojas, R.: Recognition of on-line handwritten mathematical expressions using a minimum spanning tree construction and symbol dominance. In: Graphics Recognition: Recent Advances and Persepectives, vol. 3088 of LNCS, pp. 329–340. Springer (2004)

  146. Tapia, E., Rojas, R.: A survey on recognition of on-line handwritten mathematical notation. Technical Report B-07-01, Free University of Berlin (2007)

  147. Tausky, D., Labahn, G., Lank, E., Marzouk, M.: Managing ambiguity in mathematical matrices. In: Proceedings of the Eurographics Workshop on Sketch-Based Interfaces and Modeling, pp. 115–122. Riverside (2007)

  148. The OpenMath Society. http://www.openmath.org/

  149. Toussaint G.T.: The use of context in pattern recognition. Pattern Recognit. 10, 189–204 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  150. Toyozumi, K., Suzuki, T., Mori, J., Suenaga, Y.: A system for real-time recognition of handwritten mathematical formulas. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 1059–1063. Seattle (2001)

  151. Toyozumi K., Takahiro S., Mori K., Suenaga Y.: An on-line handwritten mathemical equation recognition system that can process matrix expressions by referring to the relative positions of matrix elements. Syst. Comput. Jpn. 37(14), 87–96 (2006)

    Article  Google Scholar 

  152. Toyozumi, K., Yamada, N., Mase, K., Kitasaka, T., Mori, K., Suenaga, Y., Takahashi, T.: A study of symbol segmentation method for handwritten mathematical formula recognition using mathematical structure information. In: Proceedings of the International Conference on Pattern Recognition, vol. 2, pp. 630–633. Cambridge (2004)

  153. Twaakyondo, H.M., Okamoto, M.: Structure analysis and recognition of mathematical expressions. In: Proceedings of the International Conference on Document Analysis and Recognition, vol. 1, pp. 430–437. Montréal (1995)

  154. Vapnik V.: The Nature of Statistical Learning Theory. Springer, New York (1995)

    MATH  Google Scholar 

  155. Watt, S.M.: An empirical measure on the set of symbols occurring in engineering mathematics texts. In: Proceedings of the International Workshop on Document Analysis Systems, pp. 557–564. Japan (2008)

  156. Westman, S.: Image users’ needs and searching behaviour. In: Information Retrieval: searching in the 21st Century, pp. 63–83. Wiley (2009)

  157. Wick, K.: Rules for Typesetting Mathematics Czechoslovak Academy of Sciences. The Hague (1965). Translated by V. Boublik and M. Hejlova

  158. Winkler, H.-J.: HMM-based handwritten symbol recognition using on-line and off-line features. In: Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing, pp. 3438–3441. Atlanta (1996)

  159. Yamamoto, R., Sako, S., Nishimoto, T., Sagayama, S.: On-line recognition of handwritten mathematical expressions based on stroke-based stochastic context-free grammar. In: Proceedings of the International Workshop on Frontiers in Handwriting Recognition, pp. 249–254, La Baule (2006)

  160. Yokoi, K., Aizawa, A.: An approach to similarity search for mathematical expressions using MathML. In: Proceedings of Digital Mathematics Libraries, pp. 27–35. Grand Bend (2009)

  161. Yu, Li: Image-Based Math Retrieval Using Handwritten Queries. Master’s thesis, Rochester Institute of Technology, Rochester (2010)

  162. Zanibbi, R., Blostein, D., Cordy, J.R.: Baseline structure analysis of handwritten mathematics notation. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 768–773. Seattle (2001)

  163. Zanibbi R., Blostein D., Cordy J.R.: Recognizing mathematical expressions using tree transformation. IEEE Trans. Pattern Anal. Mach. Intell. 24(11), 1455–1467 (2002)

    Article  Google Scholar 

  164. Zanibbi, R., Blostein, D., Cordy, J.R.: Recognition tasks are imitation games. In: LNCS, vol. 3686, pp. 209–218 (2005)

  165. Zanibbi, R., Novins, K., Arvo, J., Zanibbi, K.: Aiding manipulation of handwritten mathematical expressions through style- preserving morphs. In: Proceedings of Graphics Interface, pp. 127–134. Ottawa (2001)

  166. Zanibbi, R., Pillay, A., Mouchére, H., Viard-Gaudin, C., Blostein, D.: Stroke-based performance metrics for handwritten mathematical expressions. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 334–338. Beijing (2011)

  167. Zanibbi, R., Yu, L.: Math spotting: Retrieving math in technical documents using handwritten query images. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 446–451. Beijing (2011)

  168. Zanibbi, R., Yuan, B.: Keyword and image-based retrieval for mathematical expressions. In: Proceedings of Document Recognition and Retrieval XVIII, vol. 7874 of SPIE Proceedings, paper OI, pp. 1–9. San Francisco (2011)

  169. Zeleznik, R.C., Miller, T., Li, C., La Viola, J.J.: Mathpaper: mathematical sketching with fluid support for interactive computation. In: International Symposium on Smart Graphics, vol. 5166 of LNCS, pp. 20–32. Springer (2008)

  170. Zhang, L., Blostein, D., Zanibbi, R.: Using fuzzy logic to analyze superscript and subscript relations in handwritten mathematical expressions. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 972–976. Seoul, Korea (2005)

  171. Zhao, J., Kan, M.-Y., Theng, Y.L.: Math information retrieval: user requirements and prototype implementation. In: Proceedings of the ACM/IEEE Joint Conference Digital libraries, pp. 187–196. New York (2008)

  172. Zhou X.S., Huang T.S.: Relevance feedback in image retrieval: a comprehensive review. Multimed. Syst. 8, 536–544 (2003)

    Article  Google Scholar 

  173. Zipf G.K.: Human Behavior and the Principle of Least-Effort. Addison-Wesley, Reading (1949)

    Google Scholar 

  174. Zwillinger D.: CRC Standard Mathematical Tables and Formulae. CRC Press, Boca Raton (1996)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard Zanibbi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zanibbi, R., Blostein, D. Recognition and retrieval of mathematical expressions. IJDAR 15, 331–357 (2012). https://doi.org/10.1007/s10032-011-0174-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-011-0174-4

Keywords

Navigation