Distance-based classification of handwritten symbols

Original Paper

Abstract

We study online classification of isolated handwritten symbols using distance measures on spaces of curves. We compare three distance-based measures on a vector space representation of curves to elastic matching and ensembles of SVM. We consider the Euclidean and Manhattan distances and the distance to the convex hull of nearest neighbors. We show experimentally that of all these methods the distance to the convex hull of nearest neighbors yields the best classification accuracy of about 97.5%. Any of the above distance measures can be used to find the nearest neighbors and prune totally irrelevant classes, but the Manhattan distance is preferable for this because it admits a very efficient implementation. We use the first few Legendre-Sobolev coefficients of the coordinate functions to represent the symbol curves in a finite-dimensional vector space and choose the optimal dimension and number of bits per coefficient by cross-validation. We discuss an implementation of the proposed classification scheme that will allow classification of a sample among hundreds of classes in a setting with strict time and storage limitations.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Char, B., Watt, S.M.: Representing and characterizing handwritten mathematical symbols through succinct functional approximation. In: Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), pp. 1198–1202. (2007)Google Scholar
  2. 2.
    Fujimoto, M., Suzuki, M.: AsirPad—a computer algebra system with a pen-based interface on PDA. In: Proceedings of the 7th Asian Symposium on Computer Mathematics (ASCM2005), Korea Institute for Advanced Study, pp. 259–262. (2005)Google Scholar
  3. 3.
    Golubitsky, O., Watt, S.M.: Online stroke modeling for handwriting recognition. In: Proceedings of the 18th International Conference on Computer Science and Software Engineering (CASCON), pp. 72–80. (2008)Google Scholar
  4. 4.
    Golubitsky, O., Watt, S.M.: Online computation of similarity between handwritten characters. In: Proceedings of the Document Recognition and Retrieval (DRR XVI), pp. C1–C10. (2009)Google Scholar
  5. 5.
    Golubitsky, O., Watt, S.M.: Online recognition of multi-stroke symbols with orthogonal series. In: Proceedings of the 10th International Conference on Document Analysis and Recognition, (ICDAR 2009), pp. 1265–1269. IEEE Computer Society, Barcelona, Spain. 26–29 July 2009Google Scholar
  6. 6.
    Golubitsky, O., Watt, S.M.: Improved character recognition through subclassing and runoff elections. Ontario Research Centre for Computer Algebra Technical Report. TR-09-01. http://www.orcca.on.ca/TechReports/2009/TR-09-01.html (2009)
  7. 7.
    Golubitsky, O., Watt, S.M.: Tie breaking for curve multiclassifiers. Ontario Research Centre for Computer Algebra. Technical Report. TR-09-02. http://www.orcca.on.ca/TechReports/2009/TR-09-02.html (2009)
  8. 8.
    Golubitsky, O., Watt, S.M.: Confidence measures in recognizing handwritten mathematical symbols. In: Proceedings of the Conferences on Intelligent Computer Mathematics: 16th Symposium on the Integration of Symbolic Computation and Mechanized Reasoning and 8th International Conference on Mathematical Knowledge Management, (MKM 2009), Grand Bend, Canada, pp. 460–466. Springer, Berlin, LNAI 5625. 10–12 July 2009Google Scholar
  9. 9.
    Golubitsky, O., Mazalov, V., Watt, S.M.: Orientation-independent recognition of handwritten characters with integral invariants. In: 9th Asian Symposium on Computer Mathematics (ASCM), Fukuoka, Japan. Dec 2009 (accepted)Google Scholar
  10. 10.
    Guyon, I., Schomaker, L., Plamondon, R., Liberman, M., Janet, S.: UNIPEN project of on-line data exchange and recognizer benchmarks. In: Proceedings of the 12th International Conference on Pattern Recognition (ICPR 1994), Jerusalem, Israel, IAPR-IEEE, pp. 29–33. (1994)Google Scholar
  11. 11.
    Ink Markup Language (InkML) W3C Working Draft. 23 Oct (2006) http://www.w3.org/TR/InkML
  12. 12.
    Joshi N., Sita G., Ramakrishnan A.G., Madhvanath S.: Elastic matching algorithms for online tamil character recognition. Neural Inf. Process., Lect. Notes Comput. Sci. 3316, 820–826 (2004)Google Scholar
  13. 13.
    Keogh, E.: Exact indexing of dynamic time warping. In: 28th International Conference on Very Large Data Bases, Hong Kong, pp. 406–417. (2002)Google Scholar
  14. 14.
    Klee V.L. Jr.: Convex sets in linear spaces. Duke Math. J. 18(2), 443–466 (1951)MATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Krall A.: Hilbert Space, Boundary Value Problems and Orthogonal Polynomials, Operator Theory: Advances and Applications, vol. 133. Birkhuser, Basel (2002)Google Scholar
  16. 16.
    LaViola, J.J., Jr.: Symbol Recognition Dataset. Microsoft Center for Research on Pen-Centric Computing. http://pen.cs.brown.edu/symbolRecognitionDataset.zip (2009)
  17. 17.
    Li M., Sethi I.: Confidence-based classifier design. Pattern Recognit. 39(7), 1230–1240 (2006)MATHCrossRefGoogle Scholar
  18. 18.
    Michelot C.: A finite algorithm for finding the projection of a point onto the canonical simplex of \({\mathbb{R}^n}\). J. Optim. Theory Appl. 50(1), 195–200 (1986)MATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    Munich, M.E., Perona, P.: Visual signature verification using affine arc-length. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR99) 2, (1999)Google Scholar
  20. 20.
    Munich M.E., Perona P.: Visual identification by signature tracking. IEEE PAMI 25(2), 200–217 (2003)Google Scholar
  21. 21.
    Myers C.S., Rabiner L.R.: A comparative study of several dynamic time-warping algorithms for connected word recognition. Bell Syst. Tech. J. 60(7), 1389–1409 (1981)Google Scholar
  22. 22.
    Uchida S., Sakoe H.: A Survey of elastic matching techniques for handwritten character recognition. IEICE Trans. Inf. Syst. E88-D 8, 1781–1790 (1978)Google Scholar
  23. 23.
    Sakoe H., Chiba S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. ASSP 26, 43–49 (1978)MATHCrossRefGoogle Scholar
  24. 24.
    Vincent, P., Bengio, Y.: K-local hyperplane and convex distance nearest neighbor algorithms. Adv. Neural Inf. Process. Syst., The MIT Press, pp. 985–992. (2002)Google Scholar
  25. 25.
    Watt, S.M.: Mathematical document classification via symbol frequency analysis. In: Proceedings Towards Digital Mathematics Library (DML), pp. 29–40. (2008)Google Scholar

Copyright information

© Springer-Verlag 2009

Authors and Affiliations

  1. 1.University of Western OntarioLondonCanada

Personalised recommendations