Abstract
This paper proposes a method to improve off-line character classifiers learned from examples using virtual examples synthesized from an on-line character database. To obtain good classifiers, a large database which contains a large enough number of variations of handwritten characters is usually required. However, in practice, collecting enough data is time-consuming and costly. In this paper, we propose a method to train SVM for off-line character recognition based on artificially augmented examples using on-line characters.
In our method, virtual examples are synthesized from on-line characters by the following two steps: (1) applying affine transformation to each stroke of “real” characters, and (2) applying affine transformation to each stroke of artificial characters, which are synthesized on the basis of PCA. SVM classifiers are trained by using the training samples containing artificially generated patterns and real characters. We examine the effectiveness of the proposed method with respect to the recognition rates and number of support vectors of SVM through experiments involving the handwritten Japanese Hiragana character classification.
Chapter PDF
References
Ghosh, D., Shibaprasad, A.P.: An analytic approach for generation of artificial hand-printed character database from given generative models. Pattern Recognition 32, 907–920 (1999)
Ha, T.M., Bunke, H.: Off-line, handwritten numeral recognition by perturbation method. IEEE Trans. PAMI 19(5), 535–539 (1997)
Joachims, T.: Making large-scale SVM learning practical. In: Advances in kernel methods, Ch. 11. MIT Press, Cambridge (1999)
Maruyama, K., Maruyama, M., Miyao, H., Nakano, Y.: A method to make multiple hypotheses with high cumulative recognition rate using SVMs. Pattern Recognition 37(2), 241–251 (2004)
Miller, E., Matsakis, N., Viola, P.: Learning from one example through shared densities on transformation. In: Proc. CVPR 2000, vol. 1, pp. 464–471 (2000)
Mori, S., Suen, C.Y., Yamamoto, K.: Historical review of OCR research and development. Proc. IEEE 80(7), 1029–1058 (1992)
Niyogi, P., Girosi, F., Poggio, T.: Incorporating prior knowledge in machine learning by creating virtual examples. Proc. IEEE 86(11), 2196–2207 (1998)
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
Velek, O., Lieu, C.-L., Jaeger, S., Nakagawa, M.: An improved approach to generating realistic Kanji character images from on-line characters and its benefit to off-line recognition performance. In: Proc. ICPR 2002, vol. 1, pp. 588–591 (2002)
Velek, O., Jaeger, S., Nakagawa, M.: A new warping technique for normalizing likelihood of multiple classifiers and its effectiveness in combined on-line/off-line Japanese character recognition. In: Proc. IWFHR 2002, pp. 177–182 (2002)
Schölkopf, R., Smola, A.J.: Learning with Kernels. MIT Press, Cambridge (2002)
Miyao, H., Maruyama, M., Nakano, Y., Hananoi, T.: Off-Line Handwritten Character Recognition by SVM based on the Virtual Examples Synthesized from On-Line Characters. In: Proc. ICDAR 2005, vol. 1, pp. 494–498 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Miyao, H., Maruyama, M. (2006). Virtual Example Synthesis Based on PCA for Off-Line Handwritten Character Recognition. In: Bunke, H., Spitz, A.L. (eds) Document Analysis Systems VII. DAS 2006. Lecture Notes in Computer Science, vol 3872. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11669487_9
Download citation
DOI: https://doi.org/10.1007/11669487_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32140-8
Online ISBN: 978-3-540-32157-6
eBook Packages: Computer ScienceComputer Science (R0)