The Segmentation and Identification of Handwriting in Noisy Document Images

Zheng, Yefeng; Li, Huiping; Doermann, David

doi:10.1007/3-540-45869-7_12

The Segmentation and Identification of Handwriting in Noisy Document Images

Yefeng Zheng⁶,
Huiping Li⁶ &
David Doermann⁶

Conference paper
First Online: 01 January 2002

1189 Accesses
21 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2423))

Abstract

In this paper we present an approach to the problem of segmenting and identifying handwritten annotations in noisy document images. In many types of documents such as correspondence, it is not uncommon for handwritten annotations to be added as part of a note, correction, clarification, or instruction, or a signature to appear as an authentication mark. It is important to be able to segment and identify such handwriting so we can 1) locate, interpret and retrieve them efficiently in large document databases, and 2) use different algorithms for printed/handwritten text recognition and signature verification. Our approach consists of two processes: 1) a segmentation process, which divides the text into regions at an appropriate level (character, word, or zone), and 2) a classification process which identifies the segmented regions as handwritten. To determine the approximate region size where classification can be reliably performed, we conducted experiments at the character, word and zone level. We found that the reliable results can be achieved at the word level with a classification accuracy of 97.3%. The identified handwritten text is further grouped into zones and verified to reduce false alarms. Experiments show our approach is promising and robust.

Download to read the full chapter text

Chapter PDF

References

S. N. Srihari, Y. C. Shim and V. Ramanaprasad. A system to read names and address on tax forms. Technical Report CEDAR-TR-94-2, CEDAR, SUNY, Buffalo, 1994
Google Scholar
K. C. Fan, L. S. Wang and Y. T. Tu. Classification of machine-printed and handwritten texts using character block layout variance. Pattern Recognition, 31(9), pages 1275–1284, 1998
Article Google Scholar
V. Pal and B. B. Chaudhuri. Machine-printed and handwritten text lines identification. Pattern Recognition Letters, 22, pages 431–441, 2001
Article MATH Google Scholar
J. Fanke and M. Oberlander. Writing style detection by statistical combination of classifier in form reader applications. In Proc. of the 2 ^nd Inter. Conf. On Document Analysis & Recognition, pages 581–584, 1993
Google Scholar
J. K. Guo and M. Y. Ma. Separating handwritten material from machine printed text using hidden Markov models. In Proc. of the 6 ^th Inter. Conf. On Document Analysis & Recognition, pages 439–443, 2001
Google Scholar
Y. Zheng, C. Liu and X. Ding. Single character type identification. In Proc. of SPIE Vol. 4670, Document Recognition & Retrieval IX, pages 49–56, 2001
Google Scholar
K. Kuhnke, L. Simoncini and Zs. M. Kovacs-V. A system for machine-written and handwritten character distinction. In Proc. of the 3 ^rd Inter. Conf. On Document Analysis & Recognition, pages 811–814, 1995
Google Scholar
S. Mao and T. Kanungo. Empirical performance evaluation methodology and its application to page segmentation algorithms. IEEE Trans. on Pattern Analysis and Machine Intelligence, 23(3), pages 242–256, 2001
Article Google Scholar
L. O'Gorman. The document spectrum for page layout analysis. IEEE Trans. on Pattern Analysis & Machine Intelligence, 15(11), pages 1162–1173, 1993
Article Google Scholar
G. Nagy, S. Seth and S. Stoddard. Document analysis with an expert system. Pattern Recognition in Practice II, Elsevier Science, pages 149–155, 1984
Google Scholar
D. Doermann and J. Liang. Binary document image using similarity multiple texture features. In Proc. of Symposium on Document Image Understanding Technology, pages 181–193, 2001
Google Scholar
A. K. Jain and S. Bhattacharjee. Text segmentation using Gabor filters for automatic document processing. Machine Vision Application, 5, pages 169–184, 1992
Article Google Scholar
A. Soffer. Image categorization using texture features. In Proc. of the 4 ^th Inter. Conf. on Document Analysis & Recognition, pages 233–237, 1997
Google Scholar
D. Gabor. Theory of communication. J. Inst. Elect. Engr. 93, pages 429–459, 1946
Google Scholar
K. Fukunaga. Introduction to statistical pattern recognition. Second edition, Academic Press Inc. 1990
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory for Language and Media Processing Institute for Advanced Computer Studies, University of Maryland, 20742, College Park, MD
Yefeng Zheng, Huiping Li & David Doermann

Authors

Yefeng Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Huiping Li
View author publications
You can also search for this author in PubMed Google Scholar
David Doermann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Bell Labs, Lucent Technologies, 600 Mountain Avenue, 07974, Murray Hill, NJ, USA
Daniel Lopresti
Avaya Labs Research, 233 Mount Airy Road, 07920, Basking Ridge, NJ, USA
Jianying Hu & Ramanujan Kashi &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, Y., Li, H., Doermann, D. (2002). The Segmentation and Identification of Handwriting in Noisy Document Images. In: Lopresti, D., Hu, J., Kashi, R. (eds) Document Analysis Systems V. DAS 2002. Lecture Notes in Computer Science, vol 2423. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45869-7_12

Download citation

DOI: https://doi.org/10.1007/3-540-45869-7_12
Published: 09 August 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44068-0
Online ISBN: 978-3-540-45869-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)