Document image analysis: A primer

Kasturi, Rangachar; O’Gorman, Lawrence; Govindaraju, Venu

doi:10.1007/BF02703309

Document image analysis: A primer

Published: February 2002

Volume 27, pages 3–22, (2002)
Cite this article

Sadhana Aims and scope Submit manuscript

Rangachar Kasturi¹,
Lawrence O’Gorman² &
Venu Govindaraju³

663 Accesses
74 Citations
3 Altmetric
Explore all metrics

Abstract

Document image analysis refers to algorithms and techniques that are applied to images of documents to obtain a computer-readable description from pixel data. A well-known document image analysis product is the Optical Character Recognition (OCR) software that recognizes characters in a scanned document. OCR makes it possible for the user to edit or search the document’s contents. In this paper we briefly describe various components of a document analysis system. Many of these basic building blocks are found in most document analysis systems, irrespective of the particular domain or language to which they are applied. We hope that this paper will help the reader by providing the background necessary to understand the detailed descriptions of specific techniques presented in other papers in this issue.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Arcelli C, Sanniti di Baja G 1985 A width-independent fast thinning algorithm.IEEE Trans. Pattern Anal. Machine Intell. PAMI-7: 463–74
Google Scholar
Arcelli C, Sanniti di Baja G 1993 Euclidean skeleton via center-of-maximal-disc extraction.Image Vision Comput. 11: 163–173
Article Google Scholar
Akiyama T, Hagita N 1990 Automated entry system for printed documents.Pattern Recogn. 23: 1141–1154
Article Google Scholar
Baird H S 1987 The skew angle of printed documents.Proceedings of the Conference of the Society of Photographic Scientists and Engineers on Hybrid Imaging Systems (Springfield, VA: Soc. Photogr. Sci. Eng.) pp 14–21
Google Scholar
Bharati A, Chaitanya V, Sangal R 1998 Computational linguistics in India: An overview. Technical Report, Indian Institute of Information Technologies, Hyderabad
Google Scholar
Dengel A, Bleisinger R, Hoch R, Fein F, Hones F 1992 From paper to office document standard representation.IEEE Comput. 25: 63–67
Google Scholar
Fletcher A, Kasturi R 1988 A robust algorithm for text string separation from mixed text/graphics images.IEEE Trans. Pattern Anal. Machine Intell. PAMI-10: 910–918
Article Google Scholar
Freeman H 1974 Computer processing of line drawing images.Comput. Surv. 6: 57–98
Article MATH Google Scholar
Freeman H, Davis L 1977 A corner-finding algorithm for chain-coded curves.IEEE Trans. Comput. C-26: 297–303
Article Google Scholar
Fukunaga K, Hostetler L D 1975 K-nearest-neighbour Bayes-risk estimation.IEEE Trans. Inf. Theor. 21: 285–293
Article MATH MathSciNet Google Scholar
Garris M D, Dimmick D L 1996 Form design for high accuracy optical character recognition.IEEE Trans. Pattern Anal. Machine Intel. PAMI-18: 653–656
Article Google Scholar
GREC 1995, 97, 99 Selected papers from the International Workshops on Graphics Recognition 1995, 1997, and 1999.Lecture Notes in Computer Science series (Springer Verlag) vols. 1072 (1996), 1389 (1998), 1941 (2000)
Haralick R M, Shapiro L G 1992Computer and robot vision (Reading, MA: Addison-Wesley)
Google Scholar
Haralick R M, Sternberg S R, Zhuang X 1987 Image analysis using mathematical morphology.IEEE Trans. Pattern Anal. Machine Intell. PAMI-9: 532–550
Article Google Scholar
Hashizume A, Yeh P S, Rosenfeld A 1986 A method of detecting the orientation of aligned components.Pattern Recogn. Lett. 4: 125–132
Article Google Scholar
Hart P E 1968 The condensed nearest neighbour rule.IEEE Trans. Inf. Theor. 14: 515–516
Article Google Scholar
ICDAR 19995th Int. Conf. on Document Analysis and Recognition (Los Alamitos, CA: IEEE Comput. Soc.)
Google Scholar
Illingworth J, Kittler J 1988 A survey of the Hough transform.Comput. Graphics Image Process. 44: 87–116
Article Google Scholar
Karnik R P 1999 Identifying Devnagari characters.Proc. Int. Conf. on Document Analysis and Recognition (Los Alamitos, CA: IEEE Comput. Soc.) pp. 669–672
Google Scholar
Jain A K, Bhattacharjee S K 1992 Text segmentation using Gabor filters for automatic document processing.Machine Vision Appl. J. 5: 169–184
Article Google Scholar
Lai C P, Kasturi R 1991 Detection of dashed lines in engineering drawings and maps.Proc. First Int. Conf. on Document Analysis and Recognition, St. Malo, France, pp. 507–515
Lam L, Lee S-W, Suen C Y 1992 Thinning methodologies -A comprehensive survey.IEEE Trans. Pattern Anal. Machine Intell. PAMI-14: 869–885
Article Google Scholar
Lam L, Suen C Y 1995 An evaluation of parallel thinning algorithms for character recognition.IEEE Trans. Pattern Recogn. Machine Intell. 17: 914–919
Article Google Scholar
Medioni G, Yasumoto Y 1987 Corner detection and curve representation using cubic B-splines.Comput. Vision, Graphics, Image Process. 29: 267–278
Google Scholar
Murthy B K, Deshpande W R 1998 Optical character recognition (OCR) for Indian languages.Proc. Int. Conf. on Comput. Vision, Graphics, Vision, Image Process. ICVGIP, New Delhi
Google Scholar
Nartker T A, Rice S V, Kanai J 1994 OCR Accuracy. UNLV’s Second Annual Test. Technical Journal INFORM, University of Nevada, Las Vegas
Google Scholar
O’Gorman L 1988 Curvilinear feature detection from curvature estimation.9th Int. Conference on Pattern Recognition, Rome, Italy, pp 1116–1119
O’Gorman L 1990 k x k Thinning.Comput. Vision, Graphics, Image Process. 51: 195–215
Article Google Scholar
O’Gorman L 1992 Image and document processing techniques for the right pages electronic library system.Int. Conf. Pattern Recognition (ICPR), The Netherlands, pp 260–263
O’Gorman L 1993 The document spectrum for structural page layout analysis.IEEE Trans. Pattern Anal. Machine Intelli. PAMI-15: 1162–73
Article Google Scholar
O’Gorman L 1994 Binarization and multi-thresholding of document images using connectivity.CVGIP: Graphical Models Image Process. 56: 494–506
Article Google Scholar
O’Gorman L, Kasturi R 1997 Document image analysis.IEEE Computer Society Press Executive Briefing Series, Los Alamitos, CA
Pavlidis T 1982Algorithms for graphics and image processing (Rockville, MD: Comput. Sci. Press)
Google Scholar
Pavlidis T, Zhou J 1991 Page segmentation by white streams.Proc. 1st Int. Conf. on Document Analysis and Recognition (ICDAR), St. Malo, France, pp 945–953
Postl W 1986 Detection of linear oblique structures and skew scan in digitized documents.Proc. 8th Int. Conf. on Pattern Recognition (ICPR), Paris, France, pp 687–689
Ramanujan P 1999 Development of a general-purpose Sanskrit parser, M Sc thesis, Dept. of Computer Science & Automation, Indian Institute of Science, Bangalore
Google Scholar
Ramer U E 1972 An iterative procedure for the polygonal approximation of plane curvesComput. Graphics Image Process. 1: 244–256
Google Scholar
Reddi S S, Rudin S F, Keshavan H R 1984 An optimal multiple threshold scheme for image segmentation.IEEE Trans. Syst. Man Cybern. SMC-14: 661–665
Google Scholar
Rice S V, Kanai J, Nartker T A 1992 A report on the accuracy of OCR devices. Technical Report, Information Science Research Institute of Nevada, Las Vegas
Google Scholar
Sawaki M, Hagita K 1998 Text-line extraction and character recognition of document headlines with graphical design using complimentary similarity measure.IEEE Trans. Pattern Anal. Machine Intell. PAMI-20: 1103–1109
Article Google Scholar
Sahoo P K, Soltani S, Wong A K C, Chen Y C 1988 A survey of thresholding techniques.Comput. Vision, Graphics, Image Process. 41: 233–260
Article Google Scholar
Sanniti di Baja G 1994 Well-shaped, stable and reversible skeletons from the (3,4)-distance transform.Visual Commun. Image Representation 5: 107–115
Article Google Scholar
Serra J 1982Image analysis and mathematical morphology (London: Academic Press)
MATH Google Scholar
Shih C-C, Kasturi R 1988 Generation of a line-description file for graphics recognition.Proc. SPIE Conf. on Applications of Artificial Intelligence 937: 568–575
Google Scholar
Spitz L 1997 Determination of the Script and Language Content of Document Images.IEEE Trans. Pattern Analy. Machine Intell. PAMI-19: 235–245
Article Google Scholar
Srihari S N, Govindaraju V 1989 Analysis of textual images using the Hough Transform.Machine Vision Appl. 2: 141–153
Article Google Scholar
Trier O D, Taxt T 1995 Evaluation of binarization methods for document imagesIEEE Trans. Pattern Anal. Machine Intell. PAMI-17: 312–315
Article Google Scholar
Tsai W-H 1985 Moment-preserving thresholding: A new approach.Comput. Vision, Grapics, Image Process. 29: 377–393
Article Google Scholar
Wilson C L, Geist J, Garris M D, Chellapa R 1996 Design, integration, and evaluation of form-based handprint and OCR systems. Technical Report, NISTIR5932, National Institute of Standards & Technology, US; download fromhttp://www.itl.nist.gov/iad/894.03/pubs.html
Google Scholar
Wong K Y, Casey R G, Wahl F M 1982 Document analysis system.IBM J. Res. Dev. 6: 647–656
Article Google Scholar
Wu W-Y, Wang M-J J 1993 Detecting the dominant points by the curvature-based polygonal approximation.CVGIP: Graphical Models Image Process. 55: 79–88
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, The Pennsylvania State University, University Park, 16802, PA, USA
Rangachar Kasturi
Avaya Labs, Room 1B04, 233 Mt. Airy Road, 07920, Basking Ridge, NJ, USA
Lawrence O’Gorman
CEDAR, State University of New York at Buffalo, 14228, Amherst, NY, USA
Venu Govindaraju

Authors

Rangachar Kasturi
View author publications
You can also search for this author in PubMed Google Scholar
Lawrence O’Gorman
View author publications
You can also search for this author in PubMed Google Scholar
Venu Govindaraju
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kasturi, R., O’Gorman, L. & Govindaraju, V. Document image analysis: A primer. Sadhana 27, 3–22 (2002). https://doi.org/10.1007/BF02703309

Download citation

Issue Date: February 2002
DOI: https://doi.org/10.1007/BF02703309

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Document image analysis: A primer

Abstract

Access this article

Similar content being viewed by others

The Evolution of Document Image Analysis

Different Thresholding Techniques in Image Processing : A Review

Imaging Techniques in Document Analysis Processes

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Document image analysis: A primer

Abstract

Access this article

Similar content being viewed by others

The Evolution of Document Image Analysis

Different Thresholding Techniques in Image Processing : A Review

Imaging Techniques in Document Analysis Processes

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation