Watershed Based Document Image Analysis

Shadkami, Pasha; Bonnier, Nicolas

doi:10.1007/978-3-642-17688-3_12

Pasha Shadkami²¹ &
Nicolas Bonnier²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6474))

Included in the following conference series:

International Conference on Advanced Concepts for Intelligent Vision Systems

1423 Accesses
4 Citations

Abstract

Document image analysis is used to segment and classify regions of a document image into categories such as text, graphic and background. In this paper we first review existing document image analysis approaches and discuss their limits. Then we adapt the well-known watershed segmentation in order to obtain a very fast and efficient classification. Finally, we compare our algorithm with three others, by running all the algorithms on a set of document images and comparing their results with a ground-truth segmentation designed by hand.

Results show that the proposed algorithm is the fastest and obtains the best quality scores.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chuai-aree, S., Lursinsap, C., Sophatsathit, P., Siripant, S.: Fuzzy c-mean: A statistical feature classification of text and image segmentation method. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 9(6), 661–671 (2001)
MATH Google Scholar
Gupta, P., Vohra, N., Chaudhury, S., Joshi, S.D.: Wavelet based page segmentation. In: Proc. Indian Conf. on Computer Vision, Graphics and Image Processing, pp. 51–56 (2000)
Google Scholar
Journet, N., Ramel, J.Y., Mullot, R., Eglin, V.: Document image characterization using a multiresolution analysis of the texture: application to old documents. International Journal on Document Analysis and Recognition 11(1), 9–18 (2008)
Article Google Scholar
Li, J., Gray, R.M.: Text and picture segmentation by the distribution analysis of wavelet coefficients. In: Proceedings of International Conference on Image Processing, vol. 3, pp. 790–794 (October 1998)
Google Scholar
Nakamura, K., Jiang, H., Yamamoto, S., Itoh, T.: Document image segmentation into text, continuous-tone and screened-halftone region by the neural networks. In: IAPR Workshop on Machine Vision Application (12-14), pp. 450–453 (November 1996)
Google Scholar
Pati, P.B., Raju, S.S., Pati, N., Ramakrishnan, A.G.: Gabor filter for document analysis in indian bilingual documents. In: Proc. Internat. Conf. on Intelligent Sensing and Info. Process., pp. 123–126 (2004)
Google Scholar
Wang, H., Li, S.Z., Ragupathi, S.: A fast and robust approach for document segmentation and classification. In: MVA 1996 IAPR Workshop on Machine Vision Applications, pp. 333–336 (November 1996)
Google Scholar
Cai, W., Chen, S., Zhang, D.: Fast and robust fuzzy c-means clustering algorithms incorporating local information for image segmentation. Pattern Recognition 40, 825–838 (2007)
Article MATH Google Scholar
Digabel, H., Lantuéjoul, C.: Iterative algorithms. In: 2nd European Symp. Quantitative Analysis of Microstructures in Material Science, pp. 85–99 (1978)
Google Scholar
Beucher, S., Lantuéjoul, C.: Use of watersheds in contour detection. In: International Workshop on Image Processing, Real-Time Edge and Motion Detection (1979)
Google Scholar
Beucher, S.: Watershed, hierarchical segmentation and waterfall algorithm. In: Mathematical Morphology and its Applications to Image Processing, pp. 69–76 (1994)
Google Scholar
Najman, L., Couprie, M.: Watershed algorithms and contrast preservation. In: International Conference on Discrete Geometry for Computer Imagery (11), pp. 62–71 (2003)
Google Scholar
Meyer, F.: Topographic distance and watershed lines. Signal Processing 38, 113–125 (1994)
Article MATH Google Scholar
Antonacopoulos, A., Bridson, D.: Performance analysis framework for layout analysis methods. In: 9th International Conference on Document Analysis and Recognition, pp. 1258–1262 (September 2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Oce Print Logic Technologies, 1 rue Jean Lemoine,cedex, Creteil, 94015, France
Pasha Shadkami & Nicolas Bonnier

Authors

Pasha Shadkami
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Bonnier
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

DGA/D4S/MRIS, CEP/GIP, 16 bis avenue Prieur de la côte d’or, 94114, Arcueil, France
Jacques Blanc-Talon
Canon Information Systems Research Australia, Sydney, Australia
Don Bone
Telecommunications and Information processing (TELIN), Ghent University, St.-Pietersnieuwstraat 41, B9000, Ghent, Belgium
Wilfried Philips
CSIRO ICT Centre, ICT Centre, Po Box 76, NSW 1710, Epping, Sydney, Australia
Dan Popescu
Physics, University of Antwerp, Universiteitsplein 1; Building N. 2610, Wilrijk, Belgium
Paul Scheunders

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shadkami, P., Bonnier, N. (2010). Watershed Based Document Image Analysis. In: Blanc-Talon, J., Bone, D., Philips, W., Popescu, D., Scheunders, P. (eds) Advanced Concepts for Intelligent Vision Systems. ACIVS 2010. Lecture Notes in Computer Science, vol 6474. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17688-3_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-17688-3_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17687-6
Online ISBN: 978-3-642-17688-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics