Skip to main content

Watershed Based Document Image Analysis

  • Conference paper
Advanced Concepts for Intelligent Vision Systems (ACIVS 2010)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6474))

Abstract

Document image analysis is used to segment and classify regions of a document image into categories such as text, graphic and background. In this paper we first review existing document image analysis approaches and discuss their limits. Then we adapt the well-known watershed segmentation in order to obtain a very fast and efficient classification. Finally, we compare our algorithm with three others, by running all the algorithms on a set of document images and comparing their results with a ground-truth segmentation designed by hand.

Results show that the proposed algorithm is the fastest and obtains the best quality scores.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chuai-aree, S., Lursinsap, C., Sophatsathit, P., Siripant, S.: Fuzzy c-mean: A statistical feature classification of text and image segmentation method. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 9(6), 661–671 (2001)

    MATH  Google Scholar 

  2. Gupta, P., Vohra, N., Chaudhury, S., Joshi, S.D.: Wavelet based page segmentation. In: Proc. Indian Conf. on Computer Vision, Graphics and Image Processing, pp. 51–56 (2000)

    Google Scholar 

  3. Journet, N., Ramel, J.Y., Mullot, R., Eglin, V.: Document image characterization using a multiresolution analysis of the texture: application to old documents. International Journal on Document Analysis and Recognition 11(1), 9–18 (2008)

    Article  Google Scholar 

  4. Li, J., Gray, R.M.: Text and picture segmentation by the distribution analysis of wavelet coefficients. In: Proceedings of International Conference on Image Processing, vol. 3, pp. 790–794 (October 1998)

    Google Scholar 

  5. Nakamura, K., Jiang, H., Yamamoto, S., Itoh, T.: Document image segmentation into text, continuous-tone and screened-halftone region by the neural networks. In: IAPR Workshop on Machine Vision Application (12-14), pp. 450–453 (November 1996)

    Google Scholar 

  6. Pati, P.B., Raju, S.S., Pati, N., Ramakrishnan, A.G.: Gabor filter for document analysis in indian bilingual documents. In: Proc. Internat. Conf. on Intelligent Sensing and Info. Process., pp. 123–126 (2004)

    Google Scholar 

  7. Wang, H., Li, S.Z., Ragupathi, S.: A fast and robust approach for document segmentation and classification. In: MVA 1996 IAPR Workshop on Machine Vision Applications, pp. 333–336 (November 1996)

    Google Scholar 

  8. Cai, W., Chen, S., Zhang, D.: Fast and robust fuzzy c-means clustering algorithms incorporating local information for image segmentation. Pattern Recognition 40, 825–838 (2007)

    Article  MATH  Google Scholar 

  9. Digabel, H., Lantuéjoul, C.: Iterative algorithms. In: 2nd European Symp. Quantitative Analysis of Microstructures in Material Science, pp. 85–99 (1978)

    Google Scholar 

  10. Beucher, S., Lantuéjoul, C.: Use of watersheds in contour detection. In: International Workshop on Image Processing, Real-Time Edge and Motion Detection (1979)

    Google Scholar 

  11. Beucher, S.: Watershed, hierarchical segmentation and waterfall algorithm. In: Mathematical Morphology and its Applications to Image Processing, pp. 69–76 (1994)

    Google Scholar 

  12. Najman, L., Couprie, M.: Watershed algorithms and contrast preservation. In: International Conference on Discrete Geometry for Computer Imagery (11), pp. 62–71 (2003)

    Google Scholar 

  13. Meyer, F.: Topographic distance and watershed lines. Signal Processing 38, 113–125 (1994)

    Article  MATH  Google Scholar 

  14. Antonacopoulos, A., Bridson, D.: Performance analysis framework for layout analysis methods. In: 9th International Conference on Document Analysis and Recognition, pp. 1258–1262 (September 2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Shadkami, P., Bonnier, N. (2010). Watershed Based Document Image Analysis. In: Blanc-Talon, J., Bone, D., Philips, W., Popescu, D., Scheunders, P. (eds) Advanced Concepts for Intelligent Vision Systems. ACIVS 2010. Lecture Notes in Computer Science, vol 6474. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17688-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17688-3_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17687-6

  • Online ISBN: 978-3-642-17688-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics