Word Grouping in Document Images Based on Voronoi Tessellation

  • Yue Lu
  • Zhe Wang
  • Chew Lim Tan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3163)


Voronoi tessellation of image elements provides an intuitive and appealing definition of proximity, which has been suggested as an effective tool for the description of relations among the neighboring objects in a digital image. In this paper, a Voronoi tessellation based method is presented for word grouping in document images. The Voronoi neighborhoods are generated from the Voronoi tessellation, with the information about the relations and distances of neighboring connected components, based on which word grouping is carried out. The proposed method has been evaluated on a variety of document images. The experimental results show that it has achieved promising results with a high accuracy, and is robust to various font types, styles, sizes, skew angles, as well as different text orientations.


Voronoi Diagram Document Image Text Line Image Element Voronoi Tessellation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Fletcher, L.A., Kasturi, R.: A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images. IEEE Transaction on Pattern Analysis and Machine Intelligence 10(6), 910–918 (1988)CrossRefGoogle Scholar
  2. 2.
    Jain, A., Bhattacharjee, S.: Text segmentation using Gabor filters for automatic document processing. Machine Vison Applications 5, 169–184 (1992)CrossRefGoogle Scholar
  3. 3.
    Ittner, D.J., Baird, H.S.: Language-free layout analysis. In: Proceedings of Second International Conference on Document Analysis and Recognition, Tsukuba, pp. 336–340 (1993)Google Scholar
  4. 4.
    Wang, Y., Phillips, I.T., Haralick, R.: Statistical-based approach to word segmentation. In: Proceedings of 15th International Conference on Pattern Recognition, Barcelona, Spain, September 2000, vol. 4, pp. 555–558 (2000)Google Scholar
  5. 5.
    Park, H.C., Ok, S.Y., Yu, Y.J., Cho, H.G.: A word extraction algorithm for machine-printed documents using a 3D neighborhood graph model. Int. J. Doc. Anal. Recognition 4, 115–130 (2001)CrossRefGoogle Scholar
  6. 6.
    Sobottka, K., Kronenberg, H., Perroud, T., Bunke, H.: Text extraction from colored book and journal covers. Int. J. Doc. Anal. Recognition 2, 163–176 (2000)Google Scholar
  7. 7.
    Tan, C.L., Ng, P.O.: Text extraction using pyramid. Proc. Pattern Recognition 31(1), 63–72 (1997)CrossRefGoogle Scholar
  8. 8.
    Xiao, Y., Yan, H.: Text Region Extraction in a Document Image Based on the Delaunay Tessellation. Pattern Recognition 36(3), 799–809 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Wang, Y., Phillips, I.T., Haralick, R.: Using Area Voronoi Tessellation to Segment Characters Connected to Graphics. In: Proceedings of Fourth IAPR InternationalWorkshop on Graphics Recognition (GREC 2001), Kingston, Ontario, Canada, September 2001, pp. 147–153 (2001)Google Scholar
  10. 10.
    Kise, K., Sato, A., Iwata, M.: Segmentation of Page Images Using the Area Voronoi Diagram. Computer Vision and Image Understanding 70(3), 370–382 (1998)CrossRefGoogle Scholar
  11. 11.
    Kise, K., Iwata, M., Dengel, A., Matsumoto, K.: Text-Line Extraction as Selection of Paths in the Neighbor Graph. Document Analysis Systems, 225–239 (1998)Google Scholar
  12. 12.
    Burge, M., Monagan, G.: Using the Voronoi tessellation for grouping words and multipart symbols in documents. In: Proceedings of SPIE International Symposium on Optics, Imaging and Instrumentation, San Diego, California, July 1995, vol. 2573, pp. 116–124 (1995)Google Scholar
  13. 13.
    Okabe, A., Boots, B., Sugihara, K., Chiu, S.N.: Spatial tessellations: Concepts and applications of Voronoi disgrams, 2nd edn. John Wiley, Chichester (2000)Google Scholar
  14. 14.
    Kim, S.H., Jeong, C.B., Kwag, H.K., Suen, C.Y.: Word segmentation of printed text lines based on gap clustering and special symbol detection. In: Proceesings of International Conference on Pattern Recognition, Quebec, Canada, vol. 2, pp. 320–323 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Yue Lu
    • 1
    • 2
  • Zhe Wang
    • 2
  • Chew Lim Tan
    • 2
  1. 1.Department of Computer Science and TechnologyEast China Normal UniversityShanghaiChina
  2. 2.Department of Computer Science, School of ComputingNational University of SingaporeKent RidgeSingapore

Personalised recommendations