Learning Segmentation of Documents with Complex Scripts

  • K. S. Sesh Kumar
  • Anoop M. Namboodiri
  • C. V. Jawahar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4338)


Most of the state-of-the-art segmentation algorithms are designed to handle complex document layouts and backgrounds, while assuming a simple script structure such as in Roman script. They perform poorly when used with Indian languages, where the components are not strictly collinear. In this paper, we propose a document segmentation algorithm that can handle the complexity of Indian scripts in large document image collections. Segmentation is posed as a graph cut problem that incorporates the apriori information from script structure in the objective function of the cut. We show that this information can be learned automatically and be adapted within a collection of documents (a book) and across collections to achieve accurate segmentation. We show the results on Indian language documents in Telugu script. The approach is also applicable to other languages with complex scripts such as Bangla, Kannada, Malayalam, and Urdu.


Digital Library Segmentation Algorithm Document Image Indian Language Segmentation Quality 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Shafait, F., Keysers, D., Breuel, T.M.: Performance comparison of six algorithms for page segmentation. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 368–379. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  2. 2.
    O’Gorman, L.: The Document Spectrum for Page Layout Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 1162–1173 (1993)CrossRefGoogle Scholar
  3. 3.
    Kise, K., Sato, A., Iwata, M.: Segmentation of Page Images Using the Area Voronoi Diagram. Computer Vision and Image Understanding 70, 370–382 (1998)CrossRefGoogle Scholar
  4. 4.
    Nagy, G., Seth, S., Viswanathan, M.: A Prototype Document Image Analysis System for Technical Journals. Computer 25, 10–22 (1992)CrossRefGoogle Scholar
  5. 5.
    Baird, H.S., Jones, S.E., Fortune, S.J.: Image segmentation by shape-directed covers. In: Proceedings of International Conference on Pattern Recognition(ICPR), pp. 820–825 (1990)Google Scholar
  6. 6.
    Pavlidis, T., Zhou, J.: Page Segmentation and Classification. Graphical Models and Image Processing 54, 484–496 (1992)CrossRefGoogle Scholar
  7. 7.
    Ambati, V., Balakrishnan, N.: Reddy, R., Pratha, L., Jawahar, C.V.: The Digital Library of India Project: Process, Policies and Architecture. In: Second International Conference on Digital Libraries(ICDL) (2006)Google Scholar
  8. 8.
    Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Transaction on Pattern Analysis and Machine Intelligence 23, 1222–1239 (2001)CrossRefGoogle Scholar
  9. 9.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 888–905 (2000)CrossRefGoogle Scholar
  10. 10.
    Shental, N., Zomet, A., Hertz, T., Weiss, Y.: Learning and inferring image segmentations using the GBP typical cut algorithm. In: International Conference in Computer Vision, pp. 1243–1250 (2003)Google Scholar
  11. 11.
    Kumar, M.P., Torr, P.H.S., Zisserman, A.: OBJ CUT. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, pp. 18–25 (2005)Google Scholar
  12. 12.
    Baird, H.S.: The skew angle of printed documents. In: Document Image Analysis, pp. 204–208. IEEE Computer Society Press, Los Alamitos (1995)Google Scholar
  13. 13.
    Yan, H.: Skew correction of document images using interline cross-correlation. CVGIP: Graphical Models Image Processing 55, 538–543 (1993)CrossRefGoogle Scholar
  14. 14.
    Kumar, K.S.S., Namboodiri, A.M., Jawahar, C.V.: Learning to segment document images. In: Proceedings of the First International Conference on Pattern Recognition and Machine Intelligence (PReMI), pp. 471–476 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • K. S. Sesh Kumar
    • 1
  • Anoop M. Namboodiri
    • 1
  • C. V. Jawahar
    • 1
  1. 1.Centre for Visual Information TechnologyInternational Institute of Information TechnologyHyderabadIndia

Personalised recommendations