Region-Based Caption Text Extraction

  • Miriam LeonEmail author
  • Veronica Vilaplana
  • Antoni Gasull
  • Ferran Marques
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 158)


This chapter presents a method for caption text detection. The proposed method will be included in a generic indexing system dealing with other semantic concepts which are to be automatically detected as well. To have a coherent detection system, the various object detection algorithms use a common image description, a hierarchical region-based image model. The proposed method takes advantage of texture and geometric features to detect the caption text. Texture features are estimated using wavelet analysis and mainly applied for text candidate spotting. In turn, text characteristics verification relies on geometric features, which are estimated exploiting the region-based image model. Analysis of the region hierarchy provides the final caption text objects. The final step of consistency analysis for output is performed by a binarization algorithm that robustly estimates the thresholds on the caption text area of support.


Text detection and localization Binary Partition Tree 



This work was partially founded by the Catalan Broadcasting Corporation (CCMA) and Mediapro through the Spanish project CENIT-2007-1012 i3media and TEC2007-66858/TCM PROVEC of the Spanish Government.


  1. 1.
    Assfalg J, Bertini M, Colombo C, Del Bimbo C (2001) Extracting semantic information from news and sport video. In: Proceedings of the 2nd ISPA, pp 4–11Google Scholar
  2. 2.
    Crandall D, Antani S, Kasturi R (2002) Extraction of special effects caption text events from digital video. Int J Doc Anal Recog 2:138–157Google Scholar
  3. 3.
    Jung K, Kim K, Jain AK (2004) Text information extraction in images and video:a survey. Pattern Recog 37:977–997CrossRefGoogle Scholar
  4. 4.
    Vilaplana V, Marqués F, Salembier P (2008) Binary partition trees for object detection. IEEE Trans Image Process 17(11):2201–2216Google Scholar
  5. 5.
    Zhong Y, Zhang H, Jain AK (2000) Automatic caption localization in compressed video. IEEE Trans PAMI 22(4):385–393Google Scholar
  6. 6.
    Li H, Doermann D, Kia O (2000) Automatic text detection and tracking in digital video. IEEE Trans Image Process 9(1):147–155Google Scholar
  7. 7.
    Tekinalp S, Alatan AA (2003) Utilization of texture, contrast and color homogeneity for detecting and recognizing text from video frames. In: IEEE ICIP 2003, Barcelona, SpainGoogle Scholar
  8. 8.
    Retornaz T, Marcotegui B (2007) Scene text localization based on the ultimate opening. Proc ISMM 1:177–188Google Scholar
  9. 9.
    Salembier P, Oliveras A, Garrido L (1998) Anti-extensive connected operators for image and sequence processing. IEEE Trans Image Process 7(4):555–570Google Scholar
  10. 10.
    Leon M, Mallo S, Gasull A (2005) A tree structured-based caption text detection approach. In: Proceedings of 5th IASTED VIIP, pp 220–225Google Scholar
  11. 11.
    Salembier P, Garrido L (2000) Binary partition tree as an efficient representation for image processing, segmentation and information retrieval. IEEE Trans Image Process 9(4):561–576CrossRefGoogle Scholar
  12. 12.
    Vilaplana V, Marques F, Leon M, Gasull A (2010) Object detection and segmentation on a hierarchical region-based image representation. In: Proceedings of the ICIP-10, IEEE international conference on image processing, pp 3393–3396, Hong Kong, ChinaGoogle Scholar
  13. 13.
    Leon M, Vilaplana V, Gasull A, Marques F (2009) Caption text extraction for indexing purposes using a hierarchical region-based image model. In: IEEE ICIP 2009, El Cairo, EgyptGoogle Scholar
  14. 14.
    Rosin PL (1999) Measuring rectangularity. Mach. Vis. Appl. 11(4):191–196Google Scholar
  15. 15.
    Canny J (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 8(6):679–698Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Miriam Leon
    • 1
    Email author
  • Veronica Vilaplana
    • 1
  • Antoni Gasull
    • 1
  • Ferran Marques
    • 1
  1. 1.Technical University of CataloniaBarcelonaSpain

Personalised recommendations