Text Localization and Recognition in Images and Video

  • Seiichi UchidaAffiliated withDepartment of Advanced Information Technology, Kyushu University Email author 


This chapter reviews techniques on text localization and recognition in scene images captured by camera. Since properties of scene texts are very different from scanned documents in various aspects, specific techniques are necessary to localize and recognize them. In fact, localization of scene text is a difficult and important task because there is no prior information on the location, layout, direction, size, typeface, and color of texts in a scene image in general and there are many textures and patterns similar to characters. In addition, recognition of scene text is also a difficult task because there are many characters distorted by blurring, perspective, nonuniform lighting, and low resolution. Decoration of characters makes the recognition task far more difficult. As reviewed in this chapter, those difficult tasks have been tackled with not only modified versions of conventional OCR techniques but also state-of-the-art computer vision and pattern recognition methodologies.


Scene text detection Scene text recognition Text image acquisition Textlocalization Text/non-text discrimination Video caption detection Video captionrecognition