Scene Text Detection and Tracking for Wearable Text-to-Speech Translation Camera

Goto, Hideaki; Liu, Kunqi

doi:10.1007/978-3-319-41267-2_4

Hideaki Goto¹⁶ &
Kunqi Liu¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9759))

Included in the following conference series:

International Conference on Computers Helping People with Special Needs

2916 Accesses

Abstract

Camera-based character recognition applications equipped with voice synthesizer are useful for the blind to read text messages in the environments. Such applications in the current market and/or similar prototypes under research require users’ active reading actions, which hamper other activities. We presented a different approach at ICCHP2014; the user can be passive, while the device actively finds useful text in the scene. Text tracking feature was introduced to avoid duplicate reading of the same text. This report presents an improved system with two key components, scene text detection and tracking, that can handle text in various languages including Japanese/Chinese and resolve some scene analysis problems such as merging of text lines. We have employed the MSER (Maximally Stable Extremal Regions) algorithm to obtain better text images, and developed a new text validation filter. Some technical challenges for future device design are presented as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

KNFB Reader. http://www.knfbreader.com/
Goto, H., Hoda, T.: Real-time text tracking for text-to-speech translation camera for the blind. In: Miesenberger, K., Fels, D., Archambault, D., Peňáz, P., Zagler, W. (eds.) ICCHP 2014, Part I. LNCS, vol. 8547, pp. 658–661. Springer, Heidelberg (2014)
Chapter Google Scholar
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2002)
Article Google Scholar
Bay, H., Ess, A., Tuytelaars, T., van Gool, L.: SURF: speeded up robust features. Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Cyberscience Center, Tohoku University, Sendai, Japan
Hideaki Goto
Graduate School of Information Sciences, Tohoku University, Sendai, Japan
Kunqi Liu

Authors

Hideaki Goto
View author publications
You can also search for this author in PubMed Google Scholar
Kunqi Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hideaki Goto .

Editor information

Editors and Affiliations

Institute Integriert Studieren, Universität Linz, Linz, Austria
Klaus Miesenberger
Rehabilitationswissenschaften, Technische Universität Dortmund, Dortmund, Germany
Christian Bühler
Masaryk University , Brno, Czech Republic
Petr Penaz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Goto, H., Liu, K. (2016). Scene Text Detection and Tracking for Wearable Text-to-Speech Translation Camera. In: Miesenberger, K., Bühler, C., Penaz, P. (eds) Computers Helping People with Special Needs. ICCHP 2016. Lecture Notes in Computer Science(), vol 9759. Springer, Cham. https://doi.org/10.1007/978-3-319-41267-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-41267-2_4
Published: 06 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41266-5
Online ISBN: 978-3-319-41267-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics