Word Spotting Application in Historical Mongolian Document Images

Wei, Hongxi; Gao, Guanglai

doi:10.1007/978-3-642-39479-9_32

Hongxi Wei²⁰ &
Guanglai Gao²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7995))

Included in the following conference series:

International Conference on Intelligent Computing

3400 Accesses
3 Citations

Abstract

This paper proposes a framework based on the word spotting technology for indexing and retrieving the historical Mongolian document images. In the framework, the scanned document images are segmented into word images by some preprocessing steps such as binarization, connected component analysis and so on. And then each word image is processed by the following procedure, including removing inflectional suffixes, feature extraction and fixed-length representation. Finally, each word image is represented by a fixed-length feature vector and considered as an indexing term. At the retrieval stage, the necessary query keyword image can be obtained by synthesizing a sequence of glyphs according to the spelling rules of Mongolian language. For word matching, the query keyword image is also converted into a fixed-length feature vector through the same procedure. And a ranking list can be returned in descending order of similarities between the query keyword image and each candidate word image. Experimental results on the data set prove the feasibility and effectiveness of the proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Gao, G., Su, X., Wei, H., Gong, Y.: Classical Mongolian Words Recognition in Historical Document. In: Proceedings of ICDAR 2011, pp. 692–697 (2011)
Google Scholar
Louloudis, G., Kesidis, A.L., Gatos, B.: Efficient Word Retrieval Using a Multiple Ranking Combination Scheme. In: Proceedings of DAS 2012, pp. 379–383 (2012)
Google Scholar
Kesidis, A.L., Gatos, B.: Efficient Cut-off Threshold Estimation for Word Spotting Applications. In: Proceedings of ICDAR 2011, pp. 279–283 (2011)
Google Scholar
Manmatha, R., Han, C., Riseman, E.M., Croft, W.B.: Indexing Handwriting Using Word Matching. In: Proceedings of the ICDL 1996, pp. 151–159 (1996)
Google Scholar
Rath, T.M., Manmatha, R.: Features for Word Spotting in Historical Manuscripts. In: Proceedings of ICDAR 2003, pp. 218–222 (2003)
Google Scholar
Gatos, B., Konidaris, T., Ntzios, K., Pratikakis, I., Perantonis, S.J.: A Segmentation-free Approach for Keyword Search in Historical Typewritten Documents. In: Proceedings of ICDAR 2005, pp. 54–58 (2005)
Google Scholar
Terasawa, K., Nagasaki, T., Kawashima, T.: Eigenspace Method for Text Retrieval in Historical Document Images. In: Proceedings of ICDAR 2005, pp. 437–441 (2005)
Google Scholar
Ataer, E., Duygulu, P.: Matching Ottoman words: An Image Retrieval Approach to Historical Document Indexing. In: Proceedings of CIVR 2007, pp. 341–347 (2007)
Google Scholar
Wei, H., Gao, G., Bao, Y., Wang, Y.: An Efficient Binarization Method for Ancient Mongolian Document Images. In: Proceedings of ICACTE 2010, pp. 43–46 (2010)
Google Scholar
Wei, H., Gao, G., Bao, Y.: A Method for Removing Inflectional Suffixes in Word Spotting of Mongolian Kanjur. In: Proceedings of ICDAR 2011, pp. 88–92 (2011)
Google Scholar
Rath, T.M., Manmatha, R.: Word Image Matching Using Dynamic Time Warping. In: Proceedings of CVPR 2003, pp. 521–527 (2003)
Google Scholar
Wei, H., Gao, G., Zhang, X.: Indexing for Mongolian Kanjur Images in Word Spotting. Journal of Computational Information Systems 9(4), 1501–1508 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Inner Mongolia University, Hohhot, China
Hongxi Wei & Guanglai Gao

Authors

Hongxi Wei
View author publications
You can also search for this author in PubMed Google Scholar
Guanglai Gao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Machine Learning and Systems Biology Laboratory, Tongji University, 4800 Caoan Road, 201804, Shanghai, China
De-Shuang Huang
Electrical and Electronics Department, Polytechnic of Bari, Via Orabona 4, 70125, Bari, Italy
Vitoantonio Bevilacqua
Faculty of Engineering, District University Francisco José de Caldas, Cra. 7a No. 40-53, Fifth Floor, Bogotá, Colombia
Juan Carlos Figueroa
School of Electrical, Computer and Telecommunications Engineering, The University of Wollongong, 2522, North Wollongong, NSW, Australia
Prashan Premaratne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wei, H., Gao, G. (2013). Word Spotting Application in Historical Mongolian Document Images. In: Huang, DS., Bevilacqua, V., Figueroa, J.C., Premaratne, P. (eds) Intelligent Computing Theories. ICIC 2013. Lecture Notes in Computer Science, vol 7995. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39479-9_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-39479-9_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39478-2
Online ISBN: 978-3-642-39479-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics