LDA-Based Word Image Representation for Keyword Spotting on Historical Mongolian Documents

Wei, Hongxi; Gao, Guanglai; Su, Xiangdong

doi:10.1007/978-3-319-46681-1_52

Hongxi Wei¹⁹,
Guanglai Gao¹⁹ &
Xiangdong Su¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9950))

Included in the following conference series:

International Conference on Neural Information Processing

2548 Accesses
9 Citations

Abstract

The original Bag-of-Visual-Words approach discards the spatial relations of the visual words. In this paper, a LDA-based topic model is adopted to obtain the semantic relations of visual words for each word image. Because the LDA-based topic model usually hurts retrieval performance when directly employs itself. Therefore, the LDA-based topic model is linearly combined with a visual language model for each word image in this study. After that, the basic query likelihood model is used for realizing the procedure of retrieval. The experimental results on our dataset show that the proposed LDA-based representation approach can efficiently and accurately attain to the aim of keyword spotting on a collection of historical Mongolian documents. Meanwhile, the proposed approach improves the performance significantly than the original BoVW approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Integrating Visual Word Embeddings into Translation Language Model for Keyword Spotting on Historical Mongolian Document Images

Word Spotting Application in Historical Mongolian Document Images

Latent Dirichlet Allocation Based Image Retrieval

References

Manmatha, R., Han, C., Riseman, E.M., Croft, W.B.: Indexing handwriting using word matching. In: Proceedings of ICDL 1996, pp. 151–159. ACM Press, New York (1996)
Google Scholar
Rath, T.M., Manmatha, R.: Features for word spotting in historical manuscripts. In: Proceedings of ICDAR 2003, pp. 218–222. IEEE Press, New York (2003)
Google Scholar
Rath, T.M., Manmatha, R.: Word image matching using dynamic time warping. In: Proceedings of CVPR 2003, pp. 521–527. IEEE Press, New York (2003)
Google Scholar
Chen, X., Hu, X., Shen, X.: Spatial weighting for bag-of-visual-words and its application in content-based image retrieval. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 867–874. Springer, Heidelberg (2009)
Chapter Google Scholar
Tirilly, P., Claveau, V., Gros, P.: Distance and weighting schemes for bag of visual words image retrieval. In: Proceedings of MIR 2010, pp. 323–332. ACM Press, New York (2010)
Google Scholar
Zhu, L., Jin, H., Zheng, R., Feng, X.: Weighting scheme for image retrieval based on bag-of-visual-words. IET Image Process 8(9), 509–518 (2014)
Article Google Scholar
Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008). PP. 120–126
Book MATH Google Scholar
Lopes-Monroy, A.P., Montes-Y-Gomez, M., Escalante, H.J., Cruz-Roa, A., Gonzalez, F.A.: Improving the BoVW via discriminative visual n-grams and MKL strategies. Neurocomputing 175, 768–781 (2016)
Article Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: Proceedings of CVPR 2010, pp. 3360–3367. IEEE Press, New York (2010)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of CVPR 2006, pp. 2169–2178. IEEE Press, New York (2006)
Google Scholar
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1998), pp. 275–281. ACM Press, New York (1998)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.J.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Wei, X., Croft, W.B.: LDA-based document models for ad-hoc retrieval. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2006), pp. 178–185. ACM Press, New York (2006)
Google Scholar
Wu, L., Li, M., Li, Z., Ma, W., Yu, N.: Visual language modeling for image classification. In: Proceedings of MIR 2007, pp. 115–124. ACM Press, New York (2007)
Google Scholar
Wei, H., Gao, G., Bao, Y., Wang, Y.: An efficient binarization method for ancient Mongolian document images. In: Proceedings of the 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE 2010), pp. 43–46. IEEE Press, New York (2010)
Google Scholar
Wei, H., Gao, G.: A keyword retrieval system for historical Mongolian document images. Int. J. Doc. Anal. Recogn. (IJDAR) 17(1), 33–45 (2014)
Article Google Scholar
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001), pp. 334–342. ACM Press, New York (2001)
Google Scholar

Download references

Acknowledgements

The paper is supported by the National Natural Science Foundation of China under Grant 61463038 and the Research Project of Higher Education School of Inner Mongolia Autonomous Region under Grant NJZY14007.

Author information

Authors and Affiliations

School of Computer Science, Inner Mongolia University, Hohhot, China
Hongxi Wei, Guanglai Gao & Xiangdong Su

Authors

Hongxi Wei
View author publications
You can also search for this author in PubMed Google Scholar
Guanglai Gao
View author publications
You can also search for this author in PubMed Google Scholar
Xiangdong Su
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongxi Wei .

Editor information

Editors and Affiliations

The University of Tokyo , Tokyo, Japan
Akira Hirose
Kobe University , Kobe, Japan
Seiichi Ozawa
Okinawa Institute of Science and Technology Graduate University, Onna, Japan
Kenji Doya
Nara Institute of Science and Technology , Ikoma, Japan
Kazushi Ikeda
Kyungpook National University , Daegu, Korea (Republic of)
Minho Lee
Chinese Academy of Sciences , Beijing, China
Derong Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wei, H., Gao, G., Su, X. (2016). LDA-Based Word Image Representation for Keyword Spotting on Historical Mongolian Documents. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9950. Springer, Cham. https://doi.org/10.1007/978-3-319-46681-1_52

Download citation

DOI: https://doi.org/10.1007/978-3-319-46681-1_52
Published: 30 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46680-4
Online ISBN: 978-3-319-46681-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

LDA-Based Word Image Representation for Keyword Spotting on Historical Mongolian Documents

Abstract

Access this chapter

Similar content being viewed by others

Integrating Visual Word Embeddings into Translation Language Model for Keyword Spotting on Historical Mongolian Document Images

Word Spotting Application in Historical Mongolian Document Images

Latent Dirichlet Allocation Based Image Retrieval

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

LDA-Based Word Image Representation for Keyword Spotting on Historical Mongolian Documents

Abstract

Access this chapter

Similar content being viewed by others

Integrating Visual Word Embeddings into Translation Language Model for Keyword Spotting on Historical Mongolian Document Images

Word Spotting Application in Historical Mongolian Document Images

Latent Dirichlet Allocation Based Image Retrieval

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation