Skip to main content

LDA-Based Word Image Representation for Keyword Spotting on Historical Mongolian Documents

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9950))

Included in the following conference series:

Abstract

The original Bag-of-Visual-Words approach discards the spatial relations of the visual words. In this paper, a LDA-based topic model is adopted to obtain the semantic relations of visual words for each word image. Because the LDA-based topic model usually hurts retrieval performance when directly employs itself. Therefore, the LDA-based topic model is linearly combined with a visual language model for each word image in this study. After that, the basic query likelihood model is used for realizing the procedure of retrieval. The experimental results on our dataset show that the proposed LDA-based representation approach can efficiently and accurately attain to the aim of keyword spotting on a collection of historical Mongolian documents. Meanwhile, the proposed approach improves the performance significantly than the original BoVW approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Manmatha, R., Han, C., Riseman, E.M., Croft, W.B.: Indexing handwriting using word matching. In: Proceedings of ICDL 1996, pp. 151–159. ACM Press, New York (1996)

    Google Scholar 

  2. Rath, T.M., Manmatha, R.: Features for word spotting in historical manuscripts. In: Proceedings of ICDAR 2003, pp. 218–222. IEEE Press, New York (2003)

    Google Scholar 

  3. Rath, T.M., Manmatha, R.: Word image matching using dynamic time warping. In: Proceedings of CVPR 2003, pp. 521–527. IEEE Press, New York (2003)

    Google Scholar 

  4. Chen, X., Hu, X., Shen, X.: Spatial weighting for bag-of-visual-words and its application in content-based image retrieval. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 867–874. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  5. Tirilly, P., Claveau, V., Gros, P.: Distance and weighting schemes for bag of visual words image retrieval. In: Proceedings of MIR 2010, pp. 323–332. ACM Press, New York (2010)

    Google Scholar 

  6. Zhu, L., Jin, H., Zheng, R., Feng, X.: Weighting scheme for image retrieval based on bag-of-visual-words. IET Image Process 8(9), 509–518 (2014)

    Article  Google Scholar 

  7. Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008). PP. 120–126

    Book  MATH  Google Scholar 

  8. Lopes-Monroy, A.P., Montes-Y-Gomez, M., Escalante, H.J., Cruz-Roa, A., Gonzalez, F.A.: Improving the BoVW via discriminative visual n-grams and MKL strategies. Neurocomputing 175, 768–781 (2016)

    Article  Google Scholar 

  9. Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: Proceedings of CVPR 2010, pp. 3360–3367. IEEE Press, New York (2010)

    Google Scholar 

  10. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of CVPR 2006, pp. 2169–2178. IEEE Press, New York (2006)

    Google Scholar 

  11. Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1998), pp. 275–281. ACM Press, New York (1998)

    Google Scholar 

  12. Blei, D.M., Ng, A.Y., Jordan, M.J.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  13. Wei, X., Croft, W.B.: LDA-based document models for ad-hoc retrieval. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2006), pp. 178–185. ACM Press, New York (2006)

    Google Scholar 

  14. Wu, L., Li, M., Li, Z., Ma, W., Yu, N.: Visual language modeling for image classification. In: Proceedings of MIR 2007, pp. 115–124. ACM Press, New York (2007)

    Google Scholar 

  15. Wei, H., Gao, G., Bao, Y., Wang, Y.: An efficient binarization method for ancient Mongolian document images. In: Proceedings of the 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE 2010), pp. 43–46. IEEE Press, New York (2010)

    Google Scholar 

  16. Wei, H., Gao, G.: A keyword retrieval system for historical Mongolian document images. Int. J. Doc. Anal. Recogn. (IJDAR) 17(1), 33–45 (2014)

    Article  Google Scholar 

  17. Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001), pp. 334–342. ACM Press, New York (2001)

    Google Scholar 

Download references

Acknowledgements

The paper is supported by the National Natural Science Foundation of China under Grant 61463038 and the Research Project of Higher Education School of Inner Mongolia Autonomous Region under Grant NJZY14007.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongxi Wei .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Wei, H., Gao, G., Su, X. (2016). LDA-Based Word Image Representation for Keyword Spotting on Historical Mongolian Documents. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9950. Springer, Cham. https://doi.org/10.1007/978-3-319-46681-1_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46681-1_52

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46680-4

  • Online ISBN: 978-3-319-46681-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics