Skip to main content

Content-Based Similar Document Image Retrieval Using Fusion of CNN Features

  • Conference paper
  • First Online:
Internet Multimedia Computing and Service (ICIMCS 2017)

Abstract

Rapid increase of digitized document images give birth to high demand of document image retrieval. While conventional document image retrieval approaches depend on complex OCR-based text recognition and text similarity detection, this paper proposes a new content-based approach, in which more attention is paid to feature extraction and feature fusion methods. In the proposed approach, multiple features of document images are extracted by different CNN models. After that, the extracted CNN features are reduced and fused into weighted average feature. Finally, the document images are ranked based on the feature similarity to query image. Experimental procedure is performed on a group of document images that transformed from academic papers, which contain both English and Chinese document, the results show that the proposed approach has good ability to retrieve document images with similar text content, and the fusion of CNN features can improve the retrieval accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 107.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gatos, B., Pratikakis, I.: Segmentation-free word spotting in historical printed documents. In: 10th International Conference on Document Analysis and Recognition, pp. 271–275. IEEE, Barcelona, Spain (2010)

    Google Scholar 

  2. Frinken, V., Fischer, A., Manmatha, R., et al.: A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 211–224 (2012)

    Article  Google Scholar 

  3. Gatys, L.A., Ecker, A.S., Bethge, M.: Texture synthesis and the controlled generation of natural stimuli using convolutional neural networks. arXiv:1505.07376 (2015)

  4. Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788. IEEE, Las Vegas, NV (2016)

    Google Scholar 

  5. Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9. IEEE, San Francisco (2015)

    Google Scholar 

  6. Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. Computer Science (2015)

    Google Scholar 

  7. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)

  8. Goodfellow, I.J., Bulatov, Y., Ibarz, J., et al.: Multi-digit number recognition from street view imagery using deep convolutional neural networks. Computer Science (2013)

    Google Scholar 

  9. Vedaldi, A., Lenc, K.: MatConvNet: Convolutional neural networks for MATLAB. In: Proceedings of the 23rd ACM International Conference on Multimedia, pp. 689–692. ACM, Brisbane, Australia (2014)

    Google Scholar 

  10. Chatfield, K., Simonyan, K., Vedaldi, A., et al.: Return of the devil in the details: delving deep into convolutional nets. Computer Science (2014)

    Google Scholar 

  11. Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 584–599. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_38

    Google Scholar 

  12. Jahrer, M., Töscher, A., Legenstein, R.: Combining predictions for accurate recommender systems. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 693–702. ACM, Washington (2010)

    Google Scholar 

  13. Kaggle Ensembling Guide. https://mlwave.com/kaggle-ensembling-guide/. Accessed 11 June 2015

  14. Moreira, C., Martins, B., Calado, P.: Using rank aggregation for expert search in academic digital libraries. arXiv:1501.05140 (2015)

  15. Sejal, D., Rashmi, V., Venugopal, K.R.: Image recommendation based on keyword relevance using absorbing Markov chain and image features. Int. J. Multimedia Inf. Retrieval 5(3), 1–15 (2016)

    Google Scholar 

Download references

Acknowledgments

This work was partially supported by the National Research Foundation of China (61402391).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mao Tan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tan, M., Yuan, S., Su, Y. (2018). Content-Based Similar Document Image Retrieval Using Fusion of CNN Features. In: Huet, B., Nie, L., Hong, R. (eds) Internet Multimedia Computing and Service. ICIMCS 2017. Communications in Computer and Information Science, vol 819. Springer, Singapore. https://doi.org/10.1007/978-981-10-8530-7_25

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-8530-7_25

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-8529-1

  • Online ISBN: 978-981-10-8530-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics