Abstract
Rapid increase of digitized document images give birth to high demand of document image retrieval. While conventional document image retrieval approaches depend on complex OCR-based text recognition and text similarity detection, this paper proposes a new content-based approach, in which more attention is paid to feature extraction and feature fusion methods. In the proposed approach, multiple features of document images are extracted by different CNN models. After that, the extracted CNN features are reduced and fused into weighted average feature. Finally, the document images are ranked based on the feature similarity to query image. Experimental procedure is performed on a group of document images that transformed from academic papers, which contain both English and Chinese document, the results show that the proposed approach has good ability to retrieve document images with similar text content, and the fusion of CNN features can improve the retrieval accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gatos, B., Pratikakis, I.: Segmentation-free word spotting in historical printed documents. In: 10th International Conference on Document Analysis and Recognition, pp. 271–275. IEEE, Barcelona, Spain (2010)
Frinken, V., Fischer, A., Manmatha, R., et al.: A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 211–224 (2012)
Gatys, L.A., Ecker, A.S., Bethge, M.: Texture synthesis and the controlled generation of natural stimuli using convolutional neural networks. arXiv:1505.07376 (2015)
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788. IEEE, Las Vegas, NV (2016)
Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9. IEEE, San Francisco (2015)
Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. Computer Science (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Goodfellow, I.J., Bulatov, Y., Ibarz, J., et al.: Multi-digit number recognition from street view imagery using deep convolutional neural networks. Computer Science (2013)
Vedaldi, A., Lenc, K.: MatConvNet: Convolutional neural networks for MATLAB. In: Proceedings of the 23rd ACM International Conference on Multimedia, pp. 689–692. ACM, Brisbane, Australia (2014)
Chatfield, K., Simonyan, K., Vedaldi, A., et al.: Return of the devil in the details: delving deep into convolutional nets. Computer Science (2014)
Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 584–599. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_38
Jahrer, M., Töscher, A., Legenstein, R.: Combining predictions for accurate recommender systems. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 693–702. ACM, Washington (2010)
Kaggle Ensembling Guide. https://mlwave.com/kaggle-ensembling-guide/. Accessed 11 June 2015
Moreira, C., Martins, B., Calado, P.: Using rank aggregation for expert search in academic digital libraries. arXiv:1501.05140 (2015)
Sejal, D., Rashmi, V., Venugopal, K.R.: Image recommendation based on keyword relevance using absorbing Markov chain and image features. Int. J. Multimedia Inf. Retrieval 5(3), 1–15 (2016)
Acknowledgments
This work was partially supported by the National Research Foundation of China (61402391).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Tan, M., Yuan, S., Su, Y. (2018). Content-Based Similar Document Image Retrieval Using Fusion of CNN Features. In: Huet, B., Nie, L., Hong, R. (eds) Internet Multimedia Computing and Service. ICIMCS 2017. Communications in Computer and Information Science, vol 819. Springer, Singapore. https://doi.org/10.1007/978-981-10-8530-7_25
Download citation
DOI: https://doi.org/10.1007/978-981-10-8530-7_25
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8529-1
Online ISBN: 978-981-10-8530-7
eBook Packages: Computer ScienceComputer Science (R0)