Skip to main content

Adaptive Image Representation Using Information Gain and Saliency: Application to Cultural Heritage Datasets

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10704))

Included in the following conference series:

  • 3115 Accesses

Abstract

Recently, the advent of deep neural networks showed great performances for supervised image analysis tasks. However, image expert datasets with little information or prior knowledge still need indexing tools that best represent the expert wishes. Our work fits in this very specific application context where only few expert users may appropriately label the images. Thus, in this paper, we consider small expert collections with no associated relevant label set, nor structured knowledge. In this context, we propose an automatic and adaptive framework based on the well-known bags of visual words and phrases models that select relevant visual descriptors for each keypoint to construct a more discriminating image representation. In this framework, we mix an information gain model and visual saliency information to enhance the image representation. Experiment results show the adaptiveness and the performance of our unsupervised framework on well-known “generic” datasets and also on a cultural heritage expert dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014 Part I. LNCS, vol. 8689, pp. 584–599. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_38

    Google Scholar 

  2. CESCM: Romane dataset (2015). http://baseromane.fr/accueil2.aspx

  3. Chatoux, H., Lecellier, F., Fernandez-Maloigne, C.: Comparative study of descriptors with dense key points. In: 23rd International Conference on Pattern Recognition, ICPR 2016, Cancún, Mexico, 4–8 December 2016, pp. 1988–1993 (2016)

    Google Scholar 

  4. Chen, Q., Song, Z., Dong, J., Huang, Z., Hua, Y., Yan, S.: Contextualizing object detection and classification. IEEE Trans. Pattern Anal. Mach. Intell. 37(1), 13–27 (2015)

    Article  Google Scholar 

  5. Chen, T., Yap, K.H., Zhang, D.: Discriminative soft bag-of-visual phrase for mobile landmark recognition. IEEE Trans. Multimed. 16(3), 612–622 (2014)

    Article  Google Scholar 

  6. Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22 (2004)

    Google Scholar 

  7. Delhumeau, J., Gosselin, P.H., Jégou, H., Pérez, P.: Revisiting the VLAD image representation. In: ACM Multimedia, Barcelona, Spain (2013)

    Google Scholar 

  8. Eggert, C., Romberg, S., Lienhart, R.: Improving VLAD: hierarchical coding and a refined local coordinate system. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 3018–3022 (2014)

    Google Scholar 

  9. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results (2012)

    Google Scholar 

  10. Gando, G., Yamada, T., Sato, H., Oyama, S., Kurihara, M.: Fine-tuning deep convolutional neural networks for distinguishing illustrations from photographs. Expert Syst. Appl. 66, 295–301 (2016)

    Article  Google Scholar 

  11. Gbehounou, S.: Image database indexing: emotional impact evaluation. Theses, Université de Poitiers, November 2014

    Google Scholar 

  12. Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems, pp. 545–552 (2006)

    Google Scholar 

  13. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE CVPR (2016)

    Google Scholar 

  14. Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311 (2010)

    Google Scholar 

  15. Jégou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008 Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88682-2_24

    Chapter  Google Scholar 

  16. Jung, S.I., Hong, K.S.: Deep network aided by guiding network for pedestrian detection. Pattern Recogn. Lett. 90, 43–49 (2017)

    Article  Google Scholar 

  17. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems (2012)

    Google Scholar 

  18. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 7553 (2015)

    Article  Google Scholar 

  19. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  20. Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2 (2006)

    Google Scholar 

  21. Pedrosa, G., Traina, A.: From bag-of-visual-words to bag-of-visual-phrases using n-grams. In: SIBGRAPI Conference on Graphics, Patterns and Images, pp. 304–311 (2013)

    Google Scholar 

  22. Picard, D., Gosselin, P.H., Gaspard, M.C.: Challenges in content-based image indexing of cultural heritage collections. IEEE SP Mag. 32(4), 95–102 (2015)

    Article  Google Scholar 

  23. Pittaras, N., Markatopoulou, F., Mezaris, V., Patras, I.: Comparison of fine-tuning and extension strategies for deep convolutional neural networks. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017 Part I. LNCS, vol. 10132, pp. 102–114. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51811-4_9

    Chapter  Google Scholar 

  24. Ren, Y., Benois-Pineau, J., Bugeau, A.: A comparative study of irregular pyramid matching in bag-of-bags of words model for image retrieval. In: 6th International conference on Image and Signal Processing, ICISP 2014 (2014)

    Google Scholar 

  25. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  26. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24, 513–523 (1988)

    Article  Google Scholar 

  27. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  28. Wu, Y., Liu, H., Yuan, J., Zhang, Q.: Is visual saliency useful for content-based image retrieval? Multimed. Tools Appl. 76, 1–24 (2017)

    Article  Google Scholar 

  29. Yandex, A.B., Lempitsky, V.: Aggregating local deep features for image retrieval. In: 2015 IEEE International Conference on Computer Vision, pp. 1269–1277 (2015)

    Google Scholar 

  30. Zagoruyko, S., Komodakis, N.: Wide residual networks. In: Proceedings of the British Machine Vision Conference, BMVC, York (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dorian Michaud .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Michaud, D., Urruty, T., Lecellier, F., Carré, P. (2018). Adaptive Image Representation Using Information Gain and Saliency: Application to Cultural Heritage Datasets. In: Schoeffmann, K., et al. MultiMedia Modeling. MMM 2018. Lecture Notes in Computer Science(), vol 10704. Springer, Cham. https://doi.org/10.1007/978-3-319-73603-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73603-7_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73602-0

  • Online ISBN: 978-3-319-73603-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics