Adaptive Image Representation Using Information Gain and Saliency: Application to Cultural Heritage Datasets

Michaud, Dorian; Urruty, Thierry; Lecellier, François; Carré, Philippe

doi:10.1007/978-3-319-73603-7_5

Dorian Michaud^21,22,
Thierry Urruty²¹,
François Lecellier²¹ &
…
Philippe Carré²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10704))

Included in the following conference series:

International Conference on Multimedia Modeling

3115 Accesses

Abstract

Recently, the advent of deep neural networks showed great performances for supervised image analysis tasks. However, image expert datasets with little information or prior knowledge still need indexing tools that best represent the expert wishes. Our work fits in this very specific application context where only few expert users may appropriately label the images. Thus, in this paper, we consider small expert collections with no associated relevant label set, nor structured knowledge. In this context, we propose an automatic and adaptive framework based on the well-known bags of visual words and phrases models that select relevant visual descriptors for each keypoint to construct a more discriminating image representation. In this framework, we mix an information gain model and visual saliency information to enhance the image representation. Experiment results show the adaptiveness and the performance of our unsupervised framework on well-known “generic” datasets and also on a cultural heritage expert dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014 Part I. LNCS, vol. 8689, pp. 584–599. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_38
Google Scholar
CESCM: Romane dataset (2015). http://baseromane.fr/accueil2.aspx
Chatoux, H., Lecellier, F., Fernandez-Maloigne, C.: Comparative study of descriptors with dense key points. In: 23rd International Conference on Pattern Recognition, ICPR 2016, Cancún, Mexico, 4–8 December 2016, pp. 1988–1993 (2016)
Google Scholar
Chen, Q., Song, Z., Dong, J., Huang, Z., Hua, Y., Yan, S.: Contextualizing object detection and classification. IEEE Trans. Pattern Anal. Mach. Intell. 37(1), 13–27 (2015)
Article Google Scholar
Chen, T., Yap, K.H., Zhang, D.: Discriminative soft bag-of-visual phrase for mobile landmark recognition. IEEE Trans. Multimed. 16(3), 612–622 (2014)
Article Google Scholar
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22 (2004)
Google Scholar
Delhumeau, J., Gosselin, P.H., Jégou, H., Pérez, P.: Revisiting the VLAD image representation. In: ACM Multimedia, Barcelona, Spain (2013)
Google Scholar
Eggert, C., Romberg, S., Lienhart, R.: Improving VLAD: hierarchical coding and a refined local coordinate system. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 3018–3022 (2014)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results (2012)
Google Scholar
Gando, G., Yamada, T., Sato, H., Oyama, S., Kurihara, M.: Fine-tuning deep convolutional neural networks for distinguishing illustrations from photographs. Expert Syst. Appl. 66, 295–301 (2016)
Article Google Scholar
Gbehounou, S.: Image database indexing: emotional impact evaluation. Theses, Université de Poitiers, November 2014
Google Scholar
Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems, pp. 545–552 (2006)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE CVPR (2016)
Google Scholar
Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311 (2010)
Google Scholar
Jégou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008 Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88682-2_24
Chapter Google Scholar
Jung, S.I., Hong, K.S.: Deep network aided by guiding network for pedestrian detection. Pattern Recogn. Lett. 90, 43–49 (2017)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems (2012)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 7553 (2015)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2 (2006)
Google Scholar
Pedrosa, G., Traina, A.: From bag-of-visual-words to bag-of-visual-phrases using n-grams. In: SIBGRAPI Conference on Graphics, Patterns and Images, pp. 304–311 (2013)
Google Scholar
Picard, D., Gosselin, P.H., Gaspard, M.C.: Challenges in content-based image indexing of cultural heritage collections. IEEE SP Mag. 32(4), 95–102 (2015)
Article Google Scholar
Pittaras, N., Markatopoulou, F., Mezaris, V., Patras, I.: Comparison of fine-tuning and extension strategies for deep convolutional neural networks. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017 Part I. LNCS, vol. 10132, pp. 102–114. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51811-4_9
Chapter Google Scholar
Ren, Y., Benois-Pineau, J., Bugeau, A.: A comparative study of irregular pyramid matching in bag-of-bags of words model for image retrieval. In: 6th International conference on Image and Signal Processing, ICISP 2014 (2014)
Google Scholar
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24, 513–523 (1988)
Article Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Google Scholar
Wu, Y., Liu, H., Yuan, J., Zhang, Q.: Is visual saliency useful for content-based image retrieval? Multimed. Tools Appl. 76, 1–24 (2017)
Article Google Scholar
Yandex, A.B., Lempitsky, V.: Aggregating local deep features for image retrieval. In: 2015 IEEE International Conference on Computer Vision, pp. 1269–1277 (2015)
Google Scholar
Zagoruyko, S., Komodakis, N.: Wide residual networks. In: Proceedings of the British Machine Vision Conference, BMVC, York (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

CNRS, Univ. Poitiers, XLIM, UMR 7252, 86000, Poitiers, France
Dorian Michaud, Thierry Urruty, François Lecellier & Philippe Carré
Quadra Informatique, 68 Rue du Docteur Eloy, 59133, Phalempin, France
Dorian Michaud

Authors

Dorian Michaud
View author publications
You can also search for this author in PubMed Google Scholar
Thierry Urruty
View author publications
You can also search for this author in PubMed Google Scholar
François Lecellier
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Carré
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dorian Michaud .

Editor information

Editors and Affiliations

Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria
Klaus Schoeffmann
Chulalongkorn University, Bangkok, Thailand
Thanarat H. Chalidabhongse
City University of Hong Kong, Hong Kong, China
Chong Wah Ngo
Chulalongkorn University, Bangkok, Thailand
Supavadee Aramvith
Dublin City University, Dublin, Ireland
Noel E. O’Connor
Gwangju Institute of Science and Technology, Gwangju, Korea (Republic of)
Yo-Sung Ho
Tampere University of Technology, Tampere, Finland
Moncef Gabbouj
Rutgers University, Piscataway, New Jersey, USA
Ahmed Elgammal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Michaud, D., Urruty, T., Lecellier, F., Carré, P. (2018). Adaptive Image Representation Using Information Gain and Saliency: Application to Cultural Heritage Datasets. In: Schoeffmann, K., et al. MultiMedia Modeling. MMM 2018. Lecture Notes in Computer Science(), vol 10704. Springer, Cham. https://doi.org/10.1007/978-3-319-73603-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-73603-7_5
Published: 13 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73602-0
Online ISBN: 978-3-319-73603-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics