Abstract
The availability of large annotated visual resources, such as ImageNet, recently led to important advances in image mining tasks. However, the manual annotation of such resources is cumbersome. Exploiting Web datasets as a substitute or complement is an interesting but challenging alternative. The main problems to solve are the choice of the initial dataset and the noisy character of Web text-image associations. This article presents an approach which first leverages Flickr groups to automatically build a comprehensive visual resource and then exploits it for image retrieval. Flickr groups are an interesting candidate dataset because they cover a wide range of user interests. To reduce initial noise, we introduce innovative and scalable image reranking methods. Then, we learn individual visual models for 38,500 groups using a low-level image representation. We exploit off-the-shelf linear models to ensure scalability of the learning and prediction steps. Finally, Semfeat image descriptions are obtained by concatenating prediction scores of individual models and by retaining only the most salient responses. To provide a comparison with a manually created resource, a similar pipeline is applied to ImageNet. Experimental validation is conducted on the ImageCLEF Wikipedia Retrieval 2010 benchmark, showing competitive results that demonstrate the validity of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bergamo, A., Torresani, L.: Meta-class features for large-scale object categorization on a budget. In: CVPR (2012)
Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is nearest neighbor meaningful? In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 217–235. Springer, Heidelberg (1998)
Chatzilari, E.: Using tagged images of low visual ambiguity to boost the learning efficiency of object detectors. ACM Multimedia (2013)
Chang, M.-H., et al.: Sheepdog: group and tag recommendation for flickr photos by automatic search-based learning. ACM Multimedia 2008 (2008)
Clinchant, S., et al.: Xrce’s participation in wikipedia retrieval, medical image modality classification and ad-hoc retrieval tasks of imageclef 2010. In: CLEF 2010 (2010)
Clinchant, S., et al.: Semantic combination of textual and visual information in multimedia retrieval. In: ICMR (2011)
Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR 2009 (2009)
Donahue, J., et al.: Decaf: A deep convolutional activation feature for generic visual recognition. CoRR (2013)
Fan, R.-E., et al.: Liblinear: A library for large linear classification. JMLR 9 (2008)
Grabowicz, P., et al.: Distinguishing topical and social groups based on common identity and bond theory. In: ACM WSDM 2013 (2013)
Jégou, H., et al.: Aggregating local image descriptors into compact codes. PAMI (2012)
Kennedy, L.S., Naaman, M.: Generating diverse and representative image search results for landmarks. In: WWW 2008 (2008)
Krizhevsky, A., et al.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Li, L.-J., et al.: Object bank: A high-level image representation for scene classification & semantic feature sparsification. In: NIPS (2010)
Liu, L., Wang, L., Liu, X.: In defense of soft-assignment coding. In: ICCV (2011)
Negoescu, R.A., Gatica-Perez, D.: Analyzing flickr groups. In: ACM CIVR 2008 (2008)
Oquab, M., et al.: Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks. In: CVPR (2014)
Perronnin, F., et al.: Large-scale image retrieval with compressed fisher vectors. In: CVPR 2010 (2010)
Popescu, A., Grefenstette, G.: Social media driven image retrieval. In: ACM ICMR 2011 (2011)
Russell, B., et al.: Labelme: a database and web-based tool for image annotation. IJCV 77 (2007)
Sermanet, P., et al.: Overfeat: Integrated recognition, localization and detection using convolutional networks, CoRR (2013)
Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient object category recognition using classemes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 776–789. Springer, Heidelberg (2010)
Tsikrika, T., et al.: Building reliable and reusable test collections for image retrieval: The wikipedia task at imageclef. IEEE MultiMedia (2012)
Wang, G., et al.: Learning image similarity from flickr groups using stochastic intersection kernel machines. In: CVPR (2009)
Wang, J., et al.: Locality-constrained linear coding for image classification. In: CVPR (2010)
Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In: NIPS (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ginsca, A.L., Popescu, A., Le Borgne, H., Ballas, N., Vo, P., Kanellos, I. (2015). Large-Scale Image Mining with Flickr Groups. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds) MultiMedia Modeling. MMM 2015. Lecture Notes in Computer Science, vol 8935. Springer, Cham. https://doi.org/10.1007/978-3-319-14445-0_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-14445-0_28
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14444-3
Online ISBN: 978-3-319-14445-0
eBook Packages: Computer ScienceComputer Science (R0)