Skip to main content
Log in

Robust visual object clustering and its application to sightseeing spot assessment

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we propose a robust visual object clustering approach based on bounding box ranking to discover the characteristics of objects from real-world datasets containing a large number of noisy images, and apply it to sightseeing spot assessment. The purpose is to develop a diversity of resources for sightseeing from images available on social network services (SNS). Objects appearing frequently in images captured in a certain city may represent a certain characteristic of it (local culture, architecture, and so on). Such knowledge can be used to discover various sightseeing resources from the perspective of the user rather than that of the provider (e.g., a travel agency). However, owing to the variable quality of images on SNS, it is challenging to identify objects common to several images by using conventional object discovery methods, and this is where the proposed approach is useful. Extensive experiments on standard and extended benchmarks verified its effectiveness. We also tested the proposed method on an application where the characteristics of a city (i.e., cultural elements) were discovered from a set of images of it. Moreover, by utilizing the objects discovered from images on SNS, we propose an object-level assessment framework to rank sightseeing spots by assigning scores and verify its performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. Because the discovered objects represent a city’s characteristics, we call them characteristic objects

  2. The original parameters in [25] are used.

  3. The original parameters in [16] are used.

  4. The original parameters in [8] are used.

  5. https://www.flickr.com/

  6. https://www.tripadvisor.com/

References

  1. Alexe B, Deselaers T, Ferrari V (2012) Measuring the objectness of image windows. TPAMI 34(11):2189–2202

    Article  Google Scholar 

  2. Cho M, Kwak S, Laptev I, Schmid C, Ponce J (2015) Unsupervised object discovery and localization in images and videos. In: URAI, pp 292–293

  3. Cho M, Kwak S, Schmid C, Ponce J (2015) Unsupervised object discovery and localization in the wild: part-based matching with bottom-up region proposals. In: CVPR, pp 1201–1210

  4. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: CVPR, pp 886–893

  5. Dhillon IS (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: SIGKDD, pp 269–s274

  6. Doersch C, Singh S , Gupta A, Sivic J, Efros A (2012) What makes paris look like paris? In: TOG, vol 31, issue 4

  7. Everingham M, Zisserman A, Williams CKI , Van Gool L, Allan M, Bishop CM, Chapelle O, Dalal N, Deselaers T, Dorkó G et al (2007) The PASCAL visual object classes challenge 2007 (VOC2007) results

  8. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. In: PAMI, pp 1627–1645

  9. Girshick R (2005) Fast r-cnn. In: ICCV, pp 1440–1448

  10. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp 580–587

  11. Harel J, Koch C, Perona P et al (2006) Graph-based visual saliency. In: NIPS, pp 545–552

  12. He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: ECCV, pp 346–361

  13. Hochman N, Schwartz R (2012) Visualizing instagram: tracing cultural visual rhythms. In: ICWSM12, pp 6–9

  14. Jeong J-W, Hong H-K, Heu J-U, Qasim I, Lee D-H (2012) Visual summarization of the social image collection using image attractiveness learned from social behaviors. In: ICME, pp 538–543

  15. Kwak S, Cho M, Laptev I, Ponce J, Schmid C (2015) Unsupervised object discovery and tracking in video collections. In: ICCV, pp 3173–3181

  16. Lowe DG (1999) Object recognition from local scale-invariant features. In: ICCV, pp 1150–1157

  17. Manen S, Guillaumin M, Van Gool L (2013) Prime object proposals with randomized prim’s algorithm. In: ICCV, pp 2536–2543

  18. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: NIPS, pp 91–99

  19. Rosenberg A, Hirschberg J (2007) V-Measure: a conditional entropy-based external cluster evaluation measure. In: EMNLP-CoNLL, pp 410–420

  20. Rubinstein M, Joulin A, Kopf J, Liu C (2013) Unsupervised joint object discovery and segmentation in internet images. In: CVPR, pp 1939–1946

  21. San Pedro J, Siersdorfer S (2009) Ranking and classifying attractiveness of photos in folksonomies. In: WWW, pp 771–780

  22. Shen Y, Ge M, Zhuang C, Ma Q (2016) Sightseeing value estimation by analyzing geosocial images. In: BigMM, pp 117–124

  23. Shen Y, Ge M, Zhuang C, Ma Q (2018) Sightseeing value estimation by analysing geosocial images. IJBDI 5(1/2):31–48

    Article  Google Scholar 

  24. Shen Y, Zhuang C, Ma Q (2017) Element-oriented method of landscape assessment of sightseeing spots by using social images. In: APWeb-WAIM, pp 66–73

  25. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. In: arXiv:1409.1556

  26. Singh S, Gupta A, Efros AA (2012) Unsupervised discovery of mid-level discriminative patches. In: ECCV, 73–86

  27. Tang K, Joulin A, Li L-J, Fei-Fei L (2014) Co-localization in real-world images. In: CVPR, pp 1464–1471

  28. Torralba A, Fergus R, Weiss Y (2008) Small codes and large image databases for recognition. In: CVPR, pp 1–8

  29. Zhuang C, Ma Q, Liang X, Yoshikawa M (2014) Anaba: an obscure sightseeing spots discovering system. In: ICME, pp 1–6

  30. Zhuang C, Ma Q, Liang X, Yoshikawa M (2015) Discovering obscure sightseeing spots by analysis of geo-tagged social images. In: ASONAM, pp 590–595

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiang Ma.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was partly supported by JSPS KAKENHI (16K12532) and MIC SCOPE (172307001).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ge, M., Zhuang, C. & Ma, Q. Robust visual object clustering and its application to sightseeing spot assessment. Multimed Tools Appl 78, 17135–17164 (2019). https://doi.org/10.1007/s11042-018-7066-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-7066-2

Keywords

Navigation