The Benchmark as a Research Catalyst: Charting the Progress of Geo-prediction for Social Multimedia

Larson, Martha; Kelm, Pascal; Rae, Adam; Hauff, Claudia; Thomee, Bart; Trevisiol, Michele; Choi, Jaeyoung; Van Laere, Olivier; Schockaert, Steven; Jones, Gareth J.F.; Serdyukov, Pavel; Murdock, Vanessa; Friedland, Gerald

doi:10.1007/978-3-319-09861-6_2

Martha Larson³,
Pascal Kelm⁴,
Adam Rae⁵,
Claudia Hauff³,
Bart Thomee⁶,
Michele Trevisiol⁷,
Jaeyoung Choi⁹,
Olivier Van Laere⁸,
Steven Schockaert¹⁰,
Gareth J.F. Jones¹¹,
Pavel Serdyukov¹²,
Vanessa Murdock¹³ &
…
Gerald Friedland⁹

968 Accesses
4 Citations
2 Altmetric

Abstract

Benchmarks have the power to bring research communities together to focus on specific research challenges. They drive research forward by making it easier to systematically compare and contrast new solutions, and evaluate their performance with respect to the existing state of the art. In this chapter, we present a retrospective on the Placing Task, a yearly challenge offered by the MediaEval Multimedia Benchmark. The Placing Task, launched in 2010, is a benchmarking task that requires participants to develop algorithms that automatically predict the geolocation of social multimedia (videos and images). This chapter covers the editions of the Placing Task offered in 2010–2013, and also presents an outlook onto 2014. We present the formulation of the task and the task dataset for each year, tracing the design decisions that were made by the organizers, and how each year built on the previous year. Finally, we provide a summary of future directions and challenges for multimodal geolocation, and concluding remarks on how benchmarking has catalyzed research progress in the research area of geolocation prediction for social multimedia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://multimediaeval.org.
2.
http://multimediaeval.org.
3.
http://www.multimediaeval.org.
4.
http://trec.nist.gov.
5.
http://research.nii.ac.jp/ntcir/.
6.
http://www.clef-campaign.org.
7.
http://research.microsoft.com/en-us/um/people/szeliski/visioncontest05/default.htm.
8.
http://multimediaeval.org/mediaeval2010/placing.
9.
http://www.geonames.org.
10.
http://multimediaeval.org/mediaeval2011/placing2011/.
11.
http://multimediaeval.org/mediaeval2012/placing2012.
12.
http://multimediaeval.org/mediaeval2013/placing2013.
13.
Available for download: http://www.st.ewi.tudelft.nl/~hauff/placingTask2013Data.html.
14.
The baseline runs used out-of-the-box location prediction software: https://github.com/chauff/ImageLocationEstimation, with geographic filtering enabled.
15.
http://multimediaeval.org/mediaeval2014/placing2014.
16.
http://webscope.sandbox.yahoo.com/catalog.php?datatype=i&did=67.
17.
http://www.icsi.berkeley.edu/icsi/.
18.
https://www.llnl.gov.
19.
http://www.gps.gov/systems/gps/performance/accuracy.
20.
http://www.gps.gov/systems/augmentations.
21.
https://www.youtube.com.
22.
https://www.facebook.com.
23.
https://vimeo.com.
24.
http://blip.tv.
25.
http://instagram.com.
26.
https://vine.co.
27.
http://www.nasa.gov/mission_pages/station.

References

J. Almeida, N. Leite, R. Torres, Comparison of video sequences with histograms of motion patterns, in 18th IEEE International Conference on Image Processing (ICIP), September 2011, pp. 3673–3676
Google Scholar
A. Badii, M. Einig, T. Piatrik, Overview of the MediaEval 2013 Visual Privacy Task, in Larson et al. [31]
Google Scholar
J. Cao, Photo set refinement and tag segmentation in georeferencing Flickr photos, in Larson et al. [31]
Google Scholar
J. Choi, V. Ekambaram, G. Friedland, K. Ramchandran, The 2012 ICSI/Berkeley video location estimation system, in Larson et al. [35]
Google Scholar
J. Choi, G. Friedland, Data-driven vs. semantic-technology-driven tag-based video location estimation, in Proceedings of the 2011 IEEE Fifth International Conference on Semantic Computing, ICSC ’11. IEEE Computer Society, Washington, DC, pp. 243–246 (2011)
Google Scholar
J. Choi, G. Friedland, V. Ekambaram, K. Ramchandran, Multimodal location estimation of consumer media: dealing with sparse training data, in Proceedings of the 2012 IEEE International Conference on Multimedia and Expo, ICME ’12. IEEE Computer Society, Washington, DC, pp. 43–48 (2012)
Google Scholar
J. Choi, A. Janin, G. Friedland, The 2010 ICSI video location estimation system, in Larson et al. [33]
Google Scholar
J. Choi, H. Lei, V. Ekambaram, P. Kelm, L. Gottlieb, T. Sikora, K. Ramchandran, G. Friedland, Human versus machine: establishing a human baseline for multimodal location estimation, in Proceedings of the 21st ACM International Conference on Multimedia, MM ’13, ACM, New York, pp. 867–876 (2013)
Google Scholar
J. Choi, H. Lei, G. Friedland, The 2011 ICSI video location estimation system, in Larson et al. [32]
Google Scholar
D.J. Crandall, L. Backstrom, D. Huttenlocher, J. Kleinberg, Mapping the world’s photos, in Proceedings of the 18th International Conference on World Wide Web, WWW ’09, ACM, 2009, pp. 761–770
Google Scholar
J. Davies, J. Hare, S. Samangooei, J. Preston, N. Jain, D. Dupplaw, P. Lewis, Identifying the geographic location of an image with a multimodal probability density function, in Larson et al. [31]
Google Scholar
D. Ferrès, H. Rodríguez, TALP at MediaEval 2010 Placing Task: geographical focus detection of Flickr textual annotations, in Larson et al. [33]
Google Scholar
D. Ferres, H. Rodriguez, TALP at MediaEval 2011 Placing Task: georeferencing Flickr videos with geographical knowledge and information retrieval, in Larson et al. [32]
Google Scholar
G. Friedland, J. Choi, Semantic computing and privacy: a case study using inferred geo-location. Int. J. Semant. Comput. 5(1), 79–93 (2011)
Article Google Scholar
G. Friedland, J. Choi, A. Janin, VIDEO2GPS: a demo of multimodal location estimation on Flickr videos, in Proceedings of the 19th ACM International Conference on Multimedia, MM ’11, ACM, New York, pp. 833–834 (2011)
Google Scholar
A. Gallagher, D. Joshi, J. Yu, J. Luo, Geo-location inference from image content and user tags, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2009, CVPR Workshops 2009, June 2009, pp. 55–62
Google Scholar
C. Hauff, A study on the accuracy of Flickr’s geotag data, in Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’13, ACM, New York, pp. 1037–1040 (2013)
Google Scholar
C. Hauff, G.-J. Houben, WISTUD at MediaEval 2011: placing task, in Larson et al. [32]
Google Scholar
C. Hauff, G.-J. Houben, Geo-location estimation of Flickr images: social web based enrichment, in Proceedings of the 34th European Conference on Advances in Information Retrieval, ECIR’12. Springer, Berlin, pp. 85–96 (2012)
Google Scholar
C. Hauff, G.-J. Houben, Placing images on the world map: a microblog-based enrichment approach, in Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’12, ACM, New York, pp. 691–700 (2012)
Google Scholar
C. Hauff, B. Thomee, M. Trevisiol, Working notes for the placing task at MediaEval 2013, in Larson et al. [31]
Google Scholar
J. Hays, A.A. Efros, Im2gps: estimating geographic information from a single image, in CVPR. IEEE Computer Society (2008)
Google Scholar
J.M. Perea-Ortega, M.Á. García-Cumbreras, L. Alfonso Ureña-López, M. García-Vega, SINAI at Placing Task of MediaEval 2010, in Larson et al. [33]
Google Scholar
P. Kelm, S. Schmiedeke, T. Sikora, VIDEO2GPS: geotagging using collaborative systems, textual and visual features: MediaEval 2010 Placing Task, in Larson et al. [33]
Google Scholar
P. Kelm, S. Schmiedeke, T. Sikora, A hierarchical, multi-modal approach for placing videos on the map using millions of Flickr photographs, in ACM Multimedia 2011 (Workshop on Social and Behavioral Networked Media Access—SBNMA), ACM, November 2011
Google Scholar
P. Kelm, S. Schmiedeke, T. Sikora, Multi-modal, multi-resource methods for placing Flickr videos on the map, in Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR ’11, ACM, New York, pp. 52:1–52:8 (2011)
Google Scholar
P. Kelm, S. Schmiedeke, T. Sikora, How spatial segmentation improves the multimodal geo-tagging, in Larson et al. [35]
Google Scholar
G. Kordopatis-Zilos, S. Papadopoulos, E. Spyromitros-Xioufis, A.L. Symeonidis, Y. Kompatsiaris, CERTH at MediaEval Placing Task 2013, in Larson et al. [31]
Google Scholar
F. Krippner, G. Meier, J. Hartmann, R. Knauf, Placing media items using the XTrieval framework, in Larson et al. [32]
Google Scholar
O.V. Laere, S. Schockaert, V. Tanasescu, B. Dhoedt, C. Jones, Georeferencing Wikipedia documents using data from social media. ACM Trans. Inf. Syst. 32(3), (2014)
Google Scholar
M. Larson, X. Anguera, T. Reuter, G.J.F. Jones, B. Ionescu, M. Schedl, T. Piatrik, C. Hauff, M. Soleymani (eds.), in Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, Barcelona, Spain, October 2013, CEUR-WS.org, online http://ceur-ws.org/Vol-1043 (2013)
M. Larson, A. Rae, C.-H. Demarty, C. Kofler, F. Metze, R. Troncy, V. Mezaris, G.J.F. Jones (eds.), in Working Notes Proceedings of the MediaEval 2011 Workshop, Pisa, Italy, September 2011, CEUR-WS.org, online http://ceur-ws.org/Vol-807 (2011)
M. Larson, M. Soleymani, P. Serdyukov, V. Murdock, G.J.F. Jones (eds.), in Working Notes Proceedings of the MediaEval 2010 Workshop, Pisa, Italy, October 2010, online http://multimediaeval.org/mediaeval2010/2010worknotes (2010)
M. Larson, M. Soleymani, P. Serdyukov, S. Rudinac, C. Wartena, V. Murdock, G. Friedland, R. Ordelman, G.J.F. Jones, Automatic tagging and geotagging in video collections and communities, in Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR ’11, ACM, New York, pp. 51:1–51:8 (2011)
Google Scholar
M. Larson, S. Schmiedeke, P. Kelm, A. Rae, V. Mezaris, T. Piatrik, M. Soleymani, F. Metze, G.J.F. Jones (eds.), in Working Notes Proceedings of the MediaEval 2012 Workshop, Pisa, Italy, October 2012, CEUR-WS.org, online http://ceur-ws.org/Vol-927 (2012)
M. Larson, M. Soleymani, M. Eskevich, P. Serdyukov, R. Ordelman, G. Jones, The Community and the Crowd: Multimedia Benchmark Dataset Development. MultiMedia, IEEE. 19(3), 15–23 (2012)
Google Scholar
H. Lei, J. Choi, G. Friedland, Multimodal city-verification on Flickr videos using acoustic and textual features, in 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), March 2012, pp. 2273–2276
Google Scholar
L. Li, D. Pedronette, J. Almeida, O. Penatti, R. Calumby, R. Torres, A rank aggregation framework for video multimodal geocoding, pp. 1–37 (2013)
Google Scholar
L.T. Li, J. Almeida, R.D.S. Torres, RECOD working notes for placing task MediaEval 2011, in Larson et al. [32]
Google Scholar
L.T. Li, J. Almeida, D.C.G Pedronette, O. Penatti, R.D.S. Torres, A multimodal approach for video geocoding, in Larson et al. [35]
Google Scholar
L.T. Li, J. Almeida, O. Penatti, R. Calumby, D.C.G. Pedronette, M.A. Gonçalves, R.D.S. Torres, Multimodal image geocoding: the 2013 RECOD’s approach, in Larson et al. [31]
Google Scholar
X. Li, C. Hauff, M.A. Larson, A. Hanjalic, Preliminary exploration of the use of geographical information for content-based geo-tagging of social video, in Larson et al. [35]
Google Scholar
X. Li, M. Riegler, M. Larson, A. Hanjalic, Exploration of feature combination in geo-visual ranking for visual content-based location prediction, in Larson et al. [31]
Google Scholar
N. O’Hare, V. Murdock, Modeling locations with social media. Inf. Retr. 16(1), 30–62 (2013)
Article Google Scholar
J. Oomen, P. Over, W. Kraaij, A. Smeaton, Symbiosis between the TrecVid benchmark and video libraries at the Netherlands Institute for Sound and Vision. Int. J. Digit. Libr. 13(2), 91–104 (2013)
Google Scholar
O.A.B. Penatti, L.T. Li, J. Almeida, R.D.S. Torres, A visual approach for video geocoding using bag-of-scenes, in Proceedings of the 2Nd ACM International Conference on Multimedia Retrieval, ICMR ’12, ACM, New York, pp. 53:1–53:8 (2012)
Google Scholar
A. Popescu, CEA List’s participation at MediaEval 2013 Placing Task, in Larson et al. [31]
Google Scholar
A. Popescu, N. Ballas, CEA List’s participation at MediaEval 2012 Placing Task, in Larson et al. [35]
Google Scholar
A. Rae, P. Kelm, Working notes for the Placing Task at MediaEval 2012, in Larson et al. [35]
Google Scholar
A. Rae, V. Murdock, P. Serdyukov, P. Kelm, Working notes for the Placing Task at MediaEval 2011, in Larson et al. [32]
Google Scholar
S. Schmiedeke, C. Kofler, I. Ferrané, Overview of the MediaEval 2012 Tagging Task, Working Notes Proceedings of the MediaEval 2012 Workshop, Santa Croce in Fossabanda, Pisa, Italy, October 4–5, CEUR-WS.org, ISSN 1613–0073 (2012)
Google Scholar
P. Serdyukov, V. Murdock, R. van Zwol, Placing Flickr photos on a map, in Proceedings of the 32Nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’09, ACM, New York, pp. 484–491 (2009)
Google Scholar
D.A. Shamma, One hundred million creative commons Flickr images for research. http://yahoolabs.tumblr.com/post/89783581601/one-hundred-million-creative-commons-flickr-images-for, month = June, note = Accessed: 30 June 2014 (2014)
A.F. Smeaton, P. Over, W. Kraaij, Evaluation campaigns and TrecVid, in Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, MIR ’06, ACM, New York, pp. 321–330 (2006)
Google Scholar
S. Subramanian, V. Vidyasagaran, K. Chandramouli, VIT@MediaEval 2013 Placing Task: location specific tag weighting for language model based placing of images, in Larson et al. [31]
Google Scholar
M. Trevisiol, J. Delhumeau, H. Jégou, G. Gravier, How INRIA/IRISA identifies geographic location of a video, in Larson et al. [35]
Google Scholar
M. Trevisiol, H. Jégou, J. Delhumeau, G. Gravier, Retrieving geo-location of videos with a divide & conquer hierarchical multimodal approach, in Proceedings of the 3rd ACM Conference on International Conference on Multimedia Retrieval, ICMR ’13, ACM, New York, pp. 1–8 (2013)
Google Scholar
O. Van Laere, S. Schockaert, B. Dhoedt, Ghent University at the 2010 Placing Task, in Larson et al. [33]
Google Scholar
O. Van Laere, S. Schockaert, B. Dhoedt, Finding locations of Flickr resources using language models and similarity search, in Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR ’11, ACM, New York, pp. 48:1–48:8 (2011)
Google Scholar
O. Van Laere, S. Schockaert, B. Dhoedt, Ghent University at the 2011 Placing Task, in Larson et al. [32]
Google Scholar
O. Van Laere, S. Schockaert, B. Dhoedt, Georeferencing Flickr photos using language models at different levels of granularity: an evidence based approach. J. Web Semant. 16, 17–31 (2012)
Google Scholar
O. Van Laere, S. Schockaert, B. Dhoedt, Georeferencing Flickr resources based on textual meta-data. Inf. Sci. 238, 52–74 (2013)
Google Scholar
O. Van Laere, S. Schockaert, J. Quinn, F. Langbein, B. Dhoedt, Ghent and CARDIFF University at the 2012 Placing Task, in Larson et al. [35]
Google Scholar

Download references

Author information

Authors and Affiliations

Delft University of Technology, Delft, The Netherlands
Martha Larson & Claudia Hauff
Technische Universität, Berlin, Germany
Pascal Kelm
Future Cities Catapult, London, UK
Adam Rae
Yahoo Labs, San Francisco, CA, USA
Bart Thomee
Pompeu Fabra University, Barcelona, Spain
Michele Trevisiol
Yahoo Labs, Barcelona, Spain
Olivier Van Laere
ICSI, Berkeley, CA, USA
Jaeyoung Choi & Gerald Friedland
Cardiff University, Cardiff, UK
Steven Schockaert
Dublin City University, Dublin, Ireland
Gareth J.F. Jones
Yandex, Moscow, Russia
Pavel Serdyukov
Microsoft, Bellevue, WA, USA
Vanessa Murdock

Authors

Martha Larson
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Kelm
View author publications
You can also search for this author in PubMed Google Scholar
Adam Rae
View author publications
You can also search for this author in PubMed Google Scholar
Claudia Hauff
View author publications
You can also search for this author in PubMed Google Scholar
Bart Thomee
View author publications
You can also search for this author in PubMed Google Scholar
Michele Trevisiol
View author publications
You can also search for this author in PubMed Google Scholar
Jaeyoung Choi
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Van Laere
View author publications
You can also search for this author in PubMed Google Scholar
Steven Schockaert
View author publications
You can also search for this author in PubMed Google Scholar
Gareth J.F. Jones
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Serdyukov
View author publications
You can also search for this author in PubMed Google Scholar
Vanessa Murdock
View author publications
You can also search for this author in PubMed Google Scholar
Gerald Friedland
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martha Larson .

Editor information

Editors and Affiliations

International Computer Science Institute, Berkeley, California, USA
Jaeyoung Choi
International Computer Science Institute, Berkeley, California, USA
Gerald Friedland

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Larson, M. et al. (2015). The Benchmark as a Research Catalyst: Charting the Progress of Geo-prediction for Social Multimedia. In: Choi, J., Friedland, G. (eds) Multimodal Location Estimation of Videos and Images. Springer, Cham. https://doi.org/10.1007/978-3-319-09861-6_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-09861-6_2
Published: 05 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09860-9
Online ISBN: 978-3-319-09861-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics