Skip to main content
Log in

Collaborative event annotation in tagged photo collections

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript


Events constitute a significant means of multimedia content organization and sharing. Despite the recent interest in detecting events and annotating media content in an event-centric way, there is currently insufficient support for managing events in large-scale content collections and limited understanding of the event annotation process. To this end, this paper presents CrEve, a collaborative event annotation framework which uses content found in social media sites with the prime objective to facilitate the annotation of large media corpora with event information. The proposed annotation framework could significantly benefit social media research due to the proliferation of event-related user-contributed content. We demonstrate that, compared to a standard “browse-and-annotate” interface, CrEve leads to a 19% increase in the coverage of the generated ground truth in a large-scale annotation experiment. Furthermore, the paper discusses the results of a user study that quantifies the performance of CrEve and the contribution of different event dimensions in the event annotation process. The study confirms the prevalence of spatio-temporal queries as the prime option of discovering event-related content in a large collection. In addition, textual queries and social cues (content contributor) were also found to be significant as event search dimensions. Finally, it demonstrates the potential of employing automatic photo clustering methods with the goal of facilitating event annotation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others


  1. A demo version of CrEve can be found here,

  2. LODE: an Ontology for Linking Open Descriptions of Events

  3. Ontology Event Model F, formal model of events


  1. Ames M, Naanman M (2007) Why we tag: motivations for annotation in mobile and online media. In: Proceedings of the SIGCHI conference on human factors in computing systems CHI ’07, pp 971–980

  2. Appan P, Sundaram H (2004) Networked multimedia event exploration. In: Proceedings of the 12th annual ACM international conference on multimedia (MULTIMEDIA ’04). ACM, New York, NY, USA, pp 40–47

    Chapter  Google Scholar 

  3. Appan P, Shevade B, Sundaram, H, Birchfield, D (2005) Interfaces for networked media exploration and collaborative annotation. In: Proceedings of the 10th international conference on intelligent user interfaces (IUI ’05). ACM, New York, NY, USA, pp 106–113

    Google Scholar 

  4. Begelman G, Keller P, Smadja F (2006) Automatic tag clustering: improving search and exploration in the tag space. In: Proc. of the collaborative web tagging workshop at WWW 06

  5. Burke R (2002) Hybrid recommender systems: survey and experiments. User Model User-Adapt Interact 12(4):331–370

    Article  MATH  Google Scholar 

  6. Chua TS, Tang J, Hong R, Li H, Luo Z, Zheng YT (2009) NUS-WIDE: a real-world web image database from National University of Singapore. In: ACM international conference on image and video retrieval, Greece, 8–10 Jul 2009

  7. Cooper M, Foote J, Girgensohn A, Wilcox L (2005) Temporal event clustering for digital photo collections. ACM transactions on multimedia computing, communications and applications, vol 1, pp 269–288, 3 Aug 2005

  8. Golder S, Huberman A. (2006) The structure of collaborative tagging systems, 2006. HP Labs Technical Report available at

  9. Huiskes MJ, Lew MS (2008) The MIR Flickr retrieval evaluation. In: ACM international conference on multimedia information retrieval (MIR’08), Vancouver, Canada

  10. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91110

    Article  Google Scholar 

  11. Liu X, Troncy R, Huet B (2011) Finding media illustrating events. In: ACM international conference on multimedia retrieval (ICMR 2011), Trento, Italy

  12. Marlow C, Naaman M, Boyd D, Davis M (2006) HT06, tagging paper, taxonomy, Flickr, academic article, to read. In: Proceedings of the seventeenth conference on hypertext and hypermedia (HYPERTEXT ’06). ACM, New York, NY, USA, pp 31–40

    Chapter  Google Scholar 

  13. Mirkovic M, Culibrk D, Papadopoulos S, Zigkolis C, Kompatsiaris Y, McArdle G, Crnojevic V (2011) A comparative study of spatial, temporal and content-based patterns emerging in YouTube and Flickr. In: International conference on Computational Aspects of Social Networks (CASoN), pp 189–194

  14. Naaman M, Harada S, Wang Q, Garcia-Molina H, Paepcke A (2004) Context data in geo-referenced digital photo collections. In: Proceedings of the 12th annual ACM international conference on multimedia (MULTIMEDIA ’04)

  15. Papadopoulos S, Zigkolis C, Kompatsiaris Y, Vakali A (2011) Cluster-based landmark and event detection on tagged photo collections. IEEE Multimedia Mag 18(1):52–63

    Article  Google Scholar 

  16. Papadopoulos S, Zigkolis C, Kapiris S, Kompatsiaris Y, Vakali A (2010) ClustTour: city exploration by use of hybrid photo clustering. In: Technical demonstration session of ACM multimedia 2010 (Florence, 2010)

  17. Papadopoulos S, Troncy R, Mezaris V, Huet B, Kompatsiaris I (2011) Social event detection at MediaEval 2011: challenges, dataset and evaluation. In: MediaEval workshop, Pisa, Italy, 1–2 Sept 2011

  18. Papadopoulos S, Zigkolis C, Kompatsiaris Y, Vakali A (2011) CERTH @ MediaEval 2011 SED task. In: Proceedings of MediaEval workshop, Pisa, Italy, 1–2 Sept 2011

  19. Pu P, Chen L, Kumar P (2008) Evaluating product search and recommender systems for E-commerce environments, vol 8, 1–2 Jun 2008, pp 1–27

  20. Rattenbury T, Good N, Naaman M (2007) Towards automatic extraction of event and place semantics from Flickr tags. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR ’07)

  21. Quack T, Leibe B, Van Gool L (2008) World-scale mining of objects and events from community photo collections. In: Proceedings of the 2008 international conference on content-based image and video retrieval (CIVR ’08). ACM, New York, NY, USA, pp 47–56

    Chapter  Google Scholar 

  22. Sayyadi H, Hurst M, Maykov A (2009) Event detection and tracking in social streams. In: Proceedings of international AAAI conference on weblogs and social media. AAAI Press

  23. Scherp A, Saathoff C, Franz T, Staab S (2012) A core ontology on events for representing occurrences in the real world. Multimedia Tools and Applications 58(2):293–331

    Article  Google Scholar 

  24. Shaw R, Troncy R, Hardman L (2009) LODE: linking open descriptions of events. In: 4th Asian Semantic Web Conference (ASWC’2009). Shanghai, China

  25. Shneiderman B, Bederson BB, Drucker SM (2006) Find that photo!: interface strategies to annotate, browse, and share. Communications of the ACM, vol 49, pp 69–71, 4 Apr 2006

  26. Sigurbjörnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th international conference on World Wide Web (WWW ’08), pp 327–336

  27. Smeaton AF, Over P, Kraaij W (2006) Evaluation campaigns and TRECVid. In: Proceedings of the 8th ACM international workshop on multimedia information retrieval, Santa Barbara, California, USA, 26–27 Oct 2006. MIR ’06. ACM Press, New York, NY, pp 321–330

    Google Scholar 

  28. Suh B, Bederson BB (2007) Semi-automatic photo annotation strategies using event based clustering and clothing based person recognition. Interact Comput 19(4):524–544

    Article  Google Scholar 

  29. Troncy R, Malocha B, Fialho A (2010) Linking events with media I-SEMANTICS’10. In: 6th international conference on semantic systems, colocated with “In the open track of the linked data triplification challenge, Graz, Austria, 1–3 Sept 2010

  30. Van de Sande, KE, Gevers T, Snoek CG (2009) Evaluating color descriptors for object and scene recognition. IEEE Trans Pattern Anal Mach Intell 99

  31. Von Ahn, L, Dabbish, L (2004) Labeling images with a computer game. In: Proceedings of the SIGCHI conference on Human factors in computing systems (CHI ’04)

  32. Vander-wal T (2007) Explaining and showing broad and narrow folksonomies.

  33. Westermann U, Jain R (2007) Toward a common event model for multimedia applications. IEEE MultiMedia Mag 14(1):19–29

    Article  Google Scholar 

  34. Wu X, Lu Y-J, Peng Q, Ngo C-W (2011) Mining event structures from web videos. IEEE MultiMedia Mag 18(1):38–51

    Article  Google Scholar 

  35. Xie L, Sundaram H, Campbell M (2008) Event mining in multimedia streams. Proc IEEE 96(4):623–647

    Article  Google Scholar 

  36. Zsombori V, Frantzis M, Guimaraes RL, Ursu MF, Cesar P, Kegel I, Craigie R, Bulterman DCA (2011) Automatic generation of video narratives from shared UGC. In: Proceedings of the 22nd ACM conference on hypertext and hypermedia, pp 325–334

Download references


Christos Zigkolis’s work has been co-financed by the European Union (European Social Fund—ESF) and Greek national funds through the Operational Program “Education and Lifelong Learning” of the National Strategic Reference Framework (NSRF)—Research Funding Program: Heracleitus II. Investing in knowledge society through the European Social Fund. The work of Symeon Papadopoulos was supported by the GLOCAL and SocialSensor projects, partially funded by the European Commission, under contract numbers FP7-248984 and FP7-287975.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Christos Zigkolis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zigkolis, C., Papadopoulos, S., Filippou, G. et al. Collaborative event annotation in tagged photo collections. Multimed Tools Appl 70, 89–118 (2014).

Download citation

  • Published:

  • Issue Date:

  • DOI: