Multimedia Tools and Applications

, Volume 78, Issue 10, pp 13841–13875 | Cite as

Automatic and semi-automatic annotation of people in photography using shared events

  • Anderson Almeida FirminoEmail author
  • Cláudio de Souza Baptista
  • Hugo Feitosa de Figueirêdo
  • Eanes Torres Pereira
  • Brunna de Sousa Pereira Amorim


This article proposes an automatic and semi-automatic annotation technique for people in photos using the shared event concept, which consists of many photos captured by different devices of people who attended the same event. The technique uses an algorithm to group photos into personal events and then verifies which of these events are shared. The automatic annotation of people uses techniques of facial recognition and detection, while the semi-automatic annotation uses a pondered sum of estimators based on contextual information and picture content. Experiments showed that using the shared event concept increases the hit rate of automatic and semi-automatic annotations of people in the utilized photo collection.


People annotation Face recognition Event annotation Context aware multimedia Personal photo collection 



  1. 1.
    Ahmad K, Conci N, Boato G, De Natale FG (2017) Event recognition in personal photo collections via multiple instance learning-based classification of multiple images. J Electr Imaging 26:6CrossRefGoogle Scholar
  2. 2.
    Ames M, Naaman M (2007) Why we tag: motivations for annotation in mobile and online media. In: Proceedings of the SIGCHI conference on human factors in computing systems - CHI ’07. ACM Press, New YorkGoogle Scholar
  3. 3.
    Anderson TW (2011) Anderson-Darling tests of goodness-of-fit. In: International encyclopedia of statistical science. [s.n.] pp 52–54Google Scholar
  4. 4.
    Andrade D, Figueirêdo H, Baptista C, Paiva A (2014) New approaches for geographic location propagation in digital photograph collections. In: 16th international conference on enterprise information systems, ICEIS 2014. LisbonGoogle Scholar
  5. 5.
    Andrade D, Maia L, Figueirêdo H, Viana W, Trinta F, Baptista C (2016) Photo annotation: a survey. Multimed Tools Appl, 1–35Google Scholar
  6. 6.
    Bouselmi G, Fohr D, Illina I (2012) Multilingual recognition of non-native speech using acoustic model transformation and pronunciation modeling. I J Speech Technol 15(2):203–213CrossRefGoogle Scholar
  7. 7.
    Brenner M, Izquierdo E (2012) Social event detection and retrieval in collaborative photo collections. In: Proceedings of the 2nd ACM international conference on multimedia retrieval - ICMR ’12. ACM Press, Hong KongGoogle Scholar
  8. 8.
    Brenner M, Mirza N, Izquierdo E (2014) People recognition using gamified ambiguous feedback. In: Proceedings of the first international workshop on gamification for information retrieval (GamifIR ’14). ACM, New York, pp 22–26Google Scholar
  9. 9.
    Chang X, Shen H, Wang S, Liu J, Li X (2014) Semi-supervised feature analysis for multimedia annotation by mining label correlation. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, Cham, pp 74–85Google Scholar
  10. 10.
    Choi JY, De NW, Plataniotis KN, Ro YM (2011) Collaborative face recognition for improved face annotation in personal photo collections shared on online social networks. IEEE Trans Multimed 13(1):14–28CrossRefGoogle Scholar
  11. 11.
    Chung K (1946) The approximate distribution of student’s statistic. Ann Math Statist Institut Math Statist 17(4):447–465MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Cohen D, Steiner M, Kang T, Raghavan C (2012) Aggregating photos captured at an event. U.S. Patent 20170046341A1Google Scholar
  13. 13.
    Conci N, et al. (2015) Synchronization of multi-user event media at MediaEval 2015: taskdescription, datasets, and evaluation. In: Working notes proceedings of the MediaEval 2015, Workshop, WurzenGoogle Scholar
  14. 14.
    Cooray SH (2008) Enhancing person annotation for personal photo management using content and context. Dublin City University, Dublin, p 2008Google Scholar
  15. 15.
    Cooray S, O’Connor NE, Gurrin C (2006) Identifying person re-occurrences for personal photo management applications. In: IET international conference on visual information engineering (VIE 2006). IEE, BangaloreCrossRefGoogle Scholar
  16. 16.
    Cooray SH, O’Connor NE (2009) Enhancing person annotation for personal photo management applications. In: 2009 20th international workshop on database and expert systems applicationGoogle Scholar
  17. 17.
    Davis M, et al. (2004) From context to content: leveraging context to infer media metadata. In: Proceedings of the 12th annual ACM international conference on multimedia. ACM (Multimedia ’04), pp 188–195. Disponível em:
  18. 18.
    Davis M, et al. (2005) Towards context-aware face recognition. In: Proceedings of the 13th annual ACM international conference on multimedia. ACM (Multimedia ’05), pp 483–486Google Scholar
  19. 19.
    Dao M, Boato G, Natale FGB, Nguyen TC (2013) Jointly exploiting visual and non-visual information for event-related social media retrieval. In: The 3rd ACM international conference on multimedia retrieval (ICMR ’13). ACM, New York, pp 159–166Google Scholar
  20. 20.
    De Figueirêdo HF, Da Silva JPR, Leite DFB, De Baptista CS (2012) Detection of photos from the same event captured by distinct cameras. In: Proceedings of the 18th Brazilian symposium on multimedia and the web - WebMedia ’12. ACM Press, New YorkGoogle Scholar
  21. 21.
    Escalante HJ, Montes M, Sucar LE (2012) Multi-class particle swarm model selection for automatic image annotation. Expert Syst Appl 39(12):11011–11021CrossRefGoogle Scholar
  22. 22.
    Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the second international conference on knowledge discovery and data mining, KDD’96. AAAI Press, pp 226–231Google Scholar
  23. 23.
    Feng K, et al. (2014) In search of influential event organizers in online social networks. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data. ACM (SIGMOD ’14), pp 63–74Google Scholar
  24. 24.
    Figueirêdo HF, Souza Baptista C, Casanova MA, Da Silva TE, De Paiva AC (2015) Semi-automatic photograph tagging by combining context with content-based information. Expert Syst Appl 42(1):203–211CrossRefGoogle Scholar
  25. 25.
    Freire TP, Junior GB, de Figueirêdo HF, de Souza Baptista C (2012) An approach of recommendation based on non-negative matrix factorization applied to photogeo. In: 6th international workshop on multimedia and signal processing, 2012. Viena, pp 43–46Google Scholar
  26. 26.
    Gallagher AC, Chen T (2008) Clothing cosegmentation for recognizing people. In: IEEE conference on computer vision and pattern recognition anchorage, AK: IEEEGoogle Scholar
  27. 27.
    Gallagher AC, Chen T (2009) Using context to recognize people in consumer images. IPSJ Trans Comput Vis Appl 1:15–126Google Scholar
  28. 28.
    Gallagher AC, Neustaedter CG, Cao L, Luo J, Chen T (2008) Image annotation using personal calendars as context. In: Proceeding of the 16th ACM international conference on multimedia - MM ’08. ACM Press, New YorkGoogle Scholar
  29. 29.
    Gao X, et al. (2013) GeSoDeck: a geo-social event detection and tracking system. In: Proceedings of the 21st ACM international conference on multimedia. ACM (MM ’13), pp 471–472Google Scholar
  30. 30.
    Geng Y, Liang RZ, Li W, Wang J, Liang G, Xu C, Wang JY (2016) Learning convolutional neural network to maximize pos@ top performance measureGoogle Scholar
  31. 31.
    Geng Y, Zhang G, Li W, Gu Y, Liang RZ, Liang G, Wang JY (2017) A novel image tag completion method based on convolutional neural transformation. In: International conference on artificial neural networks. Springer, Cham, pp 539–546Google Scholar
  32. 32.
    Hanbury A (2008) A survey of methods for image annotation. J Vis Lang Comput 19, 5:617–627CrossRefGoogle Scholar
  33. 33.
    Huang SC, Jiau MK, Jian YH (2016) Optimisation of automatic face annotation system used within a collaborative framework for online social networks. IET Comput Vis 10(5):349–358CrossRefGoogle Scholar
  34. 34.
    Hulsebosch RJ, Ebben PWG (2008) Enhancing face recognition with location information. In: 2008 third international conference on availability, reliability and security, pp 397–403Google Scholar
  35. 35.
    Ionescu B, Ginsca A, Boteanu B, Lupu M, Popescu A, Muller H (2016) Div150Multi: a social image retrieval result diversification dataset with multi-topic queries. In: Proceedings of the 7th international conference on multimedia systems, MMSys 2016. Klagenfurt, pp 42:1–42:6Google Scholar
  36. 36.
    Ivasic-Kos M, Ipsic I, Ribaric S (2015) A knowledge-based multi-layered image annotation system. Expert Syst Appl 42(24):9539–9553CrossRefGoogle Scholar
  37. 37.
    Jaccard P (1901) Distribution de la flore alpine dans le bassin des Dranses et dans quelques régions voisines. Bulletin de la Socié,té Vaudoise des Sciences Naturelles 37:241–272Google Scholar
  38. 38.
    Jamil N, Sa’dan SA (2014) Automated face annotation for personal photo management. In: Computational science and technology (ICCST), pp 1–5, 27–28Google Scholar
  39. 39.
    Jang C, Yoon T, Cho H-G (2009) A new clustering methodology for group photos taken by multiple travelers. In: 2009 ninth IEEE international conference on computer and information technologyGoogle Scholar
  40. 40.
    Jang C, Yoon T, Cho H-G (2009) A smart clustering algorithm for photo set obtained from multiple digital cameras. In: Proceedings of the 2009 ACM symposium on applied computing - SAC ’09. ACM Press, New YorkGoogle Scholar
  41. 41.
    Kim HN, El Saddik A, Jung JG (2012) Leveraging personal photos to inferring friendships in social network services. Expert Syst Appl 39(8):6955–6966CrossRefGoogle Scholar
  42. 42.
    Kolmogorov AN (1933) Foundations of probability. BerlinGoogle Scholar
  43. 43.
    Lacerda YA, de Figueirêdo HF, de S Baptista C, de Paiva AC (2008) Expandindo e utilizando informações de contexto para a sugestão de anotações de fotografias digitais. In: Proceedings of the 14th Brazilian symposium on multimedia and the web - WebMedia ’08. ACM Press, New York, pp 162–169Google Scholar
  44. 44.
    Lacerda YA, et al. (2008) PhotoGeo: a self-organizing system for personal photo collections. In: 2008 tenth IEEE international symposium on multimedia, pp 258–265Google Scholar
  45. 45.
    Lienhart R, Maydt J (2002) An extended set of Haar-like features for rapid object detection. In: Proceedings international conference on image processing, Anais... [S.l.]. IEEEGoogle Scholar
  46. 46.
    Lim J, Tian Q, Mulhem P (2003) Home photo content modeling for personalized event-based retrieval. IEEE MultiMed 10(4):28–37CrossRefGoogle Scholar
  47. 47.
    Lin D, et al. (2010) Joint people, event, and location recognition in personal photo collections using cross-domain context. In: Proceedings of the 11th European conference on computer vision: part i. Springer(ECCV’10), pp 243–256Google Scholar
  48. 48.
    Lo Presti L, La Cascia M (2014) Concurrent photo sequence organization. Multimed Tools Appl 68(3):777–803CrossRefGoogle Scholar
  49. 49.
    Madhumathi K, Thanamani A (2014) Face annotation using unsupervised label refinement and facial gesture detection using Eigenfaces algorithm. Int J Adv Res Comput Commun Eng 3(9):7909–7911Google Scholar
  50. 50.
    Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Statist Institut Math Statist 18(1):50–60MathSciNetCrossRefzbMATHGoogle Scholar
  51. 51.
    Markus N, Frljak M, Pandzic IS, Forchheimer R, Ahlberg J (2013) Object detection with pixel intensity comparisons organized in decision trees. ArXiv e-prints (may 2013)Google Scholar
  52. 52.
    Mezaris V, et al. (2014) Real-life events in multimedia: detection, representation, retrieval, and applications. Multimed Tools Appl 70(1):1–6CrossRefGoogle Scholar
  53. 53.
    Monaghan F, O’Sullivan D (2007) Leveraging ontologies, context and social networks to automate photo annotation. In: Proceedings of the semantic and digital media technologies 2nd international conference on semantic multimedia. Springer (SAMT’07), pp 252–255. Disponível em:
  54. 54.
    Naaman M (2005) Leveraging geo-referenced digital photographs Stanford. Stanford University, CaliforniaGoogle Scholar
  55. 55.
    Nakaji Y, Yanai K (2012) Visualization of real-world events with Geotagged tweet photos. In: 2012 IEEE international conference on multimedia and expo workshops, pp 272–277Google Scholar
  56. 56.
    O’Hare N, Smeaton AF (2009) Context-aware person identification in personal photo collections. IEEE Trans Multimed 11(2):220–228CrossRefGoogle Scholar
  57. 57.
    Paniagua J, et al. (2013) Social events and social ties. In: Proceedings of the 3rd ACM conference on international conference on multimedia retrieval. ACM. (ICMR ’13), pp 143–150Google Scholar
  58. 58.
    Patel T, Shah B (2017) A survey on facial feature extraction techniques for automatic face annotation. In: International conference on innovative mechanisms for industry applications (ICIMIA), [s.l.], v 1, n 1, pp 224–228, fev. 2017Google Scholar
  59. 59.
    PSallidas F, et al. (2013) Effective event identification in social media. IEEE Data Eng Bull 36(3):42–50Google Scholar
  60. 60.
    Rabbath M, Sandhaus P, Boll S (2012) Analysing Facebook features to support event detection for photo-based Facebook applications. In: The 2nd ACM international conference on multimedia retrieval (ICMR ’12). ACM, Hong Kong, p 18Google Scholar
  61. 61.
    Rodden K, Wood K (2003) How do people manage their digital photographs? In: CHI ’03: Proceedings of the SIGCHI conference on human factors in computing systems, vol 5, pp 409–416Google Scholar
  62. 62.
    Ruocco M, Ramampiaro H (2015) Geo-temporal distribution of tag terms for event-related image retrieval. Inf Process Manag 51.1:92–110CrossRefGoogle Scholar
  63. 63.
    Sadlier DA, Lee H, Gurrin C, Smeaton AF, O’Connor NE (2008) User-feedback on a feature-rich photo organiser. In: 2008 ninth international workshop on image analysis for multimedia interactive services, pp 215–218Google Scholar
  64. 64.
    Sansone E, et al. (2017) Automatic synchronization of multi-user photo galleries. IEEE Trans Multimed 19(6):1285–1298CrossRefGoogle Scholar
  65. 65.
    Shimizu K, Nitta N, Babaguchi N (2011) Learning people co-occurrence relations by using relevance feedback for retrieving group photos. In: Proceedings of the 1st ACM international conference on multimedia retrieval - ICMR ’11. ACM Press, New YorkGoogle Scholar
  66. 66.
    Stone Z, Zickler T, Darrell T (2008) Autotagging Facebook: social network context improves photo annotation. In: 2008 IEEE computer society conference on computer vision and pattern recognition workshopsGoogle Scholar
  67. 67.
    Turk M, Pentland A (1991) Face recognition using eigenfaces. In: Proceedings IEEE computer society conference on computer vision and pattern recognition, Anais [S.l.]. IEEE Comput. Sco. PressGoogle Scholar
  68. 68.
    Varshney LR (2008) Identity annotation in photo collections: a survey. Camera Culture 1(1):1–12MathSciNetGoogle Scholar
  69. 69.
    Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. [S.l.]: IEEE Comput Soc, I-511-I-518Google Scholar
  70. 70.
    Wang G, Gallagher AC, Luo J, Forsyth D (2010) Seeing people in social context: recognizing people and social relationships. In: Proceedings of the 11th European conference on computer vision. Springer, CreteGoogle Scholar
  71. 71.
    Wang S, Yang Y, Ma Z, Li X, Pang C, Hauptmann AG (2012) Action recognition by exploring data distribution and feature correlation. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1370–1377Google Scholar
  72. 72.
    Wang S, Yang Y, Ma Z, Li X, Pang C, Hauptmann AG (2014) Semi-supervised multiple feature analysis for action recognition. IEEE Trans Multimed 16(2):289–298CrossRefGoogle Scholar
  73. 73.
    Xu Y, Peng F, Yuan Y, Wang Y (2017) Face album: towards automatic photo management based on person identity on mobile phones. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 3031–3035Google Scholar
  74. 74.
    Yagnik J, Islam A (2007) Learning people annotation from the web via consistency learning. In: Proceedings of the international workshop on workshop on multimedia information retrieval. ACM (MIR ’07), pp 285–290Google Scholar
  75. 75.
    Zhang W, Zhang T, Tretter D (2010) Clothing-based person clustering in family photos. In: 2010 IEEE international conference on image processingGoogle Scholar
  76. 76.
    Zhang G, Liang G, Li W, Fang J, Wang J, Geng Y, Wang JY (2017) Learning convolutional ranking-score function by query preference regularization. In: International conference on intelligent data engineering and automated learning. Springer, Cham, pp 1–8Google Scholar
  77. 77.
    Zhang G, Liang G, Su F, Qu F, Wang JY (2018) Learning convolutional attribute embedding for domain-transfer learning. Lecture Notes in Artificial IntelligenceGoogle Scholar
  78. 78.
    Zhu S, Shi Z, Sun C, Shen S (2015) Deep neural network based image annotation. Pattern Recogn Lett 65:103–108CrossRefGoogle Scholar
  79. 79.
    Zigkolis C, Papadopoulos S, Filippou G, Kompatsiaris Y, Vakali (2014) A collaborative event annotation in tagged photo collections. Multimed Tools Appl 70(1):89–118CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Computer Science DepartmentFederal University of Campina GrandeParaibaBrazil
  2. 2.Federal Institute of Education, Science and Technology of Paraiba -EsperancaBrazil

Personalised recommendations