Citizen Tagger: Exploring Social Tagging of Conversational Audio

  • Delvin VargheseEmail author
  • Patrick Olivier
  • Madeline Balaam
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10516)


This paper discusses Citizen Tagger (CT), a mobile application for tagging audio-based chat-show content. The application allows users to create audio and text tags (annotations). Through an iterative design process, CT was designed and deployed with 16 members of a faith-based community who tagged a panel discussion about ‘faith and vocation’. Based on usage statistics, analysis of created tags, and other qualitative data, the user experiences of tag creation were assessed. Questions around how to configure tagging-related parameters were investigated, and diverse user motivations for creating tags were also explored. Tagging was discovered to be a subjective experience, with participants expressing a desire to customise their tagging setup. Furthermore, despite being instructed to tag for content organisation and retrieval, users utilised tagging as a tool for self-reflection.


Social tagging Multimodal interaction Audio annotations Assisted note-taking Speech modality 


  1. 1.
    Anguera, X., Xu, J., Oliver, N.: Multimodal photo annotation and retrieval on a mobile phone. In: Proceeding of the 1st ACM international conference on Multimedia information retrieval – MIR 2008, p. 188 (2008).
  2. 2.
    Azenkot, S., Lee, N.B.: Exploring the use of speech input by blind people on mobile devices. In: Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility – ASSETS 2013, pp. 1–8 (2013).
  3. 3.
    Braun, V., Clarke, V.: Using thematic analysis in psychology. Qual. Res. Psychol. 3(2), 77–101 (2006)CrossRefGoogle Scholar
  4. 4.
    Cherubini, M., Anguera, X., Oliver, N., De Oliveira, R.: Text versus speech : a comparison of tagging input modalities for camera phones. In: Proceedings of the 11th International Conference on Human-Computer Interaction with Mobile Devices and Services, pp. 1:1–1:10 (2009).
  5. 5.
    Chiu, P., Kapuskar, A., Reitmeier, S., Wilcox, L.: Room with a rear view. Comput.-Support. Coop. Work Room 7, 48–54 (2000)Google Scholar
  6. 6.
    Chiu, P., Kapuskar, A., Wilcox, L., Reitmeier, S.: Meeting capture in a media enriched conference room. In: International Workshop on Cooperative Buildings, pp. 79–88 (1999)Google Scholar
  7. 7.
    Font, F., Serrà, J., Serra, X.: Audio clip classification using social tags and the effect of tag expansion. AES 53rd International Conference on Semantic Audio, pp. 1–9 (2014)Google Scholar
  8. 8.
    Jedrzejczyk, L., Price, B.A., Bandara, A.K., Nuseibeh, B.: On the impact of real-time feedback on users’ behaviour in mobile location-sharing applications. Computer 38(12), 14:1–14:12 (2010).
  9. 9.
    Keller, T., Doll, B., Alsdorf, K.L., Kiersznowski, D., Forster, G.: Redefining Work (Panel) (2013). Accessed 18 Aug 2016
  10. 10.
    Kustanowitz, J., Shneiderman, B.: Motivating annotation for digital photographs: lowering barriers while raising incentives. HCIL-2004-18 (2004)Google Scholar
  11. 11.
    Lamere, P.: Social tagging and music information retrieval. J. New Music Res. 37(2), 101–114 (2008). CrossRefGoogle Scholar
  12. 12.
    Lee, D., Hull, J.J., Erol, B., Graham, J.: MinuteAid : multimedia note-taking in an intelligent meeting room. In: IEEE International Conference on Multimedia and Expo (ICME) (2004)Google Scholar
  13. 13.
    Marlow, C., Naaman, M., Boyd, D., Davis, M.: HT06, tagging paper, taxonomy, Flickr, academic article, to read. In: Proceedings of the Seventeenth Conference on Hypertext and Hypermedia, pp. 31–40 (2006)Google Scholar
  14. 14.
    Moran, T.P., Palen, L., Harrison, S., Chiu, P., Kimber, D., Minneman, S., Van Melle, W., Zellweger, P.: “I’ ll Get That Off the Audio”: a case study of salvaging multimedia meeting records. In: Conference on Human Factors in Computing Systems, pp. 202–209 (1997)Google Scholar
  15. 15.
    Oviatt, S., Cohen, P.: Perceptual user interfaces: multimodal interfaces that process what comes naturally. Commun. ACM 43(3), 45–53 (2000). doi: 10.1145/330534.330538 CrossRefGoogle Scholar
  16. 16.
    Sack, H., Waitelonis, J.: Integrating social tagging and document annotation for content-based search in multimedia data. In: CEUR Workshop Proceedings p. 209 (2006)Google Scholar
  17. 17.
    Sada, A.N., Maldonado, A.: Research methods in education. Sixth Edition - by Louis Cohen, Lawrence Manion and Keith Morrison. Br. J. Educ. Stud. 55(4), 469–470 (2007).
  18. 18.
    Schaffer, S., Schleicher, R., Möller, S.: Modeling input modality choice in mobile graphical and speech interfaces. Int. J. Hum. Comput. Stud. 75, 21–34 (2015). doi: 10.1016/j.ijhcs.2014.11.004 CrossRefGoogle Scholar
  19. 19.
    Singh, A., Larson, M.: Narrative-driven multimedia tagging and retrieval: investigating design and practice for speech-based mobile applications. In: SLAM@ INTERSPEECH, pp. 90–95 (2013)Google Scholar
  20. 20.
    Turk, M.: Multimodal interaction: a review. Pattern Recogn. Lett. 36(1), 189–195 (2014). MathSciNetCrossRefGoogle Scholar
  21. 21.
    Yadati, K., Chandrasekaran Ayyanathan, P.S.N., Larson, M.: Crowdsorting timed comments about music: foundations for a new crowdsourcing task. In: CEUR Workshop Proceedings, p. 1263 (2014)Google Scholar
  22. 22.
    Yew, J., Gibson, F.P., Teasley, S.: Learning by tagging: the role of social tagging in group knowledge formation1. CEUR Workshop Proceedings, vol. 312, pp. 48–62 (2007)Google Scholar
  23. 23.
    Zollers, A.: Emerging motivations for tagging: expression, performance, and activism. WWW (2007).

Copyright information

© IFIP International Federation for Information Processing 2017

Authors and Affiliations

  • Delvin Varghese
    • 1
    Email author
  • Patrick Olivier
    • 1
  • Madeline Balaam
    • 1
  1. 1.Open LabNewcastle UniversityNewcastle upon TyneUK

Personalised recommendations