Advertisement

Language Resources and Evaluation

, Volume 52, Issue 2, pp 401–432 | Cite as

A longitudinal database of Irish political speech with annotations of speaker ability

  • Ailbhe Cullen
  • Naomi Harte
Original Paper

Abstract

This paper presents the Irish Political Speech Database, an English-language database collected from Irish political recordings. The database is collected with automated indexing and content retrieval in mind, and thus is gathered from real-world recordings (such as television interviews and election rallies) which represent the nature and quality of recordings which will be encountered in practical applications. The database is labelled for six speaker attributes: boring; charismatic; enthusiastic; inspiring; likeable; and persuasive. Each of these traits is linked to the perceived ability or appeal of the speaker, and as such are relevant to a range of content retrieval and speech analysis tasks. The six base attributes are combined to form a metric of Overall Speaker Appeal. A set of baseline experiments is presented, which demonstrate the potential of this database for affective computing studies. Classification accuracies of up to 76% are achieved, with little feature or system optimisation.

Keywords

Computational paralinguistics Affective computing Political speech Machine learning Charisma Speaker ability 

Notes

Acknowledgements

This work was supported by the Irish Research Council (IRC) under the Embark initiative, and was partly funded by the ADAPT Centre for Digital Content Technology, which is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is cofunded under the European Regional Development Fund.

References

  1. Afzal, S., & Robinson, P. (2009). Natural affect data: Collection and annotation in a learning context. In Proceedings of the 3rd international conference on affective computing and intelligent interaction and workshops, ACII 2009 (pp. 1–7).Google Scholar
  2. Apple, W., Streeter, L. A., & Krauss, R. M. (1979). Effects of pitch and speech rate on personal attributions. Journal of Personality and Social Psychology, 37(5), 715–727.CrossRefGoogle Scholar
  3. Astolfi, A., Carullo, A., Pavese, L., & Puglisi, G. E. (2015). Duration of voicing and silence periods of continuous speech in different acoustic environments. Journal of the Acoustical Society of America, 137(2), 565–579.CrossRefGoogle Scholar
  4. Awamleh, R., & Gardner, W. L. (1999). Perceptions of leader charisma and effectiveness: The effects of vision content, delivery, and organizational performance. The Leadership Quarterly, 10(3), 345–373.CrossRefGoogle Scholar
  5. Bänziger, T., Mortillaro, M., & Scherer, K. (2011). Introducing the Geneva multimodal expression corpus for experimental research on emotion perception. Emotion (Washington, DC), 12(5), 1161.CrossRefGoogle Scholar
  6. Belin, P., Fillion-Bilodeau, S., & Gosselin, F. (2008). The Montreal affective voices: A validated set of nonverbal affect bursts for research on auditory affective processing. Behavior Research Methods, 40(2), 531–9.CrossRefGoogle Scholar
  7. Biadsy, F., Rosenberg, A., Carlson, R., Hirschberg, J., & Strangert, E. (2008). A cross-cultural comparison of American, Palestinian, and Swedish perception of charismatic speech. In Speech prosody (pp. 579–582).Google Scholar
  8. Biel, J., & Gatica-Perez, D. (2013). The YouTube lens: Crowdsourced personality impressions and audiovisual analysis of vlogs. IEEE Transactions on Multimedia, 15(1), 41–55.CrossRefGoogle Scholar
  9. Briggs, S. R. (1992). Assessing the five-factor model of personality description. Journal of Personality, 60(2), 253–293.CrossRefGoogle Scholar
  10. Burkhardt, F., Eckert, M., Johannsen, W., & Stegmann, J. (2010). A database of age and gender annotated telephone speech. In International conference on language resources and evaluation (LREC) (pp. 1562–1565).Google Scholar
  11. Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., & Weiss, B. (2005). A database of German emotional speech. In Proceedings of the Interspeech.Google Scholar
  12. Burkhardt, F., Schuller, B., Weiss, B., & Weninger, F. (2011). Would you buy a car from me? On the likability of telephone voices. In Interspeech. ISCA (pp. 1557–1560).Google Scholar
  13. Busso, C., Bulut, M., Lee, C. C., Kazemzadeh, A., Mower, E., Kim, S., et al. (2008). IEMOCAP: Interactive emotional dyadic motion capture database. International Conference on Language Resources and Evaluation (LREC), 42(4), 335–359.CrossRefGoogle Scholar
  14. Cabezas, F., Carlier, A., Charvillat, V., Salvador, A., & Giro-i Nieto, X. (2015). Quality control in crowdsourced object segmentation. In Proceedings of the 2015 IEEE international conference on image processing (ICIP) (pp. 4243–4247).Google Scholar
  15. Calix, R., Khazaeli, M., Javadpour, L., & Knapp, G. (2011). Dimensionality reduction and classification analysis on the audio section of the SEMAINE database, lecture notes in computer science (Vol. 6975, pp. 323–331). Berlin: Springer.Google Scholar
  16. Conger, J. A., & Kanungo, R. N. (1987). Toward a behavioral theory of charismatic leadership in organizational settings. Academy of Management Review, 12(4), 637–647.CrossRefGoogle Scholar
  17. Cowie, R., & Cornelius, R. R. (2003). Describing the emotional states that are expressed in speech. Speech Communication, 40(12), 5–32.CrossRefGoogle Scholar
  18. Cullen, A., Hines, A., & Harte, N. (2014). Building a database of political speech: Does culture matter in charisma annotations? In Audio visual emotion challenge (AVEC’14) (pp. 27–31). ACM.Google Scholar
  19. D’Errico, F., Signorello, R., Demolin, D., & Poggi, I. (2013). The perception of charisma from voice: A cross-cultural study. In Humaine association conference on affective computing and intelligent interaction (ACII) (pp. 552–557).Google Scholar
  20. D’Mello, S. K., Dowell, N., & Graesser, A. (2013). Unimodal and multimodal human perception of naturalistic non-basic affective states during human–computer interactions. IEEE Transaction on Affective Computing, 4(4), 452–465.CrossRefGoogle Scholar
  21. De Raad, B., Barelds, D. P. H., Levert, E., Ostendorf, F., Mlačić, B., Blas, L. D., et al. (2010). Only three factors of personality description are fully replicable across languages: A comparison of 14 trait taxonomies. Journal of Personality and Social Psychology, 98(1), 160–173.CrossRefGoogle Scholar
  22. De Silva, L.C., & Pei Chi, N. (2000). Bimodal emotion recognition. In Proceedings of the fourth IEEE international conference on automatic face and gesture recognition, 2000 (pp. 332–335).Google Scholar
  23. Dhall, A., Goecke, R., Joshi, J., Sikka, K., & Gedeon, T. (2014). Emotion recognition in the wild challenge 2014: Baseline, data and protocol. In Proceedings of the 16th international conference on multimodal interaction (pp. 461–466). ACM, 2666275.Google Scholar
  24. Ekman, P. (2003). Emotions revealed. London: Weidenfeld & Nicholson.Google Scholar
  25. El Ayadi, M., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44(3), 572–587.CrossRefGoogle Scholar
  26. Eyben, F., Wollmer, M., & Schuller, B. (2010). Opensmile: The Munich versatile and fast open-source audio feature extractor. In ACM international conference on multimedia (pp. 1459–1462). ACM.Google Scholar
  27. Finlayson, A., & Martin, J. (2008). ‘It ain’t what you say..’: British political studies and the analysis of speech and rhetoric. British Politics, 3(4), 445–464.CrossRefGoogle Scholar
  28. Gatica-Perez, D. (2009). Automatic nonverbal analysis of social interaction in small groups: A review. Image and Vision Computing, 27(12), 1775–1787.CrossRefGoogle Scholar
  29. Glodek, M., Tschechne, S., Layher, G., Schels, M., Brosch, T., Scherer, S., et al. (2011). Multiple classifier systems for the classification of audio-visual emotional states, lecture notes in computer science (Vol. 6975, pp. 359–368). Berlin: Springer.Google Scholar
  30. Gobl, C., & Ni Chasaide, A. (2003). The role of voice quality in communicating emotion, mood and attitude. Speech Communication, 40(12), 189–212.CrossRefGoogle Scholar
  31. Goldberg, L. R. (1990). An alternative “description of personality”: The big-five factor structure. Journal of Personality and Social Psychology, 59(6), 1216–1229.CrossRefGoogle Scholar
  32. Gravano, A., Levitan, R., Willson, L., Beňuš, V., Hirschberg, J., & Nenkova, A. (2011). Acoustic and prosodic correlates of social behaviour. In Interspeech (pp. 97–100). ISCA.Google Scholar
  33. Grimm, M., & Kroschel, K. (2005). Evaluation of natural emotions using self assessment manikins. In Proceedings of the 2005 IEEE workshop on automatic speech recognition and understanding (pp. 381–385).Google Scholar
  34. Grimm, M., Kroschel, K., & Narayanan, S. (2008). The Vera am Mittag German audio-visual emotional speech database. In IEEE international conference on multimedia and expo (pp. 865–868).Google Scholar
  35. Guerini, M., Giampiccolo, D., Moretti, G., Sprugnoli, R., & Strapparava, C. (2013). The new release of CORPS: A corpus of political speeches annotated with audience reactions. Lecture Notes in Computer Science, 7688, 86–98.CrossRefGoogle Scholar
  36. Guntuku, S.C., Lin, W., Scott, M.J., & Ghinea, G. (2015). Modelling the influence of personality and culture on affect and enjoyment in multimedia. In Proceedings of the 2015 international conference on affective computing and intelligent interaction (ACII) (pp. 236–242).Google Scholar
  37. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: An update. SIGKDD Explorations Newsletter, 11(1), 10–18.CrossRefGoogle Scholar
  38. Hanjalic, A., & Li-Qun, X. (2005). Affective video content representation and modeling. IEEE Transactions on Multimedia, 7(1), 143–154.CrossRefGoogle Scholar
  39. Hart, R. P., & Lind, C. J. (2010). Words and their ways in campaign ’08. American Behavioral Scientist, 54(4), 355–381.CrossRefGoogle Scholar
  40. Hayes, A. F., & Krippendorff, K. (2007). Answering the call for a standard reliability measure for coding data. Communication Methods and Measures, 1(1), 77–89.CrossRefGoogle Scholar
  41. Hung, H., & Gatica-Perez, D. (2010). Estimating cohesion in small groups using audio-visual nonverbal behavior. IEEE Transaction on Multimedia, 12(6), 563–575.CrossRefGoogle Scholar
  42. Irish government news channel. https://www.youtube.com/user/MerrionStreetNews.
  43. Joho, H., Staiano, J., Sebe, N., & Jose, J. M. (2010). Looking at the viewer: Analysing facial activity to detect personal highlights of multimedia contents. Multimedia Tools and Applications, 51(2), 505–523.CrossRefGoogle Scholar
  44. Junqua, J. C., Wakita, H., & Hermansky, H. (1993). Evaluation and optimization of perceptually-based ASR front-end. IEEE Transactions on Speech and Audio Processing, 1(1), 39–48.CrossRefGoogle Scholar
  45. Kim, S., Valente, F., Filippone, M., & Vinciarelli, A. (2014). Predicting continuous conflict perception with bayesian gaussian processes. IEEE Transactions on Affective Computing, 5(2), 187–200.CrossRefGoogle Scholar
  46. Kockmann, M., Burget, L., & Honza Černocký, J. (2011). Application of speaker- and language identification state-of-the-art techniques for emotion recognition. Speech Communication, 53(910), 1172–1185.CrossRefGoogle Scholar
  47. Koelstra, S., Muhl, C., Soleymani, M., Lee, J. S., Yazdani, A., Ebrahimi, T., et al. (2012). DEAP: A database for emotion analysis using physiological signals. IEEE Transactions on Affective Computing, 3(1), 18–31.CrossRefGoogle Scholar
  48. Lee, C.M., Yildirim, S., Bulut, M., Kazemzadeh, A., Busso, C., Deng, Z., Lee, S., & Narayanan, S. (2004). Emotion recognition based on phoneme classes. In Interspeech (pp. 889–892).Google Scholar
  49. Lijun, Y., Xiaozhou, W., Yi, S., Jun, W., & Rosato, M.J. (2006). A 3D facial expression database for facial behavior research. In Proceedings of the 7th international conference on automatic face and gesture recognition, FGR 2006 (pp. 211–216).Google Scholar
  50. Lucas, S. (2012). The art of public speaking (11th ed.). New York: McGraw-Hill.Google Scholar
  51. Mahmoud, M., Baltrušaitis, T., Robinson, P., & Riek, L. D. (2011). 3D corpus of spontaneous complex mental states (pp. 205–214). Berlin: Springer.Google Scholar
  52. Mana, N., Lepri, B., Chippendale, P., Cappelletti, A., Pianesi, F., Svaizer, P., & Zancanaro, M. (2007). Multimodal corpus of multi-party meetings for automatic social behavior analysis and personality traits detection. In Proceedings of the 2007 workshop on tagging, mining and retrieval of human related activity information (pp. 9–14). ACM, 1330590.Google Scholar
  53. Mariooryad, S., Kannan, A., Hakkani-Tur, D., & Shriberg, E. (2014). Automatic characterization of speaking styles in educational videos. In Proceedings of the 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4848–4852).Google Scholar
  54. McAleer, P., Todorov, A., & Belin, P. (2014). How do you say ‘Hello’? Personality impressions from brief novel voices. PLoS ONE, 9(3), 1–9.CrossRefGoogle Scholar
  55. McDuff, D., Kaliouby, R. E., & Picard, R. W. (2012). Crowdsourcing facial responses to online videos. IEEE Transactions on Affective Computing, 3(4), 456–468.CrossRefGoogle Scholar
  56. McKeown, G., Valstar, M., Cowie, R., Pantic, M., & Schroder, M. (2012). The SEMAINE database: Annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Transactions on Affective Computing, 3(1), 5–17.CrossRefGoogle Scholar
  57. Meng, H., & Bianchi-Berthouze, N. (2011). Naturalistic affective expression classification by a multi-stage approach based on hidden Markov models, lecture notes in computer science (Vol. 6975, pp. 378–387). Berlin: Springer.Google Scholar
  58. Mohammadi, G., & Vinciarelli, A. (2012). Automatic personality perception: Prediction of trait attribution based on prosodic features. IEEE Transaction on Affective Computing, 3(3), 273–284.CrossRefGoogle Scholar
  59. Mohammadi, G., Vinciarelli, A., & Mortillaro, M. (2010). The voice of personality: Mapping nonverbal vocal behaviour into trait attributions. In International workshop on social signal processing (SSPW) (pp. 17–20).Google Scholar
  60. Mower, E., Metallinou, A., Chi-Chun, L., Kazemzadeh, A., Busso, C., Sungbok, L., & Narayanan, S. (2009). Interpreting ambiguous emotional expressions. In Proceedings of the 3rd international conference on affective computing and intelligent interaction and workshops (ACII) (pp. 1–8).Google Scholar
  61. Niebuhr, O., Voße, J., & Brem, A. (2016). What makes a charismatic speaker? A computer-based acoustic-prosodic analysis of steve jobs tone of voice. Computers in Human Behavior, 64, 366–382.CrossRefGoogle Scholar
  62. Nwe, T. L., Foo, S. W., & De Silva, L. C. (2003). Speech emotion recognition using hidden Markov models. Speech Communication, 41(4), 603–623.CrossRefGoogle Scholar
  63. Oireachtas debates archive. http://oireachtas.heanet.ie/FullArchive/.
  64. Olivola, C. Y., & Todorov, A. (2010). Elected in 100 milliseconds: Appearance-based trait inferences and voting. Journal of Nonverbal Behavior, 34(2), 83–110.CrossRefGoogle Scholar
  65. Pammi, S., & Schröder, M. (2011). Evaluating the meaning of synthesized listener vocalisations. In Interspeech (pp. 329–332). ISCA.Google Scholar
  66. Pennebaker, J. W., & Lay, T. C. (2002). Language use and personality during crises: Analyses of Mayor Rudolph Giuliani’s press conferences. Journal of Research in Personality, 36(3), 271–282.CrossRefGoogle Scholar
  67. Pfister, T., & Robinson, P. (2011). Real-time recognition of affective states from nonverbal features of speech and its application for public speaking skill analysis. IEEE Transaction on Affective Computing, 2(2), 66–78.CrossRefGoogle Scholar
  68. Pianesi, F., Zancanaro, M., Lepri, B., & Cappelletti, A. (2008). A multimodal annotated corpus of consensus decision making meetings. International Conference on Language Resources and Evaluation (LREC), 41(3), 409–429.Google Scholar
  69. Pon-Barry, H., & Nelakurthi, A.R. (2014). Challenges for robust prosody-based affect recognition. In Speech Prosody (pp. 144–148).Google Scholar
  70. Rammstedt, B., & John, O. P. (2007). Measuring personality in one minute or less: A 10-item short version of the big five inventory in english and german. Journal of Research in Personality, 41(1), 203–212.CrossRefGoogle Scholar
  71. Ribeiro, F., Florencio, D., & Nascimento, V. (2011). Crowdsourcing subjective image quality evaluation. In Proceedings of the 18th IEEE international conference on image processing (ICIP) (pp. 3097–3100).Google Scholar
  72. Riek, L.D., OConnor, M.F., & Robinson, P. (2011). Guess what? A game for affective annotation of video using crowd sourcing, lecture notes in computer science (Vol. 6974, chap. 31, pp. 277–285). Berlin: Springer.Google Scholar
  73. Rosenberg, A., & Hirschberg, J. (2009). Charisma perception from text and speech. Speech Communication, 51(7), 640–655.CrossRefGoogle Scholar
  74. Salamin, H., Favre, S., & Vinciarelli, A. (2009). Automatic role recognition in multiparty recordings: Using social affiliation networks for feature extraction. IEEE Transactions on Multimedia, 11(7), 1373–1380.CrossRefGoogle Scholar
  75. Sanchez-Cortes, D., Aran, O., & Gatica-Perez, D. (2011). An audio visual corpus for emergent leader analysis. In Multimodal corpora for machine learning: Taking stock and road mapping the future. ACM ICMI.Google Scholar
  76. Scherer, K. R. (1972). Judging personality from voice: A cross-cultural approach to an old issue in interpersonal perception. Journal of Personality, 40(2), 191–210.CrossRefGoogle Scholar
  77. Scherer, K. R. (1979). Personality markers in speech, chap 5 (pp. 147–209). Cambridge: Cambridge University Press.Google Scholar
  78. Scherer, K. R. (2003). Vocal communication of emotion: A review of research paradigms. Speech Communication, 40(1), 227–256.CrossRefGoogle Scholar
  79. Scherer, K. R., Bnziger, T., & Roesch, E. B. (Eds.). (2010). Blueprint for affective computing: A sourcebook and manual. Oxford: Oxford University Press.Google Scholar
  80. Schiel, F., & Heinrich, C. (2009). Laying the foundation for in-car alcohol detection by speech. In Interspeech (pp. 983–986). ISCA.Google Scholar
  81. Schuller, B., & Batliner, A. (2014). Computational paralinguistics. Hoboken: Wiley.Google Scholar
  82. Schuller, B., Batliner, A., Steidl, S., & Seppi, D. (2011). Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge. Speech Communication, 53(910), 1062–1087.CrossRefGoogle Scholar
  83. Schuller, B., Mller, R., Eyben, F., Gast, J., Hörnler, B., Wöllmer, M., et al. (2009). Being bored? Recognising natural interest by extensive audiovisual integration for real-life application. Image and Vision Computing, 27(12), 1760–1774.CrossRefGoogle Scholar
  84. Schuller, B., Rigoll, G., & Lang, M. (2003). Hidden Markov model-based speech emotion recognition. In IEEE international conference on acoustics, speech, and signal processing (ICASSP) (Vol. 2, pp. 1–4).Google Scholar
  85. Schuller, B., Steidl, S., & Batliner, A. (2009). The INTERSPEECH 2009 emotion challenge. In Interspeech (pp. 312–315).Google Scholar
  86. Schuller, B., Steidl, S., Batliner, A., Burkhardt, F., Devillers, L., Müller, C., & Narayanan, S.S. (2010). The INTERSPEECH 2010 paralinguistic challenge. In Interspeech (pp. 2794–2797). ISCA.Google Scholar
  87. Schuller, B., Valstar, M., Eyben, F., McKeown, G., Cowie, R., & Pantic, M. (2011). AVEC 2011: The first international audio/visual emotion challenge, lecture notes in computer science (Vol. 6975, pp. 415–424). Berlin: Springer.Google Scholar
  88. Siddiquie, B., Chisholm, D., & Divakaran, A. (2015). Exploiting multimodal affect and semantics to identify politically persuasive web videos. In International conference on multimodal interactino (ICMI) (pp. 203–210). ACM.Google Scholar
  89. Signorello, R., D’Errico, F., Poggi, I., & Demolin, D. (2012). How charisma is perceived from speech: A multidimensional approach. In Privacy, security, risk and trust (PASSAT), 2012 international conference on social computing (SocialCom) (pp. 435–440).Google Scholar
  90. Slatcher, R. B., Chung, C. K., Pennebaker, J. W., & Stone, L. D. (2007). Winning words: Individual differences in linguistic style among US presidential and vice presidential candidates. Journal of Research in Personality, 41(1), 63–75.CrossRefGoogle Scholar
  91. Snel, J., Tarasov, A., Cullen, C., & Delany, S.J. (2012). A crowdsourcing approach to labeling a mood induced speech corpora. In Proceedings of the 4th international workshop on corpora for research on emotion sentiment and social signals (ES \(^3\) 2012).Google Scholar
  92. Snow, R., O’Connor, B., Jurafsky, D., & Ng, A.Y. (2008). Cheap and fast: But is it good? Evaluating non-expert annotations for natural language tasks. In Empirical methods in natural language processing (pp. 254–263). Association for Computational Linguistics.Google Scholar
  93. Soleymani, M., Lichtenauer, J., Pun, T., & Pantic, M. (2012). A multimodal database for affect recognition and implicit tagging. IEEE Transactions on Affective Computing, 3(1), 42–55.CrossRefGoogle Scholar
  94. Steidl, S. (2009). Automatic classification of emotion-related user states in spontaneous children’s speech. Berlin: Logos Verlag.Google Scholar
  95. Steidl, S., Polzehl, T., Bunnell, H.T., Dou, Y., Muthukumar, P.K., Perry, D., Prahallad, K., Vaughn, C., Black, A.W., & Metze, F. (2012). Emotion identification for evaluation of synthesized emotional speech. In Speech Prosody (pp. 661–664).Google Scholar
  96. Strangert, E., & Gustafson, J. (2008). What makes a good speaker? Subject ratings, acoustic measurements, and perceptual evaluations. In Interspeech (pp. 1688–1691). ISCA.Google Scholar
  97. Strapparava, C., Guerini, M., & Stock, O. (2010). Predicting persuasiveness in political discourses. In International conference on language resources and evaluation (LREC) (pp. 1342–1345).Google Scholar
  98. Strapparava, C., & Mihalcea, R. (2008). Learning to identify emotions in text. In Proceedings of the 2008 ACM symposium on applied computing (pp. 1556–1560). ACM.Google Scholar
  99. Tahon, M., Delaborde, A., & Devillers, L. (2012). Corpus of children voices for mid-level markers and affect bursts analysis. In International conference on language resources and evaluation (LREC) (pp. 2366–2369).Google Scholar
  100. Touati, P. (1993). Prosodic aspects of political rhetoric. In ESCA workshop on prosody (pp. 168–171).Google Scholar
  101. Truong, K.P., Neerincx, M.A., & van Leeuwen, D.A. (2008) Assessing agreement of observer-and self-annotations in spontaneous multimodal emotion data. In Interspeech (pp. 318–321). ISCA.Google Scholar
  102. Tsai, T. (2015). Are you TED material? Comparing prosody in professors and TED speakers. In Interspeech (pp. 2534–2538). ISCA.Google Scholar
  103. Vinciarelli, A., Dielmann, A., Favre, S., & Salamin, H. (2009). Canal9: A database of political debates for analysis of social interactions. In Proceedings of the 3rd international conference on affective computing and intelligent interaction and workshops (ACII) (pp. 1–4).Google Scholar
  104. Vinciarelli, A., Esposito, A., André, E., Bonin, F., Chetouani, M., Cohn, J. F., et al. (2015). Open challenges in modelling, analysis and synthesis of human behaviour in human–human and human–machine interactions. Cognitive Computation, 7(4), 397–413.CrossRefGoogle Scholar
  105. Vinciarelli, A., Pantic, M., & Bourlard, H. (2009). Social signal processing: Survey of an emerging domain. Image and Vision Computing, 27(12), 1743–1759.CrossRefGoogle Scholar
  106. Wang, W.Y., & Hirschberg, J. (2011). Detecting levels of interest from spoken dialog with multistream prediction feedback and similarity based hierarchical fusion learning. In SIGDIAL (pp. 152–161). Association for Computational Linguistics.Google Scholar
  107. Weiss, B. (2005). Prosodic elements of a political speech and its effects on listeners. In: International conference on speech and computer (SPECOM) (pp. 127–130).Google Scholar
  108. Weiss, B., & Burkhardt, F. (2010). Voice attributes affecting likability perception. In Interspeech (pp. 2014–2017). ISCA.Google Scholar
  109. Weninger, F., Krajewski, J., Batliner, A., & Schuller, B. (2012). The voice of leadership: Models and performances of automatic analysis in online speeches. IEEE Transactions on Affective Computing, 3(4), 496–508.CrossRefGoogle Scholar
  110. Wörtwein, T., Chollet, M., Schauerte, B., Morency, L.P., Stiefelhagen, R., & Scherer, S. (2015). Multimodal public speaking performance assessment. In Proceedings of the ACM international conference on multimodal interaction (pp. 43–50). ACM.Google Scholar
  111. Zhang, J.R., Sherwin, J., Dmochowski, J., Sajda, P., & Kender, J.R. (2014). Correlating speaker gestures in political debates with audience engagement measured via EEG. In Proceedings of the 22nd ACM international conference on multimedia (pp. 387–396). ACM, 2654909.Google Scholar
  112. Zhihong, Z., Pantic, M., Roisman, G. I., & Huang, T. S. (2009). A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(1), 39–58.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2017

Authors and Affiliations

  1. 1.Sigmedia, ADAPT Centre, School of EngineeringTrinity CollegeDublinIreland

Personalised recommendations