Skip to main content

Detecting Speech Interruptions for Automatic Conflict Detection

  • Chapter
  • First Online:
Conflict and Multimodal Communication

Part of the book series: Computational Social Sciences ((CSS))

Abstract

This contribution is in the field of automatic detection of conflict in group discussions from voice analysis. A reliable detector of conflict would be useful for many applications, such as security in public places, the quality of customer services, and the deployment of intelligent agents. Experiments were conducted on the SSPNet Conflict Corpus during the Interspeech 2013 Conflict Challenge. The audio clips, which were extracted from political debates, have been classified into two classes of conflict level (low or high). In this study, we have used the turn-taking characteristics, such as interruptions, for improving the conflict detection. In a group discussion, overlapping speech (overlap) corresponds to interruption. Two overlap detectors have been developed using the SVM classifier and audio features. The first detector aims at detecting whether interruptions occur in a speech segment. The second detector aims at detecting when interruptions occur in a speech segment and whether these interruptions are related to low- or high-level conflict. A multi-expert architecture has been defined to incorporate the knowledge that arises from the interruption detectors. The two-class conflict detector (low or high conflict) consists of an SVM classifier that uses a composite feature set as input. This feature set is a concatenation of selected audio features and overlap detector-based features. Experiments provide an unweighted accuracy recall (UAR) of 85.3 % on the Test set. These results indicate an improvement of 4.5 % compared to the official baseline system results. In conclusion, the interruptions in speech can be detected and can significantly improve the automatic conflict detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Atkinson JM, Drew P (1979) Order in court: the organisation of verbal interaction in judicial settings. Humanities Press, Atlantic Highlands, NJ

    Google Scholar 

  • Barzilay R, Collins M, Hirschberg J, Whittaker S (2000) The rules behind the roles: identifying speaker roles in radio broadcasts. Paper presented at 17th National conference on artificial intelligence, Austin, USA, 30 July–3 Aug, pp 679–684

    Google Scholar 

  • Beattie GW (1982) Turn-taking and interruption in political interviews: Margaret Thatcher and Jim Callaghan compared and contrasted. Semiotica 39(1–2):93–114

    MathSciNet  Google Scholar 

  • Boakye K, Trueba-Hornero B, Vinyals O, Friedland G (2008) Overlapped speech detection for improved diarization in multi-party meetings. Paper presented at ICASSP Conference, Las Vegas, USA, 31 Mar–4 Apr, pp 4353–4356

    Google Scholar 

  • Boden D (1994) The business of talk. Organizations in action. Polity Press, London

    Google Scholar 

  • Brinson SL, Winn JE (1997) Talk shows’ representations of interpersonal conflicts. J Broadcast Electron Media 41(1):25–39

    Article  Google Scholar 

  • Chen Z, Feng TJ, Houkes Z (2000) Incorporating a priori knowledge into initialized weights for neural classifier. Paper presented at international joint conference on neural networks (IJCNN), Como, Italy, 24–27 July, pp 291–296

    Google Scholar 

  • Clancy PM, Thompson SA, Suzuki R, Tao H (1996) The conversational use of reactive tokens in English, Japanese and Mandarin. J Pragmat 26:355–387

    Article  Google Scholar 

  • Daly TM, Lee JA, Soutar GN, Rasmi S (2010) Conflict-handling style measurement: a best-worst scaling application. Int J Confl Manag 21(3):281–308

    Article  Google Scholar 

  • De Ruiter JP, Mitterer H, Enfield NJ (2006) Projecting the end of a speaker’s turn: a cognitive cornerstone of conversation. Language 82(3):515–535

    Article  Google Scholar 

  • Decoste D, Scholkopf B (2002) Training invariant support vector machines. Mach Learn 46(1–3):161–190

    Article  MATH  Google Scholar 

  • Eyben F, Wöllmer M, Schuller B (2010) openSMILE the Munich versatile and fast open-source audio feature extractor. Paper presented at the ACM multimedia conference (MM), Florence, Italy, 25–29 Oct, pp 1459–1462

    Google Scholar 

  • Garcia A (1991) Dispute resolution without disputing: how the interactional organization of mediation hearings minimizes argumentative talk. Am Sociol Rev 56:818–835

    Article  Google Scholar 

  • Gravano A, Hirschberg J (2011) Turn-taking cues in task oriented dialogue. Comput Speech Lang 25(3):601–634

    Article  Google Scholar 

  • Grèzes F, Richards J, Rosenberg A (2013) Let me finish: automatic conflict detection using speaker overlap. Paper presented at the Interspeech conference, Lyon, France, 25–29 Aug, 5 pages

    Google Scholar 

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA data mining software: an update. SIGKDD Explor 11:10–18

    Article  Google Scholar 

  • Heath C, Luff P (2007) Ordering competition: the interactional accomplishment of the sale of art and antiques at auction. Br J Sociol 58:63–85

    Article  Google Scholar 

  • Hosmer DW, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley, New York

    Book  MATH  Google Scholar 

  • Jordan MI, Jacobs RA (1994) Hierarchical mixtures of experts and the EM algorithm. Neural Comput 6:181–214

    Article  Google Scholar 

  • Kim S, Filippone M, Valente F, Vinciarelli A (2012a) Predicting the conflict level in television political debates: an approach based on crowdsourcing, nonverbal communication and Gaussian processes. Paper presented at the ACM conference on multimedia, Nara, Japan, pp 793–796

    Google Scholar 

  • Kim S, Valente F, Vinciarelli A (2012b) Automatic detection of conflicts in spoken conversations: ratings and analysis of broadcast political debates. Paper presented at ICASSP, Kyoto, Japan, 25–30 Mar, pp 5089–5092

    Google Scholar 

  • Kim S, Yella SH, Valente FA (2012c) Automatic detection of conflict escalation in spoken conversations. Paper presented at Interspeech Conference, Portland, USA, OR, 9–13 Sept, 4 pages

    Google Scholar 

  • Korabik K, Baril GL, Watson C (1993) Managers’ conflict management style and leadership effectiveness: the moderating effects of gender. Sex Roles 29(5–6):405–418

    Article  Google Scholar 

  • Krupka E, Tishby N (2007) Incorporating prior knowledge on features into learning. J Mach Learn Res 2:227–234

    Google Scholar 

  • Kurtié E, Brown GJ, Wells B (2012) Resources for turn competition in overlapping talk. Speech Comm 55:1–23. doi:10.1016/j.specom.2012.10.002

    Google Scholar 

  • Lauer DF, Bloch G (2008) Incorporating prior knowledge in support vector machines for classification: a review. Neurocomputing 71(7–9):1578–1594

    Article  Google Scholar 

  • Lerner GH (1995) Turn design and the organization of participation in instructional activities. Discourse Process 19(1):111–131

    Article  Google Scholar 

  • Li Y, de Ridder D, Duin RPW, Reinders MJT (2008) Integration of prior knowledge of measurement noise in Kernel density classification. Pattern Recogn 41:320–330

    Article  MATH  Google Scholar 

  • Mac Houl A (1978) The organization of turns at formal talk in the classroom. Lang Soc 7:183–213

    Article  Google Scholar 

  • Macintosh G, Stevens CJ (2008) Personality, motives and conflict strategies in everyday service encounters. Int J Confl Manag 19(2):112–131

    Article  Google Scholar 

  • Mehan H (1985) The structure of classroom discourse. In: Dijk TA (ed) Handbook of discourse analysis, vol 3. Academic, New York, pp 120–131

    Google Scholar 

  • Mondada L (2012) The dynamics of embodied participation and language choice in multilingual meetings. Lang Soc 41:1–23

    Article  Google Scholar 

  • Mondada L (2013) Embodied and spatial resources for turn-taking in institutional multi-party interactions: participatory democracy debates. J Pragmat 46(1):39–68

    Article  Google Scholar 

  • Oertel C, Wlodarczak M, Tarasov A, Campbell N, Wagner P (2012) Context cues for classification of competitive and collaborative overlaps. Paper presented at Speech Prosody Conference, Shanghai, China, 22–25 May, 4 pages

    Google Scholar 

  • Pesarin A, Cristani M, Murino V, Vinciarelli A (2012) Conversation analysis at work: detection of conflict in competitive discussions through semi-automatic turn-organization analysis. Cogn Process 13(2):533–540

    Article  Google Scholar 

  • Platt JC (2000) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Bartlett PJ, Schölkopf B, Schuurmans D, Smola AJ (eds) Advances in large margin classifiers. MIT Press, Cambridge, pp 61–74

    Google Scholar 

  • Quinlan A, Asano F (2007) Detection of overlapping speech in meeting recordings using the modified exponential fitting test. Paper presented at the European signal processing conference, Poznan, Poland, 3–7 Sept, pp 2360–2364

    Google Scholar 

  • Rahim MA (1983) A measure of styles of handling interpersonal conflict. Acad Manag J 26(2):368–376

    Article  Google Scholar 

  • Räsänen O, Pohjalainen J (2013) Random subset feature selection in automatic recognition of developmental disorders, affective states, and level of conflict from speech. Paper presented at the Interspeech conference, Lyon, France, 25–29 Aug, 5 pages

    Google Scholar 

  • Rauber TW, Steiger-Garcao AS (1993) Feature selection of categorical attributes based on contingency table analysis. Paper presented at the Portuguese conference on pattern recognition, Porto, Portugal

    Google Scholar 

  • Sacks H, Schegloff EA, Jefferson G (1974) A simplest systematics for the organization of turn-taking for conversation. Language 50(4):696–735

    Article  Google Scholar 

  • Schegloff EA (1987) Between macro and micro: contexts and other connections. In: Alexander J, Giesen B, Munch R, Smelser N (eds) The micro-macro link. University of California Press, Berkeley, pp 207–234

    Google Scholar 

  • Schölkopf BAJ, Smola AJ (2001) Learning with Kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge, MA

    Google Scholar 

  • Schuller B, Wimmer M, Moesenlechner L, Kern C, Arsic D, Rigoll G (2008) Brute-forcing hierarchical functional for paralinguistics: a waste of feature space? Paper presented at the ICASSP conference, pp 4501–4504

    Google Scholar 

  • Schuller B, Steidl S, Batliner A, Burkhardt F, Devillers L, Müller C, Narayanan S (2010) The Interspeech 2010 paralinguistic challenge. Paper presented at the Interspeech conference, Makuhari, Japan, 26–30 Sept, pp 2794–2797

    Google Scholar 

  • Schuller B, Batliner A, Steidl S, Schiel F, Krajewski J (2011) The Interspeech 2011 speaker state challenge. Paper presented at the Interspeech conference, Florence, Italy, 28–31 Aug, 4 pages

    Google Scholar 

  • Schuller B, Steidl S, Batliner A, Noth E, Vinciarelli A, Burkhardt F, van Son R, Weninger F, Eyben F, Bocklet T, Mohammadi G, Weiss B (2012) The Interspeech 2012 speaker trait challenge. Paper presented at the Interspeech conference, Portland, OR, USA, 9–13 Sept, 4 pages

    Google Scholar 

  • Schuller B, Steidl S, Batliner A, Vinciarelli A, Scherer K, Ringeval F, Chetouani M, Weninger F, Eyben F, Marchi E, Mortillaro M, Salamin H, Polychroniou A, Valente F, Kim S (2013) The Interspeech 2013 computational paralinguistics challenge: social signals, conflict, emotion autism. Paper presented at the Interspeech conference, Lyon, France, 25–29 Aug, 5 pages

    Google Scholar 

  • Shokouhi N, Sathyanarayana A, Sadjadi SO, Hansen JHL (2013) Overlapped-speech detection with applications to driver assessment for in-vehicle active safety systems. Paper presented at ICASSP conference, Vancouver, Canada, 26–31 May, pp 2834–2838

    Google Scholar 

  • Smolenski B, Ramachandran R (2011) Usable speech processing: a filterless approach in the presence of interference. Circuits Syst Mag IEEE 11(2):8–22

    Article  Google Scholar 

  • Sollich P (2002) Bayesian methods for support vector machines: evidence and predictive class probabilities. Mach Learn 46:21–52

    Article  MATH  Google Scholar 

  • Svennevig J (2008) Exploring leadership conversations. Manag Commun Q 21:529–536

    Article  Google Scholar 

  • Thomas KW, Thomas GF, Schaubhut N (2008) Conflict styles of men and women at six organization levels. Int J Confl Manag 19(2):148–166

    Article  Google Scholar 

  • Valente F, Vinciarelli A (2010) Improving speech processing trough social signals: automatic speaker segmentation of political debates using role based turn-taking patterns. Paper presented at the International workshop on social signal processing, Firenze, Italy, 25–29 Oct, pp 29–34

    Google Scholar 

  • Vinciarelli A (2007) Speakers role recognition in multiparty audio recordings using social network analysis and duration distribution modeling. IEEE Trans Multimed 9(6):1215–1226

    Article  Google Scholar 

  • Vinciarelli A (2009) Capturing order in social interactions. Signal Process Mag IEEE 26(5):133–152

    Article  Google Scholar 

  • Vogt T, André E (2005) Comparing feature sets for acted and spontaneous speech in view of automatic emotion recognition. Paper presented at the ICME conference, Amsterdam, The Netherlands, 6–8 July, pp 474–477

    Google Scholar 

  • Yamamoto K, Asano F, Yamada T, Kitawaki N (2005) Detection of overlapping speech in meetings using support vector regression. Paper presented at the international workshop on acoustic echo and noise control (IWAENC), Eindhoven, The Netherland, 12–15 Sept, pp 2158–2165

    Google Scholar 

  • Zadrozny B, Elkan C (2002) Transforming classifier scores into accurate multiclass probability estimates. Paper presented at the international conference on knowledge discovery and data mining, Edmonton, Canada, 23–25 July, pp 694–699

    Google Scholar 

Download references

Acknowledgments

Many thanks to Björn Schuller (TUM, Germany), Stefan Steidl (FAU Erlangen-Nuremberg, Germany), and Anton Batliner (TUM, Germany) for the organization of the Interspeech 2013 Conflict Challenge and special thanks to Alessandro Vinciarelli (University of Glasgow, UK) for the SSPNet Conflict Corpus.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marie-José Caraty .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Caraty, MJ., Montacié, C. (2015). Detecting Speech Interruptions for Automatic Conflict Detection. In: D'Errico, F., Poggi, I., Vinciarelli, A., Vincze, L. (eds) Conflict and Multimodal Communication. Computational Social Sciences. Springer, Cham. https://doi.org/10.1007/978-3-319-14081-0_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14081-0_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14080-3

  • Online ISBN: 978-3-319-14081-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics