“You’re as Sick as You Sound”: Using Computational Approaches for Modeling Speaker State to Gauge Illness and Recovery

Hirschberg, Julia; Hjalmarsson, Anna; Elhadad, Noémie

doi:10.1007/978-1-4419-5951-5_13

Julia Hirschberg²,
Anna Hjalmarsson &
Noémie Elhadad

1588 Accesses
10 Citations

Abstract

Recently, researchers in computer science and engineering have begun to explore the possibility of finding speech-based correlates of various medical conditions using automatic, computational methods. If such language cues can be identified and quantified automatically, this information can be used to support diagnosis and treatment of medical conditions in clinical settings and to further fundamental research in understanding cognition. This chapter reviews computational approaches that explore communicative patterns of patients who suffer from medical conditions such as depression, autism spectrum disorders, schizophrenia, and cancer. There are two main approaches discussed: research that explores features extracted from the acoustic signal and research that focuses on lexical and semantic features. We also present some applied research that uses computational methods to develop assistive technologies. In the final sections we discuss issues related to and the future of this emerging field of research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

H. Ai et al (2006), “Using System and User Performance Features to Improve Emotion Detection in Spoken Tutoring Dialogs,” Interspeech 2006, Pittsburgh.
Google Scholar
M. Alpert et al (2001), “Reflections of depression in acoustic measures of the patient’s speech,” Journal of Affective Disorders, 66:59–69.
Article Google Scholar
J. Ang et al (2002), “Prosody-based automatic detection of annoyance and frustration in human-computer dialog”, ICSLP 2002, Denver.
Google Scholar
H. Asperger (1944) (tr. U. Frith (1991), “Autistic psychopathy in childhood,” in U. Frith. Autism and Asperger syndrome. Cambridge University Press. pp. 37–92.
Google Scholar
E. Bantum and J. Owen (2009), “Evaluating the Validity of Computerized Content Analysis Programs for Identification of Emotional Expression in Cancer Narratives,” Psychological Assessment, 2009, 21(1): 79–88.
Article Google Scholar
Emo-D B. Berlin Emotional Speech Corpus. (http://pascal.kgw.tu-berlin.de/emodb/).
D. Bitouk et al. (2009), “Improving Emotion Recognition using Class-Level Spectral Features,” Interspeech 2009, Brighton.
Google Scholar
C. Baltaxe (1984). “Use of contrastive stress in normal, aphasic, and autistic children,” Journal of Speech and Hearing Research, 27:97–105.
Google Scholar
A. Batliner et al, (2003) “How to find trouble in communication,” Speech Communication, 40, pp. 117–143.
Article MATH Google Scholar
P. Boersma & D. Weenink (2005). PRAAT: Doing phonetics by computer (Version 4.3.14) [Computer program]. Retrieved from http://www.praat.org.
F. Burkhardt et al. (2005), “A Database of German Emotional Speech,” Interspeech 2005, Lisbon.
Google Scholar
J. Diehl et al (2009), “An acoustic analysis of prosody in high-functioning autism”, Applied Psycholinguistics, 30(3).
Google Scholar
R. el Kaliouby et al. (2006). “An Exploratory Social-Emotional Prosthetic for Autism Spectrum Disorders,” in Body Sensor Networks. 2006. MIT Media Lab.
Google Scholar
R.B Fink et al (2009). “Evaluating Speech Recognition in a Computerized Naming Program for Aphasia,” American Speech-Language Hearing Association Conference. New Orleans, November.
Google Scholar
R. B. Fink et al. (2002). “A computer implemented protocol for treatment of naming disorders: Evaluation ofclinician-guided and partially self-guided instruction,” Aphasiology,16(10/11):1061–1086.
Article Google Scholar
B. Elvevaag, P. Foltz, D. Weinberger, and T. Goldberg (2007), “Quantifying Incoherence in Speech: an Automated Methodology and Novel Application to Schizophrenia,” Schizophrenia Research, 93:304–316.
Article Google Scholar
B. Elvevaag, P. Foltz, M Rosenstein, and L. DeLisi (2009), “An automated method to analyze language use in patients with schizophrenia and their first degree-relatives,” Journal of Neurolinguistics.
Google Scholar
W. Goldfarb et al. (1972), “Speech and language faults in schizophrenic children. Journal of Autism and Childhood Schizophrenia, 2(3):219–233, 1972.
Article MathSciNet Google Scholar
P. Gupta & N. Rajput, (2006), “Two-Stream Emotion Recognition For Call Center Monitoring”, Interspeech 2006, Pittsburgh.
Google Scholar
Gottschalk, L., Winget, C., & Gleser, G. (1969). Manual of instructions for using the Gottschalk-Gleser content analysis scales: Anxiety, hostility, and social alienation-personal disorganization. Berkeley: University of California Press.
Google Scholar
K. Graves et al. (2005), “Emotional expression and emotional recognition in breast cancer survivors: A controlled comparison,” Psychology and Health, 20:579–595.
Article Google Scholar
M. E. Hoque et al. (2009), “Exploring Speech Therapy Games with Children on the Autism Spectrum,” Interspeech 2009, Brighton.
Google Scholar
T. Johnstone et al (2006), “The voice of emotion: an FMRI study of neural responses to angry and happy vocal expressions,” Social, Cognitive and Affective Neuroscience, 1(3), 242–249.
Article MathSciNet Google Scholar
L. Kanner (1946), “Irrelevant and metaphorical language in early infantile autism,” American Journal of Psychiatry, 103:242–246.
Google Scholar
L. Kanner (1948), “Autistic Disturbances of Affective Contact,” Nervous Child, 2:217–2520.
Google Scholar
C. M. Lee and S. Narayanan (2004), “Towards detecting emotionsin spoken dialogs,” IEEE Transactions on Speech and Audio Processing, 2004.
Google Scholar
S. Lee et al (2006), “A Study of Emotional Speech Articulation using a Fast Magnetic Resonance Imaging Technique,” Interspeech 2006, Pittsburgh.
Google Scholar
M. Le Normand et al (2008), “Prosodic disturbances in autistic children speaking French, Speech Prosody,” Campinas, Brazil.
Google Scholar
M. Lehtinen (2008), “The prosodic and nonverbal deficiencies of French- and Finnish-speaking persons with Asperger Syndrome,” Proceedings of the ISCA Workshop on Experimental Linguistics, Athens.
Google Scholar
M. Levit et al (2001), “Use of prosodic speech characteristics for automated detection of alcohol intoxication,” ISCA Workshop on Prosody in Speech Recognition and Understanding, Red Bank NJ.
Google Scholar
Linguistic Data Consortium, “Emotional prosody speech and transcripts,” LDC Catalog No.: LDC2002S28, University of Pennsylvania.
Google Scholar
J. Liscombe et al (2005), “Using Context to Improve Emotion Detection in Spoken Dialog Systems,” Interspeech 2005, Lisbon.
Google Scholar
J. Liscombe et al (2006), “Detecting Certainness in Spoken Tutorial Dialogues,” Interspeech 2006, Pittsburgh.
Google Scholar
X. Luo et al (2006), “Vocal Emotion Recognition with Cochlear Implants,” Interspeech 2006, Pittsburgh.
Google Scholar
A. Maier, T. Haderlein, U. Eysholdt, F. Rosanowski, A. Batliner, M. Schuster, E. Nöth (2009), “PEAKS – A systems for the automatic evaluation of voice and speech disorders,” Speech Communication 51 (2009):425–437.
Article Google Scholar
F. Mairesse and M. Walker (2006), “Automatic Recognition of Personality in Conversation,” HLT-NAACL 2006, New York City.
Google Scholar
G. Mesibov (1992). “Treatment issues with high-functioning adolescents and adults with autism,” In E. Schopler & G. Mesibov (Eds.), High-functioning individuals with autism (pp. 143–156). New York: Plenum Press.
Google Scholar
Elliot Moore II, Mark Clements, John Peifer and Lydia Weisser (2003), “Investigating the Role of Glottal Features in Classifying Clinical Depression,” IEEE EMBS, Cancun.
Google Scholar
S. Mozziconacci and D. J. Hermes (1999), “Role of intonation patterns in conveying emotion in speech,” ICPhS 1999, San Francisco.
Google Scholar
Mundt, J. et al (2007), “Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology,” Journal of Neurolinguistics, 20(1):50–64.
Article Google Scholar
P. Oudeyer (2002), “Novel useful features and algorithms for the recognition of emotions in human speech,” Speech Prosody 2002, Aix-en-Provence.
Google Scholar
T. Oxman, S Rosenberg, P. Schurr, and G. Tucker (1988), “Diagnostic Classification Through Content Analysis of Patient Speech,” American Journal of Psychiatry. 1988. 145:464–468.
Google Scholar
Pennebaker, J. et al (2001), Linguistic Inquiry and Word Count: LIWC 2001. Mahwah, NJ: Erlbaum.
Google Scholar
J. Pennebaker, M. Mehl, and K. Niederhoffer (2003), “Psychological Aspects of Natural Language Use: our Words, our Selves,” Annu. Rev. Psychol. 2003. 54:547–77.
Article Google Scholar
J. Pestian, P. Matykiewicz, J. Grupp-Phelan, S. Arszman Lavanier, J. Combs, and R. Kowatch (2008), “Using Natural Language Processing to Classify Suicide Notes,” ACL BioNLP Workshop, pp. 96–97.
Google Scholar
Paul, R et al (2008) “Production of syllable stress in speakers with autism spectrum disorders,” Research in Autism Spectrum Disorders, 2:110–124.
Article Google Scholar
R. Ranganath, D. Jurafsky, and D. McFarland (2009), “It’s Not You, it’s Me: Detecting Flirting and its Misperception inSpeed-Dates,” EMNLP 2009, Singapore.
Google Scholar
Rapin, I., and Dunna, M. (2003), “Update on the language disorders of individuals on the autistic spectrum,” Brain Development. 25:166–172.
Article Google Scholar
Shriberg, L. et al, (2001), “Speech and prosody characteristics of adolescents and adults with high-functioning autism and Asperger syndrome,” Journal of Speech, Language, and Hearing Research; 44(5).
Google Scholar
P. Stone, D. Dunphy, M. Smith, et al (1969), “The General Inquirer: A Computer Approach to Content Analysis,” Cambridge, Mass. MIT Press.
Google Scholar
van Santen, J. et al (2009), “Automated assessment of prosody production,” Speech Communication 51:1082–1097.
Article Google Scholar
J. Yuan et al (2002), “The acoustic realization of anger, fear, joy, and sadness in Chinese,” ICSLP, Denver.
Google Scholar
Zei Pollerman, B. (2002), “A Study of Emotional Speech Articulation using a Fast Magnetic Resonance Imaging Technique,” Speech Prosody 2002, Aix-en-Provence.
Google Scholar
E. Zetterholm (1999), “Emotional speech focusing on voice quality,” FONETIK: The Swedish Phonetics Conference, Gothemburg.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Columbia University, New York, NY, 10027-7003, USA
Julia Hirschberg (Professor)

Authors

Julia Hirschberg
View author publications
You can also search for this author in PubMed Google Scholar
Anna Hjalmarsson
View author publications
You can also search for this author in PubMed Google Scholar
Noémie Elhadad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julia Hirschberg .

Editor information

Editors and Affiliations

Linguistic Technology Systems, Palisade Ave. 800, Fort Lee, 07024, New Jersey, USA
Amy Neustein

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hirschberg, J., Hjalmarsson, A., Elhadad, N. (2010). “You’re as Sick as You Sound”: Using Computational Approaches for Modeling Speaker State to Gauge Illness and Recovery. In: Neustein, A. (eds) Advances in Speech Recognition. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-5951-5_13

Download citation

DOI: https://doi.org/10.1007/978-1-4419-5951-5_13
Published: 13 August 2010
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-5950-8
Online ISBN: 978-1-4419-5951-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics