Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
When Do We Interact Multimodally? Cognitive Load and Multimodal Communication Patterns., 2004.
Ang, J., Dhillon, R., Krupski, A., Shriberg, E., and Stolcke, A. Prosody-Based Automatic Detection of Annoyance and Frustration in Human-Computer Dialog. In Proceedings of the International Conference on Speech and Language Processing (ICSLP 2002). Denver, CO, 2002.
Arsic, D., Wallhoff, F., Schuller, B., and Rigoll, G. Video Based Online Behavior Detection Using Probabilistic Multi-Stream Fusion. In Proceedings of the International IEEE Conference on Image Processing (ICIP 2005). 2005.
Batliner, A., Hacker, C., Steidl, S., Nöth, E., Russel, S. D. M., and Wong, M. ‘You Stupid Tin Box’-Children Interacting with the AIBO Robot:A Cross-linguisitc Emotional Speech Corpus. In Proceedings of the LREC 2004. Lisboa, Portugal, 2004.
Benoit, C., Martin, J.-C., Pelachaud, C., Schomaker, L., and Suhm, B., editors. Audiovisual and Multimodal Speech Systems. In: Handbook of Standards and Resources for Spoken Language Systems-Supplement Volume. D. Gibbon, I. Mertins, R.K. Moore, Kluwer International Series in Engineering and Computer Science, 2000.
Bolt, R. A. “Put-That-There”: Voice and Gesture at the Graphics Interface. In International Conference on Computer Graphics and Interactive Techniques, pages 262–270. July 1980.
Carpenter, B. The Logic of Typed Feature Structures. Cambridge, England, 1992.
Chuang, Z. and Wu, C. Emotion Recognition using Acoustic Features and Textual Content. In Proceedings of the International IEEE Conference on Multimedia and Expo (ICME) 2004. Taipei, Taiwan, 2004.
Core, M. G. Analyzing and Predicting Patterns of DAMSL Utterance Tags. In AAAI Spring Symposium Technical Report SS-98-01. AAAI Press, 1998. ISBN ISBN 1-57735-046-4.
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., and Taylor, J. G. Emotion Recognition in Human-computer Interaction. IEEE Signal Processing magazine, 18(1):32–80, January 2001.
Devillers, L. and Lamel, L. Emotion Detection in Task-Oriented Dialogs. In Proceedings of the International Conference on Multimedia and Expo(ICME 2003), IEEE, Multimedia Human-Machine Interface and Interaction, volume III, pages 549–552. Baltimore, MD, 2003.
Ekman, P. and Friesen, W. Facial Action Coding System. Consulting Psychologists Press, 1978.
Freund, Y. and Schapire, R. Experiments with a New Boosting Algorithm. In International Conference on Machine Learning, pages 148–156. 1996.
Geiser, G., editor. Mensch-Maschine-Kommunikation. Oldenbourg-Verlag, München, 1990.
Goldschen, A. and Loehr, D. The Role of the DARPA Communicator Architecture as a Human-Computer Interface for Distributed Simulations. In Simulation Interoperability Standards Organization (SISO) Spring Simulation Interoperability Workshop. Orlando, Florida, 1999.
Grosz, B. and Sidner, C. Attentions, Intentions and the Structure of Discourse. Computational Linguistics, 12(3):175–204, 1986.
Hartung, K., Münch, S., and Schomaker, L. MIAMI: Software Architecture, Deliverable Report 4. Report of ESPRIT III: Basic Research Project 8579, Multimodal Interface for Advanced Multimedia Interfaces (MIAMI). Technical report, 1996.
Hewett, T., Baecker, R., Card, S., Carey, T., Gasen, J., Mantei, M., Perlman, G., Strong, G., and Verplank, W., editors. Curricula for Human-Computer Interaction. ACM Special Interest Group on Computer-Human Interaction, Curriculum Development Group, 1996.
Hoch, S., Althoff, F., McGlaun, G., and Rigoll, G. Bimodal Fusion of Emotional Data in an Automotive Environment. In Proc. of the ICASSP 2005, IEEE Int. Conf. on Acoustics, Speech, and Signal Processing. 2005.
Jiao, F., Li, S., Shum, H., and Schuurmanns, D. Face Alignment Using Statistical Models and Wavelet Features. In Conference on Computer Vision and Pattern Recognition. 2003.
Joachims, T. Text Categorization with Support Vector Machines: Learning with Many Relevant Features. Technical report, LS-8 Report 23, Dortmund, Germany, 1997.
Johnston, M. Unification-based Multimodal Integration. In Below, R. K. and Booker, L., editors, Proccedings of the 4th International Conference on Genetic Algorithms. Morgan Kaufmann, 1997.
Krahmer, E. The Science and Art of Voice Interfaces. Technical report, Philips Research, Eindhoven, Netherlands, 2001.
Langley, P., Thompson, C., Elio, R., and Haddadi, A. An Adaptive Conversational Interface for Destination Advice. In Proceedings of the Third International Workshop on Cooperative Information Agents. Springer, Uppsala, Sweden, 1999.
Lee, C. M. and Pieraccini, R. Combining Acoustic and Language Information for Emotion Recognition. In Proceedings of the International Conference on Speech and Language Processing (ICSLP 2002). Denver, CO, 2002.
Lee, T. S. Image Representation Using 2D Gabor Wavelets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(10):959–971, 1996.
Levin, E., Pieraccini, R., and Eckert, W. A stochastic Model of Human-Machine Interaction for Learning Dialog Strategies. IEEE Transactions on Speech and Audio Processing, 8(1):11–23, 2000.
Litman, D., Kearns, M., Singh, S., and Walker, M. Automatic Optimization of Dialogue Management. In Proceedings of the 18th International Conference on Computational Linguistics. Saarbrücken, Germany, 2000.
Maybury, M. T. and Stock, O. Multimedia Communication, including Text. In Hovy, E., Ide, N., Frederking, R., Mariani, J., and Zampolli, A., editors, Multilingual Information Management: Current Levels and Future Abilities. A study commissioned by the US National Science Foundation and also delivered to European Commission Language Engineering Office and the US Defense Advanced Research Projects Agency, 1999.
McGlaun, G., Althoff, F., Lang, M., and Rigoll, G. Development of a Generic Multimodal Framework for Handling Error Patterns during Human-Machine Interaction. In SCI 2004, 8th World Multi-Conference on Systems, Cybernetics, and Informatics, Orlando, FL, USA. 2004.
McTear, M. F. Spoken Dialogue Technology: Toward the Conversational User Interface. Springer Verlag, London, 2004. ISBN 1-85233-672-2.
Mehrabian, A. Communication without Words. Psychology Today, 2(4):53–56, 1968.
Nielsen, J. Usability Engineering. Academic Press, Inc., 1993. ISBN 0-12-518405-0.
Nogueiras, A., Moreno, A., Bonafonte, A., and Marino, J. Speech Emotion Recognition Using Hidden Markov Models. In Eurospeech 2001 Poster Proceedings, pages 2679–2682. Scandinavia, 2001.
Oviatt, S. Ten Myths of Multimodal Interaction. Communications of the ACM 42, 11:74–81, 1999.
Oviatt, S., Cohen, P., Wu, L., Vergo, J., Duncan, L., Suhm, B., Bers, J., Holzman, T., Winograd, T., Landay, J., Larson, J., and Ferro, D. Designing the User Interface for Multimodal Speech and Pen-based Gesture Applications: State-of-the-Art Systems and Future Research Directionss. Human Computer Interaction, (15(4)):263–322, 2000.
Pantic, M. and Rothkrantz, L. Automatic Analysis of Facial Expressions: The State of the Art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12):1424–1445, 2000.
Pantic, M. and Rothkrantz, L. Toward an Affect-Sensitive Multimodal Human-Computer Interaction. Proccedings of the IEEE, 91:1370–1390, September 2003.
Petrushin, V. Emotion in Speech: Recognition and Application to Call Centers. In Proceedings of the Conference on Artificial Neural Networks in Engineering(ANNIE’ 99). 1999.
Picard, R. W. Affective Computing. MIT Press, Massachusetts, 2nd edition, 1998. ISBN 0-262-16170-2.
Pieraccini, R., Levin, E., and Eckert, W. AMICA: The AT&T Mixed Initiative Conversational Architecture. In Proceedings of the Eurospeech’ 97, pages 1875–1878. Rhodes, Greece, 1997.
Reason, J. Human Error. Cambridge University Press, 1990. ISBN 0521314194.
Sadek, D. and de Mori, R. Dialogue Systems. In de Mori, R., editor, Spoken Dialogues with computers, pages 523–562. Academic Press, 1998.
Schomaker, L., Nijtmanns, J., Camurri, C., Morasso, P., and Benoit, C. A Taxonomy of Multimodal Interaction in the Human Information Processing System. Report of ESPRIT III: Basic Research Project 8579, Multimodal Interface for Advanced Multimedia Interfaces (MIAMI). Technical report, 1995.
Schuller, B., Müller, R., Lang, M., and Rigoll, G. Speaker Independent Emotion Recognition by Early Fusion of Acousticand Linguistic Features within Ensembles. In Proceedings of the ISCA Interspeech 2005. Lisboa, Portugal, 2005.
Schuller, B., Rigoll, G., and Lang, M. Hidden Markov Model-Based Speech Emotion Recognition. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2003), volume II, pages 1–4. 2003.
Schuller, B., Rigoll, G., and Lang, M. Speech Emotion Recognition Combining Acoustic Features and Linguistic Information in a Hybrid Support Vector Machine-Belief Network Architecture. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), volume I, pages 577–580. Montreal, Quebec, 2004.
Schuller, B., Villar, R. J., Rigoll, G., and Lang, M. Meta-Classifiers in Acoustic and Linguistic Feature Fusion-Based Affect Recognition. In Proceedings of the International Conference on Acoustics, Speechand Signal Processing (ICASSP) 2005, volume 1, pages 325–329. Philadelphia, Pennsylvania, 2005.
Shneiderman, B. Designing the user interface: Strategies for effective human-computer interaction (3rd ed.). Addison-Wesley Publishing, 1998. ISBN 0201694972.
Smith, W. and Hipp, D. Spoken Natural Language Dialog Systems: A Practical Approach. Oxford University Press, 1994. ISBN 0-19-509187-6.
Tian, Y., Kanade, T., and Cohn, J. Evaluation of Gabor-wavelet-based Facial Action Unit Recognitionin Image Sequences of Increasing Complexity. In Proceedings of the Fifth IEEE International Conference on AutomaticFace and Gesture Recognition, pages 229–234. May 2002.
Turk, M. and Pentland, A. Face Recognition Using Eigenfaces. In Proc. of Conference on Computer Vision and Pattern Recognition, pages 586–591. 1991.
van Zanten, G. V. User-modeling in Adaptive Dialogue Management. In Proceedings of the Eurospeech’ 99, pages 1183–1186. Budapest, Hungary, 1999.
Ververidis, D. and Kotropoulos, C. A State of the Art Review on Emotional Speech Databases. In Proceedings of the 1st Richmedia Conference, pages 109–119. Lausanne, Sitzerland, 2003.
Witten, I. H. and Frank, E. Data Mining: Practical Machine Learning Tools with Java Implementations. Morgan Kaufmann, San Francisco, CA, 2000. ISBN 1-558-60552-5.
Wu, L., Oviatt, S., and Cohen, P. Multimodal integration-A Statistical Review. 1(4), pages 334–341. 1999.
Young, S. Probabilistic Methods in Spoken Dialogue Systems. Philosophical Transactions of the Royal Society, 358:1389–1402, 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Schuller, B., Ablaßmeier, M., Müller, R., Reifinger, S., Poitschke, T., Rigoll, G. (2006). Speech Communication and Multimodal Interfaces. In: Kraiss, KF. (eds) Advanced Man-Machine Interaction. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-30619-6_4
Download citation
DOI: https://doi.org/10.1007/3-540-30619-6_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30618-4
Online ISBN: 978-3-540-30619-1
eBook Packages: EngineeringEngineering (R0)