Abstract
In this chapter we discuss the background and the related work of ASDM within the context of IEs. In Sect. 2.1 we explain the general functioning of an SDS. The underlying idea of an IE is explained in Sect. 2.2. The IE approaches realised within the ATRACO Project serve as examples. In Sect. 2.3 we describe prior work in the field of (spoken and multimodal) interaction within IEs. Section 2.4 focuses on a specific part of an SDS: the Spoken Dialogue Manager (SDM). Several approaches toward developing this component have been implemented in the past. We give an overview on all directions in general and illustrate each with an example. Section 2.4 is divided into three parts: the first part is dedicated to state-machine-based approaches, the second part to stochastic methodologies, and the third part to plan- and Information State-based systems. In Sect. 2.5 we present several approaches to enhancing the performance of the SDM. Furthermore, we discuss how these approaches influenced our work. By introducing our own approach we conclude this chapter in Sect. 2.6.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
SOFIA is funded by the European Artemis programme, 2009–2011, http://www.sofia-project.eu
References
Abowd, G., Atkeson, C., & Essa, I. (1998). Ubiquitous smart spaces. Technical report, DARPA.
Axelsson, J., Cross, C., Lie, H. W., McCobb, G., Raman, T. V., & Wilson, L. (2001). Xhtml+voice profile 1.0. Technical report, W3C.
Bachmann, P. (1894). Die analytische Zahlentheorie, vol. 2. Leipzig: Teubner.
Baum, L. E., Petrie, T., Soules, G., & Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. The Annals of Mathematical Statistics, 41(1), 164–171.
Bechhofer, S., Volz, R., & Lord, P. (2003). Cooking the semantic web with the owl api. In The Semantic Web – ISWC 2003, (pp. 659–675). Springer.
Bellik, Y., Pruvost, G., Martin, J.-C., Tan, N., Minker, W., & Heinroth, T. (2010). D16 – user interaction adaptation component. Confidential deliverable, The ATRACO Project (FP7/2007–2013 grant agreement no:216837).
Berton, A., Bühler, D., & Minker, W. (2006). SmartKom-Mobile Car: User Interaction with Mobile Services in a Car Environment (SmartKom: Foundations of Multi-Modal Dialogue Systems ed.)., (pp. 523–541). Cognitive Technologies. Heidelberg: Springer.
Beslay, L., & Hakala, H. (2007). Digital territory: Bubbles. In P. T. Kidd (Ed.), European visions for the knowledge age: a quest for new horizons in the information society. Cheshire Henbury.
Bezold, M. (2011). Adapting Multimodal Dialogue Systems to User Behaviour. PhD thesis, Ulm University.
Bidot, J., Goumopoulos, C., & Calemis, I. (2011). Using ai planning and late binding for managing service workflows in intelligent environments. In Proc. of the International Conference on Pervasive Computing and Communications (PerCom), (pp. 156–163). IEEE.
Black, A. W., Burger, S., Conkie, A., Hastie, H. W., Keizer, S., Lemon, O., Merigaud, N., Parent, G., Schubiner, G., Thomson, B., Williams, J. D., Yu, K., Young, S., & Eskenazi, M. (2011). Spoken dialog challenge 2010: Comparison of live and control test results. In SIGDIAL Conference, (pp. 2–7).
Bohlin, P., Bos, J., Larsson, S., Lewin, I., Matheson, C., & Milward, D. (1999). Survey of existing interactive systems – trindi deliverable d1.3. Technical report, Gothenburg University.
Bohus, D., Raux, A., Harris, T. K., Eskenazi, M., & Rudnicky, E. I. (2007). Olympus: an open-source framework for conversational spoken language interface research. In HLT-NAACL 2007 workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technology.
Bohus, D., & Rudnicky, A. (2002). Integrating multiple knowledge sources for utterance-level confidence annotation in the cmu communicator spoken dialog system. Technical report, Roots in the Town. In 2nd International Workshop on Community Networking. 1995. Princeton, NJ: IEEE Communications Society
Bohus, D., & Rudnicky, A. (2005). Sorry i didn’t catch that: An investigation of non-understanding errors and recovery strategies. In Proceedings of SIGdial-2005, Lisbon, Portugal.
Bohus, D., & Rudnicky, A. I. (2009). The ravenclaw dialog management framework: Architecture and systems. Computer Speech & Language, 23, 332–361.
Brown, M., Burnett, D., Candell, E., Carter, J., Dahl, D., Ghosh, D., Hunt, A., Krause, S., Lerner, S., Lucas, B., Marschner, J., McGlashan, S., Normandin, Y., Porter, B., Raggett, D., Ramsthaler, D., Tichelen, L. V., Wang, K., & Werner, L. (2004). Speech recognition grammar specification version 1.0. Technical report, W3C.
Bühler, D. (2009). Towards Domain-driven Dialogue - Application Control and Problem Solving. PhD thesis, Ulm University.
Burkhardt, F., Huber, R., & Batliner, A. (2007). Application of speaker classification in human machine dialog systems. In Speaker Classification I: Fundamentals, Features, and Methods, (pp. 174–179). Berlin, Heidelberg: Springer.
Burkhardt, F., Metze, F., & Stegmann, J. (2008). Speaker classification for next-generation voice-dialog systems, (pp. 497–528). Wiley.
Cáceres, M. (2011). Widget packaging and configuration (working draft). Technical report, W3C.
Chin, J., Diehl, V., & Norman, K. (1988). Development of an instrument measuring user satisfaction of the human–computer interface. In Proceedings of ACM CHI 88 Conference on Human Factors in Computing, (pp. 213–218).
Chomsky, N. (1956). Three models for the description of language. IRE Transactions on Information Theory, 2, 113–124.
Chung, G., Seneff, S., Wang, C., & Hetherington, L. (2004). A dynamic vocabulary spoken dialogue interface. In Proc. ICSLP, (pp. 1457–1460).
Clark, H. H., & Schaefer, E. F. (1989). Contributing to discourse. Cognitive Science, 13(2), 259–294.
Colmerauer, A., & Roussel, P. (1996). The birth of prolog. In T. J. Bergin, Jr., & R. G. Gibson, Jr. (Eds.), History of programming languages—II (pp. 331–367). New York, NY, USA: ACM.
Cook, D., Youngblood, M., & Das, S. (2006). A multi-agent approach to controlling a smart environment. In J. Augusto and C. Nugent (Eds.), Designing Smart Homes, vol. 4008 of Lecture Notes in Computer Science (pp. 165–182). Heidelberg: Springer.
Cornelius, R. (1996). The science of emotion : research and tradition in the psychology of emotions. Upper Saddle River, NJ, USA: Prentice Hall.
Coutaz, J., Crowley, J., Dobson, S., & Garlan, D. (2005). Context is key. Communications of the ACM, 48(3), 49–53.
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., & Taylor, J. G. (2001). Emotion recognition in human–computer interaction. Signal Processing Magazine, 18(1), 32–80.
Daniels, J. (2000). Integrating a spoken language system with agents for operational information access. In AAAI, (pp. 1002–1007).
Dervin, B., Foreman-Wernet, L., & Lauterbach, E. (2003). Sense-making methodology reader: Selected writings of Brenda Dervin. Hampton Press Inc.
Dretske, F. (1991). Explaining behavior: Reasons in a world of causes. Cambridge, MA, USA: MIT.
Duong, T., Bui, H., Phung, D., & Venkatesh, S. (2005). Activity recognition and abnormality detection with the switching hidden semi-markov model. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1, (pp. 838–845). IEEE.
Fahrmeir, L., Hamerle, A., & Tutz, G. (1984). Multivariate statistische Verfahren. New York: Walter de Gruyter.
Ferguson, G., Allen, J., Blaylock, N., Byron, D., Chambers, N., Dzikovska, M., Galescu, L., Shen, X., Swier, R., & Swift, M. (2002). The Medication Advisor Project: Preliminary report. Technical Report TR776, University of Rochester Computer Science Department.
Fowler, M. (2006). Passive view.
Franke, J., Daniels, J., & McFarlane, D. (2002). Recovering context after interruption. In CogSci’02, (pp. 310–315).
Garrett, J. J. (2005). Ajax: A new approach to web applications. http://adaptivepath.com/ideas/essays/archives/000385.php.
Gervasio, M., & Murdock, J. (2009). What were you thinking?: filling in missing dataflow through inference in learning from demonstration. In Proceedings of the 14th international conference on Intelligent user interfaces, (pp. 157–166). ACM.
Gil, Y., & Ratnakar, V. (2008). Towards intelligent assistance for to-do lists. In Proceedings of the 13th international conference on Intelligent user interfaces, (pp. 329–332). ACM.
Ginzburg, J., & Cooper, R. (2004). Clarification, ellipsis, and the nature of contextual updates in dialogue. Linguistics and Philosophy, 27(3), 297–365.
Gnjatović, M., & Rösner, D. (2008). Adaptive dialogue management in the nimitek prototype system. In Proceedings of the 4th IEEE PIT workshop, (pp. 14–25). Berlin, Heidelberg: Springer.
Goumopoulos, C., & Kameas, A. (2009). Ambient ecologies in smart homes. The Computer Journal, 52(8), 922–937.
Habibi, M., Rahbar, S., & Sameti, H. (2010). Divided pomdp method for complex menu problems in spoken dialogue systems. In Spoken Language Technology Workshop (SLT), 2010 IEEE, (pp. 484–489). IEEE.
Hamp, B., & Feldweg, H. (1997). Germanet – a lexical-semantic net for german. In Proceedings of ACL workshop Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications, (pp. 9–15). Citeseer.
Heinroth, T., & Denich, D. (2011). Spoken Interaction within the Computed World: Evaluation of a Multitasking Adaptive Spoken Dialogue System. In 35th Annual IEEE International Computer Software and Applications Conference (COMPSAC 2011). IEEE.
Heinroth, T., Denich, D., & Schmitt, A. (2010). Owlspeak - adaptive spoken dialogue within intelligent environments. In 8th IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops), (pp. 666 – 671). Mannheim, Germany.
Heinroth, T., Grotz, M., Nothdurft, F., & Minker, W. (2012). Adaptive speech recognition for intuitive model-based spoken dialogues. In Proceedings of the Eighth Conference on International Language Resources and Evaluation (LREC’12). European Language Resources Association (ELRA).
Heinroth, T., Koleva, S., & Minker, W. (2011). Topic switching strategies for spoken dialogue systems. In Proc. of the 12th Annual Conference of the International Speech Communication Association.
Heinroth, T., & Minker, W. (Eds.). (2011). Next Generation Intelligent Environments: Ambient Adaptive Systems. Boston, USA: Springer.
Herm, O., Schmitt, A., & Liscombe, J. (2008). When calls go wrong: How to detect problematic calls based on log-files and emotions? In Proc. of the International Conference on Speech and Language Processing (ICSLP).
Hildebrand, A., & Sá, V. (2000). Embassi: electronic multimedia and service assistance. In oceedings of the Internet Measurement Conference (IMC), (pp. 50–59).
Hone, K. S., & Graham, R. (2000). Towards a tool for the subjective assessment of speech system interfaces (sassi). Natural Language Engineering, 6, 287–305.
Horridge, M., Bechhofer, S., & Noppens, O. (2007). Igniting the owl 1.1 touch paper: The owl api. In Proc. OWL-ED, vol. 258.
Huerta, J. M. (2000). Robust Speech Recognition in GSM Mobile Environments. PhD thesis, Carnegie Mellon University.
Hunt, A. (2000). Jspeech grammar format. W3C Note http://www.w3.org/TR/jsgf/.
Intille, S. S., Larson, K., Beaudin, J. S., Tapia, M., Kaushik, P., Nawyn, J., and Mcleish, T. J. (2005). The placelab: a live-in laboratory for pervasive computing research (video. In Proceedings of Pervasive 2005 Video Program.
ISO (2008). Iso/iec 29341–2:2008 information technology – upnp device architecture – part 2: Basic device control protocol - basic device. Technical report, INTERNATIONAL ORGANIZATION FOR STANDARDIZATION.
ITU (2005). Parameters describing the interaction with spoken dialogue systems. ITU-T Recommendation Supplement 24 to P-Series, International Telecommunication Union, Geneva, Switzerland. Based on ITU-T Contr. COM 12–17 (2009).
Jiang, H. (2005). Confidence measures for speech recognition: A survey. Speech Communication, 45(4), 455–470.
Johnston, M., Baggia, P., Burnett, D., Carter, J., Dahl, D., & McCobb, G. (2009). Emma: Extensible multimodal annotation markup language; World Wide Web Consortium Recommendation REC-emma-2009021. Technical report, W3C.
Jokinen, K., Kerminen, A., Kaipainen, M., Jauhiainen, T., Wilcock, G., Turunen, M., Hakulinen, J., Kuusisto, J., & Lagus, K. (2002). Adaptive dialogue systems-interaction with interact. In Proceedings of the 3rd SIGdial workshop on Discourse and dialogue-Volume 2, (pp. 64–73). ACL.
Jurafsky, D., & Martin, J. H. (2000). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition (Prentice Hall Series in Artificial Intelligence) (1st ed.). Prentice Hall.
Kaelbling, L., Littman, M., & Cassandra, A. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1–2), 99–134.
Kientz, J. A., Patel, S. N., Jones, B., Price, E., Mynatt, E. D., & Abowd, G. D. (2008). The georgia tech aware home. In CHI ’08 extended abstracts on Human factors in computing systems, CHI EA ’08, (pp. 3675–3680). New York, NY, USA: ACM.
Kleene, S. (1988). Introduction to metamathematics. Wolters-Noordhoff.
Knuth, D. E. (1964). Backus normal form vs. Backus Naur form. Communications of the ACM, 7(12), 735–736.
Konings, B., & Schaub, F. (2011). Territorial privacy in ubiquitous computing. In Wireless On-Demand Network Systems and Services (WONS), 2011 Eighth International Conference on, (pp. 104–108). IEEE.
Könings, B., Wiedersheim, B., & Weber, M. (2011). Privacy & trust in ambient intelligence environments. In W. Minker and T. Heinroth (Eds.), Next Generation Intelligent Environments (pp. 227–252). New York: Springer.
Krasner, G., & Pope, S. (1998). A cookbook for using the model-view-controller user interface paradigm in smalltalk-80. Journal of Object-Oriented Programming, 1(3), 26–49.
Kruskal, W., & Wallis, W. (1952). Use of ranks in one-criterion variance analysis. Journal of the American statistical Association, 47(260), 583–621.
Larsson, S. (2002). Issue-based Dialogue Management. PhD thesis, Göteborg University, Sweden.
Larsson, S., & Traum, D. (2000). Information state and dialogue management in the trindi dialogue move engine. Natural Language Engineering Special Issue, 6, 323–340.
Levenshtein, V. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10(8), 707–710.
Lewis, J. R. (1995). Ibm computer usability satisfaction questionnaires: Psychometric evaluation and instructions for use. International Journal of Human–Computer Interaction, 7(1), 57–78.
Limbourg, Q., Vanderdonckt, J., Michotte, B., Bouillon, L., & López-Jaquero, V. (2005). Usixml: A language supporting multi-path development of user interfaces. In 9th IFIP Working Conference on Engineering for Human–Computer Interaction, (pp. 134–135). Springer.
Litman, D., & Pan, S. (2002). Designing and evaluating an adaptive spoken dialogue system. User Modeling and User-Adapted Interaction, 12(2), 111–137.
Lockwood, S., & Cook, D. (2008). Computer, light on! In The 4th IET International Conference on Intelligent Environments, Seattle, USA.
López-Cózar, R., & Callejas, Z. (2006). Two-level speech recognition to enhance the performance of spoken dialogue systems. Knowledge-Based Systems, 19(3), 153–163.
López-Cózar, R., & Callejas, Z. (2008). Asr post-correction for spoken dialogue systems based on semantic, syntactic, lexical and contextual information. Speech Communication, 50(8–9), 745–766.
López-Cózar, R., & Callejas, Z. (2010). Multimodal dialogue for ambient intelligence and smart environments, chapter 21, (pp. 559–579). Springer.
Mankiewicz, R. (2000). The story of mathematics. Princeton: Princeton University Press.
Mann, H., & Whitney, D. (1947). On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics, 18(1), 50–60.
McFarlane, D. (2002). Comparison of four primary methods for coordinating the interruption of people in human–computer interaction. Human–Computer Interaction, 17, 63–139.
McGuinness, D. L., & van Harmelen, F. (2004). Owl web ontology language. Technical report, W3C.
McTear, M. (2004). Spoken Dialogue Technology: Toward the Conversational User Interface. London: Springer.
McTear, M., O’Neill, I., Hanna, P., Liu, X., McTear, M., O’Neill, I., Hanna, P., & Liu, X. (2005). Handling errors and determining confirmation strategies–an object-based approach. Speech Communication, 45(3), 249–269. Special Issue on Error Handling in Spoken Dialogue Systems.
Metze, F., Englert, R., Bub, U., Burkhardt, F., & Stegmann, J. (2008). Getting closer: tailored human–computer speech dialog. Universal Access in the Information Society, 8, 97–108.
Miller, G. (1956). The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychological review, 63(2), 81–97.
Miller, G. (1995). Wordnet: a lexical database for english. Communications of the ACM, 38(11), 39–41.
Minker, W., López-Cózar, R., & McTear, M. (2009). The role of spoken language dialogue interaction in intelligent environments. Journal of Ambient Intelligence and Smart Environments, 1(1), 31–36.
Montoro, G., Alamán, X., & Haya, P. A. (2004). Spoken interaction in intelligent environments: A working system. In Advances in Pervasive Computing.
Mozer, M. C. (2005). Lessons from an Adaptive Home, (pp. 271–294). Wiley.
Nakano, M., Miyazaki, N., Hirasawa, J.-i., Dohsaka, K., & Kawabata, T. (1999). Understanding unsegmented user utterances in real-time spoken dialogue systems. In Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, ACL ’99, (pp. 200–207). Stroudsburg, PA, USA: ACL.
Nevin, B., & Johnson, S. (2002). The legacy of Zellig Harris: language and information into the 21st century. John Benjamins Publishing Company.
Niezen, G., van der Vlist, B., Hu, J., & Feijs, L. (2010). From events to goals: Supporting semantic interaction in smart environments. In 2010 IEEE Symposium on Computers and Communications (ISCC), (pp. 1029–1034). IEEE.
Nuance (2008). Nuance speech recognition system version 8.5 grammar developer’s guide. Technical report, Nuance Communications. visited 05.09.2010.
Oh, A. H., & Rudnicky, A. I. (2000). Stochastic language generation for spoken dialogue systems. In Proceedings of the 2000 ANLP/NAACL Workshop on Conversational systems - Volume 3, ANLP/NAACL-ConvSyst ’00, (pp. 27–32). Stroudsburg, PA, USA: ACL.
Oshry, M., Auburn, R., Baggia, P., Bodell, M., Burke, D., Burnett, D. C., Candell, E., Carter, J., McGlashan, S., Lee, A., Porter, B., & Rehor, K. (2007). Voice extensible markup language (voicexml) 2.1. Technical report, W3C.
Paternò, F., Mancini, C., & Meniconi, S. (1997). Concurtasktrees: A diagrammatic notation for specifying task models. In Proceedings of the IFIP TC13 Interantional Conference on Human–Computer Interaction, (pp. 362–369).
Pittermann, J. (2008). Speech-Emotion Recognition in Adaptive Dialogue Systems. PhD thesis, Ulm University.
Pittermann, J., Pittermann, A., & Minker, W. (2009). Handling Emotions in Human–Computer Dialogues. Dordrecht, The Netherlands: Springer.
Plutchik, R. (1980). Emotion: A Psychoevolutionary Synthesis. New York, USA: Harper & Row.
Potel, M. (1996). MVP: Model-View-Presenter The Taligent Programming Model for C + + and Java. Technical report, Taligent Inc.
Pruvost, G., Heinroth, T., Bellik, Y., & Minker, W. (2011). Next Generation Intelligent Environments: Ambient Adaptive Systems, chapter 5, (pp. 151–192). Springer.
Puerta, A., & Eisenstein, J. (2002). Ximl: a common representation for interaction data. In Proceedings of the 7th International Conference on Intelligent User Interfaces, (pp. 214–215). ACM.
Qu, Y. (2001). A Constraint-Based Model of Mixed-Initiative Dialogue in Information-Seeking Interactions. PhD thesis, School of Computer Science, Carnegie Mellon University.
Qu, Y. (2002). A constraint-based approach for cooperative information-seeking dialog. In Proc. INLG.
Quesada, J. F., Garcia, F., Sena, E., Bernal, J. A., & Amores, G. (2001). Dialogue management in a home machine environment: Linguistic components over an agent architecture. Procesamiento del Lenguaje Natural, 27, 89–96.
Raux, A., & Eskenazi, M. (2007). A multi-layer architecture for semi-synchronous event-driven dialogue management. In ASRU. IEEE Workshop on Automatic Speech Recognition Understanding, (pp. 514–519).
Reenskaug, T. (1979). Models - views - controllers. Technical report, Xerox PARC.
rí Adámek, J. (2008). Theoretische Informatik (lecture notes). Technische Universität Braunschweig.
Rohlicek, J., Russell, W., Roukos, S., & Gish, H. (1989). Continuous hidden Markov modeling for speaker-independent word spotting. In ICASSP’89, (pp. 627–630). IEEE.
Román, M., Hess, C., Cerqueira, R., Campbell, R. H., & Nahrstedt, K. (2002). Gaia: A middleware infrastructure to enable active spaces. IEEE Pervasive Computing, 1, 74–83.
Ruser, H., Borodulkin, L., & Leisner, D. (2003). Multi-modal ‘smart home’ user interface. In Signals Systems Decision and Information Technology (SSD).
Schattenberg, B., Balzer, S., & Biundo, S. (2006). Knowledge-based Middleware as an Architecture for Planning and Scheduling Systems. In Proc. of the 16th International Conference on Automated Planning and Scheduling (ICAPS-06), Ambleside, The English Lake District, UK.
Schmitt, A., Heinroth, T., & Bertrand, G. (2009). Towards emotion, age- and gender-aware voicexml applications. In 5th International Conference on Intelligent Environments (IE’09), vol. 2 of Ambient Intelligence and Smart Environments, (pp. 34–41). IOS Press.
Schmitt, A., & Liscombe, J. (2008). Detecting Problematic Calls With Automated Agents. In 4th IEEE Tutorial and Research Workshop Perception and Interactive Technologies for Speech-Based Systems, Irsee, Germany.
Schmitt, A., Schatz, B., & Minker, W. (2011). Modeling and predicting quality in spoken human–computer interaction. In Proceedings of the SIGDIAL 2011 Conference, (pp. 173–184). Portland, Oregon, USA: ACL.
Schnelle-Walka, D., & Feldes, S. (2009). Towards mixed-initiative concepts in smart environments. In Proceedings of Workshop Interacting with Smart Objects.
Seneff, S., Hurley, E., Lau, R., Pao, C., Schmid, P., & Zue, V. (1998). Galaxy-ii: A reference architecture for conversational system development. In Proceedings of the international conference on spoken language processing, (pp. 931–934).
Shanmugham, S., Monaco, P., & Eberman, B. (2006). A media resource control protocol (mrcp). RFC 4463 http://tools.ietf.org/html/rfc4463.
Shannon, C. (1948). A mathematical theory of communication. Bell Systems Technical Journal, 27, 623–656.
Skantze, G. (2003). Exploring human error handling strategies: Implications for spoken dialogue systems. In Proceedings of the ISCA Workshop on Error Handling in Spoken Dialogue Systems, (pp. 71–76). Citeseer.
Sonntag, D., Engel, R., Herzog, G., Pfalzgraf, A., Pfleger, N., Romanelli, M., & Reithinger, N. (2007). SmartWeb Handheld – Multimodal Interaction with Ontological Knowledge Bases and Semantic Web Services, vol. 4451 of Lecture Notes in Computer Science, (pp. 272–295). Berlin/Heidelberg: Springer.
Stoline, M. (1981). The status of multiple comparisons: simultaneous estimation of all pairwise comparisons in one-way anova designs. American Statistician, 35(3), 134–141.
Swerts, M., Litman, D., & Hirschberg, J. (2000). Corrections in spoken dialogue systems. In Proceedings of the International Conference on Spoken Language Processing, vol. 2, (pp. 615–618). Citeseer.
Traum, D., & Larsson, S. (2003). The information state approach to dialogue management, chapter 15, (pp. 325–353). Kluwer.
Turing, A. (1937). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 2(1), 230.
Ubisense (2011). Ubisense series 7000 ip sensors. http://www.ubisense.net/en/media/pdfs/factsheets_pdf/88679_series_7000_ip_sensors_combined.pdf.
van Helvert, J., Hagras, H., & Kameas, A. (2009). D27 - prototype testing and validation (year 2). Restricted deliverable, The ATRACO Project (FP7/2007–2013 grant agreement n 216837).
van Helvert, J., Hagras, H., Wagner, C., Dooley, J., Bacon, R., & Bilgin, A. (2011). D27 - prototype testing and validation (year 3). Restricted deliverable, The ATRACO Project (FP7/2007–2013 grant agreement n 216837).
Van Welie, M., Van der Veer, G., & Eliëns, A. (1998). An ontology for task world models. In Proceedings of DSV-IS98, (pp. 3–5). Abingdon, UK: Springer.
Vipperla, R., Wolters, M., Georgila, K., & Renals, S. (2009). Speech input from older users in smart environments: Challenges and perspectives. In Proceedings HCI International: Universal Access in Human–Computer Interaction. Intelligent and Ubiquitous Interaction Environments, number 5615 in Lecture Notes in Computer Science (pp. 117–126). Springer.
Voxeo (2011). Voxeo prophecy. http://www.voxeo.com/products/.
Wagner, C., & Hagras, H. (2010). D14 – artefact operation adaptation component. Confidential deliverable, The ATRACO Project (FP7/2007–2013 grant agreement n 216837).
Walker, M., Rudnicky, A., Prasad, R., Aberdeen, J., Bratt, E., Garofolo, J., Hastie, H., Le, A., Pellom, B., Potamianos, A., et al. (2002). Darpa communicator: Cross-system results for the 2001 evaluation. In Proc. of ICSLP. Citeseer.
Walker, M. A., Litman, D. J., Kamm, C. A., & Abella, A. (1997). Paradise: a framework for evaluating spoken dialogue agents. In Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics.
Wang, K. (2002). Salt: an xml application for web-based multimodal dialog management. In Proceedings of the 2nd workshop on NLP and XML - Volume 17, (pp. 1–8).
Ward, W., & Issar, S. (1994). Recent improvements in the cmu spoken language understanding system. In Proceedings of the workshop on Human Language Technology, HLT ’94, (pp. 213–216). Stroudsburg, PA, USA: ACL.
Warren, W. (2006). The dynamics of perception and action. Psychological review, 113(2), 358.
Wechsung, I., & Naumann, A. B. (2008). Evaluation methods for multimodal systems: A comparison of standardized usability questionnaires. Lecture Notes in Computer Science, 5078, 276–284.
Williams, J., & Young, S. (2007). Scaling pomdps for spoken dialog management. IEEE Transactions on Audio, Speech, and Language Processing, 15(7), 2116–2129.
Yang, F., Heeman, P., & Kun, A. (2008). Switching to real-time tasks in multi-tasking dialogue. In COLING’08, (pp. 1025–1032). ACL.
Yang, F., Heeman, P. A., & Kun, A. L. (2011). An investigation of interruptions and resumptions in multi-tasking dialogues. Computational Linguistics, 37(1), 75–104.
Young, S. (2007). Using POMDPs for dialog management. In Spoken Language Technology Workshop, 2006. IEEE, (pp. 8–13). IEEE.
Young, S., Gasic, M., Keizer, S., Mairesse, F., Schatzmann, J., Thomson, B., & Yu, K. (2010). The hidden information state model: A practical framework for pomdp-based spoken dialogue management. Computer Speech & Language, 24(2), 150–174.
Young, S., Williams, J., Schatzmann, J., Stuttle, M., & Weilhammer, K. (2006). D4.3: Bayes net prototype - the hidden information state dialogue manager. Technical report, TALK - Talk and Look: Tools for Ambient Linguistic Knowledge, IST-507802, 6th FP.
Zgorzelski, A., Schmitt, A., Heinroth, T., & Minker, W. (2010). Repair strategies on trial: which error recovery do users like best? In Proc. of the International Conference on Speech and Language Processing (ICSLP).
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Heinroth, T., Minker, W. (2013). Background. In: Introducing Spoken Dialogue Systems into Intelligent Environments. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-5383-3_2
Download citation
DOI: https://doi.org/10.1007/978-1-4614-5383-3_2
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-5382-6
Online ISBN: 978-1-4614-5383-3
eBook Packages: EngineeringEngineering (R0)