HALEF: An Open-Source Standard-Compliant Telephony-Based Modular Spoken Dialog System: A Review and An Outlook

Suendermann-Oeft, David; Ramanarayanan, Vikram; Teckenbrock, Moritz; Neutatz, Felix; Schmidt, Dennis

doi:10.1007/978-3-319-19291-8_5

David Suendermann-Oeft⁵,
Vikram Ramanarayanan⁵,
Moritz Teckenbrock⁶,
Felix Neutatz⁶ &
…
Dennis Schmidt⁶

1162 Accesses
7 Citations

Abstract

We describe completed and ongoing research on HALEF, a telephony-based open-source spoken dialog system that can be used with different plug-and-play back-end modules. We present two examples of such a module, one which classifies whether the person calling into the system is intoxicated or not and the other a question answering application. The system is compliant with World Wide Web Consortium and related industry standards while maintaining an open codebase to encourage progressive development and a common standard testbed for spoken dialog system development and benchmarking. The system can be deployed towards a versatile range of potential applications, including intelligent tutoring, language learning and assessment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Popular grammar formats include JSGF (Java Speech Grammar Format), SRGS (speech recognition grammar specification) and ARPA (Advanced Research Projects Agency) formats.
2.
Since the data collected during different ALC experiments are not balanced in terms of class and gender, we removed all speakers that were recorded in only one of the classification states. We then discarded as many male speakers (selected at random) as necessary to achieve gender balance.

References

Black AW, Burger S, Conkie A, Hastie H, Keizer S, Lemon O, Merigaud N, Parent G, Schubiner G, Thomson B, Williams J, Yu K, Young S, Eskenazi M (2011) Spoken dialog challenge 2010: comparison of live and control test results. In: Proceedings of the SIGDIAL 2011 conference. Association for Computational Linguistics, Portland, pp 2–7
Google Scholar
Bohus D, Raux A, Harris T, Eskenazi M, Rudnicky A (2007) Olympus: an open-source framework for conversational spoken language interface research. In: Proc. of the HLT-NAACL, Rochester, 2007
Google Scholar
Bos J, Klein E, Lemon O, Oka T (2003) Dipper: description and formalisation of an information-state update dialogue system architecture. In: 4th SIGdial workshop on discourse and dialogue, pp. 115–124
Google Scholar
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27
Article Google Scholar
Eyben F, Wöllmer M, Schuller B (2010) Opensmile: the Munich versatile and fast open-source audio feature extractor. In: Proc. of the MM, Florence, 2010
Google Scholar
Ferrucci D, Brown E, Chu-Carroll J, Fan J, Gondek D, Kalyanpur A, Lally A, Murdock W, Nyberg E, Prager J, Schlaefer N, Welty C (2010) Building Watson: an overview of the DeepQA project. AI Mag 31(3):59–79
Google Scholar
Gorin A, Riccardi G, Wright J (1997) How may I help you? Speech Commun 23(1/2):113–127
Article MATH Google Scholar
Graesser AC, Chipman P, Haynes BC, Olney A (2005) Autotutor: an intelligent tutoring system with mixed-initiative dialogue. IEEE Trans Educ 48(4):612–618
Article Google Scholar
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18
Article Google Scholar
Holmes G, Donkin A, Witten IH (1994) Weka: a machine learning workbench. In: Proceedings of the 1994 second Australian and New Zealand conference on intelligent information systems. IEEE, Brisbane, pp 357–361
Google Scholar
Jurčíček F, Dušek O, Plátek O, Žilka L (2014) Alex: a statistical dialogue systems framework. In: Text, speech and dialogue. Springer, Brno, pp 587–594
Google Scholar
Lamere P, Kwok P, Gouvea E, Raj B, Singh R, Walker W, Warmuth M, Wolf P (2003) The CMU SPHINX-4 speech recognition system. In: Proc. of the ICASSP’03, Hong Kong, 2003
Google Scholar
Mehrez T, Abdelkawy A, Heikal Y, Lange P, Nabil H, Suendermann-Oeft D (2013) Who discovered the electron neutrino? A telephony-based distributed open-source standard-compliant spoken dialog system for question answering. In: Proc. of the GSCL, Darmstadt, 2013
Google Scholar
Pieraccini R, Huerta J (2005) Where do we go from here? Research and commercial spoken dialog systems. In: Proc. of the SIGdial, Lisbon, 2005
Google Scholar
Prylipko D, Schnelle-Walka D, Lord S, Wendemuth A (2011) Zanzibar OpenIVR: an open-source framework for development of spoken dialog systems. In: Proc. of the TSD, Pilsen
Google Scholar
Raux A, Langner B, Bohus D, Black A, Eskenazi M (2005) Let’s go public! taking a spoken dialog system to the real world. In: Proc. of the Interspeech, Lisbon, 2005
Google Scholar
Schiel F, Heinrich C (2009) Laying the foundation for in-car alcohol detection by speech. In: Proc. of the Interspeech, Brighton, 2009
Google Scholar
Schiel F, Heinrich C, Barfüsser S, Gilg T (2008) ALC—alcohol language corpus. In: Proc. of the LREC, Marrakesh, 2008
Google Scholar
Schmitt A, Scholz M, Minker W, Liscombe J, Suendermann D (2010) Is it possible to predict task completion in automated troubleshooters? In: Proc. of the Interspeech, Makuhari, 2010
Google Scholar
Schnelle-Walka D, Radomski S, Mühlhäuser M (2013) JVoiceXML as a modality component in the W3C multimodal architecture. J Multimodal User Interfaces 7:183–194
Article Google Scholar
Schröder M, Trouvain J (2003) The German text-to-speech synthesis system mary: a tool for research, development and teaching. Int J Speech Technol 6(4):365–377
Article Google Scholar
Schuller B, Steidl S, Batliner A, Schiel F, Krajewski J (2011) The interspeech 2011 speaker state challenge. In: INTERSPEECH, pp 3201–3204
Google Scholar
Seneff S, Wang C, Zhang J (2004) Spoken conversational interaction for language learning. In: InSTIL/ICALL symposium
Google Scholar
Suendermann D (2011) Advances in commercial deployment of spoken dialog systems. Springer, New York
Book Google Scholar
Suendermann-Oeft D (2014) Modern conversational agents. In: Technologien für digitale Innovationen. Springer, Wiesbaden, pp 63–84
Google Scholar
Taylor P, Black A, Caley R (1998) The architecture of the festival speech synthesis system. In: Proc. of the ESCA workshop on speech synthesis, Jenolan Caves, 1998
Google Scholar
van Meggelen J, Smith J, Madsen L (2009) Asterisk: the future of telephony. O’Reilly, Sebastopol
Google Scholar
van Zaanen M (2008) Multi-lingual question answering using OpenEphyra. In: Working notes for the cross language evaluation forum (CLEF), pp 1–6
Google Scholar
Williams JD, Young S (2007) Partially observable markov decision processes for spoken dialog systems. Comput Speech Lang 21(2):393–422
Article Google Scholar
Xu Y, Seneff S (2011) A generic framework for building dialogue games for language learning: application in the flight domain. In: SLaTE, pp 73–76
Google Scholar
Young S, Gašić M, Keizer S, Mairesse F, Schatzmann J, Thomson B, Yu K (2010) The hidden information state model: a practical framework for pomdp-based spoken dialogue management. Comput Speech Lang 24(2):150–174
Article Google Scholar

Download references

Author information

Authors and Affiliations

Educational Testing Service (ETS) Research, San Francisco, CA, USA
David Suendermann-Oeft & Vikram Ramanarayanan
DHBW, Stuttgart, Germany
Moritz Teckenbrock, Felix Neutatz & Dennis Schmidt

Authors

David Suendermann-Oeft
View author publications
You can also search for this author in PubMed Google Scholar
Vikram Ramanarayanan
View author publications
You can also search for this author in PubMed Google Scholar
Moritz Teckenbrock
View author publications
You can also search for this author in PubMed Google Scholar
Felix Neutatz
View author publications
You can also search for this author in PubMed Google Scholar
Dennis Schmidt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vikram Ramanarayanan .

Editor information

Editors and Affiliations

Department of Computer Science and Engin, Pohang University of Science & Tech, Namgu, Pohang, Korea (Republic of)
G.G. Lee
School of Information and Communications, Gwangju Institute of Science and Tech, Buk-gu, Gwangju, Korea (Republic of)
H.K. Kim
Microsoft Corporation, Redmond, Washington, USA
M. Jeong
Dept of Computer Science and Engineering, Sogang University, Mapo-gu, Seoul, Korea (Republic of)
J.-H. Kim

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Suendermann-Oeft, D., Ramanarayanan, V., Teckenbrock, M., Neutatz, F., Schmidt, D. (2015). HALEF: An Open-Source Standard-Compliant Telephony-Based Modular Spoken Dialog System: A Review and An Outlook. In: Lee, G., Kim, H., Jeong, M., Kim, JH. (eds) Natural Language Dialog Systems and Intelligent Assistants. Springer, Cham. https://doi.org/10.1007/978-3-319-19291-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-19291-8_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19290-1
Online ISBN: 978-3-319-19291-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics