HALEF: An Open-Source Standard-Compliant Telephony-Based Modular Spoken Dialog System: A Review and An Outlook

  • David Suendermann-Oeft
  • Vikram Ramanarayanan
  • Moritz Teckenbrock
  • Felix Neutatz
  • Dennis Schmidt


We describe completed and ongoing research on HALEF, a telephony-based open-source spoken dialog system that can be used with different plug-and-play back-end modules. We present two examples of such a module, one which classifies whether the person calling into the system is intoxicated or not and the other a question answering application. The system is compliant with World Wide Web Consortium and related industry standards while maintaining an open codebase to encourage progressive development and a common standard testbed for spoken dialog system development and benchmarking. The system can be deployed towards a versatile range of potential applications, including intelligent tutoring, language learning and assessment.


Spoken dialog systems VoiceXML Alcoholic language classification 


  1. Black AW, Burger S, Conkie A, Hastie H, Keizer S, Lemon O, Merigaud N, Parent G, Schubiner G, Thomson B, Williams J, Yu K, Young S, Eskenazi M (2011) Spoken dialog challenge 2010: comparison of live and control test results. In: Proceedings of the SIGDIAL 2011 conference. Association for Computational Linguistics, Portland, pp 2–7Google Scholar
  2. Bohus D, Raux A, Harris T, Eskenazi M, Rudnicky A (2007) Olympus: an open-source framework for conversational spoken language interface research. In: Proc. of the HLT-NAACL, Rochester, 2007Google Scholar
  3. Bos J, Klein E, Lemon O, Oka T (2003) Dipper: description and formalisation of an information-state update dialogue system architecture. In: 4th SIGdial workshop on discourse and dialogue, pp. 115–124Google Scholar
  4. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27CrossRefGoogle Scholar
  5. Eyben F, Wöllmer M, Schuller B (2010) Opensmile: the Munich versatile and fast open-source audio feature extractor. In: Proc. of the MM, Florence, 2010Google Scholar
  6. Ferrucci D, Brown E, Chu-Carroll J, Fan J, Gondek D, Kalyanpur A, Lally A, Murdock W, Nyberg E, Prager J, Schlaefer N, Welty C (2010) Building Watson: an overview of the DeepQA project. AI Mag 31(3):59–79Google Scholar
  7. Gorin A, Riccardi G, Wright J (1997) How may I help you? Speech Commun 23(1/2):113–127CrossRefMATHGoogle Scholar
  8. Graesser AC, Chipman P, Haynes BC, Olney A (2005) Autotutor: an intelligent tutoring system with mixed-initiative dialogue. IEEE Trans Educ 48(4):612–618CrossRefGoogle Scholar
  9. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18CrossRefGoogle Scholar
  10. Holmes G, Donkin A, Witten IH (1994) Weka: a machine learning workbench. In: Proceedings of the 1994 second Australian and New Zealand conference on intelligent information systems. IEEE, Brisbane, pp 357–361Google Scholar
  11. Jurčíček F, Dušek O, Plátek O, Žilka L (2014) Alex: a statistical dialogue systems framework. In: Text, speech and dialogue. Springer, Brno, pp 587–594Google Scholar
  12. Lamere P, Kwok P, Gouvea E, Raj B, Singh R, Walker W, Warmuth M, Wolf P (2003) The CMU SPHINX-4 speech recognition system. In: Proc. of the ICASSP’03, Hong Kong, 2003Google Scholar
  13. Mehrez T, Abdelkawy A, Heikal Y, Lange P, Nabil H, Suendermann-Oeft D (2013) Who discovered the electron neutrino? A telephony-based distributed open-source standard-compliant spoken dialog system for question answering. In: Proc. of the GSCL, Darmstadt, 2013Google Scholar
  14. Pieraccini R, Huerta J (2005) Where do we go from here? Research and commercial spoken dialog systems. In: Proc. of the SIGdial, Lisbon, 2005Google Scholar
  15. Prylipko D, Schnelle-Walka D, Lord S, Wendemuth A (2011) Zanzibar OpenIVR: an open-source framework for development of spoken dialog systems. In: Proc. of the TSD, PilsenGoogle Scholar
  16. Raux A, Langner B, Bohus D, Black A, Eskenazi M (2005) Let’s go public! taking a spoken dialog system to the real world. In: Proc. of the Interspeech, Lisbon, 2005Google Scholar
  17. Schiel F, Heinrich C (2009) Laying the foundation for in-car alcohol detection by speech. In: Proc. of the Interspeech, Brighton, 2009Google Scholar
  18. Schiel F, Heinrich C, Barfüsser S, Gilg T (2008) ALC—alcohol language corpus. In: Proc. of the LREC, Marrakesh, 2008Google Scholar
  19. Schmitt A, Scholz M, Minker W, Liscombe J, Suendermann D (2010) Is it possible to predict task completion in automated troubleshooters? In: Proc. of the Interspeech, Makuhari, 2010Google Scholar
  20. Schnelle-Walka D, Radomski S, Mühlhäuser M (2013) JVoiceXML as a modality component in the W3C multimodal architecture. J Multimodal User Interfaces 7:183–194CrossRefGoogle Scholar
  21. Schröder M, Trouvain J (2003) The German text-to-speech synthesis system mary: a tool for research, development and teaching. Int J Speech Technol 6(4):365–377CrossRefGoogle Scholar
  22. Schuller B, Steidl S, Batliner A, Schiel F, Krajewski J (2011) The interspeech 2011 speaker state challenge. In: INTERSPEECH, pp 3201–3204Google Scholar
  23. Seneff S, Wang C, Zhang J (2004) Spoken conversational interaction for language learning. In: InSTIL/ICALL symposiumGoogle Scholar
  24. Suendermann D (2011) Advances in commercial deployment of spoken dialog systems. Springer, New YorkCrossRefGoogle Scholar
  25. Suendermann-Oeft D (2014) Modern conversational agents. In: Technologien für digitale Innovationen. Springer, Wiesbaden, pp 63–84Google Scholar
  26. Taylor P, Black A, Caley R (1998) The architecture of the festival speech synthesis system. In: Proc. of the ESCA workshop on speech synthesis, Jenolan Caves, 1998Google Scholar
  27. van Meggelen J, Smith J, Madsen L (2009) Asterisk: the future of telephony. O’Reilly, SebastopolGoogle Scholar
  28. van Zaanen M (2008) Multi-lingual question answering using OpenEphyra. In: Working notes for the cross language evaluation forum (CLEF), pp 1–6Google Scholar
  29. Williams JD, Young S (2007) Partially observable markov decision processes for spoken dialog systems. Comput Speech Lang 21(2):393–422CrossRefGoogle Scholar
  30. Xu Y, Seneff S (2011) A generic framework for building dialogue games for language learning: application in the flight domain. In: SLaTE, pp 73–76Google Scholar
  31. Young S, Gašić M, Keizer S, Mairesse F, Schatzmann J, Thomson B, Yu K (2010) The hidden information state model: a practical framework for pomdp-based spoken dialogue management. Comput Speech Lang 24(2):150–174CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • David Suendermann-Oeft
    • 1
  • Vikram Ramanarayanan
    • 1
  • Moritz Teckenbrock
    • 2
  • Felix Neutatz
    • 2
  • Dennis Schmidt
    • 2
  1. 1.Educational Testing Service (ETS) ResearchSan FranciscoUSA
  2. 2.DHBWStuttgartGermany

Personalised recommendations