Abstract
This paper describes the integration of an on-line Kaldi speech recogniser into the Alex Dialogue Systems Framework (ADSF). As the Kaldi OnlineLatgenRecogniser is written in C++, we first developed a Python wrapper for the recogniser so that the ADSF, written in Python, could interface with it. Training scripts for acoustic and language modelling were developed and integrated into ADSF, and acoustic and language models were build. Finally, optimal recogniser parameters were determined and evaluated. The dialogue system Alex with the new speech recogniser is evaluated on Public Transport Information (PTI) domain.
Keywords
- automatic speech recognition
- Kaldi
- Alex
- dialogue systems
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Skantze, G., Schlangen, D.: Incremental dialogue processing in a micro-domain. In: Proc. ECACL, pp. 745–753 (2009)
Akinobu, L.: Open-Source Large Vocabulary CSR Engine Julius (2014), http://julius.sourceforge.jp/en_index.php
Allauzen, C., Riley, M., Schalkwyk, J., Skut, W., Mohri, M.: OpenFst: A general and efficient weighted finite-state transducer library. In: Holub, J., Žďárek, J. (eds.) CIAA 2007. LNCS, vol. 4783, pp. 11–23. Springer, Heidelberg (2007)
Huggins-Daines, D., Kumar, M., Chan, A., Black, A., Ravishankar, M., Rudnicky, A.: Pocketsphinx: A free, real-time continuous speech recognition system for hand-held devices. In: Proc. ICASSP, pp. I–I (December 2006)
D. Povey, M. Hannemann, G. Boulianne, L. Burget, A. Ghoshal, M. Janda, M. Karafiát, S. Kombrink, P. Motlicek, Y. Qian at al.: Generating exact lattices in the WFST framework. In Proc. ICASSP, pp. 4213–4216 (2012)
Rybach, D., Hahn, S., Lehnen, P., Nolden, D., Sundermeyer, M., Tüske, Z., Wiesler, S., Schlüter, R., Ney, H.: The RASR-The RWTH Aachen University open source speech recognition toolkit. In: Proc. IEEE Automatic Speech Recognition and Understanding Workshop (2011)
Povey, D., et al.: The Kaldi speech recognition toolkit. In: Proc. ASRU, Hawaii, US, pp. 1–4 (December 2011)
Public Transport Information System for Czech Republic, https://ufal.mff.cuni.cz/alex-dialogue-systems-framework/ptics
Korvas, M., Plátek, O., Dušek, O., Žilka, L., Jurčćček, F.: Free English and Czech telephone speech corpus shared under the CC-BY-SA 3.0 license. In: Proceedings of International Conference on Language Resources and Evaluation (to be published, 2014)
The Kaldi ASR toolkit (2014), http://sourceforge.net/projects/kaldi
The Alex Dialogue Systems Framework (2014), https://github.com/UFAL-DSG/alex
The OnlineLatgenRecogniser (2014), https://github.com/UFAL-DSG/pykaldi
The pyfst library: OpenFst in Python (2014), http://pyfst.github.com/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Plátek, O., Jurčíček, F. (2014). Integration of an On-line Kaldi Speech Recogniser to the Alex Dialogue Systems Framework. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2014. Lecture Notes in Computer Science(), vol 8655. Springer, Cham. https://doi.org/10.1007/978-3-319-10816-2_73
Download citation
DOI: https://doi.org/10.1007/978-3-319-10816-2_73
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10815-5
Online ISBN: 978-3-319-10816-2
eBook Packages: Computer ScienceComputer Science (R0)