Towards a French Smart-Home Voice Command Corpus: Design and NLU Experiments

  • Thierry DesotEmail author
  • Stefania Raimondo
  • Anastasia Mishakova
  • François Portet
  • Michel Vacher
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11107)


Despite growing interest in smart-homes, semantically annotated large voice command corpora for Natural Language development (NLU) are scarce, especially for languages other than English. In this paper, we present an approach to generate customizable synthetic corpora of semantically-annotated French commands for a smart-home. This corpus was used to train three NLU models – a triangular CRF, an attention-based RNN and the Rasa framework – evaluated using a small corpus of real users interacting with a smart home. While the attention model performs best on another large French dataset, on the small smart home corpus the models vary performance across to intent, slot and slot value classification. To the best of our knowledge, no other French corpus of semantically annotated voice commands is currently publicly available.


Natural Language Understanding Corpora and language resources Ambient intelligence Voice-user interface 



This work is part of the VocADom project founded by the French National Research Agency (Agence Nationale de la Recherche)/ANR-16-CE33-0006.


  1. 1.
    Bapna, A., Tur, G., Hakkani-Tur, D., Heck, L.: Sequential dialogue context modeling for spoken language understanding. In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue (2017)Google Scholar
  2. 2.
    Bapna, A., Tur, G., Hakkani-Tur, D., Heck, L.: Towards zero-shot frame semantic parsing for domain scaling. arXiv:1707.02363 [cs] (2017)
  3. 3.
    Chahuara, P., Portet, F., Vacher, M.: Context-aware decision making under uncertainty for voice-based control of smart home. Expert. Syst. Appl. 75, 63–79 (2017). Scholar
  4. 4.
    Dumitrescu, S.D.: Cassandra smart-home system description. In: 2017 International Conference on Speech Technology and Human-Computer Dialogue (SpeD), pp. 1–6 (2017).
  5. 5.
    Huang, L., Sil, A., Ji, H., Florian, R.: Improving slot filling performance with attentive neural networks on dependency structures. arXiv:1707.01075 [cs] (2017)
  6. 6.
    Jeong, M., Lee, G.G.: Triangular-chain conditional random fields. IEEE Trans. Audio Speech Lang. Process. 16(7), 1287–1302 (2008). Scholar
  7. 7.
    Jeong, M., Lee, G.G.: Multi-domain spoken language understanding with transfer learning. Speech Commun. 51(5), 412–424 (2009). Scholar
  8. 8.
    Lefèvre, F., et al.: Leveraging study of robustness and portability of spoken language understanding systems across languages and domains: the PORTMEDIA corpora. In: LREC, pp. 1436–1442 (2012)Google Scholar
  9. 9.
    Liu, B., Lane, I.: Attention-based recurrent neural network models for joint intent detection and slot filling. In: Interspeech, pp. 685–689 (2016).
  10. 10.
    Manishina, E., Jabaian, B., Huet, S., Lefèvre, F.: Automatic corpus extension for data-driven natural language generation. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portorož, Slovenia, 23–28 May 2016 (2016)Google Scholar
  11. 11.
    Mesnil, G., et al.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 23(3), 530–539 (2015). Scholar
  12. 12.
    Möller, S., Gödde, F., Wolters, M.: Corpus analysis of spoken smart-home interactions with older users. In: Proceedings of the 6th International Conference on Language Resources and Evaluation (2008)Google Scholar
  13. 13.
    Portet, F., Vacher, M., Golanski, C., Roux, C., Meillon, B.: Design and evaluation of a smart home voice interface for the elderly – acceptability and objection aspects. Pers. Ubiquitous Comput. 17(1), 127–144 (2013). Scholar
  14. 14.
    Takahashi, S., Morimoto, T., Maeda, S., Tsuruta, N.: Dialogue experiment for elderly people in home health care system. In: Matoušek, V., Mautner, P. (eds.) TSD 2003. LNCS (LNAI), vol. 2807, pp. 418–423. Springer, Heidelberg (2003). Scholar
  15. 15.
    Tran, Q., Zukerman, I., Haffari, G.: A hierarchical neural model for learning sequences of dialogue acts, pp. 428–437 (2017).
  16. 16.
    Vacher, M.: Evaluation of a context-aware voice interface for ambient assisted living: qualitative user study vs. quantitative system evaluation. ACM Trans. Access. Comput. 7(2), 5:1–5:36 (2015). Scholar
  17. 17.
    Vacher, M., Lecouteux, B., Chahuara, P., Portet, F., Meillon, B., Bonnefond, N.: The sweet-home speech and multimodal corpus for home automation interaction. In: The 9th Edition of the Language Resources and Evaluation Conference (LREC), pp. 4499–4506 (2014).
  18. 18.
    Wang, Y., Deng, L., Acero, A.: Semantic frame-based spoken language understanding. In: Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. Wiley (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Thierry Desot
    • 1
    Email author
  • Stefania Raimondo
    • 1
    • 2
  • Anastasia Mishakova
    • 1
  • François Portet
    • 1
  • Michel Vacher
    • 1
  1. 1.Univ. Grenoble Alpes, CNRS, Grenoble INP, LIGGrenobleFrance
  2. 2.University of TorontoTorontoCanada

Personalised recommendations