Multilingual Low-Resourced Prototype System for Voice-Controlled Intelligent Building Applications

  • Alexandru Caranica
  • Lucian Georgescu
  • Alexandru Vulpe
  • Horia Cucu
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 747)

Abstract

With speech recognition databases spanning most of the widely used languages around the globe, there is a lot of incentive to build linguistically diverse, voice-driven applications, in different languages and in diverse acoustic conditions. Although state of the art speech processing has achieved great performance for most widely used languages, little efforts were made for under-resourced languages, such as Romanian. Moreover, most of these systems are not focused in supporting specific voice recognition scenarios, such as assistive applications for elder or disabled people, or consider a triggered close talking voice interaction. This paper focuses in building a prototype system for Romanian language, to be used in distant speech recognition scenarios, for voice driven speech applications in intelligent homes or buildings. Previously acquired speech databases in Romanian language are used, recorded in real life conditions, by our research group. For a baseline comparison, an English recognition engine is also implemented.

Keywords

Voice drive applications Automatic speech recognition Distant speech recognition Home automation Multilingual recognition 

Notes

Acknowledgements

This work was supported by the Ministry of Innovation and Research, UEFISCDI, partly through project number 5 Sol/2017 within PNCDI III and partly through the programme “Partnerships in priority areas”, “Collaborative Applied Research Projects”, project ID: PN-II-PT-PCCA-2013-4-0789, contract number 32/2014.

References

  1. 1.
    Swapnil, B.: Amazon Echo vs. Google Home: the choice is obvious. Online Review, CIO - IDG Communications, Inc. (2016)Google Scholar
  2. 2.
    Mäyrä, F., Soronen, A., Vanhala, J., Mikkonen, J., Zakrzewski, M., Koskinen, I., Kuusela, K.: Probing a proactive home: challenges in researching and designing everyday smart environments. Hum. Technol. 2, 158–186 (2006)CrossRefGoogle Scholar
  3. 3.
    Vacher, M., Lecouteux, B., S-Romero, J., Ajili, M., Portet, F.: Speech and speaker recognition for home automation: preliminary results. In: Proceedings of the 8th International Conference on Speech Technology and Human-Computer Dialogue, “SpeD 2015”, Bucharest, Romania, pp. 181–190. IEEE (2015)Google Scholar
  4. 4.
    Cucu, H., Buzo, A., Petrică, L., Burileanu, D., Burileanu, C.: Recent improvements of the SpeeD Romanian LVCSR system. In: The Proceedings of the 10th International Conference on Communications (COMM), Bucharest, pp. 111–114 (2014)Google Scholar
  5. 5.
    Dogariu, M., Cucu, H., Buzo, A., Burileanu, D., Fratu, O.: Speech database acquisition for assisted living environment applications. In: The Proceedings of the 8th Conference on Speech Technology and Human-Computer Dialogue (SpeD), Bucharest (2015)Google Scholar
  6. 6.
    Badica, C., Brezovan, M., Badica, A.: An overview of smart home environments: architectures, technologies and applications. In: Proceedings of the Sixth Balkan Conference in Informatics, Greece (2013)Google Scholar
  7. 7.
    Chang, C.-Y., Kuo, C.-H., Chen, J.-C., Wang, T.C.: Design and implementation of an IoT access point for smart home. J. Appl. Sci. 5, 1882–1902 (2015)CrossRefGoogle Scholar
  8. 8.
    Bhuiyan, M., Picking, R.: Gesture-controlles user interfaces, what have we done and what’s next? In: Proceedings of the Fifth Collaborative Research Symposium on Security, E-Learning, Internet and Networking (2009)Google Scholar
  9. 9.
    Lee, H., Jeong, H., Lee, J., Yeom, K.-W., Shin, H., Park, J.-H.: Select-and-point: a novel interface for multi-device connection and control based on simple hand gestures. In: CHI 2008, Florence, Italy, 5–10–April 2008Google Scholar
  10. 10.
    IoTivity. http://www.iotivity.org/. Accessed Nov 2017
  11. 11.
    Fernandes, E., Jung, J., Prakash, A.: Security analysis of emerging smart home applications. In: Proceedings of 37th IEEE Symposium on Security and Privacy, San Jose, California (2016)Google Scholar
  12. 12.
    Notra, S., Siddiqi, M., Gharakheili, H.H., Sivaraman, V., Boreli, R.: An experimental study of security and privacy risks with emerging household appliances. In: Proceedings of International Workshop on Security and Privacy in Machine-to-Machine Communications (2014)Google Scholar
  13. 13.
    SpeeD Group, University Politehnica of Bucharest: Natural-language, Voice-controlled Assistive System for Intelligent Buildings (ANVSIB). http://speed.pub.ro/anvsib
  14. 14.
    ComfortClick: KNX Building Automation Toolkit. https://www.comfortclick.com/. Accessed Nov 2017
  15. 15.
  16. 16.
    Kaldi ASR Toolkit. http://kaldi-asr.org/
  17. 17.
    Mostefa, D., et al.: The CHIL audiovisual corpus for lecture and meeting analysis inside smart rooms. Lang. Resourc. Eval. 41, 389–407 (2007)CrossRefGoogle Scholar
  18. 18.
    Carletta, J., et al.: The AMI meeting corpus: a pre-announcement. In: Proceedings of the Second International Conference on Machine Learning for Multimodal Interaction (2006)Google Scholar
  19. 19.
    Cooke, M., Barker, J., Cunningham, S., Shao, X.: An audio-visual corpus for speech perception and automatic speech recognition. J. Acoust. Soc. Am. 120(5), 2421–2424 (2006)CrossRefGoogle Scholar
  20. 20.
    Vacher, M., Caffiau, S., Portet, F., Meillon, B., Roux, C., Elias, E., Lecouteux, B., Chahuara, P.: Evaluation of a context-aware voice interface for ambient assisted living: qualitative user study vs. quantitative system evaluation. ACM Trans. Access. Comput. (TACCESS) (2015)Google Scholar
  21. 21.
    Cucu, H., Buzo, A., Burileanu, C.: Unsupervised acoustic model training using multiple seed ASR systems. In: Proceedings of SLTU, St. Petersburg, Russia (2014)Google Scholar
  22. 22.
    Boros, T., Dumitrescu, S.D.: Cassandra smart-home system description. In: Proceedings of SpeD 2017 (2017)Google Scholar
  23. 23.
    Dumitrescu, S.D., Boros, T., Tufis, D.: RACAI’s natural language processing pipeline for universal dependencies. In: Proceedings of CONLL Workshop (2017)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Alexandru Caranica
    • 1
  • Lucian Georgescu
    • 1
  • Alexandru Vulpe
    • 2
  • Horia Cucu
    • 1
  1. 1.Speech and Dialogue Research LaboratoryUniversity Politehnica of BucharestBucharestRomania
  2. 2.Telecommunications DepartmentUniversity Politehnica of BucharestBucharestRomania

Personalised recommendations