Skip to main content
Log in

Detecting Persian speaker-independent voice commands based on LSTM and ontology in communicating with the smart home appliances

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Nowadays, various interfaces are used to control smart home appliances. The human and smart home appliances interaction may be based on input devices such as a mouse, keyboard, microphone, or webcam. The interaction between humans and machines can be established via speech using a microphone as one of the input modes. The Speech-based human and machine interaction is a more natural way of communication in comparison to other types of interfaces. Existing speech-based interfaces in the smart home domain suffer from some problems such as limiting the users to use a fixed set of pre-defined commands, not supporting indirect commands, requiring a large training set, or depending on some specific speakers. To solve these challenges, we proposed several approaches in this paper. We exploited ontology as a knowledge base to support indirect commands and remove user restrictions on expressing a specific set of commands. Moreover, Long Short-Term Memory (LSTM) has been exploited for detecting spoken commands more accurately. Additionally, due to the lack of Persian voice commands for interacting with smart home appliances, a dataset of speaker-independent Persian voice commands for communicating with TV, media player, and lighting system has been designed, recorded, and evaluated in this research. The experimental results show that the LSTM-based voice command detection system performed almost 1.5% and 13% more accurately than the Hidden Markov Model-based one, in scenarios ‘with’ and ‘without ontology’, respectively. Furthermore, using ontology in the LSTM-based method has improved the system performance by about 40%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

The datasets generated during the current study are not publicly available. However, further information about the data and conditions for access are available from the corresponding author on reasonable request.

References

  • Al-Osaimi R, Karim NA (2017) Ontology powered knowledge modeling for a smart home. In: 9th IEEE-GCC conference and exhibition (GCCCE), pp 1–6

  • Alexakis G, Panagiotakis S, Fragkakis A, Markakis E, Vassilakis K (2019) Control of smart home operations using natural language processing, voice recognition and IoT technologies in a multi-tier architecture. Designs 3:32–49

    Article  Google Scholar 

  • Bajpai S, Radha D (2019) Smart phone as a controlling device for smart home using speech recognition. In: International conference on communication and signal processing (ICCSP), pp 0701–0705

  • Bird S, Boguraev B, Kay M, McDonald D, Hindle D, Wilks Y (1997) Survey of the state of the art in human language technology, vol 12. Cambridge University Press, Cambridge

    Google Scholar 

  • Bocklet T, Marek A (2020) Cepstral variance normalization for audio feature extraction. Google Patents

  • Chenxuan H (2021) Research on speech recognition technology for smart home. In: IEEE 4th International conference on automation, electronics and electrical engineering (AUTEEE), pp 504–507

  • Chuang LL, Glatz C, Krupenia S (2017) Using EEG to understand why behavior to auditory in-vehicle notifications differs across test environments. In: Proceedings of the 9th international conference on automotive user interfaces and interactive vehicular applications, pp 123–133

  • Elsayed EK, Fathy DR (2020) Sign language semantic translation system using ontology and deep learning sign 11

  • Eramo V, Lavacca FG, Catena T, Di Giorgio F (2020) Reconfiguration of optical-NFV network architectures based on cloud resource allocation and QoS degradation cost-aware prediction techniques. IEEE Access 8:200834–200850

    Article  Google Scholar 

  • Eramo V, Lavacca FG, Catena T, Salazar PJP (2021) Application of a long short term memory neural predictor with asymmetric loss function for the resource allocation in NFV network architectures. Comput Netw 193:108104

    Article  Google Scholar 

  • Han Y, Hyun J, Jeong T, Yoo J-H, Hong JW-K (2016) A smart home control system based on context and human speech. In: 2016 18th International conference on advanced communication technology (ICACT),pp 165–169

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–1780

    Article  Google Scholar 

  • Huang C-C, Liu A, Zhou P-C (2015) Using ontology reasoning in building a simple and effective dialog system for a smart home system. In: IEEE International conference on systems, man, and cybernetics, pp 1508–1513

  • Huxohl T, Pohling M, Carlmeyer B, Wrede B, Hermann T (2019) Interaction guidelines for personal voice assistants in smart homes. In: International conference on speech technology and human-computer dialogue (SpeD),. IEEE, pp 1–10

  • Ittichaichareon C, Suksri S, Yingthawornsuk T (2012) Speech recognition using MFCC. In: International conference on computer graphics, simulation and modeling, pp 135–138

  • Kalkhoran LS, Tabibian S, Homayounvala E (2020) Improving the accuracy of Persian HMM-based voice command detection system in smart homes based on ontology method. In: 2020 6th Iranian conference on signal processing and intelligent systems (ICSPIS), pp 1–5

  • Këpuska V (2011) Wake-up-word speech recognition. Speech Technologies 237–262

  • Khan YI, Ndubuaku MU (2018) Ontology-based automation of security guidelines for smart homes. In: IEEE 4th world forum on internet of things (WF-IoT), pp 35–40

  • Klakow D, Peters J (2002) Testing the correlation of word error rate and perplexity. Speech Commun 38:19–28

    Article  MATH  Google Scholar 

  • Lau J, Zimmerman B, Schaub F (2018) Alexa, are you listening? privacy perceptions, concerns and privacy-seeking behaviors with smart speakers. Proc ACM Hum Comput Interaction 2:1–31

    Article  Google Scholar 

  • Mehrabani M, Bangalore S, Stern B (2015) Personalized speech recognition for Internet of Things. In: 2015 IEEE 2nd world forum on internet of things (WF-IoT), pp 369–374

  • Milward D, Beveridge M (2003) Ontology-based dialogue systems. In: Proc. 3rd Workshop on Knowledge and reasoning in practical dialogue systems (IJCAI03), pp 9–18

  • Mittal Y, Toshniwal P, Sharma S, Singhal D, Gupta R, Mittal VK (2015) A voice-controlled multi-functional smart home automation system. In: Annual IEEE India conference (INDICON), pp 1–6

  • Munir A, Ehsan SK, Raza SM, Mudassir M (2019) Face and speech recognition based smart home. In: 2019 International conference on engineering and emerging technologies (ICEET), IEEE, pp 1–5

  • Olah C (2015) Understanding LSTM networks. http://colah.github.io/posts/2015-08-Understanding-LSTMs/. Accessed 10 Sept 2022

  • Preece J, Sharp H, Rogers Y (2015) Interaction design: beyond human-computer interaction. Wiley, New York

    Google Scholar 

  • Rabiner L, Juang B (1986) An introduction to hidden Markov models. IEEE ASSP Mag 3:4–16

    Article  Google Scholar 

  • Reedoy AV, Dayal SB, Govender P, Fonou-Dombeu JV (2021) An ontology for smart home design. In: International conference on artificial intelligence, big data, computing and data communication systems (icABCD), pp 1–6

  • Rubio-Drosdov E, Díaz-Sánchez D, Almenárez F, Arias-Cabarcos P, Marín A (2017) Seamless human-device interaction in the internet of things. IEEE Trans Consum Electron 63:490–498

    Article  Google Scholar 

  • Rubio-Drosdov E, Díaz-Sánchez D, Arias-Cabarcos P, Almenárez F, Marín A (2015) Towards a seamless human interaction in IoT. In: International symposium on consumer electronics (ISCE), pp 1–2

  • Saba D, Degha HE, Berbaoui B, Maouedj R (2018) Development of an ontology based solution for energy saving through a smart home in the city of Adrar in Algeria. In: International conference on advanced machine learning technologies and applications, pp 531–541

  • Saba D, Sahli Y, Hadidi A (2021) An ontology based energy management for smart home. Sustainable Computing: Informatics and Systems 31:100591

    Google Scholar 

  • Wang P (2020) Research and design of smart home speech recognition system based on deep learning. In: International conference on computer vision, image and deep learning (CVIDL), pp 218–221

  • Weng F, Angkititrakul P, Shriberg EE, Heck L, Peters S, Hansen JH (2016) Conversational in-vehicle dialog systems: the past, present, and future. IEEE Signal Process Mag 33:49–60

    Article  Google Scholar 

  • Zhang Y, Wei Z, Yang Y, Song C (2012) Ontology description of smart home appliance based on semantic web. In: International conference on computer science and service system, pp 695–698

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shima Tabibian.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kalkhoran, L.S., Tabibian, S. & Homayounvala, E. Detecting Persian speaker-independent voice commands based on LSTM and ontology in communicating with the smart home appliances. Artif Intell Rev 56, 6039–6060 (2023). https://doi.org/10.1007/s10462-022-10326-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-022-10326-x

Keywords

Navigation