Environmental Sound Recognition in Social Robotics

Marques-Villarroya, Sara; Sosa-Aleman, Aythami; Castillo, Jose Carlos; Maroto-Gómez, Marcos; Salichs, Miguel Angel

doi:10.1007/978-3-031-43461-7_22

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 770))

Included in the following conference series:

International Symposium on Ambient Intelligence

153 Accesses

Abstract

An important task for a social robot is to understand the world around it. For this, sensory information is a key factor that allows identifying objects and people or listening to users and/or relevant sounds in the environment. This work focuses on the latter: the detection of everyday environmental sounds. We present a system able to recognise common sounds (e.g., air conditioning, car horn, water dripping) and how the integration in the social robot Mini allows for enhanced interaction with the users. We propose using deep learning techniques for sound identification, with the Mel-frequency spectrogram to represent the sound as an image, which allows using Convolutional Neural Networks to distinguish between sounds. The development was integrated into a real social robot to ensure the system’s proper operation. The resulting functional system depends on specific settings implemented in the robot to distinguish between actual sounds and background noise.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Sound source localization for auditory perception of a humanoid robot using deep neural networks

Article 29 November 2022

MIVIABot: A Cognitive Robot for Smart Museum

Deep Learning and Bayesian Networks for Labelling User Activity Context Through Acoustic Signals

Notes

1.
Working example video: https://youtu.be/2mO5241CKMU.

References

Alonso-Martin, F., Gamboa-Montero, J.J., Castillo, J.C., Castro-Gonzalez, A., Salichs, M.A.: Detecting and classifying human touches in a social robot through acoustic sensing and machine learning. Sensors 17(5) (2017). https://doi.org/10.3390/s17051138, https://www.mdpi.com/1424-8220/17/5/1138
Chandrakala, S., Jayalakshmi, S.: Environmental audio scene and sound event recognition for autonomous surveillance: a survey and comparative studies. ACM Comput. Surv. (CSUR) 52(3), 1–34 (2019)
Article Google Scholar
Jain, D., et al.: Homesound: an iterative field deployment of an in-home sound awareness system for deaf or hard of hearing users. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–12 (2020)
Google Scholar
Khurana, A., Mittal, S., Kumar, D., Gupta, S., Gupta, A.: Tri-integrated convolutional neural network for audio image classification using Mel-frequency spectrograms. Multimed. Tools Appl. 82(4), 5521–5546 (2023)
Article Google Scholar
Mu, W., Yin, B., Huang, X., Xu, J., Du, Z.: Environmental sound classification using temporal-frequency attention based convolutional neural network. Sci. Rep. 11(1), 21552 (2021)
Article Google Scholar
Pawar, M.D., Kokate, R.D.: Convolution neural network based automatic speech emotion recognition using Mel-frequency Cepstrum coefficients. Multimed. Tools Appl. 80, 15563–15587 (2021)
Article Google Scholar
Piczak, K.J.: ESC: dataset for environmental sound classification. In: Proceedings of the 23rd ACM International Conference on Multimedia, pp. 1015–1018 (2015)
Google Scholar
Salamon, J., Jacoby, C., Bello, J.P.: A dataset and taxonomy for urban sound research. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 1041–1044 (2014)
Google Scholar
Salichs, M.A., et al.: Mini: a new social robot for the elderly. Int. J. Soc. Robot. 12, 1231–1249 (2020)
Article Google Scholar
Sharan, R.V., Moir, T.J.: Acoustic event recognition using cochleagram image and convolutional neural networks. Appl. Acoust. 148, 62–66 (2019)
Article Google Scholar

Download references

Acknowledgements

The research leading to these results has received funding from the projects: Robots sociales para mitigar la soledad y el aislamiento en mayores (SOROLI), PID2021-123941OA-I00, funded by Agencia Estatal de Investigación (AEI), Spanish Ministerio de Ciencia e Innovación. Robots sociales para reducir la brecha digital de las personas mayores (SoRoGap), TED2021-132079B-I00, funded by Agencia Estatal de Investigación (AEI), Spanish Ministerio de Ciencia e Innovación. This publication is part of the R &D &I project PDC2022-133518-I00, funded by MCIN/AEI/10.13039/501100011033 and by the European Union NextGenerationEU/PRTR.

Author information

Authors and Affiliations

University Carlos III, Madrid, Spain
Sara Marques-Villarroya, Aythami Sosa-Aleman, Jose Carlos Castillo, Marcos Maroto-Gómez & Miguel Angel Salichs

Authors

Sara Marques-Villarroya
View author publications
You can also search for this author in PubMed Google Scholar
Aythami Sosa-Aleman
View author publications
You can also search for this author in PubMed Google Scholar
Jose Carlos Castillo
View author publications
You can also search for this author in PubMed Google Scholar
Marcos Maroto-Gómez
View author publications
You can also search for this author in PubMed Google Scholar
Miguel Angel Salichs
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jose Carlos Castillo .

Editor information

Editors and Affiliations

University of Minho, Braga, Portugal
Paulo Novais
Universitat Politècnica de València, Valencia, Valencia, Spain
Vicente Julián Inglada
University of Granada, Granada, Spain
Miguel J. Hornos
National Institute of Informatics, Chiyoda, Japan
Ichiro Satoh
CIICESI, ESTG, Politécnico do Porto, Felgueiras, Portugal
Davide Carneiro
ISEP/GECAD, Porto, Portugal
João Carneiro
Deep tech lab, AIR Institute, Valladolid, Spain
Ricardo S. Alonso

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Marques-Villarroya, S., Sosa-Aleman, A., Castillo, J.C., Maroto-Gómez, M., Salichs, M.A. (2023). Environmental Sound Recognition in Social Robotics. In: Novais, P., et al. Ambient Intelligence – Software and Applications – 14th International Symposium on Ambient Intelligence. ISAmI 2023. Lecture Notes in Networks and Systems, vol 770. Springer, Cham. https://doi.org/10.1007/978-3-031-43461-7_22

Download citation

DOI: https://doi.org/10.1007/978-3-031-43461-7_22
Published: 26 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43460-0
Online ISBN: 978-3-031-43461-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Environmental Sound Recognition in Social Robotics

Abstract

Access this chapter

Similar content being viewed by others

Sound source localization for auditory perception of a humanoid robot using deep neural networks

MIVIABot: A Cognitive Robot for Smart Museum

Deep Learning and Bayesian Networks for Labelling User Activity Context Through Acoustic Signals

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Environmental Sound Recognition in Social Robotics

Abstract

Access this chapter

Similar content being viewed by others

Sound source localization for auditory perception of a humanoid robot using deep neural networks

MIVIABot: A Cognitive Robot for Smart Museum

Deep Learning and Bayesian Networks for Labelling User Activity Context Through Acoustic Signals

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation