Abstract
A soundscape is a sound description of a concrete environment. Therefore, the soundscapes are always connected to a visual component, as it might capture sounds from an urban city, a countryside, or a domestic place. In this work, we present a system that generate soundscapes from images. Firstly, we recognize some objects in the image. In a second step the system searches the most adequate sounds according to the entities identified in the picture. Finally, a soundscape is synthesized by combining the short sound files found. The results obtained according to the subjective evaluation are promising and encouraging to deepen our research in the soundscape generation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Thorogood, M., Pasquier, P., Eigenfeldt, A.: Audio metaphor: audio information retrieval for soundscape composition. Proc. Sound Music Comput. Cong.(SMC), 277–283 (2012)
Marinos, K., Valle, A., et al.: Soundscapegenerator: soundscape modelling and simulation. In: XX Colloquio di Informatica Musicale 20th Colloquium on Music Informatics, pp. 65–70. Università IUAV di Venezia (2014)
Polo, A., Sevillano, X.: Musical vision: an interactive bio-inspired sonification tool to convert images into music. J. Multimodal User Interfaces 13(3), 231–243 (2019). https://doi.org/10.1007/s12193-018-0280-4
Thorogood, M., Fan, J., Pasquier, P.: A framework for computer-assisted sound design systems supported by modelling affective and perceptual properties of soundscape. J. New Music Res. 48(3), 264–280 (2019)
Harmon, S.: Narrative-inspired generation of ambient music. In: ICCC, pp. 136–142 (2017)
OK Toffa and M Mignotte. Dataset and semantic based-approach for image sonification. Multimed. Tools Appl. 1–14 (2022). https://doi.org/10.1007/s11042-022-12914-z
Pak, M., Kim, S.: A review of deep learning in image recognition. In: 2017 4th International Conference on Computer Applications and Information Processing Technology (CAIPT), pp. 1–3. IEEE (2017)
Ngugi, L.C., Abelwahab, M., Abo-Zahhad, M.: Recent advances in image processing techniques for automated leaf pest and disease recognition–a review. Inf. Process. Agric. 8(1), 27–51 (2021)
Fan, X., Feng, X., Dong, Y., Hou, H.: Covid-19 CT image recognition algorithm based on transformer and CNN. Displays, 102150 (2022)
Yang, M., Kumar, P., Bhola, J., Shabaz, M.: Development of image recognition software based on artificial intelligence algorithm for the efficient sorting of apple fruit. Int. J. Syst. Assur. Eng. Manag. 1–9 (2021). https://doi.org/10.1007/s13198-021-01415-1
Jiang, P., Ergu, D., Liu, F., Cai, Y., Ma, B.: A review of yolo algorithm developments. Procedia Comput. Sci. 199, 1066–1073 (2022)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Valle, A., Armao, P., Casu, M., Koutsomichalis, M.: SoDa: a sound design accelerator for the automatic generation of soundscapes from an ontologically annotated sound library. In: ICMC (2014)
Salamon, J., MacConnell, D., Cartwright, M., Li, P., Bello, J.P.: Scaper: a library for soundscape synthesis and augmentation. In: 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 344–348. IEEE (2017)
Pearce, M.T., Wiggins, G.A.: Evaluating cognitive models of musical composition. In: Proceedings of the 4th International Joint Workshop on Computational Creativity, pp. 73–80. Goldsmiths, University of London (2007)
Acknowledgments
The research of André Filipe Sales Mendes has been co-financed by the European Social Fund and Junta de Castilla y León (Operational Programme 2014–2020 for Castilla y León, EDU/556/2019 BOCYL) and partially supported by the project “FolkAI: Disseminate Folk European Music through Artificial Intelligence”(EIN2020-112348) under the program Research Europe 2020 financed by the Economy Ministry (Spanish Government).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Navarro-Cáceres, J.J., Mendes, A.S., Blas, H.S.S., González, G.V., Navarro-Cáceres, M. (2023). MusicFactory: Application of a Convolutional Neural Network for the Generation of Soundscapes from Images. In: de la Iglesia, D.H., de Paz Santana, J.F., López Rivero, A.J. (eds) New Trends in Disruptive Technologies, Tech Ethics and Artificial Intelligence. DiTTEt 2022. Advances in Intelligent Systems and Computing, vol 1430. Springer, Cham. https://doi.org/10.1007/978-3-031-14859-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-14859-0_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14858-3
Online ISBN: 978-3-031-14859-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)