Skip to main content

MusicFactory: Application of a Convolutional Neural Network for the Generation of Soundscapes from Images

  • Conference paper
  • First Online:
New Trends in Disruptive Technologies, Tech Ethics and Artificial Intelligence (DiTTEt 2022)

Abstract

A soundscape is a sound description of a concrete environment. Therefore, the soundscapes are always connected to a visual component, as it might capture sounds from an urban city, a countryside, or a domestic place. In this work, we present a system that generate soundscapes from images. Firstly, we recognize some objects in the image. In a second step the system searches the most adequate sounds according to the entities identified in the picture. Finally, a soundscape is synthesized by combining the short sound files found. The results obtained according to the subjective evaluation are promising and encouraging to deepen our research in the soundscape generation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://usalinvestigacion.eu.qualtrics.com/jfe/form/SV_6LuVypXB6UbLM4S.

References

  1. Thorogood, M., Pasquier, P., Eigenfeldt, A.: Audio metaphor: audio information retrieval for soundscape composition. Proc. Sound Music Comput. Cong.(SMC), 277–283 (2012)

    Google Scholar 

  2. Marinos, K., Valle, A., et al.: Soundscapegenerator: soundscape modelling and simulation. In: XX Colloquio di Informatica Musicale 20th Colloquium on Music Informatics, pp. 65–70. Università IUAV di Venezia (2014)

    Google Scholar 

  3. Polo, A., Sevillano, X.: Musical vision: an interactive bio-inspired sonification tool to convert images into music. J. Multimodal User Interfaces 13(3), 231–243 (2019). https://doi.org/10.1007/s12193-018-0280-4

    Article  Google Scholar 

  4. Thorogood, M., Fan, J., Pasquier, P.: A framework for computer-assisted sound design systems supported by modelling affective and perceptual properties of soundscape. J. New Music Res. 48(3), 264–280 (2019)

    Article  Google Scholar 

  5. Harmon, S.: Narrative-inspired generation of ambient music. In: ICCC, pp. 136–142 (2017)

    Google Scholar 

  6. OK Toffa and M Mignotte. Dataset and semantic based-approach for image sonification. Multimed. Tools Appl. 1–14 (2022). https://doi.org/10.1007/s11042-022-12914-z

  7. Pak, M., Kim, S.: A review of deep learning in image recognition. In: 2017 4th International Conference on Computer Applications and Information Processing Technology (CAIPT), pp. 1–3. IEEE (2017)

    Google Scholar 

  8. Ngugi, L.C., Abelwahab, M., Abo-Zahhad, M.: Recent advances in image processing techniques for automated leaf pest and disease recognition–a review. Inf. Process. Agric. 8(1), 27–51 (2021)

    Google Scholar 

  9. Fan, X., Feng, X., Dong, Y., Hou, H.: Covid-19 CT image recognition algorithm based on transformer and CNN. Displays, 102150 (2022)

    Google Scholar 

  10. Yang, M., Kumar, P., Bhola, J., Shabaz, M.: Development of image recognition software based on artificial intelligence algorithm for the efficient sorting of apple fruit. Int. J. Syst. Assur. Eng. Manag. 1–9 (2021). https://doi.org/10.1007/s13198-021-01415-1

  11. Jiang, P., Ergu, D., Liu, F., Cai, Y., Ma, B.: A review of yolo algorithm developments. Procedia Comput. Sci. 199, 1066–1073 (2022)

    Article  Google Scholar 

  12. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)

  13. Valle, A., Armao, P., Casu, M., Koutsomichalis, M.: SoDa: a sound design accelerator for the automatic generation of soundscapes from an ontologically annotated sound library. In: ICMC (2014)

    Google Scholar 

  14. Salamon, J., MacConnell, D., Cartwright, M., Li, P., Bello, J.P.: Scaper: a library for soundscape synthesis and augmentation. In: 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 344–348. IEEE (2017)

    Google Scholar 

  15. Pearce, M.T., Wiggins, G.A.: Evaluating cognitive models of musical composition. In: Proceedings of the 4th International Joint Workshop on Computational Creativity, pp. 73–80. Goldsmiths, University of London (2007)

    Google Scholar 

Download references

Acknowledgments

The research of André Filipe Sales Mendes has been co-financed by the European Social Fund and Junta de Castilla y León (Operational Programme 2014–2020 for Castilla y León, EDU/556/2019 BOCYL) and partially supported by the project “FolkAI: Disseminate Folk European Music through Artificial Intelligence”(EIN2020-112348) under the program Research Europe 2020 financed by the Economy Ministry (Spanish Government).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to María Navarro-Cáceres .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Navarro-Cáceres, J.J., Mendes, A.S., Blas, H.S.S., González, G.V., Navarro-Cáceres, M. (2023). MusicFactory: Application of a Convolutional Neural Network for the Generation of Soundscapes from Images. In: de la Iglesia, D.H., de Paz Santana, J.F., López Rivero, A.J. (eds) New Trends in Disruptive Technologies, Tech Ethics and Artificial Intelligence. DiTTEt 2022. Advances in Intelligent Systems and Computing, vol 1430. Springer, Cham. https://doi.org/10.1007/978-3-031-14859-0_14

Download citation

Publish with us

Policies and ethics