Skip to main content

Advertisement

Log in

Using deep learning and genetic algorithms for melody generation and optimization in music

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Music expresses thoughts and emotions in artistic form and is made up of different components including harmony, rhythm, and melody. Several musical elements are tied together during the songwriting process in order to generate melodies that are harmonic. A music melody is the essential element of any music to generate strong feelings from listeners and capture their interest. In the process of music appreciation, melody controls the emotional changes of music. It is an efficiently perceived part and the tone of a song. In recent years, Sichuan unvoiced music has developed rapidly and attracted much attention. This paper selects Sichuan unvoiced music as the main research theme and constructs a melody generation algorithm by utilizing the state-of-the-art techniques of deep learning (DL) and evolutionary algorithms (EAs) such as recurrent neural network-long short-term memory (RNN-LSTM) and genetic algorithm (GA). Firstly, this paper briefly describes the concept of DL algorithms, the deep generation model, and sequence to sequence model, as they constitute the technological foundation for this research. Secondly, this paper proposes a melody generation algorithm that utilizes RNN-LSTM for melody generation and GA for melody optimization. More specifically, the melody is generated by preprocessing data, creating, and training the RNN-LSTM model. A GA was used to determine the melodic fitness function for eight songs as the fitness function directly affects the selected termination condition. The fitness function can be thought of as either a person or an evolutionary rule. Finally, the average score of these songs, before and after evolution, is calculated, which demonstrates that the analysis and rotation creation methods are more precise and that the song’s average melody score is higher. The method proposed in this study has been thoroughly compared to the existing approaches proposed in earlier studies, and it was found that the approach we propose is more effective in terms of accuracy and the average melody score.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  • Ali M, Yin B, Kumar A, Sheikh AM et al. (2020) Reduction of multiplications in convolutional neural networks. In: 2020 39th Chinese Control Conference (CCC) (pp. 7406–7411). IEEE. https://doi.org/10.23919/CCC50068.2020.9188843

  • Aslam MS (2020) Co-design method for H∞ control of quantized TS fuzzy system over the networked system. J Intell Fuzzy Syst 39(1):771–788

    Article  Google Scholar 

  • Bai Y, Qi L, Tie Y (2020) Music generation based on reinforcement learning actor-critic. Comput Appli Softw 37(5):118–122

    Google Scholar 

  • Cao H (2022) Entrepreneurship education-infiltrated computer-aided instruction system for college music majors using convolutional neural network. Front Psychol 13:900195

    Article  Google Scholar 

  • Chen Z (2019) Observer-based dissipative output feedback control for network T-S fuzzy systems under time delays with mismatch premise. Nonlinear Dyn 95:2923–2941

    Article  MATH  Google Scholar 

  • Chen G, Chen P, Huang W, Zhai J (2022) Continuance intention mechanism of middle school student users on online learning platform based on qualitative comparative analysis method. Math Probl Eng 2022:1–12

    Google Scholar 

  • Cheng L, Yin F, Theodoridis S, Chatzis S, Chang T-H (2022) Rethinking Bayesian learning for data analysis: the art of prior and inference in sparsity-aware modeling. IEEE Signal Process Mag 39:18–52

    Article  Google Scholar 

  • Clay A, Couture N, Decarsin E, Desainte-Catherine M, Vulliard PH, Larralde J (2012) Movement to emotions to music: using whole body emotional expression as an interaction for electronic music generation. In: New Interfaces for Musical Expression

  • Dai X, Hou J, Li Q, Ullah R, Ni Z, Liu Y (2020) Reliable control design for composite-driven scheme based on delay networked T-S fuzzy system. Int J Robust Nonlinear Control 30(4):1622–1642

    Article  MathSciNet  MATH  Google Scholar 

  • Hastuti K, Andono PN, Shidik GF, Noersasongko E, Syarif AM (2020) Gamelan composer: a rule-based interactive melody generator for Gamelan music. Int J Eng Appl (IREA) 8(4):148–157

    Google Scholar 

  • Hazrat B, Yin B, Kumar A, Ali M, Zhang J, Yao J (2023) Jerk-bounded trajectory planning for rotary flexible joint manipulator: an experimental approach. Soft Comput 27(7):4029–4039. https://doi.org/10.1007/s00500-023-07923-5

    Article  Google Scholar 

  • Johnson CG, Cardalda JJR (2002) Genetic algorithms in visual art and music. Leonardo 35(2):175–184

    Article  Google Scholar 

  • Katoch S, Chauhan SS, Kumar V (2021) A review on genetic algorithm: past, present, and future. Multimed Tools Appl 80:8091–8126

    Article  Google Scholar 

  • Kereliuk C, Sturm BL, Larsen J (2015) Deep learning and music adversaries. IEEE Trans Multimed 17:2059–2071

    Article  Google Scholar 

  • Kumar A, Shaikh AM, Li Y et al (2021) Pruning filters with L1-norm and capped L1-norm for CNN compression. Appl Intell 51:1152–1160. https://doi.org/10.1007/s10489-020-01894-y

    Article  Google Scholar 

  • Lam M (2020) The physicality of music production: investigating the roles of mindful practice and kinesthetic learning. Music Educ J 106(3):23–28

    Article  Google Scholar 

  • Li Q, Hou J (2021) Fault detection for asynchronous T-S fuzzy networked Markov jump systems with new event-triggered scheme. IET Control Theory Appl 15(11):1461–1473

    Article  MathSciNet  Google Scholar 

  • Li Y, Zhang Z, Ding H, Chang L (2023) Music genre classification based on fusing audio and lyric information. Multimed Tools Appl 82(13):20157–20176

    Article  Google Scholar 

  • Liu Y, Wang K, Liu L, Lan H, Lin L (2022) Tcgl: Temporal contrastive graph for self-supervised video representation learning. IEEE Trans Image Process 31:1978–1993

    Article  Google Scholar 

  • Lu S, Liu M, Yin L, Yin Z, Liu X, Zheng W (2023) The multi-modal fusion in visual question answering: a review of attention mechanisms. PeerJ Comput Sci 9:e1400

    Article  Google Scholar 

  • Mesoudi A (2010) The experimental study of cultural innovation. Innovation in cultural systems: contributions from evolutionary anthropology, pp.175–91

  • Miao BC, Guo WA, Wang L (2019) A polyphony music generation system based on latent features and a recurrent neural network. CAAI Trans Intell Syst 14(1):158–164

    Google Scholar 

  • Misra NN, Dixit Y, Al-Mallahi A, Bhullar MS, Upadhyay R, Martynenko A (2020) IoT, big data, and artificial intelligence in agriculture and food industry. IEEE Internet Things J 9(9):6305–6324

    Article  Google Scholar 

  • Shih Y-J, Wu S-L, Zalkow F, Muller M, Yang YH (2022) Theme transformer: Symbolic music generation with theme-conditioned transformer. IEEE Trans Multimed. https://doi.org/10.1109/TMM.2022.3161851

    Article  Google Scholar 

  • Siddavatam I, Dalvi A, Gupta D, Farooqui Z, Chouhan M (2020) Multi genre music classification and conversion system. Int J Info Eng Electron Bus 12(1):30–36

    Google Scholar 

  • Tsushima H, Nakamura E, Yoshii K (2020) Bayesian melody harmonization based on a tree-structured generative model of chord sequences and melodies. IEEE/ACM Trans Audio Speech Lang Process. https://doi.org/10.1109/TASLP.2020.2996088

    Article  Google Scholar 

  • Wang L, Zhai Q, Yin B et al. (2019) Second-order convolutional network for crowd counting, Proc. SPIE 11198, Fourth International Workshop on Pattern Recognition, 111980T https://doi.org/10.1117/12.2540362.

  • Wang Q, Su F, Wang Y (2020) Hierarchical attentive deep neural networks for semantic music annotation through multiple music representations. Int J Multimed Info Retr 9:3–16

    Article  Google Scholar 

  • Wen YW, Ting CK (2022) Recent advances of computational intelligence techniques for composing music. IEEE Trans EmergTopics Comput Intell. https://doi.org/10.1109/TETCI.2022.3221126

    Article  Google Scholar 

  • Xiang Y (2018) Folk culture: the cornerstone of Chinese culture’s influence. Int Commun Chin Cult 5:301–304

    Article  Google Scholar 

  • Xu H, Sun Z, Cao Y et al (2023) A data-driven approach for intrusion and anomaly detection using automated machine learning for the internet of things. Soft Comput. https://doi.org/10.1007/s00500-023-09037-4

    Article  Google Scholar 

  • Yang X (2021) Research on automatic composition based on multiple machine learning models. In: 2021 3rd International Conference on Artificial Intelligence and Advanced Manufacture (pp. 1206–1209)

  • Yang Y, Welch G (2023) A systematic literature review of Chinese music education studies during 2007 to 2019. Int J Music Educ 41(2):175–198

    Article  Google Scholar 

  • Yao W, Guo Y, Wu Y, Guo J (2017) Experimental validation of fuzzy PID control of flexible joint system in presence of uncertainties. In: 2017 36th Chinese Control Conference (CCC) (pp. 4192–4197). IEEE. https://doi.org/10.23919/ChiCC.2017.8028015

  • Yin B, Khan J, Wang L, Zhang J, Kumar A (2019) Real-time lane detection and tracking for advanced driver assistance systems. In: 2019 Chinese Control Conference (CCC) (pp. 6772–6777). IEEE. https://doi.org/10.23919/ChiCC.2019.8866334

  • Yin B, Aslam MS et al (2023) A practical study of active disturbance rejection control for rotary flexible joint robot manipulator. Soft Comput 27:4987–5001. https://doi.org/10.1007/s00500-023-08026-x

    Article  Google Scholar 

Download references

Funding

This study was funded by the Research on the Influencing Factors and Optimization Strategies of Cooperative Protection and Development of Bayu National Folk music (BYMY22B03). This paper does not receive any funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ling Dong.

Ethics declarations

Conflict of interest

The authors have not disclosed any competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dong, L. Using deep learning and genetic algorithms for melody generation and optimization in music. Soft Comput 27, 17419–17433 (2023). https://doi.org/10.1007/s00500-023-09135-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-023-09135-3

Keywords

Navigation