Melody stuck in your head, also known as “earworm”, is tough to get rid of, unless you listen to it again or sing it out loud. But what if you can not find the name of that song? It must be an intolerable feeling. Recognizing a song name base on humming sound is not an easy task for a human being and should be done by machines. However, there is no research paper published about hum tune recognition. Adapting from Hum2Song Zalo AI Challenge 2021 - a competition about querying the name of a song by user’s giving humming tune, which is similar to Google’s Hum to Search. This paper covers details about the pre-processed data from the original type (mp3) to usable form for training and inference. In training an embedding model for the feature extraction phase, we ran experiments with some states of the art, such as ResNet, VGG, AlexNet, MobileNetV2. And for the inference phase, we use the Faiss module to effectively search for a song that matched the sequence of humming sound. The result comes at nearly 94% in MRR@10 metric on the public test set, along with the top 1 result on the public leaderboard.
- Humming sound recognition
- Deep learning
- Faiss module
- Sound preprocessinng
H. H. Luong, T. P. Tran, H. P. Ngo, H. V. Nguyen, T. Nguyen—These authors contributed equally to this work.
This is a preview of subscription content, access via your institution.
Tax calculation will be finalised at checkout
Purchases are for personal use onlyLearn about institutional subscriptions
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://ieeexplore.ieee.org/document/5206848
Alex, K.: ImageNet classification with deep convolutional neural networks. In: Proceedings. Neurips.Cc (2012). https://proceedings.neurips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html
Avery, W.: An industrial strength audio search algorithm. An Industrial Strength Audio Search Algorithm. https://www.researchgate.net/publication/220723446_An_Industrial_Strength_Audio_Search_Algorithm
Jiang, C., et al.: Similarity learning for cover song identification using cross-similarity matrices of multi-level deep sequences. IEEE Xplore, 15 May 2020. https://ieeexplore.ieee.org/document/9053257
Xiaoshuo, X., et al.: Key-invariant convolutional neural network toward efficient cover song identification. IEEE Xplore, 11 October 2018. https://ieeexplore.ieee.org/document/8486531
Quynh Nhut, N., et al.: Movie recommender systems made through tag interpolation. In: Proceedings of the 4th International Conference on Machine Learning and Soft Computing. ACM Other Conferences, 1 January 2020. https://dl.acm.org/doi/10.1145/3380688.3380712
Hao Tuan, H., et al.: Automatic keywords-based classification of vietnamese texts. In: 2020 RIVF International Conference on Computing and Communication Technologies (RIVF), IEEE (2020)
Quynh Nhut, N., et al.: Movie recommender systems made through tag interpolation. In: Proceedings of the 4th International Conference on Machine Learning and Soft Computing (2020)
Nghia, D.-T., et al.: Genres and actors/actresses as interpolated tags for improving movie recommender systems. Int. J. Adv. Comput. Sci. Appl. 11(2) (2020)
Zalo AI Challenge. https://challenge.zalo.ai/
Vovanphuc. VOVANPHUC/hum2song: Top 1 Zalo AI Challenge 2021 Task Hum to Song. GitHub. https://github.com/vovanphuc/hum2song
Krishna, K.: Song Stuck in Your Head? Just Hum to Search. Google, Google, 15 October 2020. https://blog.google/products/search/hum-to-search/
Editors and Affiliations
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Pham, B.L., Luong, H.H., Tran, T.P., Ngo, H.P., Nguyen, H.V., Nguyen, T. (2022). An Approach to Hummed-tune and Song Sequences Matching. In: Dang, T.K., Küng, J., Chung, T.M. (eds) Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications. FDSE 2022. Communications in Computer and Information Science, vol 1688. Springer, Singapore. https://doi.org/10.1007/978-981-19-8069-5_49
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8068-8
Online ISBN: 978-981-19-8069-5