Abstract
In this paper an approach for music clustering, using only lyrics features, is developed for identifying groups with similar feelings, content or emotions in the songs. For this study, a collection of 30.000 Spanish lyrics has been used. The songs were represented in a vector space model (Bag Of Words (BOW)), and some techniques of Part Of Speech (POS) were used as part of preprocessing. Partitional and hierarchical methods were used to perform clustering estimating the appropriate number of clusters (k). For evaluating the clustering results, some internal measures were used such as Davies Bouldin Index (DBI), intra similarity and inter similarity measures. At last, the final clusters were tagged using top words and association rules. Experiments show that music could be organized in related groups and tagged using unsupervised techniques as clustering with only lyrics information.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Anaya-Sánchez, H., Pons-Porrata, A., Berlanga-Llavori, R.: A document clustering algorithm for discovering and describing topics. Pattern Recogn. Lett. 31(6), 502–510 (2010)
Barreira, L., Cavaco, S., Da Silva, J.: Unsupervised music genre classification with a model-based approach. In: Antunes, L., Pinto, H.S. (eds.) EPIA 2011. LNCS, vol. 7026, pp. 268–281. Springer, Heidelberg (2011)
Fu, Z., Lu, G., Ting, K.M., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Transactions on Multimedia 13(2), 303–319 (2011)
Hu, X., Downie, J.S.: Improving mood classification in music digital libraries by combining lyrics and audio. In: Proceedings of the 10th Annual Joint Conference on Digital Libraries, JCDL 2010, New York, NY, USA, pp. 159–168 (2010)
Hu, Y., Chen, X., Yang, D.: Lyric-based song emotion detection with affective lexicon and fuzzy clustering method. In: Proceedings of ISMIR 2009, pp. 123–128 (2009)
Inc., C.: Cisco visual networking index: Forecast and methodology, 2011-2016. Tech. rep., Cisco (2012)
Karypis, G.: Cluto a clustering toolkit. Tech. Rep. 02-017, Dept. of Computer Science, University of Minnesota (2003), http://www.cs.umn.edu/~cluto
Kleedorfer, F., Knees, P., Pohle, T.: Oh oh oh whoah! towards automatic topic detection in song lyrics. In: Bello, J.P., Chew, E., Turnbull, D. (eds.) ISMIR, pp. 287–292 (2008)
Laurier, C., Grivolla, J., Herrera, P.: Multimodal music mood classification using audio and lyrics. In: Proceedings of the 2008 Seventh International Conference on Machine Learning and Applications, ICMLA 2008, pp. 688–693. IEEE Computer Society, Washington, DC (2008)
Li, T., Ogihara, M., Zhu, S.: Integrating features from different sources for music information retrieval. In: IEEE International Conference on Data Mining, pp. 372–381 (2006)
Mayer, R., Neumayer, R., Rauber, A.: Rhyme and style features for musical genre classification by song lyrics. In: Proceedings of the 9th International Conference on Music Information Retrieval (2008)
Nakatani, S.: Language detection library for java. Tech. rep., Cybozu Labs, Inc. (2011), http://code.google.com/p/language-detection/
Neumayer, R., Rauber, A.: Integration of text and audio features for genre classification in music information retrieval. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 724–727. Springer, Heidelberg (2007)
Neumayer, R., Rauber, A.: Multi-modal music information retrieval - visualisation and evaluation of clusterings by both audio and lyrics. In: Proceedings of the 8th Conference Recherche d’Information Assiste Par Ordinateur, RIAO 2007. ACM (2007)
Ozgur, A.: Supervised and Unsupervised Machine Learning Techniques For Text Document Categorization. Master’s thesis, Department of Computer Engineering, Bogazici University, Istanbul, Turkey (2002)
Pachet, F., Cazaly, D.: A taxonomy of musical genres. In: Proc. Content-Based Multimedia Information Access, RIAO 2000 (2000)
Padró, L., Stanilovsky, E.: Freeling 3.0: Towards wider multilinguality. In: Proceedings of the Language Resources and Evaluation Conference, LREC 2012. ELRA, Istanbul (2012)
Pham, D.T., Dimov, S.S.N.C.D.: Selection of k in k -means clustering. In: Proceedings of the Institution of Mechanical Engineers, vol. 219, p. 103 (2005)
Russell, J.A.: A circumplex model of affect. Journal of Personality and Social Psychology 39, 1161–1178 (1980)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing and Management 24(5), 513–523 (1988), cited By (since 1996) 1952
Scaringella, N., Zoia, G., Mlynek, D.: Automatic genre classification of music content: a survey. IEEE Signal Processing Magazine 23(2), 133–141 (2006)
Shao, X., Xu, C., Kankanhalli, M.: Unsupervised classification of music genre using hidden markov model. In: 2004 IEEE International Conference on Multimedia and Expo, ICME 2004, vol. 3, pp. 2023–2026 (2004)
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining, 1st edn. Addison Wesley (May 2005)
Ying, T.C., Doraisamy, S., Abdullah, L.: Genre and mood classification using lyric features. In: 2012 International Conference on Information Retrieval Knowledge Management, CAMP, pp. 260–263 (March 2012)
Zhao, Y., Karypis, G.: Empirical and theoretical comparisons of selected criterion functions for document clustering. Mach. Learn. 55(3), 311–331 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Parra, F.L., León, E. (2013). Unsupervised Tagging of Spanish Lyrics Dataset Using Clustering. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2013. Lecture Notes in Computer Science(), vol 7988. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39712-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-39712-7_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39711-0
Online ISBN: 978-3-642-39712-7
eBook Packages: Computer ScienceComputer Science (R0)