Abstract
Concert recordings of Carnatic music are often continuous and unsegmented. At present, these recordings are manually segmented into items for making CDs. The objective of this paper is to develop algorithms that segment continuous concert recordings into items using applause as a cue. Owing to the ‘here and now’ nature of applauses, the number of applauses exceeds the number of items in the concert. This results in a concert being fragmented into different segments. In the first part of the paper, applause locations are identified using time, and spectral domain features, namely, short-time energy, zero-crossing rate, spectral flux and spectral entropy. In the second part, inter-applause segments are merged if they belong to the same item. The main component of every item in a concert is a composition. A composition is characterised by an ensemble of vocal (or main instrument), violin (optional) and percussion. Inter-applause segments are classified into three segments, namely, vocal solo, violin solo, composition and thaniavarthanam using tonic normalised cent filter-bank cepstral coefficients. Adjacent composition segments are merged into a single item, if they belong to the same melody. Meta-data corresponding to the concert in terms of items, available from listeners, are matched to the segmented audio. The applauses are further classified based on strength using Cumulative Sum. The location of the top three highlights of every concert is documented. The performance of the proposed approaches to applause identification, inter-applause classification and mapping of items is evaluated on 50 live recordings of Carnatic music concerts. The applause identification accuracy is 99%, and the inter- and intra-item classification is 93%, while the mapping accuracy is 95%.
Similar content being viewed by others
Notes
In this paper we refer to the kīrtana as a composition.
Hereafter we refer to this as a (the main) song.
An instrument that maintains the tonic throughout the concert.
This table is Courtesy: M V N Murthy, Professor, IMSc, Chennai, India.
These live recordings were obtained from personal collections of listeners and musicians. Recordings were made available for research purposes only.
http://www.charsur.org. Live concerts have been licensed from Charsur for research purposes.
About 40 concerts are available at the time of this writing.
Key corresponds to tonic in the context of Indian classical music.
DKP – D K Pattamal and MSS – M S Subbalakshmi are female singers and ALB – Alathur Brothers, GNB – G N Balasubramaniam, KVN – K V Narayanaswamy, SS – Sanjay Subramaniam and TMK – T M Krishna are male singers.
DKP - DK Pattamal and MSS - MS Subbalakshmi are female singers and ALB – Alathur Brothers, GNB – GN Balasubramaniam, KVN – KV Narayanaswamy, SS – Sanjay Subramaniam and TMK – TM Krishna are male singers.
Labelling was done by the first author and verified by a professional musician.
The threshold was determined by performing a line search.
http://www.iitm.ac.in/donlab/music/mapping_songslist/index.php. This concert is by Srividya Janakiraman accompanied by Gyandev on violin and Sriram on mridangam. This concert was performed at the ARKAY Convention Center, Mylapore, Chennai.
References
Murthy M V N 2012 Applause and aesthetic experience. http://compmusic.upf.edu/zh-hans/node/151
Krishna T M 2013 A southern music: the Karnatik story. India: Harpercollins India
Jarina R and Olajec J 2007 Discriminative feature selection for applause sounds detection. In: Proceedings of the Eighth International Workshop on Image Analysis for Multimedia Interactive Services, WIAMIS’07, IEEE, pp. 13–13
Olajec J, Jarina R and Kuba M 2006 Ga-based feature extraction for clapping sound detection. In: Proceedings of the Eighth Seminar on Neural Network Applications in Electrical Engineering, NEUREL 2006, IEEE, pp. 21–25
Carey M J, Parris E S and Lloyd-Thomas H 1999 A comparison of features for speech, music discrimination. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE, vol. 1, pp. 149–152
Shi Z, Han J and Zheng T 2011 Heterogeneous mixture models using sparse representation features for applause and laugh detection. In: Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP), September, pp. 1–5
Li Y X, He Q H, Kwong S, Li T and Yang J C 2009 Characteristics-based effective applause detection for meeting speech. Signal Process. 89(8): 1625–1633
Li Y X, He Q H, Li W and Wang Z F 2010 Two-level approach for detecting non-lexical audio events in spontaneous speech. In: Proceedings of the International Conference on Audio Language and Image Processing (ICALIP), IEEE, pp. 771–777
Manoj C, Magesh S, Sankaran M S and Manikandan M S 2011 A novel approach for detecting applause in continuous meeting. In: Proceedings of the IEEE International Conference on Electronics and Computer Technology, India, April, pp. 182–186
Koduri G K, Ishwar V, Serrà J and Serra X 2014 Intonation analysis of rāgas in carnatic music. J. N. Music Res. (Special Issue on Computational Approaches to the Art Music Traditions of India and Turkey) 43: 72–93
Krishna T M and Ishwar V 2012 Svaras, gamaka, motif and raga identity. In: Proceedings of the Workshop on Computer Music, July, pp. 12–18
Pesch L 2009 The Oxford illustrated companion to south Indian classical music. Oxford: Oxford University Press
Serra J, Koduri G K, Miron M and Serra X 2011 Assessing the tuning of sung Indian classical music. In: Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011), pp. 157–162
Dutta S 2016 Analysis of motifs in Carnatic music: a computational perspective. Master’s Thesis, Indian Institute of Technology Madras
Bellur A and Murthy H A 2013 A novel application of group delay function for identifying tonic in Carnatic music. In: Proceedings of the 21st European Signal Processing Conference (EUSIPCO), IEEE, pp. 1–5
Bellur A, Ishwar V, Serra X and Murthy H A 2012 A knowledge based signal processing approach to tonic identification in Indian classical music. In: Serra X, Rao P, Murthy H and Bozkurt B (Eds.) Proceedings of the 2nd CompMusic Workshop, July 12–13, Istanbul, Turkey. Barcelona: Universitat Pompeu Fabra, pp. 113–118
Sarala P, Ishwar V, Bellur A and Murthy H A 2012 Applause identification and its relevance to archival of carnatic music. In: Serra X, Rao P, Murthy H and Bozkurt B (Eds.) Proceedings of the 2nd CompMusic Workshop, July 12–13, Istanbul, Turkey. Barcelona: Universitat Pompeu Fabra
Rabiner L R and Schafer R W 2011 Theory and applications of digital speech processing. Upper Saddle River, NJ: Pearson International
Cannam C, Landone C, Sandler M B and Bello J P 2006 The sonic visualiser: a visualisation platform for semantic descriptors from musical signals. In: Proceedings of the ISMIR Conference, pp. 324–327
Chang C C and Lin C J 2011 LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3): 27
Müller M, Kurth F and Clausen M 2005 Audio matching via chroma-based statistical features. In: Proceedings of the ISMIR Conference, vol. 2005, p. 6
Ellis D 2007 Chroma feature analysis and synthesis. http://www.ee.columbia.edu/~dpwe/resources/Matlab/chroma-ansyn
Salamon J, Gulati S and Serra X 2012 A two-stage approach for tonic identification in Indian art music. In: Proceedings of the Workshop on Computer Music, July, pp. 119–127
De Cheveigné A and Kawahara H 2002 Yin, a fundamental frequency estimator for speech and music. J. Acoust. Soci. Am. 111(4): 1917–1930
Brown J C and Puckette M S 1992 An efficient algorithm for the calculation of a constant q transform. J. Acoust. Soc. Am. 92(5): 2698–2701
Chordia P and Rae A 2007 Raag recognition using pitch-class and pitch-class dyad distributions. In: Proceedings of the ISMIR Conference, pp. 431–436
Brodsky E and Darkhovsky B S 1993 Nonparametric methods in change point problems. New York: Springer Science & Business Media
Wang H, Zhang D and Shin K G 2002 Syn-dog: sniffing syn flooding sources. In: Proceedings of the ICDCS, July, pp. 421–428
Liu H and Kim M S 2010 Real-time detection of stealthy ddos attacks using time-series decomposition. In: Proceedings of the IEEE International Conference on Communications (ICC), IEEE, pp. 1–6
Acknowledgements
This research was partly funded by the European Research Council under the European Union’s Seventh Framework Program, as part of the CompMusic project (ERC grant agreement 267583). We would like to thank Mr R K Ramakrishnan for arranging these concerts by Srividya Janakiraman and also seeking permission from the artists for the CompMusic project. We also thank Vidwan T M Krishna for giving permission to use his 26 personal recordings for applause analysis.
Author information
Authors and Affiliations
Corresponding author
Additional information
The work presented in this paper is an extension and includes extensive analysis on Carnatic music.
Rights and permissions
About this article
Cite this article
PADI, S., MURTHY, H.A. Segmentation of continuous audio recordings of Carnatic music concerts into items for archival. Sādhanā 43, 154 (2018). https://doi.org/10.1007/s12046-018-0922-y
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12046-018-0922-y