Abstract
This paper presents a computer-based technique for bird species identification at large scale. It automatically identifies multiple species simultaneously in a large number of audio recordings and provides the basis for the best scoring submission to the LifeCLEF 2014 Bird Identification Task. The method achieves a Mean Average Precision of 51.1% on the test set and 53.9% on the training set with an Area Under the Curve of 91.5% during cross-validation. Besides a general description of the underlying classification approach a number of additional research questions are addressed regarding the choice of features, selection of classifier hyperparameters and method of classification.
Keywords
- Bird Identification
- Information retrieval
- Biodiversity
- Spectrogram segmentation
- Median Clipping
- Template matching
- Decision trees
This is a preview of subscription content, access via your institution.
Buying options
Preview
Unable to display preview. Download preview PDF.
References
Frommolt, K.-H., Bardeli, R., Clausen, M. (eds.) Computational bioacoustics for assessing biodiversity. Proc. of the int. expert meeting on IT-based detection of bioacoustical patterns (2008)
Bardeli, R., Wolff, D., Kurth, F., Koch, M., Tauchert, K.-H., Frommolt, K.-H.: Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring. Pattern Recognition Letter 31(23), 1524–1534 (2009)
Briggs, F., Lakshminarayanan, B., Neal, L., et al.: Acoustic classification of multiple simultaneous bird species: A multi-instance multi-label approach. The Journal of the Acoustical Society of America 131(6), 4640–4650 (2012). doi:10.1121/1.4707424
Potamitis, I.: Automatic Classification of Taxon-Rich Community Recorded in the Wild. PLoS ONE 9(5), e96936 (2014). doi:10.1371/journal.pone.0096936
Glotin, H., Goëau, H., Vellinga, W-P., Rauber, A.: LifeCLEF bird identification task 2014. In: CLEF working notes (2014)
Cappellato, L., Ferro, N., Halvey, M., Kraaij, W. (eds.) CLEF 2014 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings. (CEUR-WS.org), ISSN 1613-0073, (2014). http://ceur-ws.org/Vol-1180/
Eyben, F., Wöllmer, M., Schuller, B.: openSMILE - the munich versatile and fast open-source audio feature extractor. In: Proc. ACM Multimedia (MM), pp. 1459–1462. ACM, Florence, Italy (2010). ISBN 978-1-60558-933-6, doi:10.1145/1873951.1874246
Geiger, J.T., Schuller, B., Rigoll, G.: Large-scale audio feature extraction and svm for acoustic scenes classification. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2013. IEEE (2013)
Lewis, J.P.: Fast Normalized Cross-Correlation. Industrial Light and Magic (1995)
Fodor, G.: The ninth annual MLSP competition: first place. In: 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–2 (2013). doi:10.1109/MLSP.2013.6661932
Lasseck, M.: Bird song classification in field recordings: winning solution for NIPS4B 2013 competition. In: Glotin, H. et al. (eds.) Proc. of int. symp. Neural Information Scaled for Bioacoustics, sabiod.org/nips4b, joint to NIPS, Nevada, pp. 176–181 (2013)
Pedregosa, F., et al.: Scikit-learn: Machine learning in Python. JMLR 12, 2825–2830 (2011)
Guyon, I., Weston, J., Barnhill, S., et al.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1–3), 389–422 (2002)
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Machine Learning 63(1), 3–42 (2006)
Joly, A., Goëau, H., Bonnet, P. et al.: Are multimedia identification tools biodiversity-friendly? In: Proceedings of the 3rd ACM International Workshop on Multimedia Analysis for Ecological Data (2014). doi:10.1145/2661821.2661826
Adelson, E.H., Anderson, C.H., Bergen, J.R., et al.: Pyramid Method in Image Processing. RCA Engineer 29(6), 33–41 (1984)
Animal Sound Archive Berlin. http://www.animalsoundarchive.org
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Lasseck, M. (2015). Towards Automatic Large-Scale Identification of Birds in Audio Recordings. In: , et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2015. Lecture Notes in Computer Science(), vol 9283. Springer, Cham. https://doi.org/10.1007/978-3-319-24027-5_39
Download citation
DOI: https://doi.org/10.1007/978-3-319-24027-5_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24026-8
Online ISBN: 978-3-319-24027-5
eBook Packages: Computer ScienceComputer Science (R0)