Abstract
This paper presents a computer-based technique for bird species identification at large scale. It automatically identifies multiple species simultaneously in a large number of audio recordings and provides the basis for the best scoring submission to the LifeCLEF 2014 Bird Identification Task. The method achieves a Mean Average Precision of 51.1% on the test set and 53.9% on the training set with an Area Under the Curve of 91.5% during cross-validation. Besides a general description of the underlying classification approach a number of additional research questions are addressed regarding the choice of features, selection of classifier hyperparameters and method of classification.
Keywords
- Bird Identification
- Information retrieval
- Biodiversity
- Spectrogram segmentation
- Median Clipping
- Template matching
- Decision trees
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Frommolt, K.-H., Bardeli, R., Clausen, M. (eds.) Computational bioacoustics for assessing biodiversity. Proc. of the int. expert meeting on IT-based detection of bioacoustical patterns (2008)
Bardeli, R., Wolff, D., Kurth, F., Koch, M., Tauchert, K.-H., Frommolt, K.-H.: Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring. Pattern Recognition Letter 31(23), 1524–1534 (2009)
Briggs, F., Lakshminarayanan, B., Neal, L., et al.: Acoustic classification of multiple simultaneous bird species: A multi-instance multi-label approach. The Journal of the Acoustical Society of America 131(6), 4640–4650 (2012). doi:10.1121/1.4707424
Potamitis, I.: Automatic Classification of Taxon-Rich Community Recorded in the Wild. PLoS ONE 9(5), e96936 (2014). doi:10.1371/journal.pone.0096936
Glotin, H., Goëau, H., Vellinga, W-P., Rauber, A.: LifeCLEF bird identification task 2014. In: CLEF working notes (2014)
Cappellato, L., Ferro, N., Halvey, M., Kraaij, W. (eds.) CLEF 2014 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings. (CEUR-WS.org), ISSN 1613-0073, (2014). http://ceur-ws.org/Vol-1180/
Eyben, F., Wöllmer, M., Schuller, B.: openSMILE - the munich versatile and fast open-source audio feature extractor. In: Proc. ACM Multimedia (MM), pp. 1459–1462. ACM, Florence, Italy (2010). ISBN 978-1-60558-933-6, doi:10.1145/1873951.1874246
Geiger, J.T., Schuller, B., Rigoll, G.: Large-scale audio feature extraction and svm for acoustic scenes classification. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2013. IEEE (2013)
Lewis, J.P.: Fast Normalized Cross-Correlation. Industrial Light and Magic (1995)
Fodor, G.: The ninth annual MLSP competition: first place. In: 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–2 (2013). doi:10.1109/MLSP.2013.6661932
Lasseck, M.: Bird song classification in field recordings: winning solution for NIPS4B 2013 competition. In: Glotin, H. et al. (eds.) Proc. of int. symp. Neural Information Scaled for Bioacoustics, sabiod.org/nips4b, joint to NIPS, Nevada, pp. 176–181 (2013)
Pedregosa, F., et al.: Scikit-learn: Machine learning in Python. JMLR 12, 2825–2830 (2011)
Guyon, I., Weston, J., Barnhill, S., et al.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1–3), 389–422 (2002)
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Machine Learning 63(1), 3–42 (2006)
Joly, A., Goëau, H., Bonnet, P. et al.: Are multimedia identification tools biodiversity-friendly? In: Proceedings of the 3rd ACM International Workshop on Multimedia Analysis for Ecological Data (2014). doi:10.1145/2661821.2661826
Adelson, E.H., Anderson, C.H., Bergen, J.R., et al.: Pyramid Method in Image Processing. RCA Engineer 29(6), 33–41 (1984)
Animal Sound Archive Berlin. http://www.animalsoundarchive.org
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Lasseck, M. (2015). Towards Automatic Large-Scale Identification of Birds in Audio Recordings. In: Mothe, J., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2015. Lecture Notes in Computer Science(), vol 9283. Springer, Cham. https://doi.org/10.1007/978-3-319-24027-5_39
Download citation
DOI: https://doi.org/10.1007/978-3-319-24027-5_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24026-8
Online ISBN: 978-3-319-24027-5
eBook Packages: Computer ScienceComputer Science (R0)