Encyclopedia of Database Systems

Living Edition
| Editors: Ling Liu, M. Tamer Özsu

Audio Content Analysis

  • Lie Lu
  • Alan Hanjalic
Living reference work entry
DOI: https://doi.org/10.1007/978-1-4899-7993-3_1528-2

Synonyms

Definition

An audio signal is a signal that contains information in the audible frequency range. Audio content analysis refers to a set of theories, algorithms and systems that aim at extracting descriptors or metadata related to audio content and allowing search, retrieval and other user actions performed on audio signals.

Historical Background

Multimedia content analysis has been one of the most booming research directions in the past years. With the objective of providing fast, natural, intuitive and personalized content-based access to vast multimedia data collections, and building on the synergy of many scientific disciplines, such as signal processing, pattern recognition, machine learning, information retrieval, information theory, natural language processing and psychology, the research initiative born around the end of the 1980s has succeeded in inspiring and mobilizing enormous number of researchers worldwide....

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Cai R, Lu L, Hanjalic A. Unsupervised content discovery in composite audio. In: Proceedings of the IEEE International Conference on Multimedia and Expo; 2005. p. 628–37.Google Scholar
  2. 2.
    Cai R, Lu L, Hanjalic A, Zhang H-J, Cai L-H. A flexible framework for key audio effects detection and auditory context inference. IEEE Trans Audio Speech Lang Process. 2006;14(3):1026–39.CrossRefGoogle Scholar
  3. 3.
    Casey M, et al. Content-based music information retrieval: current directions and future challenges. In: Proceedings of the IEEE, Special Issue on Advances in Multimedia Information Retrieval. 2008;96(4):668–96.Google Scholar
  4. 4.
    Cheng W-H, Chu W-T, Wu J-L. Semantic context detection based on hierarchical audio models. In: Proceedings of the 5th ACM SIGMM International Workshop on Multimedia Information Retrieval; 2003. p. 109–15.Google Scholar
  5. 5.
    Hanjalic A. Content-based analysis of digital video. Norwell: Kluwer; 2004.zbMATHGoogle Scholar
  6. 6.
    Huang X, Acero A, Hon HW. Spoken language processing: a guide to theory, algorithm, and system development. Upper Saddle River: Prentice; 2001.Google Scholar
  7. 7.
    Lu L, Cai R, Hanjalic A. Audio elements based auditory scene segmentation. Proc IEEE Int Conf Acoust Speech Signal Process. 2006;5:17–20.Google Scholar
  8. 8.
    Lu L, Zhang H-J, Jiang H. Content analysis for audio classification and segmentation. IEEE Trans Speech Audio Process. 2002;10(7):504–16.CrossRefGoogle Scholar
  9. 9.
    Radhakrishnan R, Divakaran A, Xiong Z. A time series clustering based framework for multimedia mining and summarization using audio features. In: Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval; 2004. p. 157–64.Google Scholar

Copyright information

© Springer Science+Business Media LLC 2016

Authors and Affiliations

  1. 1.Microsoft Research AsiaBeijingChina
  2. 2.Delft University of TechnologyDelftThe Netherlands

Section editors and affiliations

  • Vincent Oria
    • 1
  • Shin'ichi Satoh
    • 2
  1. 1.Dept. of Computer ScienceNew Jersey Inst. of TechnologyNewarkUSA
  2. 2.Digital Content and Media Sciences ReseaMultimedia Information Research DivisionNational Institute of InformaticsTokyoJapan