Skip to main content

Audio Representation

  • Living reference work entry
  • First Online:
  • 49 Accesses

Synonyms

Audio characterization; Audio feature extraction

Definition

An audio signal is a signal that contains information in the audible frequency range. Audio representation refers to the extraction of audio signal properties, or features, that are representative of the audio signal composition (both in temporal and spectral domain) and audio signal behavior over time. Feature extraction is typically combined with feature selection, through which the best set of features for the intended operation on the audio signal is defined.

Historical Background

Audio feature extraction typically leads to a strongly reduced audio signal representation. Obtaining such representation can improve the efficiency of audio processing and benefit many applications based on such processing. For example, a compact representation of an audio signal in the form of a fingerprintcan enable extremely fast search for a match between this signal and a large-scale audio database for the purpose of audio signal...

This is a preview of subscription content, log in via an institution.

Recommended Reading

  1. Cai R, Lu L, Hanjalic A, Zhang H-J, Cai L-H. A flexible framework for key audio effects detection and auditory context inference. IEEE Trans Audio Speech Lang Process. 2006;14(3):1026–39.

    Article  Google Scholar 

  2. Casey MA. MPEG-7 sound-recognition tools. IEEE Trans Circuits Syst Video Technol. 1997;11(6):737–47.

    Article  Google Scholar 

  3. Foote J. Content-based retrieval of music and audio. In: Proceedings of the SPIE Multimedia Storage and Archiving Systems II; 1997. p. 138–47.

    Google Scholar 

  4. Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3:1157–82.

    MATH  Google Scholar 

  5. Liu Z, Wang Y, Chen T. Audio feature extraction and analysis for scene segmentation and classification. J VLSI Signal Process Syst. 1998;20(1-2):61–79.

    Article  Google Scholar 

  6. Lu L, Zhang H-J, Jiang H. Content analysis for audio classification and segmentation. IEEE Trans Speech Audio Process. 2002;10(7):504–16.

    Article  Google Scholar 

  7. Lu L, Zhang H-J, Li S. Content-based audio classification and segmentation by using support vector machines. ACM Multimed Syst J. 2003;8(6):482–92.

    Article  Google Scholar 

  8. Peltonen V, Tuomi J, Klapuri AP, Huopaniemi J, Sorsa T. Computational auditory scene recognition. Proc IEEE Int Conf Acoustics, Speech Signal Process. 2002;2:1941–4.

    Google Scholar 

  9. Rabiner L, Juang BH. Fundamentals of speech recognition. Englewood Cliffs: Prentice-Hall; 1993.

    MATH  Google Scholar 

  10. Saunders J. Real-time discrimination of broadcast speech/music. Proc IEEE Int Conf Acoustics, Speech Signal Process. 1996;2:993–6.

    Google Scholar 

  11. Scheirer E, Slaney M. Construction and evaluation of a robust multifeature music/speech discriminator. Proc IEEE Int Conf Acoustics, Speech Signal Process. 1997;2:1331–4.

    Google Scholar 

  12. Tzanetakis G, Cook P. Marsyas: a framework for audio analysis. Organized Sound. 2000;4(3).

    Google Scholar 

  13. Wall ME, Rechtsteiner A, Rocha LM. Singular value decomposition and principal component analysis. In: Berrar DP, Dubitzky W, Granzow M, editors. A practical approach to microarray data analysis. Norwell: Kluwer; 2003. p. 91–109. LANL LA-UR-02-4001.

    Google Scholar 

  14. Wold E, Blum T, Wheaton J. Content-based classification, search and retrieval of audio. IEEE Multimedia. 1996;3(3):27–36.

    Article  Google Scholar 

  15. Zhang T, Kuo C-CJ. Video content parsing based on combined audio and visual information. In: Proceedings of the SPIE: Multimedia Storage and Archiving Systems, IV; 1999. p. 78–89.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lie Lu .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media LLC

About this entry

Cite this entry

Lu, L., Hanjalic, A. (2016). Audio Representation. In: Liu, L., Özsu, M. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-7993-3_1442-2

Download citation

  • DOI: https://doi.org/10.1007/978-1-4899-7993-3_1442-2

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, New York, NY

  • Online ISBN: 978-1-4899-7993-3

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics