Definition
An audio signal is a signal that contains information in the audible frequency range. Audio representation refers to the extraction of audio signal properties, or features, that are representative of the audio signal composition (both in temporal and spectral domain) and audio signal behavior over time. Feature extraction is typically combined with feature selection, through which the best set of features for the intended operation on the audio signal is defined.
Historical Background
Audio feature extraction typically leads to a strongly reduced audio signal representation. Obtaining such representation can improve the efficiency of audio processing and benefit many applications based on such processing. For example, a compact representation of an audio signal in the form of a fingerprintcan enable extremely fast search for a match between this signal and a large-scale audio database for the purpose of audio signal...
This is a preview of subscription content, log in via an institution.
Recommended Reading
Cai R, Lu L, Hanjalic A, Zhang H-J, Cai L-H. A flexible framework for key audio effects detection and auditory context inference. IEEE Trans Audio Speech Lang Process. 2006;14(3):1026–39.
Casey MA. MPEG-7 sound-recognition tools. IEEE Trans Circuits Syst Video Technol. 1997;11(6):737–47.
Foote J. Content-based retrieval of music and audio. In: Proceedings of the SPIE Multimedia Storage and Archiving Systems II; 1997. p. 138–47.
Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3:1157–82.
Liu Z, Wang Y, Chen T. Audio feature extraction and analysis for scene segmentation and classification. J VLSI Signal Process Syst. 1998;20(1-2):61–79.
Lu L, Zhang H-J, Jiang H. Content analysis for audio classification and segmentation. IEEE Trans Speech Audio Process. 2002;10(7):504–16.
Lu L, Zhang H-J, Li S. Content-based audio classification and segmentation by using support vector machines. ACM Multimed Syst J. 2003;8(6):482–92.
Peltonen V, Tuomi J, Klapuri AP, Huopaniemi J, Sorsa T. Computational auditory scene recognition. Proc IEEE Int Conf Acoustics, Speech Signal Process. 2002;2:1941–4.
Rabiner L, Juang BH. Fundamentals of speech recognition. Englewood Cliffs: Prentice-Hall; 1993.
Saunders J. Real-time discrimination of broadcast speech/music. Proc IEEE Int Conf Acoustics, Speech Signal Process. 1996;2:993–6.
Scheirer E, Slaney M. Construction and evaluation of a robust multifeature music/speech discriminator. Proc IEEE Int Conf Acoustics, Speech Signal Process. 1997;2:1331–4.
Tzanetakis G, Cook P. Marsyas: a framework for audio analysis. Organized Sound. 2000;4(3).
Wall ME, Rechtsteiner A, Rocha LM. Singular value decomposition and principal component analysis. In: Berrar DP, Dubitzky W, Granzow M, editors. A practical approach to microarray data analysis. Norwell: Kluwer; 2003. p. 91–109. LANL LA-UR-02-4001.
Wold E, Blum T, Wheaton J. Content-based classification, search and retrieval of audio. IEEE Multimedia. 1996;3(3):27–36.
Zhang T, Kuo C-CJ. Video content parsing based on combined audio and visual information. In: Proceedings of the SPIE: Multimedia Storage and Archiving Systems, IV; 1999. p. 78–89.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media LLC
About this entry
Cite this entry
Lu, L., Hanjalic, A. (2016). Audio Representation. In: Liu, L., Özsu, M. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-7993-3_1442-2
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7993-3_1442-2
Received:
Accepted:
Published:
Publisher Name: Springer, New York, NY
Online ISBN: 978-1-4899-7993-3
eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering