Summary
Since the invention of fuzzy sets and maturing of the fuzzy logic theory, fuzzy logic systems have been widely applied to various fields, such as fuzzy controller, data mining, and so on. New potential areas using fuzzy logic are also being explored with the emergence of other technologies. One booming technology today is the Internet, due to its fast growing number of users and rich contents. With huge data storage and speedy networks becoming available, multimedia contents like image, video, and audio are fast increasing. In order to search and index these media effectively, various content-based multimedia retrieval systems have been studied.
In this chapter, we introduce a fuzzy logic approach for hierarchical contentbased audio classification and boolean retrieval, which is intuitive due to the fuzzy nature of human perception of audio, especially audio clips of mixed types.The fuzzy nature of audio search lies in the facts that (1) both the query and target are approximations of the user’s memory and desire and (2) exact matching is sometimes impossible or impractical. Therefore, fuzzy logic systems are a natural choice in audio classification and retrieval.
The fuzzy tree classifier is the core of the hierarchical content-based audio classification. At the beginning, audio features are extracted for audio samples in the database. Proper features are then selected and used as input to a constructed fuzzy inference system (FIS). The outputs of the FIS are two types of hierarchical audio classes. The membership functions and rules are derived from the distributions of the audio features. Non-speech and music sounds are discriminated by the FIS in the first hierarchy. Secondly, music and speech are separated. One particular sound, the telephone ring, has also been recognized in this level. In the prototype system, the classification ability of up to fourth level has been explored. Hence we can use multiple FISs to form the ‘fuzzy tree’ for retrieval of different types of audio clips. With this approach, we can classify and retrieve generic audios using fewer features and less computation time, compared to other existing approaches.
As for retrieval, the existing content-based audio retrieval systems usually adopt the query-by-example mechanism to search for desired audio files. However, only one single audio sample often cannot express the user’s needs adequately. To overcome this problem, more audio files can be chosen as queries provided by the user or through feedback during searching. Correspondingly, we present a different scheme to handle content-based audio retrieval with multi-queries. The multiple queries are linked by boolean operators and thus it can be treated as a boolean search problem. We build a framework to solve the three basic boolean operators known as AND, OR, and NOT, with concepts adopted from fuzzy logic. Experiments have shown that boolean search can be helpful in audio retrieval.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Makhoul J, Kubala F et al. (2000) Speech and language technologies for audio indexing and retrieval code. In: Proceedings of the IEEE, Volume: 88 Issue: 8, Aug 2000, pp: 1338–1353
Viswanathan M, Beigi H.S.M et al. (1999) Retrieval from spoken documents using content and speaker information. In: ICDAR’99 pp: 567–572
Gauvain J.-L, Lamel L (2000) Large-vocabulary continuous speech recognition: advances and applications. In: Proceedings of the IEEE, Volume: 88 Issue: 8, Aug 2000, pp: 1181–1200
Chih-Chin Liu, Jia-Lien Hsu, Chen A.L.P (1999) An approximate string matching algorithm for content-based music data retrieval. In: IEEE International Conference on Multimedia Computing and Systems, Volume: 1, 1999, pp: 451–456
Delfs C, Jondral F (1997) Classification of piano sounds using time-frequency signal analysis. In: ICASSP-97, Volume: 3 pp: 2093–2096
Paradie M.J, Nawab S.H (1990) The classification of ringing sounds. In: ICASSP-90, pp: 2435–2438
Scheirer E, Slaney M (1997) Construction and evaluation of a robust multifeature speech/music discriminator. In: ICASSP-97, Volume: 2, pp: 1331–1334
Tong Zhang, C.-C. Jay Kuo (1999) Heuristic approach for generic audio data segmentation and annotation. In: ACM Multimedia’99, pp: 67–76
Liu Z, Huang J, Wang Y (1998) Classification TV programs based on audio information using hidden Markov model. In: IEEE Second Workshop on Multimedia Signal Processing, 1998, pp: 27–32
Wold E, Blum T, Keislar D, Wheaten J (1996) Content-based classification, search, and retrieval of audio. In: IEEE Multimedia, Volume: 3 Issue: 3, Fall 1996, pp: 27–36
Zhu Liu, Qian Huang (2000) Content-based indexing and retrieval-by-example in audio. In: ICME 2000, Volume: 2, pp: 877–880
Beritelli F, Casale S, Russo M (1995) Multilevel Speech Classification Based on Fuzzy Logic. In: Proceedings of IEEE Workshop on Speech Coding for Telecommunications, 1995, pp: 97–98
Zhu Liu, Qian Huang (1998) Classification of audio events in broadcast news. In: IEEE Second Workshop on Multimedia Signal Processing, 1998, pp:364–369
Mingchun Liu, Chunru Wan (2001) A study on content-based classification and retrieval of audio database. In: International Database Engineering and Application Symposium, 2001, pp: 339–345
Li S.Z (2000) Content-based audio classification and retrieval using the nearest feature line method, IEEE Transactions on Speech and Audio Processing, Volume: 8 Issue: 5, Sept 2000, pp: 619–625
Jang J.-S.R (1993) ANFIS: adaptive-network-based fuzzy inference system, IEEE Transactions on Systems, Man and Cybernetics, 1993, volume: 23, Issue: 3, pp: 665–685
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Liu, M., Wan, C., Wang, L. (2004). A Fuzzy Logic Approach for Content-Based Audio Classification and Boolean Retrieval. In: Loia, V., Nikravesh, M., Zadeh, L.A. (eds) Fuzzy Logic and the Internet. Studies in Fuzziness and Soft Computing, vol 137. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39988-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-39988-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05770-0
Online ISBN: 978-3-540-39988-9
eBook Packages: Springer Book Archive