A Fuzzy Logic Approach for Content-Based Audio Classification and Boolean Retrieval

Liu, Mingchun; Wan, Chunru; Wang, Lipo

doi:10.1007/978-3-540-39988-9_7

Mingchun Liu⁴,
Chunru Wan⁴ &
Lipo Wang⁴

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 137))

239 Accesses
1 Citations

Summary

Since the invention of fuzzy sets and maturing of the fuzzy logic theory, fuzzy logic systems have been widely applied to various fields, such as fuzzy controller, data mining, and so on. New potential areas using fuzzy logic are also being explored with the emergence of other technologies. One booming technology today is the Internet, due to its fast growing number of users and rich contents. With huge data storage and speedy networks becoming available, multimedia contents like image, video, and audio are fast increasing. In order to search and index these media effectively, various content-based multimedia retrieval systems have been studied.

In this chapter, we introduce a fuzzy logic approach for hierarchical contentbased audio classification and boolean retrieval, which is intuitive due to the fuzzy nature of human perception of audio, especially audio clips of mixed types.The fuzzy nature of audio search lies in the facts that (1) both the query and target are approximations of the user’s memory and desire and (2) exact matching is sometimes impossible or impractical. Therefore, fuzzy logic systems are a natural choice in audio classification and retrieval.

The fuzzy tree classifier is the core of the hierarchical content-based audio classification. At the beginning, audio features are extracted for audio samples in the database. Proper features are then selected and used as input to a constructed fuzzy inference system (FIS). The outputs of the FIS are two types of hierarchical audio classes. The membership functions and rules are derived from the distributions of the audio features. Non-speech and music sounds are discriminated by the FIS in the first hierarchy. Secondly, music and speech are separated. One particular sound, the telephone ring, has also been recognized in this level. In the prototype system, the classification ability of up to fourth level has been explored. Hence we can use multiple FISs to form the ‘fuzzy tree’ for retrieval of different types of audio clips. With this approach, we can classify and retrieve generic audios using fewer features and less computation time, compared to other existing approaches.

As for retrieval, the existing content-based audio retrieval systems usually adopt the query-by-example mechanism to search for desired audio files. However, only one single audio sample often cannot express the user’s needs adequately. To overcome this problem, more audio files can be chosen as queries provided by the user or through feedback during searching. Correspondingly, we present a different scheme to handle content-based audio retrieval with multi-queries. The multiple queries are linked by boolean operators and thus it can be treated as a boolean search problem. We build a framework to solve the three basic boolean operators known as AND, OR, and NOT, with concepts adopted from fuzzy logic. Experiments have shown that boolean search can be helpful in audio retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Makhoul J, Kubala F et al. (2000) Speech and language technologies for audio indexing and retrieval code. In: Proceedings of the IEEE, Volume: 88 Issue: 8, Aug 2000, pp: 1338–1353
Google Scholar
Viswanathan M, Beigi H.S.M et al. (1999) Retrieval from spoken documents using content and speaker information. In: ICDAR’99 pp: 567–572
Google Scholar
Gauvain J.-L, Lamel L (2000) Large-vocabulary continuous speech recognition: advances and applications. In: Proceedings of the IEEE, Volume: 88 Issue: 8, Aug 2000, pp: 1181–1200
Google Scholar
Chih-Chin Liu, Jia-Lien Hsu, Chen A.L.P (1999) An approximate string matching algorithm for content-based music data retrieval. In: IEEE International Conference on Multimedia Computing and Systems, Volume: 1, 1999, pp: 451–456
Google Scholar
Delfs C, Jondral F (1997) Classification of piano sounds using time-frequency signal analysis. In: ICASSP-97, Volume: 3 pp: 2093–2096
Google Scholar
Paradie M.J, Nawab S.H (1990) The classification of ringing sounds. In: ICASSP-90, pp: 2435–2438
Google Scholar
Scheirer E, Slaney M (1997) Construction and evaluation of a robust multifeature speech/music discriminator. In: ICASSP-97, Volume: 2, pp: 1331–1334
Google Scholar
Tong Zhang, C.-C. Jay Kuo (1999) Heuristic approach for generic audio data segmentation and annotation. In: ACM Multimedia’99, pp: 67–76
Google Scholar
Liu Z, Huang J, Wang Y (1998) Classification TV programs based on audio information using hidden Markov model. In: IEEE Second Workshop on Multimedia Signal Processing, 1998, pp: 27–32
Chapter Google Scholar
Wold E, Blum T, Keislar D, Wheaten J (1996) Content-based classification, search, and retrieval of audio. In: IEEE Multimedia, Volume: 3 Issue: 3, Fall 1996, pp: 27–36
Google Scholar
Zhu Liu, Qian Huang (2000) Content-based indexing and retrieval-by-example in audio. In: ICME 2000, Volume: 2, pp: 877–880
Google Scholar
Beritelli F, Casale S, Russo M (1995) Multilevel Speech Classification Based on Fuzzy Logic. In: Proceedings of IEEE Workshop on Speech Coding for Telecommunications, 1995, pp: 97–98
Chapter Google Scholar
Zhu Liu, Qian Huang (1998) Classification of audio events in broadcast news. In: IEEE Second Workshop on Multimedia Signal Processing, 1998, pp:364–369
Google Scholar
Mingchun Liu, Chunru Wan (2001) A study on content-based classification and retrieval of audio database. In: International Database Engineering and Application Symposium, 2001, pp: 339–345
Google Scholar
Li S.Z (2000) Content-based audio classification and retrieval using the nearest feature line method, IEEE Transactions on Speech and Audio Processing, Volume: 8 Issue: 5, Sept 2000, pp: 619–625
Article Google Scholar
Jang J.-S.R (1993) ANFIS: adaptive-network-based fuzzy inference system, IEEE Transactions on Systems, Man and Cybernetics, 1993, volume: 23, Issue: 3, pp: 665–685
Article Google Scholar

Download references

Author information

Authors and Affiliations

Schools of Electrical and Electronic Engineering, Nanyang Technological University, Block S2, 50 Nanyang Avenue, 639798, Singapore
Mingchun Liu, Chunru Wan & Lipo Wang

Authors

Mingchun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chunru Wan
View author publications
You can also search for this author in PubMed Google Scholar
Lipo Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipto. Matematica e Informatica, Universita di Salerno, Via S. Allende, 84081, Baronissi, Italy
Vincenzo Loia
Dept. Electrical Engineering and Computer Science — EECS, University of California, 94720, Berkeley, CA, USA
Masoud Nikravesh & Lotfi A. Zadeh &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Liu, M., Wan, C., Wang, L. (2004). A Fuzzy Logic Approach for Content-Based Audio Classification and Boolean Retrieval. In: Loia, V., Nikravesh, M., Zadeh, L.A. (eds) Fuzzy Logic and the Internet. Studies in Fuzziness and Soft Computing, vol 137. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39988-9_7

Download citation

DOI: https://doi.org/10.1007/978-3-540-39988-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05770-0
Online ISBN: 978-3-540-39988-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics