Audio segmentation via the similarity measure of audio feature vectors

Gang, Chen; Hui, Tan; Xin-meng, Chen

doi:10.1007/BF02832422

Audio segmentation via the similarity measure of audio feature vectors

Published: September 2005

Volume 10, pages 833–837, (2005)
Cite this article

Wuhan University Journal of Natural Sciences

Chen Gang¹,
Tan Hui¹ &
Chen Xin-meng¹

85 Accesses
3 Citations
Explore all metrics

Abstract

A formula to compute the similarity between two audio feature vectors is proposed, which can map arbitrary pair of vectors with equivalent dimension to [0, 1). To fulfill the task of audio segmentation, a self-similarity matrix is computed to reveal the inner structure of an audio clip to be segmented. As the final result must be consistent with the subjective evaluation and be adaptive to some special applications, a set of weights is adopted, which can be modified through relevance feedback techniques. Experiments show that satisfactory result can be achieved via the algorithm proposed in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient audio-driven multimedia indexing through similarity-based speech / music discrimination

Article 10 January 2017

Speech Assessment Based on Entropy and Similarity Measures

An Automatic Segmentation Method of Popular Music Based on SVM and Self-similarity

References

Woodland P C, Hain T, Johnson S,et al. Experiments in Broadcast News Transcription.Proc IEEE International Conference on Acoustics, Speech, and Signal Processing, 1998,2:909–912.
Google Scholar
Sankar A, Weng F, Rivlin Z,et al. The Development of SRI’s 1997 Broadcast News Transcription System.Proc DARPA Broadcast News Transcription and Understanding Workshop, 1998,1:91–96.
Google Scholar
Siegler M A, Jain U, Raj B,et al. Automatic Segmentation, Classification and Clustering of Broadcast News Audio.Proc DARPA Speech Recognition Workshop, 1997,1:97–99.
Google Scholar
Tzanetakis G, Cook P. Multifeature Audio Segmentation for Browsing and Annotation.Proc IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 1999,1:103–106.
Google Scholar
Nitanda N, Haseyama M, Kitajima H. Audio-Cut Detection and Audio-Segment Classification Using Fuzzy C-Means Clustering.Proc IEEE International Conference on Acoustics, Speech, and Signal Processing 2004,4:325–328.
Google Scholar
Tzanetakis G, Cook P. Marsyas: A Framework for Audio Analysis.Organized Sound, 2000,10(5):293–302.
Google Scholar
Linäker F, Niklasson L. Time Series Segmentation Using an Adaptive Resource Allocating Vector Quantization Network Based on Change Detection.International Joint Conference on Neural Networks, 2000,6:323–328.
Google Scholar
Church K, Helfman J. Dotplot: A Program for Exploring Self-Similarity in Millions of Lines of Text and Code.J American Statistical Assoc, 1993,2(2):153–174.
Google Scholar
Eckmann J P, Kamphorst S O, Ruelle D. Recurrence Plots of Dynamical Systems.Europhys Lett, 1987,4(9):973–977.
Article Google Scholar
Rui Y, Huang T S, Orgeta M,et al. Relevance Feedback: A Power Tool for Interactive Content-Based Image Retrieval.J IEEE Trans on Circuits and Video Technology, 1998.8(5):644–655.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer, Wuhan University, 430072, Wuhan, Hubei, China
Chen Gang, Tan Hui & Chen Xin-meng

Authors

Chen Gang
View author publications
You can also search for this author in PubMed Google Scholar
Tan Hui
View author publications
You can also search for this author in PubMed Google Scholar
Chen Xin-meng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Chen Gang or Chen Xin-meng.

Additional information

Foundation item: Supported by the National Natural Science Foundation of China (10371033)

Biography: CHEN Gang (1970-), male, Ph. D. candidate, research direction: audio analysis and pattern recognition.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gang, C., Hui, T. & Xin-meng, C. Audio segmentation via the similarity measure of audio feature vectors. Wuhan Univ. J. Nat. Sci. 10, 833–837 (2005). https://doi.org/10.1007/BF02832422

Download citation

Received: 20 October 2004
Issue Date: September 2005
DOI: https://doi.org/10.1007/BF02832422

Key words

CLC number

TP 391. 42

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Audio segmentation via the similarity measure of audio feature vectors

Abstract

Access this article

Similar content being viewed by others

Efficient audio-driven multimedia indexing through similarity-based speech / music discrimination

Speech Assessment Based on Entropy and Similarity Measures

An Automatic Segmentation Method of Popular Music Based on SVM and Self-similarity

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Key words

CLC number

Navigation

Audio segmentation via the similarity measure of audio feature vectors

Abstract

Access this article

Similar content being viewed by others

Efficient audio-driven multimedia indexing through similarity-based speech / music discrimination

Speech Assessment Based on Entropy and Similarity Measures

An Automatic Segmentation Method of Popular Music Based on SVM and Self-similarity

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

Search

Navigation