Audio Segmentation for Speech Recognition Using Segment Features

  • Gayatri M. Bhandari
  • Rameshwar S. Kawitkar
  • Madhuri P. Borawake
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 249)


The amount of audio available in different databases on the Internet today is immense. Even systems that do allow searches for multimedia content, like AltaVista and Lycos, only allow queries based on the multimedia filename, nearby text on the web page containing the file, and metadata embedded in the file such as title and author. This might yield some useful results if the metadata provided by the distributor is extensive. Producing this data is a tedious manual task, and therefore automatic means for creating this information is needed. In this paper an algorithm to segment the given audio and extract the features such as MFCC, SF, SNR, ZCR is proposed and the experimental results shown for the given algorithm.


Audio segmentation Feature extraction MFCC LPC SNR ZCR 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Peiszer, E., Lidy, T., Rauber, A.: Automatic Audio Segmentation: Segment Boundary and Structure Detection in Popular Music (2008)Google Scholar
  2. 2.
    Cook, G.T.P.: Multifeature Audio Segmentation for Browsing and Annotation. In: Proc.1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York, pp. W99-1–W99-4 (1999)Google Scholar
  3. 3.
    Lu, G.: Indexing and Retrieval of Audio: A Survey, pp. 269–290 (2001)Google Scholar
  4. 4.
    Zhang, J.X., Whalley, J., Brooks, S.: A Two Phase Method for general audio segmentation (2004)Google Scholar
  5. 5.
    Foote, J.: Automatic Audio Segmentation Using A Measure of Audio NoveltyGoogle Scholar
  6. 6.
    Julien, P., José, A., Régine, A.: Audio classi_cation by search of primary components, pp. 1–12Google Scholar
  7. 7.
    Lu, L., Zhang, H.-J., Jiang, H.: Content Analysis for Audio Classification and Segmentation. IEEE Transaction on Speech and Audio Processing, 504–516 (2002)Google Scholar
  8. 8.
    Lu, L., Li, S.Z., Zhang, H.-J.: Content based audio segmentation using Support Vector Machines (2008)Google Scholar
  9. 9.
    Aguilo, M., Butko, T., Temko, A., Nadeu, C.: A Hierarchical Architecture for Audio Segmentation in a Broadcast News Task, pp. 17–20 (2009)Google Scholar
  10. 10.
    Cettolo, M., Vescovi, M., Rizzi, R.: Evaluation of BIC-based algorithms for audio segmentation, pp. 147–170. Elsevier (2005)Google Scholar
  11. 11.
    Goodwin, M.M., Laroche, J.: Audio Segmentation by feature space clustering using linear discriminant analysis and dynamic programming (2003)Google Scholar
  12. 12.
    Haque, M.A., Kim, J.-M.: An analysis of content-based classification of audio signals using a fuzzy c-means algorithm (2012)Google Scholar
  13. 13.
    Mesgarani, N., Slaney, M., Shamma, S.A.: Discrimination of Speech From Nonspeech Based on Multiscale Spectro-Temporal Modulations, pp. 920–930 (2006)Google Scholar
  14. 14.
    Krishnamoorthy, P., Kumar, S.: Hierarchical audio content classification system using an optimal feature selection algorithm, pp. 415–444 (2010)Google Scholar
  15. 15.
    Panagiotis, S., Vasileios, M., Ioannis, K., Hugo, M., Miguel, B., Isabel, T.: On the use of audio events for improving video scene segmentationGoogle Scholar
  16. 16.
    Abdallah, S., Sandler, M., Rhodes, C., Casey, M.: Using duration Models to reduce fragmentation in audio segmentation 65, 485–515 (2006)Google Scholar
  17. 17.
    Cheng, S.-S., Wang, H.-M., Fu, H.-C.: BIC-BASED Audio Segmentation by divide and conquerGoogle Scholar
  18. 18.
    Yong, S.: Audio Segmentation, pp. 1–4 (2007)Google Scholar
  19. 19.
    Matsunaga, S., Mizuno, O., Ohtsuki, K., Hayashi, Y.: Audio source segmentation using spectral correlation features for automatic indexing of broadcast news, pp. 2103–2106Google Scholar
  20. 20.
    Sainath, T.N., Kanevsky, D., Iyengar, G.: Uusupervised audio segmentation using extended Baum-Welch Transformations, I 209-I 212 (2007)Google Scholar
  21. 21.
    Giannakopoulos, T., Pikrakis, A., Theodoridis, S.: A Novel Efficient Approach for Audio Segmentation (2008)Google Scholar
  22. 22.
    Zhang, Y., Zhou, J.: Audio Segmentation based on Multiscale audio classification, pp. IV-349–IV-352 (2004)Google Scholar
  23. 23.
    Peng, Y., Ngo, C.-W., Fang, C., Chen, X., Xiao, J.: Audio Similarity Measure by Graph Modeling and Matching, pp. 603–606Google Scholar
  24. 24.
    Harchaoui, Z., Vallet, F., Lung-Yut-Fong, A., Cap, O.: Regularized Kernel-Based ApproachToUnsupervised Audio SegmentationGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Gayatri M. Bhandari
    • 1
  • Rameshwar S. Kawitkar
    • 2
  • Madhuri P. Borawake
    • 3
  1. 1.JSPM’s Bhivarabai Sawant Institute of Tech. & Research(W)J.J.T. UniversityPuneIndia
  2. 2.Sinhgad Institute of TechnologyPuneIndia
  3. 3.College of Engg.J.J.T. University and PDEAPuneIndia

Personalised recommendations