Repetition Detection in Stuttered Speech

  • Pravin B. Ramteke
  • Shashidhar G. Koolagudi
  • Fathima Afroz
Conference paper
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 43)

Abstract

This paper mainly focuses on detection of repetitions in stuttered speech. The stuttered speech signal is divided into isolated units based on energy. Mel-frequency cepstrum coefficients (MFCCs), formants and shimmer are used as features for repetition recognition. These features are extracted from each isolated unit. Using Dynamic Time Warping (DTW) the features of each isolated unit are compared with those subsequent units within one second interval of speech. Based on the analysis of scores obtained from DTW a threshold is set, if the score is below the set threshold then the units are identified as repeated events. Twenty seven seconds of speech data used in this work, consists of 50 repetition events. The result shows that the combination of MFCCs, formants and shimmer can be used for the recognition of repetitions in stuttered speech. Out of 50 repetitions, 47 are correctly identified.

Keywords

MFCCs Formants Shimmer Jitter Dynamic time warping 

References

  1. 1.
    Riper, V.: The Nature of Stuttering. Prentice Hall, New Jersey (1971)Google Scholar
  2. 2.
    Kully, D., Boerg, E.: An investigation of inter-clinic agreement in the identification of fluent and stuttered syllables. J. Fluency Disord. 13, 309–318 (1988)CrossRefGoogle Scholar
  3. 3.
    Conture, E.G.: Stuttering Englewood cliffs, New Jersey: Prentice-Hall, 2nd edn. (1990)Google Scholar
  4. 4.
    Zhang, J., Dong, B., Yan, Y.: A computer-assist algorithm to detect repetitive stuttering automatically. In: International Conference on Asian Language Processing, pp. 249–252 (2013)Google Scholar
  5. 5.
    Ravikumar, K.M., Balakrishna, R., Rajagopal, R., Nagaraj, H.C.: Automatic detection of syllable repetition in read speech for objective assessment of stuttered disfluencies. Proce. World Acad. Sci. 2, 220–223 (2008)Google Scholar
  6. 6.
    Palfy, J., Pospichal, J.: Recognition of repetitions using support vector machines. In: Signal Processing Algorithms, Architectures, Arrangements, and Applications Conference Proceedings (SPA), 2011, pp. 1–6 (2011)Google Scholar
  7. 7.
    Chee, L.S., Ai, O.C., Hariharan, M., Yaacob, S.: Automatic detection of prolongations and repetitions using LPCC. In: 2009 International Conference for Technical Postgraduates (TECHPOS). pp. 1–4 (2009)Google Scholar
  8. 8.
    Ai, O.C., Hariharan, M., Yaacob, S., Chee, L.S.: Classification of speech dysfluencies with MFCC and LPCC features. J. Med. Syst. 39, 2157–2165 (2012)Google Scholar
  9. 9.
    Ying, G.S., Mitchell, C.D., Jamieson, L.H.: Endpoint detection of isolated utterances based on a modified teager energy measurement. International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 732–735 (1993)Google Scholar
  10. 10.
  11. 11.
    Welling, L., Ney, H.: Formant estimation for speech recognition. IEEE Transactions on Speech Audio Processing, vol. 6, pp. 36–48 (1998)CrossRefGoogle Scholar
  12. 12.
    Li, X., Tao, J., Johnson, M.T., Soltis, J., Savage, A., Kirsten, M.L., Newman, J.D.: Stress and emotion classification using Jitter and Shimmer features. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2007, vol. 4., pp. IV–1081. IEEE (2007)Google Scholar
  13. 13.
    Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7, 358–386 (2005)CrossRefGoogle Scholar

Copyright information

© Springer India 2016

Authors and Affiliations

  • Pravin B. Ramteke
    • 1
  • Shashidhar G. Koolagudi
    • 1
  • Fathima Afroz
    • 1
  1. 1.National Institute of Technology KarnatakaSurathkalIndia

Personalised recommendations