Skip to main content

Filled Pause Detection in Indonesian Spontaneous Speech

  • Conference paper
  • First Online:
Computational Linguistics (PACLING 2015)

Abstract

Detecting filled pause in spontaneous speech recognition is very important since most of the speech is spontaneous and the most frequent phenomenon in Indonesian spontaneous speech is filled pause. This paper discusses the detection of filled pauses in spontaneous speech of Indonesian by utilizing acoustic features of the speech signal. The detection was conducted by employing statistical method using Naïve Bayes, Classification Tree, and Multilayer Perceptron algorithm. To build the model, speech data were collected from an entertainment program. Word parts in the data were labeled and its features were extracted. These include the formant and pitch stability, energy-drop, and duration. Half an hour of sentences contains 295 filled pause and 2082 non-filled pause words were employed as training data. Using 25 sentences as testing data, Naïve Bayes gave best detection correctness, 74.35 % on a closed data set and 71.43 % on an open data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.youtube.com/user/TEDxTalks.

References

  1. Audhkhasi, K., Kandhway, K., Deshmukh, O., Verma, A.: Formant-based technique for automatic filled-pause detection in spontaneous spoken English. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2009, pp. 4857–4860 (2009)

    Google Scholar 

  2. Barras, C., Geoffrois, E., Wu, Z., Liberman, M.: Transcriber: a free tool for segmenting, labeling and transcribing speech. In: First international conference on language resources and evaluation (LREC), pp. 1373–1376 (1998)

    Google Scholar 

  3. Batliner, A., Kießling, A., Burger, S., Nöth, E.: Filled pauses in spontaneous speech (2011)

    Google Scholar 

  4. Boersma, P., Weenink, D.: PRAAT: A system for doing phonetics by computer, in Report of the Institute of Phonetic Sciences of the University of Amsterdam 132 (1996)

    Google Scholar 

  5. Fitzgerald, E., Hall, K., Jelinek, F.: Reconstructing false start errors in spontaneous speech text. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 255–263 (2009)

    Google Scholar 

  6. Garner, S.R.: Weka: The waikato environment for knowledge analysis. In: Proceedings of the New Zealand computer science research students conference, pp. 57–64 (1995)

    Google Scholar 

  7. Goto, M., Itou, K., Hayamizu, S.: A real-time filled pause detection system for spontaneous speech recognition. In: Proceedings of the 6th European Conference on Speech Communication and Technology (Eurospeech 1999), pp. 227–230 (1999)

    Google Scholar 

  8. Kaushik, M., Trinkle, M., Hashemi-Sakhtsari, A.: Automatic detection and removal of disfluencies from spontaneous speech. In: Australasian International Conference on Speech Science and Technology, Melbourne Victoria (2010)

    Google Scholar 

  9. Liu, Y., Shriberg, E., Stolcke, A.: Automatic disfluency identification in conversational speech using multiple knowledge sources. In: Proceedings of Eurospeech, vol. 1, pp. 957–960 (2003)

    Google Scholar 

  10. O’Shaughnessy, D.: Recognition of hesitations in spontaneous speech. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1992, vol. 1, pp. 521–524 (1992)

    Google Scholar 

  11. Shriberg, E., Bates, R., Stolcke, A.: A prosody-only decision-tree model for disfluency detection. In: Proceedings of Eurospeech, vol. 5, pp. 2383–2386 (1997)

    Google Scholar 

  12. Shriberg, E.: Spontaneous speech: How people really talk and why engineers should care. In: Proceedings of. European Conference on Speech Communication and Technology (Eurospeech) (2005)

    Google Scholar 

  13. Stolcke, A., Shriberg, E.: Automatic linguistic segmentation of conversational speech. In: Proceedings Fourth International Conference on Spoken Language, ICSLP 1996, IEEE, vol. 2, pp. 1005–1008 (1996)

    Google Scholar 

  14. Stolcke, A., Shriberg, E., Bates, R.A., Ostendorf, M., Hakkani, D., Plauche, M., Lu, Y.: Automatic detection of sentence boundaries and disfluencies based on recognized words. In: ICSLP (1998)

    Google Scholar 

  15. Stouten, F., Martens, J.P.: A feature-based filled pause detection system for Dutch. In: IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003, pp. 309–314 (2003)

    Google Scholar 

  16. Swerts, M., Wichmann, A., Beun, R.J.: Filled pauses as Markers of Discourse Structure (1996)

    Google Scholar 

  17. Ward, W.: Understanding spontaneous speech. In: Proceedings of the workshop on Speech and Natural Language of Association for Computational Linguistics, pp. 137–141 (1989)

    Google Scholar 

  18. Žgank, A., Rotovnik, T., Sepesy Maučec, M.: Slovenian spontaneous speech recognition and acoustic modeling of filled pauses and onomatopoeas. In: WSEAS Transactions on Signal Processing (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Auliya Sani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Singapore

About this paper

Cite this paper

Sani, A., Lestari, D.P., Purwarianti, A. (2016). Filled Pause Detection in Indonesian Spontaneous Speech. In: Hasida, K., Purwarianti, A. (eds) Computational Linguistics. PACLING 2015. Communications in Computer and Information Science, vol 593. Springer, Singapore. https://doi.org/10.1007/978-981-10-0515-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-0515-2_4

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-0514-5

  • Online ISBN: 978-981-10-0515-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics