Skip to main content
Log in

A method for constructing Korean spontaneous spoken language corpus based on an imitation of abbreviated and transformed particles

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

In the paper, we proposed a method of constructing a language corpus based on the imitation of abbreviated and transformed particles that are distinctive feature of Korean spontaneous spoken language. Since it is not practical to train a spoken-style model using numerous spoken transcripts, the proposed approach generates a spoken-style text from a written-style one such as newspapers, based on characteristics of pronouncing variations, dependent on spoken styles, of typical particles. This method for constructing spoken-style text is based on statistical analysis on particles that play same function in both of written and spoken language. We analyze grammatical functions and pronouncing features of particles that distinguish between written and spoken language, and generate spoken-style text from written-style text by imitating typical abbreviated and transformed particles which play same function. Abbreviated and transformed particles to be imitated have proper and typical pronouncing features of spoken language. We replace particles with abbreviated and transformed particles in written-style text according to correspondence of written particles to spoken ones, which results in spoken-style text. The language model, which is trained from spoken-style text imitating abbreviated and transformed particles, significantly improved a word error rate (WER) on spontaneous speech.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Cettolo, M., Brugnara, F., & Federico, M. (2004). Advances in the automatic transcription of lectures. In Proc. ICASSP (pp. 769–772).

  • Furui, S., Maekawa, K., & Isahara, H. (2000). Toward the realization of spontaneous speech recognition—Introduction of a Japanese priority program and preliminary results. In Proc. ICSLP (pp. 518–521).

  • Garofolo, J., Laprun, C., & Fiscus, J. (2004). The rich transcription 2004 spring meeting recognition evaluation. In Proc. ICASSP Meeting Recognition Workshop.

  • Glass, J., Hazen, T., Cyphers, S., Malioutov, I., Huynh, D., & Barzilay, R. (2007). Recent progress in the MIT spoken lecture processing project. In Proc. Eurospeech (pp. 2553–2556).

  • Hain, T., Woodland, P., Niesler, T., & Whittaker, E. (1999). The 1998 HTK system for transcription of conversational telephone speech. In Proc. ICASSP (pp. 57–60).

  • Hyokchol, R. (2019). A usage of the syllable unit based on morphological statistics in Korean large vocabulary continuous speech recognition system. International Journal of Speech Technology. https://doi.org/10.1007/s10772-019-09637-2

    Article  Google Scholar 

  • Kawahara, T., Nemoto, Y., & Akita, Y. (2008). Automatic lecture transcription by exploiting presentation slide information for language model adaptation. In Proc. ICASSP (pp. 4929–4932).

  • Lamel, L., Adda, G., Bilinski, E., & Gauvain, J. (2005). Transcribing lectures and seminars. In Proc. Eurospeech (pp. 1657–1660).

  • Lamel, L., Gauvain, J.L., Adda, G., Barras, C., Bilinski, E., et al. (2007). The LIMSI 2006 TC-STAR EPPS transcription systems. In Proc. ICASSP (pp. 997–1000).

  • Leeuwis, E., Federico, M., & Cettolo, M. (2003). Language modeling and transcription of the TED corpus lectures. In Proc. ICASSP (pp. 232–235).

  • Loof, J., Bisani, M., Gollan, C., Heigold, G., Hoffmeister, B., Plahl, C., Schluter, R., & Ney, H. (2006). The 2006 RWTH parliamentary speeches transcription system. In Proc. ICSLP (pp. 105–108).

  • Masumura, R., Hahm, S., & Ito, A. (2011). Training a language model using web data for large vocabulary Japanese spontaneous speech recognition. In Proc. Interspeech (pp. 1465–1468).

  • Prasad, R., Nguyen, L., Schwartz, R., & Makhoul, J. (2002). Automatic transcription of courtroom speech. In Proc. ICSLP (pp. 1745–1748).

  • Ramabhadran, B., Siohan, O., Mangu, L., Zweig, G., et al. (2006). The IBM 2006 speech transcription system for European parliamentary speeches. In Proc. ICSLP (pp. 1225–1228).

  • Renals, S., Hain, T., & Bourlard, H. (2007). Recognition and understanding of meetings: The AMI and AMIDA projects. In Proc. ASRU (pp. 238–247).

  • Stolcke, A. (2002). SRILM—an extensible language modeling toolkit. In Proc. Int. Conf. on Spoken Language Processing (pp. 901–904). Colorado: Denver.

  • Xinhui, H., Shigeki, M., Chori, H., & Hideki, K. (2013). Collecting colloquial and spontaneous-like sentences from web resources for constructing Chinese language models of speech recognition. Journal of Information Processing, 21(2), 168–175.

    Article  Google Scholar 

  • Young, S., et al. (2006). The HTK Book Version 3.4. Cambridge: Cambridge University.

    Google Scholar 

  • Zavaliagkos, G., McDonough, J., Miller, D., El-Jaroudi, et al. (1998). The BBN Byblos 1997 large vocabulary conversational speech recognition system. in Proc. ICASSP (pp. 905–908.)

Download references

Acknowledgements

We appreciate the helpful discussions with Dr. Kim and Prof. Ri, anonymous reviewers and editors for many invaluable comments and suggestions to improve this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hyok-Chol Ri.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ri, HC., Kim, C. & Jo, MR. A method for constructing Korean spontaneous spoken language corpus based on an imitation of abbreviated and transformed particles. Int J Speech Technol 25, 205–210 (2022). https://doi.org/10.1007/s10772-021-09937-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-021-09937-6

Keywords

Navigation