Skip to main content

Chinese Utterance Segmentation in Spoken Language Translation

  • Conference paper
  • First Online:
Computational Linguistics and Intelligent Text Processing (CICLing 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2588))

Abstract

This paper presents an approach to segmenting Chinese utterances for a spoken language translation (SLT) system in which Chinese speech is the source input. We propose this approach as a supplement to the function of sentence boundary detection in speech recognition, in order to identify the boundaries of simple sentences and fixed expressions within the speech recognition results of a Chinese input utterance. In this approach, the plausible boundaries of split units are determined using several methods, including keyword detection, pattern matching, and syntactic analysis. Preliminary experimental results have shown that this approach is helpful in improving the performance of SLT systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Batliner, A. and R. Kompe et. al. (1996) Syntactic-Prosodic Labelling of Large Spontaneous Speech Data-Bases. In Proceedings of ICSLP. USA.

    Google Scholar 

  2. Cettolo, Mauro and Daniele Falavigna. (1998) Automatic Detection of Semantic Boundaries Based on Acoustic and Lexical Knowledge. In Proceedings of ICSLP. pp. 1551–1554.

    Google Scholar 

  3. Furuse, Osamu, Setsuo Yamada and Kazuhide Yamamoto. (1998) Splitting Long Illformed Input for Robust Spoken-language Translation. In Proceedings of COLING, vol. I, pp. 421–427.

    Google Scholar 

  4. Kawahara, Tatsuya, Chin-Hui Lee and Biing-Hwang Juang. (1996) Key-Phrase Detection and Verification for Flexible Speech Understanding. In Proceedings of ICSLP, USA.

    Google Scholar 

  5. Nakano, Mikio, Noboru Miyazaki and Jun-ichi Hirasawa et. al. (1999) Understanding Unsegmented User Utterances in Real-time Spoken Dialogue Systems. In Proceedings of ACL.

    Google Scholar 

  6. Palmer, David D. and Marti A. Hearst. (1994) Adaptive Sentence Boundary Disambiguation. In Proceedings of the 1994 Conference on Applied Natural Language Processing (ANLP). Stuttgart, Germany, October.

    Google Scholar 

  7. Ramasway, Ganesh N. and Jan Kleindienst. (1998) Automatic Identification of Command Boundaries in a Conversational Natural Language User Interface. In Proceedings of ICSLP, pp. 401–404.

    Google Scholar 

  8. Reynar, Jeffrey C. and Adwait Ratnaparkhi. (1997) A Maximum Entropy Approach to Identifying Sentence Boundaries. In Proceedings of the Fifth Conference on Applied Natural Language Processing. USA. pp.16–19.

    Google Scholar 

  9. Riley, Michael D. (1989) Some applications of tree-based modelling to speech and language. In DARPA Speech and Language Technology Workshop. Cape Cod, Massachusetts. pp. 339–352.

    Google Scholar 

  10. Seligman, M. (2000) Nine Issues in Speech Translation. In Machine Translation. 15: 149–185.

    Article  MATH  Google Scholar 

  11. Swerts, M. (1997) Prosodic Features at Discourse Boundaries of Different Strength. JASA, 101(1): 514–521.

    Google Scholar 

  12. Stolcke, Andreas and Elizabeth Shriberg (1996) Automatic Linguistic Segmentation of Conversational Speech. In Proceedings of ICSLP, vol. 2, pp. 1005–1008.

    Google Scholar 

  13. Stolcke, Andreas and Elizabeth Shriberg et. al. (1998) Automatic Detection of Sentence Boundaries and Disfluencies Based on Recognized Words. In Proceedings of ICSLP, pp. 2247–2250.

    Google Scholar 

  14. Wakita, Yumi, Jun Kawai et. al. (1997) Correct Parts Extraction from Speech Recognition Results Using Semantic Distance Calculation, and Its Application to Speech Translation. In Proceedings of Spoken Language Translation. Spain. pp. 24–31.

    Google Scholar 

  15. Wightman, C. W. and M. Ostendorf. (1994) Automatic Labelling of Prosodic Patterns. IEEE Transactions on Speech and Audio Processing, 2(4): 469–481.

    Article  Google Scholar 

  16. Zechner, Klaus and Alex Waibel. (1998) Using Chunk Based Partial Parsing of Spontaneous Speech in Unrestricted Domains for Reducing Word Error Rate in Speech Recognition. In Proceedings of COLING-ACL’98, pp. 1453–1459.

    Google Scholar 

  17. Zhou, Yun. (2001) Analysis on Spoken Chinese Corpus and Segmentation of Chinese Utterances (in Chinese). Thesis for Master Degree. Institute of Automation, Chinese Academy of Sciences.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zong, C., Ren, F. (2003). Chinese Utterance Segmentation in Spoken Language Translation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2003. Lecture Notes in Computer Science, vol 2588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36456-0_55

Download citation

  • DOI: https://doi.org/10.1007/3-540-36456-0_55

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00532-2

  • Online ISBN: 978-3-540-36456-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics