Chinese Utterance Segmentation in Spoken Language Translation

Zong, Chengqing; Ren, Fuji

doi:10.1007/3-540-36456-0_55

Chengqing Zong⁵ &
Fuji Ren⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2588))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

896 Accesses
4 Citations

Abstract

This paper presents an approach to segmenting Chinese utterances for a spoken language translation (SLT) system in which Chinese speech is the source input. We propose this approach as a supplement to the function of sentence boundary detection in speech recognition, in order to identify the boundaries of simple sentences and fixed expressions within the speech recognition results of a Chinese input utterance. In this approach, the plausible boundaries of split units are determined using several methods, including keyword detection, pattern matching, and syntactic analysis. Preliminary experimental results have shown that this approach is helpful in improving the performance of SLT systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Segmentation of Telephone Speech Based on Speech and Non-speech Models

Sentence-Level Automatic Speech Segmentation for Amharic

Resources and Tools for Automated Speech Segmentation of the African Language Naija (Nigerian Pidgin)

References

Batliner, A. and R. Kompe et. al. (1996) Syntactic-Prosodic Labelling of Large Spontaneous Speech Data-Bases. In Proceedings of ICSLP. USA.
Google Scholar
Cettolo, Mauro and Daniele Falavigna. (1998) Automatic Detection of Semantic Boundaries Based on Acoustic and Lexical Knowledge. In Proceedings of ICSLP. pp. 1551–1554.
Google Scholar
Furuse, Osamu, Setsuo Yamada and Kazuhide Yamamoto. (1998) Splitting Long Illformed Input for Robust Spoken-language Translation. In Proceedings of COLING, vol. I, pp. 421–427.
Google Scholar
Kawahara, Tatsuya, Chin-Hui Lee and Biing-Hwang Juang. (1996) Key-Phrase Detection and Verification for Flexible Speech Understanding. In Proceedings of ICSLP, USA.
Google Scholar
Nakano, Mikio, Noboru Miyazaki and Jun-ichi Hirasawa et. al. (1999) Understanding Unsegmented User Utterances in Real-time Spoken Dialogue Systems. In Proceedings of ACL.
Google Scholar
Palmer, David D. and Marti A. Hearst. (1994) Adaptive Sentence Boundary Disambiguation. In Proceedings of the 1994 Conference on Applied Natural Language Processing (ANLP). Stuttgart, Germany, October.
Google Scholar
Ramasway, Ganesh N. and Jan Kleindienst. (1998) Automatic Identification of Command Boundaries in a Conversational Natural Language User Interface. In Proceedings of ICSLP, pp. 401–404.
Google Scholar
Reynar, Jeffrey C. and Adwait Ratnaparkhi. (1997) A Maximum Entropy Approach to Identifying Sentence Boundaries. In Proceedings of the Fifth Conference on Applied Natural Language Processing. USA. pp.16–19.
Google Scholar
Riley, Michael D. (1989) Some applications of tree-based modelling to speech and language. In DARPA Speech and Language Technology Workshop. Cape Cod, Massachusetts. pp. 339–352.
Google Scholar
Seligman, M. (2000) Nine Issues in Speech Translation. In Machine Translation. 15: 149–185.
Article MATH Google Scholar
Swerts, M. (1997) Prosodic Features at Discourse Boundaries of Different Strength. JASA, 101(1): 514–521.
Google Scholar
Stolcke, Andreas and Elizabeth Shriberg (1996) Automatic Linguistic Segmentation of Conversational Speech. In Proceedings of ICSLP, vol. 2, pp. 1005–1008.
Google Scholar
Stolcke, Andreas and Elizabeth Shriberg et. al. (1998) Automatic Detection of Sentence Boundaries and Disfluencies Based on Recognized Words. In Proceedings of ICSLP, pp. 2247–2250.
Google Scholar
Wakita, Yumi, Jun Kawai et. al. (1997) Correct Parts Extraction from Speech Recognition Results Using Semantic Distance Calculation, and Its Application to Speech Translation. In Proceedings of Spoken Language Translation. Spain. pp. 24–31.
Google Scholar
Wightman, C. W. and M. Ostendorf. (1994) Automatic Labelling of Prosodic Patterns. IEEE Transactions on Speech and Audio Processing, 2(4): 469–481.
Article Google Scholar
Zechner, Klaus and Alex Waibel. (1998) Using Chunk Based Partial Parsing of Spontaneous Speech in Unrestricted Domains for Reducing Word Error Rate in Speech Recognition. In Proceedings of COLING-ACL’98, pp. 1453–1459.
Google Scholar
Zhou, Yun. (2001) Analysis on Spoken Chinese Corpus and Segmentation of Chinese Utterances (in Chinese). Thesis for Master Degree. Institute of Automation, Chinese Academy of Sciences.
Google Scholar

Download references

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China
Chengqing Zong
Department of Information Science and Intelligent Systems, Faculty of Engineering, The University of Tokushima, 2-1, Minamijosanjima, 770-8506, Tokushima, Japan
Fuji Ren

Authors

Chengqing Zong
View author publications
You can also search for this author in PubMed Google Scholar
Fuji Ren
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Col. Zacatenco, CP 07738, Mexico D.F., Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zong, C., Ren, F. (2003). Chinese Utterance Segmentation in Spoken Language Translation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2003. Lecture Notes in Computer Science, vol 2588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36456-0_55

Download citation

DOI: https://doi.org/10.1007/3-540-36456-0_55
Published: 30 April 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00532-2
Online ISBN: 978-3-540-36456-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Chinese Utterance Segmentation in Spoken Language Translation

Abstract

Access this chapter

Preview

Similar content being viewed by others

Segmentation of Telephone Speech Based on Speech and Non-speech Models

Sentence-Level Automatic Speech Segmentation for Amharic

Resources and Tools for Automated Speech Segmentation of the African Language Naija (Nigerian Pidgin)

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Chinese Utterance Segmentation in Spoken Language Translation

Abstract

Access this chapter

Preview

Similar content being viewed by others

Segmentation of Telephone Speech Based on Speech and Non-speech Models

Sentence-Level Automatic Speech Segmentation for Amharic

Resources and Tools for Automated Speech Segmentation of the African Language Naija (Nigerian Pidgin)

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation