Improving Translation Quality by Manipulating Sentence Length

  • Laurie Gerber
  • Eduard Hovy
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1529)


Translation systems tend to have more trouble with long sentences than with short ones for a variety of reasons. When the source and target languages differ rather markedly, as do Japanese and English, this problem is reflected in lower quality output. To improve readability, we experimented with automatically splitting long sentences into shorter ones. This paper outlines the problem, describes the sentence splitting procedure and rules, and provides an evaluation of the results.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bloomfield, L. 1933. Language. New York: Henry Holt.Google Scholar
  2. 2.
    Fodor, J.A., T.G. Bever, and M.F. Garrett. 1974. The Psychology of Language. New York: McGraw-Hill.Google Scholar
  3. 3.
    Gunter, R. 1974. Sentences in Dialog. Columbia, SC: Hernbeam Press.Google Scholar
  4. 4.
    Hovy, E.H. and L. Gerber. 1997. MT at the Paragraph Level: Improving English Synthesis in SYSTRAN. In Proceedings of the 4th Conference of Theoretical and Methodological Issues in Machine Translation (TMI). Santa Fe, New Mexico.Google Scholar
  5. 5.
    Hovy, E.H. and C-Y. Lin. 1998. Automated Text Summarization in SUMMARIST. In M. Maybury and I. Mani (eds), Advances in Automatic Text Summarization. MIT Press, to appear.Google Scholar
  6. 6.
    Hutchins, J. and H. Somers. 1992. An Introdution to Machine Translation. Cambridge: Academic Press Limited.Google Scholar
  7. 7.
    Knight, K. and V. Hatzivassiloglou. 1995. Two-Level, Many-Paths Generation. In Proceedings of the 4th Conference of the ACL (252–260).Google Scholar
  8. 8.
    Knight, K., I. Chander, M. Haines, V. Hatzivassiloglou, E.H. Hovy, M. Iida, S.K. Luk, R.A. Whitney, and K. Yamada. 1995. Filling Knowledge Gaps in a Broad-Coverage MT System. In Proceedings of the 14th IJCAI Conference. Montreal, Canada.Google Scholar
  9. 9.
    Levinson, J. 1986. Punctuation and the Orthographic Sentence: A Linguistic Analysis. Ph.D dissertation, City University of New York.Google Scholar
  10. 10.
    Marcu, D. 1997. The Rhetorical Parsing, Summarization, and Generation of Natural Language Texts. Ph.D. dissertation, University of Toronto.Google Scholar
  11. 11.
    Nunberg, G. 1990. The Linguistics of Punctuation. CSLI Lecture Notes 18, Center for the Study of Language and Information, Stanford University.Google Scholar
  12. 12.
    Penman 1989 The Penman Documentation. Unpublished documentation for the Penman Language Generation System, USC/Information Sciences Institute.Google Scholar
  13. 13.
    Quirk, R., S. Greenbaum, G. Leech, and J. Svartvik. 1985. A Comprehensive Grammar of the English Language. London and New York: Longman.Google Scholar
  14. 14.
    Vaughan Payne, L. 1965. The Lively Art of Writing. Chicago: Follett Publishing Company.Google Scholar
  15. 15.
    Weathers, W. and O. Winchester. 1978. The New Strategy of Style. McGraw-Hill.Google Scholar
  16. 16.
    Yang, J. and Gerber L. 1996. SYSTRAN Chinese-English MT System. In Proceedings of the International Conference on Chinese Computing 96. Singapore.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Laurie Gerber
    • 1
  • Eduard Hovy
    • 2
  1. 1.SYSTRAN Systems IncLa Jolla
  2. 2.Information Sciences InstituteMarina del Rey

Personalised recommendations