Skip to main content

A Multilingual Procedure for Dictionary-Based Sentence Alignment

  • Conference paper
  • First Online:
Machine Translation and the Information Soup (AMTA 1998)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1529))

Included in the following conference series:

Abstract

This paper describes a sentence alignment technique based on a machine readable dictionary. Alignment takes place in a single pass through the text, based on the scores of matches between pairs of source and target sentences. Pairings consisting of sets of matches are evaluated using a version of the Gale-Shapely solution to the stable marriage problem. An algorithm is described which can handle N-to-1 (or 1-to-N) matches, for n ≥ 0, i.e., deletions, 1-to-1 (including scrambling), and 1-to-many matches. A simple frequency based method for acquiring supplemental dictionary entries is also discussed. We achieve high quality alignments using available bilingual dictionaries, both for closely related language pairs (Spanish/English) and more distantly related pairs (Japanese/English).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Peter Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 19:263–312, 1993.

    Google Scholar 

  2. Peter F. Brown, Jennifer C. Lai, and Robert L. Mercer. Aligning Sentences in Parallel Corpora. In ACL91, 1991.

    Google Scholar 

  3. Ralf D. Brown. Example-Based Machine Translation in the Pangloss System. In COLING96, pages 169–174, 1996.

    Google Scholar 

  4. S. Chen. Aligning Sentences in Bilingual Corpora using lexical information. In ACL93, pages 9–16, 1993.

    Google Scholar 

  5. Nigel Collier, Hideki Hirakawa, and Akira Kumano. An Experiment in Hybrid Dictionary and Statistical Sentece Alignment. In COLING-ACL98, 1998.

    Google Scholar 

  6. Osamu Furuse and Hitoshi Iida. Constituent Boundary Parsing for Example-Based Machine Translation. In COLING94, 1994.

    Google Scholar 

  7. William A. Gale and Kenneth W. Church. A Program for Aligning Sentences in Bilingual Corpora. Computational Linguistics, 19:75–102, 1993.

    Google Scholar 

  8. Ralph Grishman. Iterative Alignment of Syntactic Structures for a Bilingual Corpus. In WVLC94, Tokyo, 1994.

    Google Scholar 

  9. Ralph Grishman and Michiko Kosaka. Combining Rationalist and Empiricist Approaches to Machine Translation. In TMI92, Tokyo, 1992.

    Google Scholar 

  10. Dan Gusfield and Robert W. Irving. The Stable Marriage Problem: Structure and Algorithms. The MIT Press, Cambridge, 1989.

    MATH  Google Scholar 

  11. Masahiko Haruno and Takefumi Yamazaki. High-performance Bilingual Text Alignment Using Statistical and Dictionary Information. Natural Language Engineering, 3:1–14, 1997.

    Article  Google Scholar 

  12. Hiroyuki Kaji, Yuuko Kida, and Yasututsugo Morimoto. Learning Translation Templates from Bilingual Text. In COLING92, 1992.

    Google Scholar 

  13. Martin Kay and Martin Röscheisen. Text-Translation Alignment. Computational Linguistics, 19:121–142, 1993.

    Google Scholar 

  14. Sue J. Ker and Jason S. Chang. A Class-based Approach to Word Alignment. Computational Linguistics, 23:313–343, 1997.

    Google Scholar 

  15. Philippe Langlais, Michel Simard, and Jean Véronis. Methods and Practical Issues in Evaluating Alignment Techniques. In COLING-ACL98, 1998.

    Google Scholar 

  16. Catherine Macleod, Ralph Grishman, and Adam Meyers. COMLEX Syntax: A Large Syntactic Dictionary for Natural Language Processing. Computers and the Humanities, forthcoming.

    Google Scholar 

  17. Y. Matsumoto, H. Ishimoto, T. Utsuro, and M. Nagao. Structural Matching of Parallel Texts. In ACL93, 1993.

    Google Scholar 

  18. I. Dan Melamed. A Geometric Approach to Mapping Bitext Correspondence. In Proceedings of the First Conference on Empirical Methods in Natural Language Processing, 1996.

    Google Scholar 

  19. I. Dan Melamed. A Portable Algorithm for Mapping Bitext Correspondence. In ACL97, 1997.

    Google Scholar 

  20. Adam Meyers, Roman Yangarber, and Ralph Grishman. Alignment of Shared Forests for Bilingual Corpora. In COLING 1996, pages 460–465, 1996.

    Google Scholar 

  21. Adam Meyers, Roman Yangarber, Ralph Grishman, Catherine Macleod, and Antonio Moreno-Sandoval. Deriving Transfer Rules from Dominance-Preserving Alignments. In COLING-ACL98, 1998.

    Google Scholar 

  22. Makao Nagao. A Framework of a Mechanical Translation between Japanese and English by Analogy Principle. In Alick Elithorn and Ranan Banerji, editors, Artificial and Human Intelligence. Elsevier Science Publishers B.V., Amsterdam, 1984.

    Google Scholar 

  23. Satoshi Sato and Makoto Nagao. Toward Memory-based Translation. In COLING90, volume 3, pages 247–252, 1990.

    Google Scholar 

  24. Takehito Utsuro, Hiroshi Ikeda, Masaya Yumane, Yuji Matsumoto, and Makoto Nagao. Bilingual Text Matching using Bilingual Dictionary and Statistics. In COLING94, pages 1076–1082, 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Meyers, A., Kosaka, M., Grishman, R. (1998). A Multilingual Procedure for Dictionary-Based Sentence Alignment. In: Farwell, D., Gerber, L., Hovy, E. (eds) Machine Translation and the Information Soup. AMTA 1998. Lecture Notes in Computer Science(), vol 1529. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49478-2_18

Download citation

  • DOI: https://doi.org/10.1007/3-540-49478-2_18

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65259-5

  • Online ISBN: 978-3-540-49478-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics