Abstract
This paper describes a sentence alignment technique based on a machine readable dictionary. Alignment takes place in a single pass through the text, based on the scores of matches between pairs of source and target sentences. Pairings consisting of sets of matches are evaluated using a version of the Gale-Shapely solution to the stable marriage problem. An algorithm is described which can handle N-to-1 (or 1-to-N) matches, for n ≥ 0, i.e., deletions, 1-to-1 (including scrambling), and 1-to-many matches. A simple frequency based method for acquiring supplemental dictionary entries is also discussed. We achieve high quality alignments using available bilingual dictionaries, both for closely related language pairs (Spanish/English) and more distantly related pairs (Japanese/English).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Peter Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 19:263–312, 1993.
Peter F. Brown, Jennifer C. Lai, and Robert L. Mercer. Aligning Sentences in Parallel Corpora. In ACL91, 1991.
Ralf D. Brown. Example-Based Machine Translation in the Pangloss System. In COLING96, pages 169–174, 1996.
S. Chen. Aligning Sentences in Bilingual Corpora using lexical information. In ACL93, pages 9–16, 1993.
Nigel Collier, Hideki Hirakawa, and Akira Kumano. An Experiment in Hybrid Dictionary and Statistical Sentece Alignment. In COLING-ACL98, 1998.
Osamu Furuse and Hitoshi Iida. Constituent Boundary Parsing for Example-Based Machine Translation. In COLING94, 1994.
William A. Gale and Kenneth W. Church. A Program for Aligning Sentences in Bilingual Corpora. Computational Linguistics, 19:75–102, 1993.
Ralph Grishman. Iterative Alignment of Syntactic Structures for a Bilingual Corpus. In WVLC94, Tokyo, 1994.
Ralph Grishman and Michiko Kosaka. Combining Rationalist and Empiricist Approaches to Machine Translation. In TMI92, Tokyo, 1992.
Dan Gusfield and Robert W. Irving. The Stable Marriage Problem: Structure and Algorithms. The MIT Press, Cambridge, 1989.
Masahiko Haruno and Takefumi Yamazaki. High-performance Bilingual Text Alignment Using Statistical and Dictionary Information. Natural Language Engineering, 3:1–14, 1997.
Hiroyuki Kaji, Yuuko Kida, and Yasututsugo Morimoto. Learning Translation Templates from Bilingual Text. In COLING92, 1992.
Martin Kay and Martin Röscheisen. Text-Translation Alignment. Computational Linguistics, 19:121–142, 1993.
Sue J. Ker and Jason S. Chang. A Class-based Approach to Word Alignment. Computational Linguistics, 23:313–343, 1997.
Philippe Langlais, Michel Simard, and Jean Véronis. Methods and Practical Issues in Evaluating Alignment Techniques. In COLING-ACL98, 1998.
Catherine Macleod, Ralph Grishman, and Adam Meyers. COMLEX Syntax: A Large Syntactic Dictionary for Natural Language Processing. Computers and the Humanities, forthcoming.
Y. Matsumoto, H. Ishimoto, T. Utsuro, and M. Nagao. Structural Matching of Parallel Texts. In ACL93, 1993.
I. Dan Melamed. A Geometric Approach to Mapping Bitext Correspondence. In Proceedings of the First Conference on Empirical Methods in Natural Language Processing, 1996.
I. Dan Melamed. A Portable Algorithm for Mapping Bitext Correspondence. In ACL97, 1997.
Adam Meyers, Roman Yangarber, and Ralph Grishman. Alignment of Shared Forests for Bilingual Corpora. In COLING 1996, pages 460–465, 1996.
Adam Meyers, Roman Yangarber, Ralph Grishman, Catherine Macleod, and Antonio Moreno-Sandoval. Deriving Transfer Rules from Dominance-Preserving Alignments. In COLING-ACL98, 1998.
Makao Nagao. A Framework of a Mechanical Translation between Japanese and English by Analogy Principle. In Alick Elithorn and Ranan Banerji, editors, Artificial and Human Intelligence. Elsevier Science Publishers B.V., Amsterdam, 1984.
Satoshi Sato and Makoto Nagao. Toward Memory-based Translation. In COLING90, volume 3, pages 247–252, 1990.
Takehito Utsuro, Hiroshi Ikeda, Masaya Yumane, Yuji Matsumoto, and Makoto Nagao. Bilingual Text Matching using Bilingual Dictionary and Statistics. In COLING94, pages 1076–1082, 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Meyers, A., Kosaka, M., Grishman, R. (1998). A Multilingual Procedure for Dictionary-Based Sentence Alignment. In: Farwell, D., Gerber, L., Hovy, E. (eds) Machine Translation and the Information Soup. AMTA 1998. Lecture Notes in Computer Science(), vol 1529. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49478-2_18
Download citation
DOI: https://doi.org/10.1007/3-540-49478-2_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65259-5
Online ISBN: 978-3-540-49478-2
eBook Packages: Springer Book Archive