Chinese Spoken Language Processing
Volume 4274 of the series Lecture Notes in Computer Science pp 15-15
Challenges in Machine Translation
- Franz Josef OchAffiliated withGoogle Research
Abstract
In recent years there has been an enormous boom in MT research. There has been not only an increase in the number of research groups in the field and in the amount of funding, but there is now also optimism for the future of the field and for achieving even better quality. The major reason for this change has been a paradigm shift away from linguistic/rule-based methods towards empirical/data-driven methods in MT. This has been made possible by the availability of large amounts of training data and large computational resources. This paradigm shift towards empirical methods has fundamentally changed the way MT research is done. The field faces new challenges. For achieving optimal MT quality, we want to train models on as much data as possible, ideally language models trained on hundreds of billions of words and translation models trained on hundreds of millions of words. Doing that requires very large computational resources, a corresponding software infrastructure, and a focus on systems building and engineering. In addition to discussing those challenges in MT research, the talk will also give specific examples on how some of the data challenges are being dealt with at Google Research.
Chapter Metrics
Reference tools
Other actions
- Title
- Challenges in Machine Translation
- Book Title
- Chinese Spoken Language Processing
- Book Subtitle
- 5th International Symposium, ISCSLP 2006, Singapore, December 13-16, 2006. Proceedings
- Pages
- p 15
- Copyright
- 2006
- DOI
- 10.1007/11939993_3
- Print ISBN
- 978-3-540-49665-6
- Online ISBN
- 978-3-540-49666-3
- Series Title
- Lecture Notes in Computer Science
- Series Volume
- 4274
- Series ISSN
- 0302-9743
- Publisher
- Springer Berlin Heidelberg
- Copyright Holder
- Springer-Verlag Berlin Heidelberg
- Additional Links
- Topics
- Industry Sectors
- eBook Packages
- Editors
-
- Qiang Huo (18)
- Bin Ma (19)
- Eng-Siong Chng (20)
- Haizhou Li (21)
- Editor Affiliations
-
- 18. Department of Computer Science, The University of Hong Kong
- 19. Human Language Technology Department, Institute for Infocomm Research (I2R)
- 20. School of Computer Engineering, Nanyang Technological University (NTU)
- 21. Institute for Infocomm Research
- Authors
-
- Franz Josef Och (22)
- Author Affiliations
-
- 22. Google Research,
Continue reading...
To view the rest of this content please follow the download PDF link above.