Abstract
This paper presents a discriminative latent variable model (DPLVM) based classifier for improving the translation error detection performance for statistical machine translation (SMT). It uses latent variables to carry additional information which may not be expressed by those original labels and capture more complicated dependencies between translation errors and their corresponding features to improve the classification performance. Specifically, we firstly detail the mathematical representation of the proposed DPLVM method, and then introduce features, namely word posterior probabilities (WPP), linguistic features, syntactic features. Finally, we compare the proposed method with MaxEnt and SVM classifiers to verify its effectiveness. Experimental results show that the proposed DPLVM-based classifier reduce classification error rate (CER) by relative 1.75%, 1.69%, 2.61% compared to the MaxEnt classifier, and relative 0.17%, 0.91%, 2.12% compared to the SVM classifier over three different feature combinations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ueffing, N., Klaus, M., Hermann, N.: Confidence Measures for Statistical Machine Translation. In: Proceedings of the MT Summit IX, pp. 169–176 (2003)
Blatz, J., Fitzgerald, E., Foster, G., Gandrabur, S., Goutte, C., Kuesza, A., Sanchis, A., Ueffing, N.: Confidence Estimation for Machine Translation. In: Proceedings of the 20th International Conference on Computational Linguistics, pp. 315–321 (2004)
Ueffing, N., Ney, H.: Word-Level Confidence Estimation for Machine Translation. Computational Linguistics 33(1), 9–40 (2007)
Xiong, D., Zhang, M., Li, H.: Error detection for statistical machine translation using linguistic features. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 604–611 (2010)
Nguyen, B., Huang, F., AI-Onaizan, Y.: Goodness: A Method for Measuring Machine Translation Confidence. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pp. 211–219 (2011)
Specia, L., Hajlaoui, N., Hallett, C., Aziz, W.: Predicting machine translation adequacy. In: MT Summit XIII: Proceedings of the Thirteenth Machine Translation Summit, pp. 513–520 (2011)
Mariano, F., Specia, L.: Linguistic features for quality estimation. In: WMT 2012: Proceedings of the 7th Workshop on Statistical Machine Translation, pp. 96–103 (2012)
Hardmeier, C., Nivre, J., Tiedemann, J.: Tree kernels for machine translation quality estimation. In: Proceedings of the 7th Workshop on Statistical Machine Translation, pp. 109–113 (2012)
Du, J., Wang, S.: A Systematic Comparison of SVM and Maximum Entropy Classifiers for Translation Error Detection. In: Proceedings of the International Conference on Asian Language Processing, IALP (2012)
Morency, L.P., Quattoni, A., Darrell, T.: Latent-dynamic Discriminative Models for Continuous Gesture Recognition. In: Proceedings of the CVPR 2007, pp. 1–8 (2007)
Sun, X., Tsujii, J.: Sequential Labeling with Latent Variables: An Exact Inference Algorithm and An Ecient Approximation. In: Proceedings of the European Chapter of the Association for Computational Linguistics (EACL 2009), pp. 772–780 (2009)
Du, J., Way, A.: A discriminative latent variable-based classifier for Chinese-English SMT. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 286–294 (2010)
Specia, L., Cancedda, N., Dymetman, M., Turchi, M., Cristianini, N.: Estimating the sentence-level quality of machine translation systems. In: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, pp. 28–35 (2009)
Specia, L., Saunders, C., Turchi, M., Wang, Z., Shawe-Taylor, J.: Improving the confidence of machine translation quality estimates. In: Proceedings of the Twelfth Machine Translation Summit, pp. 136–143 (2009)
Koehn, P., Hoang, H., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open Source Toolkit for Statistical Machine Translation. In: Proceedings of the Demo and Poster Sessions, ACL 2007, pp. 177–180 (2007)
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, pp. 223–231 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Du, J., Guo, J., Zhao, F. (2013). Discriminative Latent Variable Based Classifier for Translation Error Detection. In: Zhou, G., Li, J., Zhao, D., Feng, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2013. Communications in Computer and Information Science, vol 400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41644-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-41644-6_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41643-9
Online ISBN: 978-3-642-41644-6
eBook Packages: Computer ScienceComputer Science (R0)