Abstract
N-gram-based metrics have been used widely in automatic evaluation of machine translation. However, most of them also lose merits due to the strict policy of matching of n-grams. Especially, the policy of exact matching leads to take synonyms as totally different words and thus give unreasonable estimation. This paper introduces fuzzy matching for n-grams, which refers to a semantic similarity function based on WordNet. And it is used to find a match with the highest similarity when incorporated into BLEU, the representative of n-gram-based evaluation metrics. Since WordNet can contribute more to high-order n-grams and fuzzy matching can perform well even with fewer references, experiments on MTC Part 2 (LDC2003T17) show our proposed method can greatly improve correlation between BLEU and human evaluation both at segment-level and document-level. Furthermore, BLEU incorporating fuzzy matching achieves more significant improvement at document-level evaluation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318 (2002)
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of Association for Machine Translation in the Americas, pp. 223–231 (2006)
Banerjee, S., Lavie, A.: METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72 (2005)
Chan, Y.S., Ng, H.T.: MAXSIM: A maximum similarity metric for machine translation evaluation. In: Proceedings of ACL 2008: HLT, pp. 55–62 (2008)
Liu, D., Gildea, D.: Syntactic features for evaluation of machine translation. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 25–32 (2005)
Lo, C.K., Wu, D.: MEANT: an inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility via semantic frames. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 220–229 (2011)
Liu, Y., Liu, Q., Lin, S.: Fuzzy matching in machine translation evaluation. Journal of Chinese Information Processing, 45–53 (2005)
Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 296–304 (1998)
Blatz, J., Fitzgerald, E., Foster, G., Gandrabur, S., Goutte, C., Kulesza, A., Sanchis, A., Ueffing, N.: Confidence estimation for machine translation. In: Proceedings of the 20th International Conference on Computational Linguistics (2004)
Callison-burch, C., Osborne, M., Koehn, P.: Re-evaluating the role of bleu in machine translation research. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, pp. 249–256 (2006)
Zhou, G., Li, J., Fan, J., Zhu, Q.: Tree kernel-based semantic role labeling with enriched parse tree structure. Inf. Process. Manage. 47, 349–362 (2011)
Zhou, G., Qian, L., Fan, J.: Tree kernel-based semantic relation extraction with rich syntactic and semantic information. Inf. Sci. 180, 1313–1325 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, L., Gong, Z. (2013). Fuzzy Matching for N-Gram-Based MT Evaluation . In: Ji, D., Xiao, G. (eds) Chinese Lexical Semantics. CLSW 2012. Lecture Notes in Computer Science(), vol 7717. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36337-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-36337-5_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36336-8
Online ISBN: 978-3-642-36337-5
eBook Packages: Computer ScienceComputer Science (R0)