KSEM 2015: Knowledge Science, Engineering and Management pp 394-406 | Cite as
Improving Transfer Learning in Cross Lingual Opinion Analysis Through Negative Transfer Detection
Abstract
Transfer learning has been used as a machine learning method to make good use of available language resources for other resource-scarce languages. However, the cumulative class noise during iterations of transfer learning can lead to negative transfer which can adversely affect performance when more training data is used. In this paper, we propose a novel transfer learning method which can detect negative transfers. This approach detects high quality samples after certain iterations to identify class noise in new transferred training samples and remove them to reduce misclassifications. With the ability to detect bad training samples and remove them, our method can make full use of large unlabeled training data available in the target language. Furthermore, the most important contribution in this paper is the theory of class noise detection. Our new class noise detection method overcame the theoretic flaw of a previous method based on Gaussian distribution. We applied this transfer learning method with negative transfer detection to cross lingual opinion analysis. Evaluation on the NLP&CC 2013 cross-lingual opinion analysis dataset shows that the proposed approach outperforms the state-of-the-art systems.
Keywords
Negative transfer Transfer learning Class noise detectionPreview
Unable to display preview. Download preview PDF.
References
- 1.Angluin, D., Laird, P.: Learning from Noisy Examples. Machine Learning 2(4), 343–370 (1988)Google Scholar
- 2.Arnold, A., Nallapati, R., Cohen, W.W.: A comparative study of methods for TTL. In: Proc. 7th IEEE ICDM Work-shops, pp. 77–82 (2007)Google Scholar
- 3.Blitzer, J., McDonald, R., Pereira, F.: Domain adaptation with structural correspondence learning. In: Proc. EMNLP, pp. 120–128 (2006)Google Scholar
- 4.Brodley, C.E., Friedl, M.A.: Identifying and Eliminating Mislabeled Training Instances. Journal of Artificial Intelligence Research 11, 131–167 (1999)MATHGoogle Scholar
- 5.Chao, D., Guo, M.Z., Liu, Y., Li, H.F.: Participatory learning based semi-supervised classification. In: Proc. of 4th ICNC, pp. 207–216 (2008)Google Scholar
- 6.Cheng, Y., Li, Q.Y.: Transfer learning with data edit. LNAI, pp. 427–434 (2009)Google Scholar
- 7.Fukumoto, F., Suzuki, Y., Matsuyoshi, S.: Text classification from positive and unlabeled data using misclassified data correction. In: Proc. of 51st ACL, pp. 474–478 (2013)Google Scholar
- 8.Holmstedt, T.: Interpolation of quasi-normed spaces. Math. Scand. 26, 177–199 (1970)MathSciNetCrossRefMATHGoogle Scholar
- 9.Jiang, Y., Zhou, Z.-H.: Editing training data for kNN classifiers with neural network ensemble. In: Yin, F.-L., Wang, J., Guo, C. (eds.) ISNN 2004. LNCS, vol. 3173, pp. 356–361. Springer, Heidelberg (2004)CrossRefGoogle Scholar
- 10.Li, M., Zhou, Z.-H.: SETRED: self-training with editing. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 611–621. Springer, Heidelberg (2005)CrossRefGoogle Scholar
- 11.Li, M., Zhou, Z.H.: COTRADE: Confident Co-Training With Data Editing. IEEE Transactions on Systems, Man, and Cybernetics—Part B Cybernetics 41(6), 1612–1627 (2011)CrossRefGoogle Scholar
- 12.Gui, L., Xu, R.F., Lu, Q. et. al.: Cross-lingual opinion analysis via negative transfer detection. In: Proc. of 52th ACL(2), pp. 860–865 (2014)Google Scholar
- 13.Montgomery-Smith, S.J.: The distribution of Rademacher sums. Proc. Amer. Math. Soc. 109, 517–522 (1990)MathSciNetCrossRefMATHGoogle Scholar
- 14.Muhlenbach, F., Lallich, S., Zighed, D.A.: Identifying and Handling Mislabeled Instances. Journal of Intelligent Information System 22(1), 89–109 (2004)CrossRefMATHGoogle Scholar
- 15.Pan, S.J., Yang, Q.: A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering 22(10), 1345–1360 (2010)CrossRefGoogle Scholar
- 16.Sluban, B., Gamberger, D., Lavra, N.: Advances in class noise detection. In: Proc.19th ECAI, pp. 1105–1106 (2010)Google Scholar
- 17.Wan, X.: Co-training for cross-lingual sentiment classification. In: Proc. of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP, pp. 235–243 (2009)Google Scholar
- 18.Zhu, X.Q., Wu, X.D., Chen, Q.J.: Eliminating class noise in large datasets. In: Proc. of 12th ICML, pp. 920–927 (2003)Google Scholar
- 19.Zhu, X.Q.: Cost-guided class noise handling for effective cost-sensitive learning. In: Proc. of 4th IEEE ICDM, pp. 297–304 (2004)Google Scholar
- 20.Zighed, D.A., Lallich, S., Muhlenbach, F.: A statistical approach to class separability. Applied Stochastic Models in Business and Industry 21(2), 187–197 (2005)MathSciNetCrossRefMATHGoogle Scholar