World Wide Web

, Volume 18, Issue 2, pp 299–316 | Cite as

Cross lingual opinion holder extraction based on multi-kernel SVMs and transfer learning

  • Ruifeng Xu
  • Lin Gui
  • Jun XuEmail author
  • Qin Lu
  • Kam-Fai Wong


Fine grained opinion analysis has much higher demand for annotated corpus which makes high quality analysis difficult when there are insufficient resources. In this paper we explore the use of cross lingual resources for opinion mining for resource poor languages. This paper presents a novel approach for cross lingual opinion holder extraction through leveraging finely annotated opinion corpus selectively from a source language as the supplementary training samples for the target language. Firstly, the opinion corpus in the source language with fine grained annotations are translated and projected to the target language to generate the training samples. Then, a classifier based on multi-kernel Support Vector Machines (SVMs) is developed to identify opinion holders in the target language, which uses a tree kernel based on syntactic features and a polynomial kernel based on semantic features, respectively. The two kernels are further improved by incorporating a pivot function based on word pair similarity. To reduce the noise of low quality translated samples, a Transfer learning algorithm is applied to select high quality translated samples iteratively for training the multi-kernel classifiers on the target language. Evaluations on transferring MPQA, an English opinion corpus (as the source language), to Chinese opinion analysis (as the target language) show that the opinion holder extraction performance on NTCIR-7 MOAT dataset is improved, which is higher than the Conditional Random Fields (CRFs) based approach and most reported systems in NTCIR-7 MOAT evaluation.


Opinion holder extraction Cross lingual Multi-kernel SVMs Transfer learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Choi, Y., Cardir, C., Riloff, E., Patwardhan, S.: Identifying sources of opinions with conditional random fields and extraction patterns. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pp. 355–362. Vancouver, October 2005Google Scholar
  2. 2.
    Choi, Y., Breck, E., Cardie, C.: Joint extraction of entities and relations for opinion recognition. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), pp. 431–439. Sydney, July 2006Google Scholar
  3. 3.
    Choi, Y., Breck, E., Cardie, C.: Hierarchical sequential learning for extracting opinions and their attributes. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 269-274. Uppsala, Sweden, 11-16 July 2010Google Scholar
  4. 4.
    Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 263–270. Philadelphia, July 2002Google Scholar
  5. 5.
    Cortes, C., Vapnik, V.: Support – vector networks. Mach. Learn. 20, 273–297 (1995)zbMATHGoogle Scholar
  6. 6.
    Johansson, R., Moschitti, A.: Syntactic and semantic structure for opinion expression detection. In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning, pp. 67–76. Uppsala, Sweden, 15–16 July 2010Google Scholar
  7. 7.
    Kim, S., Jeong, M., Lee, J., Lee, G.G.: A cross-lingual annotation projection approach for relation detection. In: Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pp. 564–571. Beijing, August 2010Google Scholar
  8. 8.
    Kim, S., Hovy, E.: Extracting opinions, opinion holders, and topics expressed in online news media text. In Proceedings of the ACL Workshop on Sentiment and Subjectivity, pp. 1-8. Sydney, July 2006Google Scholar
  9. 9.
    Lu, B., Tan, C., Cardie, C., Tsou, B.K.: Joint bilingual sentiment classification with unlabeled parallel corpora. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pp. 320–330. Portland, Oregon, 19–24 June 2011Google Scholar
  10. 10.
    Meng, X., Wang, H.: Detecting opinionated sentences by extracting context information. In: Proceedings of NTCIR-7 Workshop Meeting, Tokyo, Japan, 16–19 Decebmer 2008Google Scholar
  11. 11.
    Moschitti, A., Pighin, D., Roberto Basili: Tree kernel engineering for proposition re-ranking. In: Proceedings of Mining and Learning with Graphs (MLG 2006). Workshop held with ECML/PKDD 2006, Berlin, Germany (2006)Google Scholar
  12. 12.
    Moschitti: A study on convolution kernels for shallow semantic parsing. In: Proceedings of the 42-th Conference on Association for Computational Linguistic (ACL-2004), Barcelona, Spain (2004)Google Scholar
  13. 13.
    Seki, Y., Evans, D.K., Ku, L.-W., Sun, L., Chen, H.-H., Kando, N.: Overview of multilingual opinion analysis task at NTCIR-7. In: Proceedings of NTCIR-7 Workshop Meeting, Tokyo, Japan, 16–19 Decebmer 2008Google Scholar
  14. 14.
    Wei, B., Pal, C.: Cross lingual adaptation: An experiment on sentiment classifications. In: Proceedings of the ACL 2010 Conference Short Papers, pp. 258–262, Uppsala, Sweden, 11–16 July 2010Google Scholar
  15. 15.
    Wiegand, M., Klakow, D.: Convolution kernels for opinion holder extraction. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL, pp. 795–803. Los Angeles, California, June 2010Google Scholar
  16. 16.
    Xu, R., Wong, K.-F.: Coarse-fine opinion mining – WIA in NTCIR-7 MOAT Task. In: Proceedings of NTCIR-7 Workshop Meeting, 16–19 Decebmer 2008 Tokyo, JapanGoogle Scholar
  17. 17.
    Xu, J., Xu, R., Ding, Y., Wang, X.: Cross lingual opinion analysis via transfer learning. In: Australian Journal of Intelligent Information Processing Systems, Vol 11, No 2. Computational Neuroscience and Cognitive Science (2010)Google Scholar
  18. 18.
    Zhang, C., Wang, K., Zhu, M., Xiao, T., Zhu, J.: NEUOM: Identifying opinionated sentences in Chinese and English text. In: Proceedings of NTCIR-7 Workshop Meeting, Tokyo, Japan, 16–19 Decebmer 2008Google Scholar
  19. 19.
    Zirn Cäcilia, Niepert, M., Stuckenschmidt, H., Strube, M.: Fine-grained sentiment analysis with structural features. In: Proceedings of 5th International Joint Conference on Natural Language Processing, pp. 336–344 (2011)Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Ruifeng Xu
    • 1
  • Lin Gui
    • 1
  • Jun Xu
    • 1
    Email author
  • Qin Lu
    • 2
  • Kam-Fai Wong
    • 3
  1. 1.Key Laboratory of Network Oriented Intelligent Computation, Shenzhen Graduate SchoolHarbin Institute of TechnologyShenzhenChina
  2. 2.Department of ComputingThe Hong Kong Polytechnic UniversityKowloonHong Kong
  3. 3.Department of System Engineering and Engineering ManagementThe Chinese University of Hong KongNew TerritoriesHong Kong

Personalised recommendations