Abstract
State-of-the-arts studies on sentiment classification are typically domain-dependent and domain-restricted. In this paper, we aim to reduce domain dependency and improve overall performance simultaneously by proposing an efficient multi-domain sentiment classification algorithm. Our method employs the approach of multiple classifier combination. In this approach, we first train single domain classifiers separately with domain specific data, and then combine the classifiers for the final decision. Our experiments show that this approach performs much better than both single domain classification approach (using the training data individually) and mixed domain classification approach (simply combining all the training data). In particular, classifier combination with weighted sum rule obtains an average error reduction of 27.6% over single domain classification.
Similar content being viewed by others
References
Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment classification using machine learning techniques, In Proc. EMNLP2002, Philadelphia, USA, Jul. 7-12, 2002, pp.79-86.
Cui H, Mittal V, Datar M. Comparative experiments on sentiment classification for online product reviews. In Proc. AAAI 2006, Boston, USA, Jul. 16-20, 2006, pp.1265-1270.
Kim S, Hovy E. Identifying opinion holders for question answering in opinion texts. In Proc. Workshop on Question Answering in Restricted Domains (AAAI 2005), Pittsburgh, USA, Jul. 9-13, 2005, pp.100-107.
Ku L, Liang Y, Chen H. Opinion extraction, summarization and tracking in news and blog corpora. In Proc. the Spring Symposia on Computational Approaches to Analyzing Weblogs (AAAI-CAAW2006), Stanford University, USA, Mar. 27-29, 2006, pp.100-107.
Blitzer J, Dredze M, Pereira F. Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proc. ACL 2007, Prague, Czech, Jun. 23-30, 2007, pp.440-447.
Turney P. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proc. ACL 2002, Philadelphia, USA, Jul. 7-12, 2002, pp.417-424.
Zagibalov T, Carroll J. Automatic seed word selection for unsupervised sentiment classification of Chinese text. In Proc. COLING 2008, Manchester, UK, Aug. 18-22, 2008, pp.1073-1080.
Pang B, Lee L. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proc. ACL 2004, Barcelona, Spain, Jul. 21-26, 2004, pp.271-278.
Riloff E, Patwardhan S, Wiebe J. Feature subsumption for opinion analysis. In Proc. EMNLP2006, Sydney, Australia, Jul. 22-23, 2006, pp.440-448.
McDonald R, Hannan K, Neylon T,Wells M, Reynar J. Structured models for fine-to-coarse sentiment analysis. In Proc. ACL 2007, Prague, Czech, Jun. 23-30, 2007, pp.432-439.
Aue A, Gamon M. Customizing sentiment classifiers to new domains: A case study. In Proc. RANLP2005, Borovets, Bulgaria, Sept. 21-23, 2005.
Li S, Zong C. Multi-domain sentiment classification (short paper). In Proc. ACL 2008, Columbus, USA, Jun. 15-20, 2008, pp.257-260.
Dredze M, Crammer K. Online methods for multi-domain learning and adaptation. In Proc. EMNLP2008, Hawaii, USA, Oct. 25-27, 2008, pp.689-697.
Daumé III H. Frustratingly easy domain adaptation. In Proc. ACL 2007, Prague, Czech, Jun. 23-30, 2007, pp.256-263.
Kittler J, Roli F. Multiple classifier systems. In the First International Workshop on MCS, Cagliari, Italy, Jun. 21-23, 2000.
Ranawana R, Palade V. Multi-classifier systems: Review and a roadmap for developers. International Journal of Hybrid Intelligent Systems, 2006, 3(1): 35-61.
Press W, Teukolsky S, Vetterling W, Flannery B. Numerical Recipes in C++: The Art of Scientific Computing, Second Edition. Cambridge University Press, 2002.
Vilalta R, Drissi Y. A perspective view and survey of meta-learning. Artificial Intelligence Review, 18(2): 77-95.
Roli F, Fumera G. Analysis of linear and order statistics combiners for fusion of imbalanced classifiers. In Proc. MCS2002, Cagliari, Italy, Jun. 24-26, 2002, pp.252-261.
Forman G. An extensive empirical study of feature selection metrics for text classification. The Journal of Machine Learning Research, 3: 1533-7928.
Wu T, Lin C, Weng R. Probability estimates for multi-class classification by pairwise coupling. The Journal of Machine Learning Research, 5: 975-1005.
Koehn P. Statistical significance tests for machine translation evaluation. In Proc. EMNLP2004, Barcelona, Spain, Jul. 25-26, 2004, pp.388-295.
Yang Y, Liu X. A re-examination of text categorization methods. In Proc. SIGIR 1999, Berkeley, USA, Aug. 15-19, 1999, pp.42-49.
Li S, Zong C. Classifier combining rules under independence assumptions. In Proc. MCS2007, Prague, Czech, May 23-25, 2007, pp.322-332.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by the National Natural Science Foundation of China under Grant No. 61003155 and Start-Up Grant for Newly Appointed Professors under Grant No. 1-BBZM in The Hong Kong Polytechnic University.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Li, SS., Huang, CR. & Zong, CQ. Multi-Domain Sentiment Classification with Classifier Combination. J. Comput. Sci. Technol. 26, 25–33 (2011). https://doi.org/10.1007/s11390-011-9412-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-011-9412-y