Abstract
In this paper, we focus on classifying documents according to opinion and value judgment they contain. The main originality of our approach is to combine linguistic pre-processing, classification and a voting system using several classification methods. In this context, the relevant representation of the documents allows to determine the features for storing textual data in data warehouses. The conducted experiments on very large corpora from a French challenge on text mining (DEFT) show the efficiency of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Benamara, F., Cesarano, C., Picariello, A., Reforgiato, D., Subrahmanian, V.S.: Sentiment analysis: Adjectives and adverbs are better than adjectives alone. In: Proceedings of ICWSM conference (2007)
Bi, Y., McClean, S., Anderson, T.: Combining rough decisions for intelligent text mining using dempster’s rule. Artificial Intelligence Review 26(3), 191–209 (2006)
Brill, E.: Some advances in transformation-based part of speech tagging. In: AAAI, Vol. 1, pp. 722–727 (1994)
Cornuéjols, A., Miclet, L.: Apprentissage artificiel, Concepts et algorithmes. Eyrolles (2002)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley, Chichester (1991)
Ding, X., Liu, B.: The utility of linguistic rules in opinion mining (poster paper). In: SIGIR 2007, Amsterdam, 23-27 July (2007)
Esuli, A., Sebastiani, F.: PageRanking wordnet synsets: An application to opinion mining. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL 2007), Prague, CZ, pp. 424–431 (2007)
Grouin, C., Berthelin, J.-B., El Ayari, S., Heitz, T., Hurault-Plantet, M., Jardino, M., Khalis, Z., Lastes, M.: Présentation de deft 2007 (défi fouille de textes). In: Proceedings of the DEFT 2007 workshop, Plate-forme AFIA, Grenoble, France (2007)
Gupta, H., Srivastava, D.: The data warehouse of newsgroups. In: Proceedings of the Seventh International Conference on Database Theory. LNCS, pp. 471–488. Springer, Heidelberg (1999)
Jindal, N., Liu, B.: Review spam detection (poster paper). In: WWW 2007, Banff, Canada, May 8-12 (2007)
Joachims, T.: Text categorisation with support vector machines: Learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)
Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(3), 226–239 (1998)
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. John Wiley and Sons, Inc., Chichester (2004)
Landauer, T., Dumais, S.: A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review 104(2), 211–240 (1997)
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability (1967)
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.J.: Introduction to WordNet: an on-line lexical database. International Journal of Lexicography 3(4), 235–244 (1990)
Parks, J., Sandberg, I.W.: Universal approximation using radial-basis function networks. Neural Computation 3, 246–257 (1991)
Plantié, M.: Extraction automatique de connaissances pour la décision multicritre. PhD thesis, École Nationale Supérieure des Mines de Saint Etienne et de l’Université Jean Monnet de Saint Etienne, Nîmes (2006)
Platt, J.: Machines using sequential minimal optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning (1998)
Rahman, A.F.R., Alam, H., Fairhurst, M.C.: Multiple Classifier Combination for Character Recognition: Revisiting the Majority Voting System and Its Variation, pp. 167–178 (2002)
Salton, G., Yang, C.S., Yu, C.T.: A theory of term importance in automatic text analysis. Journal of the American Society for Information Science 26, 33–44 (1975)
Turney, P.D.: Mining the Web for synonyms: PMI–IR versus LSA on TOEFL. In: Proceedings of ECML conference. LNCS, pp. 491–502. Springer, Heidelberg (2001)
Turney, P.D., Littman, M.: Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems 21(4), 315–346 (2003)
Wang, Y., Hodges, J., Tang, B.: Classification of web documents using a naive bayes method. In: Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence, pp. 560–564 (2003)
Yang, H., Si, L., Callan, J.: Knowledge transfer and opinion detection in the trec 2006 blog track. In: Notebook of Text REtrieval Conference (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Plantié, M., Roche, M., Dray, G., Poncelet, P. (2008). Is a Voting Approach Accurate for Opinion Mining?. In: Song, IY., Eder, J., Nguyen, T.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2008. Lecture Notes in Computer Science, vol 5182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85836-2_39
Download citation
DOI: https://doi.org/10.1007/978-3-540-85836-2_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85835-5
Online ISBN: 978-3-540-85836-2
eBook Packages: Computer ScienceComputer Science (R0)