Opinion Classification Techniques Applied to a Spanish Corpus

  • Eugenio Martínez-Cámara
  • M. Teresa Martín-Valdivia
  • L. Alfonso Ureña-López
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6716)

Abstract

Sentiment analysis is a new challenging task related to Text Mining and Natural Language Processing. Although there are some current works, most of them only focus on English texts. Web pages, information and opinions on the Internet are increasing every day, and English is not the only language used to write them. Other languages like Spanish are increasingly present so we have carried out some experiments over a Spanish film reviews corpus. In this paper we present several experiments using five classification algorithms (SVM, Nave Bayes, BBR, KNN, C4.5). The results obtained are very promising and encourage us to continue investigating in this line.

Keywords

Opinion mining sentiment polarity classification subjective corpora machine learning algorithms 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ahmad, K., Cheng, D., Almas, Y.: Multi-lingual Sentiment Analysis of Financial News Streams. In: Proceedings of Science, GRID 2006 (2006)Google Scholar
  2. 2.
    Boldrini, E., Balahur, A., Martínez-Barco, P., Montoyo, A.: EmotiBlog: an annotation scheme for emotion detection and analysis in non-traditional textual genres. In: DMIN, pp. 491–497. CSREA PressGoogle Scholar
  3. 3.
    Carletta, J.: Assessing agreement on classification tasks: the kappa statistic. In: Computational Linguistics, vol. 22(2). MIT Press, Cambridge (1996)Google Scholar
  4. 4.
    Chang, C.C., Lin, C.J.: LIBSVM: a Library for Support Vector Machines (2001)Google Scholar
  5. 5.
    Cruz, F.L., Troyano, J.A., Enriquez, F., Ortega, J.: Clasificación de documentos basada en la opinión: experimentos con un corpus de críticas de cine en español. Procesamiento de Lenguaje Natural 41 (2008)Google Scholar
  6. 6.
    Denecke, K.: Using SentiWordNet for multilingual sentiment analysis. In: ICDE Workshops, pp. 507–512. IEEE Computer Society, Los Alamitos (2008)Google Scholar
  7. 7.
    Genkin, A., Lewis, D., Madigan, D.: Large-Scale Bayesian Logistic Regression for Text Categorization (2004)Google Scholar
  8. 8.
    Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)MATHGoogle Scholar
  9. 9.
    Ortiz-Martos, A., Martín-Valdivia, M.T., Ureña-Lopez, L.A., Cumbreras-García, M.A.: Detección automática de Spam utilizando Regresión Logística Bayesiana. Procesamiento del Lenguaje Natural 35, 127–133 (2005)Google Scholar
  10. 10.
    Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundation and Trends in Information Retrieval 2(1-2), 1–135 (2008)Google Scholar
  11. 11.
    Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986)Google Scholar
  12. 12.
    Quinlan, J.R.: Programs for Machine Learning. Morgan Kaurfman, San Francisco (1993)Google Scholar
  13. 13.
    Sebastiani, F.: Machine Learning in automated text categorization. ACM Computing Surveys (CSUR) 34(1), 1–47 (2002)CrossRefGoogle Scholar
  14. 14.
    Esuli, A., Sebastiani, F.: SentiWordNet: A publicly Available Lexical Resource for Opinion Mining. In: Proceedings of Language Resources and Evaluation, LREC (2006)Google Scholar
  15. 15.
    Stone, P.J.: The General Inquierer: A Computer Approach to Content Analysis. The MIT Press, Cambridge (1996)Google Scholar
  16. 16.
    Tan, S., Zhang, J.: An empirical study of sentiment analysis for Chinese documents. Expert System with Applications 34, 2622–2629 (2008)CrossRefGoogle Scholar
  17. 17.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer-Verlag, New York (1995)CrossRefMATHGoogle Scholar
  18. 18.
    Zhang, C., Zeng, D., Li, J., Wang, F.-Y., Zuo, W.: Sentiment analysis of Chinese documents: From sentence to document level. JASIST 60, 2474–2487 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Eugenio Martínez-Cámara
    • 1
  • M. Teresa Martín-Valdivia
    • 1
  • L. Alfonso Ureña-López
    • 1
  1. 1.Department of Computer Science, SINAI - Sistemas Inteligentes de Acceso a la InformaciónUniversity of JaénSpain

Personalised recommendations