Skip to main content

Interactions Between Term Weighting and Feature Selection Methods on the Sentiment Analysis of Turkish Reviews

  • Conference paper
  • First Online:
Computational Linguistics and Intelligent Text Processing (CICLing 2016)

Abstract

Term weighting methods assign appropriate weights to the terms in a document so that more important terms receive higher weights for the text representation. In this study, we consider four term weighting and three feature selection methods and investigate how these term weighting methods respond to the reduced text representation. We conduct experiments on five Turkish review datasets so that we can establish baselines and compare the performance of these term weighting methods. We test these methods on the English reviews so that we can identify their differences with the Turkish reviews. We show that both tf and tp weighting methods are the best for the Turkish, while tp is the best for the English reviews. When feature selection is applied, tf * idf method with DFD and χ2 has the highest accuracies for the Turkish, while tf * idf and tp methods with χ2 have the best performance for the English reviews.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://nltk.org.

  2. 2.

    http://www.cs.waikato.ac.nz/ml/weka.

  3. 3.

    http://www.win.tue.nl/~mpechen/projects/smm/#Datasets.

  4. 4.

    http://www.cs.cornell.edu/people/pabo/movie-review-data/.

  5. 5.

    http://www.cs.jhu.edu/~mdredze/datasets/sentiment/.

References

  1. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008)

    Article  Google Scholar 

  2. Nicholls, C., Song, F.: Comparison of feature selection methods for sentiment analysis. In: 23rd Canadian conference on Advances in Artificial Intelligence (AI 2010), pp. 286–289 (2010)

    Google Scholar 

  3. Erogul, U.: Sentiment analysis in Turkish. Master thesis, Middle East Technical University, Turkey (2009)

    Google Scholar 

  4. Çetin, M., Amasyali, M.F.: Active learning for Turkish sentiment analysis. In: IEEE International Symposium on Innovations in Intelligent Systems and Applications (INISTA) (2013)

    Google Scholar 

  5. Akba, F., Uçan, A., Sezer, E., Sever, H.: Assessment of feature selection metrics for sentiment analysis: Turkish movie reviews. In: 8th European Conference on Data Mining, pp. 180–184 (2014)

    Google Scholar 

  6. Kaya, M., Fidan, G., Toroslu, I.: Sentiment analysis of Turkish political news. In: IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology (WI-IAT), vol. 1, pp. 174–180 (2012)

    Google Scholar 

  7. Sevindi, B.I.: Comparison of supervised and dictionary based sentiment analysis approaches on Turkish text. Master thesis, Gazi University, Turkey (2013)

    Google Scholar 

  8. Yan, J., Liu, N., Zhang, B., Yan, S., Chen, Z., Cheng, Q., et al.: OCFS: optimal orthogonal centroid feature selection for text categorization. In: 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 122–129 (2005)

    Google Scholar 

  9. Pang, B., Lee, L., Vaithyanathan, V.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Morristown, pp. 79–86 (2002)

    Google Scholar 

  10. O’Keefe, T., Koprinska, I.: Feature selection and weighting methods. In: 14th Australian Document Computing Symposium on Sentiment Analysis, Sydney, Australia (2009)

    Google Scholar 

  11. McCallum, A., Nigam, K.A.: Comparison of event models for Naive Bayes text classification. In: Proceedings of AAAI (1998)

    Google Scholar 

  12. Robertson, S.E., Jones, K.S.: Relevance Weighting of Search Terms, pp. 143–160. Taylor Graham Publishing, London (1988)

    Google Scholar 

  13. Lan, M., Tan, C.L., Su, J., Lu, Y.: Supervised and traditional term weighting methods for automatic text categorization. IEEE Trans. Pattern Anal. Mach. Intell. 31, 721–735 (2009). IEEE Computer Society

    Article  Google Scholar 

  14. Demirtas, E., Pechenizkiy, M.: Cross-lingual polarity detection with machine translation. In: 2nd International Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM 2013), vol. 9 (2013)

    Google Scholar 

  15. Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of Annual Meeting for the Association of Computational Linguists (2004)

    Google Scholar 

  16. Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. Association of Computational Linguistics (ACL) (2007)

    Google Scholar 

  17. Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann Publishing, San Francisco (2006)

    MATH  Google Scholar 

Download references

Acknowledgements

This study was supported by Çukurova University Academic Research Project Unit under the grant no FDK-2015-3833 and by The Scientific and Technological Research Council of Turkey (TÜBİTAK) scholarship TUBITAK-2214-A.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tuba Parlar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Parlar, T., Özel, S.A., Song, F. (2018). Interactions Between Term Weighting and Feature Selection Methods on the Sentiment Analysis of Turkish Reviews. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9624. Springer, Cham. https://doi.org/10.1007/978-3-319-75487-1_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-75487-1_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-75486-4

  • Online ISBN: 978-3-319-75487-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics