Abstract
Term weighting methods assign appropriate weights to the terms in a document so that more important terms receive higher weights for the text representation. In this study, we consider four term weighting and three feature selection methods and investigate how these term weighting methods respond to the reduced text representation. We conduct experiments on five Turkish review datasets so that we can establish baselines and compare the performance of these term weighting methods. We test these methods on the English reviews so that we can identify their differences with the Turkish reviews. We show that both tf and tp weighting methods are the best for the Turkish, while tp is the best for the English reviews. When feature selection is applied, tf * idf method with DFD and χ2 has the highest accuracies for the Turkish, while tf * idf and tp methods with χ2 have the best performance for the English reviews.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008)
Nicholls, C., Song, F.: Comparison of feature selection methods for sentiment analysis. In: 23rd Canadian conference on Advances in Artificial Intelligence (AI 2010), pp. 286–289 (2010)
Erogul, U.: Sentiment analysis in Turkish. Master thesis, Middle East Technical University, Turkey (2009)
Çetin, M., Amasyali, M.F.: Active learning for Turkish sentiment analysis. In: IEEE International Symposium on Innovations in Intelligent Systems and Applications (INISTA) (2013)
Akba, F., Uçan, A., Sezer, E., Sever, H.: Assessment of feature selection metrics for sentiment analysis: Turkish movie reviews. In: 8th European Conference on Data Mining, pp. 180–184 (2014)
Kaya, M., Fidan, G., Toroslu, I.: Sentiment analysis of Turkish political news. In: IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology (WI-IAT), vol. 1, pp. 174–180 (2012)
Sevindi, B.I.: Comparison of supervised and dictionary based sentiment analysis approaches on Turkish text. Master thesis, Gazi University, Turkey (2013)
Yan, J., Liu, N., Zhang, B., Yan, S., Chen, Z., Cheng, Q., et al.: OCFS: optimal orthogonal centroid feature selection for text categorization. In: 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 122–129 (2005)
Pang, B., Lee, L., Vaithyanathan, V.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Morristown, pp. 79–86 (2002)
O’Keefe, T., Koprinska, I.: Feature selection and weighting methods. In: 14th Australian Document Computing Symposium on Sentiment Analysis, Sydney, Australia (2009)
McCallum, A., Nigam, K.A.: Comparison of event models for Naive Bayes text classification. In: Proceedings of AAAI (1998)
Robertson, S.E., Jones, K.S.: Relevance Weighting of Search Terms, pp. 143–160. Taylor Graham Publishing, London (1988)
Lan, M., Tan, C.L., Su, J., Lu, Y.: Supervised and traditional term weighting methods for automatic text categorization. IEEE Trans. Pattern Anal. Mach. Intell. 31, 721–735 (2009). IEEE Computer Society
Demirtas, E., Pechenizkiy, M.: Cross-lingual polarity detection with machine translation. In: 2nd International Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM 2013), vol. 9 (2013)
Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of Annual Meeting for the Association of Computational Linguists (2004)
Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. Association of Computational Linguistics (ACL) (2007)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann Publishing, San Francisco (2006)
Acknowledgements
This study was supported by Çukurova University Academic Research Project Unit under the grant no FDK-2015-3833 and by The Scientific and Technological Research Council of Turkey (TÜBİTAK) scholarship TUBITAK-2214-A.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Parlar, T., Özel, S.A., Song, F. (2018). Interactions Between Term Weighting and Feature Selection Methods on the Sentiment Analysis of Turkish Reviews. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9624. Springer, Cham. https://doi.org/10.1007/978-3-319-75487-1_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-75487-1_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75486-4
Online ISBN: 978-3-319-75487-1
eBook Packages: Computer ScienceComputer Science (R0)