Abstract
In this paper, an Arabic Opinion Analysis system is proposed. These sorts of applications produce data with a large number of features, while the number of samples is limited. The large number of features compared to the number of samples causes over-training when proper measures are not taken. In order to overcome this problem, we introduce a new approach based on Random sub space (RSS) algorithm integrating Support vector machine (SVM) learner as individual classifiers to offer an operational system able to identify opinions presented in reader’s comments found in Arabic newspapers blogs. The main steps of this study is based primarily on corpus construction, Statistical features extraction and then classifying opinion by the hybrid approach RSS-SVM. Experiments results based on 800 comments collected from Algerian newspapers are very encouraging; however, an automatic natural language processing must be added to enhance primitives’ vector.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Semiocast, Geolocation analysis of Twitter account sand tweets by Semiocast (2012), http://semiocast.com/publications/2012_07_30_Twitter_reaches_half_a_billion_accounts_140m_in_the_US
“Facebook Statistics by Country” (2012), http://www.socialbakers.com/facebook-statistics/
Kim, S.M., Hovy, E.: Determining the sentiment of opinions. In: Proceedings of the 20th International Conference on Computational Linguistics, COLING 2004. Association for Computational Linguistics, Morristown (2004)
Ho, T.K.: The Random Subspace Method for Constructing Decision Forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)
Turney, P.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the Association for Computational Linguistics, ACL (2002)
Wiebe, J.: Learning subjective adjectives from corpora. In: Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence, pp. 735–740 (2000)
Hatzivassiloglou, V., Wiebe, J.: Effects of adjective orientation and gradability on sentence subjectivity. In: COLING (2000)
Banea, C., Mihalcea, R., Wiebe, J.: A bootstrapping method for building subjectivity lexicons for languages with scarce resources. In: LREC 2008 (2008)
Riloff, E., Wiebe, J.: Learning extraction patterns for subjective expressions. In: EMNLP 2003, pp. 105–112 (2003)
Hassan, A., Qazvinian, V., Radev, V.: What’s with the attitude?: identifying sentences with attitude in online discussions. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 1245–1255 (2010)
Riloff, E., Patwardhan, S., Wiebe, J.: Feature sub sumption for opinion analysis. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 440–448 (2006)
Turney, P., Littman, M.: Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems 21, 315–346 (2003)
Kanayama, H., Nasukawa, T.: Fully automatic lexicon expansion for domain oriented sentiment analysis. In: EMNLP 2006, pp. 355–363 (2006)
Takamura, H., Inui, T., Okumura, M.: Extracting semantic orientations of words using spin model. In: ACL 2005, pp. 133–140 (2005)
Hassan, A., Radev, D.: Identifying text polarity using random walks. In: ACL 2010 (2010)
Zhai, Z., Liu, B., Xu, H., Jia, P.: Grouping product features using semi-supervised learning with soft-constraints. In: Proceedings of the23rd International Conference on Computational Linguistics, pp. 1272–1280 (2010)
Popescu, A., Etzioni, O.: Extracting product features and opinions from reviews. In: Natural Language Processing and Text Mining, pp. 9–28. Springer (2007)
Bethard, B., Yu, H., Thornton, A., Hatzivassiloglou, V., Jurafsky, D.: Automatic extraction of opinion propositions and their holders. In: 2004 AAAI Spring Symposium on Exploring Attitude and Affect in Text, p 2224 (2004)
Grefenstette, G., Qu, Y., Shanahan, J., Evans, D.A.: Coupling niche browsers and affect analysis for an opinion mining application. Proceedings of RIAO 4, 186–194 (2004)
Lin, W., Wilson, T., Wiebe, J., Hauptmann, A.: Which side are you on?: identifying perspectives at the document and sentence levels. In: Proceedings of the Tenth Conference on Computational Natural Language Learning, pp. 109–116 (2006)
Laver, M., Benoit, K., Garry, J.: Extracting policy positions from political texts using words as data. American Political Science Review 97(02), 311–331 (2003)
Somasundaran, S., Wiebe, J.: Recognizing stances in online debates. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, pp. 226–234 (August 2009)
Li, X., Zhao, H.: Weighted random subspace method for high dimensional data classification. The National Institutes of Health, PMC (2011)
Attia, M.: Handling Arabic morphological and syntactic ambiguities within the LFG framework with a view to machine translation, PhD Dissertation, University of Manchester (2008)
Sawalha, M., Atwell, E.: Comparative evaluation of Arabic language morphological analyzers and stemmers. In: Proceedings of COLING 2008 22nd International Conference on Computational Linguistics (2008)
Farghaly, A., Shaalan, K.: Arabic natural language processing: Challenges and solutions. ACM Transactions on Asian Language Information Processing, 8(4), Article 14, (2009).
Ziani, A., Azizi, N., Tlili, G.Y.: Détection de polarité d’opinions dans les forums en langue arabe par combinaison des SVMs. In: TALN-RÉCITAL 2013, Juin 17-21, Les Sables d’Olonne (2013)
Yang, Y.: An evaluation of statistical approaches to text categorization. Journal of Information Retrieval 1(1/2), 67–88 (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ziani, A., Azizi, N., Guiyassa, Y.T. (2015). Combining Random Sub Space Algorithm and Support Vector Machines Classifier for Arabic Opinions Analysis. In: Le Thi, H., Nguyen, N., Do, T. (eds) Advanced Computational Methods for Knowledge Engineering. Advances in Intelligent Systems and Computing, vol 358. Springer, Cham. https://doi.org/10.1007/978-3-319-17996-4_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-17996-4_16
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17995-7
Online ISBN: 978-3-319-17996-4
eBook Packages: EngineeringEngineering (R0)