PerSent: A Freely Available Persian Sentiment Lexicon

  • Kia Dashtipour
  • Amir Hussain
  • Qiang Zhou
  • Alexander Gelbukh
  • Ahmad Y. A. Hawalah
  • Erik Cambria
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10023)

Abstract

People need to know other people’s opinions to make well-informed decisions to buy products or services. Companies and organizations need to understand people’s attitude towards their products and services and use feedback from the customers to improve their products. Sentiment analysis techniques address these needs. While the majority of Internet users are not English speakers, most research papers in the sentiment-analysis field focus on English; resources for other languages are scarce. In this paper, we introduce a Persian sentiment lexicon, which consists of 1500 words along with their part-of-speech tags and polarity scores. We have used two machine-learning algorithms to evaluate the performance of this resource on a sentiment analysis task. The lexicon is freely available and can be downloaded from our website.

References

  1. Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: feature selection for opinion classification in Web forums. ACM Trans. Inf. Syst. (TOIS) 26(3), 12 (2008)CrossRefGoogle Scholar
  2. Abdul-Mageed, M., Diab, M.T.: SANA: a large scale multi-genre, multi-dialect lexicon for arabic subjectivity and sentiment analysis. In: LREC, pp. 1162–1169 (2014)Google Scholar
  3. Benamara, F., Cesarano, C., Picariello, A., Recupero, D.R., Subrahmanian, V.S.: Sentiment analysis: adjectives and adverbs are better than adjectives alone. In: ICWSM (2007)Google Scholar
  4. Cambria, E.: Affective computing and sentiment analysis. IEEE Intell. Syst. 31(2), 102–107 (2016)CrossRefGoogle Scholar
  5. Cambria, E., Howard, N., Xia, Y., Chua, T.S.: Computational intelligence for big social data analysis. IEEE Comput. Intell. Mag. 11(3), 8–9 (2016)CrossRefGoogle Scholar
  6. Cambria, E., Poria, S., Bisio, F., Bajpai, R., Chaturvedi, I.: The CLSA model: a novel framework for concept-level sentiment analysis. In: Gelbukh, A. (ed.) CICLing 2015. LNCS, vol. 9042, pp. 3–22. Springer, Heidelberg (2015). doi:10.1007/978-3-319-18117-2_1 Google Scholar
  7. Cambria, E., Schuller, B., Xia, Y., Havasi, C.: New avenues in opinion mining and sentiment analysis. IEEE Intell. Syst. 28(2), 15–21 (2013)CrossRefGoogle Scholar
  8. Cambria, E., Speer, R., Havasi, C., Hussain, A.: SenticNet: a publicly available semantic resource for opinion mining. In: Common-sense Knowledge, AAAI Fall Symposium series, vol. 10 (2010)Google Scholar
  9. Chen, Y., Skiena, S.: Building sentiment lexicons for all major languages. In: ACL, vol. 2, pp. 383–389 (2014)Google Scholar
  10. Dashtipour, K., Poria, S., Hussain, A., Cambria, E., Hawalah, A.Y., Gelbukh, A., Zhou, Q.: Multilingual sentiment analysis: state of the art and independent comparison of techniques. Cogn. Comput. 8, 1–15 (2016)Google Scholar
  11. Dehkharghani, R., Saygin, Y., Yanikoglu, B., Oflazer, K.: SentiTurkNet: a Turkish polarity lexicon for sentiment analysis. Lang. Resour. Eval. 50, 1–19 (2015)Google Scholar
  12. de Albornoz, J.C., Plaza, L., Gervás, P.: SentiSense: an easily scalable concept-based affective lexicon for sentiment analysis. In: LREC, pp. 3562–3567 (2012)Google Scholar
  13. Elhawary, M., Elfeky, M.: Mining Arabic business reviews. In: 2010 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 1108–1113. IEEE (2010)Google Scholar
  14. Elarnaoty, M., AbdelRahman, S., Fahmy, A.: A machine learning approach for opinion holder extraction in Arabic language. arXiv preprint arXiv:1206.1011 (2012)
  15. Esuli, A., Sebastiani, F.: Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of LREC, Vol. 6, pp. 417–422 (2006)Google Scholar
  16. Hatzivassiloglou, V., McKeown, K.R.: Predicting the semantic orientation of adjectives. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, pp. 174–181. Association for Computational Linguistics (1997)Google Scholar
  17. He, Y., Zhou, D.: Self-training from labeled features for sentiment analysis. Inf. Process. Manage. 47(4), 606–616 (2011)CrossRefGoogle Scholar
  18. Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177. ACM (2004)Google Scholar
  19. Karimi, S.: Aspects of Persian syntax, specificity, and the theory of grammar. University of Washington (1989)Google Scholar
  20. Kouloumpis, E., Wilson, T., Moore, J.D.: Twitter sentiment analysis: the good the bad and the omg!. In: ICWSM, vol. 11, pp. 538–541 (2011)Google Scholar
  21. Mahyoub, F.H., Siddiqui, M.A., Dahab, M.Y.: Building an Arabic sentiment lexicon using semi-supervised learning. J. King Saud Univ. Comput. Inf. Sci. 26(4), 417–424 (2014)Google Scholar
  22. Maynard, D., Funk, A.: Automatic detection of political opinions in tweets. In: García-Castro, R., Fensel, D., Antoniou, G. (eds.) ESWC 2011. LNCS, vol. 7117, pp. 88–99. Springer, Heidelberg (2012). doi:10.1007/978-3-642-25953-1_8 CrossRefGoogle Scholar
  23. Neviarouskaya, A., Prendinger, H., Ishizuka, M.: SentiFul: a lexicon for sentiment analysis. IEEE Trans. Affect. Comput. 2(1), 22–36 (2011)CrossRefGoogle Scholar
  24. Pak, A., Paroubek, P.: Twitter based system: using Twitter for disambiguating sentiment ambiguous adjectives. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 436–439. Association for Computational Linguistics, July 2010Google Scholar
  25. Pakray, P., Neogi, S., Bhaskar, P., Poria, S., Bandyopadhyay, S., Gelbukh, A.: A textual entailment system using anaphora resolution. In: System Report, Text Analysis Conference Recognizing Textual Entailment Track (TAC RTE) Notebook, November 2011aGoogle Scholar
  26. Pakray, P., Pal, S., Poria, S., Bandyopadhyay, S., Gelbukh, A.: JU_CSE_TAC: textual entailment recognition system at TAC RTE-6. In: System Report, Text Analysis Conference Recognizing Textual Entailment Track (TAC RTE) Notebook (2010)Google Scholar
  27. Pakray, P., Poria, S., Bandyopadhyay, S., Gelbukh, A.: Semantic textual entailment recognition using UNL. Polibits 43, 23–27 (2011b)Google Scholar
  28. Poria, S., Cambria, E., Gelbukh, A.: Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In: Proceedings of EMNLP, pp. 2539–2544 (2015a)Google Scholar
  29. Poria, S., Cambria, E., Gelbukh, A.: Aspect extraction for opinion mining with a deep convolutional neural network. Knowl.-Based Syst. 108, 42–49 (2016)CrossRefGoogle Scholar
  30. Poria, S., Cambria, E., Gelbukh, A., Bisio, F., Hussain, A.: Sentiment data flow analysis by means of dynamic linguistic patterns. IEEE Comput. Intell. Mag. 10(4), 26–36 (2015b)Google Scholar
  31. Poria, S., Cambria, E., Winterstein, G., Huang, G.B.: Sentic patterns: dependency-based rules for concept-level sentiment analysis. Knowl.-Based Syst. 69, 45–63 (2014)CrossRefGoogle Scholar
  32. Poria, S., Gelbukh, A., Das, D., Bandyopadhyay, S.: Fuzzy clustering for semi-supervised learning–case study: construction of an emotion lexicon. In: Mexican International Conference on Artificial Intelligence, pp. 73–86, October 2012Google Scholar
  33. Remus, R., Quasthoff, U., Heyer, G.: SentiWS – a publicly available german-language resource for sentiment analysis. In: LREC, May 2010Google Scholar
  34. Saraee, M., Bagheri, A.: Feature selection methods in Persian sentiment analysis. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds.) NLDB 2013. LNCS, vol. 7934, pp. 303–308. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38824-8_29 CrossRefGoogle Scholar
  35. Seraji, M., Megyesi, B., Nivre, J.: A basic language resource kit for Persian, In: LREC, pp. 2245–2252 (2012)Google Scholar
  36. Shi, H.X., Li, X.J.: A sentiment analysis model for hotel reviews based on supervised learning. In: 2011 International Conference on Machine Learning and Cybernetics (ICMLC), vol. 3, pp. 950–954. IEEE (2011)Google Scholar
  37. Sidorov, G., et al.: Empirical Study of Machine Learning Based Approach for Opinion Mining in Tweets. In: Batyrshin, I., González Mendoza, M. (eds.) MICAI 2012. LNCS (LNAI), vol. 7629, pp. 1–14. Springer, Heidelberg (2013). doi:10.1007/978-3-642-37807-2_1 CrossRefGoogle Scholar
  38. Stone, P., Dunphy, D.C., Smith, M.S., Ogilvie, D.M.: The general inquirer: a computer approach to content analysis. J. Reg. Sci. 8(1), 113–116 (1968)CrossRefGoogle Scholar
  39. Subrahmanian, V.S., Reforgiato, D.: AVA: adjective-verb-adverb combinations for sentiment analysis. IEEE Intell. Syst. 23(4), 43–50 (2008)CrossRefGoogle Scholar
  40. Tang, H., Tan, S., Cheng, X.: A survey on sentiment detection of reviews. Expert Syst. Appl. 36(7), 10760–10773 (2009)CrossRefGoogle Scholar
  41. Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37(2), 267–307 (2011)CrossRefGoogle Scholar
  42. Taghva, K., Beckley, R., Sadeh, M.: A stemming algorithm for the farsi language. In: ITCC, vol. 1, pp. 158–162, April 2005Google Scholar
  43. Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417–424. Association for Computational Linguistics (2002)Google Scholar
  44. Waltinger, U.: GermanPolarityClues: a lexical resource for german sentiment analysis. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC) (2010)Google Scholar
  45. Yang, Y.: Application of Latent Dirichlet Allocation in Online Content Generation. Ph.D. thesis, University of California, Los Angeles (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Kia Dashtipour
    • 1
  • Amir Hussain
    • 1
  • Qiang Zhou
    • 2
  • Alexander Gelbukh
    • 3
  • Ahmad Y. A. Hawalah
    • 4
  • Erik Cambria
    • 5
  1. 1.Department of Computing Science and MathematicsUniversity of StirlingStirlingScotland
  2. 2.Tsinghua UniversityBeijingChina
  3. 3.CICInstituto Politécnico NacionalMexico CityMexico
  4. 4.Taibah UniversityMadinaSaudi Arabia
  5. 5.School of Computer EngineeringNanyang Technological UniversitySingaporeSingapore

Personalised recommendations