Abstract
Sarcasm is a pervasive linguistic phenomenon in online documents that express subjective and deeply-felt opinions. Detection of sarcasm is of great importance and beneficial to many NLP applications, such as sentiment analysis, opinion mining and advertising. Current studies consider automatic sarcasm detection as a simple text classification problem. They do not use explicit features to detect sarcasm and ignore the imbalance between sarcastic and non-sarcastic samples in real applications. In this paper, we first explore the characteristics of both English and Chinese sarcastic sentences and introduce a set of features specifically for detecting sarcasm in social media. Then, we propose a novel multi-strategy ensemble learning approach(MSELA) to handle the imbalance problem. We evaluate our proposed model on English and Chinese data sets. Experimental results show that our ensemble approach outperforms the state-of-the-art sarcasm detection approaches and popular imbalanced classification methods.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Liu, B., Zhang, L.: A survey of opinion mining and sentiment analysis. In: Mining Text Data, pp. 415–463. Springer (2012)
Carvalho, P., Sarmento, L., Silva, M.J.: Clues for detecting irony in user-generated contents: oh..!! it’s so easy;-). In: Proceedings of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion, pp. 53–56. ACM (2009)
Burfoot, C., Baldwin, T.: Automatic satire detection: Are you having a laugh? In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pp. 161–164. Association for Computational Linguistics (2009)
González-Ibáñez, R., Muresan, S., Wacholder, N.: Identifying sarcasm in twitter: A closer look. In: ACL (Short Papers), pp. 581–586. Citeseer (2011)
Liebrecht, C., Kunneman, F., van den Bosch, A.: The perfect solution for detecting sarcasm in tweets# not. In: WASSA 2013, p. 29 (2013)
Blake, C., Merz, C.J.: Uci repository of machine learning databases (1998)
Gibbs Jr, R.W., Colston, H.L.: Irony in language and thought: A cognitive science reader. Psychology Press (2007)
Tsur, O., Davidov, D., Rappoport, A.: Icwsm-a great catchy name: Semi-supervised recognition of sarcastic sentences in online product reviews. In: ICWSM (2010)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. arXiv preprint arXiv:1106.1813 (2011)
Yen, S.J., Lee, Y.S.: Cluster-based under-sampling approaches for imbalanced data distributions. Expert Systems with Applications 36(3), 5718–5727 (2009)
Elkan, C.: The foundations of cost-sensitive learning. In: International Joint Conference on Artificial Intelligence, vol. 17, pp. 973–978. Citeseer (2001)
Wang, B.X., Japkowicz, N.: Boosting support vector machines for imbalanced data sets. Knowledge and Information Systems 25(1), 1–20 (2010)
Tax, D.M., Duin, R.P.: Support vector data description. Machine Learning 54(1), 45–66 (2004)
Sun, Y., Kamel, M.S., Wong, A.K., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recognition 40(12), 3358–3378 (2007)
Maalouf, M.: Trafalis: Robust weighted kernel logistic regression in imbalanced and rare events data. Computational Statistics & Data Analysis 55(1), 168–183 (2011)
Reyes, A., Rosso, P., Buscaldi, D.: From humor recognition to irony detection: The figurative language of social media. Data & Knowledge Engineering 74, 1–12 (2012)
Chen, H., Du, Y., Jiang, K.: Classification of incomplete data using classifier ensembles. In: Systems and Informatics (ICSAI), pp. 2229–2232. IEEE (2012)
Reyes, A., Rosso, P.: Making objective decisions from subjective data: Detecting irony in customer reviews. Decision Support Systems 53(4), 754–760 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Liu, P., Chen, W., Ou, G., Wang, T., Yang, D., Lei, K. (2014). Sarcasm Detection in Social Media Based on Imbalanced Classification. In: Li, F., Li, G., Hwang, Sw., Yao, B., Zhang, Z. (eds) Web-Age Information Management. WAIM 2014. Lecture Notes in Computer Science, vol 8485. Springer, Cham. https://doi.org/10.1007/978-3-319-08010-9_49
Download citation
DOI: https://doi.org/10.1007/978-3-319-08010-9_49
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08009-3
Online ISBN: 978-3-319-08010-9
eBook Packages: Computer ScienceComputer Science (R0)