Skip to main content
Log in

From classification to quantification in tweet sentiment analysis

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Sentiment classification has become a ubiquitous enabling technology in the Twittersphere, since classifying tweets according to the sentiment they convey towards a given entity (be it a product, a person, a political party, or a policy) has many applications in political science, social science, market research, and many others. In this paper, we contend that most previous studies dealing with tweet sentiment classification (TSC) use a suboptimal approach. The reason is that the final goal of most such studies is not estimating the class label (e.g., Positive, Negative, or Neutral) of individual tweets, but estimating the relative frequency (a.k.a. “prevalence”) of the different classes in the dataset. The latter task is called quantification, and recent research has convincingly shown that it should be tackled as a task of its own, using learning algorithms and evaluation measures different from those used for classification. In this paper, we show (by carrying out experiments using two learners, seven quantification-specific algorithms, and 11 TSC datasets) that using quantification-specific algorithms produces substantially better class frequency estimates than a state-of-the-art classification-oriented algorithm routinely used in TSC. We thus argue that researchers interested in tweet sentiment prevalence should switch to quantification-specific (instead of classification-specific) learning algorithms and evaluation measures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. Consistent with most mathematical literature, we use the caret symbol (\(\wedge\)) to indicate estimation.

  2. Since the standard logistic function \(\frac{e^{x}}{e^{x}+1}\) ranges (for the domain \([0,+\infty )\) we are interested in) on [\(\frac{1}{2}\),1], we multiply by 2 in order for it to range on [1,2], and subtract 1 in order for it to range on [0,1], as desired.

  3. http://www.ark.cs.cmu.edu/TweetNLP/.

  4. In Joachims (2005), SVM-perf is actually called SVM-multi, but the author has released its implementation under the name SVM-perf; we will thus use this latter name.

  5. SVM-perf is available from http://svmlight.joachims.org/svm_struct.html, while the module that customizes it to \({{\mathrm{KLD}}}\) is available from http://hlt.isti.cnr.it/quantification/. The code for all the other methods discussed in this section is available from http://alt.qcri.org/~wgao/codes/tweet_sentiment_quantification.zip.

  6. This means that we avoid STC datasets in which the labels are automatically derived from, say, the emoticons present in the tweets.

  7. In order to enhance the reproducibility of our experimental results, we make available (at http://alt.qcri.org/~wgao/data/SNAM/tweet_sentiment_quantification.zip) the vectorial representations we have generated for all the datasets (split into training / validation / test sets) used in this paper.

  8. The SVM-based implementation of CC is called SVM(HL) in Gao and Sebastiani (2015). LIBSVM is available from http://www.csie.ntu.edu.tw/~cjlin/libsvm/.

  9. At the time of writing this paper, the test set of the SemEval2016 collection has not yet been made available. However, the data made available by the organizers were already pre-split into three subsets, called “train”, “dev”, and “devtest”; we have thus used these subsets as the training set, held-out set, and test set, respectively.

  10. http://www.csie.ntu.edu.tw/~cjlin/liblinear/.

References

  • Alaíz-Rodríguez R, Guerrero-Curieses A, Cid-Sueiro J (2011) Class and subclass probability re-estimation to adapt a classifier in the presence of concept drift. Neurocomputing 74(16):2614–2623

    Article  Google Scholar 

  • Asur S, Huberman BA (2010) Predicting the future with social media. In: Proceedings of the 10th IEEE/WIC/ACM international conference on web intelligence (WI 2010), pp 492–499, Toronto, CA

  • Balikas G, Partalas I, Gaussier E, Babbar R, Amini M-R (2015) Efficient model selection for regularized classification by exploiting unlabeled data. In: Proceedings of the 14th international symposium on intelligent data analysis (IDA 2015), pp 25–36, Saint Etienne, FR

  • Barranquero J, González P, Díez J, del Coz JJ (2013) On the study of nearest neighbor algorithms for prevalence estimation in binary problems. Pattern Recognit 46(2):472–482

    Article  MATH  Google Scholar 

  • Barranquero J, Díez J, del Coz JJ (2015) Quantification-oriented learning based on reliable classifiers. Pattern Recognit 48(2):591–604

    Article  Google Scholar 

  • Beijbom O, Hoffman J, Yao E, Darrell T, Rodriguez-Ramirez A, Gonzalez-Rivero M, Hoegh-Guldberg O (2015) Quantification in-the-wild: Data-sets and baselines. Presented at the NIPS 2015 Workshop on Transfer and Multi-Task Learning. Montreal, CA

  • Bella A, Ferri C, Hernández-Orallo J, Ramírez-Quintana MJ (2010) Quantification via probability estimators. In: Proceedings of the 11th IEEE international conference on data mining (ICDM 2010), pp 737–742, Sydney, AU

  • Berardi G, Esuli A, Sebastiani F (2015) Utility-theoretic ranking for semi-automated text classification. ACM Trans Knowl Discov Data 10(1). Article 6

  • Bollen J, Mao H, Zeng X-J (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8

    Article  Google Scholar 

  • Borge-Holthoefer J, Magdy W, Darwish K, Weber I (2015) Content and network dynamics behind Egyptian political polarization on Twitter. In: Proceedings of the 18th ACM conference on computer supported cooperative work and social computing (CSCW 2015), pp 700–711, Vancouver, CA

  • Burton S, Soboleva A (2011) Interactive or reactive? Marketing with Twitter. J Consumer Mark 28(7):491–499

    Article  Google Scholar 

  • Chan YS, Ng HT (2006) Estimating class priors in domain adaptation for word sense disambiguation. In: Proceedings of the 44th annual meeting of the Association for Computational Linguistics (ACL 2006), pp 89–96, Sydney, AU

  • Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3). Article 27

  • Conroy BR, Sajda P (212) Fast, exact model selection and permutation testing for L2-regularized logistic regression. In: Proceedings of the 15th international conference on artificial intelligence and statistics (AISTATS 2012), pp 246–254, La Palma, ES

  • Cover TM, Thomas JA (1991) Elements of information theory. Wiley, New York

    Book  MATH  Google Scholar 

  • Csiszár I, Shields PC (2004) Information theory and statistics: a tutorial. Found Trends Commun Inf Theory 1(4):417–528

    Article  MATH  Google Scholar 

  • Da San Martino G, Gao W, Sebastiani F (2016) QCRI at SemEval-2016 Task 4: probabilistic methods for binary and ordinal quantification. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval 2016), San Diego, US (Forthcoming)

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39(1):1–38

    MathSciNet  MATH  Google Scholar 

  • Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  • Dodds PS, Harris KD, Kloumann IM, Bliss CA, Danforth CM (2011) Temporal patterns of happiness and information in a global social network: Hedonometrics and Twitter. PLoS One 6(12):e26752

    Article  Google Scholar 

  • Esuli A (2016) ISTI-CNR at SemEval-2016 Task 4: quantification on an ordinal scale. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval 2016), San Diego, US

  • Esuli A, Sebastiani F (2010) Sentiment quantification. IEEE Intell Syst 25(4):72–75

    Article  Google Scholar 

  • Esuli A, Sebastiani F (2014) Explicit loss minimization in quantification applications (preliminary draft). In: Presented at the 8th international workshop on information filtering and retrieval (DART 2014), Pisa, IT

  • Esuli A, Sebastiani F (2015) Optimizing text quantifiers for multivariate loss functions. ACM Trans Knowl Discov Data 9(4). Article 27

  • Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874

    MATH  Google Scholar 

  • Forman G (2005) Counting positives accurately despite inaccurate classification. In: Proceedings of the 16th European Conference on machine learning (ECML 2005), pp 564–575, Porto, PT

  • Forman G (2008) Quantifying counts and costs via classification. Data Min Knowl Discov 17(2):164–206

    Article  MathSciNet  Google Scholar 

  • Gao W, Sebastiani F (2015) Tweet sentiment: from classification to quantification. In: Proceedings of the 7th international conference on advances in social network analysis and mining (ASONAM 2015), pp 97–104, Paris, FR

  • González-Castro V, Alaiz-Rodríguez R, Alegre E (2013) Class distribution estimation based on the Hellinger distance. Inf Sci 218:146–164

    Article  Google Scholar 

  • Herfort B, Schelhorn S-J, de Albuquerque JP, Zipf A (2014) Does the spatiotemporal distribution of tweets match the spatiotemporal distribution of flood phenomena? A study about the river Elbe flood in June 2013. In: Proceedings of the 11th international conference on information systems for crisis response and management (ISCRAM 2014), pp 747–751, Philadelphia, US

  • Hopkins DJ, King G (2010) A method of automated nonparametric content analysis for social science. Am J Political Sci 54(1):229–247

    Article  Google Scholar 

  • Joachims T (2005) A support vector method for multivariate performance measures. In: Proceedings of the 22nd international conference on machine learning (ICML 2005), pp 377–384, Bonn, DE

  • Joachims T, Hofmann T, Yue Y, Yu C-N (2009) Predicting structured objects with support vector machines. Commun ACM 52(11):97–104

    Article  Google Scholar 

  • Kaya M, Fidan G, Toroslu IH (2013) Transfer learning using Twitter data for improving sentiment classification of Turkish political news. In: Proceedings of the 28th international symposium on computer and information sciences (ISCIS 2013), pp 139–148, Paris, FR

  • King G, Lu Y (2008) Verbal autopsy methods with multiple causes of death. Stat Sci 23(1):78–91

    Article  MathSciNet  MATH  Google Scholar 

  • Kiritchenko S, Zhu X, Mohammad SM (2014) Sentiment analysis of short informal texts. J Artif Intell Res 50:723–762

    MATH  Google Scholar 

  • Latinne P, Saerens M, Decaestecker C (2001) Adjusting the outputs of a classifier to new a priori probabilities may significantly improve classification accuracy: evidence from a multi-class problem in remote sensing. In: Proceedings of the 18th international conference on machine learning (ICML 2001), pp 298–305

  • Lewis DD (1995) Evaluating and optimizing autonomous text classification systems. In: Proceedings of the 18th ACM international conference on research and development in information retrieval (SIGIR 1995), pp 246–254, Seattle, US

  • Limsetto N, Waiyamai K (2011) Handling concept drift via ensemble and class distribution estimation technique. In: Proceedings of the 7th international conference on advanced data mining (ADMA 2011), pp 13–26, Beijing, CN

  • Marchetti-Bowick M, Chambers N (2012) Learning for microblogs with distant supervision: political forecasting with Twitter. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012), pp 603–612, Avignon, FR

  • Martínez-Cámara E, Martín-Valdivia MT, López LAU, Ráez AM (2014) Sentiment analysis in Twitter. Nat Lang Eng 20(1):1–28

    Article  Google Scholar 

  • Mejova Y, Weber I, Macy MW (eds) (2015) Twitter: a digital socioscope. Cambridge University Press, Cambridge

    Google Scholar 

  • Milli L, Monreale A, Rossetti G, Giannotti F, Pedreschi D, Sebastiani F (2013) Quantification trees. In: Proceedings of the 13th IEEE international conference on data mining (ICDM 2013), pp 528–536, Dallas, US

  • Mohammad SM, Kiritchenko S, Zhu X (2013) NRC-Canada: building the state-of-the-art in sentiment analysis of tweets. In: Proceedings of the 7th international workshop on semantic evaluation (SemEval 2013), pp 321–327, Atlanta, US

  • Murphy KP (2012) Machine learning. A probabilistic perspective. The MIT Press, Cambridge

    MATH  Google Scholar 

  • Nakov P, Rosenthal S, Kozareva Z, Stoyanov V, Ritter A, Wilson T (2013) SemEval-2013 Task 2: sentiment analysis in Twitter. In: Proceedings of the 7th international workshop on semantic evaluation (SemEval 2013), pp 312–320, Atlanta, US

  • Nakov P, Ritter A, Rosenthal S, Sebastiani F, Stoyanov V (2016) SemEval-2016 Task 4: sentiment analysis in Twitter. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval 2016), San Diego, US (forthcoming)

  • Narasimhan H, Li S, Kar P, Chawla S, Sebastiani F (2016) Stochastic optimization techniques for quantification performance measures. Submitted for publication

  • O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: linking text sentiment to public opinion time series. In: Proceedings of the 4th AAAI Conference on Weblogs and Social Media (ICWSM 2010), Washington, US

  • Olteanu A, Vieweg S, Castillo C (2015) What to expect when the unexpected happens: social media communications across crises. In: Proceedings of the 18th ACM conference on computer supported cooperative work and social computing (CSCW 2015), pp 994–1009, Vancouver, CA

  • Pan W, Zhong E, Yang Q (2012) Transfer learning for text mining. In: Aggarwal CC, Zhai CX (eds) Mining text data. Springer, Heidelberg, pp 223–258

    Chapter  Google Scholar 

  • Qureshi MA, O’Riordan C, Pasi G (2013) Clustering with error estimation for monitoring reputation of companies on Twitter. In: Proceedings of the 9th Asia Information Retrieval Societies Conference (AIRS 2013), pp 170–180. Singapore, SN

  • Rosenthal S, Nakov P, Kiritchenko S, Mohammad S, Ritter A, Stoyanov V (2015) SemEval-2015 Task 10: sentiment analysis in Twitter. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), pp 451–463, Denver, US

  • Rosenthal S, Ritter A, Nakov P, Stoyanov V (2014) SemEval-2014 Task 9: sentiment analysis in Twitter. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pp 73–80, Dublin, IE

  • Saerens M, Latinne P, Decaestecker C (2002) Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure. Neural Comput 14(1):21–41

    Article  MATH  Google Scholar 

  • Saif H, Fernez M, He Y, Alani H (2013) Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold. In: Proceedings of the 1st international workshop on emotion and sentiment in social and expressive media (ESSEM 2013), pp 9–21, Torino, IT

  • Sánchez L, González V, Alegre E, Alaiz R (2008) Classification and quantification based on image analysis for sperm samples with uncertain damaged/intact cell proportions. In: Proceedings of the 5th international conference on image analysis and recognition (ICIAR 2008), pp 827–836, Póvoa de Varzim, PT

  • Takahashi T, Abe S, Igata N (2011) Can Twitter be an alternative of real-world sensors? In: Proceedings of the 14th international conference on human–computer interaction (HCI International 2011), pp 240–249, Orlando, US

  • Tang L, Gao H, Liu H (2010) Network quantification despite biased labels. In: Proceedings of the 8th workshop on mining and learning with graphs (MLG 2010), pp 147–154, Washington, US

  • Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6:1453–1484

    MathSciNet  MATH  Google Scholar 

  • Vapnik V (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  • Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83

    Article  Google Scholar 

  • Wu T-F, Lin C-J, Weng RC (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5:975–1005

    MathSciNet  MATH  Google Scholar 

  • Xue JC, Weiss GM (2009) Quantification and semi-supervised classification methods for handling changes in class distribution. In: Proceedings of the 15th ACM international conference on knowledge discovery and data mining (SIGKDD 2009), pp 897–906, Paris, FR

  • Zhang Z, Zhou J (2010) Transfer estimation of evolving class priors in data stream classification. Pattern Recognit 43(9):3151–3161

    Article  MATH  Google Scholar 

  • Zhu X, Kiritchenko S, Mohammad SM (2014) NRC-Canada-2014: recent improvements in the sentiment analysis of tweets. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pp 443–447, Dublin, IE

  • Zou F, Wang Y, Yang Y, Zhou K, Chen Y, Song J (2015) Supervised feature learning via L2-norm regularized logistic regression for 3D object recognition. Neurocomputing 151:603–611

    Article  Google Scholar 

Download references

Acknowledgments

We are grateful to Chih-Chung Chang and Chih-Jen Lin for making LIBSVM available, to Rong-En Fan and colleagues for making LIBLINEAR available, to Thorsten Joachims for making SVM-perf available, to Andrea Esuli for making available the code for obtaining SVM(KLD) from SVM-perf, to José Barranquero for making available the code for obtaining SVM(Q) from SVM-perf, to Shuai Li for pointing out a small mistake in a previous version, and to Carlos Castillo for several pointers to the literature.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fabrizio Sebastiani.

Additional information

This is an extended version of a paper with the title “Tweet Sentiment: From Classification to Quantification” which appears in the Proceedings of the 6th ACM/IEEE International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2015). Fabrizio Sebastiani is on leave from Consiglio Nazionale delle Ricerche, Italy.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, W., Sebastiani, F. From classification to quantification in tweet sentiment analysis. Soc. Netw. Anal. Min. 6, 19 (2016). https://doi.org/10.1007/s13278-016-0327-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-016-0327-z

Keywords

Navigation