Estimating effectiveness of twitter messages with a personalized machine learning approach

Sun, Xunhu; Chan, Philip K.

doi:10.1007/s10115-017-1088-3

Estimating effectiveness of twitter messages with a personalized machine learning approach

Regular Paper
Published: 22 August 2017

Volume 56, pages 27–53, (2018)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

417 Accesses
3 Citations
Explore all metrics

Abstract

To improve a tweet in Twitter, we would like to estimate the effectiveness of a draft before it is sent. The total number of retweets of a tweet can be considered as a measure for the tweet’s effectiveness. To estimate the number of retweets for an author, we propose a procedure to learn a personalized model from his/her past tweets. We propose three types of new features based on the contents of the tweets: Entity, Pair, and Topic. Empirical results from seven authors indicate that the Pair and Topic features have statistically significant improvements on the correlation coefficient between the estimates and the actual numbers of retweets. We study different combinations of the three types of features, and many of the combinations significantly improve the result further.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

http://hootsuite.com.
http://mobile.twitter.com/trends.
We define the Domain Stop Words as the words belong to the web site instead of the article. For all pages from the same web site (domain), the words in the menu and even in the advertisement are usually the same. For this reason, we generate an independent list of stop words for each domain. A stop word in a domain is the one that appears in more than 80% of pages which we crawled. When we use a web page to extract features, we remove the words listed in the Domain Stop Words.
http://mallet.cs.umass.edu/index.php.
http://twitter.com/who_to_follow/interests/social-good.
http://twitter4j.org/en/index.html.
http://dev.twitter.com/streaming/overview.
http://dev.twitter.com/rest/public.
http://www.cs.waikato.ac.nz/ml/weka/.
http://www.csie.ntu.edu.tw/~cjlin/libsvm/.

References

Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Bradley MM, Lang PJ (1999) Affective norms for English Words (ANEW): instruction manual and affective ratings. Technical Report, The Center for Research in Psychophysiology, University of Florida
El-Arini K, Paquet U, Herbrich R, Van Gael J, Agüera y Arcas B (2012) Transparent user models for personalization. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, pp 678–686
Feng W, Wang J (2013) Retweet or not? Personalized tweet re-ranking. In: Proceedings of the sixth ACM international conference on Web search and data mining, pp 577–586
Jenders M, Kasneci G, Naumann F (2013) Analyzing and predicting viral tweets. In: Proceedings of the 22nd international conference on World Wide Web, pp 657–664
Kim HR, Chan PK (2003) Learning implicit user interest hierarchy for context in personalization. In: Proceedings of the 8th international conference on intelligent user interfaces, pp 101–108
Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In: Proceedings of the 19th international conference on World Wide Web, pp 591–600
Lee K, Mahmud J, Chen J, Zhou M, Nichols J (2014) Who will retweet this? Automatically identifying and engaging strangers on twitter to spread information. In: Proceedings of the 19th international conference on intelligent user interfaces, pp 247–256
Macskassy SA, Michelson M (2011) Why do people retweet? Anti-homophily wins the day! In: ICWSM, pp 209–216
Mendes PN, Gruhl D, Drews C, Kau C, Lewis N, Nagarajan M, Alba A, Welch S (2014) Sonora: a prescriptive model for message authoring on Twitter. In: International conference on Web information systems engineering, pp 17–31
Naveed N, Gottron T, Kunegis J, Alhadi AC (2011) Bad news travel fast: a content-based analysis of interestingness on twitter. In: Proceedings of the 3rd international Web science conference, p 8
Pennebaker JW, Francis ME, Booth RJ (2001) Linguistic inquiry and word count: LIWC 2001. Lawrence Erlbaum Associates, Mahway, p 71 (2001)
Quercia D, Ellis J, Capra L, Crowcroft J (2011) In the mood for being influential on twitter. In: 2011 IEEE third international conference on privacy, security, risk and trust (PASSAT) and 2011 IEEE third international conference on social computing (SocialCom), pp 307–314
Suh B, Hong L, Pirolli P, Chi EH (2010) Want to be retweeted? Large scale analytics on factors impacting retweet in twitter network. In: 2010 IEEE second international conference on social computing (SocialCom), pp 177–184
Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010) Sentiment strength detection in short informal text. J Assoc Inf Sci Technol 61(12):2544–2558
Article Google Scholar
Uysal I, Croft WB (2011) User oriented tweet ranking: a filtering approach to microblogs. In: Proceedings of the 20th ACM international conference on information and knowledge management, pp 2261–2264
Xu Z, Yang Q (2012) Analyzing user retweet behavior on twitter. In: Proceedings of the 2012 international conference on advances in social networks analysis and mining, pp 46–50

Download references

Author information

Authors and Affiliations

Department of Computer Sciences, Florida Institute of Technology, 150 W. University Blvd., Melbourne, FL, 32901, USA
Xunhu Sun & Philip K. Chan

Authors

Xunhu Sun
View author publications
You can also search for this author in PubMed Google Scholar
Philip K. Chan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xunhu Sun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, X., Chan, P.K. Estimating effectiveness of twitter messages with a personalized machine learning approach. Knowl Inf Syst 56, 27–53 (2018). https://doi.org/10.1007/s10115-017-1088-3

Download citation

Received: 03 November 2015
Revised: 20 June 2017
Accepted: 28 July 2017
Published: 22 August 2017
Issue Date: July 2018
DOI: https://doi.org/10.1007/s10115-017-1088-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating effectiveness of twitter messages with a personalized machine learning approach

Abstract

Access this article

Similar content being viewed by others

Understanding Factors That Affect Web Traffic via Twitter

Learning from the News: Predicting Entity Popularity on Twitter

Combining Classical and Deep Learning Methods for Twitter Sentiment Analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Estimating effectiveness of twitter messages with a personalized machine learning approach

Abstract

Access this article

Similar content being viewed by others

Understanding Factors That Affect Web Traffic via Twitter

Learning from the News: Predicting Entity Popularity on Twitter

Combining Classical and Deep Learning Methods for Twitter Sentiment Analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation