Abstract
To improve a tweet in Twitter, we would like to estimate the effectiveness of a draft before it is sent. The total number of retweets of a tweet can be considered as a measure for the tweet’s effectiveness. To estimate the number of retweets for an author, we propose a procedure to learn a personalized model from his/her past tweets. We propose three types of new features based on the contents of the tweets: Entity, Pair, and Topic. Empirical results from seven authors indicate that the Pair and Topic features have statistically significant improvements on the correlation coefficient between the estimates and the actual numbers of retweets. We study different combinations of the three types of features, and many of the combinations significantly improve the result further.
Similar content being viewed by others
Notes
We define the Domain Stop Words as the words belong to the web site instead of the article. For all pages from the same web site (domain), the words in the menu and even in the advertisement are usually the same. For this reason, we generate an independent list of stop words for each domain. A stop word in a domain is the one that appears in more than 80% of pages which we crawled. When we use a web page to extract features, we remove the words listed in the Domain Stop Words.
References
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Bradley MM, Lang PJ (1999) Affective norms for English Words (ANEW): instruction manual and affective ratings. Technical Report, The Center for Research in Psychophysiology, University of Florida
El-Arini K, Paquet U, Herbrich R, Van Gael J, Agüera y Arcas B (2012) Transparent user models for personalization. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, pp 678–686
Feng W, Wang J (2013) Retweet or not? Personalized tweet re-ranking. In: Proceedings of the sixth ACM international conference on Web search and data mining, pp 577–586
Jenders M, Kasneci G, Naumann F (2013) Analyzing and predicting viral tweets. In: Proceedings of the 22nd international conference on World Wide Web, pp 657–664
Kim HR, Chan PK (2003) Learning implicit user interest hierarchy for context in personalization. In: Proceedings of the 8th international conference on intelligent user interfaces, pp 101–108
Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In: Proceedings of the 19th international conference on World Wide Web, pp 591–600
Lee K, Mahmud J, Chen J, Zhou M, Nichols J (2014) Who will retweet this? Automatically identifying and engaging strangers on twitter to spread information. In: Proceedings of the 19th international conference on intelligent user interfaces, pp 247–256
Macskassy SA, Michelson M (2011) Why do people retweet? Anti-homophily wins the day! In: ICWSM, pp 209–216
Mendes PN, Gruhl D, Drews C, Kau C, Lewis N, Nagarajan M, Alba A, Welch S (2014) Sonora: a prescriptive model for message authoring on Twitter. In: International conference on Web information systems engineering, pp 17–31
Naveed N, Gottron T, Kunegis J, Alhadi AC (2011) Bad news travel fast: a content-based analysis of interestingness on twitter. In: Proceedings of the 3rd international Web science conference, p 8
Pennebaker JW, Francis ME, Booth RJ (2001) Linguistic inquiry and word count: LIWC 2001. Lawrence Erlbaum Associates, Mahway, p 71 (2001)
Quercia D, Ellis J, Capra L, Crowcroft J (2011) In the mood for being influential on twitter. In: 2011 IEEE third international conference on privacy, security, risk and trust (PASSAT) and 2011 IEEE third international conference on social computing (SocialCom), pp 307–314
Suh B, Hong L, Pirolli P, Chi EH (2010) Want to be retweeted? Large scale analytics on factors impacting retweet in twitter network. In: 2010 IEEE second international conference on social computing (SocialCom), pp 177–184
Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010) Sentiment strength detection in short informal text. J Assoc Inf Sci Technol 61(12):2544–2558
Uysal I, Croft WB (2011) User oriented tweet ranking: a filtering approach to microblogs. In: Proceedings of the 20th ACM international conference on information and knowledge management, pp 2261–2264
Xu Z, Yang Q (2012) Analyzing user retweet behavior on twitter. In: Proceedings of the 2012 international conference on advances in social networks analysis and mining, pp 46–50
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sun, X., Chan, P.K. Estimating effectiveness of twitter messages with a personalized machine learning approach. Knowl Inf Syst 56, 27–53 (2018). https://doi.org/10.1007/s10115-017-1088-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-017-1088-3