Skip to main content

Predicting the popularity of tweets using internal and external knowledge: an empirical Bayes type approach


The problem of tweet popularity prediction, or forecasting the total number of retweets stemming from an ancestral tweet, has attracted considerable interest recently. The prediction can be accomplished by fitting a point process model to the sequence of retweet times up to a certain censoring time and project the fitted model to a future time point. However, models employing such approach tend to have inferior prediction accuracy when the censoring time is too short before sufficient information can accumulate. To overcome this, we propose an empirical Bayes type approach of parameter estimation to combine internal knowledge on the times of historical retweets up to the censoring time and external knowledge on complete retweet sequences in the training data. We demonstrate the approach using several point process models with finite-dimensional parameters, where the prior distribution for the parameter of each model is constructed based on the external knowledge, and the likelihood is calculated based on the internal knowledge. The mode of the posterior distribution is used as the estimator of the finite-dimensional parameter, and the mean of the predictive distribution for the number of retweets implied by each of the estimated models is used to predict the tweet popularity. Using a large Twitter data set, we reveal that the proposed methodology not only enables prediction at time zero before the arrival of any retweet event, but also substantially improves the prediction performances of existing models, especially at earlier censoring times.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Data Availability Statement

Data are available from




  • Bandari, R., Asur, S., Huberman, B.: The pulse of news in social media: Forecasting popularity. In: ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (2012)

  • Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  • Chen, F., Tan, W.H.: Marked self-exciting point process modelling of information diffusion on Twitter. Ann. Appl. Stat. 12(4), 2175–2196 (2018)

    MathSciNet  Article  Google Scholar 

  • Cleveland, W.S., Devlin, S.J.: Locally weighted regression: an approach to regression analysis by local fitting. J. Am. Stat. Assoc. 83(403), 596–610 (1988)

    Article  Google Scholar 

  • Cowling, A., Hall, P.: On pseudodata methods for removing boundary effects in kernel density estimation. J. R. Stat. Soc.: Ser. B (Methodol.) 58(3), 551–563 (1996)

    MathSciNet  MATH  Google Scholar 

  • Daley, D.J., Vere-Jones, D.: An Introduction to the Theory of Point Processes Volume I: Elementary Theory and Methods, 2nd edn. Springer, New York (2003)

  • Eysenbach, G.: Can tweets predict citations? metrics of social impact based on Twitter and correlation with traditional metrics of scientific impact. J. Med. Internet Res. 13(4), (2011)

  • Golub, G.H., Heath, M., Wahba, G.: Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21(2), 215–223 (1979)

    MathSciNet  Article  Google Scholar 

  • Hong, L., Dan, O., Davison, BD.: Predicting popular messages in Twitter. In: Proceedings of the 20th international conference companion on World wide web, ACM, pp. 57–58 (2011)

  • Kant, G., Weisser, C., Säfken, B.: TTLocVis: A Twitter topic location visualization package. J. Open Sour. Software 5(25), (2020)

  • Kobayashi, R., Lambiotte, R.: TiDeH: time-dependent Hawkes process for predicting retweet dynamics. In: Proceedings of the Tenth International AAAI Conference on Web and Social Media (ICWSM 2016), pp. 191–200 (2016)

  • Ma, Z., Sun, A., Cong, G.: On predicting the popularity of newly emerging hashtags in Twitter. J. Am. Soc. Inform. Sci. Technol. 64(7), 1399–1410 (2013)

    Article  Google Scholar 

  • Malmgren, R.D., Stouffer, D.B., Motter, A.E., Amaral, L.A.: A Poissonian explanation for heavy tails in e-mail communication. Proc. Nat. Acad. Sci. 105(47), 18153–18158 (2008)

    Article  Google Scholar 

  • Mishra, S., Rizoiu, MA., Xie, L.: Feature driven and point process approaches for popularity prediction. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, ACM, pp. 1069–1078 (2016)

  • R Core Team.: R: A language and environment for statistical computing (2019)

  • Silverman, B.: Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC Monographs on Statistics & Applied Probability, Taylor & Francis (1986)

  • Van Aelst, P., van Erkel, P., D’heer, E., Harder, R.A.: Who is leading the campaign charts? Comparing individual popularity on old and new media. Inform. Commun. Soc. 20(5), 715–732 (2017)

    Article  Google Scholar 

  • Xie, M., Singh, K.: Confidence distribution, the frequentist distribution estimator of a parameter: a review. Int. Stat. Rev. 81(1), 3–39 (2013)

    MathSciNet  Article  Google Scholar 

  • Yang, J., Leskovec, J.: Patterns of temporal variation in online media. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining. ACM, pp. 177–186 (2011)

  • Yang, M., Chen, K., Miao, Z., Yang, X.: Cost-effective user monitoring for popularity prediction of online user-generated content. In: 2014 IEEE International Conference on Data Mining Workshop, pp. 944–951 (2014)

  • Zhao, Q., Erdogdu, M.A., He, H.Y., Rajaraman, A., Leskovec, J.: SEISMIC: a self-exciting point process model for predicting tweet popularity. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp. 1513–1522 (2015)

Download references


The authors gratefully acknowledge the constructive comments from the reviewers, which have led to improved presentation. This research includes computations using the computational cluster Katana supported by Research Technology Services at UNSW Sydney. The research also benefited from the assistance of resources from the National Computational Infrastructure (NCI), supported by the Australian Government.


Tan was supported by UMK Fundamental Research Grant [R/FUND/A0100/01348A/001/2020/00840] Chen was partly supported by UNSW Science Faculty Research Grant [PS35307]

Author information

Authors and Affiliations


Corresponding author

Correspondence to Wai Hong Tan.

Ethics declarations

Conflicts of interest

Not applicable.

Code availability

Code is available from the authors upon request

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



See Fig. 5.

Fig. 5
figure 5

A summary of the procedures involved to obtain the empirical Bayes estimates. The final criterion function combines the knowledge internal and external to a retweet sequence, depending, respectively, on the current log-likelihood function and the log-prior density function. When the censoring time is at zero, the maximizer of the prior density function is \({\tilde{\eta }}^0\), and \(e^{{\tilde{\eta }}^0}\) will be taken as the estimator of the tweet-specific model parameters

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tan, W.H., Chen, F. Predicting the popularity of tweets using internal and external knowledge: an empirical Bayes type approach. AStA Adv Stat Anal 105, 335–352 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Empirical Bayes
  • Kernel smoothing
  • Maximum a posteriori (MAP) estimation
  • Nonparametric regression