Advertisement

Multimedia Tools and Applications

, Volume 77, Issue 1, pp 1409–1436 | Cite as

Modeling and predicting the popularity of online news based on temporal and content-related features

  • Steven Van Canneyt
  • Philip Leroux
  • Bart Dhoedt
  • Thomas Demeester
Article

Abstract

As the market of globally available online news is large and still growing, there is a strong competition between online publishers in order to reach the largest possible audience. Therefore an intelligent online publishing strategy is of the highest importance to publishers. A prerequisite for being able to optimize any online strategy, is to have trustworthy predictions of how popular new online content may become. This paper presents a novel methodology to model and predict the popularity of online news. We first introduce a new strategy and mathematical model to capture view patterns of online news. After a thorough analysis of such view patterns, we show that well-chosen base functions lead to suitable models, and show how the influence of day versus night on the total view patterns can be taken into account to further increase the accuracy, without leading to more complex models. Second, we turn to the prediction of future popularity, given recently published content. By means of a new real-world dataset, we show that the combination of features related to content, meta-data, and the temporal behavior leads to significantly improved predictions, compared to existing approaches which only consider features based on the historical popularity of the considered articles. Whereas traditionally linear regression is used for the application under study, we show that the more expressive gradient tree boosting method proves beneficial for predicting news popularity.

Keywords

Online news Popularity modeling Popularity prediction Regression Feature engineering 

Notes

Acknowledgments

We thank Ke Zhou for useful suggestions on drafts of the manuscript. Steven Van Canneyt is funded by a Ph.D. grant of the Agency for Innovation by Science and Technology in Flanders (IWT). Part of the presented research was performed within the MIX-ICON project PROVIDENCE, facilitated by iMinds-Media and funded by the IWT.

References

  1. 1.
    Arapakis I, Cambazoglu BB, Lalmas M (2014) On the feasibility of predicting news popularity at cold start. In: Proceedings of the 6th Internation Conference on Social Informatics, pp 290–299Google Scholar
  2. 2.
    Bandari R, Asur S, Huberman B (2012) The pulse of news in social media: Forecasting popularity. In: Proceedings of the 6th International Conference on Weblogs and Social Media, pp 26–33Google Scholar
  3. 3.
    Barber B (2012) Bayesian reasoning and machine learning. Cambridge University PressGoogle Scholar
  4. 4.
    Berger J, Milkman KL (2012) What makes online content viral? J Market Res 49(2):192–205CrossRefGoogle Scholar
  5. 5.
    Castillo C, El-Haddad M, Stempeck M, Jazeera A, Pfeffer J (2014) Characterizing the life cycle of online news stories using social media reactions. In: Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work and Social Computing, pp 211–213Google Scholar
  6. 6.
    Cheng A, Evans M, Singh H (2014) Inside Twitter: An in-depth look inside the Twitter world. Technical reportGoogle Scholar
  7. 7.
    DeGroot MH, Schervish MJ (2010) Probability and statisticsGoogle Scholar
  8. 8.
    Deleu J, Moor AD (2012) Named entity recognition on Flemish audio-visual and news-paper archives. In: Proceedings of the 12th Dutch-Belgian Information Retrieval Workshop, pp 38–41Google Scholar
  9. 9.
    Figueiredo F, Gonçalves M, Almeida JM (2014) Improving the effectiveness of content popularity prediction methods using time series trends. In: ECML/PKDD Discovery Challenge on Predictive Analytics, pp 1–6Google Scholar
  10. 10.
    Kaltenbrunner A, Gómez V, López V (2007) Description and prediction of Slashdot activity. In: Proceedings of the Latin American Web Conference, pp 57–66Google Scholar
  11. 11.
    Kim SD, Kim SH, Cho HG (2011) Predicting the virtual temperature of web-blog articles as a measurement tool for online popularity. In: Proceedings of the 11th International Conference on Computer and Information Technology, pp 449–454Google Scholar
  12. 12.
    Kong S (2014) Predicting future retweet counts in a microblog. J Comput Inf Syst 4(10):1393–1404Google Scholar
  13. 13.
    Manning CD, Raghavan P, Schütze H. (2008) Introduction to information retrieval. Cambridge University Press, NY, USACrossRefMATHGoogle Scholar
  14. 14.
    Oghina A, Breuss M, Tsagkias M, De Rijke M (2012) Predicting IMDB movie ratings using social media. In: Proceedings of the 34th European Conference on Advances in Information Retrieval, pp 503–507Google Scholar
  15. 15.
    Pinto H, Almeida JM, Gonçalves M. A. (2013) Using early view patterns to predict the popularity of YouTube videos. In: Proceedings of the 6h ACM International Conference on Web Search and Data Mining, pp 365–374Google Scholar
  16. 16.
    Sakai T (2006) Evaluating evaluation metrics based on the bootstrap. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 525–532Google Scholar
  17. 17.
    Szabo G, Huberman B (2008) Predicting the popularity of online content. Commun ACM 53:80–88CrossRefGoogle Scholar
  18. 18.
    Tatar A, Antoniadis P, De Amorim MD, Fdida S (2014) From popularity prediction to ranking online news. Soc Netw Anal Min 4(1):174–186CrossRefGoogle Scholar
  19. 19.
    Tsagkias M, Weerkamp W, De Rijke M (2009) Predicting the volume of comments on online news stories. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp 1765–1768Google Scholar
  20. 20.
    Tsagkias M, Weerkamp W, De Rijke M (2010) News comments: Exploring, modeling, and online prediction. In: Proceedings of the 32nd European Conference on Advances in Information Retrieval, pp 191–203Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Steven Van Canneyt
    • 1
  • Philip Leroux
    • 1
  • Bart Dhoedt
    • 1
  • Thomas Demeester
    • 1
  1. 1.Department of Information TechnologyGhent University - iMindsGhentBelgium

Personalised recommendations