Skip to main content

Prediction of News Popularity via Keywords Extraction and Trends Tracking

Part of the Communications in Computer and Information Science book series (CCIS,volume 1357)

Abstract

In the last years, news agencies have become more influential in various social groups. At the same time, the media industry starts to monetize online distributed articles with contextual advertising. However, the efficiency of online marketing highly depends on the popularity of news articles. In our work, we present an alternative and effective way for article popularity forecasting with two–step approach: article keywords extraction and keywords-based article popularity prediction. We show the benefits of this technique and compare with widely used methods, such as Text Embeddings and BERT–based methods. Moreover, the work provides an architecture of the model for dynamic keyword tracking trained on the newest dataset of Russian news articles with more than 280k articles and 22k keywords for the popularity of forecasting purposes.

Keywords

  • Online news popularity forecasting
  • Keyword extraction
  • Popularity prediction
  • BERT
  • Text embedding

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-71214-3_4
  • Chapter length: 15 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-71214-3
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.
Fig. 10.

Notes

  1. 1.

    https://www.liveinternet.ru/stat/lenta.ru/summary.html?lang=en.

  2. 2.

    https://ria.ru/.

  3. 3.

    https://www.kaggle.com/yutkin/corpus-of-russian-news-articles-from-lenta.

  4. 4.

    https://www.liveinternet.ru/stat/RS_Total/Riaru_Total/summary.html.

  5. 5.

    https://toloka.yandex.com/.

  6. 6.

    http://docs.deeppavlov.ai/en/master/features/models/bert.html.

  7. 7.

    https://github.com/google-research/bert/issues/462.

References

  1. Agarap, A.F.: Deep learning using rectified linear units (ReLU). CoRR abs/1803.08375 (2018)

    Google Scholar 

  2. Balali, A., Asadpour, M., Faili, H.: A supervised method to predict the popularity of news articles. Computación y Sistemas 21, 703–716 (2018)

    Google Scholar 

  3. Desai, K.V., Ranjan, R.: Insights from the Wikipedia contest (IEEE contest for data mining 2011). arXiv preprint arXiv:1405.7393 (2014)

  4. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  5. Gayberi, M., Oguducu, S.G.: Popularity prediction of posts in social networks based on user, post and image features. In: Proceedings of the 11th International Conference on Management of Digital EcoSystems, MEDES 2019, pp. 9–15. Association for Computing Machinery, New York (2019)

    Google Scholar 

  6. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. CoRR abs/1607.01759 (2016)

    Google Scholar 

  7. Keneshloo, Y., Wang, S., Han, E.H., Ramakrishnan, N.: Predicting the popularity of news articles. In: SDM (2016)

    Google Scholar 

  8. Lamprinidis, S., Hardt, D., Hovy, D.: Predicting news headline popularity with syntactic and semantic knowledge using multi-task learning. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp. 659–664. Association for Computational Linguistics, October–November 2018

    Google Scholar 

  9. Lee, S., Kim, H.J.: News keyword extraction for topic tracking. In: 2008 Fourth International Conference on Networked Computing and Advanced Information Management, pp. 554–559, October 2008

    Google Scholar 

  10. Lu, H., Zhang, M., Ma, W., Shao, Y., Liu, Y., Ma, S.: Quality effects on user preferences and behaviorsin mobile news streaming. In: The World Wide Web Conference, WWW 2019, pp. 1187–1197. Association for Computing Machinery, New York (2019)

    Google Scholar 

  11. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)

    Google Scholar 

  12. Ostroumova, L., Gusev, G., Vorobev, A., Dorogush, A.V., Gulin, A.: CatBoost: unbiased boosting with categorical features. arXiv preprint arXiv:1706.09516 (2018)

  13. Peters, M., et al.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2018)

    Google Scholar 

  14. Piotrkowicz, A., Dimitrova, V., Otterbacher, J., Markert, K.: Headlines matter: using headlines to predict the popularity of news articles on Twitter and Facebook. In: ICWSM, pp. 656–659, May 2017

    Google Scholar 

  15. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823, June 2015

    Google Scholar 

  16. Voronov, A., Shen, Y., Mondal, P.K.: Forecasting popularity of news article by title analyzing with BN-LSTM network. In: Proceedings of the 2019 International Conference on Data Mining and Machine Learning, ICDMML 2019, pp. 19–27. Association for Computing Machinery, New York (2019)

    Google Scholar 

  17. Wang, C., Xiao, Z., Liu, Y., Xu, Y., Zhou, A., Zhang, K.: SentiView: sentiment analysis and visualization for internet popular topics. IEEE Trans. Hum. Mach. Syst. 43(6), 620–630 (2013)

    CrossRef  Google Scholar 

  18. Xia, C., Zhang, H., Moghtader, J., Wu, A., Chang, K.W.: Visualizing trends of key roles in news articles. arXiv preprint arXiv:1909.05449 (2019)

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Alexander Pugachev , Anton Voronov or Ilya Makarov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Pugachev, A., Voronov, A., Makarov, I. (2021). Prediction of News Popularity via Keywords Extraction and Trends Tracking. In: , et al. Recent Trends in Analysis of Images, Social Networks and Texts. AIST 2020. Communications in Computer and Information Science, vol 1357. Springer, Cham. https://doi.org/10.1007/978-3-030-71214-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-71214-3_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-71213-6

  • Online ISBN: 978-3-030-71214-3

  • eBook Packages: Computer ScienceComputer Science (R0)