Skip to main content
Log in

Improving content popularity prediction with k-means clustering and deep-belief networks

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

User-Generated Content (UGC) is turning into the predominant type of internet traffic. Content popularity prediction plays a pivotal role in managing this large-scale traffic. As a result, popularity prediction is increasingly becoming an important area of research in computer networking. Generally, popularity prediction methods are classified into two groups, namely, feature-driven and early-stage. While feature-driven methods predict content popularity before publication, early-stage methods monitor early content popularities to forecast the future. Many papers have shown that early-stage popularity prediction performs better than feature-driven methods. In this paper, we improve the performance of early-stage popularity prediction by first, classifying the data into several clusters using k-means clustering with Pearson correlation distance, and then, training a Deep-Belief Network (DBN) for each cluster. We evaluate our method using a dataset of YouTube videos and show that using a generative model such as DBN for time series prediction significantly improves the performance. Numerical results indicate that our proposed method outperforms other state-of-the-art methods by reducing Mean Absolute Percentage Error (MAPE) and mean Relative Square Error (mRSE) by up to 47.86% and 25.18%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Almeida J, Gonçalves MA (2013) Using early view patterns to predict the popularity of YouTube videos, ACM, WSDM’13, Rome, Italy, 365–374, https://doi.org/10.1145/2433396.2433443

  2. Alzubi, J., Nayyar, A., & Kumar, A. (2018). Machine learning from theory to algorithms: an overview. In Journal of physics: conference series, vol. 1142(1), IOP Publishing

  3. Bao Z, Liu Y, Liu H, Zhang Z, et al (2017) Leveraging adaptive peeking window to improve Self-Exciting Point Process model for popularity prediction, IEEE Behavioral, Economic, Socio-cultural Computing (BESC), https://doi.org/10.1109/BESC.2017.8256373

  4. Borghol Y, Mitra S, Ardon S et al (2011) Characterizing and modelling popularity of user-generated videos. Science Direct Performance Evaluation 68(11):1037–1055. https://doi.org/10.1016/j.peva.2011.07.008

    Article  Google Scholar 

  5. Cha M, Kwak H, Rodriguez P (2009) Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems. IEEE/ACM Trans. on Networking 17(5):1357–1370. https://doi.org/10.1109/TNET.2008.2011358

    Article  Google Scholar 

  6. Fischer A, Igel C (2014) Training Restricted Boltzmann Machines: An Introduction. Science Direct Pattern Recognition 47(1):25–39. https://doi.org/10.1016/j.patcog.2013.05.025

    Article  MATH  Google Scholar 

  7. Gheisari M, Panwar D, Tomar et al (2019) An optimization model for software quality prediction with case study analysis using MATLAB. IEEE Access 7:85123–85138

    Article  Google Scholar 

  8. Google Developers, Add YouTube functionality to your app, https://developers.google.com/youtube/v3, (accessed: 24, Nov. 2019)

  9. Gursun G, Crovella M, Matta I (2011) Describing and Forecasting Video Access Patterns, IEEE INFOCOM, Shanghai, China, https://doi.org/10.1109/INFCOM.2011.5934965

  10. Hassine NB, Marinca D, Minet P, Barth D (2015) Popularity Prediction in Content Delivery Networks, IEEE PIMRC:2083–2088, https://doi.org/10.1109/PIMRC.2015.7343641

  11. Hassine NB, Milocco R, Minet P (2017) ARMA based Popularity Prediction in Content Delivery Networks, IEEE Wireless Days, https://doi.org/10.1109/WD.2017.7918125

  12. Hinton GE (2012) A Practical Guide to Training Restricted Boltzmann Machines, Springer Neural Networks: Tricks of the Trade. Lect Notes Comput Sci 7700:599–619. https://doi.org/10.1007/978-3-642-35289-8_32

    Article  Google Scholar 

  13. Hinton GE, Osindero S, The YW (2006) A fast learning algorithm for deep belief nets, J. Neural computation 18(7):1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527

    Article  MathSciNet  MATH  Google Scholar 

  14. Hoiles W, Aprem A, Krishnamurthy V (2017) Engagement dynamics and sensitivity analysis of YouTube videos. IEEE Trans. on Knowledge and Data Engineering 29(7):1426–1437. https://doi.org/10.1109/TKDE.2017.2682858

    Article  Google Scholar 

  15. Hou T, Feng G, Qin S, et al (2018) Proactive Content Caching by Exploiting Transfer Learning for Mobile Edge Computing, IEEE Global Communication, https://doi.org/10.1109/GLOCOM.2017.8254636

  16. Hou T, Feng G, Qin S, Jiang W (2018) Proactive Content Caching by Exploiting Transfer Learning for Mobile Edge Computing, Wiley Communication Systems, vol. 31(11), https://doi.org/10.1002/dac.3706

  17. Hrasko R, Pacheco AGC, Krohling RA (2015) Time Series Prediction using Restricted Boltzmann Machines and Backpropagation. Science Direct ITQM 55:990–999. https://doi.org/10.1016/j.procs.2015.07.104

    Article  Google Scholar 

  18. Ibrahimi K, Serbouti Y (2017) Prediction of the Content Popularity in the 5G Network: Auto-Regressive, Moving-Average and Exponential Smoothing Approaches, IEEE WINCOM https://doi.org/10.1109/WINCOM.2017.8238196

  19. Kurose JF, Ross KW (2013) Computer Networking a Top-Down Approach, Pearson Edu., US, 6th edition, 602–612

  20. Li C, Liu J, Ouyang S (2016) Characterizing and Predicting the Popularity of Online Videos. IEEE, Access 4:1630–1641. https://doi.org/10.1109/ACCESS.2016.2552218

    Article  Google Scholar 

  21. Li Y, Peng Q, Sun Z, Fu L, et al (2018) A Two-stage Prediction Method of News Popularity only using Content Features, IEEE Intelligent Control and Automation, Changsha, China, 767–772, https://doi.org/10.1109/WCICA.2018.8630557

  22. Liu Y, Zhi T, Xi H et al (2019) A Novel Content Popularity Prediction Algorithm Based on Auto Regressive Model in Information-Centric IoT. IEEE Early Access 7:27555–27564. https://doi.org/10.1109/ACCESS.2019.2901525

    Article  Google Scholar 

  23. Ma C, Yan Z, Chen CW (2017) LARM: A Lifetime Aware Regression Model for Predicting YouTube Video Popularity, ACM CIKM’17, Singapore, https://doi.org/10.1145/3132847.3132997

  24. Martin T, Hofman JM, Sharma A, et al (2016) Exploring Limits to Prediction in Complex Social Systems, in International Conference on World Wide Web:683–694

  25. Namous F, Rodan A, Javed Y (2018) Online News Popularity Prediction, IEEE Information Technology Trends, https://doi.org/10.1109/CTIT.2018.8649529

  26. Ouyang S, Li C, Li X (2016) A Peek into the Future: Predicting the Popularity of Online Videos. IEEE Access 4:3026–3033. https://doi.org/10.1109/ACCESS.2016.2580911

    Article  Google Scholar 

  27. Rahman S, Alam GR, Rahman M (2020) Deep Learning-based Predictive Caching in the Edge of a Network, IEEE ICOIN, https://doi.org/10.1109/ICOIN48656.2020.9016437

  28. Szabo G, Huberman BA (2010) Predicting the popularity of online content. ACM Communications 53(8):80–88. https://doi.org/10.1145/1787234.1787254

    Article  Google Scholar 

  29. Tan Z, Zhang Y (2019) Predicting the Top-N Popular Videos via a Cross-Domain Hybrid Model. IEEE Trans. on Multimedia 21(1):147–156. https://doi.org/10.1109/TMM.2018.2845688

    Article  Google Scholar 

  30. Tan Z, Wang Y, Zhang Y et al (2016) A Novel Time Series Approach for Predicting the Long-Term Popularity of Online Videos. IEEE, Trans. on Broadcasting 62(2):436–445. https://doi.org/10.1109/TBC.2016.2540522

    Article  Google Scholar 

  31. Tan J, Liu W, Wang T, et al (2020) A high-accurate content popularity prediction computational modeling for mobile edge computing using matrix completion technology, Wiley Trans. on Emerging Tel. Tech., 31(8) https://doi.org/10.1002/ett.3871

  32. Wang X, Fang B, Zhang H, et al (2019) A Dynamic Model on News Popularity Prediction in Online Social Networks, IEEE ITNEC, Chengdu, China, 10.1109/ITNEC.2019.8729161

  33. Yang J, Leskovec J (2011) Patterns of Temporal Variation in Online Media, ACM WSDM’11, Hong Kong, China, 177–186, https://doi.org/10.1145/1935826.1935863

  34. Yang M, Chen K,Miao Z, Yang X (2014) Cost-Effective User Monitoring for Popularity Prediction of Online User-Generated Content, IEEE Data Mining Workshop, 944–951, https://doi.org/10.1109/ICDMW.2014.72

  35. Youtube.com Traffic, Demographics and Competitors, www.alexa.com, 2019, (Accessed Aug. 2019)

  36. Zhu C, Chen G, Wang AK (2017) Big Data Analytics for Program Popularity Prediction in Broadcast TV Industries. IEEE Early Access, vol. 5:24593–24601. https://doi.org/10.1109/ACCESS.2017.2767104

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Reza Khayyambashi.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nia, Z.M., Khayyambashi, M.R. Improving content popularity prediction with k-means clustering and deep-belief networks. Multimed Tools Appl 80, 15745–15764 (2021). https://doi.org/10.1007/s11042-020-10463-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-10463-x

Keywords

Navigation