Abstract
The live streaming of large-scale events over the Internet attracts a highly diverse audience. Despite the progress in streaming technologies, which employ adaptive bitrate to dynamically adjust the streaming of individual clients to reflect their resource availability, content providers still struggle to efficiently match provisioned capacity and demand in such events. In this paper, we present an in-depth characterization and modeling of client behavior during the live streaming of the 2018 FIFA World Soccer Cup. We analyze logs covering more than 60 million streaming sessions collected from servers of one of the major content providers in Latin America. We characterize key features of the workload, such as the number and duration of sessions as well as transmission quality (i.e., bitrate), shedding new light on the workload of a major content provider during a unique large-scale streaming event. We then propose a simple hierarchical model to describe the typical behavior of individual clients both at the session and video bitrate adaptation layers. Taking a step further, we employ non-supervised clustering to identify classes of client behavior and generate specialized models, one for each class. Our evaluation shows that the specialized models are more accurate compared to the single general model, as it can better describe the diversity of client behavior patterns.
Similar content being viewed by others
Notes
The matches were also broadcast on television and radio.
We have evaluated different threshold values, ranging from 30s to 180s, with quantitatively similar results.
The dataset covers 46 of the 64 matches.
Note that the median client arrival rate for high audience transmissions during the previous edition of the same event, that is, during the 2014 FIFA World Cup, was only 90 clients per second [10]. Thus, there was an increase by a factor of 4 between the two events. This illustrates the great increase in the popularity of live streaming.
During the session buffering stage, the client requests 264 kbps segments to speed up the buffer filling. Therefore very short sessions will have most segments in this bitrate.
ISP popularity was estimated by the number of client sessions.
The weighted average MSE, referred to as wMSE, is defined as \(\sum _i p_i \times {\mathrm {MSE}}_i\), where \(p_i\) is the fraction of clients in group i and \({\mathrm {MSE}}_i\) is the mean squared error of the fitted distribution for group i.
References
Cisco vni forecast highlights tool: (2018). https://www.cisco.com/c/m/en_us/solutions/service-provider/vni-forecast-highlights.html. Accessed 7 Oct 2018
Mahanti, A.: The evolving streaming media landscape. IEEE Internet Comput. 18(1), 4–6 (2014)
Hei, X., Liang, C., Liang, J., Liu, Y., Ross, K.W.: A measurement study of a large-scale p2p IPTV system. Multimedia IEEE Trans 9(8), 1672–1687 (2007)
Borges, A., Gomes, P., Nacif, J., Mantini, R.: Jussara Marques de Almeida, and Sérgio Campos. Characterizing SopCast client behavior. Comput. Commun. 35(8), 1004–1016 (2012)
Li, Z., Xie, G., Kaafar, M.A., Salamatian, K.: User behavior characterization of a large-scale mobile live streaming system. In: Proceedings of the 24th International Conference on World Wide Web, pp. 307–313. (2015)
Hu, S., Sun, L., Gui, C., Jammeh, E., Mkwawa, I.H.: Content-aware adaptation scheme for QOE optimized dash applications. In: 2014 IEEE Global Communications Conference, pp. 1336–1341. (2014)
Veloso, E., Almeida, V., Meira Jr., W., Bestavros, A., Jin, S.: A hierarchical characterization of a live streaming media workload. IEEE/ACM Trans. Netw. 14(1), 133–146 (2006)
Wagner, A. Jr.: Avaliação de transmissão ao vivo de grandes eventos pela internet. Master’s thesis, PGCC-UFJF, Juiz de Fora - Minas Gerais (2015)
Dobrian, F., Sekar, V., Awan, A., Stoica, I., Joseph, D., Ganjam, A., Zhan, J., Zhang, H.: Understanding the impact of video quality on user engagement. In: Proceedings of the ACM SIGCOMM 2011 Conference, pp. 362–373. (2011)
Breno, A., Gustavo, C., Wagner, A. Jr., Jussara, A., Italo, C., Alex Borges, V.: Caracterização do comportamento dos clientes de um sistema de vídeo ao vivo durante um evento de larga escala na internet. In: Proceedings of Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos - SBRC, pp. 1–14. (2016)
Martino, T., Danilo, G., Idilio, D, Marco,M., Maurizio, M.: Five years at the edge: watching internet from the ISP network. In: Proceedings of the 14th International Conference on Emerging Networking EXperiments and Technologies, vol. 28(2), pp. 561–574. (2018)
Gouta, A., Hong, C., Hong, D., Kermarrec, A., Lelouedec, Y.: Large scale analysis of http adaptive streaming in mobile networks. In: 2013 IEEE 14th International Symposium on “A World of Wireless, Mobile and Multimedia Networks” (WoWMoM), pp. 1–10. (2013)
Li, C., Liu, J., Ouyang, S.: Large-scale user behavior characterization of online video service in cellular network. IEEE Access 4, 3675–3687 (2016)
Li, Z., Kaafar, M.A., Salamatian, K., Xie, G.: Characterizing and modeling user behavior in a large-scale mobile live streaming system. IEEE Trans. Circuits Syst. Video Technol. 27(12), 2675–2686 (2017)
Chen, Y., Zhang, B., Liu, Y., Zhu, W.: Measurement and modeling of video watching time in a large-scale internet video-on-demand system. IEEE Trans. Multimedia 15(8), 2087–2098 (2013)
Breno, M., Alex, V., Italo, C., Artur, Z.: Evolução do comportamento do usuário em eventos de larga escala na internet. In: Anais do XVIII Workshop em Desempenho de Sistemas Computacionais e de Comunicação, pp. 1–14. (2019)
Nikolas, W., Michael, S., Sebastian, E.-L., Bruno, G., Pedro, C., Raimund, S.: Scoring high: Analysis and prediction of viewer behavior and engagement in the context of 2018 FIFA WC live streaming. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 807–815. (2020)
Marcus Vinicius, C.: Uso de anycast para balanceamento de carga na globo.com. http://pt.slideshare.net/marcuscesario/apresentacao-anycast-sbrc201205 (2012). Accessed 9 April 2021
Katabi, D., Wroclawski, J.: A framework for scalable global ip-anycast (gia). ACM SIGCOMM Comput. Commun. Rev. 30(4), 3–15 (2000)
Schlinker, B., Cunha, Í., Chiu, Y.C., Sundaresan, S. and Katz-Bassett, E.: Internet Performance from Facebook’s Edge. In: Proc, ACM IMC, pp. 179–194. (2019)
Sun, P., Vanbever, L., Rexford, J.: Scalable programmable inbound traffic engineering. In: SOSR 2015: Proceedings of the 1st ACM SIGCOMM Symposium on Software Defined Networking Research, pp. 1–7. (2015)
Kurose, J.F., Ross, K.W.: Computer networking: a top-down approach, 7th edn. Pearson (2016)
Delivering live Youtube content via dash. https://developers.google.com/youtube/v3/live/guides/encoding-with-dash (2018). Accessed 9 April 2021
Html5 and video streaming. https://medium.com/netflix-techblog/html5-and-video-streaming-a3563b19eb02 (2018). Accessed 9 April 2021
Moreira, L.: Globo com’s live video platform for fifa world cup. https://youtu.be/-Ej4iDVKfzI (2015). Accessed 9 April 2021
Xu, J., Fan, J., Ammar, M.H., Moon, S.B.: Prefix-preserving IP Address Anonymization: Measurement-based Security Evaluation and a new cryptography-based Scheme. In: IEEE ICNP, pp. 253–272. (2002)
Huysegems, R., De Vleeschauwer, B., De Schepper, K., Hawinkel, C., Wu, T., Laevens, K., Van Leekwijck, W.: Session reconstruction for http adaptive streaming: laying the foundation for network-based QOE monitoring. In: 2012 IEEE 20th International Workshop on Quality of Service, pp. 1–9. (2012)
Guarnieri, T., Drago, I., Vieira, A.B., Cunha, I., Almeida, J.: Characterizing qoe in large-scale live streaming. In: GLOBECOM 2017 - 2017 IEEE Global Communications Conference, pp. 1–7. (2017)
Krishnan, S.S., Sitaraman, R.K.: Video stream quality impacts viewer behavior: Inferring causality using quasi-experimental designs. In: Proceedings of the 2012 Internet Measurement Conference, pp. 2001–2014. (2012)
Mansy, A., Ammar, M., Chandrashekar, J., Sheth, A.: Characterizing client behavior of commercial mobile video streaming services. In: Proceedings of Workshop on Mobile Video Delivery, pp. 1–6. (2014)
Dados de acessos de comunicação multimídia (2020). https://dados.gov.br/dataset/dados-de-acessos-de-comunicacao-multimidia. Accessed 9 Oct 2020
Quantidade mensal de acessos em serviços do serviço móvel pessoal—smp (2020). https://dados.gov.br/dataset/acessos-autorizadas-smp. Accessed 9 Oct 2020
Figueiredo, F., Almeida, J.M., Gonçalves, M.A., Benevenuto, F.: On the dynamics of social media popularity: A youtube case study. ACM Trans. Internet Technol. 14(4), 1–23 (2014)
Gonalves, G.D., Drago, I., Vieira, A.B., da Silva, A.P.C., Almeida, J.M., Mellia, M.: Workload models and performance evaluation of cloud storage services. Comput. Netw. 109(P2), 183–199 (2016)
Calzarossa, M.C., Massari, L., Tessera, D.: Workload characterization: a survey revisited. ACM Comput. Surv. 48(3), 1–43 (2016)
Norris, J.R.: Markov Chains. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge (1998)
Venables, W.N., Ripley, B.D.: Modern applied statistics with S. Springer Publishing Company, Incorporated, New York (2010)
Chakravarti, I.M., Laha, R.G.: Handbook of Methods of Applied Statistics, vol. 1. Wiley, New York (1967)
Stephens, M.A.: Edf statistics for goodness of fit and some comparisons. J. Am. Stat. Assoc. 69(347), 730–737 (1974)
Forgy, E.: Cluster analysis of multivariate data: Efficiency versus interpretability of classification. Biometrics 21(3), 768–769 (1965)
Ester, M., Kriegel, H.P., Sander, J., Xu, X: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 226–231. AAAI Press (1996)
Murtagh, F., Legendre, P.: Ward’s hierarchical agglomerative clustering method: Which algorithms implement ward’s criterion? J. Classif. 31, 274–295 (2014)
Thorndike, R.: Who belongs in the family? Psychometrika 18(4), 267–276 (1953)
Wasserman, L.: All of statistics: a concise course in statistical inference. Springer Publishing Company, Incorporated, New York (2010)
Mok, R.K.P., Chan, E.W.W., Chang, R.K.C.: Measuring the quality of experience of http video streaming. In: 12th IFIP/IEEE International Symposium on Integrated Network Management (IM 2011) and Workshops, pp. 485–492. IEEE (2011)
Ghadiyaram, D., Bovik, A.C., Yeganeh, H., Kordasiewicz, R., Gallant, M.: Study of the effects of stalling events on the quality of experience of mobile streaming videos. In: Proceedings of the 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 989–993. (2014)
Balachandran, A., Sekar, V., Akella, A., Seshan, S., Stoica, I., Zhang, H.: Developing a predictive model of quality of experience for internet video. SIGCOMM Comput. Commun. Rev. 43(4), 339–350 (2013)
Zinner, T., Hossfeld, T., Minhas, T.N., Fiedler, M.: Controlled vs. uncontrolled degradations of qoe : The provisioning-delivery hysteresis in case of video. In: EuroITV 2010 Workshop: Quality of Experience for Multimedia Content Sharing (2010)
TvLine: Hbo go feels game of thrones fans’ wrath due to technical difficulties (2019). http://tiny.cc/s4l1kz. Accessed 28 Oct 2019
Cnet.com: Disney plus launch glitches out with service failures, login problems (2019). https://cnet.co/2KekgvV. Accessed 13 Nov 2019
Jin, S., Bestavros, A.: Gismo: a generator of internet streaming media objects and workloads. SIGMETRICS Perform. Eval. Rev. 29, 2–10 (2001)
Busari, M., Williamson, C.: Prowgen: a synthetic workload generation tool for simulation evaluation of web proxy caches. Comput. Netw. 38, 779–794 (2002)
Tang, W., Fu, Y., Cherkasova, L., Vahdat, A: A synthetic streaming media service workload generator. In: Proceedings of the 13th International Workshop on Network and Operating Systems Support for Digital Audio and Video, pp. 12–21. (2003)
Krishnamurthy, D., Rolia, J.A., Majumdar, S.: A synthetic workload generation technique for stress testing session-based systems. IEEE Trans. Softw. Eng. 32(11), 868–882 (2006)
Ali-Eldin, A., Kihl, M., Tordsson, J., Elmroth, E: Analysis and characterization of a video-on-demand service workload. In: Proceedings of the 6th ACM Multimedia Systems Conference, pp. 189–200 (2015)
Acknowledgements
The research leading to these results has been partly funded by CNPq, CAPES, FAPEMIG, FAPESP, and the SmartData@PoliTO center for Big Data and Machine Learning technologies. We would like to thank Globo.com teams for providing the data.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by P. Shenoy.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Best fitted distributions
Appendix: Best fitted distributions
Table 2 presents the fitting parameters of our behavior model. The clustering process originates seven client clusters. As we can see, these groups have different regimes, that are reflected on the variability of the fitted distributions.
The PMF of the Negative Binomial is \(p(k,n,p) = \left( {\begin{array}{c}k+n-1\\ n-1\end{array}}\right) p^n(1-p)^k\) for \(k \ge 0\). The PDF for Exponentiated Weibul is \(f(x,a,c) = ac(1-\mathrm{exp}(-x^c))^{a-1}\mathrm{exp}(-x^c)x^{c-1}\) with \(x>0,a>0,c>0\) and a and c as shape parameters. The PDF of Power log-normal is \(f(x,c,s) = \frac{x}{cs}\phi (log(x)/s)\varPhi (-log(x)/s))^{c-1}\) for \(x \ge 0\) and \(a > 0\). The \(\phi\) is the normal PDF, \(\varPhi\) is is the normal cdf, and \(x> 0, s, c > 0\). c and s are the shape parameter. The PDF for Log-normal distribution is \(f(x,s) = \frac{1}{sx\sqrt{2\pi }}e^{(-1/2(\frac{log(x)}{s})^2)}\) with \(x,s > 0\) and s as shape parameter. The PDF of Gamma distribution is \(f(x,a) = \frac{x^{a-1}\mathrm{exp}(-x)}{\varGamma (a)}\) for \(x \ge 0\) and \(a > 0\). The \(\varGamma (a)\) is the gamma function and a is the shape parameter. The Erlang distribution is a special case of the Gamma distribution, with the shape parameter a as an integer. The PDF for Generalized Gamma distribution is \(f(x,a,c) = \frac{|c|x^{ca-1}\mathrm{exp}(-x^c)}{\varGamma (a)}\) with \(x \ge 0\), \(a > 0\) and \(c \ne 0\). a and c are the shape parameters and \(\varGamma (a)\) is the gamma function. The PDF for Weibull (a.k.a. Frechet right) is \(f(x,c) = cx^{c-1}\mathrm{exp}(-x^c)\) for \(x>0\) and \(c>0\) and c as shape parameter. The PDF for Exponential Power distribution is \(f(x,b) = bx^{b-1}\mathrm{exp}(1+x^{b}-\mathrm{exp}(x^{b}))\) with \(x \ge 0\) and \(b > 0\) as shape parameter. The PDF for Beta distribution is \(f(x,a,b) = \frac{\varGamma (a+b)x^{a-1}(1-x)^{b-1}}{\varGamma (a)\varGamma (b)}\) for \(0 \le x \le 1, a> 0, b > 0\) where \(\varGamma\) is gamma function.
Rights and permissions
About this article
Cite this article
Guarnieri, T., Drago, I., Cunha, Í. et al. Modeling large-scale live video streaming client behavior. Multimedia Systems 27, 1101–1124 (2021). https://doi.org/10.1007/s00530-021-00788-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-021-00788-4