Skip to main content

Multi-task Learning for User Engagement and Adoption in Live Video Streaming Events

  • 991 Accesses

Part of the Lecture Notes in Computer Science book series (LNAI,volume 12979)

Abstract

Nowadays, live video streaming events have become a mainstay in viewer’s communication in large international enterprises. Provided that viewers are distributed worldwide, the main challenge resides on how to schedule the optimal event’s time so as to improve both the viewer’s engagement and adoption. In this paper we present a multi-task deep reinforcement learning model to select the time of a live video streaming event, aiming to optimize the viewer’s engagement and adoption at the same time. We consider the engagement and adoption of the viewers as independent tasks and formulate a unified loss function to learn a common policy. In addition, we account for the fact that each task might have different contribution to the training strategy of the agent. Therefore, to determine the contribution of each task to the agent’s training, we design a Transformer’s architecture for the state-action transitions of each task. We evaluate our proposed model on four real-world datasets, generated by the live video streaming events of four large enterprises spanning from January 2019 until March 2021. Our experiments demonstrate the effectiveness of the proposed model when compared with several state-of-the-art strategies. For reproduction purposes, our evaluation datasets and implementation are publicly available at https://github.com/stefanosantaris/merlin.

Keywords

  • Multi-task learning
  • Reinforcement learning
  • Live video streaming

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   69.99
Price excludes VAT (Canada)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   89.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    We consider only the l previous events to capture the most recent viewers behavior. As we will demonstrate in Sect. 4, considering large values of l does not necessarily improve the model’s performance.

  2. 2.

    Provided the high risk that might hinder when evaluating the learned policy \(\pi _{\theta }\) directly to the enterprises, in our study we perform off-line A/B testing based on the events of each dataset [9, 30].

  3. 3.

    https://github.com/braemt/attentive-multi-task-deep-reinforcement-learning.

  4. 4.

    https://github.com/deepmind/scalable_agent.

  5. 5.

    https://github.com/stefanosantaris/merlin.

References

  1. Break up your big virtual meetings. https://hbr.org/2020/04/break-up-your-big-virtual-meetings (2020). Accessed 19 Mar 2021

  2. Gauging demand for enterprise streaming - 2020 - investment trends in times of global change. https://www.ibm.com/downloads/cas/DEAKXQ5P (2020). Accessed 29 Jan 2021

  3. Using video for internal corporate communications, training & compliance. https://www.ibm.com/downloads/cas/M0R85GDQ (2021). Accessed 30 Mar 2021

  4. Antaris, S., Rafailidis, D.: Vstreamdrls: Dynamic graph representation learning with self-attention for enterprise distributed video streaming solutions. In: ASONAM, pp. 486–493 (2020)

    Google Scholar 

  5. Bräm, T., Brunner, G., Richter, O., Wattenhofer, R.: Attentive multi-task deep reinforcement learning. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds.) ECML, pp. 134–149 (2020)

    Google Scholar 

  6. Calandriello, D., Lazaric, A., Restelli, M.: Sparse multi-task reinforcement learning. In: NIPS (2014)

    Google Scholar 

  7. Chen, M., Beutel, A., Covington, P., Jain, S., Belletti, F., Chi, E.H.: Top-k off-policy correction for a reinforce recommender system. In: WSDM, pp. 456–464 (2019)

    Google Scholar 

  8. Espeholt, L., et al.: IMPALA: Scalable distributed deep-RL with importance weighted actor-learner architectures. In: ICML, pp. 1407–1416 (2018)

    Google Scholar 

  9. Gilotte, A., Calauzènes, C., Nedelec, T., Abraham, A., Dollé, S.: Offline a/b testing for recommender systems. In: WSDM, pp. 198–206 (2018)

    Google Scholar 

  10. Graves, A.: Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013)

  11. Hessel, M., Soyer, H., Espeholt, L., Czarnecki, W., Schmitt, S., van Hasselt, H.: Multi-task deep reinforcement learning with popart. In: AAAI, pp. 3796–3803. AAAI Press (2019)

    Google Scholar 

  12. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)

    Google Scholar 

  13. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning (2019)

    Google Scholar 

  14. Liu, F., Guo, H., Li, X., Tang, R., Ye, Y., He, X.: End-to-end deep reinforcement learning based recommendation with supervised embedding. In: WSDM, pp. 384–392 (2020)

    Google Scholar 

  15. Loynd, R., Fernandez, R., Celikyilmaz, A., Swaminathan, A., Hausknecht, M.: Working memory graphs. In: ICML (2020)

    Google Scholar 

  16. Parisotto, E., Salakhutdinov, R.: Efficient transformers in reinforcement learning using actor-learner distillation (2021)

    Google Scholar 

  17. Parisotto, E., et al.: Stabilizing transformers for reinforcement learning. In: ICML, pp. 7487–7498 (2020)

    Google Scholar 

  18. Polydoros, A.S., Nalpantidis, L.: Survey of model-based reinforcement learning: applications on robotics. J. Intell. Robot. Syst. 86(2), 153–173 (2017)

    CrossRef  Google Scholar 

  19. Silver, D., et al.: Mastering chess and shogi by self-play with a general reinforcement learning algorithm (2017)

    Google Scholar 

  20. Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: ICML, pp. 387–395 (2014)

    Google Scholar 

  21. Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge (2018)

    Google Scholar 

  22. Swaminathan, A., Joachims, T.: The self-normalized estimator for counterfactual learning. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) NeurIPS, vol. 28 (2015)

    Google Scholar 

  23. Teh, Y.W., et al.: Distral: robust multitask reinforcement learning. In: NIPS, pp. 4499–4509 (2017)

    Google Scholar 

  24. Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)

    Google Scholar 

  25. Vithayathil Varghese, N., Mahmoud, Q.H.: A survey of multi-task deep reinforcement learning. Electronics 9(9), 1363 (2020)

    Google Scholar 

  26. Xin, X., Karatzoglou, A., Arapakis, I., Jose, J.M.: Self-supervised reinforcement learning for recommender systems. In: SIGIR, pp. 931–940 (2020)

    Google Scholar 

  27. Ye, D., et al.: Mastering complex control in moba games with deep reinforcement learning. AAAI 34(04), 6672–6679 (2020)

    CrossRef  Google Scholar 

  28. Zhu, H., et al.: The ingredients of real world robotic reinforcement learning. In: ICLR (2020)

    Google Scholar 

  29. Zhu, Y., et al.: What to do next: Modeling user behaviors by time-lstm. In: IJCAI-17, pp. 3602–3608 (2017)

    Google Scholar 

  30. Zou, L., Xia, L., Ding, Z., Song, J., Liu, W., Yin, D.: Reinforcement learning to optimize long-term user engagement in recommender systems. In: KDD, pp. 2810–2818 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefanos Antaris .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Antaris, S., Rafailidis, D., Arriaza, R. (2021). Multi-task Learning for User Engagement and Adoption in Live Video Streaming Events. In: Dong, Y., Kourtellis, N., Hammer, B., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12979. Springer, Cham. https://doi.org/10.1007/978-3-030-86517-7_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86517-7_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86516-0

  • Online ISBN: 978-3-030-86517-7

  • eBook Packages: Computer ScienceComputer Science (R0)