A Recurrent Neural Network Survival Model: Predicting Web User Return Time

  • Georg L. GrobEmail author
  • Ângelo Cardoso
  • C. H. Bryan LiuEmail author
  • Duncan A. Little
  • Benjamin Paul Chamberlain
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11053)


The size of a website’s active user base directly affects its value. Thus, it is important to monitor and influence a user’s likelihood to return to a site. Essential to this is predicting when a user will return. Current state of the art approaches to solve this problem come in two flavors: (1) Recurrent Neural Network (RNN) based solutions and (2) survival analysis methods. We observe that both techniques are severely limited when applied to this problem. Survival models can only incorporate aggregate representations of users instead of automatically learning a representation directly from a raw time series of user actions. RNNs can automatically learn features, but can not be directly trained with examples of non-returning users who have no target value for their return time. We develop a novel RNN survival model that removes the limitations of the state of the art methods. We demonstrate that this model can successfully be applied to return time prediction on a large e-commerce dataset with a superior ability to discriminate between returning and non-returning users than either method applied in isolation. Code related to this paper is available at:


User return time Web browse sessions Recurrent neural network Marked temporal point process Survival analysis 


  1. 1.
    Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRefGoogle Scholar
  2. 2.
    Benson, A.R., Kumar, R., Tomkins, A.: Modeling user consumption sequences. In: WWW 2016, pp. 519–529 (2016)Google Scholar
  3. 3.
    Breslow, N.: Covariance analysis of censored survival data. Biometrics 30(1), 89–99 (1974)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Cai, X., Zhang, N., Venayagamoorthy, G.K., Wunsch, D.C.: Time series prediction with recurrent neural networks trained by a hybrid PSO-EA algorithm. Neurocomputing 70(13), 2342–2353 (2007)CrossRefGoogle Scholar
  5. 5.
    Chamberlain, B.P., Cardoso, A., Liu, C.H.B., Pagliari, R., Deisenroth, M.P.: Customer lifetime value prediction using embeddings. In: KDD 2017, pp. 1753–1762. ACM (2017)Google Scholar
  6. 6.
    Chandra, R., Zhang, M.: Cooperative coevolution of Elman recurrent neural networks for chaotic time series prediction. Neurocomputing 86, 116–123 (2012)CrossRefGoogle Scholar
  7. 7.
    Cheng, H.T., et al.: Wide & deep learning for recommender systems. In: DLRS 2016 (RecSys 2016), pp. 7–10. ACM (2016)Google Scholar
  8. 8.
    Cho, K., van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014)
  9. 9.
    Chollet, F., et al.: Keras. (2015)
  10. 10.
    Covington, P., Adams, J., Sargin, E.: Deep neural networks for Youtube recommendations. In: RecSys 2016, pp. 191–198. ACM (2016)Google Scholar
  11. 11.
    Cox, D.R.: Regression models and life-tables. J. Roy. Stat. Soc. Ser. B (Methodol.) 34(2), 187–220 (1972)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Cox, D.R.: Partial likelihood. Biometrika 62(2), 269–276 (1975)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Cox, D.R., Oakes, D.: Analysis of Survival Data, vol. 21. CRC Press, Boca Raton (1984)Google Scholar
  14. 14.
    Davidson-Pilon, C.: Lifelines (2016).
  15. 15.
    Du, N., Dai, H., Trivedi, R., Upadhyay, U., Gomez-Rodriguez, M., Song, L.: Recurrent marked temporal point processes: embedding event history to vector. In: KDD 2016, pp. 1555–1564. ACM (2016)Google Scholar
  16. 16.
    Du, N., Wang, Y., He, N., Song, L.: Time-sensitive recommendation from recurrent user activities. In: NIPS 2015, pp. 3492–3500. MIT Press (2015)Google Scholar
  17. 17.
    Efron, B.: The efficiency of cox’s likelihood function for censored data. J. Am. Stat. Assoc. 72(359), 557–565 (1977)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Flunkert, V., Salinas, D., Gasthaus, J.: DeepAR: probabilistic forecasting with autoregressive recurrent networks. arXiv preprint arXiv:1704.04110 (2017)
  19. 19.
    Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 855–868 (2009)CrossRefGoogle Scholar
  20. 20.
    Han, M., Xi, J., Xu, S., Yin, F.L.: Prediction of chaotic time series based on the recurrent predictor neural network. IEEE Trans. Sig. Process. 52(12), 3409–3416 (2004)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Harrell, F.E., Lee, K.L., Mark, D.B.: Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat. Med. 15, 361–387 (1996)CrossRefGoogle Scholar
  22. 22.
    Hawkes, A.G.: Spectra of some self-exciting and mutually exciting point processes. Biometrika 58(1), 83–90 (1971)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  24. 24.
    Ishwaran, H., Kogalur, U.B., Blackstone, E.H., Lauer, M.S.: Random survival forests. Ann. Appl. Stat. 2(3), 841–860 (2008)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Kalbfleisch, J.D., Prentice, R.L.: Marginal likelihoods based on cox’s regression and life model. Biometrika 60(2), 267–278 (1973)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Kapoor, K., Sun, M., Srivastava, J., Ye, T.: A hazard based approach to user return time prediction. In: KDD 2014, pp. 1719–1728. ACM (2014)Google Scholar
  27. 27.
    Klein, J.P., Moeschberger, M.L.: Survival Analysis: Techniques for Censored and Truncated Data. Springer, Heidelberg (2005)zbMATHGoogle Scholar
  28. 28.
    Li, L., Jing, H., Tong, H., Yang, J., He, Q., Chen, B.C.: NEMO: next career move prediction with contextual embedding. In: WWW 2017 Companion, pp. 505–513 (2017)Google Scholar
  29. 29.
    Manzoor, E., Akoglu, L.: Rush!: Targeted time-limited coupons via purchase forecasts. In: KDD 2017, pp. 1923–1931. ACM (2017)Google Scholar
  30. 30.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
  31. 31.
    Rajesh, R., Perotte, A., Elhadad, N., Blei, D.: Deep survival analysis. In: Proceedings of the 1st Machine Learning for Healthcare Conference, pp. 101–114 (2016)Google Scholar
  32. 32.
    Rodríguez, G.: Survival models. In: Course Notes for Generalized Linear Statistical Models (2010).
  33. 33.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS 2014, pp. 3104–3112 (2014)Google Scholar
  34. 34.
    Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. In: CVPR 2015, pp. 3156–3164 (2015)Google Scholar
  35. 35.
    Wangperawong, A., Brun, C., Laudy, O., Pavasuthipaisit, R.: Churn analysis using deep convolutional neural networks and autoencoders. arXiv preprint arXiv:1604.05377 (2016)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Georg L. Grob
    • 1
    Email author
  • Ângelo Cardoso
    • 2
  • C. H. Bryan Liu
    • 2
    Email author
  • Duncan A. Little
    • 2
  • Benjamin Paul Chamberlain
    • 1
    • 2
  1. 1.Imperial College LondonLondonUK
  2. 2.ASOS.comLondonUK

Personalised recommendations