Skip to main content

An Investigation on Online Versus Batch Learning in Predicting User Behaviour

Abstract

An investigation on how to produce a fast and accurate prediction of user behaviour on the Web is conducted. First, the problem of predicting user behaviour as a classification task is formulated and then the main problems of such real-time predictions are specified: the accuracy and time complexity of the prediction. Second, a method for comparison of online and batch (offline) algorithms used for user behaviour prediction is proposed. Last, the performance of these algorithms using the data from a popular question and answer platform, Stack Overflow, is empirically explored. It is demonstrated that a simple online learning algorithm outperforms state-of-the-art batch algorithms and performs as well as a deep learning algorithm, Deep Belief Networks. The proposed method for comparison of online and offline algorithms as well as the provided experimental evidence can be used for choosing a machine learning set-up for predicting user behaviour on the Web in scenarios where the accuracy and the time performance are of main concern.

Keywords

  • Online Learning
  • Deep Learning
  • Classification

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-47175-4_9
  • Chapter length: 15 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   119.00
Price excludes VAT (USA)
  • ISBN: 978-3-319-47175-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   159.99
Price excludes VAT (USA)
Fig. 1

Notes

  1. 1.

    The source code is at https://github.com/Nik0l/UTemPr.

  2. 2.

    http://scikit-learn.org/.

  3. 3.

    https://github.com/Lasagne/Lasagne.

  4. 4.

    http://www.nltk.org/.

  5. 5.

    In our experiments, we tried \(L_1\) and \(L_2\) regularisation but we did not find any significant improvements in the results compared to the results without regularisation reported in this paper.

References

  1. Choi, S., Kim, E., Oh, S.: Human behavior prediction for smart homes using deep learning. In: 2013 IEEE RO-MAN, pp. 173–179 (2013)

    Google Scholar 

  2. Nazerfard E., Cook, D.: Using Bayesian Networks for Daily Activity Prediction (2013)

    Google Scholar 

  3. Burlutskiy, N., Petridis, M., Fish, A., Ali, N.: Prediction of users’ response time in Q&A communities. In: ICMLA’15, International Conference on Machine Learning and Applications (2015)

    Google Scholar 

  4. Weerkamp, W., De Rijke, M.: Activity prediction: a twitter-based exploration. In: Proceedings of TAIA’12 (2012)

    Google Scholar 

  5. Zheng, B., Thompson, K., Lam, S.S., Yoon, S.W., Gnanasambandam, N.: Customers behavior prediction using artificial neural network. In: Industrial and Systems Engineering Research Conference (ISERC), pp. 700–709. Institute of Industrial Engineerings (2013)

    Google Scholar 

  6. Loumiotis, I., Adamopoulou, E., Demestichas, K., Theologou, M.: On trade-off between computational efficiency and prediction accuracy in bandwidth traffic estimation. Electron. Lett. 50(10), 754–756 (2014)

    CrossRef  Google Scholar 

  7. Liang, N.Y., Huang, G.B., Saratchandran, P., Sundararajan, N.: A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans. Neural Netw. 17(6), 1411–1423 (2006)

    CrossRef  Google Scholar 

  8. Sadilek, A., Krumm, J.: Predicting long-term human mobility. In: AAAI, Far out (2012)

    Google Scholar 

  9. Zhu, Y., Zhong, E., Pan, S.J., Wang, X., Zhou, M., Yang, Q.: Predicting user activity level in social networks. In: Proceedings of the 22Nd ACM International Conference on Information and Knowledge Management, CIKM ’13, pp. 159–168, New York, NY, USA. ACM (2013)

    Google Scholar 

  10. Radinsky, K., Svore, K., Dumais, S., Teevan, J., Bocharov, A., Horvitz, E.: Modeling and predicting behavioral dynamics on the web. In: Proceedings of the 21st International Conference on World Wide Web, WWW ’12, pp. 599–608, New York, NY, USA. ACM (2012)

    Google Scholar 

  11. Dror, G., Maarek, Y., Szpektor, I.: Will my question be answered? predicting “question answerability” in community question-answering sites. In: Blockeel, H. (ed.) Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, vol. 8190, pp. 499–514. Springer, Berlin Heidelberg (2013)

    CrossRef  Google Scholar 

  12. Yang, L., Bao, S., Lin, Q., Wu, X., Han, D., Su, Z., Yu, Y.: Analyzing and predicting not-answered questions in community-based question answering services. In: Burgard, W. (ed.) AAAI. AAAI Press (2011)

    Google Scholar 

  13. Lim, T.S., Loh, W.Y., Shih, Y.S.: A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach. Learn. 40(3), 203–228 (2000)

    CrossRef  MATH  Google Scholar 

  14. Anderson, A., Huttenlocher, D., Kleinberg, J., Leskovec, J.: Discovering value from community activity on focused question answering sites: a case study of stack overflow. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 850–858, New York, USA. ACM (2012)

    Google Scholar 

  15. Asaduzzaman, M., Mashiyat, A.S., Roy, C.K., Schneider, K.A.: Answering questions about unanswered questions of stack overflow. In: Proceedings of the 10th Working Conference on Mining Software Repositories, pp. 97–100. Piscataway, NJ, USA (2013)

    Google Scholar 

  16. Bhat, V., Gokhale, A., Jadhav, R., Pudipeddi, J., Akoglu, L.: Min(e)d your tags: analysis of question response time in stack overflow. In: 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 328–335 (2014)

    Google Scholar 

  17. Lezina, C.G.E., Kuznetsov, A.M.: Predict Closed Questions on Stack Overflow (2012)

    Google Scholar 

  18. Dekel, O.: From online to batch learning with cutoff-averaging. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 21, pp. 377–384. Curran Associates, Inc. (2009)

    Google Scholar 

  19. Hoi, S.C., Wang, J., Zhao, P.: Libol: a library for online learning algorithms. J. Mach. Learn. Res. 15, 495–499 (2014)

    MATH  Google Scholar 

  20. Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. J. Mach. Learn. Res. 7, 551–585 (2006)

    MathSciNet  MATH  Google Scholar 

  21. Bianchini, M., Scarselli, F.: On the complexity of neural network classifiers: a comparison between shallow and deep achitectures. IEEE Trans. Neural Netw. Learn. Syst. 25(8), 1553–1565 (2014)

    CrossRef  Google Scholar 

  22. Chapelle, O.: Training a support vector machine in the primal. Neural Comput. 19(5), 1155–1178 (2007)

    MathSciNet  CrossRef  MATH  Google Scholar 

  23. Minka, T.P.: A Comparison of Numerical Optimizers for Logistic Regression. Technical report (2003)

    Google Scholar 

  24. Su, J., Zhang, H.: A fast decision tree learning algorithm. In: Proceedings of the 21st National Conference on Artificial Intelligence—Volume 1, AAAI’06, pp. 500–505. AAAI Press (2006)

    Google Scholar 

  25. Bottou, L.: Proceedings of COMPSTAT’2010: 19th International Conference on Computational StatisticsParis France, August 22–27, 2010 Keynote, Invited and Contributed Papers, chapter Large-Scale Machine Learning with Stochastic Gradient Descent, pp. 177–186. Physica-Verlag HD, Heidelberg (2010)

    Google Scholar 

  26. Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press (2012)

    Google Scholar 

  27. Cai, Y., Chakravarthy, S.: Answer quality prediction in Q&A social networks by leveraging temporal features. IJNGC 4(1) (2013)

    Google Scholar 

Download references

Acknowledgments

The authors are grateful for illuminating discussions to Dr Yuri Kalnishkan’s team in the project “On-line Self-Tuning Learning Algorithms for Handling Historical Information” (funded by the Leverhulme Trust).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikolay Burlutskiy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Burlutskiy, N., Petridis, M., Fish, A., Chernov, A., Ali, N. (2016). An Investigation on Online Versus Batch Learning in Predicting User Behaviour. In: Bramer, M., Petridis, M. (eds) Research and Development in Intelligent Systems XXXIII. SGAI 2016. Springer, Cham. https://doi.org/10.1007/978-3-319-47175-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47175-4_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47174-7

  • Online ISBN: 978-3-319-47175-4

  • eBook Packages: Computer ScienceComputer Science (R0)