Skip to main content

Bot or Not? A Case Study on Bot Recognition from Web Session Logs

  • Chapter
  • First Online:
Quantifying and Processing Biomedical and Behavioral Signals (WIRN 2017 2017)

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 103))

Included in the following conference series:

Abstract

This work reports on a study of web usage logs to verify whether it is possible to achieve good recognition rates in the task of distinguishing between human users and automated bots using computational intelligence techniques. Two problem statements are given, offline (for completed sessions) and on-line (for sequences of individual HTTP requests). The former is solved with several standard computational intelligence tools. For the second, a learning version of Wald’s sequential probability ratio test is used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zeifman, I.: Bot traffic report 2016, https://www.incapsula.com/blog/bot-traffic-report-2016.html, visited on 2017-05-06

  2. Microsoft bot framework, https://dev.botframework.com/, visited on 2017-04-20

  3. Goodman, N.: A survey of advances in botnet technologies. arXiv preprint arXiv:1702.01132 (2017)

  4. Acarali, D., Rajarajan, M., Komninos, N., Herwono, I.: Survey of approaches and features for the identification of http-based botnet traffic. J. Network Comput. Appl. 76, 1–15 (2016)

    Article  Google Scholar 

  5. Bai, Q., Xiong, G., Zhao, Y., He, L.: Analysis and detection of bogus behavior in web crawler measurement. Procedia Comput. Sci. 31, 1084–1091 (2014)

    Article  Google Scholar 

  6. Invalid clicks, https://support.google.com/adwords/answer/42995, visited on 2017-03-21

  7. Doran, D., Gokhale, S.S.: Web robot detection techniques: overview and limitations. Data Mining and Knowledge Discovery 22(1), 183–210 (2011)

    Article  Google Scholar 

  8. Suchacka, G.: Analysis of aggregated bot and human traffic on e-commerce site. In: 2014 Federated Conference on Computer Science and Information Systems. pp. 1123–1130 (Sept 2014)

    Google Scholar 

  9. Suchacka, G., Sobków, M.: Detection of Internet robots using a Bayesian approach. In: 2015 IEEE 2nd International Conference on Cybernetics (CYBCONF). pp. 365–370 (June 2015)

    Google Scholar 

  10. Goodfellow, I., Bengio, Y., Courville, A.: Deep learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  11. Cortes, C., Vapnik, V.: Support vector networks. Machine Learning 20, 273–297 (1995)

    MATH  Google Scholar 

  12. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Cam, L.L., Neyman, J. (eds.) Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. vol. I, pp. 281–297. University of California (January 1967)

    Google Scholar 

  13. Masulli, F., Rovetta, S.: Soft transition from probabilistic to possibilistic fuzzy clustering. IEEE Trans. Fuzzy Syst. 14(4), 516–527 (2006)

    Article  Google Scholar 

  14. Krishnapuram, R., Keller, J.M.: A possibilistic approach to clustering. IEEE Trans. Fuzzy Syst. 1(2), 98–110 (1993)

    Article  Google Scholar 

  15. Ghosh, B.: Sequential Tests of Statistical Hypotheses. Addison-Wesley, Boston (1970)

    MATH  Google Scholar 

  16. Wald, A.: Sequential tests of statistical hypotheses. The Ann. Mathe. Statist. 16(2), 117–186 (06 1945)

    Google Scholar 

  17. Kira, S., Yang, T., Shadlen, M.N.: A neural implementation of Wald’s sequential probability ratio test. Neuron 85(4), 861–873 (2015)

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by a STSM grant from COST Action IC1406 High-Performance Modeling and Simulation for Big Data Applications (cHiPSet).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefano Rovetta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Rovetta, S., Cabri, A., Masulli, F., Suchacka, G. (2019). Bot or Not? A Case Study on Bot Recognition from Web Session Logs. In: Esposito, A., Faundez-Zanuy, M., Morabito, F., Pasero, E. (eds) Quantifying and Processing Biomedical and Behavioral Signals. WIRN 2017 2017. Smart Innovation, Systems and Technologies, vol 103. Springer, Cham. https://doi.org/10.1007/978-3-319-95095-2_19

Download citation

Publish with us

Policies and ethics