Skip to main content

Web Scanner Detection Based on Behavioral Differences

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1095))

Abstract

Web scanners will not only take up the bandwidth of the server, but also collect sensitive information of websites and probe vulnerabilities of the system, which seriously threaten the security of websites. Accurate detection of Web scanners can effectively mitigate this kind of thread. Existing scanner detection methods extract features from log and differentiate between scanners and legal users with machine learning. However, these methods are unable to block scanning due to lack of behavior information of clients. To solve this problem, a Web scanner detection method based on behavioral differences is proposed. It collects request information and behavior information of clients by three modules named Passive Detection, Active Injection and Active Detection. Then, six kinds of features including fingerprint of scanners and execution ability of JavaScript code are extracted to detect whether a client is a scanner. This method makes full use of the behavior characteristics of clients and the behavioral differences between scanners and legal users. The experimental results showed the method is efficient and fast in scanner detection.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Imperva. Bot traffic report 2016 [EB/OL] (2016). https://www.incapsula.com/blog/bot-traffic-report-2016.html

  2. Asselin, E., Aguilar-Melchor, C., Jakllari, G.: Anomaly detection for web server log reduction: a simple yet efficient crawling based approach. In: 2016 IEEE Conference on Communications and Network Security (CNS), pp. 586–590. IEEE (2016)

    Google Scholar 

  3. Stock, B., Pellegrino, G., Rossow, C., Johns, M., Backes, M.: Hey, you have a problem: on the feasibility of large-scale web vulnerability notification. In: 25th USENIX Security Symposium (USENIX Security 16), pp. 1015–1032 (2016)

    Google Scholar 

  4. Kals, S., Kirda, E., Kruegel, C., Jovanovic, N.: SecuBat: a web vulnerability scanner. In: Proceedings of the 15th International Conference on World Wide Web, pp. 247–256. ACM (2006)

    Google Scholar 

  5. Zhao, T., Yuliang, L., Liu, J.H., Sun, H., Shi, F.: Web vulnerability detection based on form crawler. Comput. Eng. 34(9), 186–188 (2008)

    Google Scholar 

  6. Akrout, R., Alata, E., Kaaniche, M., Nicomette, V.: An automated black box approach for web vulnerability identification and attack scenario generation. J. Braz. Comput. Soc. 20(1), 4 (2014)

    Article  MathSciNet  Google Scholar 

  7. Cetin, O., Ganan, C., Korczynski, M., van Eeten, M.: Make notifications great again: learning how to notify in the age of large-scale vulnerability scanning. In: Workshop on the Economy of Information Security (2017)

    Google Scholar 

  8. Stock, B., Pellegrino, G., Li, F., Backes, M., Rossow, C.: Didnt you hear me? Towards more successful web vulnerability notifications (2018)

    Google Scholar 

  9. Geens, N., Huysmans, J., Vanthienen, J.: Evaluation of web robot discovery techniques: a benchmarking study. In: Perner, P. (ed.) ICDM 2006. LNCS (LNAI), vol. 4065, pp. 121–130. Springer, Heidelberg (2006). https://doi.org/10.1007/11790853_10

    Chapter  Google Scholar 

  10. Tan, P.N., Kumar, V.: Discovery of web robot sessions based on their navigational patterns. In: Zhong, N., Liu, J. (eds.) Intelligent Technologies for Information Analysis, pp. 193–222. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-662-07952-2_9

    Chapter  Google Scholar 

  11. Bomhardt, C., Gaul, W., Schmidt-Thieme, L.: Web robot detection - preprocessing web logfiles for robot detection. In: Bock, H.H., et al. (eds.) New Developments in Classification and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 113–124. Springer, Heidelberg (2005). https://doi.org/10.1007/3-540-27373-5_14

    Chapter  Google Scholar 

  12. Stassopoulou, A., Dikaiakos, M.D.: A probabilistic reasoning approach for discovering web crawler sessions. In: Dong, G., Lin, X., Wang, W., Yang, Y., Yu, J.X. (eds.) APWeb/WAIM -2007. LNCS, vol. 4505, pp. 265–272. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72524-4_29

    Chapter  MATH  Google Scholar 

  13. Lu, W.-Z., Yu, S.-Z.: Web robot detection based on hidden Markov model. In: 2006 International Conference on Communications, Circuits and Systems, vol. 3, pp. 1806–1810. IEEE (2006)

    Google Scholar 

  14. Huntington, P., Nicholas, D., Jamali, H.R.: Web robot detection in the scholarly information environment. J. Inf. Sci. 34(5), 726–741 (2008)

    Article  Google Scholar 

  15. Seay. Waf realized scanner recognition, completely resisted hacker scanning [EB/OL] (2016). http://www.freebuf.com/articles/web/16806.html

  16. Liu, X., Fang, Y., Huang, C., Liu, L.: Research of identifying web vulnerability scanner based on finite state machine. J. Inf. Secur. Res. 3(2), 123–128 (2017)

    Google Scholar 

  17. Jacob, G., Kirda, E., Kruegel, C., Vigna, G.: \(\{\)PUBCRAWL\(\}\): protecting users and businesses from crawlers. In: Presented as part of the 21st USENIX Security Symposium (USENIX Security 12), pp. 507–522 (2012)

    Google Scholar 

  18. SEO optimization. Yujian [EB/OL] (2019). https://www.chabug.org/tools/655.html

  19. Netsparker Web Application Security Scanner. Sqlmap [EB/OL] (2019). https://sqlmap.org/

  20. Wpscanteam. Wpscan [EB/OL] (2019). https://github.com/wpscanteam/wpscan

  21. Espreto. Wpsploit [EB/OL] (2019). https://github.com/espreto/wpsploit

  22. OWASP Project. Dirbrute [EB/OL] (2019). https://github.com/Xyntax/DirBrute

  23. Xmendez. Wfuzz [EB/OL] (2019). https://github.com/xmendez/wfuzz

  24. Yu, J.X., Ou, Y., Zhang, C., Zhang, S.: Identifying interesting visitors through web log classification. IEEE Intell. Syst. 20(3), 55–59 (2005)

    Article  Google Scholar 

  25. Stevanovic, D., Vlajic, N., An, A.: Unsupervised clustering of web sessions to detect malicious and non-malicious website users. Procedia Comput. Sci. 5, 123–131 (2011)

    Article  Google Scholar 

  26. Doran, D., Gokhale, S.S.: An integrated method for real time and offline web robot detection. Expert Syst. 33(6), 592–606 (2016)

    Article  Google Scholar 

  27. OpenResty. Openresty - official site [EB/OL] (2017). https://openresty.org/en/

  28. Fuyun. Safedog [EB/OL] (2018). http://www.safedog.cn/

  29. Trustwave. Modsecurity [EB/OL] (2019). https://modsecurity.org/

  30. Liang, S., Li, M., Liang, J., Chen, Z.: An experimental study of response times of web applications. J. Comput. Res. Dev. 40(7), 1076–1080 (2003)

    Google Scholar 

Download references

Acknowledgements

We sincerely thank SociaSec anonymous reviewers for their valuable feedback. This research was supported in part by the National Natural Science Foundation of China (U1636107, 61373168).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianming Fu .

Editor information

Editors and Affiliations

Appendices

A Fingerprint Information of Common Scanners

Name

Fingerprint

Location

wpscan

wpscan

User-Agent

SQLmap

Sqlmap/version/#stable(http://sqlmap.org)

User-Agent

AppScan

APPSCAN

Requset parameter, URL

AWVS

Acunetix-Aspect

Request header

W3af

w3af.org

User-Agent

Burpsuite

burpcollaborator.net

Request header, parameter, URL

WebCruiser

WebCruiser, HEAD method

User-Agent

NetSpark

X-Scanner:Netsparker or Netsparker

Request header, parameter

FileSensor

Scrapy/1.4.0 (+http://scrapy.org)

User-Agent

Yujian

HEAD method, User-Agent:-

User-Agent, RM

BBScan

BBScan/version

User-Agent

DirBrute

whoami=wyscan_dirfuzz

Cookie

Nikto

(Nikto/version) (Evasions:None) (Test:map_codes)

User-Agent

B Testing Results of Scanners

Scanners

Visiting frequency

Fingerprint

Resource file

Carrying cookie

JS execution

Mouse click

All methods

Appscan

\(\surd \)

\(\surd \)

\(\surd \)

\(\times \)

\(\times \)

\(\surd \)

\(\surd \)

w3af

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

Burpsuite

\(\surd \)

\(\surd \)

\(\surd \)

\(\times \)

\(\surd \)

\(\surd \)

\(\surd \)

AWVS

\(\surd \)

\(\surd \)

\(\surd \)

\(\times \)

\(\surd \)

\(\surd \)

\(\surd \)

netsparker

\(\surd \)

\(\surd \)

\(\surd \)

\(\times \)

\(\surd \)

\(\surd \)

\(\surd \)

htpwdScan

\(\surd \)

\(\times \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

NagaScan

\(\surd \)

\(\times \)

\(\times \)

\(\times \)

\(\times \)

\(\times \)

\(\surd \)

WebCrusier

\(\surd \)

\(\surd \)

\(\times \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

Yujian directory scanning

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

Yujian website identification

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

SQLmap

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

BBScan

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

Cangibrina

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

BruteXSS

\(\times \)

\(\surd \)

\(\times \)

\(\times \)

\(\surd \)

\(\times \)

\(\surd \)

Shuriken

\(\times \)

\(\times \)

\(\times \)

\(\times \)

\(\surd \)

\(\times \)

\(\surd \)

Weakfilescan

\(\surd \)

\(\times \)

\(\surd \)

\(\times \)

\(\surd \)

\(\surd \)

\(\surd \)

Dirsearch

\(\surd \)

\(\times \)

\(\times \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

Pentestdb

\(\times \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

Lcyscan

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

\(\surd \)

DirBrute

\(\surd \)

\(\surd \)

\(\surd \)

\(\times \)

\(\surd \)

\(\surd \)

\(\surd \)

wpscan

\(\times \)

\(\surd \)

\(\times \)

\(\times \)

\(\surd \)

\(\surd \)

\(\surd \)

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fu, J., Li, L., Wang, Y., Huang, J., Peng, G. (2019). Web Scanner Detection Based on Behavioral Differences. In: Meng, W., Furnell, S. (eds) Security and Privacy in Social Networks and Big Data. SocialSec 2019. Communications in Computer and Information Science, vol 1095. Springer, Singapore. https://doi.org/10.1007/978-981-15-0758-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-0758-8_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-0757-1

  • Online ISBN: 978-981-15-0758-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics