Abstract
This paper presents “iCrawl”, a visual high interaction client honeypot system. Web-based cyber-attacks have increased exponentially along with the growth of cloud-based web application technologies. Web browsers provide users with an entry point to these web applications. The iCrawl system is designed to deliver a high interaction honey client that is virtually indistinguishable from a real human-driven client. The system operates by driving an actual web browser in a fashion closely resembling a genuine user’s actions. Unlike most crawlers iCrawl attempts to operate over visual elements on the web page, not code elements. The honeypot system consists of pre-configured decoy virtual machines. Each virtual machine includes spider program, which upon execution automates the process of driving the web browser and crawling the targeted website. It performs browsing by observing the page and simulating human user input through mouse and keyboard activity. The data collected from the crawling is stored in a graph database in the form of nodes and relations. This data captures the context and the changes in system behavior due to interaction with the crawled website. The graph data can be queried and monitored online for structural patterns and anomalies.
The iCrawl system is enabling technology for studying sophisticated malicious websites that can avoid detection by the simpler crawlers typically utilized by well-known security companies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Symantec: Internet Security Threat Report (2016). https://www.symantec.com/security-center/threat-report. Accessed 3 Nov 2016
Garnaeva, M., Wiel, J.V.D., Makrushin, D., Ivanov, A., Namestnikov, Y.: Kaspersky Security Bulletin, Overall Statistics for 2015. https://securelist.com/analysis/kaspersky-security-bulletin/73038/kaspersky-security-bulletin-2015-overall-statistics-for-2015/. Accessed 3 Nov 2016 (2015)
Patil, D.R., Patil, J.B.: Survey on malicious web pages detection techniques. Int. J. u- e- Serv. Sci. Technol. 8(5), 195–206 (2015)
U.S federal Government: Digital Analytics program. https://analytics.usa.gov/. Accessed May 2017
Robinson, T., Webber, J., Eifrem, E.: Graph Databases. O’Reilly Media, Sebastopol (2013)
Neo4j: What is Graph Database? https://neo4j.com/developer/graph-database/. Accessed 13 Nov 2016
Vicknair, C., et al.: A Comaprison of Graph Database and a Relational Database (2009)
Selenium, Selenium WebDriver (2012). http://www.seleniumhq.org/projects/webdriver/. Accessed 18 Nov 2016
Richardson, L.: Beautiful Soup (2004). https://www.crummy.com/software/BeautifulSoup/. Accessed 18 Nov 2016
Rodola, G.: Psutils (2009). https://github.com/giampaolo/psutil. Accessed 19 Nov 2016
Small, N.: Py2neo v3 (2011). https://github.com/nigelsmall/py2neo. Accessed 19 Nov 2016
Wang, Y.M., et al: Automated web patrol with strider HoneyMonkeys: finding web sites that exploit browser vulnerabilities. In: 13th Annual Symposium on Network and Distributed System, San Diego, California, USA (2006)
Anagnostakis, K.G., et al.: Detecting targeted attacks using shadow honeypots. In: USENIX Security Symposium (2005)
Dell’Aera, A.: Thug, Github (2011). https://github.com/buffer/thug. Accessed 12 Nov 2016
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Nagothu, D., Dolgikh, A. (2017). iCrawl: A Visual High Interaction Web Crawler. In: Rak, J., Bay, J., Kotenko, I., Popyack, L., Skormin, V., Szczypiorski, K. (eds) Computer Network Security. MMM-ACNS 2017. Lecture Notes in Computer Science(), vol 10446. Springer, Cham. https://doi.org/10.1007/978-3-319-65127-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-65127-9_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65126-2
Online ISBN: 978-3-319-65127-9
eBook Packages: Computer ScienceComputer Science (R0)