Skip to main content

Web from preprocessor for crawling


Usually organizations deploy web applications into the production environment with vulnerabilities. To avoid it, organizations need to run a web application vulnerability assessment. The most prevalent kind of vulnerability assessment is when the tester uses a vulnerability scanner. This assessment can be divided into two phases: crawling and testing. The purpose of the first phase is to gather all the access points of the application. In the second phase the tester sends some malformed values to the application, and then analyze the response looking for known vulnerability patterns. The crawling phase is critical because if the tester cannot reach the applications content, he or she couldn’t test that content to find vulnerabilities. One of the main challenges of crawling web applications are to fill out web forms with correct values. To face this challenge, web vulnerability scanners used to include a generic list of field value pairs. These scanners also let the tester to add new pairs. This paper presents a novel method for searching candidate web form field values. The challenge is to map more applications content than using the field value pairs included by default. Our method will try to get form fields values executing the client side code and looking for candidate values in an external data source.We have test the proposed method and the experiments show that it can improve the crawling phase of dynamic vulnerability assessment.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3


  1. Acunetix (2012) Accessed 3 Jan 2013

  2. Baral P (2011) Web application scanners: a review of related articles [Essay]. IEEE Potentials 30(2):10

    Article  MATH  Google Scholar 

  3. Bau J, Gupta BED, Mitchell J (2010) State of the art: automated black-box web application vulnerability testing. In: Proceedings of the 2010 IEEE Symposium on security and privacy, pp 332–345

  4. Doupe A, Cova M, Vigna G (2010) Why Johnny can’t pentest: an analysis of black-box web vulnerability scanners. In: Proceedings of the 17th International conference on detection of intrusions and malware, and vulnerability assessment, pp 111–131

  5. Gonzalez H, Halevy AY, Jensen CS, Langen A, Madhavan J, Shapley R, Shen W, Goldberg-Kidon J (2010) Google fusion tables: web-centered data management and collaboration. In: Proceedings of the international conference on management of data, pp 1061–1066

  6. Huang Y, Huang S, Lin T, Tsai C (2003) Web application security assessment by fault injection and behavior monitoring. In: Proceedings of the 12th international conference on world wide web, pp 148–159

  7. Inspect HW (2012) Accessed 3 Jan 2013

  8. Metasploit (2012) Accessed 3 Jan 2013

  9. OpenCart (2012) Accessed 3 Jan 2013

  10. OWASP (2012) Accessed 3 Jan 2013

  11. Suite B (2012) Accessed 3 Jan 2013

  12. Synonymlab (2012) Accessed 3 Jan 2013

  13. Yujian L, BL (2007) A normalized Levenshtein distance metric. IEEE Trans Pattern Anal Mach Intell 29(6):1091

    Article  Google Scholar 

Download references


This work was supported by the Ministerio de Industria, Turismo y Comercio (MITyC, Spain) through the Project Avanza Competitividad I+D+I TSI-020100-2011-165 and the Agencia Española de Cooperación Internacional para el Desarrollo (AECID, Spain) through Acción Integrada MAEC-AECID MEDITERRÁNEO A1/037528/11.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Luis Javier García Villalba.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Muñoz, F.R., García Villalba, L.J. Web from preprocessor for crawling. Multimed Tools Appl 74, 8559–8570 (2015).

Download citation

  • Published:

  • Issue Date:

  • DOI:


  • Web vulnerability Scanner
  • Crawling
  • Web forms
  • Fields values
  • Deep web