International Journal of Information Security

, Volume 14, Issue 1, pp 15–33

The MALICIA dataset: identification and analysis of drive-by download operations

  • Antonio Nappa
  • M. Zubair Rafique
  • Juan Caballero
Regular Contribution

DOI: 10.1007/s10207-014-0248-7

Cite this article as:
Nappa, A., Rafique, M.Z. & Caballero, J. Int. J. Inf. Secur. (2015) 14: 15. doi:10.1007/s10207-014-0248-7


Drive-by downloads are the preferred distribution vector for many malware families. In the drive-by ecosystem, many exploit servers run the same exploit kit and it is a challenge understanding whether the exploit server is part of a larger operation. In this paper, we propose a technique to identify exploit servers managed by the same organization. We collect over time how exploit servers are configured, which exploits they use, and what malware they distribute, grouping servers with similar configurations into operations. Our operational analysis reveals that although individual exploit servers have a median lifetime of 16 h, long-lived operations exist that operate for several months. To sustain long-lived operations, miscreants are turning to the cloud, with 60 % of the exploit servers hosted by specialized cloud hosting services. We also observe operations that distribute multiple malware families and that pay-per-install affiliate programs are managing exploit servers for their affiliates to convert traffic into installations. Furthermore, we analyze the exploit polymorphism problem, measuring the repacking rate for different exploit types. To understand how difficult is to takedown exploit servers, we analyze the abuse reporting process and issue abuse reports for 19 long-lived servers. We describe the interaction with ISPs and hosting providers and monitor the result of the report. We find that 61 % of the reports are not even acknowledged. On average, an exploit server still lives for 4.3 days after a report. Finally, we detail the Malicia dataset we have collected and are making available to other researchers.


Drive-by download operations Malicia dataset Malware distribution Cybercrime 

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Antonio Nappa
    • 1
    • 2
  • M. Zubair Rafique
    • 3
  • Juan Caballero
    • 1
  1. 1.IMDEA Software InstituteMadridSpain
  2. 2.Universidad Politécnica de MadridMadridSpain
  3. 3.iMinds-DistriNetKU LeuvenLeuvenBelgium

Personalised recommendations