Skip to main content
Log in

Malware homology identification based on a gene perspective

  • Published:
Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Abstract

Malware homology identification is important in attacking event tracing, emergency response scheme generation, and event trend prediction. Current malware homology identification methods still rely on manual analysis, which is inefficient and cannot respond quickly to the outbreak of attack events. In response to these problems, we propose a new malware homology identification method from a gene perspective. A malware gene is represented by the subgraph, which can describe the homology of malware families. We extract the key subgraph from the function dependency graph as the malware gene by selecting the key application programming interface (API) and using the community partition algorithm. Then, we encode the gene and design a frequent subgraph mining algorithm to find the common genes between malware families. Finally, we use the family genes to guide the identification of malware based on homology. We evaluate our method with a public dataset, and the experiment results show that the accuracy of malware classification reaches 97% with high efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zheng Shan.

Additional information

Project supported by the National Natural Science Foundation of China (Nos. 61472447 and 61802435)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, Bl., Shan, Z., Liu, Fd. et al. Malware homology identification based on a gene perspective. Frontiers Inf Technol Electronic Eng 20, 801–815 (2019). https://doi.org/10.1631/FITEE.1800523

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/FITEE.1800523

Key words

CLC number

Navigation