Abstract
One would have thought that hackers would be striving to hide from public view, but we find that this is not the case: they have a public online footprint. Apart from online security forums, this footprint appears also in software development platforms, where authors create publicly accessible malware repositories to share and collaborate. With the exception of a few recent efforts, the existence and the dynamics of this community has received surprisingly limited attention. The goal of our work is to analyze this ecosystem of hackers in order to: (a) understand their collaborative patterns and (b) identify and profile its most influential authors. We develop HackerScope, a systematic approach for analyzing the dynamics of this hacker ecosystem. Leveraging our targeted data collection, we conduct an extensive study of 7389 authors of malware repositories on GitHub, which we combine with their activity on four security forums. From a modelling point of view, we study the ecosystem using three network representations: (a) the author-author network, (b) the author-repository network and (c) cross-platform egonets. Our analysis leads to the following key observations: (a) the ecosystem is growing at an accelerating rate as the number of new malware authors per year triples every 2 years, (b) it is highly collaborative, more so than the rest of GitHub authors, and (c) it includes influential and professional hackers. We find 101 authors maintain an online “brand” across GitHub and our online forums. Our study is a significant step towards using public online information for understanding the malicious hacker community.
Similar content being viewed by others
References
Aaron H (2020) 17 years old boy tried to hack twitter. https://bit.ly/3o7zRQl
Alzahrani T, Horadam KJ (2016) Community detection in bipartite networks: algorithms and case studies. In: Complex systems and networks, Springer, pp 25–50
Blincoe K, Sheoran J, Goggins S, Petakovic E, Damian D (2016) Understanding the popular users: following, affiliation influence and leadership on github. Inf Softw Technol 70:30–39
Calleja A, Tapiador J, Caballero J (2016) A look into 30 years of malware development from a software metrics perspective. In: International symposium on research in attacks, intrusions, and defenses, Springer, pp 325–345
Calleja A, Tapiador J, Cabalero J (2018) The malsource dataset: quantifying complexity and code reuse in malware development. IEEE Trans Inf Forensics Secur 14(12):3175–3190
Clauset A, Newman ME, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):6
Cybersec (2018) Stealing password in 5 minutes using wifiphisher. https://www.secjuice.com/phishing-with-wifiphisher/
EmpireProject (2018) Project empire. https://github.com/EmpireProject/Empire
Gharibshah J, Papalexakis EE, Faloutsos M (2020) REST: a thread embedding approach for identifying and classifying user-specified information in security forums. ICWSM
Hauff C, Gousios G (2015) Matching github developer profiles to job advertisements. In: 2015 IEEE/ACM 12th working conference on mining software repositories, IEEE, pp 362–366
Hu Y, Zhang J, Bai X, Yu S, Yang Z (2016) Influence analysis of github repositories. SpringerPlus 5(1):1–19
Hu Y, Wang S, Ren Y, Choo KKR (2018) User influence analysis for github developer social networks. Expert Syst Appl 108:108–118
Islam R, Rokon MOF, Darki A, Faloutsos M (2020a) Hackerscope: The dynamics of a massive hacker online ecosystem. In: 2020 IEEE/ACM International conference on advances in social networks analysis and mining (ASONAM), pp 361–368
Islam R, Rokon MOF, Papalexakis EE, Faloutsos M (2020b) Tenfor: a tensor-based tool to extract interesting events from security forums. In: 2020 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 515–522
Islam R, Rokon MOF, Papalexakis EE, Faloutsos M (2021) Recten: a recursive hierarchical low rank tensor factorization method to discover hierarchical patterns in multi-modal data. In: Proceedings of the international AAAI conference on web and social media
Jiang J, Lo D, He J, Xia X, Kochhar PS, Zhang L (2017) Why and how developers fork what from whom in github. Empir Softw Eng 22(1):547–578
Lee RKW, Lo D (2017) Github and stack overflow: analyzing developer interests across multiple social collaborative platforms. In: International conference on social informatics, Springer, pp 245–256
Lepik T, Maennel K, Ernits M, Maennel O (2018) Art and automation of teaching malware reverse engineering. In: International conference on learning and collaboration technologies, Springer, pp 461–472
Li L, Shang Y, Zhang W (2002) Improvement of hits-based algorithms on web documents. In: Proceedings of the 11th international conference on World wide web, pp 527–535
Liao Z, Jin H, Li Y, Zhao B, Wu J, Liu S (2017) Devrank: mining influential developers in github. In: GLOBECOM 2017-2017 IEEE global communications conference, IEEE, pp 1–6
Mitre (2019) State sponsored hacking tool. https://attack.mitre.org/software/S0363
Online Forums (2021) Ethical hacker, hack this site, offensive community, wilders security. https://www.ethicalhacker.net/, https://www.hackthissite.org/, http://offensivecommunity.net/, https://www.wilderssecurity.com/, https://mpgh.net/
Pastrana S, Thomas DR, Hutchings A, Clayton R (2018) Crimebb: Enabling cybercrime research on underground forums at scale. In: WWW, pp 1845–1854
Portnoff RS, Afroz S, Durrett G, Kummerfeld JK, Berg-Kirkpatrick T, McCoy D, Levchenko K, Paxson V (2017) Tools for automated analysis of cybercriminal markets. In: WWW, p 657
Rokon MOF, Islam R, Darki A, Papalexakis EE, Faloutsos M (2020) Sourcefinder: Finding malware source-code from publicly available repositories in github. In: 23rd international symposium on research in attacks. Intrusions and defenses (RAID), USENIX, pp 149–163
Sapienza A, Bessi A, Damodaran S, Shakarian P, Lerman K, Ferrara E (2017) Early warnings of cyber threats in online discussions. In: 2017 IEEE international conference on data mining workshops (ICDMW), pp 667–674
Sapienza A, Ernala SK, Bessi A, Lerman K, Ferrara E (2018) Discover: Mining online chatter for emerging cyber threats. In: Companion proceedings of the web conference 2018, international world wide web conferences steering committee, WWW ’18, pp 983–990
Sophron (2014) Wifiphisher. https://github.com/wifiphisher/wifiphisher. Accessed 14 Mar 2020
Thung F, Bissyande TF, Lo D, Jiang L (2013) Network structure of social coding in github. In: 2013 17th European conference on software maintenance and reengineering, IEEE, pp 323–326
Weng J, Lim EP, Jiang J, He Q (2010) Twitterrank: finding topic-sensitive influential twitterers. In: Proceedings of the third ACM international conference on web search and data mining, pp 261–270
Xavier J, Macedo A, de Almeida Maia M (2014) Understanding the popularity of reporters and assignees in the github. In: SEKE
Zhong X, Fu Y, Yu L, Brooks R, Venayagamoorthy GK (2015) Stealthy malware traffic-not as innocent as it looks. In: 2015 10th international conference on malicious and unwanted software, IEEE, pp 110–116
Acknowledgements
This work was supported by the UC Multicampus-National Lab Collaborative Research and Training (UCNLCRT) award #LFR18548554.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Islam, R., Rokon, M.O.F., Darki, A. et al. HackerScope: the dynamics of a massive hacker online ecosystem. Soc. Netw. Anal. Min. 11, 56 (2021). https://doi.org/10.1007/s13278-021-00758-8
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-021-00758-8