Mining Actionable Information from Security Forums: The Case of Malicious IP Addresses

Gharibshah, Joobin; Li, Tai Ching; Castro, Andre; Pelechrinis, Konstantinos; Papalexakis, Evangelos E.; Faloutsos, Michalis

doi:10.1007/978-3-030-11286-8_9

Joobin Gharibshah¹⁶,
Tai Ching Li¹⁶,
Andre Castro¹⁶,
Konstantinos Pelechrinis¹⁷,
Evangelos E. Papalexakis¹⁶ &
…
Michalis Faloutsos¹⁶

Part of the book series: Lecture Notes in Social Networks ((LNSN))

Included in the following conference series:

IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

337 Accesses
1 Citations

Abstract

The goal of this work is to systematically extract information from hacker forums, whose information would be in general described as unstructured: the text of a post is not necessarily following any writing rules. By contrast, many security initiatives and commercial entities are harnessing the readily public information, but they seem to focus on structured sources of information. Here, we focus on the problem of identifying malicious IP addresses, among the IP addresses which are reported in the forums. We develop a method to automate the identification of malicious IP addresses with the design goal of being independent of external sources. A key novelty is that we use a matrix decomposition method to extract latent features of the behavioral information of the users, which we combine with textual information from the related posts. A key design feature of our technique is that it can be readily applied to different language forums, since it does not require a sophisticated natural language processing approach. In particular, our solution only needs a small number of keywords in the new language plus the user’s behavior captured by specific features. We also develop a tool to automate the data collection from security forums. Using our tool, we collect approximately 600K posts from three different forums. Our method exhibits high classification accuracy, while the precision of identifying malicious IP in post is greater than 88% in all three forums. We argue that our method can provide significantly more information: we find up to three times more potentially malicious IP address compared to the reference blacklist VirusTotal. As the cyber-wars are becoming more intense, having early accesses to useful information becomes more imperative to remove the hackers first-move advantage, and our work is a solid step towards this direction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Our software and datasets will be made available at http://www.hackerchatter.org/.

References

Abbasi, A., Li, W., Benjamin, V., Hu, S., Chen, H.: Descriptive analytics: examining expert hackers in web forums. In: 2014 IEEE Joint Intelligence and Security Informatics Conference, pp. 56–63. IEEE, Piscataway (2014)
Google Scholar
Althoff, T., Jindal, P., Leskovec, J.: Online actions with offline impact: how online social networks influence online and offline user behavior. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM’17), pp. 537–546. ACM, New York (2017)
Google Scholar
Ashiyane. http://www.ashiyane.org/forums/
Blanco, C., Lasheras, J., Valencia-García, R., Fernández-Medina, E., Toval, A., Piattini, M.: A systematic review and comparison of security ontologies. In: 2008 Third International Conference on Availability, Reliability and Security, pp. 813–820. IEEE, Piscataway (2008)
Google Scholar
Bridges, R.A., Jones, C.L., Iannacone, M.D., Testa, K.M., Goodall, J.R.: Automatic labeling for entity extraction in cyber security. arXiv preprint arXiv:1308.4941 (2013)
Google Scholar
Cheng, J., Bernstein, M., Danescu-Niculescu-Mizil, C., Leskovec, J.: Anyone can become a troll: causes of trolling behavior in online discussions. In: Proceedings of the Conference on Computer-Supported Cooperative Work. Conference on Computer-Supported Cooperative Work, p. 1217. NIH Public Access (2017)
Google Scholar
Devineni, P., Koutra, D., Faloutsos, M., Faloutsos, C.: If walls could talk: patterns and anomalies in Facebook wallposts. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015 (ASONAM’15), pp. 367–374. ACM, New York (2015)
Google Scholar
Frank, R., Macdonald, M., Monk, B.: Location, location, location: mapping potential Canadian targets in online hacker discussion forums. In: 2016 European Intelligence and Security Informatics Conference (EISIC), pp. 16–23. IEEE, Piscataway, (2016)
Google Scholar
Geolite. http://dev.maxmind.com/geoip/legacy/geolite/
Hang, H., Bashir, A., Faloutsos, M., Faloutsos, C. and Dumitras, T.: “Infect-me-not”: a user-centric and site-centric study of web-based malware. In: IFIP Networking Conference (IFIP Networking) and Workshops, pp. 234–242. IEEE, Piscataway (2016)
Google Scholar
Iannacone, M., Bohn, S., Nakamura, G., Gerth, J., Huffer, K., Bridges, R., Ferragut, E., Goodall, J. Developing an ontology for cyber security knowledge graphs. In: Proceedings of the 10th Annual Cyber and Information Security Research Conference (CISR’15), pp. 12:1–12:4. ACM, New York (2015)
Google Scholar
Li, T.C., Gharibshah, J., Papalexakis, E.E., Faloutsos, M.: Trollspot: detecting misbehavior in commenting platforms. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM ’17), pp. 171–175. ACM, New York (2017)
Google Scholar
Motoyama, M., McCoy, D., Levchenko, K., Savage, S., Voelker, G.M.: An analysis of underground forums. In: Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference (IMC’11), pp. 71–80. ACM, New York (2011)
Google Scholar
Nitol-botnet. https://threatpost.com/tag/nitol-botnet/
Offensive Community. http://www.offensivecommunity.net
Papalexakis, E.E., Sidiropoulos, N.D., Bro, R.: From k-means to higher-way co-clustering: multilinear decomposition with sparse latent factors. IEEE Trans. Signal Process. 61(2), 493–506 (2013)
Article Google Scholar
Portnoff, R.S., Afroz, S., Durrett, G., Kummerfeld, J.K., Berg-Kirkpatrick, T., McCoy, D., Levchenko, K., Paxson, V.: Tools for automated analysis of cybercriminal markets. In: Proceedings of the 26th International Conference on World Wide Web, pp. 657–666. International World Wide Web Conferences Steering Committee
Google Scholar
Ramos, J.: Using TF-IDF to determine word relevance in document queries. In: Proceedings of the First Instructional Conference on Machine Learning, vol. 242, pp. 133–142 (2003)
Google Scholar
Samtani, S., Chinn, R., Chen, H.: Exploring hacker assets in underground forums. In: IEEE International Conference on Intelligence and Security Informatics (ISI), pp. 31–36. IEEE, Piscataway (2015)
Google Scholar
Ugander, J., Karrer, B., Backstrom, L., Marlow, C.: The anatomy of the Facebook social graph. arXiv preprint arXiv:1111.4503 (2011)
Google Scholar
Virustotal. http://www.virustotal.com
Wilders Security. http://www.wilderssecurity.com
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proceedings of the Fourteenth International Conference on Machine Learning (ICML’97), pp. 412–420. Morgan Kaufmann Publishers, San Francisco (1997)
Google Scholar
Zhang, X., Tsang, A., Yue, W.T., Chau, M.: The classification of hackers by knowledge exchange behaviors. Inf. Syst. Front. 17(6), 1239–1251 (2015)
Article Google Scholar

Download references

Acknowledgements

This material is based upon work supported by an Adobe Data Science Research Faculty Award, and DHS ST Cyber Security (DDoSD) HSHQDC-14-R-B00017 grant. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding institutions.

Author information

Authors and Affiliations

University of California Riverside, Riverside, CA, USA
Joobin Gharibshah, Tai Ching Li, Andre Castro, Evangelos E. Papalexakis & Michalis Faloutsos
School of Information Sciences, University of Pittsburgh, Pittsburgh, PA, USA
Konstantinos Pelechrinis

Authors

Joobin Gharibshah
View author publications
You can also search for this author in PubMed Google Scholar
Tai Ching Li
View author publications
You can also search for this author in PubMed Google Scholar
Andre Castro
View author publications
You can also search for this author in PubMed Google Scholar
Konstantinos Pelechrinis
View author publications
You can also search for this author in PubMed Google Scholar
Evangelos E. Papalexakis
View author publications
You can also search for this author in PubMed Google Scholar
Michalis Faloutsos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joobin Gharibshah .

Editor information

Editors and Affiliations

Department of Informatics & Computers, Hellenic Air Force Academy, Dekelia, Greece
Panagiotis Karampelas
Department of Computer Science, University of Calgary, Calgary, AB, Canada
Jalal Kawash
Department of Computer Engineering, TOBB University of Economics and Technology, Ankara, Turkey
Tansel Özyer

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gharibshah, J., Li, T.C., Castro, A., Pelechrinis, K., Papalexakis, E.E., Faloutsos, M. (2019). Mining Actionable Information from Security Forums: The Case of Malicious IP Addresses. In: Karampelas, P., Kawash, J., Özyer, T. (eds) From Security to Community Detection in Social Networking Platforms. ASONAM 2017. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-030-11286-8_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-11286-8_9
Published: 10 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11285-1
Online ISBN: 978-3-030-11286-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics