Skip to main content
Log in

Forensic investigation of the dark web on the Tor network: pathway toward the surface web

  • Regular contribution
  • Published:
International Journal of Information Security Aims and scope Submit manuscript

Abstract

The Dark Web is notorious for being a huge marketplace that promotes illegal products such as indecent images of children, drug, private data, and stolen financial data. To track criminals on the Dark Web, several challenges, arising from the Dark Web’s nature, must be overcome. Dark websites frequently change domain names, so investigators find little evidence of criminals when using a common crawling method. Furthermore, disturbing material on the Dark Web threatens investigators’ mental health and decreases the effectiveness of investigations. Above all, given the anonymity of the Dark Web, few clues remain to track criminals. To address these challenges, this article presents an advanced crawler to collect data considering the Dark Web ecosystem. Machine learning models that detect disturbing content are implemented to protect investigators’ mental health. This article also describes tracking code and status module, pivotal clues that can strip the anonymity of perpetrators along with the cryptocurrency transactions studied in previous works. In this article, the current state of the Dark Web is introduced by analyzing 14,993 crawled dark websites. By presenting three case studies, it is proved that our proposed investigative methodology can identify operators of illegal dark websites by connecting dark websites with the corresponding surface websites.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are not publicly available due to privacy or ethical restrictions but are available from the corresponding author on reasonable request.

Notes

  1. https://duckduckgo.com (last accessed 19 March 2021).

  2. https://gist.github.com (last accessed 19 March 2021).

  3. https://pastebin.com (last accessed 19 March 2021).

  4. https://phantomjs.org/ (last accessed 19 March 2021).

  5. http://publicibkxahavzc.onion/ (last accessed 19 March 2021).

  6. https://github.com/spinscale/elasticsearch-ingest-langdetect (last accessed 19 March 2021).

  7. https://docs.python.org/3/library/difflib.html (last accessed 22 October 2020)

  8. At this stage, we use a service provided by walletexplorer.com The site served by developers and analysts of Chainalysis, a block chain analysis tool used by investigative agencies, provides transactions of cryptocurrencies that are traded on famous exchanges.

References

  1. Jardine, E.: Privacy, censorship, data breaches and internet freedom: the drivers of support and opposition to dark web technologies. New Media Soc. 20(8), 2824 (2018)

    Article  Google Scholar 

  2. Finklea, K.M.: Dark web, Congressional Research Service. pp. 1–19 (2017). https://fas.org/sgp/crs/misc/R44101.pdf

  3. Çalışkan, E., Minárik, T., Osula, A.M.: Technical and legal overview of the tor anonymity network. NATO Cooperative Cyber Defence Centre of Excellence, Tallinn, Estonia (2015)

  4. Soska, K., Christin, N.: Measuring the longitudinal evolution of the online anonymous marketplace ecosystem, 24th USENIX Security Symposium (USENIX Security 15) pp. 33–48 (2015)

  5. DiPiero, C.: Deciphering cryptocurrency: shining a light on the deep dark web. U. Ill. L. Rev. p. 1267 (2017)

  6. Chaabane, A., Manils, P., Kaafar, M.A.: Digging into anonymous traffic: a deep analysis of the tor anonymizing network. In: 2010 Fourth International Conference on Network and System Security, pp. 167–174 (2010)

  7. Kiran, K., Chalke, S.S., Usman, M., Shenoy, P.D., Venugopal, K.: Anonymity and performance analysis of stream isolation in Tor network. In: 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–6 (2019)

  8. Oda, T., Obukata, R., Yamada, M., Ishitaki, T., Hiyama, M., Barolli, L.: A Neural network based user identification for Tor networks: comparison analysis of activation function using Friedman test. In: 2016 10th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS), pp. 477–483 (2016)

  9. U.S. Department of Justice. South Korean National and Hundreds of Others Charged Worldwide in the Takedown of the Largest Darknet Child Pornography Website, Which was Funded by Bitcoin. https://www.justice.gov/opa/pr/south-korean-national-and-hundreds-others-charged-worldwide-takedown-largest-darknet-child (2019). Accessed 8 Sept 2020

  10. Ziegeldorf, J.H., Matzutt, R., Henze, M., Grossmann, F., Wehrle, K.: Secure and anonymous decentralized bitcoin mixing. Futur. Gener. Comput. Syst. 80, 448 (2018)

    Article  Google Scholar 

  11. Brady, P.Q.: Crimes against caring: exploring the risk of secondary traumatic stress, burnout, and compassion satisfaction among child exploitation investigators. J. Police Crim. Psychol. 32(4), 305 (2017)

    Article  MathSciNet  Google Scholar 

  12. Burruss, G.W., Holt, T.J., Wall-Parker, A.: The hazards of investigating internet crimes against children: digital evidence handlers’ experiences with vicarious trauma and coping behaviors. Am. J. Crim. Justice 43(3), 433 (2018)

    Article  Google Scholar 

  13. Dingledine, R., Mathewson, N., Syverson, P.: Tor: The second-generation onion router. Tech. rep., Naval Research Lab Washington DC (2004)

  14. Zantout, B., Haraty, R., et al.: I2P data communication system. In: Proceedings of ICN, pp. 401–409 (2011)

  15. Clarke, I., Sandberg, O., Wiley, B., Hong, T.W.: Freenet: a distributed anonymous information storage and retrieval system. In: Designing Privacy Enhancing Technologies, pp. 46–66 (2001)

  16. Karunanayake, I., Ahmed, N., Malaney, R., Islam, R., Jha, S.K.: De-anonymisation attacks on Tor: a survey. IEEE Commun. Surv. Tutor. 23(4), 2324 (2021)

    Article  Google Scholar 

  17. Biryukov, A., Pustogarov, I., Thill, F., Weinmann, R.P.: Content and popularity analysis of Tor hidden services. In: 2014 IEEE 34th International Conference on Distributed Computing Systems Workshops (ICDCSW), pp. 188–193 (2014)

  18. Faizan, M., Khan, R.A.: Exploring and analyzing the dark web: a new alchemy. First Monday 24(5) (2019)

  19. Ghosh, S., Das, A., Porras, P., Yegneswaran, V., Gehani, A.: Automated categorization of onion sites for analyzing the darkweb ecosystem. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1793–1802 (2017)

  20. Barratt, M.J., Ferris, J.A., Winstock, A.R.: Use of Silk road, the online drug marketplace, in the United Kingdom, Australia and the United States. Addiction 109(5), 774 (2014)

    Article  Google Scholar 

  21. Dolliver, D.S.: Evaluating drug trafficking on the Tor network: silk road 2, the sequel. Int. J. Drug Policy 26(11), 1113 (2015)

    Article  Google Scholar 

  22. Lee, S., Yoon, C., Kang, H., Kim, Y., Kim, Y., Han, D., Son, S., Shin, S.: Cybercriminal minds: an investigative study of cryptocurrency abuses in the dark web. In: Network and Distributed System Security Symposium, pp. 1–15 (2019)

  23. Eldefrawy, K., Gehani, A., Matton, A.: Longitudinal analysis of misuse of bitcoin. In: International Conference on Applied Cryptography and Network Security, pp. 259–278 (2019)

  24. Kumar, R., Yadav, S., Daniulaityte, R., Lamy, F., Thirunarayan, K., Lokala, U., Sheth, A.: eDarkFind: Unsupervised multi-view learning for Sybil account detection. In: Proceedings of The Web Conference 2020, pp. 1955–1965 (2020)

  25. Yoon, C., Kim, K., Kim, Y., Shin, S., Son, S.: Doppelgängers on the dark web: a large-scale assessment on phishing hidden web services. In: The World Wide Web Conference, pp. 2225–2235 (2019)

  26. Dalins, J., Wilson, C., Carman, M.: Criminal motivation on the dark web: a categorisation model for law enforcement. Digit. Investig. 24, 62 (2018)

    Article  Google Scholar 

  27. Victors, J.: The onion name system: Tor-powered distributed DNS for Tor hidden services. Master’s thesis, Utah State University (2015)

  28. Trac. Tor Rendezvous Specification-Version 3. https://gitweb.torproject.org/torspec.git/tree/rend-spec-v3.txt (2015). Accessed 8 Sept 2020

  29. Katmagic. Shallot. https://github.com/katmagic/Shallot/ (2011). Accessed 8 Sept 2020

  30. Rawat, R., Rajawat, A.S., Mahor, V., Shaw, R.N., Ghosh, A.: Dark web—onion hidden service discovery and crawling for profiling morphing, unstructured crime and vulnerabilities prediction. In: Innovations in Electrical and Electronic Engineering: Proceedings of ICEEE 2021, pp. 717–734 (2021)

  31. Ciancaglini, V., Balduzzi, M., Goncharov, M., McArdle, R.: Deepweb and cybercrime. Trend Micro Rep. 9, 5 (2013)

    Google Scholar 

  32. Jones, B., Pleno, S., Wilkinson, M.: The use of random sampling in investigations involving child abuse material. Digit. Investig. 9, S99 (2012)

    Article  Google Scholar 

  33. Powell, M., Cassematis, P., Benson, M., Smallbone, S., Wortley, R.: Police officers’ perceptions of their reactions to viewing internet child exploitation material. J. Police Crim. Psychol. 30(2), 103 (2015)

    Article  Google Scholar 

  34. Park, J., Mun, H., Lee, Y.: Improving tor hidden service crawler performance. In: 2018 IEEE Conference on Dependable and Secure Computing (DSC), pp. 1–8 (2018)

  35. Poulsen, K.: FBI admits it controlled tor servers behind mass malware attack. Retrieved September 9, 2014 (2013)

  36. Wołk, K., Marasek, K.: A sentence meaning based alignment method for parallel text corpora preparation. New Perspect. Inf. Syst. Technol. 1, 229–237 (2014)

    Google Scholar 

  37. Zulkarnine, A.T., Frank, R., Monk, B., Mitchell, J., Davies, G.: Surfacing collaborated networks in dark web to find illicit and criminal content. In: 2016 IEEE Conference on Intelligence and Security Informatics (ISI), pp. 109–114 (2016)

  38. Kanemura, K., Toyoda, K., Ohtsuki, T.: Identification of darknet markets’ bitcoin addresses by voting per-address classification results. In: 2019 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), pp. 154–158 (2019)

  39. Mirea, M., Wang, V., Jung, J.: The not so dark side of the darknet: a qualitative study. Secur. J. 32(2), 102 (2019)

    Article  Google Scholar 

  40. Pastrana, S., Hutchings, A., Thomas, D., Tapiador, J.: Measuring ewhoring. In: Proceedings of the Internet Measurement Conference, pp. 463–477 (2019)

  41. Barr-Smith, F., Wright, J.: Phishing with a darknet: imitation of onion services. In: 2020 APWG Symposium on Electronic Crime Research (eCrime), pp. 1–13 (2020)

Download references

Acknowledgements

This work was supported by Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIP) (No. 2022-0-00281, Development of digital evidence analysis technique using artificial intelligence technology). This work has selected for award at the 2019 Digital Forensics Idea/Thesis Contest administered by Ministry of Culture, Sports and Tourism(MCST) and Korea Copyright Protection Agency (KCOPA). We are immensely grateful to the referees for their extraordinary dedication and exceptional efforts in reviewing our manuscript. Their insightful critiques, meticulous evaluations, and valuable suggestions have been instrumental in transforming our initial submission into a significantly improved version. We extend our heartfelt appreciation for their invaluable guidance, which has undoubtedly elevated the quality and impact of our work. Their invaluable contributions deserve our utmost gratitude and recognition.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Doowon Jeong.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest regarding the publication of this study.

Ethical approval

In this study, we have established the following ethical policies, followed them strictly, and conducted research, based on the ethical implementation considered in previous studies [39,40,41]. (1) If a word related to illegality is found on a website, we extract and analyze only the text data of the website. For example, if the word ‘child’ is found, even if it is not actually an illegal site, only text data is extracted and then analyzed because the web page may contain CSAM. (2) When the disturbing image detector determined that a crawled dark website has illegal images, we extract and analyze only the text data. The illegal image is not stored in our system. (3) We do not access the sub-URL collected by the machine in the dark web. We focus on the evidence found on the main page. Special judicial police officer of Korea Copyright Protection Agency (KCOPA) conducted an ethical review of this study. The websites were reported to the police. Under the supervision of the agency, we did not share information without approval. The source code was also not disclosed due to potential abuse.

Human rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jin, P., Kim, N., Lee, S. et al. Forensic investigation of the dark web on the Tor network: pathway toward the surface web. Int. J. Inf. Secur. 23, 331–346 (2024). https://doi.org/10.1007/s10207-023-00745-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10207-023-00745-4

Keywords

Navigation