Abstract
Despite being backed by blockchain technology that promises security, immutability, and full transparency, some cryptocurrencies such as Bitcoin have been used as enablers for many licit and illicit activities such as money laundering, terrorism financing, and ransomware payments. In this scenario, the analysis of the transactions, as well as the entities that have generated them, became a crucial step for law enforcement officer (LEO) investigations. However, the (pseudo) anonymity of the network, the lack of regulatory authority, the employment of anonymizer mechanisms, the evolution of entities’ behavior, and the emergence of new dynamics are just five of the main elements that make this task challenging. At the same time, the huge amount of information to be analyzed can result in a waste of time and resources, slowing the investigations. For this reason, in this work, we present Kriptosare, a tool able to classify entity behaviors belonging to Bitcoin, Bitcoin Cash, and Litecoin. On the one hand, the tool makes use of state-of-the-art machine learning techniques to reduce anonymity in the considered cryptocurrencies. This model extracts behaviors from interactions and dynamics of different known entities involved in the transactions and then predicts the behaviors of new unseen entities. On the other hand, Kriptosare includes a crypto simulator able to create and control a private Bitcoin, Bitcoin Cash, or Litecoin network. This unit allows the simulation of crypto transactions in a controlled way for evaluating hypotheses and/or enriching the input data. The presented tool can be used by LEOs to search and highlight the most important red flag indicators that could suggest criminal behavior, and to support their analysis by optimizing their investigation resources.
You have full access to this open access chapter, Download chapter PDF
Keywords
- Cryptocurrency analysis
- Illicit transactions
- Behavior classification
- Cryptocurrency simulation
- Visualization
Introduction
Undoubtedly, the cryptocurrency industry is experiencing rapid innovation and constant evolution derived from its power and utility. Despite the promises of security, immutability, and complete transparency offered by blockchain technology, certain cryptocurrencies, particularly Bitcoin, have been utilized in both legal and illegal activities such as trading, buying goods, money laundering, scams, terrorism financing, and ransomware payments. In this sense, tackling terrorist financing through investigation, prosecution, and prevention has become a worldwide issue that extends beyond Europe. Every day, terrorists find new mediums to communicate, campaign, and finance their activities. For example, as reported by EUROPOL in the IOCTA report [1], two main trends are related to crowdfunding campaigns and generating revenue in markets. However, in both cases, to maintain anonymity, they often employ a combination of cryptocurrencies and dark market technologies [2].
Consequently, law enforcement officers (LEOs) face a critical challenge in analyzing these crypto transactions and identifying the responsible parties, especially due to properties like the (pseudo) anonymity of the network, the absence of regulatory oversight, the utilization of anonymizer mechanisms, the changing behavior of entities, and the emergence of new dynamics all contribute to the complexity of this task. Additionally, the sheer volume of information that needs to be examined can lead to a significant waste of time and resources, thereby impeding the progress of investigations.
To tackle these needs, and combat cybercrime, new paradigms, such as artificial intelligence (AI) and big data, can be used alongside conventional systems to create novel investigation tools. In particular, in this work, we present Kriptosare, a tool able to classify entity behaviors belonging to three main cryptocurrencies: Bitcoin (BTC), Bitcoin Cash (BCH), and Litecoin (LTC). Kriptosare is able to extract behaviors (or classes) from interactions and dynamics of different known entities involved in the transactions and then predicts the behaviors of new unseen entities. Pre-defined ML models are provided for a first classification, although users can train new ones using always new information and so they can reclassify the whole blockchains. For this task, the blockchain information is combined with open-source external data containing information about crypto addresses and real-world entity names detected over the years. This additional information facilitates the behavior definition following the taxonomyFootnote 1 provided by Interpol (Exchange, Mixer, Miner Pool, Marketplace, etc.) and represents a ground-truth for the ML training. However, these external data show uneven distribution, i.e., several entity behaviors are more represented than others introducing a class imbalance problem [3]. The imbalance problem is very critical since it can strongly affect ML performance, leading the model to learn skewed scenarios. Furthermore, addressing this issue is even more challenging in cryptocurrency applications, where detecting and collecting new observation data is complex and expensive in terms of resources and costs. Indeed, it is easier to find labeled behaviors of entities related to licit transactions rather than those involved in illicit activities, which are the most interesting from an investigation point of view. For this reason, Kriptosare also includes a synthetic data generator module, i.e., a crypto simulator able to create and manage a private Bitcoin, Bitcoin Cash, or Litecoin network. The control of this crypto environment allows users to replicate real behaviors generating synthetic data and then use them to address the imbalance problem introduced by external sources. More specifically, for creating their private network, users have two options, (a) deploy standard wallets, i.e., traditional and behavioral-free entities, or (b) pre-defined behavioral entities, i.e., intelligent wallets able to replicate real specific behavior assigned. In this way, on the one hand, it is possible to enhance the performance of the Kritposare.class reducing the costs. On the other hand, LEOs can study behaviors in captivity, i.e., in an isolated and controlled environment, to improve their knowledge about them.
In summary, Kriptosare allows users to manage both the classification and the generator modules in an easy way, through an intuitive and user-friendly interface (frontend). To the best of our knowledge, the presented tool can be used by LEOs to search and highlight the most important red flag indicators that could suggest criminal behavior, for example, a divergence between real labels obtained from external sources and the Kriptosare.class predictions, or the usage of specific entities that are usually involved in illicit activities, such as anonymizer or tumblers. These results can also be used for supporting LEOs’ analysis and optimizing their investigation resources by focusing their effort just on the most relevant behaviors, excluding the ones that are completely unregulated and which would require longer analysis times.
Related Work
The capability for non-transparent transactions and the absence of robust regulatory measures have spread the usage of cryptocurrency, in both legal and illegal/criminal activities. The most striking case is represented by Bitcoin [4]. In fact, over the years, the number of transactions involved in activities such as money laundering, selling illegal goods, ransomware, and Ponzi schemes has abruptly increased. This trend is confirmed in the “2023 Crypto Crime Report” [5] released by ChainalysisFootnote 2 [5], in which they count that in 2022, $20.6B were moved by illicit addresses. Consequently, the task of reducing anonymity within the network and categorizing crypto entities has become challenging and essential for law enforcement agencies (LEAs) [6].
For these reasons, many studies [7,8,9] have tried to address this task by using new paradigms like artificial intelligence (AI) and machine learning (ML). However, the majority of them, although valid from an academic point of view, are not used and validated in an operative context (investigation) by an end user. On the other hand, the most common tools like Chainalysis, Graphsense [10], BlockSci [11], Blockchair,Footnote 3 and CiphertraceFootnote 4 [12] are mainly focused on detecting entity behavior by gathering tags, labels, and information from the clear and dark web, rather than using AI and ML algorithms for forecasting them. In that sense, they need to be continuously fed with new external information (tags/labels) that is not always easy—and cheap—to find.
For this reason, in this study, we try to merge the two needs by introducing Kriptosare, a tool able to predict entity behaviors within cryptocurrencies using ML techniques. The tool analyzes interactions and dynamics of entities engaged in transactions, and from a few known (tagged/labeled), it is able to generalize their behaviors for detecting similar behaviors across the blockchain. Furthermore, Kriptosare allows the generation of synthetic data in a private and isolated environment. In this way, it is possible to reduce the issues related to the acquisition of external information.
General Architecture
As shown in Fig. 21.1, Kriptosare includes a central database (DB) and five units interconnected among them: four of them representing the backend (kripto_data, kripto_brain, kripto_API, and kripto_twins) and one (kripto_viz) the frontend.
In particular, the backend is based on the following technologies:
-
Python-Flask: Microweb service AP.
-
Swagger: Python-Flask API development.
-
Python Scikit Learn: Python library for ML application.
-
Cassandra: Database DB.
-
Litecoin and Bitcoin Core: daemon for running real wallet.
Whereas the frontend:
-
Vue.js: JavaScript framework.
As already described, the backend is composed of four units. The first one is the kripto_data unit, which is in charge of the data collection. More specifically, this unit allows Kriptosare to download all the available blockchains (BTC, BCH, LTC) until the current date. This operation is done by running a blockchain daemon with a specific configuration inside a docker container (one container for each cryptocurrency). Once these containers are created and linked to the real network (Mainnet), they start to synchronize themselves and so download the data. At the same time, during this synchronization phase, a specific task is in charge of copying the blockchain data into the centralized DB so that the information can be further consumed by other units. This unit constitutes a safeguard for the data in the networks that are created and used over the tool’s lifetime.
The Second Unit That Composes the Backend Is the kripto_brain
This unit represents the CORE of Kriptosare. In fact, it is in charge of three main processes which are: data preprocessing, entity creation, and feature extraction. More specifically, in the first process, blockchain data are analyzed and processed in order to extract direct relations between input and output addresses. This information is a key aspect of the LEOs’ investigations as well as for applying the follow-the-money approach [13]. As the data are preprocessed, the entity creation process is run. This script applies common cryptocurrency heuristics [14] that allow one to link addresses controlled by the same user based on publicly available transaction information or users’ mistakes, such as address reuse. In this way, it is possible to create a cluster of addresses that represents a concrete user [15]. Finally, once the entities are created, the last process is in charge of analyzing the interactions between the entities in the blockchain and extracting the features that are the inputs of the ML model. This information is finally stored in the centralized DB. All these processes are executed as Python scripts that operate uninterrupted in the background so that new information is continuously preprocessed and updated.
The primary objectives of the third unit (kripto_API) are twofold. Firstly, it serves as a conventional API, i.e., the contact point between the user interface and the data. In fact, it allows users to consult the stored information and get classification results, statistics, and so on. Secondly, kripto_API also executes the scripts that control the training of the ML models and the (re-)classification task. More specifically, the ML model used by Kriptosare recalls the cascading machine learning approach presented in [16]. This ML strategy already showed to reach very promising performance in scientific investigations. Again, all the predictions and the new models are stored in the centralized DB.
Finally, the last module that composes the backend is the kripto_twins (or simulator). This unit allows controlling and generating private cryptocurrency networks (or Regtest) of BTC, BCH, or LTC. This simulator is implemented following the instruction released in [17], where Docker containers are used for simulating the different crypto wallets. In each of these containers, the appropriate crypto daemon is run, and then, remote procedure call (RPC) commands are used to control these nodes for creating connections, transactions, mining blocks, and simulating complex behaviors.
The Frontend Is Based on the Kripto_viz Module
This unit serves as a bridge between the user and the underlying functionalities of the tool, enabling a seamless and user-friendly experience. In fact, it promotes interactivity, helping users to retrieve the classification information and the complete parametrization of synthetic networks, specifying values such as the number of wallets or their behavior (within a pre-defined set of available types of behavior). It also improves the operations’ efficiency, allowing users to run complex tasks and workflows in a few steps, and at the same time, it provides meaningful error messages for recovering mistakes and preventing critical errors. Finally, this unit allows the users to interpret and visually understand the results through a series of graphs and tables.
As shown in Fig. 21.2, Kriptosare main page is composed of 4 main sections, each one highlighted with a different color. On the right-hand side (red section), there is a menu that allows the user to navigate through the tool functionalities: Classifier, for getting classification results; Model Management, dedicated to modify or train new machine learning models; Network Management, facilitating the creation, deletion, and status retrieval of private blockchains; and lastly Behavior Simulator, designed for generating transactions, mining blocks, and simulating intricate behaviors within the simulator. The central section (green area) represents the command area where the user can select the desired cryptocurrency and configure relevant parameters according to the chosen functionality. In the specific instance presented in Fig. 21.2, when users navigate on to the Classifier menu, they must input a valid cryptocurrency address. Situated at the lower part of the interface (highlighted in yellow), a log area is designated to display all messages related to ongoing operations, their real-time statuses, and their ultimate outcomes. This setup ensures that users remain well-informed about the progress of all the operations, particularly considering that certain tasks may demand considerable time, such as training a new model or generating a substantial volume of blocks. Concluding the layout, the orange segment is dedicated to showcasing the results attained following the function’s execution. An example of the results that can be obtained using the Classifier function is reported in Fig. 21.3. As it is possible to see, Kriptosare shows statistical information related to the searched address as well as the behavior prediction (provided by the ML model) of the entity that controls or can control the address, using a very intuitive view.
All the functionalities provided by Kriptosare are functionable to all users without any prior knowledge. However, it is possible to differentiate two different groups of users: basic and ML experts. The first one includes basic users that use the interface for their investigations about crypto address predictions and the generator to create private networks that validate their hypotheses (Classifier, Network Management, and Behavior Simulator menu). The second group includes users who know how beneficial could be to train a new ML model and reclassify the whole blockchain, as well as they know how to include the synthetic data in the loop for improving the model’s abilities. In this sense, the ML experts fully exploit the model management features.
Validation and Conclusions
Kriptosare has been evaluated in two different European projects: TITANIUMFootnote 5 and Tools4LEAs.Footnote 6 In the first one, the initial version of the tool (a prototype) was made available to the project stakeholders (mainly LEAs from Germany, Spain, Finland, and Interpol) during two events called Field Labs. These events were Capture-The-Flag (CTF) exercises, in which LEAs used Kriptosare and other project tools to tackle challenges associated with criminal investigations and terrorist activities involving virtual currencies and underground markets in the darknet. This approach provided valuable insights from end-users regarding the tool’s relevance to their investigations and day-to-day responsibilities. Additionally, it allowed us to gather feedback on how to improve the tool, i.e., include new functionalities, increase the interoperability of the tool, and improve usability and user experience.
Thus, the tool was improved, and its maturity level was enhanced thanks to the Tools4LEAs project. In this second project, the tool was again evaluated by domain experts selected by the EUROPEAN ANTI-CYBERCRIME TECHNOLOGY DEVELOPMENT ASSOCIATION (EACTDA).Footnote 7 The purpose of EACTDA is to support the collaboration of multiple essential stakeholders and provide technological solutions for European Law Enforcement Agencies and Forensic Laboratories to use them in their fight against crime. In this second validation, domain experts had the chance to read all about the tool (installation and user guide), and then, they tested it freely, using only the provided materials as a guide. After that, they evaluated the tool according to the eight software characteristics, as defined by the ISO/IEC25010:2011Footnote 8 standard. Finally, the experts answered some final questions such as the type of enhancements that they would suggest to the tool, if they considered the tools valuable in their investigation, etc. At the end of this process, Kriptosare was a fully tested and operational tool ready to be used by EU public security organizations for fighting cybercrime.
In this chapter, Kriptosare, a tool for cryptocurrency entity behavioral analysis and simulation, is presented. Some preliminary results gathered from LEAs, practitioners, and domain experts proved the potential of this tool and its application in use case investigation. However, on its first deployment, the tool takes a long time to have all the blockchains up to date (depending on the physical resources). In fact, each time a new instance of Kriptosare is run, it needs months to download, preprocess, train, and classify all the data of the three blockchains, considering that just the Bitcoin blockchain has about 866 M transactions and more than 1000 M of addresses generated in 14 years (until the publication date).
As a product of the Tools4LEAs project, Kriptosare is now accessible to EU public security organizations, practitioners, and customers. To gain access to the tool, interested parties may reach out to EACTDA at info@eactda.eu.
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
References
Europol. (2021). Internet Organised Crime Threat Assessment (IOCTA). Publications Office of the European Union.
Europol. (2022). European Union terrorism situation and trend report. Publications Office of the European Union.
Zola, F., et al. (2022). Attacking bitcoin anonymity: Generative adversarial networks for improving Bitcoin entity classification. Applied Intelligence, 52(15), 17289–17314.
Dhali, M., et al. (2023). Cryptocurrency in the Darknet: Sustainability of the current national legislation. International Journal of Law and Management.
The 2023 crypto crime report, Chainalysis, 2023.
Zola, F., et al. (2019). Bitcoin and cybersecurity: Temporal dissection of blockchain data to unveil changes in entity behavioral patterns. Applied Sciences, 9, 23.
Mujlid, H. (2023). A survey on machine learning approaches in cryptocurrency: Challenges and opportunities. In 4th international conference on computing, Mathematics and Engineering Technologies (iCoMET) (p. 2023). IEEE.
Turner, A. B., McCombie, S., & Uhlmann, A. J. (2020). Analysis techniques for illicit bitcoin transactions. Frontiers in Computer Science, 2.
Lorenz, J., et al. (2020). Machine learning methods to detect money laundering in the bitcoin blockchain in the presence of label scarcity. In Proceedings of the first ACM international conference on AI in finance.
Haslhofer, B., et al. (2021). GraphSense: A general-purpose cryptoasset analytics platform. arXiv preprint arXiv, 2102.13613.
Kalodner, et al. (2020). BlockSci: Design and applications of a blockchain analysis platform. In 29th USENIX Security Symposium (pp. 2721–2738).
Srivasthav, D. P., Maddali, L. P., & Vigneswaran, R. (2021). Study of blockchain forensics and analytics tools. In 2021 3rd conference on Blockchain Research & Applications for Innovative Networks and Services (BRAINS). IEEE.
Dearden, T. E., & Tucker, S. E. (2023). Follow the money: Analyzing Darknet activity using cryptocurrency and the bitcoin Blockchain. Journal of Contemporary Criminal Justice, 39(2), 257–275.
Zhang, Y., Wang, J., & Luo, J. (2020). Heuristic-based address clustering in bitcoin. IEEE Access, 8, 210582–210591.
Androulaki, E., et al. (2013). Evaluating user privacy in bitcoin. In Financial cryptography and data security: 17th international conference, FC 2013, Okinawa, Japan, April 1–5, 2013 (Revised Selected Papers 17). Springer.
Zola, F., Eguimendia, M., Bruse, J. L., & Urrutia, R. O. (2019). Cascading machine learning to attack bitcoin anonymity. In 2019 IEEE international conference on Blockchain (Blockchain) (pp. 10–17). IEEE.
Zola, F., Pérez-Solà, C., Zubia, J. E., Eguimendia, M., & Herrera-Joancomartí, J. (2019). Kriptosare. gen, a dockerized bitcoin testbed: Analysis of server performance. In 2019 10th IFIP international conference on new technologies, mobility and security (NTMS) (pp. 1–5). IEEE.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2025 The Author(s)
About this chapter
Cite this chapter
Zola, F., Elduayen, J., Pallin, I., Orduna-Urrutia, R. (2025). Kriptosare: Behavior Analysis in Cryptocurrency Transactions. In: Gkotsis, I., Kavallieros, D., Stoianov, N., Vrochidis, S., Diagourtas, D., Akhgar, B. (eds) Paradigms on Technology Development for Security Practitioners. Security Informatics and Law Enforcement. Springer, Cham. https://doi.org/10.1007/978-3-031-62083-6_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-62083-6_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-62082-9
Online ISBN: 978-3-031-62083-6
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)