Abstract
Traditional anti-anonymity technologies for Bitcoin transactions include two types. One is network-layer anti-anonymity technology, which achieves the purpose of locating the initial IP of specific transaction information by speculating on the IP propagation path of transaction; the other is the anti-anonymity technology of the transaction layer. By analyzing the data of the Bitcoin ledger, it realizes the on-chain behavior portrait of a specific wallet address attributable to the user. In this work, we propose a new anti-anonymity technology, by constructing transaction behavior vectors and social behavior vectors based on Bitcoin ledger data and off-chain social data respectively, and build a model for mapping and aligning the two vectors. Experimental test shows that the proposed anti-anonymity technology is more accurate and has better practical effects. Furthermore, the technology suits for the anti-anonymity of other virtual currencies as well.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Bitcoin is a purely peer-to-peer version of electronic cash [1], which allow online payments to be sent directly from one party to another without going through a financial institution. It relies on digital signatures to prove ownership and a public history of transactions to prevent double-spending. Bitcoin does not rely on third-party credit, has strong anonymity. It mainly reflects three aspects: one is the anonymous transaction address. Bitcoin transaction address is created by the user independently, independent of user identity information, and does not require third-party participation to create and use the address; Second, the fragmented transaction behavior. Bitcoin system supports users to generate different addresses for each transaction. User transaction information can be arbitrarily dispersed in different anonymous address behaviors. Third, the source of Bitcoin transaction package is difficult to find in network. Bitcoin communication network uses P2P protocol, and there is no central node. Transaction information broadcasts all over the network. It is difficult to track the origin of transaction information by monitoring a single server. Because of its strong anonymity, Bitcoins are often used in gambling, illegal fund-raising, fraud, pyramid sale, money laundering and other illegal activities.
Traditional Bitcoin transaction anti-anonymity technology mainly includes two types: one is the network layer anti-anonymity method, which mainly detects and collects the transaction information broadcast by the Bitcoin network layer, analyzes the propagation path of a specific Bitcoin transaction in the P2P network, infers the IP address of the originating service node of the transaction, and then locates the user IP of the transaction. Another method is the anti-anonymity method at the transaction level, which mainly obtain user portrait information for a specific wallet address by analyzing transaction relationships between different transaction addresses, especially with the help of the labels of the addresses of exchanges, mining pools and other institutions. The above two types of anti-anonymity technologies are not effective because they cannot track the source of the user’s social identity information to which the transaction address belongs.
Because of the shortcomings of the traditional anti-anonymity technology of Bitcoin transaction, this paper integrates the data on and off the chain, studies and proposes an anti anonymity technology of Bitcoin transaction based on behavior vector mapping and aligning model. Build a social behavior vector based on off chain social data, and establish a mapping and aligning model with the transaction behavior vector based on Bitcoin ledger data, which can realize the anti-anonymity of Bitcoin address and transaction. Because the social behavior vector contains the real social identity information of users, this paper proposes anti anonymity technology, which has better practical effect than the traditional anti anonymity technology.
2 Bitcoin Transaction Overview
Every transaction in the Blockchain has a list of inputs and outputs, where each includes addresses that were used in the transaction and the amount of coins spent in that transaction. Inputs of the current transaction come from the outputs of the previous transaction, and the output of the current transaction will be used as the input in other transactions, which to form a transaction chain (see Fig. 1).
There will be either a single input from a larger previous transaction or multiple inputs combining smaller amounts, and at most two outputs: one for the payment, and one returning the change, if any, back to the sender, which will be automatically selected by the Bitcoin client as the input in future transactions.
Bitcoin transactions can be roughly divided into two types: the first type is mining reward transactions. Each block has a mining reward transaction. This kind of transaction has no input but only output. The system transfers the mining reward of this block and fee of the transaction contained in the block to the output; The second type is ordinary transactions, including several inputs and several outputs.
Since multiple input addresses of a transaction correspond to different private keys, Bitcoin transferring the input needs the signature of the corresponding private key; Therefore, it is generally believed that multiple input addresses of a transaction belong to the same entity. So, with the help of transaction address clustering, the decentralized transaction behaviors of the same entity in the ledger can be gathered, which is convenient to master the behavior characteristics of the entity.
There are four kinds of transaction address clustering technology [2]. One is the clustering technology based on multiple input addresses. Multiple input addresses of a transaction belong to the same address cluster; The second is the clustering technology based on the change address. The change address of a transaction belongs to the same address cluster as the input address. At the same time, through the change address as the connecting link, the input addresses in the two transactions can be combined into the same address cluster; the third is the clustering technology based on mining reward transaction. Multiple output addresses of a mining reward transaction belong to the same address cluster. The fourth is the comprehensive clustering technology combining the above three clustering technology.
3 Transaction Scene Graph Structure
Bitcoin transaction scene include mining reward, depositor withdrawal on the exchange, gambling, blackmail, MLM fraud, etc. Among them, deposit and withdrawal of Bitcoin on the exchange are more popular.
Deposit transaction transfer Bitcoin held by the user’s personal wallet address to the deposit wallet address assigned to the user by exchange. The private key of the deposit wallet address is controlled by the exchange, and different deposit wallet addresses correspond to different users. Deposit transactions include customer to customer (C2C) transaction scene and business to customer (B2C) transaction scene.
The general characteristics of the graph structure of C2C deposit transaction are: a small number of transaction input and two outputs, one of which including user’s deposit wallet address, and the cluster label of this address is the name of exchange (see Fig. 2).
B2C deposit transaction scene graph has a 1-to-N structure, which is generally characterized by a small number of transaction input addresses and a large number of transaction output, in which the output addresses are deposit wallet addresses of a large number of different users, and the cluster labels of different output addresses are the same or different exchange (see Fig. 3).
Withdrawal transaction transfer Bitcoin hosted on the exchange to the wallet address specified by the user. In order to reduce the transaction fee, exchange usually collects multiple users’ withdrawal order and transfers Bitcoin to multiple users’ wallet addresses in one transaction.
The graph structure of withdrawal transaction has the characteristics of a 1-to-N structure. The cluster labels of transaction input addresses are the same exchange, and the transaction output addresses are specified by a large number of different users (see Fig. 4).
Each transaction needs to pay fee, in reality, there is a combination of deposit transaction and withdrawal transaction, that is, user withdraws Bitcoin on a exchange and deposit it to another exchange.
4 Traditional Bitcoin Anti-anonymity Technology
Traditional Bitcoin anti-anonymity technology mainly includes network layer anti-anonymity technology and transaction layer anti-anonymity technology.
Network layer anti-anonymity technology [3] refers to collecting transaction packet transmitted by Bitcoin P2P network, analyzing the propagation path of a specific Bitcoin transaction packet in P2P network, and inferring the server IP of the first broadcast node. For example, koshy et al. [4] used special transactions to find the originating node. Most normal transactions will be forwarded once by multiple nodes, while transactions with wrong format will only be forwarded once by the originating node. Therefore, this feature can be used to identify the originating node of special transactions. However, due to the small proportion of special transactions, the effect of this method is limited. In addition, biryukov et al. [5, 6] proposed a transaction traceability mechanism based on neighbor nodes, which can improve the traceability accuracy by taking neighbor nodes as the judgment basis. However, the scheme needs to continuously send packet to all nodes in Bitcoin network, which may cause serious interference to Bitcoin network.
The network layer anti-anonymity technology has a certain probability to speculate the initial service node IP of the transaction. Gao Feng, Mao Hong-liang and others [3] have achieved the anti-anonymity traceability accuracy with a recall rate of 60% and an accuracy rate of 35.3%. The traceability and positioning from the service node IP to the end-user IP needs to be combined with the operator’s traffic analysis technology and IP positioning data.
Transaction layer anti-anonymity technology refers to finding the correlation between different Bitcoin addresses by analyzing transaction records in Bitcoin ledger, so as to infer the transaction behavior law and capital flow of the transaction address. Liao et al. [7] analyzed the blackmail process of the blackmail software crypto locker by analyzing the Bitcoin ledger data, found multiple Bitcoin addresses belonging to blackmail organizations, and identified a large number of Bitcoin ransom transactions. Meiklejohn et al. [8] used heuristic cluster analysis technology to identify multiple Bitcoin addresses belonging to the Silk Road website. Guo Wen-sheng et al. [9] studied how to realize the division of Bitcoin entities with different types of characteristics through machine learning of Bitcoin ledger data.
Transaction layer anti-anonymity technology can analyze and speculate the characteristics of the trading behavior on the chain of a specific wallet address. Combined with the anti-anonymity label information of the exchange, mining pool and other platform institutions, it can speculate the ownership of some wallet addresses, but it is difficult to determine the user’s social identity information. In reality, many Bitcoin hacking incidents generally analyze the transaction data of Bitcoin ledger, track the exchange into which Bitcoin is transferred, and coordinate the exchange to provide user information of the Bitcoin addresses.
In recent years, the research on Bitcoin anti-anonymity technology by integrating data on and off the chain has gradually become a research hotspot. Husam et al. [10] found that Tor Network anonymous services and users by integrating online social network data and Bitcoin ledger data.
5 Behavior Vectors Mapping and Aligning Model
Due to the anonymity of Bitcoin transaction address and trading process, and the poor readability of Bitcoin ledger data, most centralized institutions or platforms, such as exchange and mixed service, will synchronously record the user identity information and behavior information corresponding to Bitcoin ledger data. The above data is called social data off chain. Although it does not contain Bitcoin address, making full use of this data can realize the positioning and anti-anonymity of transaction behavior of Bitcoin ledger data.
We define social behavior vector S including five dimensions: [time, value, scene, name and account]. Time is the time when user receives social data, value is the number of Bitcoinin social data, scene is the transaction scene describing in social information, name is the platform name, and account is the user’s social account. If only time and value are considered, and the transaction scene, platform name are missing or ignored, the accuracy of anti-anonymity will be affected in some complex cases.
Like social behavior vector, we define transaction behavior vector E including seven dimensions: [time, value, scene, input label, output label, input address, output address]. Time is the transaction time recorded in the Bitcoin ledger, value is the number of Bitcoin in transaction output, scene is the transaction scene inferred through graph structure analysis, input label is the clustering label of the transaction input address, output label is the clustering label of the transaction output address (non change address), input address is the transaction input address and output address is the transaction output address (non change address). If transaction behavior vector E and social behavior vector S satisfy the following conditions:
-
â‘
Difference between S.time and E.time is small, that is, the social time is close to the Bitcoin ledger transaction time, such as less than 10 min;
-
â‘¡
S. Value is equal to E.value, that is, the transaction values on and off the chain are consistent;
-
â‘¢
S. Scene is equal to E.scene, that is, the trading scenarios on and off the chain are consistent;
-
â‘£
For deposit transaction, S.name is equal to E.output lable, that is, the name of the platform name is consistent with the address clustering label on the chain.
Then, user’s social account S.account corresponding to Bitcoin transaction address E.output address can be considered. Because user’s social account is more unique and social than the IP and user behavior portrait, and can better reflect user’s social identity information.
6 Experiment and Result Analysis
In order to research and prove the alignment model of behavior vector mapping on and off the chain, the anti anonymity of Bitcoin transaction can be realized more accurately. We conducted an experimental test on the charging transaction of a platform. The experimental process is as follows:
-
â‘
Recharge the two deposit wallet addresses assigned by the exchange, then receive 26 social messages sent by the exchange through two social accounts. 26 social messages correspond to 26 social behavior vectors, including 11 social behavior vectors belonging to social account A and 15 social behavior vectors belonging social account B. The sample data of social behavior vector after anonymized is as follows: [‘2020–05-12 14:24’, ‘0.010 *’, deposit, * exchange, ‘account’]
-
â‘¡
Determine the time window of Bitcoin ledger data. In this experiment, the start time of Bitcoin ledger data is greater than or equal to the social behavior vector’s time minus 20 min, and the end time is less than or equal to the social behavior vector time plus 10 min.
-
â‘¢
Extract time and value fields in each social behavior vector, match with the output value of Bitcoin ledger transaction output in the time window, choose Bitcoin ledger transactions output with equal value.
-
â‘£
Analyze the graph structure of the transaction, and choose transaction whose transaction scene is the same as the social behavior vector’s scene.
-
⑤
For the transaction output address, choose address whose cluster label is consistent with the exchange’s name in the social behavior vector.
The experimental results are shown in the following table (see Table 1):
Eleven social behavior vectors of social account A are respectively aligned with eleven C2C deposit transaction behavior vectors, and these Bitcoin transaction behavior vectors belong to one Bitcoin address, which is also the deposit address opened by the exchange for user A. Fifteen social behavior vectors of social account B are respectively aligned with fifteen B2C deposit transaction behavior vectors, and these Bitcoin transaction behavior vectors belong to one Bitcoin address, which is also the deposit address opened by the exchange for user B.
7 Conclusion
The anti-anonymity technology of Bitcoin transaction based on behavior vector mapping and aligning model proposed in this paper, realizes the fusion analysis of data on and off the chain. Compared with the traditional anti-anonymity technology, it has stronger practical effect. At the same time, the anti-anonymity technology proposed in this paper is also applicable to the anti-anonymity of other virtual currencies, such as Ethereum Coin and Tether USD.
References
Nakamoto S.: Bitcoin: A peer-to-peer electronic cash system [EB/OL]. https://www.Bitcoin.org/Bitcoin.pdf,last. Accessed 18 May 2022
Hong-liang, M.A., Zhen, W.U., Min, H.E., Ji-qiang, T.A., Meng, S.H.: Heuristic approaches based clustering of bitcoin addresses. J. Beijing Univ. Posts Telecom 41(2), 27–31 (2018)
Gao, F., Mao, H.L., Wu, Z., Shen, M., Zhu, L., Li, Y.D.: Lightweight transaction tracing technology for bitcoin. Chin. J. Comput. 41(5), 899–1004 (2018)
Koshy, P., Koshy, D., McDaniel, P.: An analysis of anonymity in Bitcoin using P2P network traffic. In: Christin, N., Safavi-Naini, R. (eds.) FC 2014. LNCS, vol. 8437, pp. 469–485. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45472-5_30
Biryukov, A., Khovratovich, D., Pustogarov, I.: Deanonymisation of clients in Bitcoin P2P network. In: Proceedings of the 21st ACM Conference on Computer and Communications Security, pp. 15–29. New York, USA (2014)
Ivan, P.: Deanonymisation Techniques for Tor and Bitcoin [Ph.D. dissertation]. University of Luxem-bourg, Luxem bourg (2015)
Liao, K., Zhao, Z., Doupe, A., et al.: Behind closed doors measurement and analysis of crypto locker ransoms in Bitcoin. In: Proceedings of the Symposium on Electronic Crime Research. Toronto, Canada, pp. 1–13 (2016)
Meiklejohn, S., Pomarole, M., Jordan, G., et al.: A fistful of Bitcoins: characterizing payments among men with no names. In: Proceedings of the Conference on Internet Measurement, pp. 127–140. Barcelona, Spain (2013)
Guo, W., Yang, X., Feng, Z., Zhang, L., Yang, J.: Research of de-anonymizing method based on machine learning for bitcoin. Comput. Eng. 47(12), 47–53 (2021)
Jawaheri, H.A., Sabah, M.A., Boshmaf, Y., et al.: When a small leak sinks a great ship: deanonymizing tor hidden service users through Bitcoin transactions analysis[EB/OL], https://arxiv.org/pdf/1801.07501.pdf,last. Accessed 18 May 2022
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2022 The Author(s)
About this paper
Cite this paper
Lin, S., Mao, H., Wu, Z., Yang, J. (2022). Research on Bitcoin Anti-anonymity Technology Based on Behavior Vectors Mapping and Aligning Model. In: Lu, W., Zhang, Y., Wen, W., Yan, H., Li, C. (eds) Cyber Security. CNCERT 2022. Communications in Computer and Information Science, vol 1699. Springer, Singapore. https://doi.org/10.1007/978-981-19-8285-9_10
Download citation
DOI: https://doi.org/10.1007/978-981-19-8285-9_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8284-2
Online ISBN: 978-981-19-8285-9
eBook Packages: Computer ScienceComputer Science (R0)