Keywords

1 Introduction

Since its introduction, Bitcoin [23] has revolutionized the world of finance and of distributed consensus. Systems such as Paxos [18] had previously provided decentralized consensus, but these were fairly complex systems and did not guarantee availability. In contrast, Bitcoin introduces Nakamoto consensus, a probabilistic consensus that provides high availability. Since its introduction, other protocols have attempted to improve upon its design. Ethereum [30] introduced quasi-Turing complete smart contracts. Tezos [11] sought more graceful evolution of a blockchain protocol by letting clients vote on blockchain proposals and automatically upgrading to successful proposals, thus avoiding many problems with hard forks. Several protocols sought to improve on the consensus mechanism, either by modifying the proof-of-work mechanism [16, 17, 22] or by eliminating it totally [7, 12, 15, 28].

The blockchain is central to the design of BitcoinFootnote 1. With this data structure, a block is chained to the previous block by a cryptographic hash. This chain of hashes continues back to the first block in the chain, referred to as the genesis block. With this design, any change made to a block invalidates all subsequent blocks. Conflicting blocks might both be valid, but blockchain protocols resolve this by clear rules for prioritizing blocks; in the case of Bitcoin, the blockchain with the most hashing power used in its computation is the highest priority.

The blockchain data structure has traditionally been used in decentralized, distributed systems. In this paper, we show how the blockchain may be useful in a centralized system.

We create a logging framework that stores its logging messages as transactions in a local proof-of-work blockchain. If the system is later compromised by an attacker, the blockchain helps to prevent the attacker from modifying the logging messages without detection. An attacker could log fraudulent messages after the compromise. They could even modify recent log messages by rewriting the blockchain and expending the necessary computation to redo the search for valid proof-of-work solutions. However, the older the message, and the more deeply that message is buried in the blockchain, the more difficult it becomes for the attacker to tamper with the log without detection.

To evaluate our idea, we implemented our logging framework using the SpartanGold blockchain framework [4]. We produced a dataset of 30 sample blockchains in JSON format, and then experimented with various attacks.

Our results show that unsophisticated attacks where an attacker simply rebuilds the blockchain from a specific block are easily detected. A more subtle attacker would modify the block times so that each rebuilt block takes only slightly longer than would be typical for a block to be produced. Our results show that the attacker can only change the block creation times by a fairly slight amount to avoid detection.

The algorithm used to detect this type of tampering is a deep learning model trained on pairs of hash IDs and timestamp differences between the current and previous block (this approximates the time required to create the current block). We noticed that the model was not able to detect such an attack when the difference between the timestamps was comparable with the average time of the creation of a legitimate block. Precisely, the model became unreliable when the range of time between the current timestamp and the previous one was between 2 and 8 s.

Our defense gives the attacker a challenging dilemma. The more aggressively the attacker increments the block time, the more likely they are to be caught; however, if they are too cautious, an outside observer might notice the change before the blockchain can be rebuilt.

2 Background and Related Work

Many have observed the potential utility of storage on the blockchain and its value in providing a public, tamper-resistant data store. Indeed, storing data on the Bitcoin blockchain was frequent enough that Bitcoin added support for an OP_RETURN opcode to allow the storage to be done in a more efficient manner.

A few protocols have focused on using storage as part of their consensus. Permacoin [22] sought to reduce Bitcoin’s proof-of-work requirements by requiring that miners also prove that they are storing a portion of the Library of Congress; storage here is a “public good”, and is not intended to be used for storing arbitrary data. Spacemint [25] replaces the proof-of-work portion of Bitcoin with a proof-of-space system; this data is “junk” data, in the sense that it has no utility outside of the protocol.

Namecoin [24] is a fork of Bitcoin focused on storing arbitrary data, acting as a decentralized and distributed key-value store. Ali et al. introduce Blockstack [2] to serve as a decentralized public-key infrastructure (PKI); their initial model used Namecoin, though they later migrated to Bitcoin due to the greater security offered by the Bitcoin network’s stronger hashing power.

Ethereum offers storage on its blockchain, but it is expensive. Several blockchain protocols have attempted to provide storage, including Filecoin [10], Storj [27], Siacoin [29], and 0Chain [21].

Intrusion detection has been tackled with machine learning algorithms with promising results. Since 1980, when the first intrusion detection system (IDS) was proposed [3], the field experienced several evolving steps. Analytical and statistical techniques have recently been substituted by machine learning models due to their superiority in detecting attacks. In [9], deep learning is compared with traditional intrusion analysis showing how much more effective the application of machine learning is in this field. Another example is the survey in [1], where different machine learning techniques are analyzed and compared on network intrusion detection. The survey concludes that the most widely applied and best performing algorithm is deep learning.

Network Intrusion Detection Systems (NIDS) are among the most explored fields where we find the application of IDS; however, it has been applied in a vast range of areas, such as detecting compromised land vehicles and aerial drones. In [26], the authors explore attacks to automotive Controller Area Networks (CANs) applying machine learning techniques to identify the theft of a vehicle. The work in [14] utilizes machine learning and blockchain technology to evade intrusions in the piloting system of Unmanned Aerial Vechicles (UAVs). In this case, the blockchain is used to share and upgrade several machine learning models to ensure that the most updated one is being used at all times without external compromising.

Biometric evaluation is also applied to identify intrusion. The work in [13], for example, assumes that the intruder has access to real users’ gesture data and applies Convolutional Neural Networks (CNNs) to detect tampering of the data to bypass user authentication.

While many papers [6, 8, 19] have approached the conjunction of blockchain technology and machine learning, there seems to be a lack in the literature when we assume a complete compromising of the machines hosting the blockchain and, thus, no applications of machine learning to detect compromised blocks have yet been proposed to the best of our knowledge.

3 Logger Design and Implementation

Our logging framework writes messages to transactions on a local blockchain, helping to ensure the integrity of logging messages even should the system hosting the logs be compromised.

In this section, we briefly give an overview of the SpartanGold blockchain framework, review the code for our logging library, and finally discuss possible future extensions for the library.

3.1 SpartanGold Overview

SpartanGold [4] is a simplified blockchain written in JavaScript designed for experimentation and education; as such, it is an ideal tool for our experiments. Like Bitcoin, it uses the hashcash protocol [5] for proof-of-work consensus. However, there are some notable differences from Bitcoin:

  • SpartanGold uses an account-based model, similar to Ethereum, rather than Bitcoin’s unspent transaction output (UTXO) model.

  • The proof-of-work target for mining a block in SpartanGold does not adjust over time.

  • SpartanGold does not support smart contracts.

  • Transactions in SpartanGold are stored directly in the block, rather than in a Merkle tree [20].

  • A block in SpartanGold does not have a strict size limit.

Although SpartanGold does not support smart contracts, new functionality can be introduced by extending the Miner or Block classes. The homepage for SpartanGoldFootnote 2 includes several sample implementations of different blockchain protocols.

The most notable limitation of SpartanGold for our purposes is that it does not use a Merkle tree. In Bitcoin, all transactions are contained in a single hash value (the Merkle root) that is hashed during the mining process. In contrast, all transactions in SpartanGold are stored in a map; the full contents of the map are directly hashed during the mining process. As a result, if there are a large number of transactions, or even a single large transaction, the size of the block might exceed the size of the hash function’s input; when this happens, the mining power required to find a block is multiplied by the number of hashing rounds required to hash the block once.

In our performance experiments, we avoided this problem by leaving the blocks at a constant size. In a real implementation, we could resolve the issue by extending SpartanGold’s Block class to store a Merkle root of transactions rather than the transactions themselves.

3.2 Logging Framework Codebase

We adapt SpartanGold to add the functionality needed for a logging framework. In our design, every log message writes a transaction to the blockchain. Every transaction therefore must include both a logging level and the log message itself. Since the time of the log message might not correspond to the time of the block, every transaction must also include a timestamp.

Figure 1 shows the Logger class. The log levels specified as constants at the top of the file (lines 8–9) are loosely based on the log levels for Apache’s Log4J logging frameworkFootnote 3. Convenience methods at the end of the file (lines 52–70) write log messages at the corresponding levels. One exception is the Block_TIME level; our logging framework uses this level to track the time that the miner began searching for a proof for the block. Every block should have exactly one transaction with this log level.

Fig. 1.
figure 1

Logger class

The constructor (lines 13–16) initializes the blockchain, including setting the proof-of-work target (powLeadingZeroes). It calls the initializeBlockchain method (lines 35–46), which creates a new miner (line 33), makes the genesis block (lines 35–42)Footnote 4, and then triggers the miner to begin mining new blocks (line 45). (The FakeNet class (lines 32 and 43) uses events to simulate network traffic between clients and miners. It is not particularly useful in this case, but is required boilerplate code.)

Once the logger is initialized, the startServer method (lines 18–29) will start listening for incoming TCP/IP connections on the specified port. When a message is received (line 21), the message is converted to a string (line 22) and the logging level and the message are extracted from the string (lines 23–24). This information is then used to invoke the log method of the logger class.

Most of SpartanGold’s standard classes work well for our implementation. However, we extend SpartanGold’s Miner class to make LoggingMiner. Two changes are worthy of closer attention.

The startNewSearch method is called whenever a search for a new block proof begins. Its responsibility is to create the block of transactions and intialize the proof field so that the proof-of-work search can begin. We extend this method to add a special transaction with the timestamp of the block. The constant BLOCK_TIME_LEVEL is set to 5, matching the BLOCK_TIME constant defined in the logging class. (As we shall see in Sect. 4.2, this method also gives us a good hook to introduce a delay in block production, simulating an attack on the log.)

The code for the startNewSearch method is shown below:

figure a

The postLoggingTransaction method sends a transaction to the miner for inclusion in the blockchain. It calls postGenericTransaction from its parent class, which handles many of the banal details about posting the transaction. Since we do not care about coins for these transactions, the ouputs field specifying who gets paid is empty, and the fee of coins to pay the miner for including this transaction is set to 0.

The data field of a SpartanGold transaction is deliberately unspecified to allow for greater ease in expanding the code. In our case, we include the level of the log, the message, and the timestamp.

The method is shown below:

figure b

3.3 Extensions

Our design uses a blockchain locally for storing messages. Given the blockchain’s utility in decentralized systems, incorporating multiple servers is a natural extension.

Instead of sending transactions to a single miner, it would be straightforward to send the transactions to a network of machines, allowing them to come to consensus through the usual mining process. This approach might be useful in a company with a large network of machines.

Alternately, a smaller company might write their logs to an external blockchain. This approach might raise concerns if the logs contain any confidential data. Additionally, the cost of blockchain storage might be prohibitive.

Instead, if the logger periodically writes the hash of the latest block to the external blockchain, then we can detect any tampering of the local log file. However, it would not be possible to recreate the original logging data with this approach.

4 Experimental Results

To validate our design, we have generated a series of blockchains to serve as our dataset of untampered blockchains (that is, blockchains whose production of blocks has been continuous). In Sect. 4.2 we then show how an attempt to change a block is likely to be detected, especially if a substantial portion of blocks must be rewritten to maintain the internal consistency of the blockchain data structure.

A more subtle attacker might try to adjust the times of blocks to hide the change. In Sect. 4.3 we show how this attack may still be detected unless the attacker is able to keep the change in block times very minimal; we note that keeping the block production timestamps within this level slows down how quickly the new blockchain could be forged, and thus gives administrators more time to recognize the discrepancy. All code and data samples are available at https://github.com/taustin/hardenedLogger.

4.1 Untampered Blockchain Dataset

To generate a sample untampered blockchain, we ran SpartanGold with one message per block until 1000 blocks were created. We repeated this process 30 times to create our dataset of untampered blockchains. The dataset was generated on a MacBook Pro with an Apple M1 Pro chip with 10 cores, 16 GBs of memory, and running OSX V.12.4. We used SpartanGold v. 1.0.7. The proof-of-work target was fixed at 19 leading zeroes (binary).

Each block was printed in JavaScript Object Notation (JSON). A sample block is shown below, modified for readability:

figure c

The first hash is the block ID. Similar to Bitcoin, the IDs are generated in the search for a valid proof-of-work; as a result, these IDs always begin with several leading zeroes.

The timestamp field identifies when the search for a proof-of-work for the block began. Contrasting this value with the timestamp of the next block indicates the total time that was required to find a valid proof.

In SpartanGold, the proof-of-work is discovered by initializing the proof field to 0 and then incrementing that field until the hash value meets the proof-of-work target. Since we search through the space of proofs sequentially, there should be a positive correlation between the proof and the duration needed to find the block. Of course, an attacker would not have to follow this rule and could search the space of possible proofs in any order that they desired.

Table 1. Block production time

Table 1 shows the average time to produce a block and the standard deviation for all of the sample blockchains in our untampered dataset. Results are reported with a precision of 5 digits.

4.2 Simple Attack

For our first experiment, we simulate an unsophisticated attack where the intruder attempts to rewrite a portion of the blockchain and continue the logs from that branch of the blockchain.

To simulate this attack, we introduced a pause at a randomly selected block before the block production was allowed to continue. As with our benign dataset, we produced 1000 blocks for each blockchain sample. For each of these blockchains, a block in the range of 25–975 was selected randomly for the delay. Other applications and system processes were allowed to run, simulating a realistic environment.

When the LoggingMiner class constructor is initialized, we specify the field compromisedBlockNumber to indicate which block should be delayed; the duration of the delay is specified in the compromiseDuration field. At the beginning of the startNewSearch method, we introduce a check to see if the search should be delayed; if so, setTimeout is called to reinvoke the method after the delay. The compromisedBlockNumber is then deleted to allow the new call to continue as normal, and we return from the method to prevent the search from beginning earlier. After this check, the method runs as per normal. The code is shown below:

figure d

When the delay was set to 5 min, the compromised block took the greatest amount of time in two cases, and was among the 3 slowest blocks to be produced in all cases. The results for a five-minute delay are summarized below.

File

Time to mine block

Order (out of 1000)

blockchain-rewritten1-05.json

301414 ms

3rd

blockchain-rewritten2-05.json

302700 ms

1st

blockchain-rewritten3-05.json

301815 ms

1st

blockchain-rewritten4-05.json

303038 ms

3rd

blockchain-rewritten5-05.json

305717 ms

2nd

We can improve the results by making the logging process a higher priority. When we use the Unix nice command with a priority of -10Footnote 5, the compromised block is the slowest to be produced in all but one of our test cases, as shown below.

File

Time to mine block

Order (out of 1000)

blockchain-rewritten1-05-HP.json

300339 ms

1st

blockchain-rewritten2-05-HP.json

306931 ms

1st

blockchain-rewritten3-05-HP.json

302706 ms

1st

blockchain-rewritten4-05-HP.json

302156 ms

2nd

blockchain-rewritten5-05-HP.json

305554 ms

1st

We note that 5 min is also relatively recent activity. The deeper the change is in the blockchain, the more likely it is that this attack would be detected. These analyses do not attempt to account for the nonces, which would be positively correlated with the proof-of-work. It also does not account for other activity on the system. More careful analysis could consider these factors.

4.3 Subtle Attack

When the attacker attempts to recreate the blockchain by interrupting the creation of a block for a fixed amount of time, the machine learning model is immediately able to recognize the discrepancy between the block ID and the amount of time required to compute it. The detection rate, in this case, reaches an accuracy close to 100%. The attacker needs a more subtle approach. Instead of adding a fixed pause before computing new blocks, we introduce a pause that lasts a different amount of time for each block. To simulate this, we modified the timestamp of every recreated block by adding to the timestamp of the previous block a random value selected from a specified range and, then, using this as the timestamp of the current block. Note that the new timestamp cannot be less than the timestamp of the previous block, that is, the new timestamp and the previous one differ a specified number of milliseconds between 1 to the upper bound of the range. For example, the attacker pauses the process for a random number of milliseconds taken from the range 1 to 60000 (one min). In this way, the model processes timestamp differences, that is, the distance between the current timestamp’s block and the previous one, that are not fixed and that vary randomly each time to imitate a more realistic scenario.

Fig. 2.
figure 2

Accuracy of the machine learning model for different values of the upper bound.

In Fig. 2, we see the results of this experiment while varying the upper bound of the range from which the number of milliseconds are selected (from 1 to 180 s). We notice that, when the ranges are close to the legitimate average of the timestamp’s distances, the detection of a compromised block becomes unreliable. This range is highlighted in red and is comprised between 2 and 8 s.

5 Discussion and Future Work

In this paper, we have shown how a logger with a local blockchain can be used to detect log tampering. We have shown that a simple attack that only restarts the blockchain log from a given block is easily detected, and a more subtle attacker can only succeed by moving slowly, and thus opening themselves to a longer window of detection while they rewrite the blockchain.

While we have focused on a single miner, we could easily expand the system to write to multiple mining processes, or even to broadcast out to external blockchain networks. which would strengthen the defenses of the log. An interesting future direction of this research is to do so in a way that uses external blockchains to strengthen the log’s defenses, but can do so in a cost-effective manner.

We are also interested in further understanding the types of attacks that the attacker could perform, and to further study detection techniques capable of identifying these attacks.

Increased accuracy in detection could be achieved by applying different machine learning techniques such as Profile Hidden Markov Models and Ensemble Learning. The model could be trained on sequences of blocks’ information to increase the ability to detect tampering. Furthermore, the hash ID can be combined with additional block information to find the best combination of training inputs to achieve both higher accuracy and higher efficiency.