1 Introduction

Since Nakamoto first proposed Bitcoin as a decentralized cryptocurrency [41], several early works have focused on analyzing and/or predicting its behavior under different attack scenarios [5, 22, 23, 32, 43, 47, 48]. However, a core question remained unanswered for quite a while:

What security goal does Bitcoin achieve under what assumptions?

An intuitive answer to this question was already given in Nakamoto’s original white paper [41]: Bitcoin aims to achieve some form of consensus on a set of valid transactions. The core difference of this consensus mechanism with traditional consensus [37,38,39, 45] is that it does not rely on having a known (permissioned) set of participants, but everyone can join and leave at any point in time. This is often referred to as the permissionless model. Consensus in this model is achieved by shifting from the traditional assumptions on the fraction of cheating versus honest participants, to assumptions on the collective computing power of the cheating participants compared to the total computing power of the parties that support the consensus mechanism. The core idea is that in order for a party’s action to affect the system’s behavior, it needs to prove that it is investing sufficient computing resources. In Bitcoin, these resources are measured by means of solutions to a presumably computation-intensive problem.

Although the above idea is implicit in [41], a formal description of Bitcoin’s goal had not been proposed or known to be achieved (and under what assumptions) until the first, seminal works of Garay, Kiayias, and Leonardos [24] and Pass, Seeman, and shelat [44], which mainly influenced this work. In a nutshell, these works set forth models of computation and, in these models, an abstraction of Bitcoin as a distributed protocol and proved that the output of this protocol satisfies certain security properties, for example the common prefix [24] or consistency [44] property. This property confirms—under the assumption that not too much of the total computing power of the system is invested in breaking it, where the exact threshold has been the study of recent works [20, 27]—a heuristic argument used by the Bitcoin specification: if some block makes it deep enough into the blockchain of an honest party, then it will eventually make it into the blockchain of every honest party and will never be reversed.Footnote 1 In addition to the common prefix property, other quality properties of the output of the abstracted blockchain protocol were also defined and proved.

1.1 Bitcoin as a Service for Cryptographic Protocols

Evidently, the main use of the Bitcoin protocol is as a decentralized monetary system with a payment mechanism, which is what it was designed for. And although the exact economic forces that guide its sustainability are still being researched, and certain rational models predict it is not a stable solution, it is a fact that Bitcoin has not met any of these pessimistic predictions for several years and it is not clear it ever will do. And even if it does, the research community has produced and is testing several alternative decentralized cryptocurrencies, e.g., [6, 12, 19, 30, 31, 40, 42, 46], that are more functional or based on different resource assumptions than Bitcoin, some of which base their analysis on earlier versions of this article [11].

This leads to the natural questions of how one can use this new reality to improve the security and/or efficiency of cryptographic protocols. First answers to this question were given in [1,2,3, 10, 28, 29, 33, 35] where it was shown how Bitcoin can be used as a punishment mechanism to incentivize honest behavior in higher-level cryptographic protocols such as fair lotteries, poker, and general multi-party computation.

But in order to formally define and prove the security of the above constructions in a widely accepted cryptographic framework for multi-party protocols, one needs to define what it means for these protocols to be run in a world that gives them access to the Bitcoin network as a resource to improve their security. In other words, the question now becomes:

What functionality can Bitcoin provide to cryptographic protocols?

To address this question, Bentov and Kumaresan [10] introduced a model of computation in which protocols can use a punishment mechanism to incentivize adversaries to adhere to their protocol instructions. As a basis, they use the universal composition framework of Canetti [13], but the proposed modifications do not support composition and it is not clear how standard UC cryptographic protocols can be cast as protocols in that model.

In a different direction, Kiayias, Zhou, and Zikas [36] connected the above question with the original question of Bitcoin’s security goal. More concretely, they proposed identifying the resource that Bitcoin (or other decentralized cryptocurrencies) offers to cryptographic protocols as its security goal, and expressing it in a standard language compatible with the existing literature on cryptographic multi-party protocols. More specifically, they modeled the ideal guarantees as a transaction-ledger functionality in the (global) universal composition framework.

In a nutshell, the ledger proposed by [36] corresponds to a trusted third party which keeps a state of blocks of transactions and makes it available, upon request, to any party. Furthermore, it accepts messages/transactions from any party and records them as long as they pass an appropriate validation procedure that depends on the above publicly available state as well as other registered messages. Periodically, this ledger puts the transactions that were recently registered into a block and adds them into the state. The state is available to everyone. As proved in [36], giving multi-party protocols access to such a transaction-ledger functionality allows for formally capturing the mechanism of leveraging security loss with coins. The proposed ledger functionality guarantees in an ideal manner all properties that one could expect from Bitcoin and encompasses the properties in [24, 44]. Therefore, it is natural to postulate that it is a candidate for defining the security goal of Bitcoin (and potentially other decentralized cryptocurrencies). However, the ledger functionality proposed by [36] was not accompanied by a security proof that any of the known cryptocurrencies implements it.

However, as we show, despite being a step in the right direction, the ledger proposed in [36] cannot be realized under standard assumptions about the Bitcoin network. On the positive side, we specify a new transaction ledger functionality which still guarantees all properties postulated in [24, 44], and prove that a reasonable abstraction of the Bitcoin protocol implements this ledger. In our construction, we describe Bitcoin as a UC protocol which generalizes both the protocols proposed in [24, 44]. We leave it as an interesting open problem to integrate more recent analyses [7, 17, 25, 26] in our UC model, where the main changes are expected in formulating the setup assumptions and restrictions along the lines we show in Sect. 8.1 for the initial models. Still, the main goal remains to UC-realize our ledger functionality.

1.2 Our Contributions

We put forth the first universally composable (simulation-based) proof of security of Bitcoin. We design a general ledger functionality whose parameters we subsequently concretely instantiate for the Bitcoin setting. We observe that the first attempts in defining such a functionality, notably the ledger proposed by Kiayias et al. [36], are too strong to be implemented by our UC abstraction of Bitcoin, the main reason being that the functionality allows too little interference of the simulator with its state, making it impossible to emulate adversarial attacks that result, e.g., in the adversary inserting only transactions coming from parties it wants or that result in parties holding chains of different lengths. We detail this in Sect. 4.1. Therefore, we propose an alternative ledger functionality in Sect. 4.2 which shares certain design properties with the proposal in [36] but which can be provably implemented by a UC abstraction of the Bitcoin protocol, where our protocol abstraction makes use of hybrid (idealized) functionalities such as the bounded-delay network, the clock to model (lock-step) synchrony, and the random oracle to idealize hash queries.

Our ledger is parametrized by a set of parameters, for example by a generic transaction validation predicate, which enables it to capture decentralized blockchain protocols beyond Bitcoin. Our functionality allows for parties/miners to join and leave the computation and we support adaptive corruptions.

We formally prove for which choice of parameters the proposed ledger functionality is implemented by Bitcoin under the assumption that miners which deviate from the Bitcoin protocol do not control a majority of the total hashing power at any point. The description of concrete parameters is given in Sect. 6, and the UC realization proof appears in Sect. 7. To this end, we first describe in detail an abstraction of the Bitcoin protocol as a UC protocol in Sect. 5. Casting Bitcoin in UC allows to precisely model the protocol assumptions, for example the knowledge of the network delay and the number of hash-function calls per round. We model Bitcoin to work over a network which basically consists of bounded-delay channels. We explain how such a network could be implemented by running the message-diffusion mechanism of the Bitcoin network (which is run over a lower level network of unicast channels). Intuitively, this network is built by every miner, upon joining the system, choosing some existing miners of its choice to use them as relay nodes. Similar to the protocol in [44], the miners are not aware of (an upper bound on) the actual delay that the network induces. As we argue, this is a strictly weaker model assumption than assuming that the network delay is publicly known such as in [24]. We devote Sect. 3 to modeling the UC execution with the appropriate setups.

Our security proof proposes a useful modularization of the Bitcoin protocol. Concretely, we first identify the part of the Bitcoin code which intuitively corresponds to the lottery aspect, provide an ideal UC functionality that reflects this lottery aspect, and prove that this part of the Bitcoin code realizes the proposed functionality. We then analyze the remainder of the protocol in the simpler world where the respective code that implements the lottery aspect is replaced by invocations of the corresponding functionality. Using the UC composition theorem, we can then immediately combine the two parts into a proof of the full protocol. Finally, in Sect. 8, we show how one can cast the theorem’s assumptions as part of the setup functionalities of the protocol. We thus obtain a desirable corollary where we do not have to restrict the environment regarding the distribution of hashing power (but where the restriction is enforced by the setup functionalities), which improves the way this protocol can be formally composed with other protocols.

As is the case with the so-called backbone protocol from [24], our above UC protocol description of Bitcoin relies only on proofs of work and not on digital signatures. As a result, it implements a somewhat weaker ledger, which does not guarantee that transactions submitted by honest parties will eventually make it into the blockchain.Footnote 2 As a last result, we show in Sect. 9 that (similarly to [24]) by incorporating public-key cryptography, i.e., taking signatures into account in the validation predicate, we can implement a stronger ledger that ensures that transactions issued by honest users—i.e., users who do not sign contradicting transactions and who keep their signing keys for themselves—are guaranteed to be eventually included into the blockchain. The fact that our protocol is described in UC makes this a straight-forward, modular construction using the proposed transaction ledger as a hybrid. In particular, we do not need to consider the specifics of the Bitcoin protocol in the proof of this step. This also allows us to identify the maximum (worst-case) delay a user needs to wait before being guaranteed to see its transaction on the blockchain and be assured that it will not be inverted.

Future directions. The presented analysis in UC corresponds to the first analysis of a blockchain protocol and requires a couple of novel modeling concepts to accurately model the execution of such decentralized protocols whose security are based on rate-limiting resources. For Bitcoin in particular, a few interesting extensions to this work are conceivable which all relate to the topic of bringing the model closer to reality. First, it is an interesting open question to what extent the reliance on a local random oracle can be relaxed while still achieving composition. A second line of research would be to model a more realistic network functionality, taking into account limited message omissions or bandwidth constraints. Finally, the full Bitcoin protocol includes adjusting the difficulty of the PoW puzzles per epoch based on the observed historic performance. It appears as a very interesting theoretical question whether the Bitcoin protocol (or possibly a variant of it) can in fact be understood as the modular composition of a component estimating participation and a ledger component as analyzed in this work.

1.3 Overview of Bitcoin and Related Work

High-level introduction. At a high level, the Bitcoin protocol works as follows: The parties (also referred to as miners) collect and circulate messages (transactions) from users of the network, check that they satisfy some commonly agreed validity property, put the valid transactions into a block, and then try to find appropriate metadata such that the hash of the block-contents and this metadata is of a specific form—concretely that, parsed as a binary string, it has a sufficient number of leading zeros. This is often referred to as a solving a mining puzzle and the intuition behind it is that the best strategy for finding such metadata is supposedly by trial and error. Thus, informally, the probability that some party finds appropriate metadata increases proportional to the number of times some party attempts a hash computation. And the more leading zeros we require from a correct puzzle solution the harder it is to find one, since the solution space of the puzzle is smaller.

Intuitively, a successful solution can be seen as a proof-of-work (POW) that testifies to the fact that the miner presenting has in fact tried a large number of hash queries. Once a miner finds such a solution, he puts it into a block and sends it to the other miners. The miners who receive it check that it satisfies some validity property (see below) and if so create new metadata using the hash of this (newly minted) block and put this metadata together with transactions that are still valid into a new block and start working on solving the puzzle induced by this block. Since a block is rendered valid by a miner only if it includes a hash-pointer to a previous valid block in the view of this miner, the view consists of a set of linked lists, namely a sequence of valid blocks each with a hash-pointer to its predecessor in the list. Each such list is called a blockchain or simply chain. All lists have a common starting point which is the so-called genesis block of Bitcoin. Hence, the entire view of a miner could be modeled as a tree, where the root is the genesis block, the nodes are valid blocks, and the hash-pointers correspond to (directed) edges.

The works of Garay, Kiayias, and Leonardos [24] and that of Pass, Seeman, and shelat [44] contain the first formal specifications and security proofs of the Bitcoin protocol. The proved security in these works is property-based. They prove that conditioned on the largest part of the network following the Bitcoin protocol (in fact an abstraction and generalization thereof), the output of this so-called backbone protocol satisfies three properties with overwhelming probability. We only informally describe these properties here. We will meet their formalization when analyzing the Bitcoin protocol in UC. In the following, let \(t_1 \le t_2\) be two points in time during the protocol execution.

  • Common prefix: Any two valid chains \(\mathcal {C}_{t_1}\), \(\mathcal {C}_{t_2}\) adopted by some honest parties at times \(t_1\) and \(t_2\), respectively, share a large common prefix. This is typically quantified by specifying a value k (the common-prefix parameter) and the size of the common prefix is required to be at least \(|\mathcal {C}_{t_1}|-k\).

  • Chain growth: For time-intervals \([t_1,t_2]\) of reasonable extent, the increase in number of blocks—measured as the difference between any two valid chains \(\mathcal {C}_{t_1}\) and \(\mathcal {C}_{t_2}\) adopted by some honest parties at times \(t_1\) and \(t_2\), respectively—is guaranteed to be substantial. The relationship between time and chain-length is typically referred to as the chain-growth coefficient.

  • Chain quality: For any honest party and its adopted valid chain \(\mathcal {C}_t\) at time t, it holds that any consecutive sequence of blocks of reasonable extent in \(\mathcal {C}_t\) is guaranteed to contain blocks contributed by honest parties. The proportion of honestly mined blocks is typically refereed to as the chain-quality coefficient.

Chain quality and chain growth are often expressed with respect to the common-prefix parameter k. That is, as the fraction of honestly mined blocks in a consecutive sequence of k blocks, and as the time interval within which an increase of k blocks is guaranteed (except with negligible probability in k).

Network assumptions and random oracle. Both [24] and [44] assume a multicast network—i.e., a network where a party sends messages to arbitrary other partiesFootnote 3—and abstract the hash function as a random oracle. Furthermore, they both have an explicit round-based model of execution where parties proceed in rounds. There are some slight differences between the two models. For example, in [24] every party makes q hash-queries (i.e., q RO calls) in each round as opposed to [44] where every party makes one hash-query per round. Second, in [44], the adversary might choose to delay message delivery but the statements are proved assuming no message is delayed by more than \(\Delta \) rounds—also known as the partial-synchronous setting—while the initial model taken by [24] was more synchronous (and was lifted to the partial synchronous model later). We note that since the number of hash-queries is fixed in both models, this implies that parties know exactly in which round they are, as they could simply count the number of queries made to the random oracle (and by definition of their models no party goes to round \(r+1\) before all parties have finished round r). Note that the partial-synchronous protocol execution model in [44] is a strictly weaker setting than a synchronous execution model with a fixed delay of one round.

Property-based vs simulation-based security. Proving that Bitcoin satisfies the above properties has been an essential step into the direction of understanding the security goals of Bitcoin. But as argued above, this does not offer the tool to be able to argue security of cryptographic protocols that use Bitcoin—e.g., to achieve an improved fairness notion [1,2,3, 10, 28, 29, 33, 35]—without the need to always look at the Bitcoin specifics. In other words, such property-based security definitions do not support composition. The standard way to allow for such a generic use of blockchain protocols as a cryptographic resource is to prove that it implements an ideal functionality in a composable framework. Intuitively, in such frameworks, a composition theorem states that we can replace calls to a functionality with invocation of a protocol implementing it without worrying about the protocol’s internals.

2 Preliminaries

2.1 Overview of the UC Framework

We use the universal composability (UC) framework introduced by Canetti [13, 14]. We give a brief introduction into the main notation of this framework.

2.1.1 Basics

The goal of the UC framework is to capture what it means for a protocol to securely carry out a task. UC first defines the process of executing a protocol in some environment and in the presence of an adversary, next it defines an ideal process to formalize what securely carrying out the task means, and finally one has to prove that no (efficient) environment can distinguish the real process and the ideal process. The core defining element of the ideal process is the ideal functionality, which can be thought of as an incorruptible party. We briefly describe the main ingredients first and then describe the real and ideal process.

Protocol and protocol instances. Formally, a protocol \(\pi \) is an algorithm for a distributed system and formalized as an interactive Turing machine. An ITM has several tapes, for example an identity tape (read-only), an activation tape, or input/output tapes to pass values to its program and return values back to the caller (e.g., the environment). An ITM also has communication tapes that model messages sent to and received from the network.

While an ITM is a static object, UC defines the notion of an ITM instance (denoted ITI), which is defined by the so-called extended identity (eid) of the form (Mid), where M is the description of an ITM and \(id = (\textsf{sid}, \textsf{pid})\) is an identity string consisting of a session identifier \(\textsf{sid}\) and a party identifier \(\textsf{pid}\). Each instance is associated with a configuration, which is as usual the contents of all of its tapes and the heads, and the control state of that ITM.

An instance, also called a session, of a protocol \(\pi \) (represented as an ITM \(M_\pi \)) with respect to a session id \(\textsf{sid}\) is defined as a set of ITIs \((M_\pi , id_i)\) with \(id_i = (\textsf{pid}_i, \textsf{sid})\).

Network and adversary. The UC model does not give any guarantee for its built-in network. The network is asynchronous without guaranteed delivery or authenticity of the originator. The messages are routed and controlled by the adversary unless a stronger network is available (such as the one we define in this work). The adversary \(\mathcal {A}\) is also defined as an ITM. Aside of its capabilities to send and read messages, it can at any time issue special corruption messages to corrupt protocol ITMs. When an ITM is corrupted, the adversary does not only learn the contents of all tapes, but it can also act in the name of this ITM, meaning that whenever this ITM is activated, the adversary gets actually activated and can decide on the next steps. This corruption dynamics is the standard form of corruption and we call such an adversary active and adaptive.

2.1.2 Real-World Process

The real-world process for a protocol \(\pi \) is defined as follows. Let \(\mathcal {Z}\) be an environment machine and let \(\mathcal {A}\) denote the adversary. The execution consists of a sequence of activations, initiated by \(\mathcal {Z}\), where in each activation, either \(\mathcal {Z}\), \(\mathcal {A}\) or some ITI running \(\pi \) is activated. We say that \(\mathcal {Z}\) invokes a new ITI \(\mathcal {Z}\) if it activates an ITI for the first time (by passing some inputs) upon which this new instance gets created (in the default configuration). All ITIs invoked by \(\mathcal {Z}\) need to have unique extended identities and need to have the same session-identifier (which is chosen by \(\mathcal {Z}\)).

Activations and execution rules. An activated ITI can change its configuration based on its code. By the UC system model (i.e., by the definition of external-write requests), an ITI loses its activation when (1) passing an input value to a (subsidiary) ITI (like a hybrid functionality), or (2) producing an output, i.e., writing to its subroutine output tape, or (3) providing output to the adversary (e.g., by an ideal functionality). The next activated ITI is the ITI that was addressed in the external-write request, or the environment if no external-write request is made.

The environment \(\mathcal {Z}\) can pass inputs to and read outputs from the input/output tape of any party, respectively. The environment can thereby emulate all outside processes and how they interact with the (challenge) protocol session. In these inputs, the environment thereby also specifies a source (extended) identity of the input (to which supposedly some output will be returned). We call such identities external. It is convenient to parametrize an environment with a predicate \(\xi \) that restricts the set of allowed external identities to use. One natural standard predicate to enforce is the one that disallows \(\mathcal {Z}\) to use as an external identity an extended identity of any ITI that it provides input to in the system. Other choices of predicates may be helpful in various scenarios. Clearly, the more relaxed the predicate \(\xi \), the more general the security statement. More restrictive predicates in turn lead to more restrictions on the contexts in which the protocols proved secure with respect to those predicates can be deployed.

The adversary \(\mathcal {A}\) can access the so-called backdoor tapes of the ITIs and in the plain network model, thereby deliver messages. Following the external-write rules, if in some activation, the adversary delivers a message to an ITI, then this ITI is activated next. In addition, the adversary can corrupt parties as described above. The environment learns the party id \(\textsf{pid}\) of any corrupted ITI via a special corruption-aggregation mechanism.

The UC model also follows some activation rules (specified by the control function). As already stated, the environment is activated first, and upon completion of its actions (entering a special waiting state), the adversary is activated as a second entity. The remaining execution proceeds as described above. As a convention, in addition to the above rules, the UC execution model requires that if an ITI completes without external-write request (for example not sending a message), then the environment is activated next.

Output and transcript. The output of the protocol execution is the output of \(\mathcal {Z}\), and we assume that this output is a binary value \(v \in \{0,1\}\). We denote this output by \(\textsc {exec}_{\pi ,\mathcal {A},\mathcal {Z}}(k, z, r)\) where k is the security parameter, \(z \in \{0,1\}^*\) is the input to the environment, and randomness r for the entire experiment. Let \(\textsc {exec}_{\pi ,\mathcal {A},\mathcal {Z}}(k, z)\) denote the random variable obtained by choosing the randomness r uniformly at random and evaluating \(\textsc {exec}_{\pi ,\mathcal {A},\mathcal {Z}}(k, z, r)\). Let \(\textsc {exec}_{\pi ,\mathcal {A},\mathcal {Z}}\) denote the ensemble \(\{\textsc {exec}_{\pi ,\mathcal {A},\mathcal {Z}}(k, z)\}_{k \in \mathbb {N}, z\in \{0,1\}^*}\). By slight abuse of notation, we denote by \(T_{\textsc {exec}_{\pi ,\mathcal {A},\mathcal {Z}}}(k,z,r)\) the associated transcript of this execution, which is the concatenation of all inputs to \(\mathcal {Z}\), all outputs from \(\mathcal {Z}\), and all messages exchanged via the communication tapes of the ITIs (also called the joint view). The distribution \(T_{\textsc {exec}_{\pi ,\mathcal {A},\mathcal {Z}}}(k,z)\) and ensemble \(T_{\textsc {exec}_{\pi ,\mathcal {A},\mathcal {Z}}}\) are defined analogously to above.

2.1.3 Ideal-World Process

Security of protocols is defined via comparing the real-world execution with an ideal-world process that solves the task in an idealistic way. More formally, the ideal process is formulated with respect to an ITM \(\mathcal {F}\) which is called an ideal functionality. In the ideal process, the environment \(\mathcal {Z}\) interacts with \(\mathcal {F}\), an ideal-world adversary (often called the simulator) \(\mathcal {S}\) and a set of trivial, i.e., dummy ITMs representing the protocol machines. The dummy ITMs behave as follows: whenever activated with a request x, they forward the request x to \(\mathcal {F}\) and output toward \(\mathcal {Z}\) whatever they receive in return. \(\mathcal {F}\) thereby specify all outputs generated for each party, and the amount of information the ideal-world adversary learn and what its active influence is via its interaction with \(\mathcal {F}\). By definition of the corruption mechanism in UC, corruption of parties happens via special corruption messages on the backdoor tape of the ideal functionality (and the party ids \(\textsf{pid}\) of all corrupted (dummy) parties can be learned by the environment). We note that an ideal functionality itself, represented as an ITI during the protocol execution, cannot be corrupted by definition.

Based on the above definitions, the ideal-world process proceeds as the real process. It is essentially the real-world process where the ITIs running the protocol are replaced by the dummy ITIs interacting with \(\mathcal {F}\) (and only one challenge session ever exists). In this interaction, the same constraints and activation sequence restrictions are enforced by the UC control function. For further details, we refer to [14].

We denote the output of this ideal-world process by \(\textsc {exec}_{\mathcal {F},\mathcal {A},\mathcal {Z}}(k, z, r)\) where the inputs are as in the real-world process. Let \(\textsc {exec}_{\mathcal {F},\mathcal {S},\mathcal {Z}}(k, z)\) denote the random variable obtained by choosing the randomness r uniformly at random and evaluating \(\textsc {exec}_{\mathcal {F},\mathcal {S},\mathcal {Z}}(k, z, r)\). Let \(\textsc {exec}_{\mathcal {F},\mathcal {S},\mathcal {Z}}\) denote the ensemble \(\{\textsc {exec}_{\mathcal {F},\mathcal {S},\mathcal {Z}}(k, z)\}_{k \in \mathbb {N}, z\in \{0,1\}^*}\). The transcript is defined analogously as in the real-world process and denoted \(T_{\textsc {exec}_{\mathcal {F},\mathcal {S},\mathcal {Z}}}(k,z,r)\).

2.1.4 Hybrid Worlds

To model setup, the UC framework knows so-called hybrid worlds. We discuss two important cases of hybrid worlds that differ in whether the setup, typically called the hybrid functionality, is available only to an instance of a protocol session (standard), or to multiple protocol sessions at the same time (shared). Note that a protocol can assume several setup functionalities of both types.

Standard (local) setup. A standard setup is modeled in UC as an ideal-functionality available in a real-world protocol execution, i.e., as an incorruptible ITI \(\mathcal {F}\) that provides certain ideal guarantees to this protocol session. We consider here the natural case that standard setups are available in real-world processes only (note that while the following conventions can be applied to ideal-world-processes as well, it still seems like an uninteresting case to consider standard setups in ideal-processes). So, formally, the \(\mathcal {F}\)-hybrid-world process is identical to the real-world process with the following additions: The parties can interact with (an a priori unbounded number of) instances of \(\mathcal {F}\) by standard interaction (sending messages, passing output to them, or receiving input from them). Each copy of \(\mathcal {F}\), i.e., each such incorruptible ITI, is identified via a unique session identifier \(\textsf{sid}\) chosen by the protocol that passes in put to it. (This in particular implies a unique identity id of this ITI.) It is stressed that by this definition, the environment can only access \(\mathcal {F}\) via calls to parties or via the adversary.

Since a protocol makes explicit which local functionalities it assumes we omit an explicit reference in the formal expressions for simplicity. For example, we just write \(\textsc {exec}_{\pi ,\mathcal {A},\mathcal {Z}}\) or \(T_{\textsc {exec}_{\pi ,\mathcal {A},\mathcal {Z}}}\) to denote the output or the transcript distribution ensembles in such cases.

Shared (global) setup. Sometimes we want to model that a certain hybrid functionality, say \(\mathcal {G}\), to be declared as shared (often also denoted to as global setup). This breaks the isolated-session idea (subroutine respecting property) of standard UC and allows sessions to share state, or more generally a functionality, with other sessions, resp. with the environment. While the exact dynamics of such a global model is beyond the scope of this introduction, the basic idea how to model such global setup in UC is relatively easy and does not need a separate model such as the one originally proposed in [16]. Instead, we follow the modeling in [4]: a generic mapping (or operator akin to the UC composition operator) is defined that takes a protocol \(\pi \) and the to-be-treated-as-global functionality \(\mathcal {G}\) (more precisely, the ideal protocol associated with \(\mathcal {G}\) as described above) and transforms the protocol (oblivious to the protocol and the functionality) into a standard UC protocol \(\textsf{M}[\pi ,\mathcal {G}]\). This protocol has the property that it exposes \(\mathcal {G}\) to the environment, while the behavior of \(\pi \) in its interaction with other ITIs and the environment remains unchanged. We recall that the identity-bound predicate \(\xi \) is a handy tool to model which external identities must be assumed to have access to the shared functionality when proving the challenge protocol secure. Finally, since global setups are always present in the experiments, the ideal process (that is, the ideal world) can be simply expressed as \(\textsf{M}[\mathcal {F},\mathcal {G}]\), i.e., where the ideal protocol for \(\mathcal {F}\) is considered instead of \(\pi \). We point out that global setups must satisfy a so-called regularity condition, which can be achieved trivially by having any party that wants to interact with \(\mathcal {G}\) to register first with the functionality. All our ideal functionalities are of this form. If a shared setup \(\mathcal {G}\) is available in the real-world or ideal-world processes, we usually make it explicit in the notation such as \(\textsc {exec}_{\pi ,\mathcal {A},\mathcal {Z}}^{\mathcal {G}}\) or \(\textsc {exec}_{\mathcal {F},\mathcal {S},\mathcal {Z}}^{\mathcal {G}}\), where we understand the normal UC execution with the above transformation.

2.1.5 Secure Realization and Composition

In a nutshell, a protocol securely realizes an ideal functionality \(\mathcal {F}\) if the real-world process (where the protocol is executed) is indistinguishable from the ideal-world process (relative to \(\mathcal {F}\)). If the protocol uses setup, we technically consider the hybrid-world processes instead of the plain real-world or ideal-world processes. We directly state the definitions.

Definition 2.1

Let us denote by \(\mathcal {X} = \{X(k,z)\}_{k\in \mathbb {N}, z \in \{0,1\}^*}\) and \(\mathcal {Y} = \{Y(k,z)\}_{k\in \mathbb {N}, z \in \{0,1\}^*}\) two distribution ensembles over \(\{0,1\}\). We say that \(\mathcal {X}\) and \(\mathcal {Y}\) are indistinguishable if for any \(c,d \in \mathbb {N}\) there exists a \(k_0 \in \mathbb {N}\) such that \(|\textrm{Pr}^{}\!\left[ X(k,z)=1\right] - \textrm{Pr}^{}\!\left[ Y(k,z)=1\right] | < k^{-c}\) for all \(k>k_0\) and all \(z~\in ~\bigcup _{\kappa \le k^d}\{0,1\}^\kappa \). We use the shorthand notation \(\mathcal {X} \approx \mathcal {Y}\) to denote two indistinguishable ensembles.

Definition 2.2

Let \(n \in \mathbb {N}\), let \(\mathcal {F}\) be an ideal functionality, and let \(\pi \) be a protocol defined for the real-world, and which potentially makes use of some local setup functionality \(\mathcal {H}\) and some global setup \(\mathcal {G}\). We say that \(\pi \) securely realizes \(\mathcal {F}\) (in the presence of these setup functionalities) if for any (efficient) adversary \(\mathcal {A}\) there exists an (efficient) ideal-world adversary (the simulator) \(\mathcal {S}\) such that for every (efficient) environment \(\mathcal {Z}\) it holds that \(\textsc {exec}_{\pi ,\mathcal {A},\mathcal {Z}}^\mathcal {G}\approx \textsc {exec}_{\mathcal {F},\mathcal {S},\mathcal {Z}}^\mathcal {G}\).

In the literature, the above condition is often referred to as \(\pi \) securely realizing functionality \(\mathcal {F}\) in the \((\mathcal {G}, \mathcal {H})\)-hybrid world, where the type of setup is inferred by the context.

Composition. The notion of secure realization is composable. We do not give a detailed explanation as it is not important to follow the results in this work. In a nutshell, assume first that a protocol \(\pi \) securely realizes \(\mathcal {F}\) in the \(\mathcal {H}\)-hybrid world, where \(\mathcal {H}\) denotes a standard (local) setup functionality. Let further \(\phi \) be a protocol that securely realizes \(\mathcal {F}\). Then, the protocol \(\pi '\), where each call to \(\mathcal {H}\) is replaced by an invocation of protocol \(\phi \), securely realizes \(\mathcal {F}\). We refer the interested reader to [14] for the general formal statement and on the exact definition of \(\pi '\). Along similar lines, a composition theorem can be proven (following from the standard UC composition theorem) where local hybrid functionalities are replaced by the protocols securely realizing them, all in the presence of shared setups [4]. Finally, we only note in passing that one can also consider replacing shared functionalities (i.e., the global setups) by suitable protocols. This, however, is a very subtle issue for which we refer the interested reader to [9].

2.2 Large Deviation Bounds

We use some known results to derive large deviation bounds in our probabilistic arguments. For proofs and further discussions we refer to [21].

Theorem 2.3

(Chernoff bound) Let \(X_1, \ldots , X_T\) be independent random variables with \({{\,\mathrm{\mathbb {E}}\,}}[X_i] = p_i\) and \(X_i \in [0,1]\). Let \(X = \sum _{i =1}^T X_i\) and \(\mu = \sum _{i =1}^T p_i = {{\,\mathrm{\mathbb {E}}\,}}[X]\). Then, for all \(\Lambda \ge 0\),

$$\begin{aligned}&\Pr [X \ge (1 + \Lambda )\mu ] \le e^{-\frac{\Lambda ^2}{2+\Lambda } \mu }\,;\\&\Pr [X \le (1 - \Lambda )\mu ] \le e^{-\frac{\Lambda ^2}{2+\Lambda } \mu }\,. \end{aligned}$$

Theorem 2.4

(Azuma’s inequality (Azuma; Hoeffding)) Let \(X_0, \ldots , X_n\) be a sequence of real-valued random variables so that, for all t, \(|X_{t+1} - X_t| \le c\) for some constant c. If \({{\,\mathrm{\mathbb {E}}\,}}[X_{t+1} \,|\, X_0, \ldots , X_{t}] \le X_t\) for all t, then for every \(\Lambda \ge 0\)

$$\begin{aligned} \Pr [X_n - X_0 \ge \Lambda ] \le \exp \left( -\frac{\Lambda ^2}{2nc^2}\right) . \end{aligned}$$

Alternatively, if \({{\,\mathrm{\mathbb {E}}\,}}[X_{t+1} \,|\, X_0, \ldots , X_{t}] \ge X_t\) for all t, then for every \(\Lambda \ge 0\)

$$\begin{aligned} \Pr [X_n - X_0 \le -\Lambda ] \le \exp \left( -\frac{\Lambda ^2}{2nc^2}\right) . \end{aligned}$$

3 UC Execution Model for Permissionless PoW Blockchains

In this section, we describe our UC model of execution for the Bitcoin protocol. We remark that providing such a formal model of execution forces us to make explicit all the implicit assumptions from previous works. As we lay down the theoretical framework, we will also discuss these assumptions along with their strengths and differences.

Bitcoin miners are formally represented as ITIs, which we refer to for notational convenience by \(P _i\), i.e, \(P _i = (\pi , id_i)\), where \(id_i=(\textsf{pid}_i,\textsf{sid})\) and where \(\pi \) will be the Bitcoin protocol (running in session \(\textsf{sid}\)). We refer to \(P _i\) as a party for short. The index i is used to distinguish two identifiers, i.e., \(P _i \ne P _j\), and otherwise carries no meaning. Parties interact which each other by exchanging messages over an unauthenticated multicast network with eventual delivery (see below) and might make queries to a common random oracle. We will assume a central adversary \(\mathcal {A}\) who gets to corrupt miners and might use them to attempt to break the protocol’s security. As is common in UC, the resources available to the parties are described as hybrid functionalities (some of which are treated as shared or global as we discuss later). Before we provide the formal specification of such functionalities, we first discuss a delicate issue that relates to the set of parties (ITIs) that might interact with an ideal functionality.

3.1 Functionalities with Dynamic Party Sets

In many UC functionalities, the set of parties is defined upon initiation of the functionality and is not subject to change throughout the lifecycle of the execution. Nonetheless, UC does provide support for a completely dynamic generation of ITIs and thus making the set of parties that might interact with the functionality dynamic. This feature is important when modeling the Bitcoin protocol—where miners come and go at will. In this work, we make this explicit by means of the following mechanism: all the functionalities considered here include the instructions below that allow parties to join or leave the set \(\mathcal {P}\) that the functionality interacts with, and inform the adversary about the current set of registered parties. Note that making the set of parties dynamic means that the adversary needs to be informed about which parties are currently in the computation so that it can chose how many (and which) parties to corrupt.

  • Upon receiving \((\textsc {register}, \text {sid})\) from some party \(P \)Footnote 4 (or from \(\mathcal {A}_{}\) on behalf of a corrupted \(P \)), the functionality sets \(\mathcal {P}= \mathcal {P}\cup \left\{ { P }\right\} \) and returns \((\textsc {register}, \text {sid}, P)\) to the caller.

  • Upon receiving \((\textsc {de-register}, \text {sid})\) from some party \(P \in \mathcal {P}\) (or from \(\mathcal {A}_{}\) on behalf of a corrupted \(P \in \mathcal {P}\)), the functionality sets \(\mathcal {P}:=\mathcal {P}{\setminus } \{P \}\) and returns \((\textsc {de-register}, \text {sid},P)\) to the caller.

  • Upon receiving \((\textsc {get-registered}, \text {sid})\) from \(\mathcal {A}_{}\), the functionality returns the response \((\textsc {get-registered}, \text {sid},\mathcal {P})\) to \(\mathcal {A}_{}\).

Finally, our functionalities will only interact with those parties that are effectively registered to it. This makes any functionality (hybrid or realized) in this work regular which is important for global UC.

3.2 Modeling Network Assumptions

In many situations, one cannot tolerate a complete asynchronous network such as the standard UC communication mechanism. For example, we want to argue about liveness properties of blockchains, which requires communication with eventual delivery guarantees as time goes by (see below how we model time). We describe such a network based on ideas from [8, 18, 34]. In particular, we capture such communication by a network functionality that provides each party or miner \(P _{s}\in \mathcal {P}\) the capability to multicast a message. For every newly sent message, say m, the network functionality creates a unique identifier \(\textsf {mid}\) for each triple \((P _j,P _j, m)\), where \(P _j \in \mathcal {P}\) is a potential receiver. This handle is needed to succinctly refer to a message circulating in the network in a fine-grained manner. The network does not provide any information to any receiver about who else is using it or where a message originates from. More precisely, messages are buffered but the information of who is the sender is never provided to a receiver.

The adversary—who is informed about both the content of the messages and about the handles—is allowed to delay messages by any finite amount, and allowed to deliver them in an arbitrary out-of-order manner. To ensure that the adversary cannot arbitrarily delay the delivery of messages submitted by honest parties, we use the following idea: The network works in a “fetch message” mode, which means that parties need to actively query for the message (for example, a party can query for messages once in a round). If the adversary wishes to delay the delivery of some message with message ID mid, he needs to submit an integer value \(T_\textsf {mid}\)—the delay for the message-in-transmission with identifier mid. For example, if \(\textsf {mid}\) refers to the triple \((P _s,P _j,m)\), this will have the effect that only after the next \(T_\textsf {mid}\) fetch attempts by \(P _j\), \(P _j\) will be able to report the receipt of this particular message m. Importantly, the network does not accept more than \(\Delta \) accumulative delay for any \(\textsf {mid}\). To allow the adversary freedom in scheduling the delivery of messages, we allow him to input delays more than once, which are added to the current delay amount. If the adversary wants to deliver the message in the next activation, all he needs to do is submit a negative delay. Furthermore, we allow the adversary to schedule more than one messages to be delivered in the same “fetch” command. Finally, to ensure that the adversary is able to re-order such batches of messages arbitrarily, we allow \(\mathcal {A}_{}\) to send special \((\textsc {swap},\textsf {mid},\textsf {mid}')\) commands that have as an effect to change the order of the corresponding messages. Last but not least, the adversary is further allowed to do partial and inconsistent multicasts, i.e., where different messages are sent to different parties. This is the main difference of such a multicast network from a broadcast network. The description appears in Fig. 1.

Fig. 1
figure 1

The network functionality with eventual delivery guarantees. Note that for a list \(\vec {M}\) we denote by the symbol || the operation which appends a new element to \(\vec {M}\)

3.2.1 Multicast from Unicast

The above multicast functionality is an ideal abstraction of a large network, where we idealize the network delay and network topology. While the network delay is an explicit parameter in the analysis (and could be estimated from real-world deployed networks), the topology does not appear in any way. Hence, a natural question is how to get the above multicast network from simpler channels. Note that in Bitcoin, parties/miners communicate over an incomplete network and a standard diffusion mechanism is employed: The sender sends the message it wishes to multicast to all its neighbors who check that a message with the same content was not received before, and if this is the case forward it to their neighbors, who then do the same check, and so on. In fact, a multicast network can be built from unicast channels. That is, one essentially assumes for each miner \(P _{R}\in \mathcal {P}\) a channel functionality —which is parameterized by a receiver \(P _R\) and an upper bound on the delay \(\Delta \)—to which any other party \(P _{i}\in \mathcal {P}\) can connect and input messages to be delivered to \(P _{R}\). A miner connecting to the unicast channel with receiver \(P _R\) models the real-world process of looking up \(P _R\) (e.g., a public node in the network) and using this party to disseminate future messages. The unicast channel should have some similar properties as the above network, namely:

  • They guarantee (reliable) delivery of messages within a delay parameter but are otherwise specified to be of asynchronous nature (see below) and hence no protocol can rely on timings regarding the delivery of messages. The adversary might delay any message sent through such a channel, but at most by \(\Delta \). In particular, the adversary cannot block messages. However, he can induce an arbitrary order on the messages sent to some party.

  • The receiver gets no information other than the messages themselves. In particular, a receiver cannot link a message to its sender nor can he observe whether or not two messages were sent from the same sender.

  • The channel offers no privacy guarantees. The adversary is given read access to all messages sent on the network.

In Appendix A, we provide this channel functionality for completeness and explain how a simple round-based diffusion mechanism can be used to implement a multicast mechanism from unicast channels as long as the corresponding network among honest parties stays strongly connected—where a network graph is called strongly connected if there is a directed path between any two nodes in the network where the unicast channels are the directed edges from senders to receivers.

3.3 Modeling Time and Clock-Dependent Protocol Execution

Katz et al. [34] proposed a methodology for casting synchronous protocols in UC by assuming they have access to an ideal functionality , the clock, that allows parties to ensure that they proceed in synchronized rounds. Informally, the idea is that the clock keeps track of a round variable whose value the parties can request by sending it \((\textsc {clock-read}, \text {sid}_C)\). This value is updated only once all honest parties sent the clock a \((\textsc {clock-update}, \text {sid}_C)\) command. We use a variant of their clock as a global setup in this work. The description is given in Fig. 2, where we also make explicit the behavior of the clock-update upon corruption.

Fig. 2
figure 2

The clock functionality that we use as a global setup

Given a clock, the authors of [34] describe how synchronous protocols can maintain their necessary round structure in UC: For every round \(\rho \) each party first executes all its round-\(\rho \) instructions and then sends the clock a \(\textsc {clock-update}\) command. Subsequently, whenever activated, it sends the clock a \(\textsc {clock-read} \) command and does not advance to round \(\rho +1\) before it sees the clocks variable being updated. This ensures that no honest party will start round \(\rho +1\) before every honest party has completed round \(\rho \).

Idealized progression of time. We know from [34] that if we want to capture the ideal guarantee of eventual-delivery, or more generally speaking, idealized progression of time, an ideal functionality needs to keep track of the number of activations that an honest party gets—so that it knows when to enforce progress in the time-domain. As a general principle, the functionality would then have to issue a clock-update command in the name of a party, once that party is done with its round actions (the overall clock ticks for a session once all honest parties are done with their round actions). We now define a notion in Definition 3.1 that simplifies this bookkeeping, that is, instead of having the functionality manage time-progression per party, it can do it on a more coarse-grained, session level and only contact the clock once per round.

To follow the definition recall the mechanics of activations in UC. In a protocol execution, an ITI gets activated either by receiving an input from the environment, subroutine output from one of its hybrid-functionalities, or an input on the backdoor tape from the adversary. Any activation results in the activated ITI performing some computation on its view of the protocol and its local state and ends with either the party providing output to some of its hybrid functionalities, to the environment, or to the adversary. In either of these cases (formally dubbed external-write requests), the ITI loses the activation.Footnote 5

For any given protocol execution, we define the honest-input sequence \(\vec {\mathcal {I}}_H \) to consist of all inputs given to a main ITI by the environment and the corruption messages by the adversary (listed in the order that they were given). The inputs are annotated with the identity of the ITI that received the input. For an execution with an environment and adversary, where the honest parties in session \(\textsf{sid}\) have received m inputs in total, \(\vec {\mathcal {I}}_H \) is a vector of the form \(((x_1,id_1),\ldots ,(x_m,id_m))\), where \(x_i\) is the ith input that was given to machine \(id_i\), or a corruption message with target \(id_i\). We further define the extended timed honest-input sequence, denoted as \(\vec {\mathcal {I}}^T_H \), to be the honest-input sequence augmented with the respective clock time when an input was given. If the timed honest-input sequence of an execution is \(\vec {\mathcal {I}}^T_H =((x_1,id_1,\tau _1),\ldots ,(x_m,id_m,\tau _m))\), this means that for each \(i\in [n]\), \(\tau _i\) is the time of the global clock when input \(x_i\) was handed to \(id_i\).

Definition 3.1

A -hybrid protocol \(\Pi \) has a predictable synchronization pattern iff there exist an efficiently computable algorithm \(\textsf {predict-time} _\Pi (\cdot )\) such that for any possible execution of \(\Pi \) in a session \(\textsf{sid}\) (i.e., for any adversary and environment, and any choice of random coins) the following holds: If \(\vec {\mathcal {I}}^T_H =((x_1,id_1,\tau _1),\ldots ,(x_m,id_m,\tau _m))\) is the corresponding timed honest-input sequence for this session, then for any \(i\in [m-1]:\)

$$\textsf {predict-time} _\Pi ((x_1,id_1,\tau _1),\ldots ,(x_i,id_i,\tau _i))=\tau _{i+1}.$$

3.3.1 Using the Clock as a Global Setup/Shared Subroutine

Treating the clock as a global setup or shared subroutine has the benefit of allowing parties across protocols to have a common denomination of time, and to be able to specify observable time-dependent ideal properties. However, modeling a setup as global also comes with complications [4]: to complete the specification in UC to what extent the usage to coordinate on time across protocols is sound, we have to define the (identity bound) predicate that specifies the applicable context. Recall that this predicate is intended to restrict the set of extended identities that the environment can claim when contacting protocols.Footnote 6 We define the following identity bound \(\xi _{sync}\) first suggested in [4]: The environment is not able to issue any request to the clock which has as a source ID the ID of a party (i.e., ITI) that already exists in the system, or to spawn any ITI for which it already claimed an external identity before in an interaction with the clock. Furthermore, the environment is not allowed to access the corruption information of the session \(\text {sid}_C\) from the clock directly for the sake of a well-defined PID-wise corruption model [4].

This identity-bound thus ensures that the clock can be used to model lock-step progression of protocols, and composition is guaranteed in any context that does not trivially break the lock-step execution style. Finally, in order to keep the standard PID-wise corruption model simple, the environment is only allowed to access the actual corruption set through the main session’s corruption aggregation machine, not through the shared subroutine’s session [4].

3.4 Modeling Hash Queries

As usual in cryptographic proofs, the queries to the hash function are modeled by assuming access to a random oracle (functionality) \(\mathcal {F}_{\textsc {RO}} ^\kappa \). This functionality is specified as follows: upon receiving a query \((\textsc {eval}, \text {sid}, x)\) from a registered party, if x has not been queried before, a value y is chosen uniformly at random from \(\{0,1\}^\kappa \) (for security parameter \(\kappa \)) and returned to the party (and the mapping (xy) is internally stored). If x has been queried before, the corresponding y is returned. The description appears in Fig. 3.

Fig. 3
figure 3

The random oracle functionality. The functionality makes no changes to the standard corruption mode

3.5 Which Setup Functionalities as Shared Subroutines?

The choice which setup functionalities should be treated as local and which one as shared is an important one and is steered by several considerations. First, from the point of view of an outsider party, the question is how much of the inner workings of a session are indeed relevant to observe. Second, modeling a resource as shared comes with complications and must justify the insights. For example, if the network was modeled as a shared network, then the protocol’s effect on the network must be replicated in the ideal world (it does not just exist in the simulator’s head). This complication might be justified when one studies the problem of protocols competing for bandwidth. Third, to enable simulation, the simulator needs some edge over the real-world adversary which often comes from the fact that the inner workings of a session are not publicly verifiable and thus the simulator can be in charge of creating for example a local CRS or program a random oracle.

In our work, we have three setups: the network, the clock, and the RO, where only the clock is a shared setup. The network is a local resource to simplify the proofs. Making it global would only make the ideal world more complex without providing more insights. Finally, abstracting hash-queries as calls to a global random oracle (GRO) runs into intrinsic problems in the PoW-setting because of two reasons. First, the model needs a reasonably elegant way to achieve some closure on the amount of work invested into a PoW blockchain, and thus it seems natural to say all work invested in Bitcoin are the RO queries made in that session. Second, at the more technical level, a global random oracle would force the simulator to create blocks that indeed carry sufficient work due to the lack of programmability. Since the simulator needs to also simulate the hash queries of honest parties, this would only be feasible if it had a much larger query budget than the real-world adversary (and not having to do this amount of work is the simulator’s edge since it can program the RO). Clearly, one could consider a global random oracle that supports programmability [15], but such a more complicated model does not appear to offer more insights than a local RO, as it basically is a shared subroutine that offers a per-session (and hence a local) advantage to the simulator. Additionally, the programmability feature comes at the price of a technical condition via which an adversary can always make a protocol abort, which is unrealistic.

3.6 Assumptions as UC-Functionality Wrappers

In order to prove statements about cryptographic protocols, one often makes assumptions about what the environment (or the adversary) can or cannot do. For example, a standard assumption in [24, 44] is that in each round the adversary cannot do more calls to the random oracle than what the honest parties (collectively) can do. This can be captured by assuming a restricted environment and adversary which balances the amount of times that the adversary queries the random oracle. In a property-based treatment such as [24, 44], this assumption is typically acceptable. Also in a composable model such restrictions can be formulated. However, restricting the environment is not compliant with a general composition theorem.

Therefore, instead of restricting the class of environments/adversaries, we present an alternative approach to capture the fact that the adversary’s access to real-world resource is restricted. The general methodology is to capture restrictions by means of a functionality wrapper that wraps the hybrid resources and enforces the restrictions on the adversary by limiting its access to the resource. Such restrictions can become quite complex and we show concrete examples in Sect. 8 to cast the assumptions and derive the equivalent composable statements.

A toy example. To illustrate the general methodology, consider the example of limiting the rate of RO queries of an adversary over time. We can capture this assumption by means of a functionality wrapper in a hybrid world that wraps the RO functionality and enforces a bound on the adversary, for example by assigning to each corrupted party at most \(q \) activations per clock-tick for some parameter \(q \). For completeness the wrapped random-oracle functionality \(\mathcal {W}^{q}(\mathcal {F}_{\textsc {RO}} ^\kappa )\) is given in Fig. 4.

Fig. 4
figure 4

The wrapped random oracle

4 The Basic Transaction-Ledger Functionality

The purpose of this section is to describe the basic structure of a ledger functionality . The presented functionality is very generic in the sense that it is parameterizable by several elements. The idea is that concrete blockchain protocols yield concrete instances of these parameters, while the basic structure, as presented here, remains the same and can be seen as the greatest common divisor of any such blockchain protocol proposal.

4.1 Introduction and Overview

Our ledger is parametrized by certain algorithms/predicates that allow us to capture a more general version of a ledger which can be instantiated by various cryptocurrencies. Since our abstraction of the Bitcoin protocol is in the synchronous model of computation (this is consistent with known approaches in the cryptographic literature), our ledger is also designed for this synchronous model. Nonetheless, several of our modeling choices are made with the foresight of removing or limiting the use of the clock and leaving room for less synchrony.

At a high level, our ledger has a similar structure as the ledger proposed in [36]. Concretely, anyone (whether an honest miner or the adversary) might submit a transaction which is validated by means of a predicate \(\textsf {Validate}\), and if it is found valid it is added to a buffer \(\texttt {buffer}\). The adversary \(\mathcal {A}_{}\) is informed that the transaction was received and is given its contents.Footnote 7 Informally, this buffer also contains transactions that, although validated, are not yet deep enough in the blockchain to be considered out-of-reach for an adversary.Footnote 8 Periodically, fetches some of the transactions in the buffer, and using an algorithm \(\textsf {Blockify}\) creates a block including these transactions and adds this block to its permanent state \(\texttt {state}\), which is a data structure that includes the part of the blockchain the adversary can no longer change. This corresponds to the common prefix in [24, 44]. Any miner or the adversary is allowed to request a read of the contents of the state.

This sketched specification is simple, but in order to have a ledger that can be implemented by existing blockchain protocols, we need to relax this functionality by giving the adversary more power to interfere with it and influence its behavior. Before sketching the necessary relaxations we discuss the need for a new ledger definition and its potential use as a global setup.

Impossibility to realize the ledger of [36]. The main reasons why the ledger functionality in [36] is not realizable by known protocols under reasonable assumptions are as follows: first, their ledger guarantees that parties always obtain the same common state. Even with strong synchrony assumptions, this is not realizable since an adversary, who just mined a new block, is not forced to inform each party instantaneously (or at all) and thus could, for example, make parties observe different lengths of the same prefix. Second, the adversarial influence is restricted to permuting the buffer. This is too optimistic, as in reality the adversary can try to mine a new block and possibly exclude certain transactions. Also, this excludes any possibility to quantify quality. Third, letting the update rate be fixed does not adequately reflect the probabilistic nature of Nakamoto-style blockchain protocols.

On the sound usage of a ledger as a global setup.

As already pointed out in [11], one has to be extra careful when replacing a global setup by its implementation, e.g., in the case of  by the UC Bitcoin protocol. Indeed, such a replacement does not, in general, preserve a realization proof of some ideal functionality \(\mathcal {F}\) that is conducted in ledger-hybrid world where the ledger is treated as a shared subroutine, because the simulator in that proof might rely on specific capabilities that are not available any more after replacement (as the global setup is also replaced in the ideal world). A recent follow-up work by Badertscher et al. [9] explores the facets of this question and gives conditions when a replacement is sound.

As this work focuses on the realization of ledger functionalities per se, this complication is not relevant to this work. We know from [9] that the distinction on whether a functionality is “global/shared” or “local” is decision of how the functionality is being used by a protocol (which stands in sharp contrast to prior global UC models that assign it a new type, which can be problematic [4]). Therefore, the functionality  is a standard UC functionality in our realization proofs.

4.2 The General Ledger Functionality

We present here the formal description of the ledger functionality. An overview of its parameters and state variables is given in Table 1 and a in-depth explanation follows in the next section.

figure d

4.3 On the Defining Features

We explain several of the features of the ledger functionality and give an overview of the relevant parameters and state variables in Table 1.

Table 1 Overview of main ledger elements including its parameters and state variables

4.3.1 State-Buffer Validation

The first relaxation is with respect to the invariant that is enforced by the validation predicate \(\textsf {Validate}\). Concretely, in [36] it is assumed that the validation predicate enforces that the buffer does not include conflicting transactions, i.e., upon receipt of a transaction, \(\textsf {Validate}\) checks that it is not in conflict with the state and the buffer, and if so the transaction is added to the buffer. However, in reality we do not know how to implement such a strong filter, as different miners might be working on different, potentially conflicting sets of transactions.Footnote 9 The only time when it becomes clear which of these conflicting transactions will make it into the state is once one of them has been inserted into a block which has made it deep enough into the blockchain (i.e., has become part of \(\texttt {state}\)). Hence, given that the buffer includes all transactions that might end up in the state, it might at some point include both conflicting transactions.

To enable us for a provably implementable ledger, in this work we take a different approach. The validate predicate will be less restrictive as to which transactions make it into the buffer. Concretely, at the very least, \(\textsf {Validate}\) will enforce the invariant that no single transaction in the buffer contradicts the state \(\texttt {state}\), while different transactions in \(\texttt {buffer}\) might contradict each other. Looking ahead, a stronger version that is achievable by employing digital signatures (presented in Sect. 9) could enforce that no submitted transaction contradicts other submitted transactions. As in [36], whenever a new transaction \(\texttt {tx} \) is submitted to , it is passed to \(\textsf {Validate}\) which takes as input a transaction and the current state and decides if \(\texttt {tx} \) should be added to the buffer. Additionally, as \(\texttt {buffer}\) might include conflicts, whenever a new block is added to the state, the buffer (i.e., every single transaction in \(\texttt {buffer}\)) is re-validated using \(\textsf {Validate}\) and invalid transactions in \(\texttt {buffer}\) are removed. To allow for this re-validation to be generic, transactions that are added to the buffer are accompanied by certain metadata, i.e., the identity of the submitter, a unique transaction ID txid,Footnote 10 or the time \(\tau \) when \(\texttt {tx} \) was received.

4.3.2 State Update Policy and Security Guarantees

The second relaxation is with respect to the rate and the form and/or origin of transactions that make it into a block. Concretely, instead of assuming that the state is extended in fixed time intervals, we allow the adversary to define when this update occurs. This is done by allowing the adversary, at any point, to propose what we refer to as the next-block candidate \(\texttt {NxtBC}\). This is a data structure containing the contents of the next block that \(\mathcal {A}_{}\) wants to have inserted into the state. Leaving \(\texttt {NxtBC}\) empty can be interpreted as the adversary signaling that it does not want the state to be updated in the current clock tick.

Of course allowing the adversary to always decide what makes it into the state \(\texttt {state}\), or if anything ever does, yields a very weak ledger. Intuitively, this would be a ledger that only guarantees the common prefix property [24] but no liveness or chain quality. Therefore, to enable us to capture also stronger properties of blockchain protocols we parameterize the ledger by an algorithm \(\textsf {ExtendPolicy}\) that, informally, enforces a state-update policy restricting the freedom of the adversary to choose the next block and implementing an appropriate compliance-enforcing mechanism in case the adversary does not follow the policy. This enforcing mechanism simply returns a default policy-complying block using the current contents of the buffer. We point out that a good simulator for realizing the ledger will avoid triggering this compliance-enforcing mechanism, as this could result in an uncontrolled update of the state which would yield a potential distinguishing advantage. In other words, a good simulator, i.e., ideal-world adversary, always complies with the policy.

In a nutshell, \(\textsf {ExtendPolicy}\) is a possibly stateful algorithm that takes the ledger state, the current contents of the buffer \(\texttt {buffer}\), along with the adversary’s recommendation \(\texttt {NxtBC}\). The output of \(\textsf {ExtendPolicy}\) is a vector including the blocks to be appended to the state (where again, \(\textsf {ExtendPolicy}\) outputting an empty vector is a signal to not extend), together with its update internal state. To ensure that \(\textsf {ExtendPolicy}\) can also enforce properties that depend on who inserted how many (or which) blocks into the state—e.g., the so-called chain quality property from [24]—we also pass to it the timed honest-input sequence \(\vec {\mathcal {I}}^T_H \) (cf. Sect. 3).

Some examples of how \(\textsf {ExtendPolicy}\) allows us to define ways that the protocol might restrict the adversary’s interference in the state-update include the following properties from [24]:

  • Liveness corresponds to \(\textsf {ExtendPolicy}\) enforcing the following policy: If the state has not been extended for more that a certain number of rounds and the simulator keeps recommending an empty \(\texttt {NxtBC}\), \(\textsf {ExtendPolicy}\) can choose some of the transactions in the buffer (e.g., those that have been in the buffer for a long time) and add them to the next block. Note that a good simulator or ideal-world adversary will never allow for this automatic update to happen and will make sure that he keeps the state extend rate within the right amount.

  • Chain quality corresponds to \(\textsf {ExtendPolicy}\) enforcing the following policy: Every block proposal made by the simulator is examined as to whether it is maximally filled with valid transactions. Such blocks must appear frequently. If this is not the case, the ledger will define and add a default block to the state. We point out that unlike the original chain-quality property from [24], this policy does not enforce which miner should receive the reward for honest blocks and it is up to the simulator to do so (via the so-called coinbase transaction).Footnote 11

We note that \(\textsf {ExtendPolicy}\) is a general concept capable of formulating various properties of blockchain protocols. For example, we can capture that honest (and non-conflicting) transactions eventually make it into the state. Another property could be to formalize that transactions with higher rewards make it into a block faster than others (which we do not consider in this work).

In Sect. 6, we provide one possible specification of \(\textsf {ExtendPolicy}\) that can be guaranteed for the UC Bitcoin protocol.

4.3.3 Output Slackness and Sliding Window of State Blocks

The common prefix property guarantees that blocks that are sufficiently deep in the blockchain of an honest miner will eventually be included in the blockchain of every honest miner. Stated differently, if an honest miner receives as output from the ledger a state \(\texttt {state}\), every honest miner will eventually receive \(\texttt {state}\) as its output. However, in reality we cannot guarantee that at any given point in time all honest miners see exactly the same blockchain length; this is especially the case when network delays are incorporated into the model, but it is also true in the zero-delay model of [24]. Thus, it is unclear how \(\texttt {state}\) can be defined so that at any point all parties have the same view on it.

Therefore, to have a ledger implementable by standard assumptions we make the following relaxation: We interpret \(\texttt {state} \) as the view of the state of the miner with the longest blockchain. And we allow the adversary to define for every honest miner \(P _i\) a subchain \(\texttt {state} _i\) of \(\texttt {state}\) of length \(|\texttt {state} _i|=\texttt {pt} _i\) that corresponds to what \(P _i\) gets as a response when he reads the state of the ledger (formally, the adversary can fix a pointer \(\texttt {pt} _i\)). For convenience, we denote by \(\texttt {state} |_{\texttt {pt} _i}\) the subchain of \(\texttt {state} \) that finishes in the \(\texttt {pt} _i\)th block. Once again, to avoid over-relaxing the functionality to an unuseful setup, our ledger allows the adversary to only move the pointers forward and it forbids the adversary to define pointers for honest miners that are too far apart, i.e., more than \(\texttt {windowSize} \) state blocks. The parameter \(\texttt {windowSize} \in \mathbb {N}\) denotes a core parameter of the ledger. In particular, the parameter \(\texttt {windowSize} \) reflects the similarity of the blockchain to the dynamics of a so-called sliding window, where the window of size \(\texttt {windowSize} \) contains the possible views of honest miners onto \(\texttt {state} \) and where the head of the window advances with the head of the \(\texttt {state} \). In addition, it is convenient to express security properties of concrete blockchain protocols, including the properties discussed above, as assertions that hold within such a sliding window (for any point in time).

4.3.4 Synchrony Aspects and De-Synchronized Parties

In order to keep the ideal execution indistinguishable from the real execution, the progression of time must be the same. Since the protocol advances the clock as an effect of executing the protocol, the ledger needs to ensure this in the ideal world (note that we model that the protocol can make advancement without the adversary being in the loop to capture liveness). To simplify clock-progression management, recall Definition 3.1, where we introduce \(\textsf {predict-time} (\vec {\mathcal {I}}^T_H)\), to enable a modular view how the clock proceeds of an entire session. Thus, instead of managing each party individually, the ledger simply registers itself to the clock, records the timed honest-input sequence \(\vec {\mathcal {I}}^T_H\) of its session, and signals the clock when the session is ready to advance to the next round. Observe that the ledger can infer all protocol-relevant inputs to honest parties and thus keep track of the honest inputs sequence \(\vec {\mathcal {I}}^T_H\). As the other functions explained above, we make the function \(\textsf {predict-time} \) a parameter of the (general) ledger functionality that needs to be instantiated when realizing a specific ledger such as the Bitcoin ledger (which is the topic of Sect. 6).

A final observation is with respect to guarantees that the protocol (and therefore also the ledger) can give to recently registered honest parties. We introduce an additional party set, \(\mathcal {P}_{DS}\), which consist of honest parties for which we are not able to give the full guarantees yet because they are de-synchronized. The ledger parameter \(\texttt {Delay} \) describes the time (in number of clock ticks) it takes for a newly joining party, that joins later than at the onset of the execution, to become fully synchronized.

To provide more intuition why we need such a set, consider the following scenario: An honest party registers as miner in round r and waits to receive from honest parties the transactions to mine and the current longest blockchain. In Bitcoin, upon joining, the miner sends out a special request on the network—we denote this here as a special \(\textsc {new-miner} \)-message—and as soon as any party receives it, it responds with the set of transactions and longest blockchain it knows. Due to the network delay, it can take a full round-trip time before the longest chain arrives to the newcomer. However, because we do not make any assumption on honest parties knowing \(\Delta \) they start mining as soon as they see network traffic. But now the adversary, in the worst case, can make these parties mine on any block he wants and have them accept any valid chain he wants as the current state while they wait for the network’s response (by maximally delaying everything sent to these parties by other honest parties, and instead immediately deliver what he wants them to work on). However, after a constant number of rounds, this effect will be resolved and the parties will be synchronized with the longest chain.

5 Bitcoin as a UC Protocol

5.1 Basics

A blockchain \(\mathcal {C} = \textbf{B} _1, \dots , \textbf{B} _n\) is a (finite) sequence of blocks where a block \(\textbf{B} _i = \langle \texttt {s} _i,\texttt {st} _i,\texttt {n} _i \rangle \) is a triple consisting of the pointer \(\texttt {s} _i\) (identifying the predecessor block via its hash), the state block \(\texttt {st} _i\), and the nonce \(\texttt {n} _i\). The head of chain \(\mathcal {C} \) is the block \(\texttt {head} (\mathcal {C}):= \textbf{B} _n\) and the length \(\texttt {length} (\mathcal {C})\) of the chain is the number of blocks, i.e., \(\texttt {length} (\mathcal {C})=n\). The chain \(\mathcal {C} ^{\lceil k}\) is the (potentially empty) sequence of the first \(\texttt {length} (\mathcal {C})-k\) blocks of \(\mathcal {C} \). A special block is the genesis block \(\textbf{G} = \langle \bot ,\texttt {gen},\bot \rangle \) which contains the genesis state \(\texttt {gen}:=\varepsilon \) and, as we will see later, is required to be the first block in the sequence.

The state \(\vec {\texttt {st}}\) encoded in \(\mathcal {C} \) is defined as a sequence of the corresponding state blocks, i.e., \(\vec {\texttt {st}}:= \texttt {st} _1||\dots ||\texttt {st} _n\). In other words, one should think of the blockchain \(\mathcal {C}\) as an encoding of its underlying state \(\vec {\texttt {st}}\); such an encoding might, e.g., organize \(\mathcal {C}\) is an efficient searchable data structure as is the case in the Bitcoin protocol where a blockchain is a linked list implemented with hash-pointers. In the protocol, the blockchain is the data structure storing a sequence of entries, often referred to as transactions. Furthermore, as in [36], in order to capture a range of blockchains with syntactically different state encoding, we assume a generic algorithm to map a vector of transactions into a state block. Thus, each block \(\texttt {st} \in \vec {\texttt {st}}\) (except the genesis state) of the state encoded in the blockchain has the form where \(\vec {N}\) is a vector of elements that we simply call transactions, although our treatment is generic and does not fix the type of data the ledger is carrying.

5.1.1 Validity and Longest Valid Chains

The validity of a blockchain \(\mathcal {C} \) depends on two aspects: chain-level validity, also referred to as syntactic validity, and a state-level validity also referred to as semantic validity.

Syntactic validity. This is defined with respect to a difficulty parameter \(\texttt {D} \in [2^\kappa ]\), where \(\kappa \) is the security parameter, and a given hash function \(H:\left\{ { 0,1 }\right\} ^* \rightarrow \left\{ { 0,1 }\right\} ^{\kappa }\); it requires that, for each \(i>1\), the value \(\texttt {s} _i\) contained in \(\textbf{B} _i\) satisfies \(\texttt {s} _i = H[\textbf{B} _{i-1}]\) and that \(H[\textbf{B} _i] < \texttt {D} \) holds (for non-genesis blocks), where the output of the hash-function is understood as an integer. Note that for notational simplicity, we omit the hash function as an explicit superscript.

figure e

Semantic validity. This is defined on the state \(\vec {\texttt {st}}\) encoded in the blockchain \(\mathcal {C} \) and specifies whether this content is valid (which might depend on a particular application). We go with a generic semantic validity check of the blockchain defined by algorithm below. We assume a generic validation predicate for single transactions that we refer to by (and which is an algorithm that takes a state and the transaction that is being validated as inputs and outputs a bit). For the sake of generality, this validity predicate is completely generic and looking ahead, our main theorem holds for any choice of this predicate, whenever the ledger parameter \(\textsf {Validate} \) is chosen accordingly as we show in Sect. 6.

The pseudo-code of the algorithm which builds upon is provided below. In a nutshell, the algorithm checks that a given blockchain state can be built in an iterative manner, such that each contained transaction is considered valid according to upon insertion. It further ensures that the state starts with the genesis state and that state blocks contain a special coin-base transaction \(\texttt {tx} ^{\text {coin-base}}_{\text {minerID}}\) which assigns them to a miner.

figure f

Definition 5.1

A chain \(\mathcal {C} \) is valid if it satisfies syntactic and semantic validity, i.e., if, for the chain and its encoded state \(\vec {\texttt {st}}\), the following predicate evaluates to true:

Longest valid chain. In the Bitcoin protocol, the notion of the longest valid chain is very crucial. The reason is that the party defines the ledger state at a certain time as a prefix of the state encoded in the longest valid chain it knows at that time. We stick to the nomenclature of [24] and call the function .

figure g

5.1.2 Extending Chains and Proofs-of-Work

A core step in Bitcoin is to extend a given chain \(\mathcal {C} \) by a new block \(\textbf{B} \) (with certain state content) to yield a longer chain \(\mathcal {C} || \textbf{B} \). As presented in [24] this can be captured by an algorithm \(\textsf {extendchain} _{\texttt {D}}(\cdot )\) that takes a chain \(\mathcal {C} \), a state block \(\texttt {st} \) and the number of attempts \(q \) as inputs. It tries to find a proof-of-work which allows to extend the \(\mathcal {C} \) by a block which encodes \(\texttt {st} \).

figure h

5.2 Overview and Modeling Decisions

In Bitcoin, each party maintains a local blockchain which initially consists of the genesis block. The chains of honest parties might differ (but as we will prove, it will have a common prefix which will define the ledger state). New transactions are added in a ‘mining process’. First, a party collects valid transactions (according to ) and creates a new state block \(\texttt {st} \) using . Next, the party attempts to mine a new block by solving a puzzle (and hence finding a proof-of-work) which upon success could then be validly added to their local blockchain. After each mining attempt parties will multicast their current chain. A party will replace its local chain if it obtains or receives a longer valid chain. When queried to output the state of the ledger, a party reports a prefix of the state encoded in its longest valid chain—obtained by ignoring (or chopping-off) the most recent \(T \) blocks (a party outputs \(\varepsilon \) if the state has less than \(T \) blocks). This behavior will ensure that all honest parties output a consistent ledger state. \(T \) is a crucial parameter of the Bitcoin protocol and typically, the guarantees of the security statements depend on \(T \) (and in addition on the usual security parameter \(\kappa \)).

5.2.1 The Round Structure

As already mentioned in the introduction, we model Bitcoin as a lock-step (sometimes dubbed semi-) synchronous protocol: The protocol can proceed in rounds—enabled by having access to a global synchronization clock —but is not aware of the actual delay of the network. In each round, two logical tasks have to be executed: an updating or information-fetching step (where new messages from the network are processed) and a working or mining-step, where each party tries to extend its local chain.

To simplify the UC activation handling in the analysis, we divide each logical round into two sub-rounds (where each sub-round corresponds to a logical task; see below for more details). This means that each logical round correspond to two actual clock-ticks (also known as mini-rounds in the MPC literature). We say that a protocol is in round r if the current time of the clock is \(\tau \in \{2r, 2r+1\}\).

Having two clock-ticks per round is a standard way to model in synchronous UC that messages (e.g., a block) sent within a round are delivered at the beginning of the next round. In our case, each round is divided into two mini-rounds, where each mini-round corresponds to a clock tick. We treat the first mini-round as the updating mini-round (fetch messages from the network to obtain messages sent previous rounds) and the second mini-round as the working mini-round (solving the puzzle and multicasting solutions).

5.2.2 Handling Interrupts

A protocol command might consists of a sequence of operations. However, certain operations, such as sending a message to another party, result in the protocol machine losing the activation token. We briefly describe a standard way to formalize that a party that loses an activation in the middle of a multi-step command is able to resume and complete the command following the implicit proposal of [34]. Their mechanism can be made explicit by introducing an anchor a that stores a pointer to the current operation; the protocol associates each anchor with such a multiple command and an input I, so that when such an input is received it directly jumps to the stored anchor, executes the next operation(s) and updates (increases) the anchor before releasing the activation. We refer to such an execution as being I-interruptible.

As an example, consider a protocol that requires that upon receiving input I, the party should run a command that consists of m steps Step 1, Step \(2, \ldots , \) Step m, but some of these steps might result in the party losing its activation. Running this command in an I-interruptible manner means executing the following code: Upon receiving input I if \(a<m\) go to Step a and increase \(a=a+1\) before executing the first operation that releases the activation; otherwise go to Step 1 and set \(a=2\) before executing any operation that releases the activation.

5.3 The Formal Protocol Description

We can now formally define our blockchain protocol \(\textsf {Ledger-Protocol} _{q,\texttt {D},T}\) (we usually omit the parameters when clear from the context). The protocol allows an arbitrary number of parties/miners to communicate by means of a multicast network . Note that this means that the adversary can send different messages to different parties. New miners might dynamically join or leave the protocol by means of the registration/de-registration commands: when they join they register with all associated functionalities and when they leave they deregister. The pseudo-code of this UC blockchain protocol is given in the remainder of this section. For the general structure of our UC blockchain model, we refer to Fig. 5.

The Bitcoin ledger protocol assumes as hybrids a random oracle \(\mathcal {F}_{\textsc {RO}} ^\kappa \), a network for blockchains, a network for transactions, and clock . Note that the two networks are simply (named) instances of and can be realized from a single network using different message-IDs. The protocol is parametrized by \(q \), \(\texttt {D} \), \(T \) where \(q \) is the number of mining attempts per round, \(\texttt {D} \) is the difficulty of the proof-of-work, and \(T \) is the number of blocks chopped off to obtain the ledger state.

Fig. 5
figure 5

The main structure of the UC blockchain protocol

5.3.1 Registration, De-Registration and Initialization

Recall from Sects. 2.1.4 and 3 that we model explicit registrations to make all our functionalities, and in particular the ledger functionality, regular. And thus this has to be reflected in the real-world protocol. However, registration and de-registration can be seen as explicit commands to start the operation of a protocol machine, and to stop the operation of a protocol machine, and to have this feature explicitly exposed to the higher-level protocol.

The formal registration process in the protocol works as follows. If a party receives \((\textsc {register}, \text {sid})\) from the environment it registers to all hybrid functionalities. Once registration has succeeded the party returns activation to the environment. Upon the next activation to maintain the ledger (\(\textsc {maintain-ledger} \)), the party initializes its local variables, multicasts a special \(\textsc {new-party} \) message over the network, and executes the main maintenance sub-protocol (in an interruptible manner as further explained below). De-registering from the ledger (via a query \((\textsc {de-register}, \text {sid})\)) from the environment) works analogously, upon which the party erases all its state and becomes idle until its is freshly invoked with a \(\textsc {register} \)-query.

Recall that the notion of de-synchronized parties is strongly connected to these registrations: if an active honest party is not registered to all hybrids for long enough after joining the protocol execution at some time \(\tau > 0\), it is considered de-synchronized (and otherwise the party is synchronized). In particular, honest parties that register at the onset of the protocol execution are synchronized (until they get corrupted or de-registered).

5.3.2 Ledger-Specific Queries

Ledger-specific queries are the specific features that one wishes to implement. Our very basic ledger supports three operations (after registration):

Submitting a transaction. This one is very simple: when given a transaction a party multicasts the transaction.

Ledger maintenance. Ledger maintenance refers to activating the main mining procedure of Bitcoin and is given in Fig. 6. Since ledger maintenance consists of several complex steps that in particular lose activations, the execution proceeds in an interruptible manner as explained in Sect. 5.2.2. The main structure of maintenance enforces the mini-round structure: in a working mini-round, the protocol tries to obtain the solution to a proof-of-work puzzle for a newly generated state block. The core sub-protocol thereby is:

figure j

It then enters an idle mode for maintenance queries until the clock advances and enters an update mini-round where new information is fetched from the network.

figure k

Again the protocol is idle for maintenance queries until the clock advances.

Reading the state. When asked to report the current ledger state, the protocol outputs the prefix of the exported state, i.e., a prefix of the state encoded in \(\mathcal {C} _{\textrm{ exp}}\). By the mini-round structure, the exported state is updated exactly once in every update mini-rounds (after initialization is complete).

5.3.3 Predictable Synchronization Pattern

We now show that the ledger protocol has a predictable synchronization pattern according to Definition 3.1.

Lemma 5.2

The protocol \(\textsf {Ledger-Protocol} _{q,\texttt {D},T}\) satisfies Definition 3.1. More specifically, there is a predicate \(\textsf {predict-time} _{BC}\) that predicts the synchronization pattern of the UC Bitcoin protocol as required by Definition 3.1.

Proof Sketch

This is follows by inspection of our ledger protocol (and all protocols that share the same structure as we will see later in all the respective hybrid worlds they are executed in). The predicate \(\textsf {predict-time} \) can be implemented as follows: browse through the entire sequence \(\vec {\mathcal {I}}^T_H \) and determine how many times the clock advances. The clock advances for the first time, when all miners got sufficient maintain commands to complete their mini-round operation. By definition of \(\textsf {Ledger-Protocol} \), this implies that each party has sent a clock-update to the clock and hence the clock advances. By an inductive argument, whenever the clock has ticked, the check when the clock advances the next time is checked exactly the same way. Overall, this allows to check whether the next activation of an honest party, given the history of activations will provoke a clock update (and knowing which parties are corrupted). \(\square \)

Fig. 6
figure 6

The maintenance procedure of the UC Bitcoin protocol

6 The Bitcoin Ledger

We next show how to instantiate the ledger functionality from Sect. 4 with appropriate parameters so that it is implemented by protocol \(\textsf {Ledger-Protocol}\). The proof of this appears in the next section. To define this Bitcoin ledger , we give the specific instantiations of the relevant functions \(\textsf {Validate} \), \(\textsf {Blockify} \), \(\textsf {ExtendPolicy} \), and \(\textsf {predict-time} \).

Synchrony pattern. First, \(\textsf {predict-time} \) is defined to be \(\textsf {predict-time} _{BC}\) to reflect the synchronization pattern of the UC Bitcoin protocol as described in the proof of Lemma 5.2. This shows the dependency of the realized ledger from the protocol that achieves it.

State-buffer-validation. Similarly, in case of \(\textsf {Validate} \) we use the same predicate as the protocol uses to validate the states: For a given transaction \(\texttt {tx} \) and a given state \(\texttt {state} \), the predicate decides whether this transaction is valid with respect to \(\texttt {state} \). Given such a validation predicate, the ledger validation predicate takes a specific simple form which, excludes dependency on anything other than the transaction \(\texttt {tx} \) and the state \(\texttt {state} \), i.e., for any values of \(\text {txid}, \tau _L, P _i, \) and \(\texttt {buffer} \):

Ledger-output format. As with the above parameters, the function \(\textsf {Blockify}\) is defined to be , i.e., the function used in the UC Bitcoin protocol. In principle, any formatting function can be used and the security proof goes through (as long as the same function is used in the protocol \(\textsf {Ledger-Protocol}\) and functionality ). However, as we observe below in Definition 6.1, a meaningful \(\textsf {Blockify}\) should be in certain relation with the ledger’s \(\textsf {Validate}\) predicate. This relation is satisfied by the Bitcoin protocol.

The ledger policy. Finally, we define \(\textsf {ExtendPolicy} \). At a high level, upon receiving a list of possible candidate blocks which should go into the state of the ledger, \(\textsf {ExtendPolicy} \) does the following: for each block it first verifies that the blocks are valid with respect to the state they extend. Only valid blocks might be added to the state. In particular, \(\textsf {ExtendPolicy} \) is parameterized by three parameters—two positive integers \(\texttt {maxTime} _\texttt {window},\texttt {minTime} _\texttt {window} \), and a positive fraction \(\eta \)—and ensures the following property:

  1. 1.

    The speed of the ledger is not too slow. This is implemented by defining an upper limit \(\texttt {maxTime} _\texttt {window} \) on the time (number of clock-ticks) it takes to add \(\texttt {windowSize} \) state blocks. The enforced minimal ledger growth rate is expressed as the fraction \(\frac{\texttt {windowSize}}{\texttt {maxTime} _\texttt {window}}\).

  2. 2.

    The speed of the ledger is not too fast. This is implemented by defining a lower bound \(\texttt {minTime} _\texttt {window} \) on the time it takes to add \(\texttt {windowSize} \) state blocks. The enforced maximal ledger growth rate is expressed as \(\frac{\texttt {windowSize}}{\texttt {minTime} _\texttt {window}}\).

  3. 3.

    The adversary cannot create too many blocks with arbitrary (but valid) contents. This is formally enforced by defining an upper bound \(\eta \) on the ratio these so-called adversarial blocks within any sequence of \(\texttt {windowSize} \) (or more) state blocks. This is known as chain quality. Formally, this is enforced by requiring that a certain fraction of blocks need to satisfy higher-quality standards (to model blocks that are honestly generated).

  4. 4.

    Last but not least, \(\textsf {ExtendPolicy} \) guarantees that if a transaction is “old enough”, and still valid with respect to the actual state, then it is included into the state. This is a weak form of guaranteeing that a transaction will make it into the state unless it is in conflict. As we show in Sect. 9, this guarantee can be amplified by using digital signatures.

In order to enforce these policies, \(\textsf {ExtendPolicy} \) first defines alternative blocks which satisfy all of the above criteria in an ideal way, and whenever it catches the adversary in trying to propose blocks that do not obey the policies, it punishes the adversary by proposing its own generated blocks. In particular, if the adversary violates the policy regarding minimal chain-growth, the \(\textsf {ExtendPolicy} \) will directly propose a sequence of complying blocks and hence ensure liveness in a strong sense. The precise formal description of the extend policy (as pseudo-code) for is given in Appendix B for completeness.

On the relation between \(\textsf {Blockify} \) and \(\textsf {Validate} \). As already discussed above, \(\textsf {ExtendPolicy} \) guarantees that the adversary cannot block the extension of the state indefinitely, and that occasionally an honest miner will create a block. These are implications of the chain-growth and chain-quality properties from [24]. However, our generic \(\textsf {ExtendPolicy} \) makes explicit that a priori, we cannot exclude that the chain always extends with blocks that include, for example, only a coin-base transaction, i.e., any submitted transaction is ignored and never inserted into a new block. This issue is an orthogonal one to ensuring that honest transactions are not invalidated by adversarial interaction—which, as argued in [24], is achieved by adding digital signatures.

To see where this could be problematic in general, consider a blockify that, at a certain point, creates a block that renders all possible future transactions invalid. Observe that this does not mean that our protocol is insecure and that this is as well possible for the protocols of [24, 44]; indeed, our proof shows that the protocol will give exactly the same guarantees as an parametrized with such an algorithm \(\textsf {Blockify}\).

Nonetheless, a look in reality indicates that this situation never occurs with Bitcoin. To capture that this is the case, \(\textsf {Validate}\) and \(\textsf {Blockify}\) need to be in a certain relation with each other. Informally, this relation should ensure that the above sketched situation does not occur, i.e., \(\textsf {Blockify} \) should “not affect” the “true validity” of a transaction. A way to ensure this, which is already implemented by the Bitcoin protocol, is by restricting \(\textsf {Blockify}\) to only make an invertible manipulation of the blocks when they are inserted into the state—e.g., be an encoding function—and define \(\textsf {Validate}\) to depend on the inverse of \(\textsf {Blockify}\). This is captured in the following definition.

Definition 6.1

A co-design of \(\textsf {Blockify} \) and \(\textsf {Validate} \) is non-self-disqualifying if there exists an efficiently computable function \(\textsf {Dec} \) mapping outputs of \(\textsf {Blockify}\) to vectors \(\vec {N}\) such that there exists a validate predicate \(\textsf {Validate} '\) for which the following properties hold for any possible state \(\texttt {state} =\texttt {st} _1||\cdots ||\texttt {st} _\ell \), buffer \(\texttt {buffer}\) vectors \(\vec {N}:=(\texttt {tx} _1,\dots ,\texttt {tx} _m)\), and transaction \(\texttt {tx} \):

$$\begin{aligned} 1. \ {}&\ \textsf {Validate} (\texttt {tx},\texttt {state},\texttt {buffer}) = \textsf {Validate} '(\texttt {tx},\textsf {Dec} (\texttt {st} _1)||\cdots ||\textsf {Dec} (\texttt {st} _\ell ),\\&\texttt {buffer})\\ 2. \ {}&\ \textsf {Validate} (\texttt {tx},\texttt {state} ||\textsf {Blockify} (\vec {N}),\texttt {buffer}) = \textsf {Validate} '(\texttt {tx},\textsf {Dec} (\texttt {st} _1)||\ldots ||\\ {}&\textsf {Dec} (\texttt {st} _\ell )||\vec {N},\texttt {buffer}) \end{aligned}$$

We remark that the actual validation of Bitcoin does satisfy the above definition, since a transaction is only rendered invalid with respect to the state if the coins it is trying to spend have already been spent, and this only depends on the transactions in the state and not the metadata added by \(\textsf {Blockify}\). Hence, in the following, we assume that and satisfy the relation in Definition 6.1.

7 Security Analysis

7.1 Overview

In this section, we prove our main theorem, namely that, under appropriate assumptions, Bitcoin realizes the instantiation of the ledger functionality from the previous section. We prove our main theorem which can be described informally as follows:

Theorem

(Informal). For the security parameter \(\kappa \) and assuming \(\texttt {windowSize} =\omega (\log \kappa )\), then the protocol \(\textsf {Ledger-Protocol} \) securely realizes the concrete ledger functionality  defined in the previous section. The assumptions on network delays and mining power, where mining power is roughly understood as the ability to find proofs of work via queries to the random oracle (and will be formally defined later), are as follows:

  • In any round of the protocol execution, the collective mining power of the adversary, contributed by corrupted and temporarily de-synchronized miners, does not exceed the mining power of honest (and synchronized) parties. The exact relation additionally captures the (negative) impact of network delays on the coordination of mining power of honest parties.

  • No message can be delayed in the network by more than \(\Delta =O(1)\) rounds.

We prove the above theorem via what we believe is a useful modularization of the Bitcoin protocol (cf. Fig. 7). Informally, this modularization distills out from the protocol a reactive state-extend subprocess which captures the lottery that decides which miner gets to advance the blockchain next and additionally the process of propagating this state to other miners. In Lemma 7.2, we show that the state-extend-and-exchange module/subprocess implements an appropriate reactive UC functionality \(\mathcal {F}_{\textsc {StX}}\). We can then use the UC composition theorem which allows us to argue security of \(\textsf {Ledger-Protocol} \) in a simpler hybrid world where, instead of using this subprocess, parties make calls to the functionality \(\mathcal {F}_{\textsc {StX}}\), which then leads us to our main analysis in Theorem 7.9.

Fig. 7
figure 7

Modularization of the Bitcoin protocol

7.2 First Proof Step

In a first step, we distill out from the protocol \(\textsf {Ledger-Protocol}\) a state-extend module/subprocess, denoted as \(\textsf {StateExchange-Protocol}\), and show, using a “game-hopping” argument, that a modular description of the \(\textsf {Ledger-Protocol}\) in which every party makes invocations of this subprocess, yields an equivalent protocol. We abstract the service provided by this subprocess by a new lottery-functionality denoted \(\mathcal {F}_{\textsc {StX}}\). The modularized protocol, defined for the \(\mathcal {F}_{\textsc {StX}}\)-hybrid world, is denoted by \(\textsf {Modular-Ledger-Protocol}\).

As we prove, the subprocess \(\textsf {StateExchange-Protocol}\) UC-realizes \(\mathcal {F}_{\textsc {StX}}\) and hence the original protocol \(\textsf {Ledger-Protocol}\) and the modularized protocol \(\textsf {Modular-Ledger-Protocol}\) are in fact indistinguishable. This final step is a direct consequence of the universal composition theorem: \(\textsf {Ledger-Protocol}\) UC emulates \(\textsf {Modular-Ledger-Protocol}\) where invocations of \(\textsf {StateExchange-Protocol}\) are replaced by invocations of \(\mathcal {F}_{\textsc {StX}}\) (for appropriate parameters as precisely defined below).

Looking ahead, in the next section, we can hence focus on analyzing the simpler protocol \(\textsf {Modular-Ledger-Protocol}\) in order to show that the UC Bitcoin protocol realizes the Bitcoin Ledger of Sect. 6—again by invoking the composition theorem.

7.2.1 The State-Exchange Functionality

The state-exchange functionality \(\mathcal {F}_{\textsc {StX}} ^{\Delta ,p_H,p_A}\) allows parties to submit ledger states which are accepted with a certain probability. Accepted states are then multicast to all parties. Informally, it can be seen as lottery on which (valid) states are exchanged among the participants. Note that for simplicity of notation we do not write the parameters when clear from the context.

Parties can use \(\mathcal {F}_{\textsc {StX}}\) to multicast a valid state, but instead of accepting any submitted state and sending it to all (registered) parties, \(\mathcal {F}_{\textsc {StX}}\) keeps track of all states that it ever saw, and implements the following mechanism upon submission of a state \(\vec {\texttt {st}}\) and a new block \(\texttt {st} \) from any party: If \(\vec {\texttt {st}}\) was previously accepted by \(\mathcal {F}_{\textsc {StX}} \) and \(\vec {\texttt {st}}|| \texttt {st} \) is a valid new state, then \(\mathcal {F}_{\textsc {StX}} \) accepts \(\vec {\texttt {st}}|| \texttt {st} \) with probability \(p_H\) (resp. \(p_A\) for dishonest parties) and sends it to registered parties. Each submission is evaluated independently. The formal specification is found in Fig. 8.

Fig. 8
figure 8

The state exchange functionality. Parameters are the delay \(\Delta \) and the success probabilities \(p_H\) and \(p_A\) for honest and adversarial submissions

7.2.2 Realizing the State-Exchange Functionality

The state-exchange functionality is realized by the protocol given below. It is obtained by identifying the relevant instructions from the UC-ledger protocol. More precisely, protocol \(\textsf {StateExchange-Protocol} \) UC-realizes the \(\mathcal {F}_{\textsc {StX}} \) functionality in the \((\mathcal {F}_{\textsc {RO}} ^\kappa \), -hybrid world. Note that is a (named) instance of the functionality. The protocol is parametrized by \(q \) and \(\texttt {D} \) where \(q \) is the number of mining attempts per submission attempt and \(\texttt {D} \) is the difficulty of the proof-of-work.

figure l

Lemma 7.1

Let \(p:=\frac{\texttt {D}}{2^\kappa }\). The protocol \(\textsf {StateExchange-Protocol} _{q,\texttt {D}}\) UC-realizes functionality \(\mathcal {F}_{\textsc {StX}} ^{\Delta , p_H,p_A}\) in the -hybrid model where \(p_A:= p\) and \(p_H:= 1-(1-p)^q\).

Proof

We consider the following simulator:

figure m

The proof works similar as the one for Lemma 5.1 in [44]. Recall the notation from Sect. 2.1 and introduce the shorthand notation \(T_{\textrm{real}}:= T_{\textsc {exec}_{\textsf {StateExchange-Protocol},\mathcal {A},\mathcal {Z}}(\kappa ,z)}\) which is the (distribution of the) joint view of all parties in the execution of \(\textsf {StateExchange-Protocol} \) for adversary \(\mathcal {A}\) and environment \(\mathcal {Z}\) (upon some input z). Denote by \(T_\textrm{ideal}:= T_{\textsc {exec}_{\mathcal {F}_{\textsc {StX}},\mathcal {S}_{\text {stx}},\mathcal {Z}}}(\kappa ,z)\) the joint view of all parties for \(\mathcal {F}_{\textsc {StX}} \) with simulator \(\mathcal {S}_{\text {stx}} \). In the following, we treat the arguments \(\kappa \) and z as implicit.

Define a new hybrid world, via the following random experiment: the experiment is defined as the real-world execution except that the random oracle aborts on collisions with \(\textsc {collision-error} \) and that adversarial oracle queries are emulated as in \(\mathcal {S}_{\text {stx}} \). We use the shorthand \(\textsc {HYB} _{\mathcal {A},\mathcal {Z}}\) to refer to this hybrid world (defined analogously to \(\textsc {exec}_{\cdot ,\cdot ,\mathcal {Z}}\)). The only difference is thus that in the hybrid world we may abort with \(\textsc {collision-error} \) or \(\textsc {tree-error} \) as in the ideal execution. Let \(T_\textrm{hyb}\) be the associated distribution of the joint view.

Let \(\textsf{event1}\) be the event that some parties query two different values \(v,v'\) such that \(H[v] = H[v']\), i.e. the event that a hash-collision occurs (this event is a condition on the realized transcript tr in the support of \(T_\textrm{real}\) or \(T_\textrm{hyb}\), respectively). For any two queries the probability that they return the same hash value is \(2^{-\kappa }\). By a union bound over all queries we have that \(\textsf{event1}\) happens with probability at most \(\textsf{poly}(\kappa )\cdot 2^{-\kappa }\) in both worlds. Note that if \(\textsf{event1}\) does not happen the hybrid random experiment does not abort with \(\textsc {collision-error} \).

Let \(\textsf{event2}\) be the event that some party makes a query \(H[(\texttt {s},\cdot ,\cdot )]\) where no v exists such that \(H[v] = \texttt {s} \), but later some party makes a query \(v'\) such that \(H[v'] = \texttt {s} \). The probability that any query \(H[(\texttt {s},\cdot ,\cdot )]\) a later query returns \(\texttt {s} \) is \(2^{-\kappa }\) in both worlds By a union bound over all queries we have that \(\textsf{event2}\) happens with probability at most \(\textsf{poly}(\kappa )\cdot 2^{-\kappa }\) in both worlds.

Next, we show that the \(\textsc {tree-error} \) abort does not occur in the hybrid world execution conditioned under \(\textsf{event1}\) and \(\textsf{event2}\) not happening. Assume for contradiction that \(\textsc {HYB} _{\mathcal {A},\mathcal {Z}}\) aborts with \(\textsc {tree-error} \) with \(\textsf{event1}\) and \(\textsf{event2}\) not happening. Let \(\mathcal {C} = \textbf{B} _1,\dots ,\textbf{B} _n\) be the shortest valid chain created in the experiment \(\textsc {HYB} _{\mathcal {A},\mathcal {Z}}\) such that \(\textbf{B} _1,\dots ,\textbf{B} _{n-1} \in \mathcal {T}\) but \(\textbf{B} _1,\dots ,\textbf{B} _{n} \not \in \mathcal {T}\). Let \(\textbf{B} _i = (\texttt {s} _i,\texttt {st} _i,\texttt {n} _i)\). Since \(\mathcal {C} \) is a valid chain we have \(H[(\texttt {s} _{n},\texttt {st} _{n}\texttt {n} _{n})] < D\). But at the time \(\textbf{B} _n\) was added to \(H\) no valid chain existed where the last block has hash value \(\texttt {s} _n\) (otherwise \(\mathcal {C} \) would be in \(\mathcal {T}\)). This implies that no earlier query to \(H\) could have returned \(\texttt {s} _n\), since if the query was \(\textbf{B} _{n-1}\) \(\mathcal {C} \) would not be the shortest chain with the above property and if the query was not \(\textbf{B} _{n-1}\) the event \(\textsf{event1}\) must have happened. This implies that \(\textsf{event2}\) must have happened, which is a contradiction.

This implies that conditioned under \(\textsf{event1}\) and \(\textsf{event2}\) not happening, the hybrid-world execution proceeds the same as the real-world execution and hence the two worlds are statistically close with respect to efficient environments \(\mathcal {Z}\), i.e., \(\textsc {exec}_{\textsf {StateExchange-Protocol},\mathcal {A},\mathcal {Z}} \approx \textsc {HYB} _{\mathcal {A},\mathcal {Z}}\).

Now we compare \(\textsc {HYB} _{\mathcal {A},\mathcal {Z}}\) and \(\textsc {exec}_{\mathcal {F}_{\textsc {StX}},\mathcal {S}_{\text {stx}},\mathcal {Z}}\). Consider the event where a honest miner queries a block \((\texttt {s},\texttt {st},\texttt {n})\) and fails, i.e., where \(H[(\texttt {s},\texttt {st},\texttt {n})]>\texttt {D} \). In the hybrid execution, this query is stored in the random oracle table while the simulator in the ideal world does not store the query in the random oracle table. Under the condition that such failed queries are not repeated, the hybrid-world execution and the ideal-world execution proceed in identical ways (note that the network simulation in \(\mathcal {S}_{\text {stx}} \) perfectly mimics the real and the hybrid worlds).

Note that the nonce \(\texttt {n} \) in a ‘failed’ query \((\texttt {s},\texttt {st},\texttt {n})\) is chosen uniformly at random from \(\left\{ { 0,1 }\right\} ^{\kappa }\) by honest parties. This implies that with probability \(\textsf{poly}(\kappa )\cdot 2^{-\kappa }\) it was never queried before. As honest miner discard ‘failed’ queries (and failed queries do not leave the ITI and hence are hidden from the adversary) it also follows that except with probability \(\textsf{poly}(\kappa )\cdot 2^{-\kappa }\) the query will not be queried again (by any honest or corrupted party) unless the nonce of that failed query would be successfully guessed. By a union bound over all failed queries we have that failed queries are never queried twice except with probability \(\textsf{poly}(\kappa )\cdot 2^{-\kappa }\). Thus, \(\textsc {exec}_{\mathcal {F}_{\textsc {StX}},\mathcal {S}_{\text {stx}},\mathcal {Z}} \approx \textsc {HYB} _{\mathcal {A},\mathcal {Z}}\).

This concludes the proof. \(\square \)

7.3 Modularizing the Ledger-Protocol

From the ledger protocol \(\textsf {Ledger-Protocol} _{q,\texttt {D},T}\) we can derive what we denote \(\textsf {Modular-Ledger-Protocol} _{T}\), which uses the state-exchange functionality to extend and exchange blockchain states. This is defined in Fig. 9, where the only non-trivial modifications (aside of some minor structural changes) are the replaced implementations of algorithms \(\textsf {ExtendState} (\texttt {st})\) and \(\textsf {FetchInformation} \). The new implementations are as follows:

Fig. 9
figure 9

The modular ledger protocol (differences to original protocol shown)

figure n

and

figure o

We prove the soundness of this decomposition in the following lemma, which involves a sequence of hybrid steps to convert the original protocol into the suitable modular form, and finally by invoking Lemma,7.1.

Lemma 7.2

The UC Bitcoin protocol \(\textsf {Ledger-Protocol} _{q,\texttt {D},T}\) UC emulates \(\textsf {Modular-Ledger-Protocol} _{T}\) that runs in a hybrid world with access to the functionality \(\mathcal {F}_{\textsc {StX}} ^{\Delta ,p_H,p_A}\) with \(p_A:= \frac{\texttt {D}}{2^\kappa }\) and \(p_H= 1-(1-p_A)^q \), and where \(\Delta \) denotes the upper bound on the network delay.

Proof

We first provide the sequence of modifications, morphing from the original protocol to the modularized protocol in a “game-hopping” style:

We start with the original \(\textsf {Ledger-Protocol} \) and consider the protocol part below where will alter the protocol step by step.

figure p

Modification 1. The first modification of the protocol (see below) proceeds as \(\textsf {Ledger-Protocol} \) except (a) it stores a history of all valid chains in a tree \(\mathcal {T}\) and (b) in the ExtendState(st) procedure it checks that \(\vec {\texttt {st}}||\texttt {st} \) is a valid state and that there exists a chain in \(\mathcal {T}\) which encodes the state \(\vec {\texttt {st}}\). We observe that the protocol calls ExtendState(st) only with \(\texttt {st} \) where \(\vec {\texttt {st}}||\texttt {st} \) is a valid state. This implies that the first check is always satisfied. Moreover, note that the current local chain \(\mathcal {C} _{loc}\) which encodes state \(\vec {\texttt {st}}\) is at any time stored in the tree \(\mathcal {T}\). We therefore call the state encoded in \(\mathcal {C} _{loc}\) by \(\vec {\texttt {st}}_\textrm{loc}\) and see that the second check is therefore also always satisfied. Hence, the modified protocol has the same input/output behavior as \(\textsf {Ledger-Protocol} \).

figure q

Modification 2. In Modification 2 (see below) the local state \(\vec {\texttt {st}}_\textrm{loc}\) is stored directly instead of being encoded in chain \(\mathcal {C} _{loc}\). The procedures ExtendState(st) and FetchInformation are modified to accommodate this change. Note that the \(\mathcal {C} _{loc}\) is stored in \(\mathcal {T}\) as we have seen in the first modification. This implies that the behavior of ExtendState(st) remains the same as in the first modification.

figure r

Modification 3. In Modification 3 (see below) parts of the procedures ExtendState(st) and FetchInformation are split off into separate sub-procedures. Otherwise the protocol remains the same. As there are no changes to the program logic, the protocol still has the same behavior as the original protocol.

figure s

Final Considerations. We identify that Modification 3 above, in particular procedures \(\textsc {submit-new} \), \(\textsc {continue} \), and \(\textsc {fetch-new} \), are as defined in \(\textsf {StateExchange-Protocol} \), hence invocations of this protocol. Consider the part of \(\textsf {Modular-Ledger-Protocol} \) below that we observe is the same as Modification 3 except that the handling of chains (variable \(\mathcal {T}\)) and the calls to sub-procedures \(\textsc {submit-new} \), \(\textsc {continue} \), and \(\textsc {fetch-new} \) are now handled managed by the functionality \(\mathcal {F}_{\textsc {StX}} \) that is invoked at the respective places:

figure t

By Lemma 7.1, we know that \(\textsf {StateExchange-Protocol} \) UC-realizes \(\mathcal {F}_{\textsc {StX}} \), therefore replacing calls to \(\textsf {StateExchange-Protocol} \) by calls to the ideal process \(\mathcal {F}_{\textsc {StX}} \) yields an indistinguishable protocol to \(\textsf {Ledger-Protocol} \). This concludes the proof of the lemma. \(\square \)

7.4 Second Proof Step

We now proof that if honest parties have some advantage over the dishonest parties in winning the lottery, then the UC Bitcoin protocol \(\textsf {Modular-Ledger-Protocol} _{T}\) realizes the ledger functionality. By the composition theorem, we can directly conclude that \(\textsf {Ledger-Protocol} _{q,\texttt {D},T}\) realizes the Bitcoin ledger functionality.

7.4.1 Relevant Quantities of the Analysis

The main theorem will require a condition on the power of the adversary, and it is useful to describe here the random variables induced by a pair \((\mathcal {Z}, \mathcal {A})\).

Recall from Sects. 5.2.1 and 5.3.1 that a party is honest-and-synchronized if it either joined at the onset of the execution or it joined a sufficient number of rounds ago (depending on the delay). Furthermore, recall that a logical round consists of two clock-ticks. In the following, we denote the round number by r (which consists of two mini-rounds).

Definition 7.3

(Query Power) We define for the real-world execution of \(\textsf {Modular-Ledger-Protocol} _{T}\) with respect to the pair \((\mathcal {Z}, \mathcal {A})\) the sequence of random variables \(Q_H^{(r)}\) to measure the number of distinct honest-and-synchronized parties that are activated in the working mini-round of round r to submit a query to \(\mathcal {F}_{\textsc {StX}} ^{\Delta ,p_H,p_A}\). Analogously, denote by \(Q_A^{(r)}\) the number of submit-queries to \(\mathcal {F}_{\textsc {StX}} ^{\Delta ,p_H,p_A}\) from corrupted parties in round r, and by \(Q_{H,DS}^{(r)}\) the number submit-queries by honest-but-desynchronized parties in the working mini-round of round r.

Definition 7.4

(Mining Power.) We define mining power as simple functions of the query-power. Note that, in our analysis, \(p_A\) and \(p_H\) are constants. We have:

  • The total mining power \(\texttt {T} _{mp}^{(r)}:= Q^{(r)}_{A} \cdot p_A+ (Q^{(r)}_{H} + Q^{(r)}_{H,DS}) \cdot p_H\).

  • The adversarial mining power \(\beta ^{(r)}:= p_A\cdot Q^{(r)}_A \, + \, p_H\cdot Q^{(r)}_{H,DS}\).

  • The honest mining power \(\alpha ^{(r)}:= 1 - (1-p_H)^{Q_H^{(r)}}\).

It might be useful to recall that from Bernoulli’s inequality we have \(\alpha ^{(r)} \le p_H\cdot Q_H^{(r)}\). For small values of \(p_H\) (as usual in Bitcoin), this upper bound is a good approximation of \(\alpha ^{(r)}\).

Note that \(\alpha ^{(r)}\), \(\beta ^{(r)}\), and \(\texttt {T} _{mp}^{(r)}\) are random variables (on integer domains). For example. \(\alpha ^{(r)}\) maps the number of honest-and-synchronized submit-queries to the probability that at least one is a successful query. More formally, conditioned on \(Q^{(r)}_H = q\), the random variable \(\alpha ^{(r)}\) is the probability of at least one success among q queries and the expected value of \(\alpha ^{(r)}\) corresponds to the probability of at least one successful state-extension in round r of the execution. The reason is that \(\mathcal {F}_{\textsc {StX}} ^{\Delta ,p_H,p_A}\) treats each submit-query independently at random. This is the main motivation to introduce this intermediate step.

7.4.2 The Analysis

In the analysis of Bitcoin, conditions are needed that allow to reasonably lower and upper bound expected values of the above random variables (and their variances). As we will quickly recap below, it is shown in  [44] that if the involved query power exceeds any limits in the constant-difficulty case, then no security guarantees can be obtained. We start with the following definition.

Definition 7.5

(Query and Mining Pattern) We say that the pair \((\mathcal {Z},\mathcal {A})\), running for R rounds (referred to by numbers \(0,\dots ,R-1\)) obeys the query pattern \((\vec {h}, \vec {a}, \vec {d})\) if, for any round r, we have

$$\begin{aligned}&Q_H^{(r)} \ge h_r, \quad Q_A^{(r)} \le a_r, \quad Q_{H,DS}^{(r)} \le d_r \end{aligned}$$

where \(\vec {h} = (h_0,\dots ,h_{R-1}), \vec {a}=(a_0,\dots ,a_{R-1}), \vec {d} = (d_0,\dots ,d_{R-1})\) are vectors consisting of positive integers. Consequently, the pair \((\mathcal {Z},\mathcal {A})\) obeys the associated mining pattern denoted by \((\vec {\alpha },\vec {\beta })\), where vectors \(\vec {\alpha }=(\alpha _0,\dots ,\alpha _{R-1})\) and \(\vec {\beta } = (\beta _0,\dots ,\beta _{R-1})\) are defined by the mapping

$$\begin{aligned}&\alpha ^{(r)} \ge 1-(1-p_H)^{h_r} =: \alpha _r\\&\beta ^{(r)} \le p_A\cdot a_r + p_H\cdot d_r =: \beta _r. \end{aligned}$$

Technically, these definitions imply lower and upper bounds on the expectations of the random variables \(\alpha ^{(r)}\) and \(\beta ^{(r)}\), respectively, which is what will be eventually needed.

Definition 7.6

(Power Limits) The pair \((\mathcal {Z},\mathcal {A})\) is said to be \(q_{tot}\)-query-limited if \(Q^{(r)}_H + Q^{(r)}_A + Q^{(r)}_{H,DS} \le q_{tot}\). The pair \((\mathcal {Z},\mathcal {A})\) is said to be \(\texttt {T} _{mp}\)-mining limited if for all r,

$$\begin{aligned} \texttt {T} _{mp}^{(r)} \le \texttt {T} _{mp}. \end{aligned}$$

The bounds in the theorem will depend on several worst-case quantities that we introduce below.

Definition 7.7

For mining patterns \((\vec {\alpha }, \vec {\beta })\), we use the shorthand notation

$$\begin{aligned}&\alpha _{min} := \min \,\{\alpha _r\}_{r\in [0,R-1]} \quad \text {and}\quad \alpha _{max} := \max \,\{\alpha _r\}_{r\in [0,R-1]}; \\&\beta _{min} := \min \,\{\beta _r\}_{r\in [0,R-1]} \quad \text {and} \quad \beta _{max} := \max \,\{\beta _r\}_{r\in [0,R-1]}. \end{aligned}$$

For a (non-empty) subset \(S \subseteq \{0,\dots , R-1\}\) of rounds, we define the corresponding averages by

$$\begin{aligned}&\overline{\alpha }_S := \frac{1}{|S|}\cdot \sum _{r\in S} \alpha _r \quad \text {and}\quad \overline{\beta }_S := \frac{1}{|S|}\cdot \sum _{r\in S} \beta _r. \end{aligned}$$

For \(\texttt {T} _{mp}\)-mining limited pairs \((\mathcal {Z},\mathcal {A})\), we define the relative-power fractions

$$\begin{aligned} \rho _{h} := \frac{\alpha _{min}}{\texttt {T} _{mp}} \quad \text {and} \quad \rho _{a} := \frac{\beta _{min}}{\texttt {T} _{mp}}. \end{aligned}$$

We call a subset S of rounds an interval if it consists of consecutive round numbers \(r,\dots ,r+t\) for some integers \(r,t \ge 0\).

Following [44], the theorem will take into account that the network delay \(\Delta \) decreases the effectiveness of the actual honest mining power:

Definition 7.8

(Discount function.) We define the function \(\gamma (\alpha ,\Delta ):= \frac{\alpha }{1+\alpha \Delta }\) for \(\alpha , \Delta >0\).

We are now ready to state and prove the main theorem which assures that we can realize the ledger for a given range of parameters.Footnote 12

Theorem 7.9

Let \(p\in (0,1)\), integer \(q \ge 1\), \(p_H= 1-(1-p)^q\), and \(p_A=p\). Let \(\Delta \ge 1\) be the upper bound on the network delay, let \(\kappa \) be the security parameter, and let \(T = \omega (\log \kappa )\) be the main protocol parameter of the ledger protocol. For all pairs \((\mathcal {Z}, \mathcal {A})\) of PPT environments \(\mathcal {Z}\) (w.r.t. identity bound \(\xi _{sync}\)) and PPT adversaries \(\mathcal {A}\) running for R rounds which obey the \((\vec {\alpha }, \vec {\beta })\) mining pattern as of Definition 7.5 and which are \(\texttt {T} _{mp}\)-limited as of Definition 7.6, the real-world execution of protocol \(\textsf {Modular-Ledger-Protocol} _T \) (in the -hybrid world) is indistinguishable from the ideal-world execution with ledger functionality (and the simulator defined in the proof), if for some \(\lambda >1\), it holds that for any interval S of rounds of size \(t\ge 1\) and any \(S'\subseteq S\) of size \(t'\in [\max \{1, t\cdot (1-\Delta \alpha _{max})\}, \dots , t]\) the relation

$$\begin{aligned} \overline{\alpha }_{S'} \cdot (1-2\cdot (\Delta +1)\cdot \texttt {T} _{mp}) \ge \lambda \cdot \overline{\beta }_S \end{aligned}$$
(1)

holds, and if the parameters of fulfill the relations

$$\begin{aligned} \texttt {windowSize} = T \quad&\text { and } \quad \, \texttt {Delay} = 4\Delta , \\ \frac{(1-\delta )}{2}\cdot \gamma _{min} \ge \frac{\texttt {windowSize}}{\texttt {maxTime} _\texttt {window}} \,\,\,&\text { and } \,\,\, \frac{(1+\delta )}{2} \cdot \texttt {T} _{mp}\le \frac{\texttt {windowSize}}{\texttt {minTime} _\texttt {window}}, \\ \eta \ge \min \{(1+\delta )&\cdot \frac{\beta _{max}}{\gamma _{min}}, \, 1\}, \end{aligned}$$

where the quantities are defined as in Definition 7.7 and where \(\gamma _{min}:= \gamma (\alpha _{min},\Delta )\), and \(\delta >0\) is an arbitrary constant. In particular, the distinguishing advantage is bounded by \(R \cdot \text {negl}(T)\), where \(\text {negl}(T)\) denotes a negligible function in T.

Remark

Before proving the theorem, it is instructive to recall the flat model of Bitcoin and to see how the above quantities appear there. By the above definitions and theorem statement, we see that we only make statements if the honest mining power is not too small, the dishonest mining power is not too large (and stands in a certain relation to the honest mining power) and if the respective mining power values are in a reasonable range to the overall mining power. In particular, the theorem expresses a condition that the average honest mining power dominates the average mining power of the adversary, even if the honest average is taken over slightly smaller intervals (note that in particular, for each singleton set S, we obtain that the familiar condition that \(\alpha _r\) should dominate \(\beta _r\)).

Note that \(\beta _{min}\) is the most restrictive restriction (but not a lower-bound) on the adversary (similarly, \(\alpha _{max}\) is the best guaranteed lower-bound for honest-and-synchronous mining power). In general, the adversary (and hence the environment) is free to activate as many ITIs unless it would exceed \(\texttt {T} _{mp}\) if the environment is \(\texttt {T} _{mp}\)-bounded, and no more than what is allowed by \(\vec {\beta }\). This is a more general setting in the fixed-difficulty setting compared to previous works in the same setting. Furthermore, we show in the next subsection how to get a better bound for chain-quality.

Looking ahead, for example in [44], the overall number of parties is fixed to be some number n and there is an upper bound on the number of dishonest parties \(\rho n\) (and de-synchronized parties are not allowed by definition). Assume for simplicity that \(p_H= p_A= p\) for a very small value \(p>0\). We then obtain \(\alpha _{min} \approx (1-\rho ) \cdot n \cdot p\) and \(\beta _{max} \approx \rho _H \cdot n \cdot p\). By \(\texttt {T} _{mp}= n\cdot p\) and since the mining pattern as defined above is flat in flat models (cf. Sect. 8.1), the correspondence \(\rho _a = \rho \) and \(\rho _h = (1-\rho )\) follows.

Also, as pointed out by [44], for too large values of p in a range that would yield \(\texttt {T} _{mp}= n\cdot p > \frac{1}{ \Delta }\) (where \(\Delta \) is the network delay), there is an attack against the protocol, even if one assumes an honest majority. This indicates that the main condition of the theorem in Eq. (1) is also necessary up to a constant factor, and recent works have revealed the exact threshold for security [20, 27].

We now prove our main theorem.

Proof of Theorem 7.9

We start with an overview followed by a sequence of claims.

Overview. We prove the theorem using the formalism of [4] to be able to model shared functionalities. In more detail, we specify the simulator \(\mathcal {S}_{\text {ledg}} \) as pseudo-code in Appendix C to prove that . Recall from Sect. 2.1.4 that this means that in the real world, parties are running , and in the ideal world the parties are running , where the operator obliviously transforms the defined protocols into standard UC protocols without changing their behavior, and the indistinguishability notion is the standard UC-emulation notion.

Let us explain the general structure of the simulator and the proof: the simulator internally runs the round-based mining procedure of every honest party. Whenever a working mini-round is over, i.e., whenever the real world parties have issued their queries to \(\mathcal {F}_{\textsc {StX}} \), then the simulator will assemble the views of its simulated honest-and-synchronized miners and determine their common prefix of states, which is the longest state stored or received by each simulated party when chopping off T blocks. The adversary will then propose a new block candidate, i.e., a list of transactions, to the ledger to announce that the common prefix has increased (procedure ExtendLedgerState). The ledger will apply the \(\textsf {Blockify} \) on this list of transactions and add it to the state. Note that since \(\textsf {Blockify} \) does not depend on time, the current time of the ledger has no influence on this output. To reflect that not all parties have the same view on this common prefix, the simulator can adjust the state pointers accordingly (procedure AdjustView). The simulation inside the simulator is perfect and is simply the emulation of real-world processes. What restricts a perfect simulation is the requirement of a consistent prefix and the restrictions imposed by \(\textsf {ExtendPolicy} \). In order to show that these restrictions are not forbidding a proper simulation, we have to justify, why the choice of the parameters in the theorem are sufficient to guarantee that (except with negligible probability). To this end, we analyze the real-world execution to bound the corresponding bad events that prevent a perfect simulation.

We basically follow the proof ideas of Pass, Seeman, and shelat [44] to bound the bad events and adapt their observations to our setting. The analysis is divided into several different claims about the real-world execution. They include properties such as a lower-bound on the chain growth, the chain quality, or an upper-bound on the chain growth. These claims prove that our simulator can simulate the real-world view perfectly, since the restrictions imposed by the ledger prohibit that only with negligible probability, where the distinguishing advantage is upper bounded by \(R \cdot \text {negl}(T)\), where R denotes the number of rounds the protocol is running and \(\text {negl}(\cdot )\) denotes a negligible function in the parameter T.

Recall that each round consists of two time-ticks. Hence, if a statement is expressed with respect to a certain number t of rounds, it can equivalently be expressed with respect to 2t clock-ticks. Recall that the ledger parameters have to be given with respect to the clock, since the clock is the formal reference point of time. However, for the analysis, it is easier to think in rounds. In the following sections, if we refer to an interval \(r,\dots ,r+t\), this refers to t full rounds, i.e., the time window when the clock first switched to the value \(\tau =2r\) up to any point where the clock value satisfies \(\tau \in \{2(r+t),2(r+t)+1\}\).

Chain dissemination. We first state an obvious useful fact about the protocol’s operation.

Lemma 7.10

(State dissemination) Let \(P _i\) and \(P _j\) be miners, and let \(r \ge 0\). Assume \(P _i\) is honest in round r, and its adopted state has length \(\ell \). For any honest miner \(P _j\) in round \(r+\Delta \) who registered to the network before round r, it holds that its adopted state must have at least length \(\ell \).

Proof

By assumption, all messages, and in particular transmitted states of honest miners, are delayed maximally by \(\Delta \) rounds. Thus, if such a miner receives a state of length \(\ell \), then any other honest miner will receive this state within the next \(\Delta \) rounds since the protocol relays its adopted state. Additionally, if an honest miner successfully extends a ledger-state in round r, the new state is fetched by other honest miner at latest after \(\Delta \) rounds if they were registered before round r. Hence by then, they will have adopted a chain of length at least \(\ell \). \(\square \)

Probably the most useful corollary which is used in the sequel, is to apply the above lemma to the sub-class of honest-and-synchronized miners. Note that if \(P _j\) in the above lemma is honest-and-synchronized at round \(r+\Delta \) it must have been registered to the network not later than at round \(\max \{0, r-\Delta \}\) and hence the statement applies.

Analyzing chain growth. We now state the relation between time (measured in number of rounds) and guaranteed number of new state blocks.

Lemma 7.11

(Chain growth) Consider the real-world execution (under the conditions of the theorem). Let \(P _i\) be a miner, and let \(r \ge 0\). Assume \(P _i\) is honest-and-synchronized in round r, and the (longest) state adopted by \(P _i\) in round r has length \(\ell \). Then, in round \(r+t\), it holds that for any \(\delta >0\), except with probability \(R \cdot \text {negl}(T)\), the length of the (longest) state adopted of any honest-and-synchronized miner \(P _j\) in that round has length at least \(\ell + T\) if \(t \ge \frac{T}{(1-\delta )\cdot \gamma _{min}}\).

More generally, for an interval of rounds \(r,\dots ,r+t\), we can guarantee a length increase of \(\gamma \cdot t\) with \(\gamma := \frac{\tau }{1+\tau \Delta }\) if for all possible subsets S of rounds of size \(t'= t(1-\gamma \Delta )\) of this interval we have \(\overline{\alpha }_S \ge \tau \). The guarantee holds except with probability \(\exp (-\Omega (t\gamma ))\).

Proof

We first prove that for any real-world adversary \(\mathcal {A}_{} \), there is an adversary \(\mathcal {A}_{} '\) that, starting at the given round r, maximally delays messages and prove that in a real-world execution with \(\mathcal {A}_{} '\) the expected state length of an honest-and-synchronized miner in round \(r+t\), where the expectation is taken over the randomness of the adversarial strategy, is no larger than with adversary \(\mathcal {A}_{} \) in round \(r+t\). Given adversary \(\mathcal {A}_{} \), the adversary \(\mathcal {A}_{} '\) works as follows. It internally runs \(\mathcal {A}_{} \) until and including round r without any modifications. After round r, \(\mathcal {A}_{} '\) first delays all current messages in the network to the maximally possible delay. Also, after round r, whenever an honest-and-synchronized party sends a message containing a state, \(\mathcal {A}_{} '\) sets the maximal delay \(\Delta \) for this message. Message delays defined by \(\mathcal {A}_{} \) for messages that contain valid states of honest parties are ignored. The adversary further ignores any message sent by \(\mathcal {A}_{} \) on behalf of corrupted parties after round r.

We define the following “hybrid world”, which equals the real world execution, but with fixed randomness as follows: for random strings \(\sigma \), \(\sigma '\), we define \(\textsc {HYB} _{\mathcal {F}_{\textsc {StX}} (\sigma '), \mathcal {A}(\sigma ),\mathcal {Z}}\) to be defined analogously to \(\textsc {exec}_{\cdot ,\cdot ,\mathcal {Z}}\) but where the internal coins of \(\mathcal {A}_{} \) and \(\mathcal {F}_{\textsc {StX}} \) are fixed to \(\sigma \) and \(\sigma '\) respectively (note that both are poly-bounded by the run-time restrictions of UC). Let \(T^{\textrm{hyb}}_{\mathcal {A}_{} (\sigma ),\mathcal {F}_{\textsc {StX}} (\sigma '),\mathcal {Z}}\) be the associated distribution of the joint view (induced by the random coins of \(\mathcal {Z}\)). Let be the function that maps a transcript T (of real-world and hybrid-world executions) to the length of the (longest) adopted chain by (honest-and-synchronized) miner i in round r.

We first give an inductive proof to show that for any \(r > 0\), and all strings \(\sigma ,\sigma '\),

\(\square \)

Base Case(s): We give the base cases \(t=0\) and \(=1\) to already include the arguments for the general case below. We argue for any fixed \(\sigma _Z\) and show that the condition in the event cannot be violated. Since adversary \(\mathcal {A}_{} \) and \(\mathcal {A}_{} '\) behave identical up to and including round r, the length of the longest state known or received by any party is the same. The reason is that \(\mathcal {A}_{} '\) and \(\mathcal {A}_{} \) play exactly the same strategy when the randomness is fixed, since \(\mathcal {A}_{} '\) itself does not use additional random coins and thus case \(t=0\) follows. Furthermore, when the randomness \(\sigma '\) of \(\mathcal {F}_{\textsc {StX}} \) is fixed, a miner i in any round \(r'\) is successful, if and only if it is successful in round \(r'\) with adversary \(\mathcal {A}_{} '\). Thus, the condition for \(t=1\) would only be violated if player i receives a longer state in round \(r+1\). However, since \(\mathcal {A}_{} '\) maximally delays messages, if any state arrives in round \(r+1\) in the real execution with \(\mathcal {A}_{} '\), then it arrives no later than \(r+1\) in the real execution with \(\mathcal {A}_{} \). This concludes the base cases.

Induction Step: \(t \rightarrow t+1\): By the induction hypothesis, we have that the condition

holds with probability one. We argue that holds as well (on the above arguments) with probability one. Assume this was not the case, then following the above reasoning, it can only be due to miner i receiving a state in round \(r+t+1\) that would increase the value of but not the value of (since the success of miner i in round \(r+t+1\) is fixed given \(\sigma '\)). By the same reasoning as above, since \(\mathcal {A}_{} '\) maximally delays delivery of new states, if any state arrives in round r in the real execution with \(\mathcal {A}_{} '\), then it arrives no later than r in the real execution with \(\mathcal {A}_{} \). This concludes the induction proof.

We note that the hybrid world, if we sample \(\sigma , \sigma '\) this yields the distribution \(T_{\textsc {exec}_{\pi ,\mathcal {A}',\mathcal {Z}}}(\kappa ,z)\) (for any fixed input z to the environment). Let us abbreviate this by \(T_{\textrm{real}, \mathcal {A}_{} '}\) to save on notation (and assuming the input z is hard-coded in the environment). Similarly, let us denote \(T_{\textrm{real}, \mathcal {A}_{}}\) the distribution in an execution with \(\mathcal {A}_{} \).

By taking the expectation over \(\sigma ,\sigma '\) (and by the law of total probability), we immediately get from the above arguments that for any positive integer c and any round r:

where we also used that for \(t=0\), the length distributions induced by \(\mathcal {A}_{} \) and \(\mathcal {A}_{} '\) are identical. Hence, chain growth can be analyzed w.r.t. adversary \(\mathcal {A}_{} '\) to yield a useful statement for any adversary \(\mathcal {A}_{} \).

Let us use the following terminology: We say a round \(r'\) is uniform if holds (where tr is a transcript), for all honest-and-synchronized miners i and j. Recall that adversary \(\mathcal {A}_{} '\) does not broadcast adversarially generated states and any new state is delayed by exactly \(\Delta \) rounds. The slowest progress of the overall maximal state length known to an honest-and-synchronized party occurs in case uniform rounds are the only successful rounds (if at all). Otherwise, the honest miner with the longest state could be successful and broadcast a longer state at round \(r'\), which would be guaranteed to arrive to any other honest miner in \(r + \Delta \). Furthermore, by a standard coupling argument, the probability of success of any honest-and-synchronized party in some round \(r'\) is minimized by an environment \(\mathcal {Z}\) that activates just enough parties to obey the mining pattern \(\alpha _{r'}\). The coupling with any other environment can be obtained by letting the activation results be the same up to the point where enough parties have been activated to satisfy the mining pattern. Further activations honest-and-synchronized participants can only induce more successful state extension than what \(\mathcal {Z}\) obtained.

We are thus left with analyzing growth w.r.t. a simple adversary and an environment \(\mathcal {Z}\) with a fixed activation pattern per round to match the mining pattern.

Obtaining a tail bound depending on number of blocks. Now, fix some round r. If in round \(s=r+t\), the length increase of the overall longest state of an honest-and-synchronized miner is less than c blocks, then at most \(c\cdot \Delta \) non-uniform rounds occurred. According to above, we can associate to each round i a random variable \(X_{i}\) which is 1 if at least one honest-and-synchronized miner successfully extended the state by a query to \(\mathcal {F}_{\textsc {StX}} \). The \(X_i\)’s are independent by construction and there must be at least \(t-c\cdot \Delta \) uniform rounds. On the other hand, for any concrete sub-sequence of rounds \(S \subset (r,\dots ,r+t)\) of size \(t'\), the Chernoff-Hoeffding bound in Theorem 2.3 implies for our setting (of independent heterogeneous variables) that

$$\begin{aligned} \Pr \left[ \sum _{i \in S} X_i \le (1-\delta ) \cdot \overline{\alpha }_S \cdot t' \right] \le \exp (-\Omega (\overline{\alpha }_S \cdot t')), \end{aligned}$$
(2)

where \(\overline{\alpha }_S:= \frac{1}{t'}\sum _{i \in S}\alpha _i\).

We conclude that if for the sub-sequence S of rounds in the interval from r to s, the relations \(c = {{\,\mathrm{\mathbb {E}}\,}}\left[ \sum _{i \in S} X_i\right] = \overline{\alpha }_S \cdot t'\) and \(|S| =: t' = t- c\Delta \) hold, we can derive a tail-estimate depending on the number of blocks. We can define

$$\begin{aligned} c_S := \frac{\overline{\alpha }_S t}{1+\overline{\alpha }_S\Delta } \end{aligned}$$

and assign a corresponding growth coefficient

$$\begin{aligned} \gamma _S := \frac{\overline{\alpha }_S}{1+\overline{\alpha }_S\Delta }. \end{aligned}$$

and thus except with exponentially small probability in \(t\gamma _S=c_S\), the length-increase is at least \(c_S\) for this particular interval.

For the first part of the statement, observe that \(\overline{\alpha }_S \ge \alpha _{min}\), for all subsets S, and that the function \(\frac{x}{1+kx}\), where k is a positive integer and \(x\in (0,1)\), is monotone in x. We get the guaranteed minimal growth by \(t\cdot \gamma _{min}\) in any interval of size t rounds for an honest-and-synchronized party except with negligible probability in \(t\cdot \gamma _{min}\) by taking the union bound overall all rounds r. What remains to prove is that this bound applies also to the growth of the state if one compares any two honest-and-synchronized miners which we do below (still following the proof steps of [44]).

For the second part of the statement, we generalize the above observation: if we have a guaranteed lower bound \(\tau \) on the average \(\overline{\alpha }_S\) (better than \(\alpha _{min}\) as used before) with respect to any subset of the required size within the given interval \(r,\dots ,r+t\) (note that indeed we only have a bound for the size of S in our experiments but no guarantee that a particularly “good” one is chosen), the second part of the statement follows.

Bound for any honest-and-synchronized party. By Lemma 7.10, we know that if an honest-and-synchronized miner knows some state, then within \(\Delta \) rounds, every other honest miner will be aware of that state. A similar calculation shows that the lower bound on the time to have a state increase by T blocks by all honest-and-synchronized parties follows the same law (and hence the perceived ledger speed is the same). By requiring \(s=r+t-\Delta \) above, and thus considering \(t':= t- \Delta - c\cdot \Delta = t-(c+1)\Delta \) does not change the asymptotic behavior since \(\gamma _S t - 1< \gamma _S t - \gamma _S \Delta < \gamma _S t\) for all t and S since \(\Delta \gamma _S < 1\). Hence, this additional additive term can be compensated by choosing a sufficiently small constant \(\delta \) in Eq. (2). \(\square \)

Mining limits. We state some helpful facts about bounds on the mining behavior.

Lemma 7.12

The number of successful state-extensions that happen with \(\mathcal {F}_{\textsc {StX}} ^{\Delta ,p_H,p_A}\) in any given interval of t rounds (in the real-world execution under the theorem conditions), where \(p_A=p\) and \(p_H=1-(1-p)^q\) for some \(q \ge 1\) and \(p \in (0,1)\) is bounded by \((1+\delta )\cdot t\cdot \texttt {T} _{mp}\) for any \(\delta >0\), except with probability \(\text {negl}(\texttt {T} _{mp}\cdot t)\). Consequently, for a number T of state-extensions to occur, the number of required rounds is less than \(\frac{T}{(1+\delta )\texttt {T} _{mp}}\) only with negligible probability in T. Finally, the number of adversarial state extensions in a sub-set S of t rounds is no more than \((1+\delta )\overline{\beta }_S\cdot t\) except with probability \(\exp (-\Omega (\overline{\beta }_S \cdot t))\) (for any \(\delta > 0\).

Proof

Since the state-exchange functionality evaluates each query independently, we can upper bound the number of successes of these independent Bernoulli-trials. We prove the bound for the environment \(\mathcal {Z}\) (and \(\mathcal {A}_{} \)) that makes as many queries as allowed per round (as limited by \(\beta _r\) and \(\texttt {T} _{mp}\)). As in the previous lemma, a coupling argument shows that any other query-distribution cannot induce a larger probability exceeding the given bound than \(\mathcal {Z}\), for which the query distribution is fixed. For a round, let \(X^{(r)} = \sum _i X_i\) model the sum of the involved independent trials to the state-exchange functionality. Clearly, \(\beta _r \le {{\,\mathrm{\mathbb {E}}\,}}[X^{(r)}] \le \texttt {T} _{mp}\). Let S be a set of t rounds. By linearity of expectation and invoking Theorem 2.3 we get the tail-estimate

$$\begin{aligned} \Pr \left[ \sum _{i \in S} X^{(i)} \ge (1+\delta )\cdot t \cdot \texttt {T} _{mp}\right]&\le \exp (-\Omega (\overline{\beta }_S \cdot t)) \\&\le \exp (-\Omega (\texttt {T} _{mp}\cdot t)), \end{aligned}$$

where the last step invokes the theorem assumption that \(\forall r: \beta _r \ge \rho _a \texttt {T} _{mp}\) for the relative-power coefficient \(\rho _a\).

Similarly, denote by \(Y^{(r)} = \sum _i Y_i\) the number of adversarial state-extensions in round r. Again it is sufficient to consider a maximizing \(\mathcal {Z}\) which has an expected value of \(t\cdot \overline{\beta }_S\) over a sub-set of rounds of size t. Hence, we again can obtain an estimate of the form

$$\begin{aligned} \Pr \left[ \sum _{i \in S} Y^{(i)} \ge (1+\delta )\cdot t \cdot \overline{\beta }_S\right]&\le \exp (-\Omega (\overline{\beta }_S \cdot t)). \end{aligned}$$

As a final conclusion, we observe that for any number of state blocks T, the probability that for any \(\delta >0\) it takes less than \(t=\frac{T}{(1+\delta )\texttt {T} _{mp}}\) rounds to get T state extensions is negligible in T. Consequently, for this large time interval, all tail bounds hold except with probability \(\exp (-\Omega (T))\), where the constant hidden in \(\Omega (\cdot )\) depend on \(\delta \) and on the relative-power coefficient \(\rho _a\). \(\square \)

Block withholding. From chain growth and the theorem’s condition, we derive that if an honest-and-synchronized miner adopts a new state that contains a block the adversary obtained by \(\mathcal {F}_{\textsc {StX}} \) then either this block has been published by the adversary before, or it was mined quite recently by a corrupted party.

Lemma 7.13

(Bound on Withholding strategies) In the real-world execution (under the conditions of the theorem), assume that in round r, an honest-and-synchronized miner adopts a new longer state \(\texttt {state} \). Assume there is a block \(\texttt {st} \) in this new state that was accepted upon an adversarial query to \(\mathcal {F}_{\textsc {StX}} \) and that is not part of any state adopted by any honest-and-synchronized party before round r. The probability that such a block \(\texttt {st} \) was first accepted by \(\mathcal {F}_{\textsc {StX}} \) before round \(r-\omega t\) happens only with probability \(\text {negl}(\overline{\beta }_S\cdot t)\), for any constant \(0< \omega < 1\), where S denotes the interval \(r-\omega t,\dots , r\).

Proof

Let us define \(\vec {\texttt {st}}^{(r)} = \texttt {st} _0||\dots ||\texttt {st} _{k}\) to be the state adopted by the honest-and-synchronized miner in round r as assumed in the lemma statement. Let \(\vec {\texttt {st}}^{(r')}\) be the longest prefix of \(\vec {\texttt {st}}^{(r)}\) such that \(\vec {\texttt {st}}^{(r')}\) is either the genesis block or a state newly accepted by \(\mathcal {F}_{\textsc {StX}} \) upon a query by an honest-and-synchronized party in round \(r'\le r\). Hence all the blocks in that prefix are known to at least one honest-and-synchronized party by round \(r'\). In light of the lemma statement, we consider the case that \(r-r' \ge \omega t\).

Let S denote the set of rounds from \(r'\) to r. The number of new states mined by the adversary does not exceed \((1+\delta ')\cdot \overline{\beta }_S \omega t\) (except with probability \(\text {negl}(\overline{\beta }_S \cdot t)\)) by the previous lemma.

At the same time, Eq. (1) implies that on any subset \(S'\) of size \(t'=\omega t(1-\alpha _{max}\Delta )\) the condition \(\overline{\alpha }_{S'}(1-\Delta \overline{\alpha }_{S'}) \ge (1+\delta )\overline{\beta }_S\) has to hold for some constant \(\delta \in (0,1)\). This is the case since for all \(x, \Delta > 0\), \(\frac{x}{1+x\Delta } > x(1-x\Delta )\) (and \(\texttt {T} _{mp}\ge \overline{\alpha }_{S'}\)) and this implies that \(\gamma :=\frac{\overline{\alpha }_{S'}}{1+\overline{\alpha }_{S'}\Delta } \ge (1+\delta ) \overline{\beta }_S\). Lemma 7.11 gives us a chain growth of \(|\vec {\texttt {st}}^{(r)}| - |\vec {\texttt {st}}^{(r')}| \ge (1-\delta ')\cdot \gamma \omega t\) except with probability \(\text {negl}(\overline{\beta }_S \cdot t)\).

Since all \(|\vec {\texttt {st}}^{(r)}| - |\vec {\texttt {st}}^{(r')}|\) blocks must have been mined by the adversary, we have \(|\vec {\texttt {st}}^{(r)}| - |\vec {\texttt {st}}^{(r')}| \le (1+\delta '')\cdot \overline{\beta }_S \omega t\). We get a contradiction, since now

$$\begin{aligned} (1-\delta ')\cdot \gamma \omega t \le (1+\delta '')\cdot \overline{\beta }_S\cdot \omega t, \end{aligned}$$

which, for sufficiently small \(\delta ', \delta ''\) would imply that \(\gamma < (1+\delta )\overline{\beta }_S\). \(\square \)

Chain-growth upper-bound. Our ledger also restricts the growth over time. This is based on the following observation.

Lemma 7.14

(Chain-Growth Upperbound) Consider the real-world execution (under the conditions of the theorem) and let \(P _i\) be a miner, and let \(r \ge 0\). Assume \(P _i\) is honest-and-synchronized in round r, and the longest state received or stored by \(P _i\) in round r has length \(\ell \). Then, in round \(r+t\), it holds, except with probability \(R \cdot \text {negl}(T)\), that the length of the longest state (received or stored) of at least one honest-and-synchronized miner \(P _j\) in that round has length at most \(\ell + T\) if \(t \le \frac{T}{(1+\delta )\cdot \texttt {T} _{mp}}\) for any \(\delta > 0\).

Proof

We can combine the previous observations to upper bound the number of accepted blocks. By Lemma 7.12, the number of rounds to generate T new extensions of states is at least \(t' \ge \frac{T}{(1+\delta ') \texttt {T} _{mp}}\) except with probability \(\text {negl}(T)\) (for any \(\delta '>0\)) and thus with overwhelming probability, in \(t' \le \frac{T}{(1+\delta ')\texttt {T} _{mp}}\), no more than T new blocks are mined.

In addition, we can invoke Lemma 7.13 to conclude that a new state that contains a block that the adversary is withholding since a round prior to \(r-\omega t\) is accepted by an honest-and-synchronized party only with probability \(\text {negl}(\beta _{min} t)\), for any \(0< \omega < 1\) (since \(\beta _{min}\) can be achieved in any round by an adversarial strategy and hence can serve as the lower bound in the exponent of the tail bound). Analogously to Lemma 7.12, by the definition \(\rho _a \cdot \texttt {T} _{mp}= \beta _{min}\) this error probability is thus negligible in T.

Both observations together imply that in \(t'= t(1+\omega ) \le \frac{T}{(1+\delta ')\texttt {T} _{mp}}\) rounds, no honest-and-synchronized party experiences a state increase of more than T blocks for any \(\delta '\) except with negligible probability in T. This is equivalent to the condition that \(t \le \frac{T}{(1+\omega )(1+\delta ')\texttt {T} _{mp}}\) and we can choose \(\delta '\) sufficiently small to obtain the bound with respect to \(t \le \frac{T}{(1+\delta )\texttt {T} _{mp}}\) and any given \(\delta >0\) as required by the statement. The claim follows by taking the union bound over all rounds as the arguments above hold for any round r. \(\square \)

Worst-case chain quality. We give a very coarse bound on the overall chain quality in any sequence of T blocks as follows:

Lemma 7.15

(Fraction of honest blocks) Let \(P _i\) be a miner, and let \(r \ge 0\). Assume \(P _i\) is honest-and-synchronized in round r and that the length of the longest state received or stored is \(\ell \ge T\). The fraction of adversarially mined blocks within a sequence of T blocks in the state is at most \(\min \{1, (1+\delta )\cdot \frac{\beta _{max}}{\gamma _{min}}\}\) except with probability \(R\cdot \text {negl}(T)\) for any \(\delta >0\).

Proof

Let us assume that at round r, the state adopted by miner \(P _i\) is \(\vec {\texttt {st}}_{r'} = \texttt {st} _0||\dots ||\texttt {st} _k\). We show that in any sub-sequence of T state blocks \(\texttt {st} _{j+1},\dots ,\texttt {st} _{j+T}\) in \(\vec {\texttt {st}}_{r}\), the fraction of adversarially mined blocks is bounded. Without loss of generality, one can assume that the state \(\vec {\texttt {st}}^{<j}:= \texttt {st} _0||\dots ||\texttt {st} _j\) as well as the state \(\vec {\texttt {st}}^{>j+T}:= \texttt {st} _0||\dots ||\texttt {st} _{j+T+1}\) are mined by honest-and-synchronized miners (or \(j+T\) equals the length of the state). Otherwise, one can enlarge T to meet this condition — this can only increase the fraction of adversarial blocks in the sequence of T blocks and since any state is finite and starts with the genesis block, the condition will be fulfilled for some T. We further assume that \(\vec {\texttt {st}}^{<j}\) is mined at round \(r'\), and that in round \(r'+t\), the state \(\vec {\texttt {st}}^{>j+T}\) appears for the first time as the state, or the prefix of a state, of at least one honest-and-synchronized miner. We conclude that if an adversary successfully extended the state during some round by a new state block \(\texttt {st} _{j+s}\) of the above sequence \(\texttt {st} _{j+1},\dots , \texttt {st} _{j+T}\), then this happens in a round between \(r'\) and \(r'+t\).

We now relate the number t of rounds to the number T of blocks. Since t is assumed to be the minimal number of rounds until the first honest-and-synchronized miner adopted a state containing \(\texttt {st} _{j+1}\), we can make use of the minimal chain-growth Lemma 7.11 to conclude that the probability that the condition \(t > \frac{T}{(1-\delta ')\gamma _{min}}\) occurs in such an execution is at most \(\text {negl}(T)\). We hence have \(t \le \frac{T}{(1-\delta ')\gamma _{min}}\) with overwhelming probability in T.

Similar to above, by Lemma 7.12 the time it takes to generate T blocks is at least \(t \ge \frac{T}{(1+\delta ) \texttt {T} _{mp}}\) except with probability \(\text {negl}(T)\) and thus with overwhelming probability, in \(t \le \frac{T}{(1+\delta )\texttt {T} _{mp}}\), no more than T blocks are mined.

Furthermore, also by Lemma 7.12, we get a worst-case upper bound. Let \(N_A^t\) denote the expected value in t rounds, invoking Lemma 7.12 gives us that \(N_A^t \le (1+\delta )\beta _{max}t\) except with probability \(\text {negl}(\beta _{min} t)\) (where we again use the minimum to bound the average of any interval). Hence, since \(\rho _a \cdot \texttt {T} _{mp}= \beta _{min}\) by definition it follows as in previous lemmata that the bound holds except with probability \(\text {negl}(T)\).

Putting things together, we conclude that except with negligible probability in T, the number of times the adversary was successful in extending the state by one block is upper bounded by the quantity

$$\begin{aligned} N_A^{\frac{T}{(1-\delta ')\gamma }} \le \frac{1+\delta }{1-\delta '} \cdot T \cdot \frac{\beta _{max}}{\gamma _{min}}. \end{aligned}$$

Hence, the fraction of adversarial blocks within T consecutive blocks cannot be more than \(f = \min \{1, (1+\delta '')\frac{\beta _{max}}{\gamma _{min}}\}\) for any \(\delta ''\) and sufficiently small constants \(\delta ,\delta ' >0\), except with negligible probability in the length T of the sequence.

Since our arguments hold for any interval, the proof is concluded by taking the union bound over the number of such sequences (which is in the order of number of rounds). \(\square \)

Consistency (common prefix). We now state the lemma on the common-prefix property in our setting.

Lemma 7.16

(Consistent states) Consider the real-world execution under the condition of the main theorem. Let \(P _i\) and \(P _j\) be miners (potentially the same), and let \(r' \ge r \ge 0\). Assume \(P _i\) is honest-and-synchronized in round r, and \(P _j\) is honest-and-synchronized in round \(r'\). Assume that the length of the longest state received or stored by \(P _i\) in round r is \(\ell \ge T\). Then, the \(\ell -T\)-prefix of that longest state of \(P _i\) in round r is identical to the \(\ell -T\)-prefix of the state of \(P _j\) stored or received in round \(r'\) except with probability \(R \cdot \text {negl}(T)\).

Proof

We again follow the basic line of reasoning in [44] and adapt the appropriate arguments to our setting. First, since an inconsistency at round r implies an inconsistency at round \(r'>r\), if the claim is proven for the case \(r \le r' \le r+1\), then by an inductive argument, the claim holds for any \(r'\ge r\).

The protocol mandates that the honest-and-synchronized miners truncates the T newest blocks from the current respective state. Thus, we need to argue that the block which is \(T+1\) far away from the head will be part of any state output by any honest-and-synchronized miner. Suppose we are at round \(r'\) in the protocol, then the time it takes to generate the last T blocks is at least \(t \ge \frac{T}{(1+\delta ) \texttt {T} _{mp}}\) except with negligible probability in T as established in Lemma 7.12 and any \(0< \delta < 1\).

Looking ahead, we will eventually conclude that with overwhelming probability within the interval of rounds \(s = r-t, \dots , r'\in \{r,r+1\}\) (where \(r\ge t\)), the honest-and-synchronized miners have an opportunity to agree on a common state and hence at round \(r'\), they will still agree on a large common prefix of the current state at round \(r'\).

In the interval of rounds, let this set be denoted as usual by S, between round s and round \(r'=r\), the expected number of rounds, where at lest one honest-and-synchronized miner is successful, is at least \(\overline{\alpha }_S t\). Thus, again by a standard Chernoff bound, the probability that the number of these successful rounds is smaller than \(\bar{q}_{min}:= (1-\delta ')\cdot \overline{\alpha }_S t\) is no more than \(\exp (-\Omega (t\overline{\alpha }_S))\) in the real-world UC random experiment. Again, a coupling argument as in Lemma 7.11 yields that this tail-bound (where the environment activates the least number of parties possible and hence the random variables that describe the success are independent) applies to any environment. Finally, the conditions of the theorem in particular assure that \(\overline{\alpha }_S > \beta _{min}\) and hence this probability can be upper bounded by \(\text {negl}(\beta _{min} t)\).

Unfortunately, the “race” between the good guys and the bad guys is not yet conclusively analyzed, since the mere superiority of honestly mined blocks does not imply that the honest parties will reach agreement. In particular, not all of the expected honestly mined blocks are equally useful to obtain a so-called convergence opportunity. In particular, we need to know how many of the honestly mined blocks happen in isolated, sufficiently silent intervals.

Formally, let us introduce the random variable \(R_i\) that measures the number of elapsed round between successful round \(i-1\) and successful round i in the real-world UC execution, where \(R_1\) measures the number of elapsed rounds to the first successful round. Based on \(R_i\), the random variable \(X_i\) is defined as follows: \(X_i = 1\) if and only if \(R_i > \Delta \) and exactly one honest-and-synchronized miner mines a new state (i.e., successfully appends a new block to the state) in the ith successful round.

Let \(E^i_1\) be the event that there is at least one successful round in the interval of \(\Delta \) rounds starting after successful round \(i-1\) (or at the onset of the experiment). Let \(E^i_2\) be the event that strictly more than one miner is successful in the following successful round i.

Overall, our goal is to suitably bound the number of blocks that prevent those events of “success & silence” (i.e., bound the probability of the event \(X_i=0\)) in an interval of t rounds. We call these the undesirable blocks. They have to be infrequent enough such that in combination with adversarially mined blocks, they do not prevent too many convergence opportunities. We hence need to suitably bound the occurrence of the above two bad events \(E^i_j\) in our experiment.

By a union bound, and invoking that \(\alpha _r \le \texttt {T} _{mp}\), we directly have that \(\Pr [X_i=0] = \Pr [E^i_1 \cup E^i_2] \le \Delta \texttt {T} _{mp}+ \texttt {T} _{mp}\), hence, on the positive side, \(\Pr [X_i=1] \ge 1-\texttt {T} _{mp}(\Delta +1)\).

Let \(X:= \sum _{i=1}^{\bar{q}_{min}} X_i\), and let us define \(\bar{q}_{min}':= (1-\delta '')\cdot (1-\texttt {T} _{mp}(\Delta +1)) \cdot \bar{q}_{min}\). Since by Eq. (1) the term \(1-2(\Delta +1)\texttt {T} _{mp}\) must be positive, we have that \(\texttt {T} _{mp}(\Delta +1) \le \frac{1}{2}\) and, because \(\mathcal {F}_{\textsc {StX}} \) treats each new state-submission independently of previous submission, we conclude that \(\Pr [X_i=1 \,|\, X_1,\dots ,X_{i-1}] \ge \frac{1}{2}\). Since we do not argue here about any particular optimal strategy by an environment-adversary pair \((\mathcal {Z},\mathcal {A}_{})\), we need to invoke Lemma 7.17 from which we get

$$\begin{aligned} \Pr [X \le \bar{q}_{min}'] \le \exp \left( -(\delta '')^2 \bar{q}_{min}/2\right) . \end{aligned}$$
(3)

To express this w.r.t. \(\beta _{min}\), observe that not only \(\alpha _r > \beta _r\) (and thus \(\alpha _{min} > \beta _{min}\)) by Eq. (1) but also there is an actual constant \(0< \widehat{\delta } < 1\) such that \(\texttt {T} _{mp}(\Delta + 1) < 1-\widehat{\delta }\). This is true since by the theorem condition we deduce that

$$\begin{aligned}&(1-2(\Delta +1)\texttt {T} _{mp}) \ge \lambda (\beta _{min}/\alpha _{min}) \\&\implies 1-\lambda (\beta _{min}/\alpha _{min}) \ge 2(\Delta +1)\texttt {T} _{mp}> (\Delta +1)\texttt {T} _{mp}. \end{aligned}$$

And since \(\lambda > 1\), i.e., we get can bound the constant by \(0<\widehat{\delta } < \lambda (\beta _{min}/\alpha _{min})\) and obtain

$$\begin{aligned} (1-\texttt {T} _{mp}(\Delta +1)) \cdot \bar{q}_{min}> \widehat{\delta }(1-\delta ')\cdot \overline{\alpha }_S t > \widehat{\delta }(1-\delta ')\cdot \overline{\beta _S} t. \end{aligned}$$

And hence conclude by Eq. (3) that \(\Pr [X \le \bar{q}_{min}'] \le \exp (-\Omega (\beta _{min} t))\). We thus have a (high-probability) lower bound on the number of silent patterns.

We are actually interested in the number of times that \(X_i = X_{i+1} = 1\). This situation, as already introduced above, means that we have a situation, in which for \(\Delta \) rounds, no miner is successful, then exactly one honest-and-synchronized miner is successful, and afterward, we again have \(\Delta \) rounds of silence. This is denoted in [44] as a convergence opportunity. For example, a convergence opportunity has the desirable property, that at the end of such an opportunity, if the adversary is unable to provide a longer state to the honest-and-synchronized miners during this period, all honest-and-synchronized miners will reach an agreement on the current longest state. Thus, in order to prevent this, an adversary needs to be successful in mining roughly at the rate of the number of convergence opportunities within t rounds.

We have already seen that with overwhelming probability, there are at least \(\bar{q}_{min}\) successful rounds, and among which \((\bar{q}_{min}-\bar{q}_{min}')\) can disturb convergence opportunities. Since a single disturbing round can at most prevent two convergence opportunities (it violates the condition for a convergence opportunity with its neighbors in the sequence \(X_1,\dots ,X_k\)), the number of effective convergence opportunities C is lower bounded (except with negligible probability) by

$$\begin{aligned}&C \ge \bar{q}_{min} - 2(\bar{q}_{min}-\bar{q}_{min}') = 2 \bar{q}_{min}' - \bar{q}_{min} \\&\ge (1-\delta ')\overline{\alpha }_St[1-2\texttt {T} _{mp}(\Delta +1)-2\delta '']. \end{aligned}$$

For any constant \(\epsilon \), by picking \(\delta '\) and \(\delta ''\) sufficiently small, this yields a bound (except with negligible probability as derived above) of

$$\begin{aligned} C > (1-\epsilon )(1-2\texttt {T} _{mp}(\Delta +1))\overline{\alpha }_S t. \end{aligned}$$

The final argument is a counting argument. Let us denote by \(\mathcal {S}_{r'}\) the set of maximal states known to \(\mathcal {F}_{\textsc {StX}} \) at round \(r'\) (i.e., any path from the root to some a leaf) of length at least \(\ell + C\), where \(\ell \) is the length of the longest state known to at least one honest-and-synchronized miner at round s. Note that \(\mathcal {S}_{r'}^{\ell +C}\) is non-empty: since each convergence opportunity increases the length by at least one, and before each successful round, there is a period of \(\Delta \) rounds where no honest miner mines a new state, there has to exist at least one state with length at least \(\ell +C\) at round \(r'\).

Assume that the number of successful state extensions made by the adversary (to \(\mathcal {F}_{\textsc {StX}} \)) between round s and \(r'\) is \(T_{\mathcal {A}_{}} < C\). Then, by the pigeonhole principle, for all \(\vec {\texttt {st}} \in \mathcal {S}_{r'}\), it holds that there is at least one block \(\texttt {st} _k\), such that functionality \(\mathcal {F}_{\textsc {StX}} \) is successfully queried by exactly one honest-and-synchronized miner \(P \) in round i to extend the state to length \(k+1\), but no query by the adversary to extend a state of length k to a state of length \(k+1\) has been successful up to and including round \(r'\). Even more, \(T_{\mathcal {A}_{}} < C\) implies that such an i has to exist that also constitutes a convergence opportunity.

After this convergence opportunity at round i, all honest-and-synchronized miners have a state whose first \(k+1\) blocks are \(\vec {\texttt {st}}_i = \texttt {st} _0\dots ,\texttt {st} _k\). Unless the adversary provides an alternative state with a prefix \(\vec {\texttt {st}}_i'\) of length \(k+1\), such that \(\texttt {st} _l' \ne \texttt {st} _l\) for at least one index \(0< l \le k\), no honest-and-synchronized miner will ever mine on a state whose first \(k+1\) blocks do not agree with \(\vec {\texttt {st}}_i\).

The existence of an alternative prefix \(\vec {\texttt {st}}_i'\) of length \(k+1\) for any such i and for all states \(\vec {\texttt {st}} \in \mathcal {S}_{r'}^{\ell +C}\) implies \(T_{\mathcal {A}_{}} \ge C\) and therefore contradicts the assumption that \(T_{\mathcal {A}_{}} < C\).

What is left to prove is that for any such interval of size t (from round s to round \(r'\)), the probability that \(T_{\mathcal {A}_{}} < C\) holds in any real-world execution except with negligible probability in \(\beta _{min} t\). Analogously to Lemma 7.12, by the definition \(\rho _a \cdot \texttt {T} _{mp}= \beta _{min}\) (and recalling that we established a lower bound on t in the beginning) we get that this error probability is negligible in T.

First, by Lemma 7.13, for any \(\omega > 0\), the probability that any new state accepted by an honest-and-synchronized miner during the period of at most \(t+1\) rounds (from s to \(r'\)) is actually a state extension that the adversary withheld since round \(s-\omega (t+1)\) (or even before) is at most \(\text {negl}(\beta _{min} t)\). By Lemma 7.12, the number of adversarial blocks (i.e., successful state extensions by \(\mathcal {A}_{} \)) generated within this slightly larger interval \(S'\) of size \(|S'|=(1+\omega ) (t+1)\) rounds is (except with probability \(\text {negl}(\beta _{min} t)\)) upper bounded by \(T_{\mathcal {A}_{}} \le (1+\delta )(1+\omega ) (t+1)\overline{\beta }_{S'}\). Also by picking constant \(\omega \) sufficiently small, we have that \(|S| \ge (1-\alpha _{max}\Delta )|S'|\) and thus \(\overline{\alpha }_S\) dominates \(\overline{\beta }_{S'}\) by the theorem assumptions. We hence get \(T_{\mathcal {A}_{}} \le \frac{(1+\delta )(1+\omega )}{\lambda } (t+1)\overline{\alpha }_S \cdot (1-2\texttt {T} _{mp}\cdot (\Delta +1))\) by Eq. (1). By picking the constants \(\delta \) and \(\omega \), and \(\epsilon \) sufficiently small relative to \(\lambda \), we hence get \(T_{\mathcal {A}_{}} < C\) except with probability \(\text {negl}(\beta _{min} t)\). Since our arguments hold for any particular intervals S, we again apply the union bound over the number of rounds and get the desired claim. \(\square \)

We used the following useful lemma in the previous proof to bound the deviation with respect to an arbitrary environment (inducing a certain sequence of random variables):

Lemma 7.17

Let \(\tau \ge \frac{1}{2}\) and consider boolean random variables \(X_1, \ldots , X_n\) for which it holds that \(\Pr [X_i=1 \,|\, X_1,\dots ,X_{i-1}] \ge \tau \). Then, for any \(\delta > 0\),

$$\begin{aligned} \Pr \left[ \sum _{i=1}^n X_i \le (1 - \delta ) \tau n\right] \le \exp \left( -\delta ^2 n/2\right) . \end{aligned}$$

Proof

We define the random variables \(Y_k:= \sum _{i=1}^k (X_i - \tau ) = (\sum _{i=1}^k X_i) - k\tau \). First, they satisfy the sub-martingale condition, i.e., for all k, \({{\,\mathrm{\mathbb {E}}\,}}[Y_k\,|\, Y_1, \ldots Y_{k-1}] \ge Y_{k-1}\): let \(\Pr [Y_k = y_{k-1} + (1-\tau ) \, |\, Y_{k-1}= y_{k-1}] = \Pr [X_k = 1 \,|\, X_1,\dots ,X_{k-1}] =:p_1 \ge \tau \) and \(\Pr [Y_k = y_{k-1} + (-\tau ) \, |\, Y_{k-1}= y_{k-1}] = \Pr [X_k = 0 \,|\, X_1,\dots ,X_{k-1}]:= p_0 \le 1-\tau \). The (conditional) expected value is \(p_1(y_{k-1}+(1-\tau )) + p_0(y_{k-1}-\tau ) \ge y_{k-1} + p_1(1-\tau ) - p_0\tau \ge y_{k-1} + [\tau (1-\tau ) - (1-\tau )\tau ] = y_{k-1}\).

Second, we have a bounded difference of \(|Y_k - Y_{k-1}| \le \max (\tau , 1-\tau ) = \tau \) by the condition \(\tau \ge 1/2\). Applying the Azuma-Hoeffding bound given by Theorem 2.4 to the variables \(Y_k\) gives

$$\begin{aligned} \Pr [Y_n \le -\delta \tau n] \le \exp (-\delta ^2n/2). \end{aligned}$$

And by definition \(Y_n \le -\delta \tau n \leftrightarrow X_n \le n\tau - n\delta \tau \), the statement follows. \(\square \)

Concluding observations. Finally, we conclude the proof by noting that after a delay of \(\Delta \) rounds, all honestly multicast transactions are known to all honest-and-synchronized miners and would be included into the next honestly minded block if valid. In the simulation, the simulator also does it in the ideal world and hence will never propose blocks of honest parties that do not comply with the conditions of the defined \(\textsf {ExtendPolicy} \) of . Further, the synchronization of a party takes at most \(\texttt {Delay} =4\Delta \) clock ticks: if \(P _j\) joins the network, his knowledge of the longest chain and the set of valid transactions relative to that state, which is known to at least one honest and synchronized miner is only reliable after \(2\Delta \) rounds (\(4\Delta \) clock ticks) since it takes at most \(\Delta \) rounds to multicast the initial message that the miner has joined the network, and additional \(\Delta \) rounds until the replies are received. During this \(2\Delta \) round the new miner will also have received all messages sent at or after he joined the network, and in particular all transactions that are more than \(\Delta \) rounds (\(2\Delta = \frac{\texttt {Delay}}{2}\)) old and potentially valid.

The pointers of honest-and-synchronized parties can also not be too distant, i.e., the slackness is upper bounded by \(\texttt {windowSize} \ge T\) as otherwise we would have a common-prefix violation in that execution (assume the prefix of the chain known to a honest-and-synchronized party was further away than T blocks from the prefix of the actual longest chain, this would yield a fork with substantial probability). The theorem follows. \(\square \)

7.5 Improving the Chain-Quality Parameter

As long as \(\alpha _{min} > \beta _{max}\), we see that among \(\texttt {windowSize} \) state blocks, there is at least an honestly generated block, because then, by Eq. (1), we also have \(\gamma _{min} > \beta _{max}\) and thus \(\frac{\beta _{max}}{\gamma _{min}} < 1\). Such an assumption is usually taken in existing analyses. However, we can derive more general bounds for chain-quality (where the above case is one special case) to obtain bounds for more general scenarios. In light of the chain-growth statement in Lemma 7.11, we introduce the following useful quantity:

Definition 7.18

Let the mining pattern be \((\vec {\alpha },\vec {\beta })\) for R rounds, let the network delay be \(\Delta \), and let S be an interval. Define

$$\begin{aligned} cg_\Delta (S)&:= \max \{\tau \in (0,1) \, | \, \forall S'\subseteq S \text { with } |S'|\ge \max \{1, |S|(1-\Delta \cdot \gamma (\tau ,\Delta ))\} \, : \, \overline{\alpha }_{S'}\\&\qquad \ge \tau \}; \end{aligned}$$

and define the fraction

$$\begin{aligned} f_{cq} := \max _{S \subseteq \{0,\dots ,R-1\}} \frac{\overline{\beta }_S}{\gamma (cg_\Delta (S),\Delta )}. \end{aligned}$$

Both quantities are well-defined as functions since we assume that \(\forall r: \alpha _r > 0\). We derive a more general worst-case guarantee for the fraction of adversarial blocks which in particular shows that this fraction is less than one under the theorem condition.

Lemma 7.19

(Generalization of Lemma 7.15) Consider a real-world execution as in Theorem 7.9. Let \(P _i\) be a miner, and let \(r \ge 0\). Assume \(P _i\) is honest-and-synchronized in round r and that the length of the longest state received or stored is \(\ell \ge T\). The fraction of adversarially mined blocks within a sequence of T blocks in the state is at most \(\min \{1, (1+\delta )\cdot f_{cq}\}\) except with probability \(R\cdot \text {negl}(T)\) for any \(\delta >0\) and where \(f_{cq}\) is defined as in Definition 7.18. Under the condition of Theorem 7.9, this means that for the ledger , we can guarantee

$$\begin{aligned} \eta \ge \min \{(1+\delta )\cdot f_{cq}, \, 1\}, \end{aligned}$$

with \(f_{cq} < 1\) (and for any \(\delta >0\)).

Proof

The proof proceeds as the one of Lemma 7.15: consider any sub-sequence of T state blocks \(\texttt {st} _{j+1},\dots ,\texttt {st} _{j+T}\) in \(\vec {\texttt {st}}_r\). We again assume that \(\vec {\texttt {st}}^{<j}\) is mined at round \(r'\) (by an honest-and-synchronized party), and that in round \(r'+t\), the state \(\vec {\texttt {st}}^{>j+T}\) appears for the first time as the state, or the prefix of a state, of at least one honest-and-synchronized miner. Recall that if an adversary successfully extended the state during some round by a new state block \(\texttt {st} _{j+s}\) of the above sequence \(\texttt {st} _{j+1},\dots , \texttt {st} _{j+T}\), then this happens in a round between \(r'\) and \(r'+t\). Let us denote this interval by the set S of rounds.

Since t is assumed to be the minimal number of rounds until the first honest miner adopted a state containing \(\texttt {st} _{j+1}\), we can actually make use of the general part of Lemma7.11 to conclude that the probability that the condition \(t \ge \frac{T}{(1-\delta ')\gamma (cg_\Delta (S),\Delta )}\) occurs in such an execution is at most \(\text {negl}(T)\) and obtain \(t \le \frac{T}{(1-\delta ')\gamma (cg_\Delta (S),\Delta )}\) with overwhelming probability in T. On the other hand, the lower bound on t is as in the proof of Lemma 7.15.

Let again \(N_A^t\) denote the expected value of adversarial blocks in t rounds, invoking Lemma 7.12 gives us that \(N_A^t \le (1+\delta )\overline{\beta }_St\) except with probability \(\text {negl}(\overline{\beta }_{S} t)\).

The number of times the adversary was successful in extending the state by one block can therefore be upper bounded by the quantity

$$\begin{aligned} N_A^{\frac{T}{(1-\delta ')\gamma }} \le \frac{1+\delta }{1-\delta '} \cdot T \cdot \frac{\overline{\beta }_{S}}{\gamma (cg_\Delta (S),\Delta )}. \end{aligned}$$

Since our arguments hold for any interval, the proof is concluded by taking the worst case over all rounds and the maximal fraction equals \(f_{cq}\) as claimed.

To establish the last part of the statement, we observe that Eq. (1) in particular implies that for any interval S (of sufficient size), we have that any subset \(S'\) of rounds of size \((1-\alpha _{max}\Delta )|S|\) fulfills \(\overline{\alpha }_{S'}(1-\texttt {T} _{mp}\Delta ) > (1+\epsilon )\overline{\beta }_S\) for some \(\epsilon > 0\). Since a lower bound x for \(\overline{\alpha }_{S'}\) over all subsets of size \((1-\alpha _{max}\Delta )|S|\) implies that x is also a lower bound for any larger subset \(S''\) and hence for \(cg_\Delta (S)\). Observing that for \(x, \Delta > 0\), \(\frac{x}{1+x\Delta } > x(1-x\Delta )\) and \(\texttt {T} _{mp}\ge cg_\Delta (S)\), we get \(\gamma (cg_\Delta (S),\Delta ) > \overline{\beta }_S\) as required to conclude that \(f_{cq} < 1\). \(\square \)

8 Special Cases of our Model and Functionality Wrappers

In this section, we first explain how our main theorem relates to the influential initial provable security analyses of Bitcoin. Afterward, we show how to use functionality wrappers to enforce the main theorem’s conditions in order to obtain composable statements (i.e., with respect to all environments).

8.1 Special Cases and Existing Works

We demonstrate how the protocols, assumptions, and results from the two existing works analyzing security of Bitcoin (in a property based manner) can be cast as special cases of our construction. We focus on the early analyses of Pass et al. [44] (PSs for short) and the original analysis of Garay et al. [24] (GKL for short). Both initial models assume a fixed upper bound n on the number of active participants in the protocol execution. All honest parties are assumed to be synchronized (e.g., by special initialization messages by the environment).

GKL analysis (fixed difficulty and delay). We start with the result in [24], in particular with the so-called flat (every party has the same hashing power) and synchronous model with next-round delivery. The relevant variables are defined as follows:

  • Each party is allowed to perform \(q \ge 1\) hash queries. This translates to a success probability of \(p_H= 1 - (1-p)^q\) and \(p_A= p\), and to a total mining power \(\texttt {T} _{mp}^{\textrm{GKL}}:= p \cdot q \cdot n\).

  • The adversary gets (at most) q queries per corrupted party with probability \(p_A= p\) (there are no desynchronized parties). If \(t_r\) denotes the number of corrupted parties in round r, the expected value would be \(t_r\cdot q \cdot p\), and thus, we can define the upper bound on the adversarial mining power \(\beta ^{\textrm{GKL}}_{\textrm{max}} =p \cdot q \cdot (\rho \cdot n)\), where \(\rho n\) is the (assumed) upper bound on the number of miners contributing to the adversarial mining power (independent of r). Since the adversary is free to go to the limit in the model, the mining pattern is also flat: \(\vec {\beta }=(\beta ^{\textrm{GKL}}_{\textrm{max}},\dots ,\beta ^{\textrm{GKL}}_{\textrm{max}})\).

  • Each honest and synchronized miner gets exactly one activation per round and has success probability \(p_H= 1-(1-p)^q \in (0,1)\), for some integer \(q>0\) and hence we get a minimal honest mining power of \(\alpha ^{\textrm{GKL}}_{\textrm{min}} = 1-(1-p)^{q(1-\rho )\cdot n}\) (independent of r). Note that since n is assumed to be fixed in their model, \(q(1-\rho )\cdot n\) is in fact a lower bound on the honest and synchronized hashing power. Since the model assumes that this lower bound could potentially always be allowed, we again define the flat mining pattern \(\vec {\alpha } = (\alpha ^{\textrm{GKL}}_{\textrm{min}},\dots ,\alpha ^{\textrm{GKL}}_{\textrm{min}})\).

  • If instant delivery is assumed, this translates to defining \(\Delta ^{\textrm{GKL}}:=1\), i.e., guaranteed delivery in the next round.

PSs analysis (fixed difficulty). Similarly, we can instantiate the above values with the assumptions of [44]:

  • For each corrupted party, the adversary gets at most one query per round. Each honest miner makes exactly one query per round. In total, there are n parties among which \(\rho n\) can be corrupted (in any round).

  • In the PSs model, \(p_H= p_A= p\) and hence \(\texttt {T} _{mp}^{\textrm{PSs}} = p\cdot n\). With these values we get \(\beta ^{\textrm{PSs}}_{\textrm{max}} = p \cdot (\rho \cdot n)\). Putting things together, we also have \(\alpha ^{\textrm{PSs}}_{\textrm{min}} = 1-(1-p)^{(1-\rho )\cdot n}\), where \((1-\rho )\cdot n\) is the lower bound on the honest (and hence also synchronized) parties. As before, the mining pattern is flat.

  • The delay of the network is upper bounded by a constant \(\Delta ^{\textrm{PSs}}\) (as usual, unknown to the participants).

The security is established by the following lemma:

Lemma 8.1

For the special settings above, if we impose the assumption that

$$\begin{aligned} \alpha ^{\{\textrm{GKL},\, \textrm{PSs}\}}_{\textrm{min}} \cdot (1-2\cdot (\Delta ^{\{\textrm{GKL},\, \textrm{PSs}\}}+1)\cdot \alpha ^{\{\textrm{GKL},\, \textrm{PSs}\}}_{\textrm{min}}) \ge \lambda \cdot \beta ^{\{\textrm{GKL},\, \textrm{PSs}\}}_{\textrm{max}} \end{aligned}$$
(4)

then this implies the secure realization of the Bitcoin ledger with the parameters assured by Theorem 7.9 for the above choices of values, respectively.

Proof Sketch

The statement of course follows from the arguments given in the respective works [24] and [44] since our execution model in particular allows us to formulate the above assumptions. However, it is instructive to see how the security follows in view of Theorem 7.9. In particular, why security follows when replacing the condition in Eq. (1) by Eq. (4). At first sight, the condition is stronger as it implies that the best strategy of the adversary is dominated by the worst strategy of the honest players. However, the discount factor \((1-2\cdot (\Delta ^{\{\textrm{GKL},\, \textrm{PSs}\}}+1)\cdot \alpha ^{\{\textrm{GKL},\, \textrm{PSs}\}}_{\textrm{min}})\) is better than \((1-2\cdot (\Delta ^{\{\textrm{GKL},\, \textrm{PSs}\}}+1)\cdot \texttt {T} _{mp}^{\textrm{GKL},\textrm{PSs}})\). The key observation why Eq. (4) subsumes Eq. (1) in the special cases described above are the following:

  • Since the number n of parties is fixed and exactly divided into honest and adversarial, and because the worst-case honest strategy still dominates the adversary’s best strategy, we can use to following argument to justify why Eq. (4) is actually sufficient. Still, the best strategy of the adversary is to activate as many corrupted parties, say t, as allowed by the upper bound \(\beta _{max}\). Since the number of parties is fixed, this implies that at most \(n-t\) activations of honest parties remain and by definition \(\alpha _{min}=1-(1-p)^(n-t)\) is the matching lower bound. Hence, and in contrast to the more general setting, here the best strategy for corrupted parties induces a concrete strategy for honest parties.Footnote 13 A bit more formally, let x denote the number of queries such that \(\alpha _{min} = 1-(1-p)^x\) holds. Assume in some round r, more honest parties are activated, say \(q_H^r\). By definition, \(\beta _{max} \ge p \cdot (n-x)\) and we can formally assign the difference \((q_H^r-x)\) to the adversary’s budget (and the condition \(\alpha _{min} > \beta _{max}\) is preserved as stated below). First, observe that for integers \(x,y>1\),

    $$\begin{aligned} \alpha _r - \alpha&= (1-(1-p)^{x+y}) - (1-(1-p)^x) = (1-p)^x - (1-p)^{x+y} \\&=(1-p)^x\cdot (1-(1-p)^y) \le (1-(1-p)^y) \le (1-(1-y\cdot p)) \\&= y\cdot p, \end{aligned}$$

    where the last inequality is a consequence of Bernoulli’s inequality. The adversary’s mining power is thus increased, however not beyond \(\beta _{max}\) since the identity \(n-x = (n-q_H^r)+(q_H^r-x)\) is guaranteed because n and x are fixed for the analysis.

  • Looking at the proof of Theorem 7.9, we see that the quantities \(\overline{\alpha }_S\) and \(\overline{\beta }_S\) can be identified by \(\alpha _{min}\) and \(\beta _{max}\), respectively, and in addition the relationship \(\alpha _{min} > \beta _{max}\) is implied by Eq. (4) (and thus \(\overline{\alpha }_S > \overline{\beta }_S\) for any subset S of rounds of any size. With this, all Lemmata in the proof of Theorem 7.9 simplify and no further condition in addition to Eq. (4) is needed.

With this in mind, replacing the condition in Eq. (1) by Eq. (4) the proof of Theorem 7.9, under the conditions imposed by the above models, yields the statement of the lemma. \(\square \)

8.2 Restrictions and Composition

Note that the theorem statement a-priori holds for any environment (but simply yields a void statement if the conditions are violated). In order to turn this into a composable statement without restrictions, we follow the approach proposed in Sect. 3 and model restrictions in the setup of the protocol via wrapper functionalities. The general conceptual principle behind this is the following: For the hybrid world, that consists of a network , a clock and a random oracle \(\mathcal {F}_{\textsc {RO}} ^\kappa \) with output length \(\kappa \) (or alternatively the state-exchange functionality \(\mathcal {F}_{\textsc {StX}} \) instead of the random oracle), define a wrapper functionality \(\mathcal {W}\) which enforces a given mining pattern \((\vec {\alpha },\vec {\beta })\) (and the upper bounds on the mining power). If the conditions of Theorem 7.9 are met, then we get a UC-realization statement with respect to all (efficient) environments.

A general wrapper. We define such a general wrapper for our setting and denote it by in Fig. 10. Note that this wrapper slightly changes the synchrony pattern of the real-world execution: since a lower bound on honest mining power is enforced (otherwise, the clock does not go on), we realize the ledger with a slightly different predicate \(\textsf {predict-time} _{BC}\) to reflect this assumption. It is easy to see that this is a straightforward extension to the derivation in Lemma 5.2. We note that this change to the synchronization pattern just stems from the fact how we implement such restricting assumptions but does not affect other modeling decisions and the we still realize the ledger (this is actually a major motivation to abstract the time-dependency of the ledger using such an abstract predicate, such that minor details have only local effects). For this wrapper we have the following desired corollary to Theorem 7.9 and Lemma,7.2. This statement is guaranteed to compose according to the UC composition theorem.

Corollary 8.2

The protocol \(\textsf {Ledger-Protocol} _{q,\texttt {D},T}\), defined in the -hybrid world, UC-realizes functionality (for the parameters established by Theorem 7.9 and the extended predicate \(\textsf {predict-time} _{BC}\) as described above) if the parameters of the wrapper (and thus formally enforced by the setup-functionality of the protocol), satisfy equation (1).

Fig. 10
figure 10

The wrapper that restricts the adversarial access to the real-world resources

Remark

It is straightforward to design different wrappers capturing a range of assumptions that one might want to make (and which imply the conditions of Theorem 7.9), such as an explicit restriction on number of active participants etc. An additional interesting observation is that if Bitcoin did require that block extensions are accompanied by their hash explicitly, i.e., report \(y:= H[\textbf{B} _i]\) along with a new block \(\textbf{B} _i\) (and honest parties consequently ignored a received block that is not paired with the right hash value), then the wrapper could be simplified. As the adversary would not be able to abuse honest verifications as actual (adversarial) work, the wrapper would not have to consider the special case of adversarial multicast-messages in Fig. 10 when computing the budget.

As a final remark, we point out that the wrapper enforces that honest and adversarial attempts at solving PoW-puzzles (modeled via evaluations of the RO) are made in tandem, that is, the adversary cannot consume the budget of future rounds ahead of time. This also means that, effectively, the execution starts at the same time for everyone. This is one of two ways how to render pre-computation attacks ineffective, the other one being to define the start of the execution to be the point when a random genesis block is published—by an additional hybrid functionality as done for example in [6] in the PoS case.

9 Modular Constructions Based on the Ledger

The ledger functionality can be enhanced in a modular way in various directions. In this section, we show a simple extension to strengthen liveness thanks signatures. Informally, the stronger guarantee ensures that every transaction submitted by an honest participant will eventually make it into the state. In this section we present this stronger ledger and show how such an implementation can be captured as a UC protocol which makes black-box use of the \(\textsf {Ledger-Protocol}\) to implement this ledger. The UC composition theorem makes such a proof immediate, as we do not need to think about the specifics of the invoked ledger protocol, and we can instead argue security in a hybrid world with access to .

Protection of transactions using addresses. In Bitcoin, a participant creates a unique address denoted by \(\textsf{AddrID}\) by generating a signature key pair and hashing the public key. Any transaction of this party includes this address, i.e., \(\texttt {tx} = (\textsf{AddrID}, \texttt {tx} ')\). An important property is that a transaction of a certain address cannot be invalidated by a transaction with a different address ID. Hence, to protect the validity of a transaction, upon submitting \(\texttt {tx} \), party \(P _i\) has to sign it, append the signature and verification key to get a transaction \(((\textsf{AddrID}, \texttt {tx} '), vk , \sigma )\). The validation predicate now additionally has to check that the address is the hash of the public key and that the signature \(\sigma \) is valid with respect to the verification key \( vk \). Roughly, an adversary can invalidate \(\texttt {tx} \), only by either forging a signature relative to \( vk \), or by possessing key pair whose hash of the public key collides with the address of the honest party.

The realized ledger abstraction, denoted by , is a ledger functionality as the one from the previous section, but which additionally allows parties to create unique addresses. Upon receiving a transaction from party \(P _i\), only accepts a transaction containing the \(\textsf{AddrID}\) that was previously associated to \(P _i\) and ensures that parties are restricted to issue transactions using their own addresses. As we explain, this amplifies transaction liveness.

9.1 A Stronger Ledger with Address Management

9.1.1 Overview and Definitions

To achieve stronger guarantees than our original Bitcoin ledger, a party issues transactions relative to an address. More abstractly speaking, a transaction contains an identifier, \(\textsf{AddrID}\), which can be seen as the abstract identity that claims ownership of the transaction. More specifically, we can represent this situation by having transactions \(\texttt {tx} \) be pairs \((\textsf{AddrID}, \texttt {tx} ')\) with the above meaning. Signatures enter the picture at this level: an honest participant will issue only signed transactions. In order to link verification key to the address, \(\textsf{AddrID}\) is the hash of the verification keys, where we require collision resistance. More concretely, whenever a miner is supposed to submit a transaction \(\texttt {tx} \), it signs it and appends the signature and its verification key. The validation consists of three parts. First, it is verified that the public key matches the address, second, the signature is verified, and third, its validated whether the actual transaction \((\textsf{AddrID},\texttt {tx} ')\) is valid, with respect to a separate validation predicate on states and transactions \(\texttt {tx} \) of the above format. Only if all three tests succeed, the transactions is valid. We make use of an existentially unforgeable digital signature scheme and recall its definition here:

Definition 9.1

A digital signature scheme \(\textrm{DSS}:= (\textsf{Gen}, \textsf{Sign}, \textsf{Ver})\) for a message space \(\mathcal {M}\), signature space \(\mathcal {S}\), and key space \(\mathcal {K}= \mathcal{S}\mathcal{K}\times \mathcal{P}\mathcal{K}\) consists of a (probabilistic) key generation algorithm \(\textsf{Gen}\) that returns a key pair \(( sk , vk ) \in \mathcal {K}\), a (possibly probabilistic) signing algorithm \(\textsf{Sign}\), that given a message \(m\in \mathcal {M}\) and the signing key \( sk \in \mathcal{S}\mathcal{K}\) returns a signature \(s\leftarrow \textsf{Sign}( sk , m)\), and a (possibly probabilistic, but usually deterministic) verification algorithm \(\textsf{Ver}\), that given a message \(m \in \mathcal {M}\), a candidate signature \(s' \in \mathcal {S}\), and the verification key \( vk \in \mathcal{P}\mathcal{K}\) returns a bit \(\textsf{Ver}( vk , m, s')\). The bit 1 is interpreted as a successful verification and 0 as a failed verification. We require correctness, that is, we demand that \(\textsf{Ver}( vk , m, \textsf{Sign}( sk , m))=1\) for all \(m \in \mathcal {M}\) and all pairs \(( vk , sk )\) in the support of \(\textsf{Gen}\).

Definition 9.2

A digital signatures scheme is existentially unforgeable under chosen message attacks if no efficient adversary A can win the following security experiment better than with negligible probability: the challenger first chooses a key pair \(( sk , vk ) \leftarrow \textsf{Gen}\). Then it acts as a signing oracle, receiving messages \(m \in \mathcal {M}\) from the adversary and responding with \(\textsf{Sign}( sk , m)\). At any point, A can undertake a forging attempt by providing a message \(m'\) and a candidate signature \(s'\) to the challenger. The adversary wins if and only if \(\textsf{Ver}( vk , m',s')=1\) and \(m'\) was never queried before by A.

9.1.2 The Protocol for Address Management

Hybrid ledger functionality. Let and be as in the previous section but with the following additional property: each transaction is a pair \(\texttt {tx} = (\textsf{AddrID}, \texttt {tx} ')\) where the first part is bitstring of fixed length and the second part is an arbitrary transaction payload. In addition we require the following property: for any state \(\texttt {state} \) and any transaction \(\texttt {tx} \) it holds that implies, for any state extension \(\texttt {state} ||\texttt {st} '\), that , if \(\texttt {st} '\) does not contain a transaction with the same identifier \(\textsf{AddrID}\) (this is clearly satisfied for Bitcion for example). Recall that we also assume that Definition 6.1 is satisfied.

Our protocol is defined w.r.t. a Bitcoin ledger functionality with the following validation predicate, which is defined relative to a hash function H, and a signature scheme \(\textrm{DSS}\).

figure u

Protocol. The protocol is straightforward: whenever the protocol is given an input of the form \((\textsf{AddrID}, \texttt {tx})\) it first checks that it is the party associated with this address ID. Then, it receives the newest state from the ledger and checks, whether this input is valid with respect to the current state. If this is the case, the party signs the input and submits it to the ledger.

figure v

9.1.3 The Enhanced Ledger Functionality

We present an enhanced ledger functionality with a validation predicate that enforces that an adversarial transaction cannot prevent a transaction by an honest party to eventually make it into the stable state of the ledger.

figure w

The following validation predicate is used within .

figure x

On the better guarantees. The stronger guarantee for honestly submitted transactions stems from two facts. First, by Definition 6.1, the state blocks contain transactions beyond coin-base transactions. Second, since a transaction of a party is associated with its address, and cannot be invalidated by another transaction with a different address, this implies that the transaction remains valid relative to \(\texttt {state} \) (unless the honest party itself issues a transaction that contradicts a previous transaction for one of its addresses, but we neglect this here). As an example, assume an honest party submits a single transaction for one of its addresses, and assume this transaction is valid relative to the state \(\texttt {state} \). Then, by the defined enforcing mechanism of \(\textsf {ExtendPolicy} \), this transaction is guaranteed to enter the state after staying in the buffer for long enough, because the ledger continuously enforces that a certain fraction of blocks contain all those unconfirmed (and still valid) transactions that are older than a certain threshold.

We have the following lemma:

Lemma 9.3

Let \(\textrm{DSS}\) be a secure digital signature scheme and let H be a collision resistant hash function. Then the protocol \(\textsf{addrMgmt}\) in the -hybrid world UC-realizes ledger , where the functionalities are instantiated as described above.

Proof Sketch

It is straightforward to write a simulator in the ideal-world execution that perfectly mimics the protocol as long as no hash-collision or signature forgery occurs. This is because the only non-trivial property that the ledger enforces (in addition to what the assumed ledger guarantees) is that only the address holder can submit a transaction but no one else. If no hash-function collision is found, the only possible way is to forge a signature. If both events do not happen, the real world indeed implements the stronger validation predicate. Assuming a collision-resistant hash function and a signature scheme that is unforgeable under chosen-message attacks, this implies the statement. \(\square \)