SoK: unraveling Bitcoin smart contracts

. Albeit the primary usage of Bitcoin is to exchange currency, its blockchain and consensus mechanism can also be exploited to securely execute some forms of smart contracts . These are agreements among mutually distrusting parties, which can be automatically enforced without resorting to a trusted intermediary. Over the last few years a variety of smart contracts for Bitcoin have been proposed, both by the academic community and by that of developers. However, the heterogeneity in their treatment, the informal (often incomplete or imprecise) descriptions, and the use of poorly documented Bitcoin features, pose obstacles to the research. In this paper we present a comprehensive survey of smart contracts on Bitcoin, in a uniform framework. Our treatment is based on a new formal speciﬁcation language for smart contracts, which also helps us to highlight some subtleties in existing informal descriptions, making a step towards automatic veriﬁcation. We discuss some obstacles to the diﬀusion of smart contracts on Bitcoin, and we identify the most promising open research challenges.


Introduction
The term "smart contract" was conceived in [43] to describe agreements between two or more parties, that can be automatically enforced without a trusted intermediary.Fallen into oblivion for several years, the idea of smart contract has been resurrected with the recent surge of distributed ledger technologies, led by Ethereum and Hyperledger.In such incarnations, smart contracts are rendered as computer programs.Users can request the execution of contracts by sending suitable transactions to the nodes of a peer-to-peer network.These nodes collectively maintain the history of all transactions in a public, append-only data structure, called blockchain.The sequence of transactions on the blockchain determines the state of each contract, and, accordingly, the assets of each user.
A crucial feature of smart contracts is that their correct execution does not rely on a trusted authority: rather, the nodes which process transactions are assumed to be mutually untrusted.Potential conflicts in the execution of contracts are resolved through a consensus protocol, whose nature depends on the specific platform (e.g., it is based on "proof-of-work" in Ethereum).Ideally, contracts execute correctly whenever the adversary does not control the majority of some resource (e.g., computational power for "proof-of-work" consensus).
The absence of a trusted intermediary, combined with the possibility of transferring money given by blockchain-based cryptocurrencies, creates a fertile ground for the development of smart contracts.For instance, a smart contract may promise to pay a reward to anyone who provides some value that satisfies a given public predicate.This generalises cryptographic puzzles, like breaking a cipher, inverting a hash function, etc.
Since smart contracts handle the ownership of valuable assets, attackers may be tempted to exploit vulnerabilities in their implementation to steal or tamper with these assets.Although analysis tools [17,30,34] may improve the security of contracts, so far they have not been able to completely prevent attacks.For instance, a series of vulnerabilities in Ethereum contracts [10] have been exploited, causing money losses in the order of hundreds of millions of dollars [3][4][5].
Using domain-specific languages (possibly, not Turing-complete) could help to overcome these security issues, by reducing the distance between contract specification and implementation.For instance, despite the discouraging limitations of its scripting language, Bitcoin has been shown to support a variety of smart contracts.Lotteries [6,14,16,36], gambling games [32], contingent payments [13,24,35], and other kinds of fair multi-party computations [8,31] are some examples of the capabilities of Bitcoin as a smart contracts platform.
Unlike Ethereum, where contracts can be expressed as computer programs with a well-defined semantics, Bitcoin contracts are usually realised as cryptographic protocols, where participants send/receive messages, verify signatures, and put/search transactions on the blockchain.The informal (often incomplete or imprecise) narration of these protocols, together with the use of poorly documented features of Bitcoin (e.g., segregated witnesses, scripts, signature modifiers, temporal constraints), and the overall heterogeneity in their treatment, pose serious obstacles to the research on smart contracts in Bitcoin.
Contributions.This paper is, at the best of our knowledge, the first systematic survey of smart contracts on Bitcoin.In order to obtain a uniform and precise treatment, we exploit a new formal model of contracts.Our model is based on a process calculus with primitives to construct Bitcoin transactions, to put them on the blockchain, and to search the blockchain for transactions matching given patterns.Our calculus allows us to give smart contracts a precise operational semantics, which describes the interactions of the (possibly dishonest) participants involved in the execution of a contract.
We exploit our model to systematically formalise a large portion of the contracts proposed so far both by researchers and Bitcoin developers.In many cases, we find that specifying a contract with the intended security properties is significantly more complex than expected after reading the informal descriptions of the contract.Usually, such informal descriptions focus on the case where all participants are honest, neglecting the cases where one needs to compensate for some unexpected behaviour of the dishonest environment.
Overall, our work aims at building a bridge between research communities: from that of cryptography, where smart contracts have been investigated first, to those of programming languages and formal methods, where smart contracts could be expressed using proper linguistic models, supporting advanced analysis and verification techniques.We outline some promising research perspectives on smart contracts, both in Bitcoin and in other cryptocurrencies, where the synergy between the two communities could have a strong impact in future research.

Background on Bitcoin transactions
In this section we give a minimalistic introduction to Bitcoin [21,38], focussing on the crucial notion of transaction.To this purpose, we rely on the model of Bitcoin transactions in [11].Here, instead of repeating the formal machinery of [11], we introduce the needed concepts through a series of examples.We will however follow the same notation of [11], and point to the formal definitions therein, to allow the reader to make precise the intuitions provided in this paper.
Bitcoin is a decentralised infrastructure to securely transfer currency (the bitcoins, B) between users.Transfers of bitcoins are represented as transactions, and the history of all transactions is stored in a public, append-only, distributed data structure called blockchain.Each user can create an arbitrary number of pseudonyms through which sending and receiving bitcoins.The balance of a user is not explicitly stored within the blockchain, but it is determined by the amount of unspent bitcoins directed to the pseudonyms under her control, through one or more transactions.The logic used for linking inputs to outputs is specified by programmable functions, called scripts.
Hereafter we will abstract from a few technical details of Bitcoin, e.g. the fact that transactions are grouped into blocks, and that each transaction must pay a fee to the "miner" who appends it to the blockchain.We refer to [11] for a discussion on the differences between the formal model and the actual Bitcoin.

Transactions
In their simplest form, Bitcoin transactions allow to transfer bitcoins from one participant to another one.The only exception are the so-called coinbase transactions, which can generate fresh bitcoins.Following [11], we assume that there exists a single coinbase transaction, the first one in the blockchain.We represent this transaction, say T 0 , as follows: The transaction T 0 has three fields.The fields in and wit are set to ⊥, meaning that T 0 does not point backwards to any other transaction (since T 0 is the first one on the blockchain).The field out contains a pair.The first element of the pair, λx.x < 51, is a script, that given as input a value x, checks if x < 51 (this is just for didactical purposes: we will introduce more useful scripts in a while).The second element of the pair, 1B, is the amount of currency that can be transferred to other transactions.Now, assume that participant A wants to redeem 1B from T 0 , and transfer that amount under her control.To do this, A has to append to the blockchain a new transaction, e.g.: The field in points to the transaction T 0 in the blockchain.To be able to redeem from there 1B, A must provide a witness which makes the script within T 0 .outevaluate to true.In this case the witness is 42, hence the redeem succeeds, and T 0 is considered spent.The script within T A .out is the most commonly used one in Bitcoin: it verifies the signature x with A's public key.The message against which the signature is verified is the transaction3 which attempts to redeem T A .Now, to transfer 1B to another participant B, A can append to the blockchain the following transaction: where the witness sig k A (T B ) is A's signature on T B (but for the wit field itself).
The ones shown above represent just the simplest cases of transactions.More in general, a Bitcoin transaction can collect bitcoins from many inputs, and split them between one or more outputs; further, it can use more complex scripts, and specify time constraints on when it can be appended to the blockchain.
Following [11], hereafter we represent transactions as tuples of the form (in, wit, out, absLock, relLock), where: in contains the list of inputs.An input (T, i) refers to the i-th output of transaction T. -wit contains the list of witnesses, of the same length as the list of inputs.For each input (T, i) in the in list, the witness at the same index must make the i-th output script of T evaluate to true.-out contains the list of outputs.Each index refers to a pair (λz.e, v), where the first component is a script, and the second is a currency value.-absLock and relLock indicate absolute and relative time constraint on when the transaction can be added to the blockchain.
In transaction fields, we represent a list 1 • • • n as 1 → 1 , . . ., n → n , or just as We denote with T v A the canonical transaction, i.e. the transaction with a single output of the form (λς.versig k A (ς), vB), and with all the other fields empty (denoted with ⊥).Example 1.Consider the transactions in Figure 1.In T 1 there are two outputs: the first one transfers v 1 B to any transaction T which provides as witness a signature of T with key k; the second output can transfer v 2 B to a transaction whose witness satisfies the script e 1 .The transaction T 2 tries to redeem v 1 B from the output at index 1 of T 1 , by providing the witness σ 1 .Since T 2 .relLock(1)= t, then T 2 can be appended only after at least t time units have passed since the transaction in T 2 .in(1)(i.e., T 1 ) appeared on the blockchain.In T 3 , the input 1 refers to the output 2 of T 1 , and the input 2 refers to the output 1 of T 2 .The witness σ 2 and σ 2 are used to evaluate T 1 .out(2), replacing the occurrences of x and x in e 1 .Similarly, σ 3 is used to evaluate T 2 .out(1),replacing the occurrences of x in e 2 .The transaction T 3 can be put on the blockchain only after time t .

Scripts
In Bitcoin, scripts are small programs written in a non-Turing equivalent language.Whoever provides a witness that makes the script evaluate to "true", can redeem the bitcoins retained in the associated (unspent) output.In the abstract model, scripts are terms of the form λz.e, where z is a sequence of variables occurring in e, and e is an expression with the following syntax: Besides variables x, constants k, and basic arithmetic/logical operators, the other expression are peculiar: |e| denotes the size, in bytes, of the evaluation of e; H(e) evaluates to the hash of e; versig k (e) evaluates to true iff the sequence of signatures e (say, of length m) is verified by using m out of the n keys in k.For instance, the script λx.versig k (x) is satisfied if x is a signature on the redeeming transaction, verified with the key k.The expressions absAfter t : e and relAfter t : e define absolute and relative time constraints: they evaluate as e if the constraints are satisfied, otherwise they evaluate to false.
In Figure 2 we recap from [11] the semantics of script expressions.The function • T,i,ρ takes three parameters: T is the redeeming transaction, i is the index of the redeeming witness, and ρ is a map from variables to values.We use ⊥ to represent the "failure" of the evaluation, H for a public hash function, and size(n) for the size (in bytes) of an integer n.The function ver k (σ, T, i) verifies a sequence of signatures σ against a sequence of keys k (see Section 2.3) All the Let ρ = {x → sig k (T 3 ), x → s}.To redeem T 1 .out(2)with the witness T 3 .wit(1),the script expression is evaluated as follows: = true as ρ(x ) = s

Transaction signatures
The signatures verified with versig never apply to the whole transaction: the content of wit field is never signed, while the other fields can be excluded from the signature according to some predefined patterns.To sign parts of a transaction, we first erase the fields which we want to neglect in the signature.Technically, we set these fields to the "null" value ⊥ using a transaction substitution.
A transaction substitution {f → d} replaces the content of field f with d.If the field is indexed (i.e., all fields but absLock), we denote with {f (i) → d} the substitution of the i-th item in field f , and with {f ( = i) → d} the substitution of all the items of field f but the i-th.For instance, to set all the elements of the wit field of T to ⊥, we write T{wit → ⊥}, and to additionally set the second input to ⊥ we write T{wit → ⊥}{in(2) → ⊥}.
In Bitcoin, there exists a fixed set of transaction substitutions.We represent them as signature modifiers, i.e. transaction substitutions which set to ⊥ the fields which will not be signed.Signatures never apply to the whole transaction: modifiers always discard the content of the wit, while they can keep all the inputs or only one, and all the outputs, or only one, or none.Modifiers also take a parameter i, which is instantiated to the index of the witness where the signature will be included.Below we only present two signature modifiers, since the others are not commonly used in Bitcoin smart contracts.
The modifier aa i only sets the first witness to i, and the other witnesses to ⊥ (so, all inputs and all outputs are signed).This ensures that a signature computed for being included in the witness at index i can not be used in any witness with index j = i: The modifier sa i removes the witnesses, and all the inputs but the one at index i (so, a single input and all outputs are signed).Differently from aa i , this modifier discards the index i, so the signature can be included in any witness: Signatures carry information about which parts of the transaction are signed: formally, they are pairs σ = (w, µ), where µ is the modifier, and w is the signature on the transaction T modified with µ.We denote such signature as sig µ,i k (T), where k is a key, and i is the index used by µ, if any.Verification of a signature σ for index i is denoted by ver k (σ, T, i).Formally: where sig and ver are, respectively, the signing function and the verification function of a digital signature scheme.Multi-signature verification ver k (σ, T, i) extends verification to the case where σ is a sequence of signatures and k is a sequence of keys.Intuitively, if |σ| = m and |k| = n, it implements a m-of-n multi-signature scheme, evaluating to true if all the m signatures match (some of) the keys in k.The actual definition also takes into account the order of signatures, as formalised in Definition 6 of [11].

Blockchain and consistency
Abstracting away from the fact that the actual Bitcoin blockchain is formed by blocks of transactions, here we represent a blockchain B as a sequence of pairs (T i , t i ), where t i is the time when T i has been appended, and the values t i are increasing.We say that the j-th output of the transaction T i in the blockchain is spent (or, for brevity, that (T i , j) is spent) if there exists some transaction T i in the blockchain (with i > i) and some j such that T i .in(j ) = (T i , j).
We now describe when a pair (T, t) can be appended to B = (T 0 , t 0 ) • • • (T n , t n ).Following [11], we say that T is a consistent update of B at time t, in symbols B (T, t), when the following conditions hold: (c) the witness T.wit(i) makes the script in T .out(j)evaluate to true; 2. the time constraints absLock and relLock in T are satisfied at time t ≥ t n ; 3. the sum of the amounts of the inputs of T is greater or equal 4 to the sum of the amount of its outputs.
We assume that each transaction T i in the blockchain is a consistent update of the sequence of past transactions The consistency of the blockchain is actually ensured by the Bitcoin consensus protocol.
Example 3. Recall the transactions in Figure 1.Assume a blockchain B whose last pair is (T 1 , t 1 ) and t 1 ≥ t , while T 2 and T 3 are not in B.
We verify that (T 2 , t 2 ) is a consistent update of B, assuming t 2 = t 1 + t and that σ 1 is the signature of T 2 with (the private part of) key k.The only input of T 2 is (T 1 , 1).Conditions 1a and 1b are satisfied, since (T 1 , 1) is unspent in B. Condition 1c holds because versig k (σ 1 ) evaluates to true.Condition 2 holds: indeed the relative timelock in T 2 is satisfied because t 2 − t 1 ≥ t.Condition 3 holds because the amount of the input of T 2 , i.e. v 1 B, is equal to the amount of its output.Note instead that (T 3 , t 2 ) would not be a consistent update of B, since it violates condition 1a on the second input.Now, let B = B(T 2 , t 2 ).We verify that (T 3 , t 3 ) is a consistent update of B , assuming t 3 ≥ t 2 , e 1 as in Example 2, and e 2 = versig k (x).Further, let σ 2 = sig k (T 3 ), let σ 2 = s, and σ 3 = sig k (T 3 ).Conditions 1a and 1b hold, because T 1 and T 2 are in B , and the referred outputs are unspent.Condition 1c holds because the output scripts T 1 .out(2)and T 2 .out(1)against σ 2 , σ 2 and σ 3 evaluate to true.Condition 2 is satisfied at t 3 ≥ t 2 ≥ t 1 ≥ t .Finally, condition 3 holds because the amount (v 1 + v 2 )B in T 3 .out(1)is equal to the sum of the amounts in T 1 .out(2)and T 2 .out(1).

Modelling Bitcoin contracts
In this section we introduce a formal model of the behavior of the participants in a contract, building upon the model of Bitcoin transactions in [11].
We start by formalising a simple language of expressions, which represent both the messages sent over the network, and the values used in internal computations made by the participants.Hereafter, we assume a set Var of variables, and we define the set Val of values comprising constants k ∈ Z, signatures σ, scripts λz.e, transactions T, and currency values v.
Definition 1 (Contract expressions).We define contract expressions through the following syntax: where E denotes a finite sequence of expressions (i.e., We define the function • from (variable-free) contract expressions to values in Figure 3.As a notational shorthand, we omit the index i in sig (resp.versig) when the signed (resp.verified) transactions have a single input.
Intuitively, when T evaluates to a transaction T, the expression T{f (i) → E} represents the transaction obtained from T by substituting the field f (i) with the sequence of values obtained by evaluating E. For instance, T{wit(1) → σ} denotes the transaction obtained from T by replacing the witness at index 1 with the signature σ.Further, sig µ,i k (T) evaluates to the signature of the transaction represented by T, and versig k (E, T, i) represents the m-of-n multi-signature verification of the transaction represented by T. Both for the signing and verification, the parameter i represents the index where the signature will be used.We assume a simple type system (not specified here) that rules out ill-formed expressions, like e.g.k{wit(1) → T}.
We formalise the behaviour of a participant as an endpoint protocol, i.e. a process where the participant can perform the following actions: (i) send/receive messages to/from other participants; (ii) put a transaction on the ledger; (iii) wait until some transactions appear on the blockchain; (iv) do some internal computation.Note that the last kind of operation allows a participant to craft a transaction before putting it on the blockchain, e.g.setting the wit field to her signature, and later on adding the signature received from another participant.
Definition 2 (Endpoint protocols).Assume a set of participants (named A, B, C, . . .).We define prefixes π, and protocols P , Q, R, . . .as follows: We assume that each name X has a unique defining equation X(x) = P where the free variables in P are included in x.We use the following syntactic sugar: τ check true, the internal action; -0 ∅ P , the terminated protocol (as usual, we omit trailing 0s); The behaviour of protocols is defined in terms of a LTS between systems, i.e. the parallel composition of the protocols of all participants, and the blockchain.

Definition 3 (Semantics of protocols).
A system S is a term of the form , where (i) all the A i are distinct; (ii) there exists a single component (B, t), representing the current state of the blockchain B, and the current time t; (iii) systems are up-to commutativity and associativity of |.We define the relation − → between systems in Figure 4, where match B (T) is the set of all the transactions in B that are equal to T, except for the witnesses.When writing S | S we intend that the conditions above are respected.
Intuitively, a guarded choice i π i .P i can behave as one of the branches P i .A parallel composition P | Q executes concurrently P and Q.All the rules (except the last two) specify how a protocol (π.P + Q) | R evolves within a system.Rule [Com] models a message exchange between A and B: participant A sends messages E, which are received by B on variables x.Communication is synchronous, i.e.A is blocked until B is ready to receive.Rule [Check] allows the branch P of a sum to proceed if the condition represented by E is true.Rule [Put] allows A to append a transaction to the blockchain, provided that the update is consistent.Rule [Ask] allows the branch P of a sum to proceed only when the blockchain contains some transactions T 1 • • • T n obtained by instantiating some ⊥ fields in T (see Section 2).This form of pattern matching is crucial because the value of some fields (e.g., wit), may not be known at the time the protocol is written.When the ask prefix unblocks, the variables x in P are bound to Fig. 4: Semantics of endpoint protocols.
so making it possible to inspect their actual fields.Rule [Def] allows a named process X(E) to evolve as P , assuming a defining equation X(x) = P .The variables x in P are substituted with the results of the evaluation of E. Such defining equations can be used to specify recursive behaviours.Finally, rule [Delay] allows time to pass 5 .

Example 4 (Naïve escrow).
A buyer A wants to buy an item from the seller B, but they do not trust each other.So, they would like to use a contract to ensure that B will get paid if and only if A gets her item.In a naïve attempt to realise this, they use the transactions in Figure 5, where we assume that (T A , 1) used in T.in, is a transaction output redeemable by A through her key k A .The transaction T makes A deposit 1B, which can be redeemed by a transaction carrying the signatures of both A and B. The transactions T A and T B redeem T, transferring the money to A or B, respectively.
The protocols of A and B are, respectively, P A and Q B : First, A adds her signature to T, and puts it on the blockchain.Then, she internally chooses whether to unblock the deposit for B or to request a refund.In the first case, A sends sig aa k A (T B ) to B. In the second case, she waits to receive the signature sig aa k B (T A ) from B (saving it in the variable x); afterwards, she puts T A on the blockchain (after setting wit) to redeem the deposit.The seller B waits to see T on the blockchain.Then, he chooses either to receive the signature sig aa k A (T B ) from A (and then redeem the payment by putting T B on the blockchain), or to refund A, by sending his signature sig aa k B (T A ).This contract is not secure if either A or B are dishonest.On the one hand, a dishonest A can prevent B from redeeming the deposit, even if she had already received the item (to do that, it suffices not to send her signature, taking the rightmost branch in P ).On the other hand, a dishonest B can just avoid to send the item and the signature (taking the leftmost branch in Q B ): in this way, the deposit gets frozen.For instance, let , where B contains T A unredeemed.The scenario where A has never received the item, while B dishonestly attempts to receive the payment, is modelled as follows: At this point the computation is stuck, because both A and B are waiting a message from the other participant.We will show in Section 4.3 how to design a secure escrow contract, with the intermediation of a trusted arbiter.

A survey of smart contracts on Bitcoin
We now present a comprehensive survey of smart contracts on Bitcoin, comprising those published in the academic literature, and those found online.To this aim we exploit the model of computation introduced in Section 3. Remarkably, all the following contracts can be implemented by only using so-called standard transactions6 , e.g.via the compilation technique in [11].This is crucial, because non-standard transactions are currently discarded by the Bitcoin network.

Oracle
In many concrete scenarios one would like to make the execution of a contract depend on some real-world events, e.g.results of football matches for a betting contract, or feeds of flight delays for an insurance contract.However, the evaluation of Bitcoin scripts can not depend on the environment, so in these scenarios one has to resort to a trusted third-party, or oracle [2,19], who notifies real-world events by providing signatures on certain transactions.For example, assume that A wants to transfer vB to B only if a certain event, notified by an oracle O, happens.To do that, A puts on the blockchain the transaction T in Figure 6, which can be redeemed by a transactions carrying the signatures of both B and O. Further, A instructs the oracle to provide his signature to B upon the occurrence of the expected event.
We model the behaviour of B as the following protocol: Here, B waits to receive the signature sig aa k O (T B ) from O, then he puts T B on the blockchain (after setting its wit) to redeem T. In practice, oracles like the one needed in this contract are available as services in the Bitcoin ecosystem 7 .
Notice that, in case the event certified by the oracle never happens, the vB within T are frozen forever.To avoid this situation, one can add a time constraint to the output script of T, e.g. as in the transaction T bond in Figure 10.

Crowdfunding
Assume that the curator C of a crowdfunding campaign wants to fund a venture V by collecting vB from a set {A i } i∈I of investors.The investors want to be guaranteed that either the required amount vB is reached, or they will be able to redeem their funds.To this purpose, C can employ the following contract.She starts with a canonical transaction T v V (with empty in field) which has a single output of vB to be redeemed by V. Intuitively, each A i can invest money in the campaign by "filling in" the in field of the T v V with a transaction output under their control.To do this, A i sends to C a transaction output (T i , j i ), together with the signature σ i required to redeem it.We denote with val(T i , j i ) the value of such output.Notice that, since the signature σ i has been made on T v V , the only valid output is the one of vB to be redeemed by V. Upon the reception of the message from A i , C updates T v V : the provided output is appended to the in field, and the signature is added to the corresponding wit field.If all the outputs (T i , j i ) are distinct (and not redeemed) and the signatures are valid, when i val(T i , j i ) ≥ v the filled transaction T v V can be put on the blockchain.If C collects v > vB, the difference v − v goes to the miners as transaction fee.
The endpoint protocol of the curator is defined as X( T v V , 1, 0), where: while the protocol of each investor A i is the following: Note that the transactions sent by investors are not known a priori, so they cannot just create the final transaction and sign it.Instead, to allow C to complete the transaction T v V without invalidating the signatures, they compute them using the modifier sa 1 .In this way, only a single input is signed, and when verifying the corresponding signature, the others are neglected.

Escrow
In Example 4 we have discussed a naïve escrow contract, which is secure only if both the buyer A and the seller B are honest (so making the contract pointless).Rather, one would like to guarantee that, even if either A or B (or both) are dishonest, exactly one them will be able to redeem the money: in case they disagree, a trusted participant C, who plays the role of arbiter, will decide who gets the money (possibly splitting the initial deposit in two parts) [1,19].
The output script of the transaction T in Figure 7 is a 2-of-3 multi-signature schema.This means that T can be redeemed either with the signatures A and B (in case they agree), or with the signature of C (with key k C ) and the signature of A or that of B (in case they disagree).The transaction T AB (z) in Figure 7 allows the arbiter to issue a partial refund of zB to A, and of (1 − z)B to B. Instead, to issue a full refund to either A or B, the arbiter signs, respectively, the transactions T A = T 1B A {in(1) → (T, 1)} or T B = T 1B B {in(1) → (T, 1)} (not shown in the figure).The protocols of A and B are similar to those in Example 4, except for the part where they ask C for an arbitration: In the summation within P A , participant A internally chooses whether to send her signature to B (so allowing B to redeem 1B via T B ), or to proceed with P .There, A waits to receive either B's signature (which allows A to redeem 1B by putting T A on the blockchain), or a response from the arbiter, in the process P .The three cases in the summation of check in P correspond, respectively, to the case where A gets a full refund (z = 1), a partial refund (0 < z < 1), or no refund at all (z = 0).
The protocol for B is dual to that of A: If an arbitration is requested, C internally decides (through the τ actions) who between A and B can redeem the deposit in T, by sending its signature to one of the two participants, or decide for a partial refund of z and 1 − z bitcoins, respectively, to A and B, by sending its signature on T AB to both participants: Note that, in the unlikely case where both A and B choose to send their signature to the other participant, the 1B deposit becomes "frozen".In a more concrete version of this contract, a participant could keep listening for the signature, and attempt to redeem the deposit when (unexpectedly) receiving it.

Intermediated payment
Assume that A wants to send an indirect payment of v C B to C, routing it through an intermediary B who retains a fee of v B < v C bitcoins.Since A does not trust B, she wants to use a contract to guarantee that: (i) if B is honest, then v C B are transferred to C; (ii) if B is not honest, then A does not lose money.The contract uses the transactions in Figure 8: T AB transfers (v B +v C )B from A to B, and T BC splits the amount to B (v B B) and to C (v C B).We assume that (T A , 1) is a transaction output redeemable by A. The behaviour of A is as follows: Here, A receives from B his signature on T BC , which makes it possible to pay C later on.The τ branch and the else branch ensure that A will correctly terminate also if B is dishonest (i.e., B does not send anything, or he sends an invalid signature).If A receives a valid signature, she puts T AB on the blockchain, adding her signature to the wit field.Then, she also appends T BC , adding to the wit field her signature and B's one.Since A takes care of publishing both transactions, the behaviour of B consists just in sending his signature on T BC .Therefore, B's protocol can just be modelled as ).This contract relies on SegWit.In Bitcoin without SegWit, the identifier of T AB is affected by the instantiation of the wit field.So, when T AB is put on the blockchain, the input in T BC (which was computed before) does not point to it.

Timed commitment
Assume that A wants to choose a secret s, and reveal it after some time -while guaranteeing that the revealed value corresponds to the chosen secret (or paying a penalty otherwise).This can be obtained through a timed commitment [20], a protocol with applications e.g. in gambling games [25,28,42], where the secret contains the player move, and the delay in the revelation of the secret is intended to prevent other players from altering the outcome of the game.Here we formalise the version of the timed commitment protocol presented in [8].
Intuitively, A starts by exposing the hash of the secret, i.e. h = H(s), and at the same time depositing some amount vB in a transaction.The participant B has the guarantee that after t time units, he will either know the secret s, or he will be able to redeem vB.
The transactions of the protocol are shown in Figure 9, where we assume that (T A , 1) is a transaction output redeemable by A. The behaviour of A is modelled as the following protocol: Participant A starts by putting the transaction T com on the blockchain.Note that within this transaction A is committing the hash of the chosen secret: indeed, h is encoded within the output script T com .out.Then, A sends to B her signature on T pay .Note that this transaction can be redeemed by B only when t time units have passed since T com has been published on the blockchain, because Tbond in: (TA , 1) wit: ⊥ out: (λςς .versigk A k B (ςς ) or relAfter t : versig k A (ς), kB) Tref in: (Tbond , 1) wit: ⊥ out: (λς.versig k A (ς), vB) relLock: t Fig. 10: Transactions of the micropayment channel contract.
of the relative timelock declared in T pay .relLock.After sending her signature on T pay , A internally chooses whether to reveal the secret, or do nothing (via the τ actions).In the first case, A must put the transaction T open on the blockchain.Since it redeems T com , she needs to write in T open .witboth the secret s and her signature, so making the former public.
A possible behaviour of the receiver B is the following: In this protocol, B first receives from A (and saves in x) her signature on the transaction T pay .Then, B checks if the signature is valid: if not, he aborts the protocol.Even if the signature is valid, B cannot put T pay on the blockchain and redeem the deposit immediately, since the transaction has a timelock t.Note that B cannot change the timelock: indeed, doing so would invalidate A's signature on T pay .If, after t time units, A has not published T open yet, B can proceed to put T pay on the blockchain, writing A's and his own signatures in the witness.Otherwise, B retrieves T open from the blockchain, from which he can obtain the secret, and use it in Q .
A variant of this contract, which implements the timeout in T com .out,and does not require the signature exchange, is used in Section 4.7.

Micropayment channels
Assume that A wants to make a series of micropayments to B, e.g. a small fraction of B every few minutes.Doing so with one transaction per payment would result in conspicuous fees8 , so A and B use a micropayment channel contract [29].A starts by depositing kB; then, she signs a transaction that pays vB to B and (k − v)B back to herself, and she sends that transaction to B. Participant B can choose to publish that transaction immediately and redeem its payment, or to wait in case A sends another transaction with increased value.A can stop sending signatures at any time.If B redeems, then A can get back the remaining amount.If B does not cooperate, A can redeem all the amount after a timeout.
The protocol of A is the following (the transactions are in Figure 10).A publishes the transaction T bond , depositing kB that can be spent with her signature and that of B, or with her signature alone, after time t.A can redeem the deposit by publishing the transaction T ref .
To pay for the service, A sends to B the amount v she is paying, and her signature on T pay (v).Then, she can decide to increase v and recur, or to terminate.
The participant B waits for T bond to appear on the blockchain, then receives the first value v and A's signature σ.Then, B checks if σ is valid, otherwise he aborts the protocol.At this point, B waits for another pair (v , σ ), or, after a timeout, he redeems vB using T pay (v).
Note that Q B should redeem T pay before the timeout expires, which is not modelled in Q B .This could be obtained by enriching the calculus with timeconstraining operators (see Footnote 5).

Fair lotteries
A multiparty lottery is a protocol where N players put their bets in a pot, and a winner -uniformly chosen among the players -redeems the whole pot.Various contracts for multiparty lotteries on Bitcoin have been proposed in [8,9,12,14,16,36].These contracts enjoy a fairness property, which roughly guarantees that: (i) each honest player will have (on average) a non-negative payoff, even in the presence of adversaries; (ii) when all the players are honest, the protocol behaves as an ideal lottery: one player wins the whole pot (with probability 1 /N), while all the others lose their bets (with probability N −1 /N).
Here we illustrate the lottery in [8], for N = 2. Consider two players A and B who want to bet 1B each.Their protocol is composed of two phases.The first phase is a timed commitment (as in Section 4.5): each player chooses a secret (s A and s B ) and commits its hash (h A = H(s A ) and h B = H(s B )).In doing that, both players put a deposit of 2B on the ledger, which is used to compensate the other player in case one chooses not to reveal the secret later on.In the second phase, the two bets are put on the ledger.After that, the players reveal their secrets, and redeem their deposits.Then, the secrets are used to compute the winner of the lottery in a fair manner.Finally, the winner redeems the bets.
The transactions needed for this lottery are displayed in Figure 11 (we only show A's transactions, as those of B are similar).The transactions for the commitment phase (T com , T open , T pay ) are similar to those in Section 4.5: they only differ in the script of T com .out,which now also checks that the length of the secret is either 128 or 129.This check forces the players to choose their secret so that it has one of these lengths, and reveal it (using T open ) before the absLock deadline, since otherwise they will lose their deposits (enabling T pay ).
The bets are put using T lottery , whose output script computes the winner using the secrets, which can then be revealed.For this, the secret lengths are compared: if equal, A wins, otherwise B wins.In this way, the lottery is equivalent to a coin toss.Note that, if a malicious player chooses a secret having another length than 128 or 129, the T lottery transaction will become stuck, but its opponent will be compensated using the deposit.
The endpoint protocol P A of player A follows (the one for B is similar):  (T lottery ) on the ledger.If this is not possible (e.g., because one of the T bet is already spent), A aborts using P open .After T lottery is on the ledger, A reveals her secret and redeems her deposit with P open .In parallel, with P win she waits for the secret of B to be revealed, and then attempts to redeem the pot (T Awin ).
The fairness of this lottery has been established in [8].This protocol can be generalised to N > 2 players [8,9] but in this case the deposit grows quadratically with N .The works [14,36] have proposed fair multiparty lotteries that require, respectively, zero and constant (≥ 0) deposit.More precisely, [36] devises two variants of the protocol: the first one only relies on SegWit, but requires each player to statically sign O(2 N ) transactions; the second variant reduces the number of signatures to O(N 2 ), at the cost of introducing a custom opcode.Also the protocol in [14] assumes an extension of Bitcoin, i.e. the malleability of in fields, to obtain an ideal fair lottery with O(N ) signatures per player (see Section 5).

Contingent payments
Assume a participant A who wants to pay vB to receive a value s which makes a public predicate p true, where p(s) can be verified efficiently.A seller B who knows such s is willing to reveal it to A, but only under the guarantee that he will be paid vB.Similarly, the buyer wants to pay only if guaranteed to obtain s.
A naïve attempt to implement this contract in Bitcoin is the following: A creates a transaction T such that T.out(ς, x) evaluates to true if and only if p(x) holds and ς is a signature of B. Hence, B can redeem vB from T by revealing s.In practice, though, this approach is arguably useful, since it requires coding p in the Bitcoin scripting language, whose expressiveness is quite limited.
More general contingent payment contracts can be obtained by exploiting zero-knowledge proofs [13,24,35].In this setting, the seller generates a fresh key k, and sends to the buyer the encryption e s = E k (s), together with the hash h k = H(k), and a zero-knowledge proof guaranteeing that such messages have the intended form.After verifying this proof, A is sure that B knows a preimage k of h k (by collision resistance, k = k) such that D k (e s ) satisfies the predicate p, and so she can buy the preimage k of h k with the naïve protocol, so obtaining the solution s by decrypting e s with k.
The transactions implementing this contract are displayed in Figure 12.The relAfter clause in T cp allows A to redeem vB if no solution is provided by the deadline t.The behaviour of the buyer A can be modelled as follows: Upon receiving e s , h k and the proof z9 the buyer verifies z.If the verification succeeds, A puts T cp (h k ) on the blockchain.Then, she waits for T open , from which she can retrieve the key k, and so use the solution D get k (x) (e s ) in P .In this way, B can redeem vB.If B does not put T open , after t time units A can get her deposit back through T refund .The protocol of B is simple, so it is omitted.

Research challenges and perspectives
Extensions to Bitcoin.The formal model of smart contracts we have proposed is based on the current mechanisms of Bitcoin; indeed, this makes it possible to translate endpoint protocols into actual implementations interacting with the Bitcoin blockchain.However, constraining smart contracts to perfectly adhere to Bitcoin greatly reduces their expressiveness.Indeed, the Bitcoin scripting language features a very limited set of operations10 , and over the years many useful (and apparently harmless) opcodes have been disabled without a clear understanding of their alleged insecurity 11 .This is the case e.g., of bitwise logic operators, shift operators, integer multiplication, division and modulus.
For this reason some developers proposed to re-enable some disabled opcodes 12 , and some works in the literature proposed extensions to the Bitcoin scripting language so to enhance the expressiveness of smart contracts.
A possible extension is covenants [37], a mechanism that allows an output script to constrain the structure of the redeeming transaction.This is obtained through a new opcode, called CHECKOUTPUTVERIFY, which checks if a given out of the redeeming transaction matches a specific pattern.Covenants are also studied in [41], where they are implemented using the opcode CAT (currently disabled) and a new opcode CHECKSIGFROMSTACK which verifies a signature against an arbitrary bitstring on the stack.In both works, covenants can also be recursive, e.g. a covenant can check if the redeeming transaction contains itself.Using recursive covenants allows to implement a state machine through a sequence of transactions that store its state.
Secure cash distribution with penalties [8,16,32] is a cryptographic primitive which allows a set of participants to make a deposit, and then provide inputs to a function whose evaluation determines how the deposits are distributed among the participants.This primitive guarantees that dishonest participants (who, e.g., abort the protocol after learning the value of the function) will pay a penalty to the honest participants.This primitive does not seem to be directly implementable in Bitcoin, but it becomes so by extending the scripting language with the opcode CHECKSIGFROMSTACK discussed above.Secure cash distribution with penalties can be instantiated to a of smart contracts, e.g.lotteries [8] poker [32], and contingent payments.The latter smart contract can also be obtained through the opcode CHECKKEYPAIRVERIFY in [24], which checks if the two top elements of the stack are a valid key pair.
Another new opcode, called MULTIINPUT [36] consumes from the stack a signature σ and a sequence of in values (T 1 , j 1 ) • • • (T n , j n ), with the following two effects: (i) it verifies the signature σ against the redeeming transaction T, neglecting T.in; (ii) it requires T.in to be equal to some of the T i .Exploiting this opcode, [36] devise a fair N -party lottery which requires zero deposit, and O(N 2 ) off-chain signed transaction.The first one of these effects can be alternatively obtained by extending, instead of the scripting language, the signature modifiers.More specifically, [14] introduces a new signature modifier, which can set to ⊥ all the inputs of a transaction (i.e., no input is signed).In this way they obtain a fair multi-party lottery with similar properties to the one in [36].
Another way improve the expressiveness of smart contracts is to replace the Bitcoin scripting language, e.g. with the one in [40].This would also allow to establish bounds on the computational resources needed to run scripts.
Unfortunately, none of the proposed extensions has been yet included in the main branch of the Bitcoin Core client, and nothing suggests that they will be considered in the near future.Indeed, the development of Bitcoin is extremely conservative, as any change to its protocol requires an overwhelming consensus of the miners.So far, new opcodes can only be empirically assessed through the Elements alpha project 13 , a testnet for experimenting new Bitcoin features.A significant research challenge would be that of formally proving that new opcodes do not introduce vulnerabilities, exploitable e.g. by Denial-of-Service attacks.For instance, unconstrained uses of the opcode CAT may cause an exponential space blow-up in the verification of transactions.
Formal methods for Bitcoin smart contracts.As witnessed in Section 4, designing secure smart contracts on Bitcoin is an error-prone task, similarly to designing secure cryptographic protocols.The reason lies in the fact that, to devise a secure contract, a designer has to anticipate any possible (mis-)behaviour of the other participants.The side effect is that endpoint protocols may be quite convoluted, as they must include compensations at all the points where something can go wrong.Therefore, tools to automate the analysis and verification of smart contracts may be of great help.
Recent works [7] propose to verify Bitcoin smart contracts by modelling the behaviour of participants as timed automata, and then using UPPAAL [15] to check properties against an attacker.This approach correctly captures the time constraints within the contracts.The downside is that encoding this UPPAAL model into an actual implementation with Bitcoin transactions is a complex task.Indeed, a designer without a deep knowledge of Bitcoin technicalities is likely to produce an UPPAAL model that can not be encoded in Bitcoin.A relevant research challenge is to study specification languages for Bitcoin contracts (like e.g. the one in Section 3), and techniques to automatically encode them in a model that can be verified by a model checker.
Remarkably, the verification of security properties of smart contracts requires to deal with non-trivial aspects, like temporal constraints and probabilities.This is the case, e.g., for the verification of fairness of lotteries (like e.g. the one discussed in Section 4.7); a further problem is that fairness must hold against any adversarial strategy.It is not clear whether in this case it is sufficient to consider a "most powerful" adversary, like e.g. in the symbolic Dolev-Yao model.In case a contract is not secure against arbitrary (PTIME) adversaries, one would like to verify that, at least, it is secure against rational ones [27], which is a relevant research issue.Additional issues arise when considering more concrete models of the Bitcoin blockchain, respect to the one in Section 2. This would require to model forks, i.e. the possibility that a recent transaction is removed from the blockchain.This could happen with rational (but dishonest) miners [33].
DSLs for smart contracts.As witnessed in Section 4, modelling Bitcoin smart contracts is complex and error-prone.A possible way to address this complexity is to devise high-level domain-specific languages (DSLs) for contracts, to be compiled in low-level protocols (e.g., the ones in Section 3).Indeed, the recent proliferation of non-Turing complete DSLs for smart contracts [18,22,26] suggests that this is an emerging research direction.
A first proposal of an high-level language implemented on top of Bitcoin is Typecoin [23].This language allows to model the updates of a state machine as affine logic propositions.Users can "run" this machine by putting transactions on the Bitcoin blockchain.The security of the blockchain guarantees that only the legit updates of the machine can be triggered by users.A downside of this approach is that liveness is guaranteed only by assuming cooperation among the participants, i.e., a dishonest participant can make the others unable to complete an execution.Note instead that the smart contracts in Section 4 allow honest participants to terminate, regardless of the behaviours of the environment.In some cases, e.g. in the lottery in Section 4.7, abandoning the contract may even result in penalties (i.e., loss of the deposit paid upfront to stipulate the contract).
We use syntactic sugar for expressions, e.g.false denotes 1 = 0, true denotes 1 = 1, while e and e denotes if e then e else false.Example 2. Recall the transactions in Figure1.Let e 1 (the script expression within T 1 .out(2))be defined as e 1 = absAfter t : versig k (x) and H(x ) = h, for h and t constants such that T 3 .absLock≥ t .Further, let σ 2 and σ 2 (the witnesses within T 3 .wit(1))be respectively sig k (T 3 ) and s, where sig k (T 3 ) is the signature of T 3 (excluding its witnesses) with key k, and s is a preimage of h, i.e. h = H(s).
ask T Bcom as y.P + τ.P open P = let h B = get hash (y) in if h B = h A then P pay | P else P pay | P open P = B ? x.P + τ.P open P = let σ = sig aa,1 k A (T lottery (h A , h B )) in put T lottery (h A , h B ){wit(1) → σ}{wit(2) → x}.(P open | P win ) + τ.P openP pay = put T Bpay {wit → ⊥ sig aa k A (T Bpay )} P open = put T Aopen {wit → s A sig aa k A (T Aopen )} P win = ask T Bopen as z.put T Awin (h A , h B ){wit → sig aa k A (T Awin (h A , h B )) s A get secret (z)}Player A starts by putting T Acom on the blockchain, then she waits for B doing the same.If B does not cooperate, A can safely abort the protocol taking its τ.P open branch, so redeeming her deposit with T Aopen (as usual, here with τ we are modelling a timeout).If B commits his secret, A executes P , extracting the hash h B of B's secret, and checking whether it is distinct from h A .If the hashes are found to be equal, A aborts the protocol using P open .Otherwise, A runs P | P pay .The P pay component attempts to redeem B's deposit, as soon as the absLock deadline of T Bpay expires, forcing B to timely reveal his secret.Instead, P proceeds with the lottery, asking B for his signature of T lottery .If B does not sign, A aborts using P open .Then, A runs P , finally putting the bets