Summing Up Smart Transitions

Some of the most significant high-level properties of currencies are the sums of certain account balances. Properties of such sums can ensure the integrity of currencies and transactions. For example, the sum of balances should not be changed by a transfer operation. Currencies manipulated by code present a verification challenge to mathematically prove their integrity by reasoning about computer programs that operate over them, e.g., in Solidity. The ability to reason about sums is essential: even the simplest ERC-20 token standard of the Ethereum community provides a way to access the total supply of balances. Unfortunately, reasoning about code written against this interface is non-trivial: the number of addresses is unbounded, and establishing global invariants like the preservation of the sum of the balances by operations like transfer requires higher-order reasoning. In particular, automated reasoners do not provide ways to specify summations of arbitrary length. In this paper, we present a generalization of first-order logic which can express the unbounded sum of balances. We prove the decidablity of one of our extensions and the undecidability of a slightly richer one. We introduce first-order encodings to automate reasoning over software transitions with summations. We demonstrate the applicability of our results by using SMT solvers and first-order provers for validating the correctness of common transitions in smart contracts.


Introduction
A basic challenge in smart contract verification is how to express the functional correctness of transactions, such as currency minting or transferring between accounts. Typically, the correctness of such a transaction can be verified by proving that the transaction leaves the sum of certain account balances unchanged. Consider for example the task of minting an unbounded number of tokens in the simplified ERC-20 token standard of the Ethereum community [31], as illustrated in Figure 1 4 . This example deposits the minted amount (n) into the re- ceiver's address (a) and we need to ensure that the mint operation only changed the balance of the receiver. To do so, in addition to (i) proving that the balance of the receiver has been increased by n, we also need to verify that (ii) the account balance of every user address a different than a has not been changed during the mint operation and that (iii) the sum of all balances changed exactly by the amount that was minted. The validity of these three requirements (i)-(iii), formulated as the post-conditions of Figure 1, imply its functional correctness. Surprisingly, proving formulas similar to the post-conditions of Figure 1 is challenging for state-of-the-art automated reasoners, such as SMT solvers [7,6,9] and first-order provers [18,10,33]: it requires reasoning that links local changes of the receiver (a) with a global state capturing the sum of all balances, as well as constructing that global state as an aggregate of an unbounded but finite number of Address balances. Moreover, our encoding of the problem uses discrete coins that are minted and deposited, whose number is unbounded but finite as well.
In this paper we address verification challenges of software transactions with aggregate properties, such as preservation of sums by transitions that manipulate low-level, individual entities. Such properties are best expressed in higher-order logic, hindering the use of existing automated reasoners for proving them. To overcome such a reasoning limitation, we introduce Sum Logic (SL) as a generalization of first-order logic, in particular of Presburger arithmetic. Previous works [20,30,11] have also introduced extensions of first-order logic with aggregates by counting quantifiers or generalized quantifiers. In Sum Logic (SL) we only consider the special case of integer sums over uninterpreted functions, allowing us to formalize SL properties with and about unbounded sums, in particular sums of account balances, without higher-order operations (Section 3). We prove the decidability of one of our SL extensions and the undecidability of a slightly richer one (Section 4). Given previous results [20], our undecidability result is not surprising. In contrast, what may be unexpected is our decidability result and the fact that we can use our first-order fragment for a convenient and practical new way to verify the correctness of smart contracts.
We further introduce first-order encodings which enable automated reasoning over software transactions with summations in SL (Section 5). Unlike [5], where SMT-specific extensions supporting higher-order reasoning have been in-troduced, the logical encodings we propose allow one to use existing reasoners without any modification. We are not restricted to SMT reasoning, but can also leverage generic automated reasoners, such as first-order theorem provers, supporting first-order logic. We believe our results ease applying automated reasoning to smart contract verification even for non-experts.
We demonstrate the practical applicability of our results by using SMT solvers and first-order provers for validating the correctness of common financial transitions appearing in smart contracts (Section 6). We refer to these transitions as smart transitions. We encode SL into pure first-order logic by adding another sort that represents the tokens of the crypto-currency themselves (which we dub "coins").
Although the encodings of Section 5 do not translate to our decidable SL fragment from Section 4, our experimental results show that automated reasoning engines can handle them consistently and fast. The decidability results of Section 5 set the boundaries for what one can expect to achieve, while our experiments from Section 5 demonstrate that the unknown middle-ground can still be automated.
While our work is mainly motivated by smart contract verification, our results can be used for arbitrary software transactions implementing sum/aggregate properties. Further, when compared to the smart contract verification framework of [32], we note that we are not restricted to proving the correctness of smart contracts as finite-state machines, but can deal with semantic properties expressing financial transactions in smart contracts, such as currency minting/transfers.
While ghost variable approaches [13] can reason about changes to the global state (the sum), our approach allows the verifier to specify only the local changes and automatically prove the impact on the global state.
Contributions. In summary, this paper makes the following contributions: -We present a generalization to Presburger arithmetic (SL, in Section 3) that allows expressing properties about summations. We show how we can formalize verification problems of smart contracts in SL. -We discuss the decidability problem of checking validity of SL formulas (Section 4): we prove that it is undecidable in the general case, but also that there exists a small decidable fragment. -We show different encodings of SL to first-order logic (Section 5). To this end, we consider theory-specific reasoning and variations of SL, for example by replacing non-negative integer reasoning with term algebra properties. -We evaluate our results with SMT solvers and first-order theorem provers, by using 31 new benchmarks encoding smart transitions and their properties (Section 6). Our experiments demonstrate the applicability of our results within automated reasoning, in a fully automated manner, without any user guidance.

Preliminaries
We consider many-sorted first-order logic (FOL) with equality, defined in the standard way. The equality symbol is denoted by ≈.
We denote by STRUCT [Σ] the set of all structures for the vocabulary Σ. A structure A ∈ STRUCT [Σ] is a pair (D, I), where for each sort s, its domain in A is D(s), and for each symbol S, its interpretation in A is I(S). Note that models of a formula ϕ over a vocabulary Σ are structures A ∈ STRUCT [Σ].
A first-order theory is a set of first-order formulas closed under logical consequence. We will consider, the first-order theory of the natural numbers with addition. This is Presburger arithmetic (PA) which is of course decidable [26]. We write N to denote the set of natural numbers. We consider 0 ∈ N and write N + to explicitly exclude 0 from N. The vocabulary of PA is Σ Presburger = 0, 1, c 1 , . . . , c l , + 2 , with all constants 0, 1, c i of sort Nat. A structure A = (D, I) ∈ STRUCT [Σ Presburger ] is called a Standard Model of Arithmetic when D(Nat) = N and + 2 is interpreted as the standard binary addition + function over the naturals. The vocabulary Σ Presburger can be extended with a total order relation, yielding Σ * Presburger = 0, 1, + 2 , ≤ 2 , where ≤ 2 is interpreted as the binary relation ≤ in Standard Models of Arithmetic.

Sum Logic (SL)
We now define Sum Logic ( SL) as a generalization of Presburger arithmetic, extending Presburger arithmetic with unbounded sums. SL is motivated by applications of financial transactions over cryptocurrencies in smart contracts. Smart contracts are decentralized computer programs executed on a blockchain-based system, as explained in [27]. Among other tasks, they automate financial transactions such as transferring and minting money. We refer to these transactions as smart transitions. The aim of this paper and SL in particular is to express and reason about the post-conditions of smart transitions similar to Figure 1.
SL expresses smart transition relations among sums of accounts of various kinds, e.g., at different banks, times, etc. Each such kind, j, is modeled by an uninterpreted function symbol, b j , where b j (a) denotes the balance of a's account of kind j, and a constant symbol s j , which denotes the sum of all outputs of b j . As such, our SL generalizes Presburger arithmetic with (i) a sort Address corresponding to the (unbounded) set of account addresses; (ii) balance functions b j mapping account addresses from Address to account values of sort Nat; and (iii) sum constants s j of sort Nat capturing the total sum of all account balances represented by b j . Formally, the vocabulary of SL is defined as follows.  Table 1: ERC-20 Token Standard -(Addresses) The constants a 1 , . . . , a l are of sort Address; -(Balance functions) b 1 1 , . . . , b 1 m are unary function symbols from Address to Nat; -(Constants and Sums) The constants c 1 , . . . , c d , s 1 , . . . , s m and 0, 1 are of sort Nat; -+ 2 is a binary function Nat × Nat → Nat; -≤ 2 is a binary relation over Nat × Nat.
In what follows, when the cardinalities in an SL vocabulary are clear from context, we simply write Σ instead of Σ l,m,d +,≤ . Further, by Σ l,m,d +, ≤ we denote the sub-vocabulary where the crossed-out symbols are not available. Note that even when addition is not available, we still allow writing numerals larger than 1.
We restrict ourselves to universal sentences over an SL vocabulary, with quantification only over the Address sort.
We now extend the Tarskian semantics of first-order logic to ensure that the sum constants of an SL vocabulary (s 1 , . . . , s m ) are equal to the sum of outputs of their associated balance functions (b j for each s j ) over the respective entire domains of sort Address.
Let Σ be an SL vocabulary. An SL structure A = (D, I) ∈ STRUCT [Σ] representing a model for an SL formula ϕ is called an SL model iff (Sum Property) We write A SL ϕ to mean that A is an SL model of ϕ. When it is clear from context, we simply write A ϕ.
Example 1 (Encoding ERC-20 in SL). As a use case of SL, we showcase the encoding of the ERC-20 token standard of the Ethereum community [31] in SL.
To this end, we consider an SL vocabulary Σ l,2,d . We respectively denote the balance functions and their associated sums as b, b , s, s in the SL structure over Σ l,2,d . The resulting instance of SL can then be used to encode ERC-20 operations/smart transitions as SL formulas, as shown in Table 1. Using this encoding, the post-condition of Figure 1 is expressed as the SL formula formalizing the correctness of the smart transition of minting n tokens in Figure 1. In the applied verification examples in Section 6, rather than verifying the low-level implementation of built-in functions such as mint n , we assume their correctness by including suitable axioms.

Decidability of SL
We consider the decidability problem of verifying formulas in SL. We show that when there are several function symbols b j to sum over, the satisfiability problem for SL becomes undecidable 5 . We first present, however, a useful decidable fragment of SL.

A Decidable Fragment of SL
We prove decidability for a fragment of SL, which we call the (l, 1, d)-FRAG fragment of SL (Theorem 4). For doing so, we reduce the fragment to Presburger arithmetic, by using regular Presburger constructs to encode SL extensions, that is the uninterpreted functions and sum constants of SL. The first step of our reduction proof is to consider distinct models, which are models where the Address constants a i represent distinct elements in the domain D(Address). While this restriction is somewhat unnatural, we show that for each vocabulary and formula that has a model, there exists an equisatisfiable formula over a different vocabulary that has a distinct model (Theorem 1). The crux of our decidability proof is then proving that (l, 1, d)-FRAG has small Address space: given a formula ϕ, if it is satisfiable, then there exists a model where |D(Address)| ≤ κ(|ϕ|), |ϕ| is the length of ϕ, and κ(.) is some computable function (Theorem 3) 6 .
Distinct Models An SL structure A is considered distinct when the l Address constants represent l distinct elements in D(Address). I.e., |{I(a 1 ), . . . , I(a l )}| = l .
Since each SL model induces an equivalence relation over the Address constants, we consider partitions P over {a 1 , . . . , a l }. For each possible partition P we define a transformation of terms and formulas T P that substitutes equivalent Address constants with a single Address constant. The resulting formulas are defined over a vocabulary that has |P | Address constants. We show that given an SL formula ϕ, if ϕ has a model, we can always find a partition P such that each of its classes corresponds to an equivalence class induced by that model. Theorem 1 (Distinct Models). Let ϕ be an SL formula over Σ, then ϕ has a model iff there exists a partition P of {a 1 , . . . , a l } such that T P (ϕ) has a distinct model.
Small Address Space In order to construct a reduction to Presburger arithmetic, we bound the size of the Address sort. For a fragment of SL to be decidable, we therefore need a way to bound its models upfront. We formalize this requirement as follows.
Definition 2 (Small Address Space). Let FRAG be some fragment of SL over vocabulary Σ = Σ l,m,d +,≤ . FRAG is said to have small Address space if there exists a computable function κ Σ (.), such that for any SL formula ϕ ∈ FRAG, ϕ has a distinct model iff ϕ has a distinct model A = (D, I) with small Address space, where |D(Address)| ≤ κ Σ (|ϕ|).
We call κ Σ (.) the bound function of FRAG; when the vocabulary is clear from context we simply write κ(.).
One instance of a fragment (or rather, family of fragments) that satisfies this property is the (l, 1, d)-FRAG fragment: the simple case of a single uninterpreted "balance" function (and its associated sum constant), further restricted by removing the binary function + and the binary relation ≤. Therefore, we derive the following theorem: Theorem 2 (Small Address Space of (l, 1, d)-FRAG).
An attempt to trivially extend Theorem 2 for a fragment of SL with two balance functions falls apart in a few places, but most importantly when comparing balances to the sum of a different balance function. In Section 4.2 we show that these comparisons are essential for proving our undecidability result in SL.

Presburger Reduction
For showing decidability of some FRAG fragment of SL, we describe a Turing reduction to pure Presburger arithmetic. We introduce a transformation τ (.) of formulas in SL into formulas in Presburger arithmetic. It maps universal quantifiers to disjunctions, and sums to explicit addition of all balances. In addition, we define an auxiliary formula η(ϕ), which ensures only valid addresses are considered, and that invalid addresses have zero balances. The formal definitions of τ (.) and η(ϕ) can be found in Appendix A.
By relying on the properties of distinctness and small Address space we get the following results.

Theorem 3 (Presburger Reduction
). An SL formula ϕ has a distinct, SL model with small Address space iff τ (ϕ) ∧ η(ϕ) has a Standard Model of Arithmetic.
Theorem 4 (SL Decidability). Let FRAG be a fragment of SL that has small Address space, as defined in Definition 2. Then, FRAG is decidable.
Proof (Theorem 4). Let ϕ be a formula in FRAG. Then ϕ has an SL model iff for some partition P of {a 1 , . . . , a l }, T P (ϕ) has a distinct SL model. For any P , the formula T P (ϕ) is in FRAG, therefore T P (ϕ) has a distinct SL model iff it has a distinct SL model with small Address space.
From Theorem 3, we get that for any P , ϕ P T P (ϕ) has a distinct SL model iff τ (ϕ P ) ∧ η(ϕ P ) has a Standard Model of Arithmetic. By using the PA decision procedure as an oracle, we obtain the following decision procedure for a FRAG formula ϕ: -For each possible partition P of {a 1 , . . . , a l }, let ϕ P = T P (ϕ); -Using a PA decision procedure, check whether τ (ϕ P ) ∧ η(ϕ P ) has a model, for each P ; -If a model for some partition P was found, the formula ϕ P has a distinct SL model, and therefore ϕ has SL model; -Otherwise, there is no distinct SL model for any partition P , and therefore there is no SL model for ϕ.
Remark 1. Our decision procedure for Theorem 4 requires B l Presburger queries, where B l is Bell's number for all possible partitions of a set of size l.
Using Theorem 4 and Theorem 2, we then obtain the following result.

SL Undecidability
We now show that simple extensions of our decidable (l, 1, d)-FRAG fragment lose its decidability (Theorem 5). For doing so, we encode the halting problem of a two-counter machine using SL with 3 balance functions, thereby proving that the resulting SL fragment is undecidable. Consider a two-counter machine, whose transitions are encoded by the Presburger formula π(c 1 , c 2 , p, c 1 , c 2 , p ) with 6 free variables: 2 for each of the three registers, one of which being the program counter (pc). We assume w.l.o.g. that all three registers are within N + , allowing us to use addresses with a zero balance as a special "separator". In addition, we assume that the program counter is 1 at the start of the execution, and that there exists a single halting statement at line H. That is, the two-counter machine halts iff the pc is equal to H.

Reduction Setting
We have 4 Address elements for each time-step, 3 of them hold one register each, and one is used to separate between each group of Address elements (see Table 2). We have 3 uninterpreted functions from Address to Nat ("balances"). For readability we denote these functions as c, l, g (instead of b 1 , b 2 , b 3 ) and their respective sums as s c , s l , s g : 1. Function c : Cardinality function, used to force size constraints. We set its value for all addresses to be 1, and therefore the number of addresses is s c . 2. Function l : Labeling function, to order the time-steps. We choose one element to have a maximal value of s c − 1 and ensure that l is injective. This means that the values of l are distinctly [0, s c − 1]. 3. Function g : General purpose function, which holds either one of the registers or 0 to mark the Address element as a separating one.
Address l(Address) c(Address) g(Address)  Each group representing a time-step is a 4 Address element, ordered as follows: 1. First, a separating Address element x (where g(x) is 0). 2. Then, the two general-purpose counters. 3. Lastly, the program counter.
In addition we have 2 Address constants, a 0 and a 1 which represent the pc value at the start and at the end of the execution. The element a 1 also holds the maximal value of l, that is, l(a 1 ) + 1 ≈ s c . Further, a 0 holds the fourthminimal value, since its the last element of the first group, and each group has four elements.
Formalization Using a Two-Counter Machine We now formalize our reduction, proving undecidability of SL.
(i) We impose an injective labeling (ii) We next formalize properties over the program counter pc. The Address constant that represents the program counter pc value of the last time-step is set to have the maximal labeling, that is Further, the Address constant that represents the pc value of the first time-step has the fourth labeling, hence Finally, the first and last values of the program counter are respectively 1 and H, that is We express cardinality constraints ensuring that there are as many Address elements as the labeling of the last Address constant (a 1 ) + 1. We assert We encode the transitions of the two-counter machine, as follows. For every 8 Address elements, if they represent two sequential time-steps, then the formula for the transitions of the two-counter machine is valid for the registers it holds. As such, we have where the conjunction F 1 ∧ F 2 ∧ F 3 expresses that x 1 , . . . , x 8 are two sequential time-steps, with F 1, F 2 and F 3 defined as below. In particular, F 1, F 2 and F 3 formalize that x 1 , . . . , x 8 have sequential labeling, starting with one zerovalued Address element ("separator") and continuing with 3 non-zero elements, as follows: Based on the above formalization, the formula ϕ = ϕ 1 ∧ · · · ∧ ϕ 6 is satisfiable iff the two-counter machine halts within a finite amount of time-steps (and the exact amount would be given by sc 4 ). Since the halting problem for two-counter machines is undecidable, our SL, already with 3 uninterpreted functions and their associated sums, is also undecidable.
Theorem 5. For any l ≥ 2, m ≥ 3 and d, any fragment of SL over Σ l,m,d +,≤ is undecidable.
Remark 2. Note that in the above formalization the only use of associated sums comes from expressing the size of the set of Address elements. As for our uninterpreted function c(.) we have ∀x.c(x) ≈ 1, its sum s c is thus the amount of addresses. Hence, we can encode the halting problem for two-counter machines in an almost identical way to the encoding presented here, using a generalization of PA with two uninterpreted functions for l(.) and g(.), and a size operation replacing c(.) and its associated sum.

SL Encodings of Smart Transitions
The definition of SL models in Sections 3 and 4 ensured that the summation constants s j were respectively equal to the actual summation of all balances b j (.). In this section, we address the challenge to formalize relations between s j and b j (.) in a way that the resulting encodings can be expressed in the logical frameworks of automated reasoners, in particular of SMT solvers and first-order theorem provers.
In what follows, we consider a single transaction or one time-step of multiple transactions over s j , b j (.). We refer to such transitions as smart transitions. Smart transitions are common in smart contracts, expressing for example the minting and/or transferring of some coins, as evidenced in Figure 1 and discussed later.
Based on Section 3, our smart transitions are encoded in the Σ l,2,d fragment of SL. Note however, that neither decidability nor undecidability of this fragment is implied by Theorem 4, nor Theorem 5. In this section, we show that our SL encoding of smart transitions is expressible in first-order logic. We first introduce a sound, implicit SL encoding, by "hiding" away sum semantics and using invariant relations over smart transitions (Section 5.1). This encoding does not allow us to directly assert the values of any balance or sum, but we can prove that this implicit encoding is complete, relative to a translation function (Section 5.2).
By further restricting our implicit SL encoding to this relative complete setting, we consider counting properties to explicitly reason with balances and directly express verification conditions with unbounded sums on s j and b j (.). This is shown in Section 5.3, and we evaluate different variants of the explicit SL encoding in Section 6, showcasing their practical use and relevance within automated reasoning.
To directly present our SL encodings and results in the smart contract domain, in what follows we rely on the notation of Table 1. As such, we respectively denote b, b by old-bal, new-bal and write old-sum, new-sum for s, s . As already discussed in Figure 1, the prefixes old-and new-refer to the entire state expressed in the encoding before and after the smart transition. We explicitly indicate this state using old-world, new-world respectively. The non-prefixed versions bal and sum are stand-ins for both the old-and new-versions -Figure 2 illustrates our setting for the smart transition of minting one coin.
With this SL notation at hand, we are thus interested in finding first-order formulas that verify smart transition relations between old-sum and new-sum, given the relation between old-bal and new-bal. In this paper, we mainly focus on the smart transitions of minting and transferring money, yet our results could be used in the context of other financial transactions/software transitions over unbounded sums.
Example 2. In the case of minting n coins in Figure 1, we require formulas that (a) describe the state before the transition (the old-world, thus pre-condition), (b) formalize the transition (the relation between old-bal and new-bal; (i)-(ii) in Figure 1) and (c) imply the consequences for the new-world ((iii) in Figure 1). These formulas verify that minting and depositing n coins into some address result in an increase of the sum by n, that is new-sum = old-sum + n, as expressed in the functional correctness formula (1) of Figure 1.

SL Encoding using Implicit Balances and Sums
The first encoding we present is a set of first-order formulas with equality over sorts {Coin, Address}. No additional theories are considered. The Coin sort represents money, where one coin is one unit of money. The Address sort represents the account addresses as before. As a consequence, balance functions and sum constants only exist implicitly in this encoding. As such, the property sum = a∈Address bal(a) cannot be directly expressed in this encoding. Instead, we formalize this property by using so-called smart invariant relations between two predicates has-coin and active over coins c ∈ Coin and a ∈ Address, as follows.

Definition 3 (Smart Invariants).
Let has-coin ⊆ Address × Coin and consider active ⊆ Coin. A smart invariant of the pair (has-coin, active) is the conjunction of the following three formulas 1. Only active coins c can be owned by an address a: 2. Every active coin c belongs to some address a: ∀c : Coin. active(c) → ∃a : Address. has-coin(a, c) .
Intuitively, our smart invariants ensure that a coin c is active iff it is owned by precisely one address a. Our smart invariants imply the soundness of our implicit SL encoding, as follows.
We say that a smart transition preserves smart invariants, when inv(old-has-coin, old-active) where old-has-coin, old-active and new-has-coin, new-active respectively denote the functions has-coin, active in the states before and after the smart transition. Based on the soundness of our implicit SL encoding, we formalize smart transitions preserving smart invariants as first-order formulas. We only discuss smart transitions implementing minting n coins here, but other transitions, such as transferring coins, can be handled in a similar manner. We first focus on miniting a single coin, as follows.
Definition 4 (Transition mint 1 (a, c)). Let there be c ∈ Coin, a ∈ Address. The transition mint 1 (a, c) activates coin c and deposits it into address a.
1. The coin c was inactive before and is active now: (M1)

The address a owns the new coin c:
new-has-coin(a, c) ∧ ∀a : Address. ¬old-has-coin(a , c) .
3. Everything else stays the same: By minting one coin, the balance of precisely one address, that is of the receiver's address, increases by one, whereas all other balances remain unchanged. Thus, the expected impact on the sum of account balances is also increased by one, as illustrated in Figure 2. The following theorem proves that the definition of mint 1 is sound. That is, mint 1 affects the implicit balances and sums as expected and hence mint 1 preserves smart invariants.
Theorem 7 (Soundness of mint 1 (a, c)). Let c ∈ Coin, a ∈ Address such that mint 1 (a, c). Consider balance functions old-bal, new-bal : Address → N, non-negative integer constants old-sum, new-sum, unary predicates old-active, new-active ⊆ Coin and binary predicates old-has-coin, new-has-coin ⊆ Address × Coin such that |old-active| = old-sum , |new-active| = new-sum, and for every address a , we have Smart transitions minting an arbitrary number of n coins, as in our Figure 1, is then realized by repeating the mint 1 transition n times. Based on the soundness of mint 1 , ensuring that mint 1 preserves smart invariants, we conclude by induction that n repetitions of mint 1 , that is minting n coins, also preserves smart invariants. The precise definition of mint n together with the soundness result is stated in Appendix B.2.

Completeness Relative to a Translation Function
Smart invariants provide sufficient conditions for ensuring soundness of our SL encodings (Theorem 6). We next show that, under additional constraints, smart invariants are also necessary conditions, establishing thus (relative) completeness of our encodings.
A straightforward extension of Theorem 6 however does not hold. Namely, only under the assumptions of Theorem 6, the following formula is not valid: ⇐⇒ inv(has-coin, active) .
As a counterexample, assume (i) sum = |active|, (ii) for every a ∈ Address it holds that bal(a) = |{c ∈ Coin | (a, c) ∈ has-coin}|, that is the assumptions of Theorem 6. Further, let (iii) the smart invariants inv(has-coin, active) hold for all but the coins c 1 , c 2 ∈ Coin and all but the addresses a 1 , a 2 ∈ Address. We also assume that (iv) c 1 is active but not owned by any address and (v) c 2 is active and owned by the two distinct addresses a 1 , a 2 . We thus have sum = a∈Address bal(a), yet inv(has-coin, active) does not hold. To ensure completeness of our encodings, we therefore introduce a translation function f that restricts the set F 2 Address×Coin × 2 Coin of (has-coin, active) pairs, as follows. We exclude from F those pairs (has-coin, active) that violate smart invariants by both (i) not satisfying (I2), as (I2) ensures that there are not too many active coins, and by (ii) not satisfying at least one of (I1) and (I3), as (I1) and (I3) ensure that there are not too few active coins. The required translation function f (Appendix B.3) now assigns every pair (bal, sum) the set of all (has-coin, active) ∈ F that satisfy sum = |active|, bal(a) = |{c ∈ Coin | has-coin(a, c)}| for every address a and have not been excluded.

SL Encodings using Explicit Balances and Sums
We now restrict our SL encoding from Section 5.1 to explicitly reason with balance functions during smart transitions. We do so by expressing our translation function f from Section 5.2 in first-order logic. We now use the summation constant sum ∈ N and the balance function bal : Address → N in our SL encoding. In particular, we use our smart invariants inv(has-coin, active) in this explicit SL encoding together with two additional axioms (Ax1, Ax2), ensuring that sum = |active| and bal(a) = |{c ∈ Coin | has-coin(a, c)}| for all a ∈ Address.
To formalize the additional properties, we introduce two counting mechanisms in our SL encoding. The first one is a bijective function count : Coin → N + and the second one is a function idx : Address × Coin → N + , where idx(a, .) : Coin → N + is bijective for every a ∈ Address. To ensure that count and idx(a, .) count coins, we impose the following two properties: ∀c : Coin. ∀a : Address. has-coin(a, c) ⇐⇒ idx(a, c) ≤ bal(a) . (Ax2) Figure 3 illustrates our revised SL encoding for our smart transition mint 1 . We next ensure soundness of our resulting explicit encoding for summation, as follows.
In particular, we have sum = a∈Address bal(a).
When compared to Section 5.1, our explicit SL encoding introduced above uses our smart invariants as axioms of our encoding, together with (Ax1) and (Ax2). In our explicit SL encoding, the post-conditions asserting functional correctness of smart transitions express thus relations among old-sum to new-sum. For example, for mint n we are interested in ensuring (2) By using two new constants old-total, new-total ∈ N, we can use sum = total as smart invariant for mint n . As a result, the property to be ensured is then It is easy to see that the negations of (2) and (3) are equisatisfiable. We note however that the additional constants old-total, new-total used in (3) lead to unstable results within automated reasoners, as discussed in Section 6.

Experiments
From Theory to Practice. To make our explicit SL encodings handier for automated reasoners, we improved the setting illustrated in Figure 3 by applying the following restrictions without losing any generality. (i) The predicates has-coin and active were removed from the explicit SL encodings, by replacing them by their equivalent expressions (Ax1)-(Ax2). (ii) The surjectivity assertions of count and idx were restricted to the relevant intervals [1, sum], [1, bal(a)] respectively. (iii) Compared to Figure 3, only one mutual count and one mutual idx functions were used. We however conclude that we do not lose expressivity of our resulting SL encoding, as shown in Appendix B.5. (iv) When our SL encoding contains expressions such as ∀c : 1}, then it can be assumed that the coins in those intervals are in the same order for both functions (Appendix B.6). Based on the above, we derive three different explicit SL encodings to be used in automated reasoning about smart transitions. We respectively denote these explicit SL encodings by int, nat and id, and describe them next. Benchmarks. In our experiments, we consider four smart transitions mint 1 , mint n , transferFrom 1 and transferFrom n , respectively denoting minting and transferring one and n coins. These transitions capture the main operations of linear integer arithmetic. In particular, mint n implements the smart transition of our running example from Figure 1.
For each of the four smart transitions, we implement four SL encodings: the implicit SL encoding uf from Section 5.1 using only uninterpreted functions and three explicit encodings int, nat and id as variants of Section 5.3. We also consider three additional arithmetic benchmarks using int, which are not directly motivated by smart contracts. Together with variants of int and nat presented in the sequel, our benchmark set contains 31 examples altogether, with each example being formalized in the SMT-LIB input syntax [1]. In addition to our encodings, we also proved consistency of the axioms used in our encodings. SL Encodings and Relaxations. Our explicit SL encoding int uses linear integer arithmetic, whereas nat and id are based on natural numbers. As naturals are not a built-in theory in SMT-LIB, we assert the axioms of Presburger arithmetic directly in the encodings of nat and id.
In our id encodings, inductive datatypes are additionally used to order coins. There exists one linked list of all coins for count and one for each idx(a, .), a ∈ Address. Additionally, there exists a "null" coin, which is the first element of every list and is not owned by any address. As shown in Figure 4, the numbering of each coin is defined by its position in the respective list. This way surjectivity for count and idx can respectively be asserted by the formulas ∃c : Coin. count(c) ≈ sum and ∀a : Address. ∃c : Coin. idx(a, c) ≈ bal(a). However, asserting surjectivity for int and nat cannot be achieved without quantifying over N + . Such quantification would drastically effect the performance of automated reasoners in (fragments of) first-order logics. As a remedy, within the default encodings of int and nat, we only consider relevant instances of surjectivity.
Further, we consider variations of int and nat by asserting proper surjectivity to the relevant intervals of idx and count (denoted as surj ) and/or adding the total constants mentioned in Section 5.3 (denoted as with total, no total) . These variations of int and nat are implemented for mint 1 and transferFrom 1 . Experimental Setting. We evaluated our benchmark set of 31 examples using SMT solvers Z3 [7] and CVC4 [6], as well as the first-order theorem prover  Vampire [18]. Our experiments were run on a standard machine with an Intel Core i5-6200U CPU (2.30GHz, 2.40GHz) and 8 GB RAM. The time is given in seconds and we ran all experiments with a time limit of 300s. Time out is indicated by the symbol ×. The default parameters were used for each solver, unless stated otherwise in the corresponding tables. The precise calls of the solvers, together with examples of the encodings, can be found in Appendix C 7 . Experimental Analysis. We first report on our experiments using different variations of int and nat. Table 3 shows that asserting complete surjectivity for int and nat is computationally hard and indeed significantly effects the performance of automated reasoners. Thus, for the following experiments only relevant instances of surjectivity, such as ∃c : Coin. count(c) = sum were asserted in int and nat. Table 3 also illustrates the instability of using the total constant. Some tasks seem to be easier even though their reasoning difficulty increased strictly by adding additional constants.
Our most important experimental findings are shown in Table 4, demonstrating that our SL encodings are suitable for automated reasoners. Thanks to our Task  Time  Transition Impact  explicit SL encodings, each solver can certify every smart transition in at least one encoding. Our explicit SL encodings are more relevant than the implicit encoding uf as we can express and compare any two non-negative integer sums, whereas for uf handling arbitrary values n can only be done by iterating over the mint 1 (or transferFrom 1 ) transition. This iteration requires inductive reasoning, which currently only Vampire could do [14], as indicated by the superscript * . Nevertheless, the transactions mint 1 , transferFrom 1 , which involve only one coin in uf, require no inductive reasoning as the actual sum is not considered; each of our solvers can certify these examples. We note that the tasks mint n and transferFrom n from Table 4 yield a huge search space when using their explicit SL encodings within automated reasoners. We split these tasks into proving intermediate lemmas and proved each of these lemmas independently, by the respective solver. In particular, we used one lemma for mint n and four lemmas for transferFrom n . In our experiments, we only used the recent theory reasoning framework of Vampire with split queues [12] and indicate our results in by superscript †.
We further remark that our explicit SL encoding id using inductive datatypes also requires inductive reasoning about smart transitions and beyond. The need of induction explains why SMT solvers failed proving our id benchmarks, as shown in Table 4. We note that Vampire found a proof using built-in induction [14] and theory-specific reasoning [12], as indicated by superscript ‡.
We conclude by showing the generality of our approach beyond smart transitions. It in fact enables fully automated reasoning about any two summations i∈I g(i), i∈I h(i) of non-negative integer values g(i), h(i) (i ∈ I) over a mutual finite set I. The examples of Table 5 affirm this claim.

Related work
Smart Contract Safety. Formal verification of smart contracts is an emerging hot topic because of the value of the assets stored in smart contracts, e.g. the DeFi software [3]. Due to the nature of the blockchain, bugs in smart contracts are irreversible and thus the demand for provably bug-free smart contracts is high.
The K interactive framework has been used to verify safety of a smart contract, e.g. in [22]. Isabelle [21] was also shown to be useful in manual, interactive verification of smart contracts [16]. We, however, focus on automated approaches.
There are also efforts to perform deductive verification of smart contracts both on the source level in languages such as Solidity [32, 4,13] and Move [34], as well as on the the Ethereum virtual machine (EVM) level [2,28]. This paper improves the effectiveness of these approaches by developing techniques for automatically reasoning about unbounded sums. This way, we believe we support a more semantic-based verification of smart contracts.
Our approach differs from works using ghost variables [13], since we do not manually update the "ghost state". Instead, the verifier needs only to reason about the local changes, and the aggregate state is maintained by the axioms. That means other approaches assume (a) the local changes and (b) the impact on ghost variables (sum), whereas we only assume (a) and automatically prove a ⇒ b. This way, we reduce the user-guidance in providing and proving (b).
Our work complements approaches that verify smart contracts as finite state machines [32] and methods, like ZEUS [17], using symbolic model checking and abstract interpretation to verify generic safety properties for smart contracts.
The work in [29] provides an extensive evaluation of ERC-20 and ERC-721 tokens. ERC-721 extends ERC-20 with ownership functions, one of which being "approve". It enables transactions on another party's behalf. This is independent of our ability to express sums in first-order logic, as the transaction's initiator is irrelevant to its effect.
Reasoning about Financial Applications. Recently, the Imandra prover introduced an automated reasoning framework for financial applications [23,24,25]. Similarly to our approach, these works use SMT procedures to verify and/or generate counter-examples to safety properties of low-and high-level algorithms. In particular, results of [23,24,25] include examples of verifying ranking orders in matching logics of exchanges, proving high-level properties such as transitivity and anti-symmetry of such orders. In contrast, we focus on verifying properties relating local changes in balances to changes of the global state (the sum). Moreover, our encodings enable automated reasoning both in SMT solving and first-order theorem proving.
Automated Aggregate Reasoning. The theory of first-order logic with aggregate operators has been thoroughly studied in [15,20]. Though proven to be strictly more expressive than first-order logic, both in the case of general aggregates as well as simple counting logics, in this paper we present a practical way to encode a weakened version of aggregates (specifically sums) in first-order logic. Our encoding (as in Section 5) works by expressing particular sums of interest, harnessing domain knowledge to avoid the need of general aggregate operators.
Previous works [19,5] in the field of higher-order reasoning do not directly discuss aggregates. The work of [19] extends Presburger arithmetic with Boolean algebra for finite, unbounded sets of uninterpreted elements. This includes a way to express the set cardinalities and to compare them against integer variables, but does not support uninterpreted functions, such as the balance functions we use throughout our approach.
The SMT-based framework of [5] takes a different, white-box approach, modifying the inner workings of SMT solvers to support higher-order logic. We on the other hand treat theorem provers and SMT solvers as black-boxes, constructing first-order formulas that are tailored to their capabilities. This allows us to use any off-the-shelf SMT solver.
In [8], an SMT module for the theory of FO(Agg) is presented, which can be used in all DPLL-based SAT, SMT and ASP solvers. However, FO(Agg) only provides a way to express functions that have sets or similar constructs as inputs, but not to verify their semantic behavior.

Conclusions
We present a methodology for reasoning about unbounded sums in the context of smart transitions, that is transitions that occur in smart contracts modeling transactions. Our sum logic SL and its usage of sum constants, instead of fullyfledged sum operators, turns out to be most appropriate for the setting of smart contracts. We show that SL has decidable fragments (Section 4.1), as well as undecidable ones (Section 4.2). Using two phases to first implicitly encode SL in first-order logic (Section 5.1), and then explicitly encode it (Section 5.3), allows us to use off-the-shelf automated reasoners in new ways, and automatically verify the semantic correctness of smart transitions.
Showing the (un)decidability of the SL fragment with two sets of uninterpreted functions and sums is an interesting step for further work, as this fragment supports encoding smart transition systems. Another interesting direction of future work is to apply our approach to different aggregates, such as minimum and maximum and to reason about under which conditions these values stay above/below certain thresholds. A slightly modified setting of our SL axioms can already handle min/max aggregates in a basic way, namely by using ≥ and ≤ instead of equality and dropping the injectivity/surjectivity (respectively) axioms of the counting mechanisms.
Summing upon multidimensional arrays in various ways is yet another direction of future research. Our approach supports the summation over all values in all dimensions by adding the required number of parameters to the predicate idx and by adapting the axioms accordingly. Let Σ be an SL vocabulary. We write a structure A = (D, I) ∈ STRUCT [Σ] as a tuple We always assume that D(Nat) = N, and that 0, 1, + 2 and ≤ 2 are interpreted naturally. For brevity, we omit them when describing SL structures.

Distinct Models Proof
Observation 1. For any set X and any partition P thereof, it holds that |P | ≤ |X|.

Definition 6 (Partition-Induced Function).
Let P be a partition of a finite set X of size l. P = {A 1 , . . . , A l } where l ≤ l.
We define the partition-induced function f P (x) (for any x ∈ X) as the index i such that A i ∈ P and x ∈ A i .
For brevity, we denote f P (x) as P (x).

Definition 7 (Function-Induced Equivalence Class).
Let f be some function over some set X. We define the function-induced equivalence class for each x ∈ X as

Definition 8 (Function-Induced Partition).
Let f be some function defined over some set X. We define the functioninduced partition P f as Definition 9 (Partitioning Sum Terms by P ).
Let t be some term over an SL vocabulary Σ = Σ l,m,d (with l Address constants) and let P be some partition of {a 1 , . . . , a l }.
We define a transformation T P (t) inductively as a term over an SL vocabulary Σ P = Σ l ,m,d with l = |P | ≤ l Address constants: Definition 10 (Partitioning an SL Formula by P ). We naturally extend the terms transformation T P to formulas.
Observation 2. For any SL vocabulary Σ, Σ P ⊆ Σ, since l ≤ l. Therefore, for any formula ϕ in some fragment FRAG of SL, T P (ϕ) ∈ FRAG as well.

Definition 11 (Distinct Structures).
An SL structure A is considered distinct when a A 1 , . . . , a A l = l. I.e. the l Address constants represent l distinct elements in D(Address).
Theorem 1 (Distinct Models). Let ϕ be an SL formula over Σ, then ϕ has a model iff there exists a partition P of {a 1 , . . . , a l } such that T P (ϕ) has a distinct model.

Proof of Theorem 1
Part 1: If ϕ has an SL model, then there exists some partition P such that T P (ϕ) has a distinct SL model (⇒) Let A be some SL model of ϕ and let f be the mapping from {a 1 , . . . , a l } to A, Let P be the partition (of size l ) induced by f and we construct a distinct SL model   Proof. Since T P (t) = t for all terms except terms containing a i , and since A is identical to A except for Address constants, we only need to consider this kind of terms.
Moreover, since T P is defined inductively, it suffices to prove the claim for the basis terms a i .
Proof. Identical to the proof of Claim 1.
Claim 3. Let ξ be a sub-formula of ϕ, therefore: 1. If ξ is a closed formula then A ξ ⇐⇒ A T P (ξ) 2. If ξ is a formula with free variables x 1 , . . . , x r then for any α 1 , . . . , α r ∈ A, Since ϕ is a closed formula, and since A ϕ, it holds that A T P (ϕ) and therefore A is a distinct SL model for T P (ϕ).
2. If ξ is a formula with free variables x 1 , . . . , x r then for every α 1 , . . . , α r ∈ A, Since ϕ is a closed formula, and since A T P (ϕ) we get that A ϕ.
Proof (Claim 6). In the same vain of Claim 3, this follows from Claims 4 and 5.

Small Address Space Proof
Definition 2 (Small Address Space). Let FRAG be some fragment of SL over vocabulary Σ = Σ l,m,d +,≤ . FRAG is said to have small Address space if there exists a computable function κ Σ (.), such that for any SL formula ϕ ∈ FRAG, ϕ has a distinct model iff ϕ has a distinct model A = (D, I) with small Address space, where |D(Address)| ≤ κ Σ (|ϕ|).
We call κ Σ (.) the bound function of FRAG; when the vocabulary is clear from context we simply write κ(.).

Proof of Theorem 2
Let there be some universal, closed formula ϕ over Σ = Σ l,1,d +, ≤ and let there be some minimal structure A ∈ STRUCT [Σ] such that A SL ϕ (i.e. A is an SL model for ϕ).
We denote the (finite) size of A as z |A|, and we assume towards contradiction that z ≥ l + |ϕ| + 1 (as our bound function is κ(x) = l + x + 1). We construct a smaller model A for ϕ. Thus contradicting the minimality of A, and proving our desired claim.
We write out the given model . . , c A n , s A We know that |A| = z > l, and therefore the set is not empty. Let us define . We construct the smaller SL structure and we postpone defining c A k for now. We observe that: Firstly, we prove the following claim: Claim 7. A holds the sum property.
Proof. Since A holds the sum property for s A : The definition for c A k depends on b * . If b * = 0 then simply c A k = c A k . In this case, we prove the following: Lemma 1. For any term t, assignment ∆, Proof. Since b * = 0, we get that s A = s A and therefore the interpretations of A and A are identical -I = I .
Corollary 2. Since the domain of A is a strict subset of the domain of A, for any formula ξ, A ξ ⇒ A ξ, and in particular A is also an SL model for ϕ.
In the case that b * > 0, we define and the proof is more involved. We firstly make the following observations: The central claim we need to prove is: Claim 8. Let ξ be a sub-formula of ϕ,

If ξ is a closed, quantifier-free formula then
A ξ ⇐⇒ A ξ 2. If ξ is a quantifier-free formula with free variables x 1 , . . . , x r , then for every α 1 , . . . , α r ∈ A , 3. If ξ is a closed, universally quantified formula then If ξ is a universally quantified formula with free variables x 1 , . . . , x r , then for every α 1 , . . . , α r ∈ A , Proof. Since ϕ is a closed, universally quantified sub-formula of itself, ] and since it is given that A ϕ, we get from Claim 8 that A ϕ.
In order to prove Claim 8 we firstly need to prove the following two lemmas: What remains to prove is that for any α ∈ A , b A (α) < s A . A has at least l + |ϕ| + 1 elements, and therefore A \ a A 1 , . . . , a A l has at least 2 elements. Let us denote them: α 1 , α 2 .
For both of these elements, since otherwise they would have been chosen as α * -contradicting b * 's minimality.
For any element α, since A holds the sum property, We can re-arrange and get that and since A \ {α} contains either α 1 or α 2 , it must be that . It has at least |ϕ| + 1 elements.
For any α ∈ S, b A (α) > 0, otherwise it would have been chosen as α * and we'd have b * = 0 -which contradicts b * 's minimality.
Therefore, on the one hand, And, on the other hand, since S ⊆ A , we know that And combining the two results we get that |ϕ| < s A , and since b * > 0, Proof (Proof of Claim 8). We prove the claim using structural induction.
Step 1: ξ = t 1 ≈ t 2 without free variables ξ is a closed, quantifier-free formula, so we prove that A ξ ⇐⇒ A ξ. We consider the following cases: Since ξ is a sub-formula of ϕ, |ξ| ≤ |ϕ|, and therefore the numeral is less than |ϕ|.
However, s A , s A > |ϕ| from Lemma 3 and therefore A, A ξ.
However, since ξ is a sub-formula of ϕ, the numeral is less than |ϕ|, and s A > |ϕ|, from Lemma 3. Therefore, A, A ξ. Case 1.6 : t 1 = c k1 , t 2 = c k2 Trivial, from Observation 7. Case 1.7 : However, from Lemma 2 we know that for any a ∈ A (and in particular for Trivial, since the interpretation of the Address constants is identical in A, A .
Any other case is symmetrical to one of the cases above.
Step 2: ξ = t 1 ≈ t 2 with free variables x 1 , . . . , x r ξ is a quantifier-free formula, so we prove that for any α 1 , . . . , α r ∈ A We consider the following cases: From Lemma 2 we know that for any Let there be α ∈ A , we separate into the following cases: Case 2.5 : Trivial from Lemma 2. Case 2.6 : Trivial from Lemma 2. Case 2.7 : t 1 = a i , t 2 = x Trivial, since the interpretation of the address constants is identical in A and A . Case 2.8 : Trivially holds for any a ∈ A . Any other case is symmetrical to one of the cases above.
Step 3: ξ = ¬ζ without free variables Since ϕ is a universal formula we can assume it is in prenex form, and therefore, ζ is a closed, quantifier-free formula, shorter than ξ and from the induction hypothesis, A ζ ⇐⇒ A ζ, and therefore Step 4: ξ = ¬ζ with free variables x 1 , . . . , x r Similarly to the closed formula case, ζ is a quantifier-free formula with free variables x 1 , . . . , x r and the claim holds from the induction hypothesis.
Step 5: ξ = ζ 1 ∨ ζ 2 without free variables Similarly to the negation formula case, ζ 1 , ζ 2 are closed, quantifier-free formulas and the claim holds from the induction hypothesis.
Step 6: ξ = ζ 1 ∨ ζ 2 with free variables x 1 , . . . , x r Similarly to the closed formula case, and the claim holds from the induction hypothesis.
Step 7: ξ = ∀v.ζ without free variables Since ξ is a universal formula, we need to show that if A ξ then A ξ (but not vice versa).
ζ is a universally quantified formula with (at most) one free variable x. If A ξ then for every α ∈ A, A ζ [α/x] and in particular for any α ∈ A A. ζ is shorter than ξ and therefore the induction hypothesis holds: for any α ∈ A , and therefore A ξ.

Presburger Reduction Proof
Defining the Transformations The transformation of formulas from SL to PA works by explicitly writing out sums as additions and universal quantifiers as conjunctions. Since we're dealing with a fragment of SL that has some bound function κ(·), we know that for given formula ϕ, there is a model with at most κ(|ϕ|) elements of Address sort. Moreover, we useκ max {κ(|ϕ|), l} as the upper bound (where l is the amount of Address constants). Since we're looking for distinct models, it is obvious that we need at least l distinct elements.
For each balance function b 1 j we haveκ constants b 1,j , . . . , bκ ,j . In addition we haveκ indicator constants a 1 , . . . , aκ, to mark if an Address element is "active". An inactive element has all zero balances, and is skipped over in universal quantifiers.
Any Address constant a i or Address variable x is handled in two ways, depending on the context they appear in: -If they are compared, we replace the comparison with or ⊥; we know statically if the comparison holds, since the Address constants are distinct and every universal quantifier is written out as a conjunction.
-Otherwise, they must be used in some balance function b 1 j , and then they are substituted with the corresponding b i,j or b x,j (which will be determined once the universal quantifiers are unrolled).
The integral constants c 1 , . . . , c d are simply copied over. In summary: Firstly, we define the simpler auxiliary formula η(ϕ) in three parts: Definition 13. We require that inactive Address elements have zero balances - And that elements referred by Address constants be active - Definition 15. Finally, we require that the active elements are a continuous sequence starting at 1. Or, put differently, once an indicator is zero, all indicators following it are also zero: The complete auxiliary formula is then η(ϕ) = η 1 (ϕ) ∧ η 2 (ϕ) ∧ η 3 (ϕ). In order to define τ (ϕ), we firstly define the transformation for terms, and then build up the complete transformation, using several substitutions: Definition 16. We define the terms transformation inductively, and we substitute balances and Address terms (constants or variables) with placeholders (marked with *), which are further substituted: Definition 17. Next we define the transformation for formulas, replacing only variable placeholders: We can see that for any formula ξ containing arbitrary terms, τ 1 (ξ) only has a * i and b * j placeholders (but no x * ones).
Definition 18. Now we define a substitution σ 1 that removes Address comparisons by evaluating them: where i, i ∈ [1,κ].
Note 1. We first replace comparisons where a * i ≈ a * i , which is equivalent to true ( ). Then any remaining comparison must be where i = i , and therefore equivalent to false (⊥).
Definition 19. Finally, we're left with placeholders inside balance functions, which we substitute by the corresponding balance constant: Definition 20. The complete transformation is then: Given the above definitions, let us recall the Presburger Reduction Theorem:

Theorem 3 (Presburger Reduction
). An SL formula ϕ has a distinct, SL model with small Address space iff τ (ϕ) ∧ η(ϕ) has a Standard Model of Arithmetic.

Proof of Theorem 3
We first define congruence between SL structures and structures over the corresponding Presburger vocabulary, and we prove a general theorem about them. We use that congruence theorem to prove that a formula ϕ has a distinct SL model with small Address space iff ϕ has a Standard Model of Arithmetic.
, and for any i > z, b A i,j = 0. 6. For any i ∈ [1, z], a A i > 0 and for any i > z, a A i = 0. 7. A is distinct, and in particular l ≤ z.
Lemma 4. Let A, A be two congruent structures for SL vocabulary Σ, bound κ and formula ϕ. For any ground term t of sort Nat over Σ, Proof. We prove the lemma using structural induction over all possible ground terms: Step 1.1: t = s j for any j ∈ [1, m] From Congruence Condition 1 A holds the sum property, and therefore: , and for any i ∈ [z + 1,κ], b A i,j = 0, therefore we can write the sum above as From the definition of τ 0 we get: Since we have no placeholders, σ 2 has no effect, and we get the same expression as for I(t).
From Congruence Condition 4, a A i = α i , and we get: From the definition of τ 0 and σ 2 we get: And therefore, since A is distinct, i ≤ l ≤ z, and from Congruence Condition 5, Follows from the induction hypothesis for t 1 and t 2 (since + is interpreted in the same way in A and A ).
Lemma 5. Let A, A be two congruent structures for SL vocabulary Σ, bound κ and formula ϕ. For any term t with at most r free variables x 1 , . . . , x r , for any indices i 1 , . . . , i r ∈ [1, z], for any assignment ∆ we define and the following holds: Proof. We prove the lemma using structural induction over all possible terms with free variables: Step 1.4: t = b j (x) For any i ∈ [1, z], By definition, τ 0 (t) = b * j (x * ), and therefore Step 1.5: t = t 1 + t 2 Either t 1 or t 2 has free variables, and let us assume w.l.o.g. that t 1 does. Therefore, t 2 is either a ground term, or also has free variables. If t 2 has no free variables, the substitution of free variables wouldn't affect it.
In both cases, from the induction hypothesis and from Lemma 4, for any i 1 , . . . , i r ∈ [1, z], where v ∈ {1, 2}, and we get desired equality for t as well.
Lemma 6. Let A, A be two congruent structures for SL vocabulary Σ, bound κ and formula ϕ. Let ξ be a sub-formula of ϕ, therefore: 1. If ξ is a closed formula, then A τ (ξ) ⇐⇒ A ξ.
2. If ξ is a formula with free variables x 1 , . . . , x r then for any i 1 , . . . , i r ∈ [1, z]: Proof. Let us separate into the following steps: Step 1.6: ξ = t 1 ≈ t 2 without free variables, where t 1 , t 2 are of Address sort Since t 1 , t 2 are Addresses, there are two indices i 1 , i 2 ∈ [1, l] such that t 1 = a i1 , t 2 = a i2 . From Congruence Condition 7, A is distinct and therefore As for τ (ξ), Step 1.7: ξ = t 1 ≈ t 2 without free variables, where t 1 , t 2 are of sort Nat In this case, σ 1 would not change the formula and we can apply σ 2 to each term: Since t 1 , t 2 are of sort Nat, from Lemma 4 we get that Step 1.8: ξ = t 1 ≈ t 2 with free variables x 1 , . . . , x r , where t 1 , t 2 are of Address sort Let us first define σ = [α i1 /x 1 , . . . , α ir /x r ], σ = a * i1 /x * 1 , . . . , a * ir /x * r . If t 1 , t 2 both have free variables then we can write them as t 1 = x 1 , t 2 = x 2 and after substituting σ we get that As for A , we get And we get that A τ 1 (ξ)σ σ 1 σ 2 ⇐⇒ i 1 = i 2 ⇐⇒ A ξσ.
Part 2: Proof of Theorem 3 (⇒): If ϕ has a distinct SL model with small Address space, then ϕ has a Standard Model of Arithmetic Let there be a distinct SL model for ϕ with small Address space: We can represent its Addresses set as A = {α 1 , . . . , α z } where z = |A|, and for every i ∈ [1, l], a A i = α i since A is distinct. Combined with the fact that A has small Address space we know that z ≤κ.
We define A , the Standard Model of Arithmetic for ϕ , as follows: where the indicators are and the natural constants are c A k = c A k .
Proof. We show that A satisfies η 1 (ϕ), η 2 (ϕ) and η 3 (ϕ): Step 2.1: A satisfies η 1 (ϕ) We need to show that for each i ∈ [1,κ], Step 2.2: A satisfies η 2 (ϕ) We need to show that for each i ∈ [1, l], Let there be some index i such that a A i = 0, therefore, by definition, i > z. For any i > i it also holds that i > z and therefore a A i = 0.
Proof. We show that A and A are congruent, and since A ϕ, from Lemma 6, A τ (ϕ): Arithmetic for ϕ . Since A η 3 (ϕ), we know that there exists some maximal index z ≤κ such that a A z = 0 and for any i > z, a A i = 0. Since A η 1 (ϕ) we know that z ≥ l.
We construct an SL model A for ϕ as follows: the Address constants are for any i ∈ A, j ∈ [1, m]; the natural constants are c A k = c A k and the sums are defined as We show that A, A are congruent: 1. By construction, A holds the sum property. 2. It is given that A satisfies η(ϕ). 3. We define A to be the set [1, z], and therefore |A| ≤κ. 4. a A i = α i as defined above.

By construction, for any
i,j and z was chosen such that for any i > z, b A i,j = 0. 6. z was chosen such that for any i ∈ [1, z], a A i > 0 and for any i > z, a A i = 0. 7. A is distinct by construction.
Given that A ϕ , we know in particular that A τ (ϕ), and from Lemma 6, A ϕ as a many-sorted, first-order formula. Since A holds the sum property, A SL ϕ. In addition, by construction |A| = z ≤κ. Therefore, A is a distinct SL model for ϕ with small Address space.

B.1 Soundness of mint 1
For readability reasons, the theorem is stated again.
Theorem 7 (Soundness of mint 1 (a, c)). Let c ∈ Coin, a ∈ Address such that mint 1 (a, c). Consider balance functions old-bal, new-bal : Address → N, non-negative integer constants old-sum, new-sum, unary predicates old-active, new-active ⊆ Coin and binary predicates old-has-coin, new-has-coin ⊆ Address × Coin such that

B.2 Soundness of mint n
To define the smart transition mint n we need one pair of predicates for every time step. Thus we have an additional "parameter" i, the i-th time step, in active and has-coin instead of using the prefixes old--and new--. Other than that the definition and the soundness result is analog to the setting of mint 1 . mint n (a)). Let a ∈ Address. Then, the transition mint n (a) activates n coins and deposits them into address a, one coin c in each time step.

Definition 22 (Transition
1. The coin c was inactive before and is active now: 2. The address a owns the new coin c: has-coin(a, c, i + 1) ∧ ∀a : Address. ¬has-coin(a , c, i) .
The transition mint n (a) is defined as ∀i : Nat. ∃c : The soundness result we get is similar to Theorem 7 but extended by the new parameter.
Theorem 10 (Soundness of mint n (a)). Let a ∈ Address such that mint n (a). Consider a balance function bal : Address × N → N, a summation function sum : N → N + , a binary predicate active ⊆ Coin × N and a ternary predicate has-coin ⊆ Address × Coin × N such that for every i ∈ N |active(., i)| = sum(i) and for every address a and i ∈ N, we have bal(a , i) = |{c ∈ Coin | (a , c , i) ∈ has-coin}| .

B.3 Soundness and Completeness relative to f
In order to establish a proof of Theorem 8, some formal definitions of the in the paper informally explained concepts have to be stated first. The exclusion of certain elements of F is based on an equivalence relation ∼.
To properly prove that ∼ is an equivalence relation, we have to define V ≤ and V ≥ first.
Definition 24. Given a pair (active, has-coin) ∈ F. For an address a, we define C a {c ∈ Coin | has-coin(a, c)}. Further, we define three types of error coins: and one type of error pairs M Pairs {(a, c) | c ∈ C a ∧ ∃b. a ≈ b ∧ c ∈ C b } to refine the number of mistakes caused by the violation of (I3). The number of violations of (I2) is now V ≤ |M Least |. and the number of violations of (I1) and (I3) is defined as V ≥ |M Inact | + |M Pairs | − |M Most |. Lemma 7. The relation ∼ is an equivalence relation on F.
Let p 1 , p 2 ∈ F such that p 1 ∼ p 2 , then due to symmetry of = also p 2 ∼ p 1 holds. -Transitivity of ∼.
Let p 1 , p 2 , p 3 ∈ F, such that p 1 ∼ p 2 and p 2 ∼ p 3 then due to the transitivity of = also p 1 ∼ p 3 holds.
The translation function f can now be defined as a function that assigns every pair (bal, sum) a class from F/ ∼ . Function f ). The function f : N Address × N → F/ ∼ , (bal, sum) → [(has-coin, active)] ∼ , is defined to satisfy the following conditions for an arbitrary (has-coin, active) ∈ [(has-coin, active)] ∼ .
The function f is well-defined and injective, ensuring soundness and completeness of our SL encodings relative to f . Proof. The proof is organized in 4 steps. The first step provides a technicality that is need for the steps 2 and 3 and finally in the last step the claim is proven.
1. Consider any pair (has-coin, active) ∈ F with V ≤ = V ≥ = 0. Then, since there are no coins nor addresses violating the invariants here, we thus have a∈Address C a = active and all the C a are disjoint. Thus, a∈Address |C a | = | a∈Address C a | = |active|. 2. Now we only assume V ≥ = 0. Consider Then the pair p = (has-coin, active\M Least ) satisfies V ≤ = V ≥ = 0, because all the coins were not active originally are active now and we did not change the any of the other mistake sets. From the first step we now get a∈Address |C a | = |active\M Least | and therefore 3. Similarly to the second step we now only assume V ≤ = 0. By definition it holds that M Inact ∩ active = ∅, M Pairs ⊆ has-coin and M Most ⊆ active ∪ M Inact . We now consider the pair Clearly, there is not any coin assigned to two different addresses in p . However all the coins that were in two different addresses before are now not assigned to any address, this is why these coins have to be removed from active ∪ M Inact . Also there are no coins that are active without belonging to any address. Further, all active coins still are assigned to an address as From Theorem 8, also Theorem 6 follows immediately by stating the properties of the function f .

B.4 Soundness of Explicit SL Encodings
Theorem 9 (Soundness of Explicit SL Encodings). Let there be a pair (bal, sum) ∈ N Address × N, a pair (has-coin, active) ∈ F, and functions count : Coin → N + and idx : Address × Coin → N + .
In particular, we have sum = a∈Address bal(a).

B.5 No Loss of Generality: Restricting idx and count
We want to prove that we do not lose any generality when considering mutual count and idx functions for the old-and the new-world. In order to do so we need the following preliminary lemmas.
Proof. We proceed by constructing p y = (y-has-coin, y-active) ∈ f (h y ) such that it satisfies properties (1)-(3). To fulfill property (1), let y-active x-active ∪ S, where S ∈ Coin\x-active and |S| = y-sum − x-sum. Then also |y-active| = y-sum holds. To construct the C y,a properly, the set y-active has to be partitioned, since p y ∈ f (h y ) and thus inv(y-has-coin, y-active). For every a with x-bal(a) ≤ y-bal(a) we require C x,a ⊆ C y,a . Therefore there are a: x-bal(a)>y-bal(a) x-bal(a) − y-bal(a) additional spare coins. For a with x-bal(a) ≥ y-bal(a) we want C y,a ⊆ C x,a , which leaves us with By replacing z-sum by a∈Address z-bal(a) all the summands with either y-bal(a) > x-bal(a) or x-bal(a) > y-bal(a) disappear and the remaining value is 0. Therefore, such a partition of y-active exists and thus, there exists p y = (y-active, y-has-coin) ∈ f (h y ) satisfying (1), (2) and (3). Lemma 9. Given two pairs h x , h y with a∈Address z-bal(a) = z-sum, p z ∈ f (h z ), for z ∈ {x, y} and x-sum ≤ y-sum as in Lemma 8. Then, there exist a bijective function count : Coin → N + with count(z-active) = [1, z-sum] and bijective functions idx(a, .) : Coin → N + , with idx(C z,a ) = [1, z-bal(a)], for z ∈ {x, y}, a ∈ Address.
Proof. At first, we construct count. We know y-active = x-active∪ S, where |y-active| = y-sum, |x-active| = x-sum and |S| = y-sum − x-sum. Thus, we can easily find an injective function with count(x-active) = [1, x-sum] and count(S) = [x-sum + 1, y-sum]. Further, this function can be bijectively extended onto N + . Similarly, for the addresses a, we construct idx(a, .) in the following way. Since we know |C z,a | = z-bal(a), we can find an injective function with idx(a, C x,a ) = [1, x-bal(a)]. For all a, where y-bal(a) ≤ x-bal(a), we can assume that idx(a, C y,a ) = [1, y-bal(a)], as C y,a ⊆ C x,a . For these addresses a, idx(a, .) can now also be extended bijectively onto N + . Finally, for a with x-bal(a) ≤ y-bal(a) we know C x,a ⊆ C y,a and can thus assume idx(a, C y,a \C x,a ) = [x-bal(a) + 1, y-bal(a)]. Now also these idx(a, .) can be extended bijectively onto N + .
Having these two lemmas at hand we can now state and prove the following result.
Proof. Let h x ∈ {h o , h n } such that x-sum = min {old-sum, new-sum}. The other pair gets the prefix 'y-' from now on. Also elements in f (h x ) and f (h y ) will be named accordingly.

B.6 No Loss of Generality: Ordering of Coins
The property to prove is that whenever a block of coins has the same order in two of our counting functions and they are not crossing its crucial value (sum, bal(a i ) ), then we can assume that they are ordered in the same way. In order to do so, we have to formalize the notion of the former invariants inv'(idx, count). They are the formulas one gets by replacing has-coin and active by count and idx according to (Ax1) and (Ax2) in the invariants (I1)-(I3). Proof. For simplicity assume f 0 = idx(a 0 , .) and f 1 = idx(a 1 , .). However, the proof works in an analog way for count. We proceed by constructing idx' and count' and then showing properties (i)-(vi) and ∀c : Coin. f 0 (c) ∈ [l 0 , u 0 ] → f 0 (c) + l 1 = f 1 (c) + l 0 hold. The function count count. We construct idx' the following way.
Finally, we have to prove (iv). Once we have shown idx(a, c) ≤ bal(a) iff idx (a, c) ≤ bal(a) for all a and for all c the property follows immediately, since count = count. For all a, c with one of a = a 1 or idx(a 1 , c) / ∈ [l 1 , u 1 ] the equivalence follows from the definition of idx . Consider now a 1 with a c such that idx(a 1 , c) ∈ [l 1 , u 1 ]. Using the implication of (v), we get idx (a 1 , c) ∈ [l 1 , u 1 ]. Now using property (vi), we know that either u 1 ≤ bal(a 1 ), in which case idx(a 1 , c), idx (a 1 , c) ≤ bal(a 1 ), or bal(a 1 ) < l 1 which implies idx(a 1 , c), idx (a 1 , c) > bal(a 1 ). This concludes the proof of property (iv) and thus of the theorem.