1 Introduction

Apple’s iMessage app works across iOS (iPhone, iPad) and OS X (MacBook) devices. Laudably, it aims to provide end-to-end security. At its heart is a signcryption scheme.

The current scheme—we refer to the version in iOS 9.3 onwards, revised after the attacks of GGKMR [26] on the iOS 9.0 version—is of interest on two fronts. (1) Applied: iMessage encrypts (according to an Internet estimate) 63 quadrillion messages per year. It is important to determine whether or not the scheme provides the security expected by its users. (2) Theoretical: The scheme involves (symmetric) encryption of a message under a key that is derived from the message itself, an uncommon and intriguing technique inviting formalization and a foundational treatment.

Signcryption theory: We extend the prior Signcryption definitions of ADR [3] to capture elements particular to messaging systems, and give general results that simplify the analysis of the candidate schemes. EMDK: We introduce, and give definitions (syntax and security) for, Encryption under Message Derived Keys. iMessage EMDK scheme: We extract from iMessage an EMDK scheme and prove its security in the random-oracle model. Composition and iMessage Signcryption: We give a way to compose EMDK, PKE and signatures to get signcryption, prove it works, and thereby validate the iMessage signcryption scheme for appropriate parameter choices.

By default, the iMessage chatting app encrypts communications between any two iMessage users. The encryption is end-to-end, under keys stored on the devices, meaning Apple itself cannot decrypt. In this way, iMessage joins Signal, WhatsApp and other secure messaging apps as a means to counter mass surveillance, but the cryptography used is quite different, and while the cryptography underlying Signal and WhatsApp, namely ratcheting, has received an extensive theoretical treatment [2, 12, 19, 22, 28, 29, 33], that underlying iMessage has not.

Fig. 1.
figure 1

Encryption in \(\mathsf {iMsg1}\) (left) and \(\mathsf {iMsg2}\) (right). Here is the recipient’s public RSA encryption key, is the sender’s ECDSA secret signing key and is the sender’s ECDSA public verification key. Our analysis and proofs consider general schemes of which the above emerge as instantiations corresponding to particular choices of primitives and parameters.

In 2016, Garman, Green, Kaptchuk, Miers and Rushanan (GGKMR) [26] gave chosen-ciphertext attacks on the then current, iOS 9 version, of iMessage that we will denote \(\mathsf {iMsg1}\). Its encryption algorithm is shown on the left in Fig. 1. In response Apple acknowledged the attack as CVE-2016-1788 [20], and revised the protocol for iOS 9.3. We’ll denote this version \(\mathsf {iMsg2}\), its encryption algorithm is shown on the right in Fig. 1. It has been stable since iOS 9.3. It was this revision that, for the specific purpose of countering the GGMKR attack, introduced (symmetric) encryption with message-derived keys: message M at line 4 is encrypted under a key K derived, via lines 13, from M itself. The question we ask is, does the fix work?

To meaningfully answer the above question we must first, of course, identify the formal primitive and security goal being targeted. Neither Apple’s iOS Security Guide [4], nor GGKMR [26], explicitly do so. We suggest that it is signcryption. Introduced by Zheng [36], signcryption aims to simultaneously provide privacy of the message (under the receiver’s public encryption key) and authenticity (under the sender’s secret signing key), and can be seen as the asymmetric analogue of symmetric authenticated encryption. A formalization was given by An, Dodis and Rabin (ADR) [3]. They distinguish between outsider security (the adversary is not one of the users) and the stronger insider security (the adversary could be a sender or receiver).

Identifying the iMessage goal as signcryption gives some perspective on, and understanding of, the schemes and history. The iMessage schemes can be seen as using some form of ADR’s Encrypt-then-Sign (\(\mathcal {E} t \mathcal {S}\)) method. The \(\mathsf {iMsg1}\) scheme turns out to be a simple scheme from ADR [3]. It may be outsider-secure, but ADR give an attack that shows it is not insider secure. (The adversary queries the sender encryption oracle to get a ciphertext \(((C_1,C_2),S)\), substitutes S with a signature \(S'\) of \(H =\mathsf {SHA1}(C_1\Vert C_2)\) under its own signing key, which it can do as an insider, and then queries this modified ciphertext to the recipient decryption oracle to get back the message underlying the original ciphertext.) The GGKMR [26] attack on \(\mathsf {iMsg1}\) is a clever improvement and real-world rendition of the ADR attack. That Apple acknowledged the GGKMR attack, and modified the scheme to protect against it, indicates that they want insider security, not just outsider security, for their modified \(\mathsf {iMsg2}\) scheme. So the question becomes whether this goal is achieved.

We could answer the above question relative to ADR’s (existing) definitions of insider-secure signcryption, but we do more, affirming the \(\mathsf {iMsg2}\) signcryption scheme under stronger definitions that capture elements particular to messaging systems, making our results of more applied value.

When you send an iMessage communication to Alice, it is encrypted to all her devices (her iPhone, MacBook, iPad, ...), so that she can chat seamlessly across them. To capture this, we enhance signcryption syntax, making the encryption algorithm multi-recipient. (It takes not one, but a list of receiver public encryption keys.) We also allow associated data as in symmetric authenticated encryption [35].

We give, like in prior work [3], a privacy definition (priv) and an authenticity definition (auth); but, unlike prior work, we also give a strong, unified definition (sec) that implies auth+priv. We show that (under certain conditions) sec is implied by auth+priv, mirroring analogous results for symmetric authenticated encryption [9, 15]. Proving that a scheme satisfies sec (the definition more intuitively capturing the practical setting) now reduces to the simpler tasks of separately showing it satisfies auth and priv. These definitions and results are for both insider and outsider security, and parameterized by choices of relaxing relations that allow us to easily capture variants reflecting issues like plaintext or ciphertext integrity [8], gCCA2 [3] and RCCA [18].

Recall that a scheme for conventional symmetric encryption specifies a key-generation algorithm that is run once, a priori, to return a key k; the encryption algorithm then takes k and message m to return a ciphertext. In our definition of a scheme for (symmetric) Encryption under Message-Derived Keys (EMDK), there is no dedicated key-generation algorithm. Encryption algorithm \(\mathsf {\mathsf {EMDK}.Enc}\) takes only a message m, returning both a key k and a ciphertext c, so that k may depend on m. Decryption algorithm \(\mathsf {\mathsf {EMDK}.Dec}\) takes k—in the overlying signcryption scheme, this is communicated to the receiver via asymmetric encryption—and c to return either m or \(\bot \).

We impose two security requirements on an EMDK scheme. (1) The first, called \(\mathsf {ae}\), adapts the authenticated encryption requirement of symmetric encryption [35]. (Our game formalizing \(\mathsf {ae}\) is in Fig. 8.) (2) The second, called \(\mathsf {rob}\), is a form of robustness or wrong-key detection [1, 17, 23, 24]. (Our game formalizing \(\mathsf {rob}\) is also in Fig. 8.) Of course one may define many other and alternative security goals for EMDK, so why these? We have focused on these simply because they suffice for our results.

EMDK is different from both (Symmetric) Encryption of Key-Dependent Messages (EKDM) [14, 16] and (Symmetric) Encryption secure against Related-Key Attack (ERKA) [7]. To begin with, these definitions apply to syntactically different objects. Namely, both EKDM and ERKA are security metrics for the standard symmetric encryption syntax where the encryption algorithm takes a key and message as input and returns a ciphertext, while in EMDK the encryption algorithm takes only a message and itself produces a key along with the ciphertext. (Note that the latter is also different from the syntax of a Key-Encapsulation mechanism, where encryption does produce a key and ciphertext, but takes no input message.) These syntactic differences make comparison moot, but one can still discuss intuitively how the security requirements relate. In the security games for EKDM there is an honestly and randomly chosen target key k, and challenge messages to be encrypted may depend on k, but in our security games for EMDK, the key is not chosen honestly and could depend on the message being encrypted. In ERKA also, like EKDM but unlike EMDK, a target key k is chosen honestly and at random. One can now have the game apply the encryption algorithm under a key \(k'\) derived from k, but this does not capture the encryption algorithm not taking a key as input but itself producing it as a function of the message, as in EKDM.

Equipped with the above, we show how to cast the \(\mathsf {iMsg2}\) signcryption scheme as the result of a general transform (that we specify and call \(\mathsf {IMSG\text {-}SC}\)) on a particular EMDK scheme (that we specify) and some standard auxiliary primitives (that we also specify). In Sect. 5, we prove that \(\mathsf {IMSG\text {-}SC}\) works, reducing insider security (priv, auth, sec) of the signcryption scheme to the security of the constituents, leaving us with what is the main technical task, namely showing security of the EMDK scheme.

In more detail, \(\mathsf {IMSG\text {-}SC}\) takes a scheme \(\mathsf {EMDK}\) for encryption under message-derived keys, a public-key encryption scheme \(\mathsf {PKE}\) and a digital signature scheme \(\mathsf {DS}\) to return a signcryption scheme \(\mathsf {SC}= \mathsf {IMSG\text {-}SC}[\mathsf {EMDK}, \mathsf {PKE}, \mathsf {DS}]\). (In the body of the paper, this is done in two steps, with a multi-recipient public-key encryption scheme [6] as an intermediate point, but for simplicity we elide this here.) Both iMessage signcryption schemes (i.e. \(\mathsf {iMsg1}\) and \(\mathsf {iMsg2}\)) can be seen as results of this transform. The two make the same choices of \(\mathsf {PKE}\) and \(\mathsf {DS}\), namely \(\mathsf {RSA}\hbox {-}\mathsf {OAEP}\) and \(\mathsf {EC}\hbox {-}\mathsf {DSA}\) respectively, differing only in their choice of \(\mathsf {EMDK}\), which for \(\mathsf {iMsg1}\) is a trivial scheme that we call the basic scheme, and for \(\mathsf {iMsg2}\) a more interesting scheme that we denote \(\mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\) and discuss below. Our Sect. 5 result is that signcryption scheme \(\mathsf {SC}= \mathsf {IMSG\text {-}SC}[\mathsf {EMDK}, \mathsf {PKE}, \mathsf {DS}]\) provides insider security (priv, auth, sec) assuming \(\mathsf {ae}\)- and \(\mathsf {rob}\)-security of \(\mathsf {EMDK}\) and under standard assumptions on \(\mathsf {PKE}\) and \(\mathsf {DS}\).

In Fig. 10 we specify an EMDK scheme \(\mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\) constructed from a given function family \(\mathsf {F}\) and a given, ordinary one-time (assumed deterministic) symmetric encryption scheme \(\mathsf {SE}\). Setting \(\mathsf {F}\) to \(\mathsf {HMAC}\) and \(\mathsf {SE}\) to \(\mathsf {AES}\hbox {-}\mathsf {CTR}\) recovers the EMDK scheme underlying \(\mathsf {iMsg2}\) signcryption. This EMDK scheme captures the heart of \(\mathsf {iMsg2}\) signcryption, namely lines 14 of the right side of Fig. 1.

The security analysis of \(\mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\) is somewhat complex. We prove \(\mathsf {ae}\)-security of this EMDK scheme assuming \(\mathsf {F}\) is a random oracle and \(\mathsf {SE}\) has the following properties: one-time IND-CPA privacy, a property we define called uniqueness, and partial key recovery security. The latter strengthens key recovery security to say that, not only is it hard to recover the key, but it is hard to recover even a prefix, of a certain prescribed length, of this key. We prove \(\mathsf {rob}\)-security of the EMDK scheme assuming \(\mathsf {F}\) is a random oracle and \(\mathsf {SE}\) satisfies uniqueness and weak robustness. The properties assumed of \(\mathsf {SE}\) appear to be true for the \(\mathsf {AES}\hbox {-}\mathsf {CTR}\) used in iMessage, and could be shown in idealized models.

Fig. 2.
figure 2

Lower bounds for the bit-security of privacy achieved by iMessage, depending on the key size of \(\mathsf {AES}\hbox {-}\mathsf {CTR}\) and the length of the authentication tag returned by \(\mathsf {HMAC}\). iMessage 10 uses 128-bit AES key and 40-bit long \(\mathsf {HMAC}\) authentication tag, and hence guarantees at least 39 bits of security for privacy. (Any choice of parameters guarantees 71 bits of security for authenticity.)

What we have proved is that \(\mathsf {iMsg2}\) signcryption is secure in principle, in the sense that the underlying template is sound. (That is, the signcryption scheme given by our \(\mathsf {IMSG\text {-}SC}\) transform is secure assuming the underlying primitives are secure.) For the practical implications, we must consider the quantitative security guaranteed by our theorems based on the particular choices of parameters and primitives made in \(\mathsf {iMsg2}\) signcryption scheme. Here, things seem a bit borderline, because \(\mathsf {iMsg2}\) signcryption has made some specific parameter choices that seem dangerous. Considering again the right side of Fig. 1, the 128-bit \(\mathsf {AES}\) key K at line 3 has only 88 bits of entropy—all the entropy is from the choice of L at line 1—which is not only considered small in practice but also is less than for \(\mathsf {iMsg1}\). (On the left side of the Figure we see that line 1 selects an \(\mathsf {AES}\) key K with the full 128 bits of entropy.) Also the tag h produced at line 2 of the right-hand-side of the Figure is only 40 bits, shorter than recommended lengths for authentication tags. To estimate the impact of these choices, we give concrete attacks on the scheme. They show that the bounds in our theorems are tight, but do not contradict our provable-security results.

Numerical estimates based on our provable-security results say that iMessage 10 guarantees at least 39 bits of security for privacy, and 71 bits of security for authenticity, if \(\mathsf {HMAC}\) and \(\mathsf {AES}\) are modeled as ideal primitives. Figure 2 shows the guaranteed bit-security of privacy for different choices of \(\mathsf {AES}\) key length and \(\mathsf {HMAC}\) tag length. For the small parameter choices made in \(\mathsf {iMsg2}\) signcryption, the attacks do approach feasibility in terms of computational effort, but we wouldn’t claim they are practical, for two reasons. First, they only violate the very stringent security goals that are the target of our proofs. Second, following the GGKMR [26] attacks, Apple has implemented decryption-oracle throttling that will also curtail our attacks.

Still, ideally, a practical scheme would implement cryptography that meets even our stringent security goals without recourse to extraneous measures like throttling. We suggest that parameter and primitive choices in iMessage signcryption be revisited, for if they are chosen properly, our results do guarantee that the scheme provides strong security properties.

When a new primitive (like EMDK) is defined, the first question of a theoretical cryptographer is often, does it exist, meaning, can it be built, and under what assumptions? At least in the random-oracle model [10] in which our results are shown, it is quite easy to build, under standard assumptions, an EMDK scheme that provides the \(\mathsf {ae}\)+\(\mathsf {rob}\)-security we define, and we show such a scheme in Fig. 9. The issue of interest for us is less existence (to build some secure EMDK scheme) and more the security of the particular \(\mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\) scheme underlying \(\mathsf {iMsg2}\) signcryption. The motivation is mainly applied, stemming from this scheme running in security software (iMessage) that is used by millions.

But, one may then ask, WHY did Apple use their (strange) EMDK scheme instead of one like that in Fig. 9, which is simpler and provable under weaker assumptions? We do not know. In that vein, one may even ask, why did Apple use EMDK at all? The literature gives Signcryption schemes that are efficient and based on standard assumptions. Why did they not just take one of them? Again, we do not know for sure, but we can speculate. The EMDK-based template that we capture in our \(\mathsf {IMSG\text {-}SC}\) transform provides backwards decryption compatibility; an \(\mathsf {iMsg1}\) implementation can decrypt an \(\mathsf {iMsg2}\) ciphertext. (Of course, security guarantees revert to those of the \(\mathsf {iMsg1}\) scheme under such usage, but this could be offset by operational gains.) Moving to an entirely new signcryption scheme would not provide this backwards compatibility. But we stress again that this is mere speculation; we did not find any Apple documents giving reasons for their choices.

We have discussed some related work above. However, signcryption is a big research area with a lot of work. We overview this in [13].

2 Preliminaries

In [13] we provide the following standard definitions. We state syntax, correctness and security definitions for function families, symmetric encryption, digital signatures, public-key encryption, and multi-recipient public-key encryption. We define the random oracle model, the ideal cipher model, and provide the birthday attack bounds. In this section we introduce the basic notation and conventions we use throughout the paper.

Let \({{\mathbb N}}=\{1, 2, \ldots \}\) be the set of positive integers. For \(i \in {{\mathbb N}}\) we let [i] denote the set \(\{1, \ldots , i\}\). If \({X}\) is a finite set, we let denote picking an element of \({X}\) uniformly at random and assigning it to x. Let \(\varepsilon \) denote the empty string. By \(x{\,\Vert \,}y\) we denote the concatenation of strings x and y. If \(x\in \{0,1\}^*\) is a string then |x| denotes its length, x[i] denotes its i-th bit, and \(x[i..j]=x[i]\ldots x[j]\) for \(1\le i\le j\le |x|\). If \(\mathsf {mem}\) is a table, we use \(\mathsf {mem}[i]\) to denote the element of the table that is indexed by i. We use a special symbol \(\bot \) to denote an empty table position; we also return it as an error code indicating an invalid input to an algorithm or an oracle, including invalid decryption. We assume that adversaries never pass \(\bot \) as input to their oracles.

We write \(\langle a, b, \ldots \rangle \) to denote a string that is a uniquely decodable encoding of \(a, b, \ldots \), where each of the encoded elements can have an arbitrary type (e.g. string or set). For any \(n\in {{\mathbb N}}\) let \(x_1, \ldots , x_n\) and \(y_1, \ldots , y_n\) be two sequences of elements such that for each \(i\in [n]\) the following holds: either \(x_i = y_i\), or both \(x_i\) and \(y_i\) are strings of the same length. Then we require that \(\left| \langle x_1, \ldots , x_n \rangle \right| = \left| \langle y_1, \ldots , y_n \rangle \right| \), and that \(\langle x_1, \ldots , x_{i-1}, x_i, x_{i+1}, \ldots , x_n \rangle \oplus \langle x_1, \ldots , x_{i-1}, y_i, x_{i+1}, \ldots , x_n \rangle = \langle x_1, \ldots , x_{i-1}, (x_i \oplus y_i), x_{i+1}, \ldots , x_n \rangle \) for all \(i\in [n]\).

Algorithms may be randomized unless otherwise indicated. Running time is worst case. If A is an algorithm, we let \(y \leftarrow A(x_1,\ldots ;r)\) denote running A with random coins r on inputs \(x_1,\ldots \) and assigning the output to y. We let be the result of picking r at random and letting \(y \leftarrow A(x_1,\ldots ;r)\). We let \([A(x_1,\ldots )]\) denote the set of all possible outputs of A when invoked with inputs \(x_1,\ldots \). Adversaries are algorithms.

We use the code based game playing framework of [11]. (See Fig. 5 for an example.) We let \(\Pr [\mathrm {G}]\) denote the probability that game \(\mathrm {G}\) returns \(\mathsf {true}\). In the security reductions, we omit specifying the running times of the constructed adversaries when they are roughly the same as the running time of the initial adversary.

In algorithms and games, uninitialized integers are assumed to be initialized to 0, Booleans to \(\mathsf {false}\), strings to the empty string, sets to the empty set, and tables are initially empty.

Let \(\mathsf {prim}\) be any cryptographic primitive, and let \(\mathsf {sec}\) be any security notion defined for this primitive. We say that \(\mathsf {prim}\) has n bits of security with respect to \(\mathsf {sec}\) (or n bits of \(\mathsf {sec}\)-security) if for every adversary \(\mathcal{A}\) that has advantage \(\epsilon _\mathcal{A}\) and runtime \(T_\mathcal{A}\) against \(\mathsf {sec}\)-security of \(\mathsf {prim}\) it is true that \(\epsilon _\mathcal{A}/ T_\mathcal{A}< 2^{-n}\). In other words, if there exists an adversary \(\mathcal{A}\) with advantage \(\epsilon _\mathcal{A}\) and runtime \(T_\mathcal{A}\) against \(\mathsf {sec}\)-security of \(\mathsf {prim}\), then \(\mathsf {prim}\) has at most \(- \log _2 (\epsilon _\mathcal{A}/ T_\mathcal{A})\) bits of security with respect to \(\mathsf {sec}\). This is the folklore definition of bit-security for cryptographic primitives. Micciancio and Walter [31] recently proposed an alternative definition for bit-security.

Let \(\mathcal {BS}(\mathsf {prim}, \mathsf {sec})\) denote the bit-security of cryptographic primitive \(\mathsf {prim}\) with respect to security notion \(\mathsf {sec}\). Consider any security reduction showing \(\mathsf {Adv}^{\mathsf {\mathsf {sec}}}_{\mathsf {prim}}(\mathcal{A}) \le \sum _i \mathsf {Adv}^{\mathsf {\mathsf {sec}_i}}_{\mathsf {prim}_i}(\mathcal{B}^{\mathcal{A}}_i)\) by constructing for any adversary \(\mathcal{A}\) and for each i a new adversary \(\mathcal{B}^{\mathcal{A}}_i\) with runtime roughly \(T_\mathcal{A}\). Then we can lower bound the bit-security of \(\mathsf {prim}\) with respect to \(\mathsf {sec}\) as

$$\begin{aligned} \mathcal {BS}(\mathsf {prim}, \mathsf {sec})&= \min _{\forall \mathcal{A}}\; -\log _2\left( \frac{\epsilon _\mathcal{A}}{T_\mathcal{A}}\right) \ge \min _{\forall \mathcal{A}}\; -\log _2 \left( \frac{\sum _i \mathsf {Adv}^{\mathsf {\mathsf {sec}_i}}_{\mathsf {prim}_i}(\mathcal{B}^{\mathcal{A}}_i)}{T_\mathcal{A}} \right) \\&\ge - \log _2 \left( \sum _i 2^{- \mathcal {BS}(\mathsf {prim}_i, \mathsf {sec}_i)} \right) . \end{aligned}$$

3 Signcryption

In this section we define syntax, correctness and security notions for multi-recipient signcryption schemes. We assume that upon generating any signcryption key pair , it gets associated to some identity . This captures a system where users can independently generate their cryptographic keys prior to registering them with a public-key infrastructure. We require that all identities are distinct values in \(\{0,1\}^*\). Depending on the system, each identity serves as a label that uniquely identifies a device or a user. Note that cannot be used in place of the identity, because different devices can happen to use the same public keys (either due to generating the same key pairs by chance, or due to maliciously claiming someone’s else public key). We emphasize that our syntax is not meant to capture identity-based signcryption, where a public key would have to depend on the identity. In [13] we provide an extensive summary of prior work on signcryption.

We focus on authenticity and privacy of signcryption in the insider setting, meaning that the adversary is allowed to adaptively compromise secret keys of any identities as long as that does not enable the adversary to trivially win the security games. Our definitions can also capture the outsider setting by considering limited classes of adversaries. We define our security notions with respect to relaxing relations. This allows us to capture a number of weaker security notions in a fine-grained way, by choosing an appropriate relaxing relation in each case. In [13] we define a combined security notion for signcryption that simultaneously encompasses authenticity and privacy, and prove that it is equivalent to the separate notions under certain conditions.

Fig. 3.
figure 3

Syntax of the constituent algorithms of signcryption scheme \(\mathsf {SC}\).

A multi-recipient signcryption scheme \(\mathsf {SC}\) specifies algorithms \(\mathsf {\mathsf {SC}.Setup}\), \(\mathsf {\mathsf {SC}.Kg}\), \(\mathsf {\mathsf {SC}.SigEnc}\), \(\mathsf {\mathsf {SC}.VerDec}\), where \(\mathsf {\mathsf {SC}.VerDec}\) is deterministic. Associated to \(\mathsf {SC}\) is an identity space \(\mathsf {\mathsf {SC}.ID}\). The setup algorithm \(\mathsf {\mathsf {SC}.Setup}\) returns public parameters \(\pi \). The key generation algorithm \(\mathsf {\mathsf {SC}.Kg}\) takes \(\pi \) to return a key pair , where is a public key and is a secret key. The signcryption algorithm \(\mathsf {\mathsf {SC}.SigEnc}\) takes \(\pi \), sender’s identity , sender’s public key , sender’s secret key , a set \(\mathcal {R}\) of pairs containing recipient identities and public keys, a plaintext \(m\in \{0,1\}^*\), and associated data to return a set \(\mathcal {C}\) of pairs , each denoting that signcryption ciphertext c should be sent to the recipient with identity . The unsigncryption algorithm \(\mathsf {\mathsf {SC}.VerDec}\) takes \(\pi \), sender’s identity , sender’s public key , recipient’s identity , recipient’s public key , recipient’s secret key , signcryption ciphertext c, and associated data to return \(m\in \{0,1\}^*\cup \{\perp \}\), where \(\perp \) indicates a failure to recover plaintext. The syntax used for the constituent algorithms of \(\mathsf {SC}\) is summarized in Fig. 3.

The correctness of a signcryption scheme \(\mathsf {SC}\) requires that for all \(\pi \in [\mathsf {\mathsf {SC}.Setup}]\), all \(n\in {{\mathbb N}}\), all all , all distinct , all \(m\in \{0,1\}^*\), and all the following conditions hold. Let . We require that for all : (i) \(\left| \mathcal {C}\right| = \left| \mathcal {R}\right| \); (ii) for each \(i\in \{1,\ldots ,n\}\) there exists a unique \(c\in \{0,1\}^*\) such that ; (iii) for each \(i\in \{1,\ldots ,n\}\) and each c such that we have .

Fig. 4.
figure 4

Relaxing relations \({\mathsf {R}_{\mathsf {m}}}\) and \(\mathsf {R}_{\mathsf {id}}\).

A relaxing relation \(\mathsf {R}\subseteq \{0,1\}^* \times \{0,1\}^*\) is a set containing pairs of arbitrary strings. Associated to a relaxing relation \(\mathsf {R}\) is a membership verification algorithm \(\mathsf {R.Vf}\) that takes inputs \(z_0, z_1\in \{0,1\}^*\) to return a decision in \(\{\mathsf {true},\mathsf {false}\}\) such that \(\forall z_0,z_1\in \{0,1\}^* :\mathsf {R.Vf}(z_0,z_1) = \mathsf {true}\) iff \((z_0,z_1) \in \mathsf {R}\). We will normally define relaxing relations by specifying their membership verification algorithms. Two relaxing relations that will be used throughout the paper are defined in Fig. 4.

We define our security notions for signcryption with respect to relaxing relations. Relaxing relations are used to restrict the queries that an adversary is allowed to make to its unsigncryption oracle. The choice of different relaxing relations can be used to capture a variety of different security notions for signcryption in a fine-grained way. We will use relaxing relations \(\mathsf {R}_{\mathsf {id}}\) and \({\mathsf {R}_{\mathsf {m}}}\) to capture strong vs. standard authenticity (or unforgeability) of signcryption, and IND-CCA vs. RCCA [18, 27] style indistinguishability of signcryption. In Sect. 5.3 we will also define unforgeability of digital signatures with respect to relaxing relations, allowing to capture standard and strong unforgeability notions in a unified way.

Fig. 5.
figure 5

Game defining authenticity of signcryption scheme \(\mathsf {SC}\) with respect to relaxing relation \(\mathsf {R}\).

Consider game \(\mathrm {G}^{\mathsf {auth}}\) of Fig. 5 associated to a signcryption scheme \(\mathsf {SC}\), a relaxing relation \(\mathsf {R}\) and an adversary \(\mathcal{F}\). The advantage of adversary \(\mathcal{F}\) in breaking the \(\mathrm {AUTH}\)-security of \(\mathsf {SC}\) with respect to \(\mathsf {R}\) is defined as \(\mathsf {Adv}^{\mathsf {auth}}_{\mathsf {SC}, \mathsf {R}}(\mathcal{F}) = \Pr [\mathrm {G}^{\mathsf {auth}}_{\mathsf {SC}, \mathsf {R}, \mathcal{F}}]\). Adversary \(\mathcal{F}\) has access to oracles \(\textsc {NewH}\), \(\textsc {NewC}\), \(\textsc {Exp}\), \(\textsc {SigEnc}\), and \(\textsc {VerDec}\). The oracles can be called in any order. Oracle \(\textsc {NewH}\) generates a key pair for a new honest identity . Oracle \(\textsc {NewC}\) associates a key pair of adversary’s choice to a new corrupted identity ; it permits malformed keys, meaning should not necessarily be a valid secret key that matches with . Oracle \(\textsc {Exp}\) can be called to expose the secret key of any identity. The game maintains a table \(\mathsf {exp}\) to mark which identities are exposed; all corrupted identities that were created by calling oracle \(\textsc {NewC}\) are marked as exposed right away. The signcryption oracle \(\textsc {SigEnc}\) returns ciphertexts produced by sender identity to each of the recipient identities contained in set \(\mathcal {I}\), encrypting message m with associated data . Oracle \(\textsc {VerDec}\) returns the plaintext obtained as the result of unsigncrypting the ciphertext c sent from sender to recipient , with associated data . The goal of adversary \(\mathcal{F}\) is to forge a valid signcryption ciphertext, and query it to oracle \(\textsc {VerDec}\). The game does not let adversary win by querying oracle \(\textsc {VerDec}\) with a forgery that was produced for an exposed sender identity , since the adversary could have trivially produced a valid ciphertext due to its knowledge of the sender’s secret key. Certain choices of relaxing relation \(\mathsf {R}\) can lead to another trivial attack.

When adversary \(\mathcal{F}\) in game \(\mathrm {G}^{\mathsf {auth}}_{\mathsf {SC}, \mathsf {R}, \mathcal{F}}\) calls oracle \(\textsc {SigEnc}\) on inputs , then for each ciphertext c produced for a recipient the game adds a tuple to set \(Q\). This set is then used inside oracle \(\textsc {VerDec}\). Oracle \(\textsc {VerDec}\) constructs and prevents the adversary from winning the game if \(\mathsf {R.Vf}(z_0, z_1)\) is true for any \(z_1 \in Q\). If the relaxing relation is empty (meaning \(\mathsf {R}= \emptyset \) and hence \(\mathsf {R.Vf}(z_0, z_1) = \mathsf {false}\) for all \(z_0, z_1\in \{0,1\}^*\)) then an adversary is allowed to trivially win the game by calling oracle \(\textsc {SigEnc}\) and claiming any of the resulting ciphertexts as a forgery (without changing the sender and recipient identities). Let us call this a “ciphertext replay” attack.

In order to capture a meaningful security notion, the \(\mathrm {AUTH}\)-security of \(\mathsf {SC}\) should be considered with respect to a relaxing relation that prohibits the above trivial attack. The strongest such security notion is achieved by considering \(\mathrm {AUTH}\)-security of \(\mathsf {SC}\) with respect to the relaxing relation \(\mathsf {R}_{\mathsf {id}}\) that is defined in Fig. 4; this relaxing relation prevents only the ciphertext replay attack. The resulting security notion captures the strong authenticity (or unforgeability) of signcryption. Alternatively, one could think of this notion as capturing the ciphertext integrity of signcryption.

Note that a relaxing relation \(\mathsf {R}\) prohibits the ciphertext replay attack iff \(\mathsf {R}_{\mathsf {id}}\subseteq \mathsf {R}\). Now consider the relaxing relation \({\mathsf {R}_{\mathsf {m}}}\) as defined in Fig. 4; it is a proper superset of \(\mathsf {R}_{\mathsf {id}}\). The \(\mathrm {AUTH}\)-security of \(\mathsf {SC}\) with respect to \({\mathsf {R}_{\mathsf {m}}}\) captures the standard authenticity (or unforgeability, or plaintext integrity) of signcryption. The resulting security notion does not let adversary win by merely replaying an encryption of from to for any fixed , even if the adversary can produce a new ciphertext that was not seen before.

Game \(\mathrm {G}^{\mathsf {auth}}_{\mathsf {SC}, \mathsf {R}, \mathcal{F}}\) captures the authenticity of \(\mathsf {SC}\) in the insider setting, because it allows adversary to win by producing a forgery from an honest sender identity to an exposed recipient identity. This, in particular, implies that \(\mathsf {SC}\) assures non-repudiation, meaning that the sender cannot deny the validity of a ciphertext it sent to a recipient (since the knowledge of the recipient’s secret key does not help to produce a forgery). In contrast, the outsider authenticity only requires \(\mathsf {SC}\) to be secure when both the sender and the recipient are honest. Our definition can capture the notion of outsider authenticity by considering a class of outsider adversaries that never query when .

Fig. 6.
figure 6

Games defining privacy of signcryption scheme \(\mathsf {SC}\) with respect to relaxing relation \(\mathsf {R}\).

Consider game \(\mathrm {G}^{\mathsf {priv}}\) of Fig. 6 associated to a signcryption scheme \(\mathsf {SC}\), a relaxing relation \(\mathsf {R}\) and an adversary \(\mathcal{D}\). The advantage of adversary \(\mathcal{D}\) in breaking the \(\mathrm {PRIV}\)-security of \(\mathsf {SC}\) with respect to \(\mathsf {R}\) is defined as \(\mathsf {Adv}^{\mathsf {priv}}_{\mathsf {SC}, \mathsf {R}}(\mathcal{D}) = 2\Pr [\mathrm {G}^{\mathsf {priv}}_{\mathsf {SC}, \mathsf {R}, \mathcal{D}}] - 1\). The game samples a challenge bit \(b\in \{0,1\}\), and the adversary is required to guess it. Adversary \(\mathcal{D}\) has access to oracles \(\textsc {NewH}\), \(\textsc {NewC}\), \(\textsc {Exp}\), \(\textsc {LR}\), and \(\textsc {VerDec}\). The oracles can be called in any order. Oracles \(\textsc {NewH}\), \(\textsc {NewC}\), and \(\textsc {Exp}\) are the same as in the authenticity game (with the exception of oracle \(\textsc {Exp}\) also checking table \(\mathsf {ch}\), which is explained below). Oracle \(\textsc {LR}\) encrypts challenge message \(m_b\) with associated data , produced by sender identity to each of the recipient identities contained in set \(\mathcal {I}\). Oracle \(\textsc {LR}\) aborts if \(m_0 \ne m_1\) and if the recipient set \(\mathcal {I}\) contains an identity that is exposed. Otherwise, the adversary would be able to trivially win the game by using the exposed recipient’s secret key to decrypt a challenge ciphertext produced by this oracle. If \(m_0 \ne m_1\) and none of the recipient identities is exposed, then oracle \(\textsc {LR}\) uses table \(\mathsf {ch}\) to mark each of the recipient identities; the game will no longer allow to expose any of these identities by calling oracle \(\textsc {Exp}\). Oracle \(\textsc {VerDec}\) returns the plaintext obtained as the result of unsigncrypting the ciphertext c sent from to with associated data . We discuss the choice of a relaxing relation \(\mathsf {R}\) below. However, note that oracle \(\textsc {LR}\) updates the set \(Q\) (used by relaxing relation) only when \(m_0 \ne m_1\). This is because the output of \(\textsc {LR}\) does not depend on the challenge bit when \(m_0 = m_1\), and hence such queries should not affect the set of prohibited queries to oracle \(\textsc {VerDec}\).

The output of oracle \(\textsc {VerDec}\) in game \(\mathrm {G}^{\mathsf {priv}}\) is a pair containing the plaintext (or the incorrect decryption symbol \(\perp \)) as its first element, and the status message as its second element. This ensures that the adversary can distinguish whether \(\textsc {VerDec}\) returned \(\perp \) because it failed to decrypt the ciphertext (yields error message \(\text {``}\mathrm {dec}\text {''}\)), or because the relaxing relation prohibits the query (yields error message \(\text {``}\mathrm {priv}\text {''}\)). Giving more information to the adversary results in a stronger security definition, and will help us prove equivalence between the joint and separate security notions of signcryption in [13]. Note that an adversary can distinguish between different output branches of all other oracles used in our authenticity and privacy games.

Consider relaxing relations \(\mathsf {R}_{\mathsf {id}}\) and \({\mathsf {R}_{\mathsf {m}}}\) that are defined in Fig. 4. We recover IND-CCA security of \(\mathsf {SC}\) as the \(\mathrm {PRIV}\)-security of \(\mathsf {SC}\) with respect to \(\mathsf {R}_{\mathsf {id}}\). And we capture the RCCA security of \(\mathsf {SC}\) as the \(\mathrm {PRIV}\)-security of \(\mathsf {SC}\) with respect to \({\mathsf {R}_{\mathsf {m}}}\). Recall that the intuition behind the RCCA security [18, 27] is to prohibit the adversary from querying its decryption oracle with ciphertexts that encrypt a previously queried challenge message. In particular, this is the reason that two elements are added to set \(Q\) during each call to oracle \(\textsc {LR}\), one for each of \(m_0\) and \(m_1\). Our definition of RCCA security for \(\mathsf {SC}\) is very similar to that of IND-gCCA2 security as proposed by An, Dodis and Rabin [3]. The difference is that our definition passes the decrypted message as input to the relation, whereas IND-gCCA2 instead allows relations that take public keys of sender and recipient as input. It is not clear that having the relation take the public key would make our definition meaningfully stronger.

Game \(\mathrm {G}^{\mathsf {priv}}_{\mathsf {SC}, \mathsf {R}, \mathcal{D}}\) captures the privacy of \(\mathsf {SC}\) in the insider setting, meaning that the adversary is allowed to request challenge encryptions from to even when is exposed. This implies some form of forward security because exposing the sender’s key does not help the adversary win the indistinguishability game. To recover the notion of outsider privacy, consider a class of outsider adversaries that never query when .

4 Encryption Under Message Derived Keys

We now define Encryption under Message Derived Keys (EMDK). It can be thought of as a special type of symmetric encryption allowing to use keys that depend on the messages to be encrypted. This type of primitive will be at the core of analyzing the security of iMessage-based signcryption scheme. In Sect. 4.1 we define syntax, correctness and basic security notions for EMDK schemes. In Sect. 4.2 we define the iMessage-based EMDK scheme and analyse its security.

4.1 Syntax, Correctness and Security of EMDK

We start by defining the syntax and correctness of encryption schemes under message derived keys. The interaction between constituent algorithms of \(\mathsf {EMDK}\) is shown in Fig. 7. The main security notions for EMDK schemes are \(\mathrm {AE}\) (authenticated encryption) and \(\mathrm {ROB}\) (robustness). We also define the \(\mathrm {IND}\) (indistinguishability) notion that will be used in Sect. 4.2 for an intermediate result towards showing the \(\mathrm {AE}\)-security of the iMessage-based EMDK scheme.

Fig. 7.
figure 7

Constituent algorithms of encryption scheme under message derived keys \(\mathsf {EMDK}\).

An encryption scheme under message derived keys \(\mathsf {EMDK}\) specifies algorithms \(\mathsf {\mathsf {EMDK}.Enc}\) and \(\mathsf {\mathsf {EMDK}.Dec}\), where \(\mathsf {\mathsf {EMDK}.Dec}\) is deterministic. Associated to \(\mathsf {EMDK}\) is a key length \(\mathsf {\mathsf {EMDK}.kl}\in {{\mathbb N}}\). The encryption algorithm \(\mathsf {\mathsf {EMDK}.Enc}\) takes a message \(m\in \{0,1\}^*\) to return a key \(k\in \{0,1\}^{\mathsf {\mathsf {EMDK}.kl}}\) and a ciphertext \(c\in \{0,1\}^*\). The decryption algorithm \(\mathsf {\mathsf {EMDK}.Dec}\) takes kc to return message \(m \in \{0,1\}^* \cup \{\perp \}\), where \(\perp \) denotes incorrect decryption. Decryption correctness requires that \(\mathsf {\mathsf {EMDK}.Dec}(k, c) = m\) for all \(m\in \{0,1\}^*\), and all \((k, c) \in [\mathsf {\mathsf {EMDK}.Enc}(m)]\).

Fig. 8.
figure 8

Games defining indistinguishability, authenticated encryption security, and robustness of encryption scheme under message derived keys \(\mathsf {EMDK}\).

Consider game \(\mathrm {G}^{\mathsf {ind}}\) of Fig. 8, associated to an encryption scheme under message derived keys \(\mathsf {EMDK}\), and to an adversary \(\mathcal{D}\). The advantage of \(\mathcal{D}\) in breaking the \(\mathrm {IND}\) security of \(\mathsf {EMDK}\) is defined as \(\mathsf {Adv}^{\mathsf {ind}}_{\mathsf {EMDK}}(\mathcal{D}) = 2 \cdot \Pr [\mathrm {G}^{\mathsf {ind}}_{\mathsf {EMDK}, \mathcal{D}}] - 1\). The game samples a random challenge bit b and requires the adversary to guess it. The adversary has access to an encryption oracle \(\textsc {LR}\) that takes two challenge messages \(m_0, m_1\) to return an \(\mathsf {EMDK}\) encryption of \(m_b\).

Consider game \(\mathrm {G}^{\mathsf {ae}}\) of Fig. 8, associated to an encryption scheme under message derived keys \(\mathsf {EMDK}\), and to an adversary \(\mathcal{D}\). The advantage of \(\mathcal{D}\) in breaking the \(\mathrm {AE}\) security of \(\mathsf {EMDK}\) is defined as \(\mathsf {Adv}^{\mathsf {ae}}_{\mathsf {EMDK}}(\mathcal{D}) = 2 \cdot \Pr [\mathrm {G}^{\mathsf {ae}}_{\mathsf {EMDK}, \mathcal{D}}] - 1\). Compared to the indistinguishability game from above, game \(\mathrm {G}^{\mathsf {ae}}\) saves the keys and ciphertexts produced by oracle \(\textsc {LR}\), and also provides a decryption oracle \(\textsc {Dec}\) to adversary \(\mathcal{D}\). The decryption oracle allows to decrypt a ciphertext with any key that was saved by oracle Enc, returning either the actual decryption m (if \(b = 1\)) or the incorrect decryption symbol \(\perp \) (if \(b = 0\)). To prevent trivial wins, the adversary is not allowed to query oracle \(\textsc {Dec}\) with a key-ciphertext pair that were produced by the same \(\textsc {LR}\) query.

Consider game \(\mathrm {G}^{\mathsf {rob}}\) of Fig. 8, associated to an encryption scheme under message derived keys \(\mathsf {EMDK}\), and to an adversary \(\mathcal{G}\). The advantage of \(\mathcal{G}\) in breaking the \(\mathrm {ROB}\) security of \(\mathsf {EMDK}\) is defined as \(\mathsf {Adv}^{\mathsf {rob}}_{\mathsf {EMDK}}(\mathcal{G}) = \Pr [\mathrm {G}^{\mathsf {rob}}_{\mathsf {EMDK}, \mathcal{G}}]\). To win the game, adversary \(\mathcal{G}\) is required to find \((c, k_0, k_1, m_0, m_1)\) such that c decrypts to \(m_0\) under key \(k_0\), and c decrypts to \(m_1\) under key \(k_1\), but \(m_0 \ne m_1\). Furthermore, the game requires that the ciphertext (along with one of the keys) was produced during a call to oracle \(\textsc {Enc}\) that takes a message m as input to return the output (kc) of running \(\mathsf {\mathsf {EMDK}.Enc}(m)\) with honestly generated random coins. The other key can be arbitrarily chosen by the adversary. In the symmetric encryption setting, a similar notion called wrong-key detection was previously defined by Canetti et al. [17]. The notion of robustness for public-key encryption was formalized by Abdalla et al. [1] and further extended by Farshim et al. [23].

Fig. 9.
figure 9

Sample EMDK scheme \(\mathsf {EMDK}= \mathsf {SIMPLE\text {-}EMDK}\) in the ROM.

It is easy to build an EMDK scheme that is both \(\mathrm {AE}\)-secure and \(\mathrm {ROB}\)-secure. One example of such scheme is the construction \(\mathsf {SIMPLE\text {-}EMDK}\) in the random oracle model (ROM) that is defined in Fig. 9. In the next section we will define the EMDK scheme used iMessage; it looks convoluted, and its security is hard to prove even in the ideal models. In [13] we define the EMDK scheme that was initially used in iMessage; it was replaced with the current EMDK scheme in order to fix a security flaw in the iMessage design. We believe that the design of the currently used EMDK scheme was chosen based on a requirement to maintain backward-compatibility across the initial and the current versions of iMessage protocol.

4.2 iMessage-Based EMDK Scheme

In this section we define the EMDK scheme \(\mathsf {IMSG\text {-}EMDK}\) that is used as the core building block in the construction of iMessage (we use it to specify the iMessage-based signcryption scheme in Sect. 5). We will provide reductions showing the \(\mathrm {AE}\)-security and the \(\mathrm {ROB}\)-security of \(\mathsf {IMSG\text {-}EMDK}\). These security reductions will first require us to introduce two new security notions for symmetric encryption schemes: partial key recovery and weak robustness.

Fig. 10.
figure 10

iMessage-based EMDK scheme \(\mathsf {EMDK}= \mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\).

Let \(\mathsf {SE}\) be a symmetric encryption scheme. Let \(\mathsf {F}\) be a function family with \(\mathsf {F.In}= \{0,1\}^*\) such that \(\mathsf {F.kl}+ \mathsf {F.ol}= \mathsf {\mathsf {SE}.kl}\). Then \(\mathsf {EMDK}= \mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\) is the EMDK scheme as defined in Fig. 10, with key length \(\mathsf {\mathsf {EMDK}.kl}=\mathsf {\mathsf {SE}.kl}\).

Informally, the encryption algorithm \(\mathsf {\mathsf {EMDK}.Enc}(m)\) samples a hash function key \(r_0\) and computes hash . It then encrypts m by running \(\mathsf {\mathsf {SE}.Enc}(k, m)\), where \(k = r_0 {\,\Vert \,}r_1\) is a message-derived key. The decryption algorithm splits k into \(r_0\) and \(r_1\) and – upon recovering m – checks that \(r_1 = \mathsf {F.Ev}(r_0, m)\). In the iMessage construction, \(\mathsf {SE}\) is instantiated with AES-CTR using 128-bit keys and a fixed IV=1, whereas \(\mathsf {F}\) is instantiated with HMAC-SHA256 using \(\mathsf {F.kl}= 88\) and \(\mathsf {F.ol}= 40\).

Fig. 11.
figure 11

Games defining partial key recovery security of symmetric encryption scheme \(\mathsf {SE}\) with respect to prefix length \(\ell \), and weak robustness of deterministic symmetric encryption scheme \(\mathsf {SE}\) with respect to randomized key-suffix length \(\ell \).

Consider game \(\mathrm {G}^{\mathsf {pkr}}\) of Fig. 11, associated to a symmetric encryption scheme \(\mathsf {SE}\), a prefix length \(\ell \in {{\mathbb N}}\) and an adversary \(\mathcal{P}\). The advantage of \(\mathcal{P}\) in breaking the \(\mathrm {PKR}\)-security of \(\mathsf {SE}\) with respect to \(\ell \) is defined as \(\mathsf {Adv}^{\mathsf {pkr}}_{\mathsf {SE}, \ell }(\mathcal{P}) = \Pr [\mathrm {G}^{\mathsf {pkr}}_{\mathsf {SE}, \ell , \mathcal{P}}]\). The adversary \(\mathcal{P}\) has access to oracle \(\textsc {Enc}\) that takes a message m and encrypts it under a uniformly random key k (independently sampled for each oracle call). The goal of the adversary is to recover the first \(\ell \) bits of any secret key that was used in prior \(\textsc {Enc}\) queries.

Consider game \(\mathrm {G}^{\mathsf {wrob}}\) of Fig. 11, associated to a deterministic symmetric encryption scheme \(\mathsf {SE}\), a randomized key-suffix length \(\ell \in {{\mathbb N}}\), and an adversary \(\mathcal{G}\). The advantage of \(\mathcal{G}\) in breaking the \(\mathrm {WROB}\)-security of \(\mathsf {SE}\) with respect to \(\ell \) is defined as \(\mathsf {Adv}^{\mathsf {wrob}}_{\mathsf {SE}, \ell }(\mathcal{G}) = \Pr [\mathrm {G}^{\mathsf {wrob}}_{\mathsf {SE}, \ell , \mathcal{G}}]\). The adversary has access to oracle \(\textsc {Enc}\). The oracle takes a prefix of an encryption key \(r_0\in \{0,1\}^{\mathsf {\mathsf {SE}.kl}- \ell }\) and message m as input. It then randomly samples the suffix of the key \(r_1\in \{0,1\}^{\ell }\) and returns it to the adversary. The adversary wins if it succeeds to query \(\textsc {Enc}\) on some inputs \((r_0, m)\) and \((r_0', m')\) such that \(m \ne m'\) yet the oracle mapped both queries to the same ciphertext c. In other words, the goal of the adversary is to find \(k_0, m_0, k_1, m_1\) such that \(\mathsf {\mathsf {SE}.Enc}(k_0, m_0) = \mathsf {\mathsf {SE}.Enc}(k_1, m_1)\) and \(m_0 \ne m_1\) (which also implies \(k_0 \ne k_1\)), and the adversary has only a partial control over the choice of \(k_0\) and \(k_1\). Note that this assumption can be validated in the ideal cipher model.

We now provide the reductions for \(\mathrm {AE}\)-security and \(\mathrm {ROB}\)-security of \(\mathsf {IMSG\text {-}EMDK}\). The former is split into Theorems 1 and 2, whereas the latter is provided in Theorem 3. Note that in [13] we provide the standard definitions for the random oracle model, the \(\mathrm {UNIQUE}\)-security and the \(\mathrm {OTIND}\)-security of symmetric encryption, and the \(\mathrm {TCR}\)-security of function families. The proofs of Theorems 1, 2 and 3 are in the full version [13].

Theorem 1

Let \(\mathsf {SE}\) be a symmetric encryption scheme. Let \(\mathsf {F}\) be a function family with \(\mathsf {F.In}= \{0,1\}^*\), such that \(\mathsf {F.kl}+ \mathsf {F.ol}= \mathsf {\mathsf {SE}.kl}\). Let \(\mathsf {EMDK}= \mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\). Let \(\mathcal{D}_\mathrm {AE}\) be an adversary against the \(\mathrm {AE}\)-security of \(\mathsf {EMDK}\). Then we build an adversary \(\mathcal{U}\) against the \(\mathrm {UNIQUE}\)-security of \(\mathsf {SE}\), an adversary \(\mathcal{H}\) against the \(\mathrm {TCR}\)-security of \(\mathsf {F}\), and an adversary \(\mathcal{D}_\mathrm {IND}\) against the \(\mathrm {IND}\)-security of \(\mathsf {EMDK}\) such that

$$\begin{aligned} \mathsf {Adv}^{\mathsf {ae}}_{\mathsf {EMDK}}(\mathcal{D}_\mathrm {AE}) \le 2 \cdot \mathsf {Adv}^{\mathsf {unique}}_{\mathsf {SE}}(\mathcal{U}) + 2 \cdot \mathsf {Adv}^{\mathsf {tcr}}_{\mathsf {F}}(\mathcal{H}) + \mathsf {Adv}^{\mathsf {ind}}_{\mathsf {EMDK}}(\mathcal{D}_\mathrm {IND}). \end{aligned}$$

Theorem 2

Let \(\mathsf {SE}\) be a symmetric encryption scheme. Let \(\mathsf {F}\) be a function family with \(\mathsf {F.In}= \{0,1\}^*\) and \(\mathsf {F.kl}+ \mathsf {F.ol}= \mathsf {\mathsf {SE}.kl}\), defined by \(\mathsf {F.Ev}^{\textsc {RO}}(r, m) = \textsc {RO}(\langle r, m \rangle , \mathsf {F.ol})\) in the random oracle model. Let \(\mathsf {EMDK}= \mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\). Let \(\mathcal{D}_\mathsf {EMDK}\) be an adversary against the \(\mathrm {IND}\)-security of \(\mathsf {EMDK}\) that makes \(q_\textsc {LR}\) queries to its \(\textsc {LR}\) oracle and \(q_\textsc {RO}\) queries to random oracle \(\textsc {RO}\). Then we build an adversary \(\mathcal{P}\) against the \(\mathrm {PKR}\)-security of \(\mathsf {SE}\) with respect to \(\mathsf {F.kl}\), and an adversary \(\mathcal{D}_\mathsf {SE}\) against the \(\mathrm {OTIND}\)-security of \(\mathsf {SE}\), such that

$$\begin{aligned} \mathsf {Adv}^{\mathsf {ind}}_{\mathsf {EMDK}}(\mathcal{D}_\mathsf {EMDK}) \le 2 \cdot \gamma + 2 \cdot \mathsf {Adv}^{\mathsf {pkr}}_{\mathsf {SE}, \mathsf {F.kl}}(\mathcal{P}) + \mathsf {Adv}^{\mathsf {otind}}_{\mathsf {SE}}(\mathcal{D}_\mathsf {SE}), \end{aligned}$$

where

$$\begin{aligned} \gamma = \frac{(2 \cdot q_{\textsc {RO}} + q_\textsc {LR}- 1) \cdot q_\textsc {LR}}{2^{\mathsf {F.kl}+ 1}}. \end{aligned}$$

Theorem 3

Let \(\mathsf {SE}\) be a deterministic symmetric encryption scheme. Let \(\mathsf {F}\) be a function family with \(\mathsf {F.In}= \{0,1\}^*\) and \(\mathsf {F.kl}+ \mathsf {F.ol}= \mathsf {\mathsf {SE}.kl}\), defined by \(\mathsf {F.Ev}^{\textsc {RO}}(r, m) = \textsc {RO}(\langle r, m \rangle , \mathsf {F.ol})\) in the random oracle model. Let \(\mathsf {EMDK}= \mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\). Let \(\mathcal{G}_\mathsf {EMDK}\) be an adversary against the \(\mathrm {ROB}\)-security of \(\mathsf {EMDK}\). Then we build an adversary \(\mathcal{U}\) against the \(\mathrm {UNIQUE}\)-security of \(\mathsf {SE}\), and an adversary \(\mathcal{G}_\mathsf {SE}\) against the \(\mathrm {WROB}\)-security of \(\mathsf {SE}\) with respect to \(\mathsf {F.ol}\) such that

$$\begin{aligned} \mathsf {Adv}^{\mathsf {rob}}_{\mathsf {EMDK}}(\mathcal{G}_\mathsf {EMDK}) \le \mathsf {Adv}^{\mathsf {unique}}_{\mathsf {SE}}(\mathcal{U}) + \mathsf {Adv}^{\mathsf {wrob}}_{\mathsf {SE}, \mathsf {F.ol}}(\mathcal{G}_\mathsf {SE}). \end{aligned}$$

5 Design and Security of iMessage

In this section we define a signcryption scheme that models the current design of iMessage protocol for end-to-end encrypted messaging, and we analyze its security. All publicly available information about the iMessage protocol is provided by Apple in iOS Security Guide [4] that is regularly updated but is very limited and vague. So in addition to the iOS Security Guide, we also reference work that attempted to reverse-engineer [32, 34] and attack [26] the prior versions of iMessage. A message-recovery attack against iMessage was previously found and implemented by Garman et al. [26] in 2016, and subsequently fixed by Apple starting from version 9.3 of iOS, and version 10.11.4 of Mac OS X. The implemented changes to the protocol prevented the attack, but also made the protocol design less intuitive. It appears that one of the goals of the updated protocol design was to preserve backward-compatibility, and that could be the reason why the current design is a lot more more sophisticated than otherwise necessary. Apple has not formalized any claims about the security achieved by the initial or the current iMessage protocol, or the assumptions that are required from the cryptographic primitives that serve as the building blocks. We fill in the gap by providing precise claims about the security of iMessage design when modeled by our signcryption scheme. In this section we focus only on the current protocol design of iMessage. In [13] we provide the design of the initial iMessage protocol, we explain the attack proposed by Garman et al. [26], and we introduce the goal of backward-compatibility for signcryption schemes.

Fig. 12.
figure 12

Modular design of iMessage-based signcryption scheme. The boxed nodes in the diagram denote transforms that build a new cryptographic scheme from two underlying primitives.

5.1 iMessage-Based Signcryption Scheme \(\mathsf {IMSG\text {-}SC}\)

The design of iMessage combines multiple cryptographic primitives to build an end-to-end encrypted messaging protocol. It uses HMAC-SHA256, AES-CTR, RSA-OAEP and ECDSA as the underlying primitives. Apple’s iOS Security Guide [4] and prior work on reverse-engineering and analysis of iMessage  [26, 32, 34] does not explicitly indicate what type of cryptographic scheme is built as the result of combining these primitives. We identify it as a signcryption scheme. We define the iMessage-based signcryption scheme \(\mathsf {IMSG\text {-}SC}\) in a modular way that facilitates its security analysis. Figure 12 shows the order in which the underlying primitives are combined to build \(\mathsf {IMSG\text {-}SC}\), while also providing intermediate constructions along the way. We now explain this step by step.

Our construction starts from choosing a function family \(\mathsf {F}\) and a symmetric encryption scheme \(\mathsf {SE}\) (instantiated with HMAC-SHA256 and AES-CTR in iMessage). It combines them to build an encryption scheme under message derived keys \(\mathsf {EMDK}= \mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\). The resulting \(\mathsf {EMDK}\) scheme is combined with public-key encryption scheme \(\mathsf {PKE}\) (instantiated with RSA-OAEP in iMessage) to build a multi-recipient public-key encryption scheme \(\mathsf {MRPKE}= \mathsf {IMSG\text {-}MRPKE}[\mathsf {EMDK}, \mathsf {PKE}]\) (syntax and correctness of MRPKE schemes is defined in [13]). Finally, \(\mathsf {MRPKE}\) and digital signature scheme \(\mathsf {DS}\) (instantiated with ECDSA in iMessage) are combined to build the iMessage-based signcryption scheme \(\mathsf {SC}= \mathsf {IMSG\text {-}SC}[\mathsf {MRPKE}, \mathsf {DS}]\). The definition of \(\mathsf {IMSG\text {-}EMDK}\) was provided in Sect. 4.2. We now define \(\mathsf {IMSG\text {-}SC}\) and \(\mathsf {IMSG\text {-}MRPKE}\).

Fig. 13.
figure 13

Signcryption scheme \(\mathsf {SC}= \mathsf {IMSG\text {-}SC}[\mathsf {MRPKE}, \mathsf {DS}]\).

Let \(\mathsf {MRPKE}\) be a multi-recipient public-key encryption scheme. Let \(\mathsf {DS}\) be a digital signature scheme. Then \(\mathsf {SC}= \mathsf {IMSG\text {-}SC}[\mathsf {MRPKE}, \mathsf {DS}]\) is the signcryption scheme as defined in Fig. 13, with \(\mathsf {\mathsf {SC}.ID}= \{0,1\}^*\). In order to produce a signcryption of message m with associated data , algorithm \(\mathsf {\mathsf {SC}.SigEnc}\) performs the following steps. It builds a new message as the unique encoding of , where \(\mathcal {I}\) is the set of recipients. It then calls \(\mathsf {MRPKE.Enc}\) to encrypt the same message for every recipient. Algorithm \(\mathsf {MRPKE.Enc}\) returns a set containing pairs , each indicating that an \(\mathsf {MRPKE}\) ciphertext was produced for recipient . For each recipient, the corresponding ciphertext is then encoded with the associated data into and signed using the signing key of sender identity , producing a signature \(\sigma \). The pair is then added to the output set of algorithm \(\mathsf {\mathsf {SC}.SigEnc}\). When running the unsigncryption of ciphertext c sent from to , algorithm \(\mathsf {\mathsf {SC}.VerDec}\) ensures that the recovered \(\mathsf {MRPKE}\) plaintext is consistent with and .

Fig. 14.
figure 14

Multi-recipient public-key encryption scheme \(\mathsf {MRPKE}= \mathsf {IMSG\text {-}MRPKE}[\mathsf {EMDK}, \mathsf {PKE}]\).

 Let \(\mathsf {EMDK}\) be an encryption scheme under message derived keys. Let \(\mathsf {PKE}\) be a public-key encryption scheme with \(\mathsf {PKE.In}= \{0,1\}^{\mathsf {\mathsf {EMDK}.kl}}\). Then \(\mathsf {MRPKE}= \mathsf {IMSG\text {-}MRPKE}[\mathsf {EMDK}, \mathsf {PKE}]\) is the multi-recipient public-key encryption scheme as defined in Fig. 14. Algorithm \(\mathsf {MRPKE.Enc}\) first runs to produce an \(\mathsf {EMDK}\) ciphertext that encrypts m under key k. The obtained key k is then independently encrypted for each recipient identity using its \(\mathsf {PKE}\) encryption key , and the corresponding tuple is added to the output set of algorithm \(\mathsf {MRPKE.Enc}\).

Fig. 15.
figure 15

Algorithms \(\mathsf {\mathsf {SC}.SigEnc}\) (left panel) and \(\mathsf {\mathsf {SC}.VerDec}\) (right panel) for \(\mathsf {SC}= \mathsf {IMSG\text {-}SC}[\mathsf {MRPKE}, \mathsf {DS}]\), where \(\mathsf {MRPKE}= \mathsf {IMSG\text {-}MRPKE}[\mathsf {EMDK}, \mathsf {PKE}]\) and \(\mathsf {EMDK}= \mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\). For simplicity, we let be the only recipient, and we do not show how to parse inputs and combine outputs for the displayed algorithms. The dotted lines inside \(\mathsf {\mathsf {SC}.VerDec}\) denote equality check, and the dotted arrow denotes membership check.

Fig. 16.
figure 16

The resources used by adversaries \(\mathcal{D}_{\mathsf {exhaustive}, n}\), \(\mathcal{D}_\mathsf {birthday}\) and \(\mathcal{D}_\mathsf {ADR02}\), and the advantage achieved by each of them. Columns labeled \(q_\mathsf {O}\) denote the number of queries an adversary makes to oracle \(\mathsf {O}\). All adversaries make 2 queries to oracle \(\textsc {NewH}\), and 0 queries to oracle \(\textsc {Exp}\). See Lemma 4 for necessary assumptions.

Let \(\mathsf {SC}\) be the iMessage-based signcryption scheme that is produced by combining all of the underlying primitives described above. Then the data flow within the fully expanded algorithms \(\mathsf {\mathsf {SC}.SigEnc}\) and \(\mathsf {\mathsf {SC}.VerDec}\) is schematically displayed in Fig. 15. For simplicity, the diagrams show the case when a message m is sent to a single recipient .

5.2 Parameter-Choice Induced Attacks on Privacy of iMessage

The iMessage-based signcryption scheme \(\mathsf {SC}\) uses the EMDK scheme \(\mathsf {EMDK}= \mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\) as one of its underlying primitives. Recall that in order to encrypt a payload , the \(\mathsf {EMDK}\) scheme samples a function key , computes a hash of \(m'\) as \(r_1 \leftarrow \mathsf {F.Ev}(r_0, m')\), sets the encryption key \(k \leftarrow r_0 {\,\Vert \,}r_1\), and produces a ciphertext as . The implementation of iMessage uses parameters \(\mathsf {F.kl}= 88\) and \(\mathsf {F.ol}= 40\). In this section we provide three adversaries against the privacy of \(\mathsf {SC}\) whose success depends on the choice of \(\mathsf {F.kl}\) and \(\mathsf {F.ol}\). In next sections we will provide security proofs for \(\mathsf {SC}\). We will show that each adversary in this section arises from an attack against a different step in our security proofs. We will be able to conclude that these are roughly the best attacks that arise from the choice of \(\mathsf {EMDK}\) parameters. We will also explain why it is hard to construct any adversaries against the authenticity of \(\mathsf {SC}\). Now consider the adversaries of Fig. 16. The full version of this paper [13] provides a detailed explanation for each adversary.

Fig. 17.
figure 17

Adversaries \(\mathcal{D}_{\mathsf {exhaustive}, n}\), \(\mathcal{D}_\mathsf {birthday}\) and \(\mathcal{D}_\mathsf {ADR02}\) against the \(\mathrm {PRIV}\)-security of \(\mathsf {SC}= \mathsf {IMSG\text {-}SC}[\mathsf {MRPKE}, \mathsf {DS}]\), where \(\mathsf {MRPKE}= \mathsf {IMSG\text {-}MRPKE}[\mathsf {EMDK}, \mathsf {PKE}]\) and \(\mathsf {EMDK}= \mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\). Adversary \(\mathcal{D}_\mathsf {ADR02}\) requires that \(\mathsf {SE}\) is AES-CTR with a fixed IV.

We provide the number of queries, the runtime complexity and the advantage of each adversary in Fig. 17. The assumptions necessary to prove the advantage are stated in Lemma 4 below. Note that \(\mathcal{D}_\mathsf {birthday}\) represents a purely theoretical attack, but both \(\mathcal{D}_\mathsf {exhaustive}\) and \(\mathcal{D}_\mathsf {ADR02}\) can lead to practical message-recovery attacks (the latter used by Garman et al. [26]).

Let \(\mathsf {EMDK}= \mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\). Adversary \(\mathcal{D}_\mathsf {ADR02}\) shows that \(\mathsf {EMDK}\) can have at most \(\mathsf {F.ol}\) bits of security with respect to \(\mathrm {PRIV}\), and adversary \(\mathcal{D}_\mathsf {birthday}\) shows that \(\mathsf {EMDK}\) can have at most \(\approx \mathsf {F.kl}/ 2 + \log _2 \mathsf {F.kl}\) bits of security with respect to \(\mathrm {PRIV}\). It follows that setting \(\mathsf {F.ol}\approx \mathsf {F.kl}/ 2\) is a good initial guideline, and roughly corresponds to the parameter choices made in iMessage. We will provide a more detailed analysis in Sect. 5.5. The proof of Lemma 4 is in the full version [13].

Lemma 4

Let \(\mathsf {SE}\) be a symmetric encryption scheme. Let \(\mathsf {F}\) be a function family with \(\mathsf {F.In}= \{0,1\}^*\) such that \(\mathsf {F.kl}+ \mathsf {F.ol}= \mathsf {\mathsf {SE}.kl}\). Let \(\mathsf {EMDK}= \mathsf {IMSG\text {-}EMDK}[\mathsf {F}, \mathsf {SE}]\). Let \(\mathsf {PKE}\) be a public-key encryption scheme with \(\mathsf {PKE.In}= \{0,1\}^{\mathsf {\mathsf {SE}.kl}}\). Let \(\mathsf {MRPKE}= \mathsf {IMSG\text {-}MRPKE}[\mathsf {EMDK}, \mathsf {PKE}]\). Let \(\mathsf {DS}\) be a digital signature scheme. Let \(\mathsf {SC}= \mathsf {IMSG\text {-}SC}[\mathsf {MRPKE}, \mathsf {DS}]\). Let \(\mathsf {R}\subseteq \{0,1\}^* \times \{0,1\}^*\) be any relaxing relation. Then for any \(n > \mathsf {\mathsf {SE}.kl}\),

$$\begin{aligned} \mathsf {Adv}^{\mathsf {priv}}_{\mathsf {SC}, \mathsf {R}}(\mathcal{D}_{\mathsf {exhaustive}, n}) \ge 1 - 2^{\mathsf {\mathsf {SE}.kl}- n}. \end{aligned}$$

Furthermore, for any \(1 \le \mathsf {F.kl}\le 124\), if \(\mathsf {SE}\) is AES-CTR with a fixed IV, and if AES is modeled as the ideal cipher, then

$$\begin{aligned} \mathsf {Adv}^{\mathsf {priv}}_{\mathsf {SC}, \mathsf {R}}(\mathcal{D}_\mathsf {birthday}) > 1/8 - 2^{\mathsf {F.kl}- 128}. \end{aligned}$$

Let \({\mathsf {R}_{\mathsf {m}}}\) be the relaxing relation defined in Fig. 4. If \(\mathsf {SE}\) is AES-CTR with a fixed IV, and if \(\mathsf {F}\) is defined as \(\mathsf {F.Ev}^{\textsc {RO}}(r, m) = \textsc {RO}(\langle r, m \rangle , \mathsf {F.ol})\) in the random oracle model, then

$$\begin{aligned} \mathsf {Adv}^{\mathsf {priv}}_{\mathsf {SC}, {\mathsf {R}_{\mathsf {m}}}}(\mathcal{D}_{\mathsf {ADR02}}) = 2^{-\mathsf {F.ol}}. \end{aligned}$$

5.3 Authenticity of iMessage

In this section we reduce the authenticity of the iMessage-based signcryption scheme \(\mathsf {SC}\) to the security of its underlying primitives. First we reduce the authenticity of \(\mathsf {SC}= \mathsf {IMSG\text {-}SC}[\mathsf {MRPKE}, \mathsf {DS}]\) to the unforgeability of \(\mathsf {DS}\) and to the robustness of \(\mathsf {MRPKE}\). And then we reduce the robustness of \(\mathsf {MRPKE}= \mathsf {IMSG\text {-}MRPKE}[\mathsf {EMDK}, \mathsf {PKE}]\) to the robustness of either \(\mathsf {PKE}\) or \(\mathsf {EMDK}\); it is sufficient that only one of the two is robust.

Recall that an \(\mathsf {SC}\) ciphertext is a pair that consists of an \(\mathsf {MRPKE}\) ciphertext (encrypting some ) and a \(\mathsf {DS}\) signature \(\sigma \) of . Intuitively, the authenticity of \(\mathsf {SC}\) requires some type of unforgeability from \(\mathsf {DS}\) in order to prevent the adversary from producing a valid signature on arbitrary and of its own choice. However, the unforgeability of \(\mathsf {DS}\) is not a sufficient condition, because the adversary is allowed to win the game \(\mathrm {G}^{\mathsf {auth}}\) by forging an \(\mathsf {SC}\) ciphertext for a corrupted recipient identity that uses maliciously chosen \(\mathsf {SC}\) keys. So an additional requirement is that the adversary should not be able to find an \(\mathsf {SC}\) key pair that successfully decrypts an honestly produced \(\mathsf {SC}\) ciphertext to an unintended message. To ensure this, we require that \(\mathsf {MRPKE}\) is robust (as defined in the full version of this paper [13]). Note that finding a new key pair that decrypts the ciphertext to the original message will not help the adversary to win the game because then the decryption will fail by not finding the corrupted recipient’s identity in recipient set \(\mathcal {I}\).

We define unforgeability \(\mathrm {UF}\) of a digital signature scheme with respect to a relaxing relation \(\mathsf {R}\), such that the standard unforgeability is captured with respect to \({\mathsf {R}_{\mathsf {m}}}\) and the strong unforgeability is captured with respect to \(\mathsf {R}_{\mathsf {id}}\). The formal definition is in the full version [13]. We show that if \(\mathsf {DS}\) is \(\mathrm {UF}\)-secure with respect to a relaxing relation \(\mathsf {R}^* \in \{{\mathsf {R}_{\mathsf {m}}}, \mathsf {R}_{\mathsf {id}}\}\) then \(\mathsf {SC}\) is \(\mathrm {AUTH}\)-secure with respect to the corresponding parameterized relaxing relation \(\mathsf {IMSG\text {-}AUTH\text {-}REL}[\mathsf {R}^*]\), which we define below. ECDSA signatures are not strongly unforgeable [25], so iMessage is \(\mathrm {AUTH}\)-secure with respect to \(\mathsf {IMSG\text {-}AUTH\text {-}REL}[{\mathsf {R}_{\mathsf {m}}}]\).

Fig. 18.
figure 18

Relaxing relation \(\mathsf {IMSG\text {-}AUTH\text {-}REL}[\mathsf {R}^*]\).

Let \({\mathsf {R}_{\mathsf {m}}}\) and \(\mathsf {R}_{\mathsf {id}}\) be the relaxing relations defined in Sect. 3. Let \(\mathsf {R}^*\in \{{\mathsf {R}_{\mathsf {m}}}, \mathsf {R}_{\mathsf {id}}\}\). Then \(\mathsf {IMSG\text {-}AUTH\text {-}REL}[\mathsf {R}^*]\) is the relaxing relation as defined in Fig. 18. Note that

$$\begin{aligned} \mathsf {R}_{\mathsf {id}}= \mathsf {IMSG\text {-}AUTH\text {-}REL}[\mathsf {R}_{\mathsf {id}}] \subset \mathsf {IMSG\text {-}AUTH\text {-}REL}[{\mathsf {R}_{\mathsf {m}}}] \subset {\mathsf {R}_{\mathsf {m}}}, \end{aligned}$$

where \(\mathrm {AUTH}\)-security with respect to \(\mathsf {R}_{\mathsf {id}}\) captures the stronger security definition due to imposing the least number of restrictions regarding which queries are permitted to oracle \(\textsc {VerDec}\). Relaxing relation \(\mathsf {IMSG\text {-}AUTH\text {-}REL}[{\mathsf {R}_{\mathsf {m}}}]\) does not allow adversary to win the authenticity game by only mauling the signature \(\sigma \) and not changing anything else.

Theorem 5

Let \(\mathsf {MRPKE}\) be a multi-recipient public-key encryption scheme. Let \(\mathsf {DS}\) be a digital signature scheme. Let \(\mathsf {SC}= \mathsf {IMSG\text {-}SC}[\mathsf {MRPKE}, \mathsf {DS}]\). Let \(\mathsf {R}^*\in \{{\mathsf {R}_{\mathsf {m}}}, \mathsf {R}_{\mathsf {id}}\}\). Let \(\mathcal{F}_\mathsf {SC}\) be an adversary against the \(\mathrm {AUTH}\)-security of \(\mathsf {SC}\) with respect to relaxing relation \(\mathsf {R}= \mathsf {IMSG\text {-}AUTH\text {-}REL}[\mathsf {R}^*]\). Then we build an adversary \(\mathcal{F}_\mathsf {DS}\) against the \(\mathrm {UF}\)-security of \(\mathsf {DS}\) with respect to \(\mathsf {R}^*\), and an adversary \(\mathcal{G}\) against the \(\mathrm {ROB}\)-security of \(\mathsf {MRPKE}\) such that

$$\begin{aligned} \mathsf {Adv}^{\mathsf {auth}}_{\mathsf {SC}, \mathsf {R}}(\mathcal{F}_\mathsf {SC}) \le \mathsf {Adv}^{\mathsf {uf}}_{\mathsf {DS}, \mathsf {R}^*}(\mathcal{F}_\mathsf {DS}) + \mathsf {Adv}^{\mathsf {rob}}_{\mathsf {MRPKE}}(\mathcal{G}). \end{aligned}$$

The proof of Theorem 5 is in the full version [13].

The ciphertext of \(\mathsf {MRPKE}= \mathsf {IMSG\text {-}MRPKE}[\mathsf {EMDK}, \mathsf {PKE}]\) is a pair , where is an \(\mathsf {EMDK}\) ciphertext encrypting some , and is a \(\mathsf {PKE}\) ciphertext encrypting the corresponding \(\mathsf {EMDK}\) key k. The decryption algorithm of \(\mathsf {MRPKE}\) first uses the \(\mathsf {PKE}\) key pair to decrypt , and then uses the recovered \(\mathsf {EMDK}\) key k to decrypt . We show that just one of \(\mathsf {PKE}\) and \(\mathsf {EMDK}\) being robust implies that \(\mathsf {MRPKE}\) is also robust. Our definition of robustness for public-key encryption requires that it is hard to find a key pair that decrypts an honestly produced ciphertext to a plaintext that is different from the originally encrypted message. If this condition holds for \(\mathsf {PKE}\), then clearly \(\mathsf {MRPKE}\) is robust regardless of whether \(\mathsf {EMDK}\) is robust. On the other hand, if \(\mathsf {PKE}\) is not robust, then the robustness of \(\mathsf {EMDK}\) (as defined in Sect. 4) would guarantee that the adversary is unlikely to decrypt to a message other than \(m^*\) even if it has full control over the choice of \(\mathsf {EMDK}\) key k. It is not known whether RSA-OAEP is robust, so our concrete security analysis of iMessage in Sect. 5.5 will rely entirely on the robustness of \(\mathsf {EMDK}= \mathsf {IMSG\text {-}EMDK}\). The formal definition of robustness for PKE and the proof of Theorem 6 are in the full version [13].

Theorem 6

Let \(\mathsf {EMDK}\) be an encryption scheme under message derived keys. Let \(\mathsf {PKE}\) be a public-key encryption scheme with \(\mathsf {PKE.In}= \{0,1\}^{\mathsf {\mathsf {EMDK}.kl}}\). Let \(\mathsf {MRPKE}= \mathsf {IMSG\text {-}MRPKE}[\mathsf {EMDK}, \mathsf {PKE}]\). Let \(\mathcal{G}_\mathsf {MRPKE}\) be an adversary against the \(\mathrm {ROB}\)-security of \(\mathsf {MRPKE}\). Then we build an adversary \(\mathcal{G}_\mathsf {EMDK}\) against the \(\mathrm {ROB}\)-security of \(\mathsf {EMDK}\) such that

$$\begin{aligned} \mathsf {Adv}^{\mathsf {rob}}_{\mathsf {MRPKE}}(\mathcal{G}_\mathsf {MRPKE}) \le \mathsf {Adv}^{\mathsf {rob}}_{\mathsf {EMDK}}(\mathcal{G}_\mathsf {EMDK}), \end{aligned}$$

and an adversary \(\mathcal{G}_\mathsf {PKE}\) against the \(\mathrm {ROB}\)-security of \(\mathsf {PKE}\) such that

$$\begin{aligned} \mathsf {Adv}^{\mathsf {rob}}_{\mathsf {MRPKE}}(\mathcal{G}_\mathsf {MRPKE}) \le \mathsf {Adv}^{\mathsf {rob}}_{\mathsf {PKE}}(\mathcal{G}_\mathsf {PKE}). \end{aligned}$$

5.4 Privacy of iMessage

In this section we reduce the \(\mathrm {PRIV}\)-security of \(\mathsf {SC}= \mathsf {IMSG\text {-}SC}[\mathsf {MRPKE}, \mathsf {DS}]\) to the \(\mathrm {INDCCA}\)-security of \(\mathsf {MRPKE}\), then reduce the \(\mathrm {INDCCA}\)-security of \(\mathsf {MRPKE}= \mathsf {IMSG\text {-}MRPKE}[\mathsf {EMDK}, \mathsf {PKE}]\) to the \(\mathrm {AE}\)-security of \(\mathsf {EMDK}\) and the \(\mathrm {INDCCA}\)-security of \(\mathsf {PKE}\). The reductions are straightforward.

An adversary attacking the \(\mathrm {PRIV}\)-security of \(\mathsf {SC}\) is allowed to query oracle \(\textsc {LR}\) and get a challenge ciphertext from an exposed sender as long as the recipient is honest. This means that the adversary can use the sender’s \(\mathsf {DS}\) signing key to arbitrarily change associated data and signature \(\sigma \) of any challenge ciphertext prior to querying it to oracle \(\textsc {VerDec}\). Our security reduction for \(\mathrm {PRIV}\)-security of \(\mathsf {SC}\) will be with respect to a relation that prohibits the adversary from trivially winning this way. Note that if \(\mathsf {IMSG\text {-}SC}\) was defined to instead put inside , then our security reduction would be able to show the \(\mathrm {PRIV}\)-security of \(\mathsf {SC}\) with respect to \(\mathsf {R}_{\mathsf {id}}\) assuming \(\mathsf {DS}\) had unique signatures. However, ECDSA does not have this property (for the same reason it is not strongly unforgeable, as explained in [25]).

Fig. 19.
figure 19

Relaxing relation \(\mathsf {IMSG\text {-}PRIV\text {-}REL}\).

Let \(\mathsf {IMSG\text {-}PRIV\text {-}REL}\) be the relaxing relation defined in Fig. 19. It first discards the associated data and the signature \(\sigma \), and then compares the resulting tuples against each other. This reflects the intuition that an adversary can trivially change the values of and \(\sigma \) in any challenge ciphertext when attacking the \(\mathrm {PRIV}\)-security of \(\mathsf {IMSG\text {-}SC}\).

Theorem 7

Let \(\mathsf {MRPKE}\) be a multi-recipient public-key encryption scheme. Let \(\mathsf {DS}\) be a digital signature scheme. Let \(\mathsf {SC}= \mathsf {IMSG\text {-}SC}[\mathsf {MRPKE}, \mathsf {DS}]\). Let \(\mathcal{D}_\mathsf {SC}\) be an adversary against the \(\mathrm {PRIV}\)-security of \(\mathsf {SC}\) with respect to the relaxing relation \(\mathsf {R}= \mathsf {IMSG\text {-}PRIV\text {-}REL}\). Then we build an adversary \(\mathcal{D}_\mathsf {MRPKE}\) against the \(\mathrm {INDCCA}\)-security of \(\mathsf {MRPKE}\) such that

$$\begin{aligned} \mathsf {Adv}^{\mathsf {priv}}_{\mathsf {SC}, \mathsf {R}}(\mathcal{D}_\mathsf {SC}) \le \mathsf {Adv}^{\mathsf {indcca}}_{\mathsf {MRPKE}}(\mathcal{D}_\mathsf {MRPKE}). \end{aligned}$$

Theorem 8

Let \(\mathsf {EMDK}\) be an encryption scheme under message derived keys. Let \(\mathsf {PKE}\) be a public-key encryption scheme with input set \(\mathsf {PKE.In}= \{0,1\}^{\mathsf {\mathsf {EMDK}.kl}}\). Let \(\mathsf {MRPKE}= \mathsf {IMSG\text {-}MRPKE}[\mathsf {EMDK}, \mathsf {PKE}]\). Let \(\mathcal{D}_\mathsf {MRPKE}\) be an adversary against the \(\mathrm {INDCCA}\)-security of \(\mathsf {MRPKE}\). Then we build an adversary \(\mathcal{D}_\mathsf {PKE}\) against the \(\mathrm {INDCCA}\)-security of \(\mathsf {PKE}\), and an adversary \(\mathcal{D}_\mathsf {EMDK}\) against the \(\mathrm {AE}\)-security of \(\mathsf {EMDK}\) such that

$$\begin{aligned} \mathsf {Adv}^{\mathsf {indcca}}_{\mathsf {MRPKE}}(\mathcal{D}_\mathsf {MRPKE}) \le 2 \cdot \mathsf {Adv}^{\mathsf {indcca}}_{\mathsf {PKE}}(\mathcal{D}_\mathsf {PKE}) + \mathsf {Adv}^{\mathsf {ae}}_{\mathsf {EMDK}}(\mathcal{D}_\mathsf {EMDK}). \end{aligned}$$

The proofs of Theorems 7 and 8 are in the full version [13].

5.5 Concrete Security of iMessage

In this section we summarize the results concerning the security of our iMessage-based signcryption scheme. For simplicity, we use the constructions and primitives from all across our work without formally redefining each of them.

Let \(\mathsf {SC}\) be the iMessage-based signcryption scheme, defined based on the appropriate underlying primitives. Let \(\mathsf {R}_\mathsf {auth}= \mathsf {IMSG\text {-}AUTH\text {-}REL}[\mathsf {R}^*]\) and \(\mathsf {R}_\mathsf {priv}= \mathsf {IMSG\text {-}PRIV\text {-}REL}\). Then for any adversary \(\mathcal{F}_\mathsf {SC}\) attacking the \(\mathrm {AUTH}\)-security of \(\mathsf {SC}\) we can build new adversaries such that:

$$\begin{aligned} \mathsf {Adv}^{\mathsf {auth}}_{\mathsf {SC}, \mathsf {R}_\mathsf {auth}}(\mathcal{F}_\mathsf {SC}) \le \mathsf {Adv}^{\mathsf {uf}}_{\mathsf {DS}, \mathsf {R}^*}(\mathcal{F}_\mathsf {DS}) + \min (\mathsf {Adv}^{\mathsf {rob}}_{\mathsf {PKE}}(\mathcal{G}_\mathsf {PKE}), \alpha ), \end{aligned}$$

where

$$\begin{aligned} \alpha = \mathsf {Adv}^{\mathsf {unique}}_{\mathsf {SE}}(\mathcal{U}_0) + \mathsf {Adv}^{\mathsf {wrob}}_{\mathsf {SE}, \mathsf {F.ol}}(\mathcal{G}_\mathsf {SE}). \end{aligned}$$

For any adversary \(\mathcal{D}_\mathsf {SC}\) attacking the \(\mathrm {PRIV}\)-security of \(\mathsf {SC}\), making \(q_\textsc {LR}\) queries to \(\textsc {LR}\) oracle and \(q_\textsc {RO}\) queries to \(\textsc {RO}\) oracle, we build new adversaries such that:

$$\begin{aligned} \mathsf {Adv}^{\mathsf {priv}}_{\mathsf {SC}, \mathsf {R}_\mathsf {priv}}(\mathcal{D}_\mathsf {SC}) \le 2 \cdot (\beta + \gamma ) + \mathsf {Adv}^{\mathsf {otind}}_{\mathsf {SE}}(\mathcal{D}_\mathsf {SE}), \end{aligned}$$

where

$$\begin{aligned} \beta = \mathsf {Adv}^{\mathsf {indcca}}_{\mathsf {PKE}}(\mathcal{D}_\mathsf {PKE}) + \mathsf {Adv}^{\mathsf {unique}}_{\mathsf {SE}}(\mathcal{U}_1) + \mathsf {Adv}^{\mathsf {tcr}}_{\mathsf {F}}(\mathcal{H}) + \mathsf {Adv}^{\mathsf {pkr}}_{\mathsf {SE}, \mathsf {F.kl}}(\mathcal{P}), \end{aligned}$$
$$\begin{aligned} \gamma = \frac{(2 \cdot q_{\textsc {RO}} + q_\textsc {LR}- 1) \cdot q_\textsc {LR}}{2^{\mathsf {F.kl}+ 1}}. \end{aligned}$$

We now assess the concrete security of iMessage when the abstract schemes that constitute \(\mathsf {SC}\) are instantiated with real-world primitives. First, note that \(\mathsf {Adv}^{\mathsf {unique}}_{\mathsf {SE}}(\mathcal{U}) = 0\) for any \(\mathcal{U}\) when \(\mathsf {SE}\) is AES-CTR. We will approximate the bit-security of \(\mathsf {SC}\) based on the other terms above.

We assume that ECDSA with 256-bit keys (on the NIST P-256 curve) has 128 bits of \(\mathrm {UF}\)-security with respect to \({\mathsf {R}_{\mathsf {m}}}\) [5, 21]. We assume that RSA-OAEP with 1280-bit keys has 80 bits of \(\mathrm {INDCCA}\)-security [21, 30]. \(\mathsf {SE}\) is AES-CTR with key length \(\mathsf {\mathsf {SE}.kl}\); we assume that \(\mathsf {SE}\) has \(\mathsf {\mathsf {SE}.kl}\) bits of \(\mathrm {OTIND}\)-security.

For every other term used above, we approximate the corresponding bit-security based on the advantage \(\epsilon \) and the runtime T of the best adversary we can come up with. For simplicity, we model \(\mathsf {F}\) as the random oracle and we model \(\mathsf {SE}\) as the ideal cipher. This simplifies the task of finding the “best possible” adversary against each security notion and then calculating its advantage. In each case we consider either a constant-time adversary making a single guess in its security game (achieving some advantage \(\epsilon \) in time \(T\approx 1\)), or an adversary that runs a birthday attack (achieving advantage \(\epsilon \ge 0.3 \cdot \frac{q \cdot (q-1)}{N}\) in time \(T\approx q \cdot \log _2 q\) for \(q = \sqrt{2N}\)). We use the following adversaries:

  1. (i)

    Assume \(\mathsf {SE}\) is AES-CTR where AES modeled as the ideal cipher with block length 128. In game \(\mathrm {G}^{\mathsf {wrob}}_{\mathsf {SE}, \mathsf {F.ol}, \mathcal{G}}\) consider an adversary \(\mathcal{G}\) that repeatedly queries its oracle \(\textsc {Enc}\) on inputs \((r_0, m)\) where all \(r_0\in \{0,1\}^{\mathsf {F.kl}}\) are distinct and all \(m\in \{0,1\}^{128}\) are distinct. The adversary wins if a collision occurs across the 128-bit outputs of \(\mathsf {\mathsf {SE}.Enc}\). Then \(\epsilon = \mathsf {Adv}^{\mathsf {wrob}}_{\mathsf {SE}, \mathsf {F.ol}}(\mathcal{G}_\mathsf {SE}) \ge 0.3 \cdot \frac{q_\textsc {Enc}* (q_\textsc {Enc}- 1)}{2^{128}}\) and \(T = q_\textsc {Enc}\cdot \log _2 q_\textsc {Enc}\) for \(q_\textsc {Enc}= \sqrt{2^{128+1}}\).

  2. (ii)

    In game \(\mathrm {G}^{\mathsf {tcr}}_{\mathsf {F}, \mathcal{H}}\) consider an adversary \(\mathcal{H}\) that queries its oracle \(\textsc {NewKey}(x_0)\) for any \(x_0\in \{0,1\}^*\) and then makes a guess \((1, x_1)\) for any \(x_0 \ne x_1\). Then \(\epsilon = \mathsf {Adv}^{\mathsf {tcr}}_{\mathsf {F}}(\mathcal{H}) = 2^{-\mathsf {F.ol}}\) and \(T \approx 1\) in the random oracle model.

  3. (iii)

    In game \(\mathrm {G}^{\mathsf {pkr}}_{\mathsf {SE}, \mathsf {F.kl}, \mathcal{P}}\) consider an adversary \(\mathcal{P}\) that makes a single call to \(\textsc {Enc}\) and then randomly guesses any key prefix \(p\in \{0,1\}^{\mathsf {F.kl}}\). Then \(\epsilon = \mathsf {Adv}^{\mathsf {pkr}}_{\mathsf {SE}, \mathsf {F.kl}}(\mathcal{P}) = 2^{-\mathsf {F.kl}}\) and \(T \approx 1\) in the ideal cipher model.

  4. (iv)

    The term \(\gamma \) upper bounds the probability of an adversary finding a collision when running the birthday attack (in the random oracle model). The corresponding lower bound (for \(q_\textsc {RO}= 0\)) is \(\epsilon \ge 0.3 \cdot \frac{q_\textsc {LR}\cdot (q_\textsc {LR}-1)}{2^{\mathsf {F.kl}}}\) with \(T = q_\textsc {LR}\cdot \log _2 q_\textsc {LR}\) and \(q_\textsc {LR}= \sqrt{2^{\mathsf {F.kl}+ 1}}\).

Fig. 20.
figure 20

Lower bounds for bit-security of \(\mathsf {SC}\) across different parameter choices.

We wrote a script that combines all of the above to find the lower bound for the bit-security of \(\mathsf {SC}\) (with respect to \(\mathrm {PRIV}\) and \(\mathrm {AUTH}\) security notions) for different choices of \(\mathsf {\mathsf {SE}.kl}\), \(\mathsf {F.kl}\) and \(\mathsf {F.ol}\). This assumes that the above adversaries are optimal, and computes the lower bound according to Sect. 2. Figure 2 (in Sect. 1) shows the bit-security lower bounds with respect to privacy, depending on the choice of symmetric key length \(\mathsf {\mathsf {SE}.kl}\) and authentication tag length \(\mathsf {F.ol}\). Figure 20 shows the choices of \(\mathsf {F.kl}\) and \(\mathsf {F.ol}\) that yield the best lower bounds for the bit-security of \(\mathrm {PRIV}\) for each \(\mathsf {\mathsf {SE}.kl}\in \{128, 192, 256\}\). According to our results, the security of the iMessage-based signcryption scheme would slightly improve if the value of \(\mathsf {F.ol}\) was chosen to be 48 instead of 40. The bit-security of \(\mathsf {SC}\) with respect to \(\mathrm {AUTH}\) is constant because it does not depend on the values of \(\mathsf {\mathsf {SE}.kl}\), \(\mathsf {F.kl}\), \(\mathsf {F.ol}\). The assumption that RSA-OAEP with 1280-bit long keys has 80 bits of \(\mathrm {INDCCA}\)-security limits the bit-security that can be achieved when \(\mathsf {\mathsf {SE}.kl}= 256\); otherwise, the \(\mathrm {PRIV}\) bit-security for \(\mathsf {\mathsf {SE}.kl}= 256\) would allow a lower bound of 86 bits. But note that using \(\mathsf {\mathsf {SE}.kl}\in \{192, 256\}\) is likely not possible while maintaining the backward-compatibility of iMessage.