Fast Message Franking: From Invisible Salamanders to Encryptment

Dodis, Yevgeniy; Grubbs, Paul; Ristenpart, Thomas; Woodage, Joanne

doi:10.1007/978-3-319-96884-1_6

Fast Message Franking: From Invisible Salamanders to Encryptment

Yevgeniy Dodis¹⁵,
Paul Grubbs¹⁶,
Thomas Ristenpart¹⁶ &
…
Joanne Woodage¹⁷

Conference paper
First Online: 25 July 2018

4200 Accesses
31 Citations

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10991))

Abstract

Message franking enables cryptographically verifiable reporting of abusive messages in end-to-end encrypted messaging. Grubbs, Lu, and Ristenpart recently formalized the needed underlying primitive, what they call compactly committing authenticated encryption (AE), and analyze security of a number of approaches. But all known secure schemes are still slow compared to the fastest standard AE schemes. For this reason Facebook Messenger uses AES-GCM for franking of attachments such as images or videos.

We show how to break Facebook’s attachment franking scheme: a malicious user can send an objectionable image to a recipient but that recipient cannot report it as abuse. The core problem stems from use of fast but non-committing AE, and so we build the fastest compactly committing AE schemes to date. To do so we introduce a new primitive, called encryptment, which captures the essential properties needed. We prove that, unfortunately, schemes with performance profile similar to AES-GCM won’t work. Instead, we show how to efficiently transform Merkle-Damgärd-style hash functions into secure encryptments, and how to efficiently build compactly committing AE from encryptment. Ultimately our main construction allows franking using just a single computation of SHA-256 or SHA-3. Encryptment proves useful for a variety of other applications, such as remotely keyed AE and concealments, and our results imply the first single-pass schemes in these settings as well.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

End-to-end encrypted messaging systems including WhatsApp [40], Signal [38], and Facebook Messenger [13] have increased in popularity — billions of people now rely on them for security. In these systems, intermediaries including the messaging service provider should not be able to read or modify messages. Providers simultaneously want to provide abuse reporting: should one user send another a harmful message, image, or video, the recipient should be able to report the content to the service provider. End-to-end encryption would seem to prevent the provider from verifying that the reported message was the one sent.

Facebook suggested a way to navigate this tension in the form of message franking [14, 30]. The idea is to enable the recipient to cryptographically prove to the service provider that the reported message was the one sent. Grubbs, Lu, and Ristenpart (GLR) [17] provided the first formal treatment of the problem, and introduced compactly committing authenticated encryption with associated data (ccAEAD) as the key primitive. A secure ccAEAD scheme is symmetric encryption for which a short portion of the ciphertext serves as a cryptographic commitment to the underlying message (and associated data). They detailed appropriate security notions and security proofs that provide validation of the main Facebook message franking approach and a faster custom ccAEAD scheme called Committing Encrypt-and-PRF (CEP).

The Facebook scheme composes HMAC (serving the role of a commitment) with a standard encrypt-then-MAC AEAD scheme. Their scheme therefore requires a full three cryptographic passes over messages. The CEP construction gets this down to two. But even that does not match the fastest standard AE schemes such as AES-GCM [28] and OCB [32]. These require at most one blockcipher call (on the same key) per block of message and some arithmetic operations in GF($2^n$), which are faster than a blockcipher invocation. As observed by GLR, however, these schemes are not compactly committing: one can find two distinct messages and two encryption keys that lead to the same tag. This violates what they call receiver binding, and could in theory allow a malicious recipient to report a message that was never sent.

Existing ccAEAD schemes are not considered fast enough for all applications of message franking by practitioners [30]. Facebook Messenger does not use the ccAEAD scheme mentioned above to directly encrypt attachments, rather using a kind of hybrid encryption combining ccAEAD of a symmetric key that is in turn used with AES-GCM to encrypt the attachment. Use of AES-GCM does not necessarily seem problematic despite the GLR results; the latter do not imply any concrete attack on Facebook’s system.

Breaking Facebook’s attachment franking. Our first contribution is to show an attack against Facebook’s attachment franking scheme. The attack enables a malicious sender to transmit an abusive attachment (e.g., an objectionable image or video) to a receiver so that: (1) the recipient receives the attachment (it decrypts correctly), yet (2) reporting the abusive message fails — Facebook’s systems essentially “lose” the abusive image, rendering them invisible from the abuse handling team. Instead what gets reported to Facebook is a different, innocuous image. See Fig. 3.

Perhaps confusingly, our attack does not violate the primary reason for requiring receiver binding in committing AE (preventing a malicious recipient from framing a user as having sent a message they didn’t send). Instead it violates what GLR call sender binding security: a malicious sender should not be able to force an abusive message to be received by the recipient, yet that recipient can’t report it properly. Nevertheless, the root cause of this vulnerability in Facebook’s case is the use of an AE scheme that is not a binding commitment to its message or, equivalently in this context, that is not a robust encryption scheme [1, 15, 16].

Briefly, Facebook uses a cryptographic hash of the AES-GCM ciphertext, along with a randomly-generated value, as an identifier for the attachment. For a given abusive message, our attack efficiently finds two keys and a ciphertext, such that the first key decrypts the ciphertext to the abusive attachment while the other key successfully decrypts the same ciphertext, but to another innocuous attachment. The malicious sender transmits two messages with the different keys but the same attachment ciphertext. Facebook’s systems deduplicate the two attachments, and the report will only include the non-abusive image.

We responsibly disclosed this vulnerability to Facebook, and in fact they helped us understand how our attack works against their systems (much of the abuse handling code is server-side and closed source). The severity of the issue led them to patch their (server-side) systems and to award us a bug bounty. Their fix is ad hoc and involves deduplicating more carefully. But the vulnerability would have been avoided in the first place by using a fast ccAEAD scheme that provided the binding security properties implicitly assumed of, but not actually provided by, AES-GCM.

Towards faster ccAEAD schemes: encryptment. This message franking failure motivates the need for faster schemes. As mentioned, the best known secure ccAEAD scheme from GLR is two pass, requiring computing both HMAC and AES-CTR mode (or similar) over the message. The fastest standard AE schemes [22, 28, 32], however, require just a single pass using a blockcipher with a single key. Can we build ccAEAD schemes that match this performance?

To tackle this question we first abstract out the core technical challenge underlying ccAEAD: building a one-time encryption mechanism that simultaneously encrypts and compactly commits to the message. We formalize this in a new primitive that we call encryptment. An encryptment of a message using a key $K_{{\mathsf{{EC}}}}$ is a pair $(C_{{\mathsf{{EC}}}},B_{{\mathsf{{EC}}}})$ where $C_{{\mathsf{{EC}}}}$ is a ciphertext and $B_{{\mathsf{{EC}}}}$ is a binding tag. By compactness we require that $|B_{{\mathsf{{EC}}}}|$ is independent of the length of the message. Decryption takes as input $K_{{\mathsf{{EC}}}},C_{{\mathsf{{EC}}}},B_{{\mathsf{{EC}}}}$ and returns a message (or $\bot $). Finally, there is a verification algorithm that takes a key, a message, and a binding tag, and determines whether the tag is a commitment to the message. Encryptment supports associated data also, but we defer the details to the body.

We introduce security notions for encryptment. These include a real-or-random style confidentiality goal in which the adversary must distinguish between a single encryptment and an appropriate-length sequence of random bits. Additionally we require sender binding and receiver binding notions like those from GLR (but adapted to the encryptment syntax), and finally a strong correctness property that is easy to meet. Comparatively, GLR require many-time confidentiality and integrity notions in addition to various binding notions.

Therefore encryptment is substantially simpler than ccAEAD, making analyses easier and, we think, design of constructions more intuitive. At the same time, we will be able to build ccAEAD from encryptment using simple, efficient transforms. In the other direction, we show that one can also build encryptment from ccAEAD, making the two primitives equivalent from a theoretical perspective. Encryptment also turns out to be the “right” primitive for a number of other applications: robust authenticated-encryption [1, 15, 16], concealments [12], remotely keyed authenticated encryption [12], and perhaps even more.

Fast encryptment from fixed-key blockciphers? Given a simpler formulation in hand, we turn to building fast schemes. First, we show a negative result: encryptment schemes cannot match the efficiency profile of OCB or AES-GCM. In fact we rule out any scheme that uses just a single blockcipher invocation for each block of message, with some fixed small set of keys.

The negative result makes use of a connection between encryptment and collision-resistant (CR) hashing. Because encryptment schemes are deterministic, we can think of the computation of a binding tag $B_{{\mathsf{{EC}}}}$ as a deterministic function $F(K_{{\mathsf{{EC}}}},M)$ applied to the key and message; verification simply checks that $F(K_{{\mathsf{{EC}}}},M) = B_{{\mathsf{{EC}}}}$. Then, receiver binding is achieved if and only if F is CR: the adversary shouldn’t be able to find $(K_{{\mathsf{{EC}}}},M) \ne (K_{{\mathsf{{EC}}}}',M')$ such that $F(K_{{\mathsf{{EC}}}},M) = F(K_{{\mathsf{{EC}}}}',M')$.

Given this connection, we can exploit previous work on ruling out fixed-key blockcipher-based CR hashing [34, 35, 37]. A simple corollary of [35, Theorem 1] is that one cannot prove receiver binding security for any rate-1 fixed-key blockcipher-based encryptment. (Rate-1 meaning one blockcipher call per block of message.) Since OCB and AES-GCM fall into this category of rate-1, they don’t work, but neither do other similar blockcipher-based schemes. Our negative result also rules out rate-1 ccAEAD, due to our aforementioned result that (fast) ccAEAD implies (fast) encryptment.

One-pass encryptment from hashing. Given the connection just mentioned, it is natural to turn to CR hashing as a starting point for building as-fast-as-possible encryptment. We do so and show how to achieve secure encryptment using just a single pass of a secure cryptographic hash function. The encryptment can be viewed as a mode of operation of a fixed-input-length compression function, such as the one underlying SHA-256 or other Merkle-Damgärd style constructions.

Let f(x, y) be a compression function on two n-bit inputs and with output an n-bit string. Then our HFC (hash function chaining) encryptment works as shown in Fig. 8. Basically one hashes $K_{{\mathsf{{EC}}}}\, \Vert \,(M_1 \oplus K_{{\mathsf{{EC}}}}) \, \Vert \,\cdots \, \Vert \,(M_2 \oplus K_{{\mathsf{{EC}}}})$ using a standard iteration of f. But, additionally, one uses the intermediate chaining values as pads to encrypt the message blocks. Decryption simply computes the hash, recovering message blocks as it goes.

We prove that our HFC scheme is a secure encryptment. Binding is inherited from the CR of the underlying hash function. We show confidentiality assuming $f(x,y\oplus K_{{\mathsf{{EC}}}})$ is a related-key-attack-secure pseudorandom function (RKA-PRF) [3] when keyed by $K_{{\mathsf{{EC}}}}$. For standard designs, such as the Davies-Meyer construction $f(x,y\oplus K_{{\mathsf{{EC}}}}) = E(y\oplus K_{{\mathsf{{EC}}}},x)\oplus x$, we can reduce RKA-PRF security to RKA-PRP security of the underlying blockcipher E. This property is already an active target of cryptographic analysis for standard E (such as AES), giving us confidence in the assumption. Because SHA-256 uses a DM-style compression function, this also gives confidence for using SHA-256 (or SHA-384, SHA-512).

From a theoretical perspective, one might want to avoid relying on RKA security (compared to standard PRF security). We discuss approaches for doing so in the body, but the resulting constructions are not as fast or elegant as HFC.

HFC has some features in common with the Duplex authenticated-encryption mode [6] using Keccak (SHA-3) [5]. In fact the Duplex mode gives rise to a secure encryptment scheme as well. See the full version for a discussion. The way we key in HFC is also similar to the Halevi-Krawczyk construction for reducing the assumptions needed on hash functions in digital signature settings [20], but the keying serves a different role here and their analysis techniques are not applicable.

From encryptment to ccAEAD. We show several efficient transforms for building a ccAEAD scheme given a secure encryptment. First consider doing so given also a secure (standard) AE scheme. To encrypt a message M, first generate a random key $K_{{\mathsf{{EC}}}}$ and then compute an encryptment $(C_{{\mathsf{{EC}}}},B_{{\mathsf{{EC}}}})$ for $K_{{\mathsf{{EC}}}},M$. Encrypt $K_{{\mathsf{{EC}}}}$ under the long-lived AE key K using as associated data the binding tag $B_{{\mathsf{{EC}}}}$. The resulting ciphertext is the AE ciphertext (including its authentication tag) along with $C_{{\mathsf{{EC}}}},B_{{\mathsf{{EC}}}}$. We prove that this transformation provides the multi-opening confidentiality and integrity goals for ccAEAD of GLR, assuming the standard security of the AE scheme and the aforementioned security goals are met for the encryptment scheme.

One can instead use just two additional PRF calls to securely convert an encryptment scheme to a ccAEAD scheme. One can, for example, instantiate the PRF with the SHA-256 compression function, to have a total cost of at most $m + 4$ SHA-256 compression function calls for a message that can be parsed into m blocks of 256 bits. Another transform uses a single tweakable blockcipher call in addition to the encryptment. See the full version for details.

Our approach of hashing-based ccAEAD has a number of attractive features. HFC works with any hash function that iterates a secure compression function, giving us a wide variety of options for instantiation. Because of our simplified formalization via encryptment, the security proofs are modular and conceptually straightforward. As already mentioned it is fast in terms of the number of underlying primitive calls. If instantiated using SHA-256, one can use the SHA hardware instructions [18] now supported on some AMD and ARM processors, and that are likely to be incorporated in future Intel processors. Finally, HFC-based ccAEAD is simple to implement.

Other applications. Encryptment proves a useful abstraction for other applications as well. In the full version of this work, we show how it suffices for building concealments [12] (a conceptually similar, but distinct, primitive) which, in turn, can be used to build remotely keyed AE [12]. Previous constructions of these required two passes over the message. Our new encryptment-based approach gives the first single-pass concealments and remotely keyed AE. Finally, encryptment schemes give rise to robust AE [15] via some of our transforms mentioned above. We expect that encryptment will find further applications in the future.

2 Definitions and Preliminaries

Preliminaries. For an alphabet $\varSigma $, we let $\varSigma ^*$ denote the set of all strings of symbols from that alphabet, and let $\varSigma ^n$ denote the set of all such strings of length n. For a string $x \in \varSigma ^*$, we write |x| to denote the length of the string x. We let $\varepsilon $ denote the empty string, and $\perp $ denote the distinguished error symbol. We write to denote choosing an element at random from the set $\mathcal {X}$.

We define the XOR of two strings of different lengths to return the XOR of the shorter string and the truncation of the longer string to the length of the shorter string. Our proofs assume a RAM model of computation where most operations are unit cost. We use big-O notation $\mathcal{O}(\cdot )$ to hide small constants related to the internal data structures (e.g., tables of queries) used by reductions.

For a deterministic algorithm A, we write $y \leftarrow A(x_1, \dots )$ to denote running A on inputs $x_1, \dots $ to produce output y. For a probabilistic algorithm A with associated coin space $\mathcal {C}$, we write to denote choosing coins and returning $y \leftarrow A(x_1, \dots ; c)$, where $y \leftarrow A(x_1, \dots ; c)$ denotes running A on the given inputs with coins c fixed, to deterministically produce output y.

Collision-resistant functions. Let be a function on some domain . The collision resistance game $\text {CR}$ has ${\mathcal A}$ run and output a pair of messages $X,X'$. If analysis is with respect to an ideal primitive such as an ideal cipher, then ${\mathcal A}$ is given oracle access to this primitive also. The game outputs true if $\mathcal{H}(X) = \mathcal{H}(X')$ and $X \ne X'$. The $\text {CR}$ advantage of an adversary ${\mathcal A}$ against $\mathcal{H}$ is defined $\mathbf {Adv}^{\mathrm {cr}}_{\mathcal{H}}({\mathcal A}) = \Pr \left[ \, \text {CR}^{\mathcal A}_\mathcal{H}\Rightarrow \mathsf {true} \,\right] $, where the probability is over the coins of ${\mathcal A}$ and those of any ideal primitive. We measure the efficiency of the attacker in terms of their resources, e.g. run time or number of queries made to some underlying primitive.

For space reasons, we direct the reader to [33] for syntax and correctness notions for AEAD. We require that AEAD schemes offer both real-or-random confidentiality and ciphertext integrity. These will be formalized in Sect. 7.

3 Invisible Salamanders: Breaking Facebook’s Franking

In this section we demonstrate an attack against Facebook’s message franking. Facebook uses AES-GCM to encrypt attachments sent via Secret Conversations. The attack creates a “colliding” GCM ciphertext which decrypts to an abusive attachment via one key and an innocuous attachment via the other. This combined with the behavior of Facebook’s server-side abuse report generation code prevents abusive messages from being reported to Facebook. Since messages in Secret Conversations are called “salamanders” by Facebook (perhaps inspired by the Axolotl ratchet used in Signal, named for an endangered salamander), ensuring Facebook does not see a message essentially makes it an invisible salamander. We responsibly disclosed the vulnerability to Facebook. They have remediated it and have given us a bug bounty for reporting the issue.

Facebook’s attachment franking. A diagram of Facebook’s franking protocol for attachments (e.g., images and videos) is in Fig. 1. The protocol uses $\text {CtE2}$, Facebook’s ccAEAD scheme for chat messages described in [14, 30] and analyzed in [17], as a subroutine. Some encryption and HMAC keys, as well as some other details like headers and associated data not important to the presentation of the protocol, have been removed for simplicity in the diagram and prose below. Consult [14, 17] for additional details. For ease of exposition we divide the protocol into three phases: the sending phase involving the sender Alice and Facebook, the receiving phase involving the receiver Bob and Facebook, and the reporting phase between Bob and Facebook.

Sending phase: In the first part of the sending phase, Alice generates a key $K_{\text {im}}$ and nonce $N_{\text {im}}$ and encrypts $M_{\text {a}}$ using AES-GCM (described in pseudocode in Fig. 2) to obtain a ciphertext $C_{\text {im}}$. The sender computes the SHA-256 digest $D_{\text {im}}$ of $N_{\text {im}}\, \Vert \,C_{\text {im}}$ and sends Facebook $N_{\text {im}}\, \Vert \,C_{\text {im}}$ for storage. Facebook generates a random identifier $\text {id}$ and puts $N_{\text {im}}\, \Vert \,C_{\text {im}}$ in a key-value data structure with key $\text {id}$. Facebook then sends $\text {id}$ to Alice. In the second part of the sending phase, Alice encrypts the message $\text {id}\, \Vert \,K_{\text {im}}\, \Vert \,D_{\text {im}}$ using $\text {CtE2}$ to obtain the ccAEAD ciphertext $C,C_B$. Below, we will call a message containing an identifier, key and digest an “attachment metadata” message. Alice sends $C,C_B$ to Facebook, which runs $\mathsf {FBTag}$ on $C_B$ (this amounts to HMAC-SHA256 with an internal Facebook key and some metadata) as in the standard message franking protocol to obtain $a$. Facebook sends $C,C_B,a$ to the receiver.

Receiving phase: Upon receiving a message $C,C_B,a$ from Alice (via Facebook), Bob runs $\text {CtE2-Dec}$ on $C,C_B$ to obtain $\text {id}\, \Vert \,K_{\text {im}}\, \Vert \,D_{\text {im}}$. Bob then sends $\text {id}$ to Facebook, which gets the value $N_{\text {im}}\, \Vert \,C_{\text {im}}$ associated with $\text {id}$ in its key-value store and sends it to Bob. Bob verifies that $D_{\text {im}}= \mathsf{SHA}\text {-}\mathsf{256}(N_{\text {im}}\, \Vert \,C_{\text {im}})$ and decrypts $C_{\text {im}}$ to obtain the attachment content $M_{\text {a}}$.

Reporting phase: Bob sends all recent messages to Facebook along with their commitment openings and $a$ values (not pictured in the diagram). For each message, Facebook verifies the commitment using $\text {CtE2-Ver}$ and the authentication tag $a$ using its internal HMAC key. Then, if the commitment verifies correctly and the message contains attachment metadata, Facebook gets the attachment ciphertext and nonce $N_{\text {im}}\, \Vert \,C_{\text {im}}$ from its key-value store using its identifier $\text {id}$. Facebook verifies that $D_{\text {im}}= \mathsf{SHA}\text {-}\mathsf{256}(N_{\text {im}}\, \Vert \,C_{\text {im}})$ and decrypts $C_{\text {im}}$ with $K_{\text {im}}$ and $N_{\text {im}}$ to obtain the attachment content $M_{\text {a}}$. If no other attachment metadata message containing identifier $\text {id}$ has already been seen, the plaintext $M_{\text {a}}$ is added to the abuse report R. (Looking ahead, this is the application-level behavior that enables the attack, which will violate the one-to-one correspondence between $\text {id}$ and plaintext that is assumed here.)

Attack intuition. The threat model of this attack is a malicious Alice who wants to send an abusive attachment to Bob, but prevent Bob from reporting it to Facebook. The attachment can be an offensive image (e.g., a picture of abusive text or of a gun) or video. We focus our discussion below on images.

The attack has two main steps: (1) generating the colliding ciphertext and (2) sending it twice to Bob. In step (1), Alice creates two GCM keys and a single GCM ciphertext which decrypts (correctly) to the abusive attachment under one key and to a different attachment under the other key. In step (2), Alice sends the ciphertext to Facebook and gets an identifier back. Alice then sends the identifier to Bob twice, once with each key.

On receiving the two messages, Bob decrypts the image twice and sees both the abusive attachment and the other one. When Bob reports the conversation to Facebook, its server-side code verifies both decryptions of the image ciphertext but only inserts the other decryption into the abuse report—the human making the abusive-or-not judgment will have no idea Bob saw the abusive attachment.

We will describe two variants of the attack. We will begin with the case where the second decryption of the colliding ciphertext is junk bytes with no particular structure. This variant is simple but easily detectable, since the junk bytes will not display correctly. Then we give a more advanced variant where the second decryption correctly displays an innocuous attachment, like a picture of a kitten.

Generating the colliding ciphertext—simple variant. Alice begins the attack with an abusive attachment $M_{\text {a}}^{\text {ab}}$. Alice chooses two distinct 128-bit GCM keys $K_{1}$ and $K_{2}$ and a nonce $N_{\text {im}}$, then computes a ciphertext $C_{\text {a}}$ via $\text {CTR-Enc}(K_{1},N_{\text {im}}+2,M_{\text {a}}^{\text {ab}})$, where $\text {CTR-Enc}$ denotes CTR-mode encryption with the given key and nonce. The nonce is $N_{\text {im}}+2$ to match GCM, see Fig. 2. In Facebook’s scheme Alice can choose the keys and the nonce, but this is not necessary—any combination of two keys and a nonce will work.

The ciphertext $C_{\text {a}}$ is almost, but not quite, the ciphertext Alice will use in the attack. To ensure GCM decryption is correct for both keys, Alice generates the colliding GCM tag and final ciphertext block using $\mathsf{{Collide}}\text {-}\mathsf{{GCM}}(K_{1},K_{2},N_{\text {im}},C_{\text {a}})$ (described in Fig. 2). The function $\mathsf{{Collide}}\text {-}\mathsf{{GCM}}$ works by computing the tags for the two keys then solving a linear equation to find the value of the last ciphertext block. We use the final ciphertext block as the variable, but a different ciphertext block or a block of associated data could be used instead. The output $N_{\text {im}}\, \Vert \,C_{\text {im}}\, \Vert \,T$ correctly decrypts to $M_{\text {a}}^{\text {ab}}$ under $K_{1}$ and to another plaintext $\mathsf {M}_{\text {j}}$ under $K_{2}$. However, the plaintext $\mathsf {M}_{\text {j}}$ will be random bytes with no structure.

Sending the colliding ciphertext. Alice continues the sending phase with Facebook, obtaining an identifier $\text {id}$ for the ciphertext $N_{\text {im}}\, \Vert \,C_{\text {im}}$. Alice then creates two attachment metadata messages: $\text {MD}_1 = \text {id}\, \Vert \,K_{2}\, \Vert \,D_{\text {im}}$ and $\text {MD}_2 = \text {id}\, \Vert \,K_{1}\, \Vert \,D_{\text {im}}$. Alice completes the remainder of the sending phase twice, first with $\text {MD}_1$ and then with $\text {MD}_2$. (The first message sent is associated to the junk message.) After finishing the receiving phase for $\text {MD}_1$, Bob will decrypt $C_{\text {im}}$ with $K_{2}$, giving $\mathsf {M}_{\text {j}}$. After finishing the receiving phase with $\text {MD}_2$, Bob will decrypt $C_{\text {im}}$ with $K_{1}$ and see $M_{\text {a}}^{\text {ab}}$. We emphasize that both attachment metadata messages are valid, and no security properties of $\text {CtE2}$ are violated.

When Bob reports the recent messages, Facebook will verify both $\text {MD}_1$ and $\text {MD}_2$ and check the digest $D_{\text {im}}$ matches the value $N_{\text {im}}\, \Vert \,C_{\text {im}}$ stored with identifier $\text {id}$. However, it will only insert the first decryption, the plaintext ${\mathbf {\mathsf{{M}}}}_{{\varvec{j}}}$, into the abuse report. The system sees the second ciphertext has the same SHA-256 hash and identifier, and assumes it’s a duplicate: the human viewing the report will have no idea Bob ever saw the message $M_{\text {a}}^{\text {ab}}$.

3.1 Advanced Variant and Proof of Concept

Next we will describe the advanced variant of the attack (in which both decryptions correctly display as attachments) and our proof-of-concept implementation. Ensuring both decryptions are valid attachments is important because the simple variant (where one decryption is random bytes) may not have sufficed for a practical exploit if Facebook only inserted valid images into their abuse reports. We implemented the advanced variant and crafted a colliding ciphertext for which the “abusive” decryption $M_{\text {a}}^{\text {ab}}$ is the image of an Axolotl salamander in Fig. 3. The innocuous decryption $\mathsf {M}_{\text {j}}$ is the image of a kitten in that figure. We verified both display correctly in Facebook Messenger’s browser client.

The only difference between the advanced variant and the one described above is the way Alice generates the ciphertext $C_{\text {a}}$ which is input to $\mathsf{{Collide}}\text {-}\mathsf{{GCM}}$. Instead of simply encrypting the abusive attachment $M_{\text {a}}^{\text {ab}}$, Alice first merges $M_{\text {a}}^{\text {ab}}$ and another innocuous attachment $\mathsf {M}_{\text {j}}$ using a function $\text {Att-Merge}(K_{1},K_{2},M_{\text {a}}^{\text {ab}},\mathsf {M}_{\text {j}})$ which takes the two keys and attachments and outputs a nonce $N_{\text {im}}$ and $C_{\text {a}}$ so that $\text {CTR-Dec}(K_{1},N_{\text {im}}+2,C_{\text {a}})$ displays $M_{\text {a}}^{\text {ab}}$ and $\text {CTR-Dec}(K_{2},N_{\text {im}}+2,C_{\text {a}})$ displays $\mathsf {M}_{\text {j}}$. The exact implementation of $\text {Att-Merge}$ is file-format-specific, but for most formats $\text {Att-Merge}$ has two main steps: (1) a nonce search yielding a nonce which gives a collision on some region of the ciphertext, and (2) a plaintext restructuring that expands the plaintexts with random bytes in locations that are ignored by parsers for their respective file formats. We implemented $\text {Att-Merge}$ for JPEG and BMP images (the salamander image and the kitten image, respectively), so our discussion will focus on these formats.

Before discussing our implementation of $\text {Att-Merge}$ we will briefly describe the JPEG and BMP file formats. JPEG files must begin with the two-byte sequence $\texttt {ff}{} \texttt {d8}$ and end with $\texttt {ff}{} \texttt {d9}$. JPEGs can have comments. They are indicated with the two-byte sequence $\texttt {ff}{} \texttt {fe}$ followed by a big-endian two-byte encoding of the comment length. BMP files must begin with $\texttt {42}{} \texttt {4d}$, and the next four bytes must be the length block. The length block in a BMP file is a four-byte (little-endian) encoding of the file length. All the BMP parsers we used only read the number of bytes indicated in the header and ignore trailing bytes.

Nonce search. Since file formats generally have some internal structure (like having a fixed byte sequence at the beginning or end) $\text {Att-Merge}$ must choose a nonce so that the keystreams for the two keys respect this structure. JPEG and BMP files must begin with different fixed two-byte sequences, so the keystreams XORed with those sequences must result in a collision for the first two bytes. The plaintext restructuring step will need the JPEG to have a comment header in the next two bytes, which in the BMP plaintext contain the file length. Thus, the nonce output by $\text {Att-Merge}$ must produce a collision in the first four bytes of the ciphertext (marked $C^0$ through $C^4$ in Fig. 4), which happens for about one in $2^{32}$ nonces. We wrote a simple Python script to search through nonces until we found 10606665379, which produces the required collision. Finding that nonce took roughly three hours on a 3.4 GHz quad-core Intel i7.

Plaintext restructuring. After the nonce search, the two plaintexts can be restructured. For JPEG and BMP images $\text {Att-Merge}$ performs the following steps: (1) inserting the decryption (under $K_{1}$) of the BMP ciphertext into a comment region at the beginning of the JPEG, (2) inserting an additional comment at the end of the JPEG so the bytes randomized by $\mathsf{{Collide}}\text {-}\mathsf{{GCM}}$ are ignored by the JPEG parser, and (3) appending the decryption (under $K_{2}$) of the JPEG ciphertext to the end of the BMP plaintext. See Fig. 4 for a diagram of the JPEG and BMP plaintexts after restructuring.

One important subtlety is that JPEG comments are at most $2^{16}$ bytes in length, so the BMP image must be smaller than $2^{16}$ bytes. In fact, it is advantageous for the BMP to be as small as possible because the comment length bytes in the JPEG are not fixed by the nonce search. A more detailed explanation of this issue and plaintext restructuring in general will be given in the full version of this work.

Implementing $\mathsf{{Collide}}\text {-}\mathsf{{GCM}}$. We implemented $\mathsf{{Collide}}\text {-}\mathsf{{GCM}}$ in Python 2.7 and verified that arbitrary colliding ciphertexts can be generated in roughly 45 s using an unoptimized implementation of $\text {GF}(2^{128})$ arithmetic. We checked decryption correctness using cryptography.io, a Python cryptography library which uses OpenSSL’s GCM implementation. This sufficed as a proof-of-concept exploit for Facebook’s engineering team.

3.2 Discussion and Mitigation

We chose JPEG and BMP files for our $\text {Att-Merge}$ proof of concept because their formats can tolerate random bytes in different regions of the file (the beginning and the end, respectively). We did not try to extend the $\text {Att-Merge}$ to other common image formats but it is possible. We did not try to implement $\text {Att-Merge}$ for video file formats. Such formats are substantially more complex than image formats, but we conjecture it is possible to extend the attack to video files.

Relation to GLR. In [17] GLR proved $\text {CtE2}$ is a ccAEAD scheme, and one may wonder whether this attack shows their proof is incorrect. Their proof only applies to $\text {CtE2}$ itself, not to the composition of $\text {CtE2}$ and GCM. Concretely, GLR analyzed $\text {CtE2}$ as it is used for text chat messages in Messenger, but did not analyze how it is used for attachments. This attack points to a gap between GLR’s analysis and what Facebook actually uses, but it does not mean GLR’s proof is incorrect. Indeed, the fact that the attack works without breaking $\text {CtE2}$’s binding highlights the surprising subtlety of security notions for this setting.

The $\mathsf{{Collide}}\text {-}\mathsf{{GCM}}$ algorithm in Fig. 2 is related to the ${\mathrm{{r}}{\text {-}}\mathrm{{BIND}}}$ attack against GCM given by GLR [17]. However, their attack is insufficient to exploit Facebook’s attachment franking—it only creates ciphertexts with colliding tags, but not the same ciphertext. Thus using it against Facebook wouldn’t work, because the SHA-256 hashes of the two images would not collide. The $\mathsf{{Collide}}\text {-}\mathsf{{GCM}}$ algorithm works even if the entire ciphertext, including any headers and the nonce, act as the commitment and the only opening is the encryption key.

Mitigating the attack. There are two main ways this attack can be mitigated. The first is a “server-software-only” patch that ensures abuse reports containing attachments are not deduplicated by attachment identifier. The second is changing the Messenger clients to use a ccAEAD scheme instead of GCM to encrypt attachments. In response to our bug report, Facebook deployed the first mitigation, primarily because it did not require patching the Messenger clients (an expensive and time-consuming process). Despite requiring less engineering effort, we believe this mitigation has some important drawbacks. Most notably, it leaves the underlying cryptographic issue intact: attachments are still encrypted using GCM. This means future changes to either the Messenger client or Facebook’s server-side code could re-expose the vulnerability. Using a ccAEAD in place of GCM for attachment encryption would immediately prevent any deduplication behavior from being exploited, since the binding security of ccAEAD implies attachment identifiers uniquely identify the attachment plaintexts.

4 A New Primitive: Encryptment

In this section, we introduce a new primitive called an encryptment scheme. Encryptment schemes allow both encryption of, and commitment to^{Footnote 1}, a message. Moreover, the schemes which we target and ultimately build achieve both security goals with only a single pass over the underlying data.

While the syntax of encryptment schemes is similar to that of the ccAEAD schemes we ultimately look to build, the key difference is that we expect far more minimal security notions from encryptment schemes (see Sect. 7 for a more detailed discussion). Looking ahead, we shall see that a secure encryptment scheme is the key building block for more complex primitives such as ccAEAD schemes, robust encryption [1, 15, 16], cryptographic concealments [12], and domain extension for authenticated encryption and remotely keyed AE [12], facilitating the construction of very efficient instantiations of these primitives. In Sect. 7.3 we show how to build ccAEAD from encryptment. The other primitives are deferred to the full version of this work.

Encryptment schemes. Applying the encryptment algorithm to a given key, header and message tuple $(K_{{\mathsf{{EC}}}}, H, M)$ returns a pair $(C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}})$ which we call an encryptment. We refer to encryptment component $C_{{\mathsf{{EC}}}}$ as the ciphertext, and to $B_{{\mathsf{{EC}}}}$ as the binding tag. Together the ciphertext/binding tag pair $(C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}})$ function as an encryption of M under key $K_{{\mathsf{{EC}}}}$, so that given $(K_{{\mathsf{{EC}}}}, H, C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}})$, the opening algorithm ${\textsf {DO}}$ can recover the underlying message M. The binding tag $B_{{\mathsf{{EC}}}}$ simultaneously acts as a commitment to the underlying header and message, with opening $K_{{\mathsf{{EC}}}}$; the validity of this commitment to a given pair (H, M) is checked by the verification algorithm ${\textsf {EVer}}$. Looking ahead, we will actually require that $B_{{\mathsf{{EC}}}}$ acts as a commitment to the opening $K_{{\mathsf{{EC}}}}$ also, in that it should be infeasible to find $K_{{\mathsf{{EC}}}}\ne K_{{\mathsf{{EC}}}}'$ which verify the same $B_{{\mathsf{{EC}}}}$.

Formally an encryptment scheme is a tuple ${\mathsf{{EC}}}= ({\textsf {EKg}}, {\textsf {EC}},{\textsf {DO}},{\textsf {EVer}})$ defined as follows. Associated to the scheme is a key space $\mathcal{K}_{{\mathsf{{EC}}}}\subseteq \varSigma ^*$, header space $\mathcal{H}_{{\mathsf{{EC}}}} \subseteq \varSigma ^*$, message space $\mathcal{M}_{{\mathsf{{EC}}}} \subseteq \varSigma ^*$, ciphertext space $\mathcal{C}_{{\mathsf{{EC}}}} \subseteq \varSigma ^*$, and binding tag space $\mathcal{T}_{{\mathsf{{EC}}}} \subseteq \varSigma ^*$.

The randomized key generation ${\textsf {EKg}}$ algorithm takes no input, and outputs a key $K_{{\mathsf{{EC}}}}\in \mathcal{K}_{{\mathsf{{EC}}}}$.
The encryptment algorithm ${\textsf {EC}}$ is a deterministic algorithm which takes as input a key $K_{{\mathsf{{EC}}}}\in \mathcal{K}_{{\mathsf{{EC}}}}$, a header $H \in \mathcal{H}_{{\mathsf{{EC}}}}$, and a message $M \in \mathcal{M}_{\mathsf{{EC}}}$, and outputs an encryptment $(C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}}) \in \mathcal{C}_{{\mathsf{{EC}}}} \times \mathcal{T}_{{\mathsf{{EC}}}}$.
The decryptment algorithm ${\textsf {DO}}$ is a deterministic algorithm which takes as input a key $K_{{\mathsf{{EC}}}}\in \mathcal{K}_{{\mathsf{{EC}}}}$, a header $H \in \mathcal{H}_{{\mathsf{{EC}}}}$, and an encryptment $(C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}}) \in \mathcal{C}_{{\mathsf{{EC}}}} \times \mathcal{T}_{{\mathsf{{EC}}}}$, and outputs a message $M\in \mathcal{M}_{{\mathsf{{EC}}}}$ or the error symbol $\perp $. We assume that if $(K_{{\mathsf{{EC}}}}, H, C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}}) \notin \mathcal{K}_{{\mathsf{{EC}}}}\times \mathcal{H}_{{\mathsf{{EC}}}} \times \mathcal{C}_{{\mathsf{{EC}}}} \times \mathcal{T}_{{\mathsf{{EC}}}}$, then $\perp \leftarrow {\textsf {DO}}(K_{{\mathsf{{EC}}}}, H, C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}})$.
The verification algorithm ${\textsf {EVer}}$ is a deterministic algorithm which takes as input a header $H \in \mathcal{H}_{{\mathsf{{EC}}}}$, a message $M \in \mathcal{M}_{{\mathsf{{EC}}}}$, a key $K_{{\mathsf{{EC}}}}\in \mathcal{K}_{{\mathsf{{EC}}}}$, and a binding tag $B_{{\mathsf{{EC}}}}\in \mathcal{T}_{{\mathsf{{EC}}}}$, and returns a bit b. We assume that if $(H, M, K_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}}) \notin \mathcal{H}_{{\mathsf{{EC}}}} \times \mathcal{M}_{{\mathsf{{EC}}}} \times \mathcal{K}_{{\mathsf{{EC}}}}\times \mathcal{T}_{{\mathsf{{EC}}}}$ then $0 \leftarrow {\textsf {EVer}}(H, M, K_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}})$.

Length regularity and compactness. We impose two requirements on the lengths of the encryptments output by encryptment schemes. First, we require compactness: that the binding tags $B_{{\mathsf{{EC}}}}$ output by an encryptment scheme are of constant length $\textsf {btlen}$ regardless of the length of the underlying message, and that $\textsf {btlen}$ is linear in the key size. Second, we require length regularity: that the length of ciphertexts $C_{{\mathsf{{EC}}}}$ depend only on the length of the underlying message. Formally, we require there exists a function $\textsf {clen}{{}:{}}{{\mathbb N}}\ \rightarrow {{\mathbb N}}$ such that for all $(H,M) \in \mathcal{H}_{{\mathsf{{EC}}}}\times \mathcal{M}_{{\mathsf{{EC}}}}$ it holds that $|C_{{\mathsf{{EC}}}}| = \textsf {clen}(|M|)$ with probability one for the sequence of algorithm executions: .

Correctness. We define two correctness notions for encryptment schemes, which we formalize via the games $\text {COR}$ and $\text {S-COR}$ shown in Fig. 5. We require that all encryptment schemes satisfy our all-in-one correctness notion, which requires that honestly generated encryptments both decrypt to the correct underlying message, and successfully verify, with probability one. Formally, we say that an encryptment scheme ${\mathsf{{EC}}}= ({\textsf {EKg}}, {\textsf {EC}}, {\textsf {DO}}, {\textsf {EVer}})$ is correct if for all header/message pairs $(H, M) \in \mathcal{H}_{{\mathsf{{EC}}}} \times \mathcal{M}_{{\mathsf{{EC}}}}$, it holds that $\Pr \left[ \, \text {COR}_{{\mathsf{{EC}}}}(H, M) \Rightarrow 1 \,\right] = 1$, where the probability is over the coins of ${\textsf {EKg}}$.

We additionally define strong correctness, which requires that for each tuple $(K_{{\mathsf{{EC}}}}, H, M) \in \mathcal{K}_{{\mathsf{{EC}}}}\times \mathcal{H}_{{\mathsf{{EC}}}} \times \mathcal{M}_{{\mathsf{{EC}}}}$ there is a unique encryptment $(C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}})$ such that $M \leftarrow {\textsf {DO}}(K_{{\mathsf{{EC}}}}, H, C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}})$. We formalize this in game $\text {S-COR}$, and say that an encryptment scheme ${\mathsf{{EC}}}= ({\textsf {EKg}}, {\textsf {EC}}, {\textsf {DO}}, {\textsf {EVer}})$ is strongly correct if for all tuples $(K_{{\mathsf{{EC}}}}, H, C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}}) \in \mathcal{K}_{{\mathsf{{EC}}}}\times \mathcal{H}_{{\mathsf{{EC}}}} \times \mathcal{M}_{{\mathsf{{EC}}}}\times \mathcal{C}_{{\mathsf{{EC}}}} \times \mathcal{T}_{{\mathsf{{EC}}}}$, it holds that $\Pr \left[ \, \text {S-COR}_{{\mathsf{{EC}}}}(K_{{\mathsf{{EC}}}}, H, C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}}) \Rightarrow 1 \,\right] = 1$. While we only require that encryptment schemes satisfy correctness, the schemes we build will also possess the stronger property (which simplifies their security proofs). We note that strong correctness can be added to any encryptment scheme by making ${\textsf {DO}}$ recompute a ciphertext after decrypting, and returning $\perp $ if the two do not match; however for efficiency we target schemes which achieve strong correctness without this.

4.1 Security Goals for Encryptment

We require encryptment schemes to satisfy both one-time real-or-random ($\text {otROR}$) security, and a variant of one-time ciphertext integrity ($\text {SCU}$) which requires forging a ciphertext for a given binding tag with a known key; we motivate this variant below. The security games for both notions are shown in Fig. 6.

Confidentiality. We define $\text {otROR}$ security for an encryptment scheme ${\mathsf{{EC}}}= ({\textsf {EKg}}, {\textsf {EC}}, {\textsf {DO}}, {\textsf {EVer}})$ in terms of games $\text {otROR0}$ and $\text {otROR1}$. Each game allows an attacker ${\mathcal A}$ to make one query of the form (H, M) to his real-or-random encryption oracle; in game $\text {otROR0}$ he receives back the real encryptment $(C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}})$ encrypting the input under a secret key, and in game $\text {otROR1}$ he receives back random bit strings. For an encryptment scheme ${\mathsf{{EC}}}$ and adversary ${\mathcal A}$, we define the $\text {otROR}$ advantage of ${\mathcal A}$ against ${\mathsf{{EC}}}$ as

$$\begin{aligned} \mathbf {Adv}^{{\mathrm{{ot}}{\text {-}}\mathrm{{ror}}}}_{{\mathsf{{EC}}}}({\mathcal A}) = \bigg |\Pr \left[ \, \text {otROR0}^{\mathcal A}_{{\mathsf{{EC}}}}\Rightarrow 1 \,\right] - \Pr \left[ \, \text {otROR1}^{\mathcal A}_{{\mathsf{{EC}}}}\Rightarrow 1 \,\right] \bigg |\; , \end{aligned}$$

where the probability is over the coins of ${\textsf {EKg}}$ and ${\mathcal A}$.

Second-ciphertext unforgeability. We also ask that encryptment schemes meet an unforgeability goal that we call second-ciphertext unforgeability ($\text {SCU}$). In this game, the attacker first learns an encryptment $(C_{{\mathsf{{EC}}}},B_{{\mathsf{{EC}}}})$ corresponding to a chosen header/message pair (H, M) under key $K_{{\mathsf{{EC}}}}$. We then require that the attacker shouldn’t be able to find a distinct header and ciphertext pair $(H', C_{{\mathsf{{EC}}}}')\ne (H, C_{{\mathsf{{EC}}}})$ such that ${\textsf {DO}}(K_{{\mathsf{{EC}}}}, H', C_{{\mathsf{{EC}}}}',B_{{\mathsf{{EC}}}})$ does not return an error. This should hold even if the attacker knows $K_{{\mathsf{{EC}}}}$. Looking ahead, this is a necessary and sufficient condition needed from encryptment when using it to build ccAEAD schemes from fixed domain authenticated encryption.

Formally, the game $\text {SCU}$ is shown in Fig. 6. To an encryptment scheme ${\mathsf{{EC}}}$ and adversary ${\mathcal A}$, we define the second-ciphertext unforgeability ($\text {SCU}$) advantage to be $\mathbf {Adv}^{\mathrm {scu}}_{{\mathsf{{EC}}}}({\mathcal A}) = \Pr \left[ \, \text {SCU}^{{\mathcal A}}_{{\mathsf{{EC}}}}\Rightarrow \mathsf {true} \,\right] $, where the probability is again over the coins of ${\textsf {EKg}}$ and ${\mathcal A}$.

Binding security. We finally require that encryptment schemes satisfy certain binding notions. We start by generalizing the receiver binding notion ${\mathrm{{r}}{\text {-}}\mathrm{{BIND}}}$ for ccAEAD schemes from [17], and adapting the syntax to the encryptment setting. ${\mathrm{{r}}{\text {-}}\mathrm{{BIND}}}$ security requires that no computationally efficient adversary can find two keys, message, header triples $(K_{{\mathsf{{EC}}}},H,M)$,$(K_{{\mathsf{{EC}}}}',H',M')$ and a binding tag $B_{{\mathsf{{EC}}}}$ such that $(H,M) \ne (H',M')$ and ${\textsf {EVer}}(H,M,K_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}}) = {\textsf {EVer}}(H',M',K_{{\mathsf{{EC}}}}', B_{{\mathsf{{EC}}}}) = 1$. A simple strengthening of this notion — which we denote $\mathrm{{sr}}{\text {-}}\mathrm{{BIND}}$ (for strong receiver binding) — allows the adversary to instead win if $(H,M, K_{{\mathsf{{EC}}}}) \ne (H',M', K_{{\mathsf{{EC}}}}')$. The pseudocode game $\mathrm{{sr}}{\text {-}}\mathrm{{BIND}}$ is shown in Fig. 6, where we define the $\mathrm{{sr}}{\text {-}}\mathrm{{BIND}}$ advantage of an adversary ${\mathcal A}$ against ${\mathsf{{EC}}}$ as $\mathbf {Adv}^{\mathrm{{sr}}{\text {-}}\mathrm{{bind}}}_{{\mathsf{{EC}}}}({\mathcal A}) = \Pr \left[ \, \mathrm{{sr}}{\text {-}}\mathrm{{BIND}}_{{\mathsf{{EC}}}}^{{\mathcal A}}\Rightarrow \mathsf {true} \,\right] $. The corresponding game and advantage term for ${\mathrm{{r}}{\text {-}}\mathrm{{BIND}}}$ security are defined analogously. The stronger receiver binding notion implies the prior notion, and indeed is strictly stronger. We defer the details to the full version. For our purposes, it will simplify our negative results about rate-1 blockcipher-based encryptment.

We additionally define the notion of sender binding. It ensures that a sender must itself commit to the message underlying an encryptment, by requiring that it is infeasible to find an encryptment which decrypts correctly but for which verification fails. Without this requirement, a malicious sender may be able to send an abusive message to a receiver with a faulty commitment such that a receiver is unable to report it. We define sender binding security formally via the game $\text {s-BIND}$ in Fig. 6. We define the $\text {s-BIND}$ advantage of an adversary ${\mathcal A}$ against an encryptment scheme ${\mathsf{{EC}}}$ as $\mathbf {Adv}^{\mathrm{{s}}{\text {-}}\mathrm{{bind}}}_{{\mathsf{{EC}}}}({\mathcal A}) = \Pr \left[ \, \text {s-BIND}_{{\mathsf{{EC}}}}^{{\mathcal A}}\Rightarrow \mathsf {true} \,\right] $.

Binding notions and the Facebook attack. Looking ahead, the analogous strong receiver binding notion for ccAEAD schemes is the property that would have prevented the Facebook attack, had they used a scheme that enjoyed it. This is because receiver binding implies that it is computationally intractable for an attacker to find two distinct keys that verify the same binding tag. In the Facebook attack, the sender was able to exploit this weakness to violate a security property similar to GLR’s sender binding notion [17], which ensures decryption can only succeed if the binding tag commits to the underlying plaintext. Canonically, however, receiver binding is modeling the ability of a malicious receiver to frame the sender as having sent a message they did not, in fact, send. Such an attack doesn’t work against Facebook’s attachment franking scheme because the encryption of the AES-GCM key enjoys receiver binding, and prevents the recipient from forging an abuse report for an image that wasn’t sent.

Relation to ccAEAD. Given the simpler security properties expected of them, building highly efficient secure encryptment schemes is a more straightforward task than constructing a ccAEAD scheme directly. However, as we shall see, encryptment isolates the core complexity of building ccAEAD schemes with multi-opening security. In particular, in Sect. 7.3 we give a generic transform which allows one to build a multi-opening secure ccAEAD schemes from a secure encryptment scheme and secure AEAD scheme. Armed with this transform, in Sect. 6 we show how to construct a secure encryptment scheme from cryptographic hash functions. Together, our results will yield the first single-pass, single-primitive constructions of ccAEAD.

Binding and correctness imply ciphertext integrity. One reason we have introduced encryptment as a standalone primitive (instead of directly working with the ccAEAD formulation from GLR) is that it simplifies security analyses. One useful tool towards this is that we can show the following lemma, which states that for any encryptment scheme ${\mathsf{{EC}}}$ that enjoys strong correctness, the combination of ${\mathrm{{r}}{\text {-}}\mathrm{{BIND}}}$ and $\text {s-BIND}$ security suffice to prove the $\text {SCU}$ security.

Lemma 1

Let ${\mathsf{{EC}}}= ({\textsf {EKg}}, {\textsf {EC}}, {\textsf {DO}}, {\textsf {EVer}}) $ be a strongly correct encryptment scheme, and consider an attacker ${\mathcal A}$ in the $\text {SCU} $ game against ${\mathsf{{EC}}}$. Then there exist attackers ${\mathcal B}$ and ${\mathcal C}$ such that $\mathbf {Adv}^{\mathrm {scu}}_{{\mathsf{{EC}}}}({\mathcal A}) \le \mathbf {Adv}^{\mathrm{{s}}{\text {-}}\mathrm{{bind}}}_{{\mathsf{{EC}}}}({\mathcal B}) + \mathbf {Adv}^{{\mathrm{{r}}{\text {-}}\mathrm{{bind}}}}_{{\mathsf{{EC}}}}({\mathcal C})$, and moreover ${\mathcal B}$ and ${\mathcal C}$ both run in the same time as ${\mathcal A}$.

We give a proof sketch and defer details to the full version. Let $((C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}}), K_{{\mathsf{{EC}}}})$ be the tuple corresponding to ${\mathcal A}$’s single encryption query (H, M) in the $\text {SCU}$ game, and suppose that ${\mathcal A}$ subsequently wins the game with decryption oracle query $(H', C_{{\mathsf{{EC}}}}')$, meaning that ${\textsf {DO}}(K_{{\mathsf{{EC}}}}, H', C_{{\mathsf{{EC}}}}', B_{{\mathsf{{EC}}}})= M'\ne \perp $ and $(H', C_{{\mathsf{{EC}}}}') \ne (H, C_{{\mathsf{{EC}}}})$. The proof first argues that if the scheme is $\text {s-BIND}$-secure, then any ciphertext which decrypts correctly must also verify correctly. As such, it follows that if $(H, M)\ne (H', M')$ for the winning query, then this can be used to construct a winning tuple for an attacker in the ${\mathrm{{r}}{\text {-}}\mathrm{{BIND}}}$ game against ${\mathsf{{EC}}}$; we bound the probability that this occurs with a reduction to ${\mathrm{{r}}{\text {-}}\mathrm{{BIND}}}$ security. On the other hand, if $(H, M) = (H', M')$, then it must be the case that $C_{{\mathsf{{EC}}}}\ne C_{{\mathsf{{EC}}}}'$ — but this in turn implies that we have found two distinct encryptments which decrypt to the same header and message under $K_{{\mathsf{{EC}}}}$, violating strong correctness.

A simple encryptment construction. It is straightforward to construct an encryptment scheme by composing a secure encryption scheme and a commitment scheme. One can just use a simple adaptation of the CtE2 ccAEAD scheme from [17]. We defer the details to the full version. But such generic compositions are inherently two pass and we seek faster schemes.

5 On Efficient Fixed-Key Blockcipher-Based Encryptment

We are interested in building encryptment schemes — and ultimately, more complex primitives such as ccAEAD schemes — from just a blockcipher used on a small number of keys and other primitive arithmetic operations (XOR, finite field arithmetic, etc.). Beyond being an interesting theoretical question, there is the practical motivation that the current fastest AEAD schemes, such as OCB [32], fall into this category.

As a simple motivating example illustrating the challenging nature of this task, we note that OCB does not satisfy ${\mathrm{{r}}{\text {-}}\mathrm{{BIND}}}$ security (see Sect. 4) when reframed as an encryptment scheme in the natural way. The high level reason for this (modulo a number of details), is that in OCB the binding tag is computed as a function over the XOR of the message blocks. As such, it is straightforward to construct two distinct messages such that the blocks XOR to the same value (and thus produce the same binding tag), thereby violating ${\mathrm{{r}}{\text {-}}\mathrm{{BIND}}}$ security. Full details of the scheme and attack are given in the full version.

For the remainder of this section, we formally define high-rate encryptment schemes, and show how prior results on the impossibility of high-rate CR functions can be used to rule out high-rate encryptment schemes as well.

A connection between hashing and encryptment. Towards showing negative results, we must first define more carefully what we mean by the rate of encryptment schemes. We are inspired by (and will later exploit connections to) the definitions of rate from the blockcipher-based hash function literature [9, 34, 35]. Consider a compression function $\mathcal {H}:\{0,1\}^{mn}\rightarrow \{0,1\}^{rn}$ for $m>r\ge 1$ and $n \ge 1$, which uses $k \ge 1$ calls of a blockcipher $E{{}:{}}\{0,1\}^\kappa \times \{0,1\}^n\rightarrow \{0,1\}^n$ ($m,r,n,k, \kappa \in \mathbb {N}$). Then following [35], we may write $\mathcal {H}$ as shown in Fig. 7, where we let $K_1,\ldots ,K_k$ be any fixed strings^{Footnote 2}, and $f_i{{}:{}}\{0,1\}^{(m+(i-1))n}\rightarrow \{0,1\}^n$ ($i = 1, \dots , k$), $g{{}:{}}\{0,1\}^{(m+k)n}\rightarrow \{0,1\}^{rn}$ are functions.

The rate of $\mathcal {H}$ is defined to be $m{\slash }k$; so a rate-$\frac{1}{\beta }$ function $\mathcal {H}$ makes $\beta $ blockcipher calls per n-bits of input. For example, a rate-1 $\mathcal {H}$ would achieve a single blockcipher call per n-bit block of input. A consequence of the more general results of [35] (see below) is that they rule out rate-1 functions achieving security past $2^{n/4}$ queries to E by an adversary, when modeling E as an ideal cipher. We would like to exploit their negative results to similarly rule out rate-1 encryptment schemes.

We now focus attention on encryptment schemes that fall into a certain form. Consider an encryptment scheme ${\mathsf{{EC}}}= ({\textsf {EKg}}, {\textsf {EC}}, {\textsf {DO}}, {\textsf {EVer}})$. Because ${\textsf {EC}}$ is deterministic, we can view computing the binding tag as a function $F(K_{{\mathsf{{EC}}}},H,M)$ defined by computing $(C_{{\mathsf{{EC}}}},B_{{\mathsf{{EC}}}}) = {\textsf {EC}}(K_{{\mathsf{{EC}}}},H,M)$ and outputting $B_{{\mathsf{{EC}}}}$. The verification algorithm ${\textsf {EVer}}(H,M,K_{{\mathsf{{EC}}}},B_{{\mathsf{{EC}}}})$ checks that $F(K_{{\mathsf{{EC}}}},H,M) = B_{{\mathsf{{EC}}}}$. (One can generalize this definition by allowing ${\textsf {EC}}$ and ${\textsf {EVer}}$ to use different functions F,$F'$ to compute the binding tag; the lower bounds given in this section on the rate of such functions readily extend to this case also.)

With this in place, we can define the rate of verification for encryptment analogously to defining the rate of a hash function $\mathcal {H}$, by saying that an encryptment scheme has rate-$\frac{1}{\beta }$ if the associated function F makes $\beta $ blockcipher calls per n-bits of header and message data (or equivalently, can process (H, M) of combined length mn-bits using $\beta m$ blockcipher calls).

Now we can give a generic, essentially syntactic, transform from an encryptment scheme to a hash function. For an encryptment scheme ${\mathsf{{EC}}}$, let F be the associated binding tag computation function as per above. Let $\mathcal {H}{{}:{}}\{0,1\}^*\rightarrow \{0,1\}^n$ be the function defined as $\mathcal {H}(X) = F(K_{{\mathsf{{EC}}}},\varepsilon ,X)$ for $K_{{\mathsf{{EC}}}}$ an arbitrary, fixed bit string. (Here we take $H = \varepsilon $, so that the number of block cipher calls required to compute F is solely determined by the length of the input X). The following is simple to prove.

Theorem 1

Let ${\mathsf{{EC}}}$ be a encryptment scheme with binding codes, and let $\mathcal {H}$ be defined as in the previous paragraph. For any collision-resistance adversary ${\mathcal A}$, we give an ${\mathrm{{r}}{\text {-}}\mathrm{{BIND}}}$ adversary ${\mathcal B}$ so that $\mathbf {Adv}^{\mathrm {cr}}_{\mathcal {H}}({\mathcal A}) \le \mathbf {Adv}^{{\mathrm{{r}}{\text {-}}\mathrm{{bind}}}}_{{\mathsf{{EC}}}}({\mathcal B})$. The adversary ${\mathcal B}$ runs in the same amount of time as ${\mathcal A}$.

Theorem 1 allows us to apply known negative results about efficient CR-hashing. For example, we have the following corollary of Theorem 1 and [35, Theorem 1]:

Corollary 1

Fix $m > r \ge 1$ and $n > 0$ ($m, r, n \in \mathbb {N}$). Let $N = 2^n$. Let ${\mathsf{{EC}}}$ be an encryptment scheme with ideal-cipher-based binding codes of length rn and that has message space including strings of length mn. Then there is a runnable adversary ${\mathcal A}$ making $q = k(N^{1-(m-r)/k} +1)$ ideal cipher queries and achieving $\mathbf {Adv}^{{\mathrm{{r}}{\text {-}}\mathrm{{bind}}}}_{{\mathsf{{EC}}}}({\mathcal A}) = 1$, where $k \in \mathbb {N}$ denotes the number of permutation calls required to compute the binding code for an mn-bit input.

This immediately rules out security of rate-1 schemes that achieve the efficiency of OCB, i.e., having $k = m$, m arbitrarily large, and $r = 1$. Consider the minimal case that $m = 2$ (two block messages), then ${\mathcal A}$ only requires $q = 2$ queries to succeed. Stronger results ruling out rate-$\frac{1}{2}$ verification can be similarly lifted from [35, Theorem 2] under some technical conditions about the verification function and the adversary. The results above were cast in terms of ${\mathrm{{r}}{\text {-}}\mathrm{{BIND}}}$ security, but extend to $\mathrm{{sr}}{\text {-}}\mathrm{{BIND}}$ security because the latter implies the former.

Ultimately these negative results indicate that for an ${\mathrm{{r}}{\text {-}}\mathrm{{BIND}}}$-secure encryptment scheme, the best we can hope for is either a rate-$\frac{1}{3}$ construction with a small set of keys, or to allow rekeying with each block of message. We therefore turn to building as efficient-as-possible constructions.

In Sect. 7, we will describe how the existence of an ${\mathrm{{r}}{\text {-}}\mathrm{{BIND}}}$-secure ccAEAD scheme of a given rate implies the existence of a given ${\mathrm{{r}}{\text {-}}\mathrm{{BIND}}}$-secure encryptment scheme of the same rate, and so the results of this section exclude the existence of rate-1 or rate-$\frac{1}{2}$ ccAEAD schemes also.

6 Encryptment from Hashing

In this section, we turn our attention to building secure and efficient encryptment schemes. As we shall see in Sect. 7, these can be lifted to multi-opening, many-time secure ccAEAD via simple and efficient transforms.

As one might expect given the close relationship between binding and CR hashing discussed previously in Sect. 5, our starting point will be cryptographic hashing. A slightly simplified version of the construction is shown in Fig. 8 (padding details are omitted), where $\mathsf {f}$ is a compression function. In summary, the scheme hashes the key, associated data and message data (the latter two of which are repeatedly XOR’d with the key). Intermediate chaining variables from the hash computation are used as pads to encrypt the message data, while the final chaining variable constitutes the binding tag.

Intuitively, (strong) receiver binding derives from the collision resistance of the underlying hash function. We XOR the key into all the associated data and message blocks to ensure that every application of the compression function is keyed. This is critical; just prepending (or both prepending and appending) the key to the data leads to a scheme whose confidentiality is easily broken. Likewise one cannot dispense with the additional initial block that simply processes the key, otherwise the encoding of the key, associated data, and message would not be injective and binding attacks result.

Some notation. Before defining the full scheme, we first give some additional notation which will simplify the presentation. The algorithm $\mathrm {Parse}_d$ is used to partition a string into d-bit blocks. Formally, we define $\mathrm {Parse}_d$ to be the algorithm which on input X outputs $(X_1, \dots , X_\ell )$ such that $|X_i| =d$ for $1 \le i \le \ell - 1$ and $|X_\ell | = |X| \mod d$. For correctness, we require that $X = X_1\, \Vert \,\dots \, \Vert \,X_\ell $. Similarly, we define $\mathrm {Trunc}_r$ to be the algorithm which on input X outputs the r leftmost bits of X. We write $\langle y\rangle _{64}$ to be the encoding of y as a 64-bit string.

Our scheme utilizes a padding scheme $\textsf {PadS}\,{=}\, (\mathrm {PadH}, \mathrm {PadM}, \mathrm {PadSuf}, \mathrm {Pad})$. The padding scheme is parameterized by a pair of numbers d, n, but we omit these in the notation for simplicity. We assume $d \ge n \ge 128$. The algorithms $\mathrm {PadH}, \mathrm {PadM}$, and $\mathrm {PadSuf}$ are shown in Fig. 9. Notice that for all header and message pairs (H, M), it holds that if $|M| \bmod n = r$, then $r + |\mathrm {PadSuf}(|H|,|M|)|$ will be equal to either d or 2d. The full padding function is then defined to be $\mathrm {Pad}(H,M) = {\mathrm {PadH}(H)} \, \Vert \,{\mathrm {PadM}(M)}\, \Vert \,\mathrm {PadSuf}(|H|,|M|)$. Note that $|\mathrm {Pad}(H, M)|$ is a multiple of d and that the function $\mathrm {Pad}(H, M)$ is injective, i.e., for all pairs $(H, M), (H', M')$, $\mathrm {Pad}(H,M) = \mathrm {Pad}(H',M')$ only if $(H, M) = (H', M')$.

Next we define iterated functions. Let $\mathsf {f}{{}:{}}\{0,1\}^n\times \{0,1\}^d\rightarrow \{0,1\}^n$ be a function for some $d \ge n \ge 128$, let $D^+ = \cup _{i\ge 1} \{0,1\}^{id}$ and let $V_0 \in \{0,1\}^n$. Then $\mathsf {f}^+{{}:{}}\{0,1\}^n\times D^+\rightarrow \{0,1\}^n$ denotes the iteration of $\mathsf {f}$, where $\mathsf {f}^+(V_0,X_1\, \Vert \,\cdots \, \Vert \,X_m) = V_m$ is computed via $V_i = \mathsf {f}(V_{i-1},X_i)$ for $1 \le i \le m$.

The HFC encryptment scheme. The hash-function-chaining encryptment scheme $\mathsf{{HFC}}= (\mathsf{{HFC}}{\textsf {Kg}},\mathsf{{HFC}}{\textsf {Enc}},\mathsf{{HFC}}{\textsf {Dec}},\mathsf{{HFC}}{\textsf {Ver}})$ is based on a compression function $\mathsf {f}{{}:{}}\{0,1\}^n \times \{0,1\}^d \rightarrow \{0,1\}^n$. The pseudocode for the encryptment and decryptment algorithms is presented in Fig. 10.

Key generation $\mathsf{{HFC}}{\textsf {Kg}}$ simply chooses . Encryptment first pads the header and message using the padding functions $\mathrm {PadH}$ and $\mathrm {PadM}$ respectively. We let $IV\in \{0,1\}^n$ be a fixed constant value (also called an initialization vector). The scheme computes an initial chaining variable as $V_0 = \mathsf {f}(IV, K_{{\mathsf{{EC}}}})$. It then hashes ${\mathrm {PadH}(H)}\, \Vert \,{\mathrm {PadM}(M)} \, \Vert \,\mathrm {PadSuf}(|H|, |M|)$ with $\mathsf {f}^+$, the iteration of the compression function $\mathsf {f}$, where the secret encryptment key $K_{{\mathsf{{EC}}}}$ is XORed into each d-bit block prior to hashing. The final chaining variable produced by this process forms the binding tag $B_{{\mathsf{{EC}}}}$. Notice that while the compression function takes d-bit inputs, the way in which the message data is padded means we only process n-bits of message in each compression function call. We will see that the collision resistance of the iterated hash function when instantiated with an appropriate compression function implies the $\mathrm{{sr}}{\text {-}}\mathrm{{BIND}}$ security of the construction.

Rather than running a separate encryption algorithm alongside this process to encrypt the message, we instead generate ciphertext blocks by XORing the message blocks $M_i$ with intermediate chaining variables, yielding $C_{i} = V_{h + i - 1} \oplus M_i$ for $1 \le i \le m$ where h denotes the number of header blocks. Recall that in our notation $X \oplus Y$ silently truncates the longer string to the length of the shorter string, and so only the n-bits of message data in each d-bit padded message block is XORed with the n-bit chaining variable; similarly, if message M is such that $|M| \bmod n = r$, then the final ciphertext block produced by this process is truncated to the leftmost r-bits. The properties of the compression function ensure that the chaining variables are pseudorandom, thus yielding the required $\text {otROR}$ security. By ‘reusing’ chaining variables as random pads we can achieve encryptment with no additional overhead over just computing the binding tag, incurring a significant efficiency saving (see further discussion below).

Decryption ${\textsf {DO}}(K_{{\mathsf{{EC}}}}, H, C_{{\mathsf{{EC}}}})$ begins by padding H into d-bit blocks via $\mathrm {PadH}(H)$ and parsing $C_{{\mathsf{{EC}}}}$ into n-bit blocks. The algorithm computes the initial chaining variable as $V_0 = \mathsf {f}(IV, K_{{\mathsf{{EC}}}})$, then hashes the padded header as in encryption. The scheme then recovers the first message block $M_1$ by XORing the chaining variable into the first ciphertext block $C_1$. This is then used to compute the next chaining variable via application of $\mathsf {f}$, and so on. Notice how at most n-bits of message data is recovered in each such step; this is why we must process only n-bits of message data in each compression function call, else the decryptor would be unable to compute the next chaining variable. Finally, ${\textsf {DO}}$ recomputes and verifies the binding tag, returning the message only if verification succeeds.

The verification algorithm (not shown), on input $(K_{{\mathsf{{EC}}}}, H, M, B_{{\mathsf{{EC}}}})$, pads the message to $\mathrm {PadH}(H) \, \Vert \,\mathrm {PadM}(M) \, \Vert \,\mathrm {PadSuf}(|H|,|M|)$, XORs $K_{{\mathsf{{EC}}}}$ into every block, and hashes the resulting string with $\mathsf {f}^+$ with initial chaining variable $V_0 = \mathsf {f}(IV, K_{{\mathsf{{EC}}}})$, checking that the output matches the binding tag $B_{{\mathsf{{EC}}}}$.

Our padding scheme is a variant of MD strengthening. We will not rely on the strengthening for its traditional purpose of forming a suffix-free padding scheme; we use strengthening only for injectivity and will assume more of $\mathsf {f}$.

Efficiency. The efficiency of the scheme (in terms of throughput) depends on the parameters d, n, where recall that $\mathsf {f}{{}:{}}\{0,1\}^n \times \{0,1\}^d \rightarrow \{0,1\}^n$. As discussed previously, at most n-bits of message data can be processed in each compression function call. As such, the HFC encryptment scheme achieves optimal throughput when $d = n$. In this case no padding is applied to the message blocks, and so computing the full encryptment incurs no overhead over simply computing the binding tag. If $d > n$, then some throughput is lost due to the padding. In the full version we present an alternative padding scheme for this case, which recovers some throughput by padding message blocks with header data.

6.1 Analyzing the $\mathsf{{HFC}}$ Encryptment Scheme

In this section, we analyze the security of the $\mathsf{{HFC}}$ encryptment scheme, relative to the security goals detailed in Sect. 4. We also discuss some of the options for instantiating the compression function $\mathsf {f}$.

Strong receiver binding. We begin by proving that the $\mathsf{{HFC}}$ encryptment scheme satisfies strong receiver binding. Observe that the binding tag computation performed by $\mathsf{{HFC}}{\textsf {Enc}}$ on input tuple $(K_{{\mathsf{{EC}}}}, H, M)$ is equivalent to XORing $K_{{\mathsf{{EC}}}}$ into each d-bit block of $0^d \, \Vert \,\mathrm {Pad}(H, M)$ (we refer to this as ‘encoding’ the tuple), and hashing the resulting string with $\mathsf {f}^+$. Moreover, it is straightforward to verify that the injectivity of $\mathrm {Pad}$ implies that the encoding map is injective also. So any tuple breaking the $\mathrm{{sr}}{\text {-}}\mathrm{{BIND}}$ security of $\mathsf{{HFC}}$ is a collision against $\mathsf {f}^+$.

A well-known folklore result (see [2]) gives that $\mathsf {f}^+$ is collision-resistant provided the underlying compression function is collision-resistant, and that it is hard to find an input which hashes to the $IV$. Standard compression functions satisfy both properties. The full proof of the following is given in the full version. The conditions on d, n below are due to the padding scheme and can be relaxed.

Theorem 2

Let $\mathsf{{HFC}}$ be as shown in Fig. 10, using compression function $\mathsf {f}{{}:{}}\{0,1\}^{n}\times \{0,1\}^{d}\rightarrow \{0,1\}^{n}$ where $d \ge n \ge 128$. Then for any adversary ${\mathcal A}$ in the $\mathrm{{sr}}{\text {-}}\mathrm{{BIND}}$ game against $\mathsf{{HFC}}$, there exists an adversary ${\mathcal B}$ such that $\mathbf {Adv}^{\mathrm{{sr}}{\text {-}}\mathrm{{bind}}}_{\mathsf{{HFC}}}({\mathcal A}) \le \mathbf {Adv}^{\mathrm {cr}}_{\mathsf {f}^+}({\mathcal B})$, where adversary ${\mathcal B}$ runs in the same time as ${\mathcal A}$.

Sender binding and correctness. The $\text {s-BIND}$ security of $\mathsf{{HFC}}$ is immediate because decryption verifies the binding tag. Similarly, it is straightforward to verify that the scheme is strongly correct. Therefore Lemma 1 allows us to bound the $\text {SCU}$ security of $\mathsf{{HFC}}$ as an immediate consequence of these observations coupled with Theorem 2.

One-time confidentiality. All that remains is to bound the $\text {otROR}$ security of $\mathsf{{HFC}}$. We do this in the next theorem, by reducing $\text {otROR}$ security of $\mathsf{{HFC}}$ to the related-key attack (RKA) PRF security [3] of $\mathsf {f}$ for a specific class of related-key deriving functions.

Let $F{{}:{}}\{0,1\}^{n}\times \{0,1\}^{d}\rightarrow \{0,1\}^{n}$ be a function, and consider the games $\text {RKA-PRF0}$ and $\text {RKA-PRF1}$. In both games a key is chosen. The attacker is given access to an oracle to which he may submit queries of the form $(X,Y) \in \{0,1\}^{n}\times \{0,1\}^{d}$. In game $\text {RKA-PRF0}$, the oracle returns $F(X, Y \oplus K_{\text {prf}})$. In game $\text {RKA-PRF1}$, the oracle returns a random bit string for each query, answering consistently if $(X,Y\oplus K_{\text {prf}})$ collides with a previous query. The linear-only RKA-PRF advantage of an adversary ${\mathcal A}$ is defined as

$$\begin{aligned} \mathbf {Adv}^{\oplus \text {-}\mathrm{{prf}}}_{F}({\mathcal A}) = \left| \Pr \left[ \, \text {RKA-PRF0}^{\mathcal A}_{F}\Rightarrow 1 \,\right] - \Pr \left[ \, \text {RKA-PRF1}^{\mathcal A}\Rightarrow 1 \,\right] \right| \;, \end{aligned}$$

where the probabilities are over the coins used in the games.

The proof of the following theorem then follows from a reduction to the RKA-PRF security of $\mathsf {f}$, coupled with a birthday bound to account for collisions during the challenge ciphertext computation. The proof is given in the full version.

Theorem 3

Let $\mathsf{{HFC}} $ be as shown in Fig. 10, using compression function $\mathsf {f}{{}:{}}\{0,1\}^{n}\times \{0,1\}^{d}\rightarrow \{0,1\}^{n}$ where $d \ge n \ge 128$. Then for any adversary ${\mathcal A}$ in the $\text {otROR}$ game against $HFC $, there exists an adversary ${\mathcal B}$ such that $\mathbf {Adv}^{{\mathrm{{ot}}{\text {-}}\mathrm{{ror}}}}_{\mathsf{{HFC}} }({\mathcal A}) \le \mathbf {Adv}^{\oplus \text {-}\mathrm{{prf}}}_{\mathsf {f}}({\mathcal B}) + \frac{\ell ^2}{2^n}$, where $\ell \cdot d$ denotes the length of ${\mathcal A}$’s encryption query after padding. The adversary ${\mathcal B}$ runs in time that of ${\mathcal A}$ plus an $\mathcal{O}(\ell )$ overhead and makes at most $\ell $ queries.

Instantiations. The obvious (and probably best) choice to instantiate $\mathsf {f}$ is the SHA-256 or SHA-512 compression function. These provide good software performance, and there is a shift towards widespread hardware support in the form of the Intel SHA instructions [11, 18, 39]. Extensive cryptanalysis for the CR (e.g., [23, 26, 36]), preimage resistance (e.g., [19, 23]), and RKA-PRP of the associated SHACAL-2 blockcipher (e.g., [21, 24, 25, 27]) gives confidence in its security. Another approach would be to use AES via a PGV compression function [31] like Davies-Meyer (DM). Security of AES has been studied extensively, and known attacks do not falsify the assumptions we need [7, 8]. On systems with AES-NI, $\mathsf{{HFC}}$ instantiated with DM-AES will have very good performance. More problematic is that binding can only hold up $2^{64}$, which is in general insufficient in practice. Other options, although in some cases less well-studied cryptanalytically, include SHA-3 finalists. In particular, a variant of the HFC construction using a sponge-based mode such as Keccak, in which the key is fed to the sponge prior to hashing the message blocks, would allow us to avoid the RKA assumption. We could also remove the assumption by using a compression function with a dedicated key input such as LP231 [34]. We discuss both cases, and include a more thorough discussion of instantiations, in the full version.

7 Compactly Committing AEAD from Encryptment

In this section we recall the formal notions for compactly committing AEAD schemes (ccAEAD schemes), following the treatment given by GLR [17], and compare these to encryptment. With this in place, we show in Sect. 7.3 how to build ccAEAD from encryptment with very efficient transforms. In the full version, we will show how to construct a secure encryptment scheme from a ccAEAD scheme in a way that transfers our negative results from Sect. 5 to ccAEAD; this result does not appear here for space reasons.

7.1 ccAEAD Syntax and Correctness

Encryptment can be viewed as a one-time secure, deterministic variant of ccAEAD. We discuss further the differences between the two primitives later in the section.

ccAEAD schemes. Formally, a ccAEAD scheme is a tuple of algorithms $ {\textsf {CE}} = ({\textsf {Kg}},{\textsf {Enc}},{\textsf {Dec}},{\textsf {Ver}})$ with associated key space $\mathcal{K}\subseteq \varSigma ^*$, header space $\mathcal{H}\subseteq \varSigma ^*$, message space $\mathcal{M}\subseteq \varSigma ^*$, ciphertext space $\mathcal{C}\subseteq \varSigma ^*$, opening space $\mathcal{K}_f\subseteq \varSigma ^*$, and binding tag space $\mathcal{T}\subseteq \varSigma ^*$, defined as follows. The randomized key generation algorithm ${\textsf {Kg}}$ takes no input, and outputs a secret key $K\in \mathcal{K}$. The randomized encryption algorithm ${\textsf {Enc}}$ takes as input a tuple $(K, H, M) \in \mathcal{K}\times \mathcal{H}\times \mathcal{M}$ and outputs a ciphertext/binding tag pair $(C, C_B) \in \mathcal{C}\times \mathcal{T}$. The deterministic decryption algorithm ${\textsf {Dec}}$ takes as input a tuple $(K, H, C, C_B) \in \mathcal{H}\times \mathcal{M}\times \mathcal{C}\times \mathcal{T}$, and outputs a message/opening pair $(M, K_{f}) \in \mathcal{M}\times \mathcal{K}_f$ or the error symbol $\perp $. The deterministic verification algorithm ${\textsf {Ver}}$ takes as input a tuple $(H, M, K_{f}, C_B) \in \mathcal{H}\times \mathcal{M}\times \mathcal{K}_f\times \mathcal {T}$, and outputs a bit b. We assume that if ${\textsf {Dec}}$ and ${\textsf {Ver}}$ are queried on inputs which do not lie in their defined input spaces, then they return $\perp $ and 0 respectively.

Correctness and compactness. Correctness for ccAEAD schemes is defined identically to the COR correctness notion for encryptment schemes (Fig. 5), except in the ccAEAD case the probability is now over the coins of ${\textsf {Enc}}$ also. We require that the structure of ciphertexts $C$ depend only on the length of the underlying message. Formally, let $M^*= \{i \ | \ \exists m \in \mathcal{M}:|m| = i\}$. Then we require that the ciphertext space $\mathcal{C}$ can be partitioned into disjoint sets $\mathcal{C}(i)\subseteq \mathcal{C}$, $i \in M^*$, such that for all $(H,M) \in \mathcal{H}\times \mathcal{M}$ it holds that $C\in \mathcal{C}(|M|)$ with probability one for the sequence of algorithm executions: . Finally, we require that the binding tags $C_B$ are compact, by which we mean that all $C_B$ returned by a ccAEAD scheme are of constant length $\textsf {blen}$ which is linear in the key size.

Comparison with encryptment. With this in place, we highlight the key differences between encryptment and ccAEAD schemes. The overarching difference is that encryptment schemes are single-use (a key is only ever used to encrypt a single message), whereas ccAEAD schemes are multi-use. To support this, the encryption algorithm for ccAEAD schemes is randomized, whereas for encryptment this algorithm is deterministic. This is necessary for achieving schemes that enjoy security in the face of attackers that can obtain multiple encryptions. Moreover, while encryptment schemes are restricted to use the same key for verification as they use for encryptment, ccAEAD schemes output an explicit opening key $K_{f}$ during decryption. There is no requirement that this equal the secret key used for encryption. Again, outputting an opening key distinct from the encryption key allows for ccAEAD schemes that maintain confidentiality and integrity even after some ciphertexts produced under a given encryption key have been opened.

AEAD schemes. The usual definition of AEAD schemes (see Sect. 2) can be recovered from the above definition of ccAEAD schemes by noticing that the tuple of AEAD algorithms $ \textsf {AEAD} = ( \textsf {AEAD} {.}\mathsf{kg}, \textsf {AEAD} {.}\mathsf{enc}, \textsf {AEAD} {.}\mathsf{dec})$ can be defined identically to their ccAEAD variants, except we view the ciphertext/binding tag pair as a single ciphertext, and modify decryption to no longer output the opening, in the AEAD case. This framing allows us to define security notions for AEAD schemes as a special case of those notions for ccAEAD schemes for conciseness and ease of comparison. Similarly regular AE schemes are defined to be the same as AEAD schemes but with all references to the header removed.

7.2 Security Notions for Compactly Committing AEAD

We now define the security notions for ccAEAD schemes, following GLR. They adapt the familiar security notions of real-or-random (ROR) ciphertext indistinguishability [33], and ciphertext integrity (CTXT) [4] for AE schemes to the ccAEAD setting. We focus on GLR’s multi-opening (MO) security notions. ${\mathrm{{MO}}\text {-}\mathrm{{ROR}}}$ (resp. ${\mathrm{{MO}}\text {-}\mathrm{{CTXT}}}$) requires that if multiple messages are encrypted under the same key, then learning the message/opening pair $(M, K_{f})$ for some of the resulting ciphertexts does not compromise the ROR (resp. CTXT) security of the remaining unopened ciphertexts. This precludes schemes which for example have the opening key $K_{f}$ equal to the secret encryption key K.

Confidentiality. Games $\text {MO-REAL}$ and $\text {MO-RAND}$ are shown in Fig. 11. In both variants, the attacker is given access to an oracle $\mathbf{ChalEnc }$ to which he may submit message/header pairs. This oracle returns real (resp. random) ciphertext/binding tag pairs in game $\text {MO-REAL}$ (resp. $\text {MO-RAND}$). The attacker is then challenged to distinguish between the two games. To model multi-opening security, the attacker is also given a pair of encryption/decryption oracles, $\mathbf{Enc }$ and $\mathbf{Dec }$, and may submit the (real) ciphertexts generated via a query to the former to the latter, learning the openings of these ciphertexts in the process. The challenge decryption oracle will return $\perp $ for any ciphertext not generated via the encryption oracle, to prevent the attacker trivially winning by decrypting a ciphertext returned by $\mathbf{ChalEnc }$. We define the advantage of an attacker ${\mathcal A}$ in game ${\mathrm{{MO}}\text {-}\mathrm{{ROR}}}$ against a ccAEAD scheme $ {\textsf {CE}} $ as

$$\begin{aligned} \mathbf {Adv}^{{\mathrm{{mo}}\text {-}\mathrm{{ror}}}}_{ {\textsf {CE}} }({\mathcal A}) = \left| \Pr \left[ \, \text {MO-REAL}_{ {\textsf {CE}} }^{{\mathcal A}}\Rightarrow 1 \,\right] - \Pr \left[ \, \text {MO-RAND}_{ {\textsf {CE}} }^{{\mathcal A}}\Rightarrow 1 \,\right] \right| \;. \end{aligned}$$

Ciphertext integrity. Ciphertext integrity guarantees that an attacker cannot produce a fresh ciphertext which will decrypt correctly. The multi-opening adaptation to the ccAEAD setting ${\mathrm{{MO}}\text {-}\mathrm{{CTXT}}}$ is shown in Fig. 11. The attacker ${\mathcal A}$ is given access to encryption oracle $\mathbf{Enc }$ and a challenge decryption oracle $\mathbf{ChalDec }$. The attacker wins if he submits a ciphertext to $\mathbf{ChalDec }$ which decrypts correctly and which wasn’t the result of a previous query to the encryption oracle. To model multi-opening security, the attacker is given access to a further oracle $\mathbf{Dec }$ via which he may decrypt ciphertexts and learn the corresponding openings. The advantage of an attacker ${\mathcal A}$ in game ${\mathrm{{MO}}\text {-}\mathrm{{CTXT}}}$ against a ccAEAD scheme $ {\textsf {CE}} $ is then defined

$$\begin{aligned} \mathbf {Adv}^{{\mathrm{{mo}}\text {-}\mathrm{{ctxt}}}}_{ {\textsf {CE}} }({\mathcal A}) = \Pr \left[ \, {\mathrm{{MO}}\text {-}\mathrm{{CTXT}}}_{ {\textsf {CE}} }^{{\mathcal A}}\Rightarrow \mathsf {true} \,\right] \;. \end{aligned}$$

Security for standard AEAD. We note that the familiar $\text {ROR}$ and $\text {CTXT}$ notions for AEAD schemes can be recovered from the corresponding ccAEAD games in Fig. 11 by reframing the ccAEAD scheme as an AEAD scheme as described previously, removing access to oracle $\mathbf{Dec }$ in all games, and removing $\mathbf{Enc }$ in $\text {MO-REAL}$ and $\text {MO-RAND}$. Advantage functions are defined analogously. Since here we are removing attacker capabilities, it follows that security for a ccAEAD scheme with respect to these notions implies security for the derived AEAD scheme also.

Receiver and sender binding. Strong receiver binding for ccAEAD schemes is the same as for encryptment (Fig. 6), except the attacker outputs openings $K_{f}, K_{f}'$ rather than secret keys $K, K'$ as part of his guess. The sender binding game for a ccAEAD scheme challenges an attacker ${\mathcal A}$ to output a tuple $(K,H,C, C_B)$ such that $(K_{f},M) \leftarrow {\textsf {Dec}}(K,H,C,C_B)$ does not equal $\bot $ but ${\textsf {Ver}}(H,M,K_{f},C_B) = 0$. This is the same as the associated game for encryptment, except that the opening $K_{f}$ recovered during decryption is used for verification rather than the key output by ${\mathcal A}$. Given the similarities, we abuse notation by using the same names for ccAEAD binding notion games and advantage terms as in the encryptment case; which version will be clear from the context.

Given that both target certain binding notions, a natural question is whether an $\mathrm{{sr}}{\text {-}}\mathrm{{BIND}}$ secure ccAEAD scheme is also robust [16], and vice versa. In the full version, we show that neither notion implies the other in generality. We also discuss the conditions under which the ccAEAD schemes we build from secure encryptment are robust.

7.3 Encryptment to ccAEAD Transforms

We now turn to building ccAEAD from encryptment. Fix an encryptment scheme ${\mathsf{{EC}}}= ({\textsf {EKg}}, {\textsf {EC}}, {\textsf {DO}}, {\textsf {EVer}})$ and a standard AEAD scheme $ \textsf {AEAD} = ( \textsf {AEAD} {.}\textsf {Kg}, \textsf {AEAD} {.}\mathsf{enc}, \textsf {AEAD} {.}\mathsf{dec})$. Let $ {\textsf {CE}} [{\mathsf{{EC}}}, \textsf {AEAD} ] = ({\textsf {Kg}},{\textsf {Enc}},{\textsf {Dec}},{\textsf {Ver}})$ be the ccAEAD scheme whose encryption, decryption, and verification algorithms are shown in Fig. 12. Key generation ${\textsf {Kg}}$ runs and outputs K.

To encrypt a header/message (H, M), ${\textsf {Enc}}$ uses the key generation algorithm of the encryptment scheme to generate a one-time encryptment key , and computes the encryptment of the header and message via $(C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}}) \leftarrow {\textsf {EC}}(K_{{\mathsf{{EC}}}}, H, M)$. The scheme then uses the encryption algorithm of the $ \textsf {AEAD} $ scheme to encrypt the one-time key $K_{{\mathsf{{EC}}}}$ with header $B_{{\mathsf{{EC}}}}$, producing , and outputs $((C_{{\mathsf{{EC}}}}, C_{ \textsf {AE} }), B_{{\mathsf{{EC}}}})$. On input $(K, (C_{{\mathsf{{EC}}}}, C_{ \textsf {AE} }), B_{{\mathsf{{EC}}}})$, ${\textsf {Dec}}$ computes $K_{{\mathsf{{EC}}}}\leftarrow \textsf {AEAD} {.}\mathsf{dec}(K, B_{{\mathsf{{EC}}}}, C_{ \textsf {AE} })$ and if $K_{{\mathsf{{EC}}}}= \perp $ returns $\perp $ since this clearly indicates that $C_{ \textsf {AE} }$ is invalid. The recovered key $K_{{\mathsf{{EC}}}}$ is in turn used to recover the message via $M \leftarrow {\textsf {DO}}(K_{{\mathsf{{EC}}}}, H, C_{{\mathsf{{EC}}}}, B_{{\mathsf{{EC}}}})$. If $M = \perp $, the scheme returns $\perp $; otherwise, ${\textsf {EC}}$ returns $(M, K_{{\mathsf{{EC}}}})$ as the message/opening pair. ${\textsf {Ver}}$ simply applies the verification algorithm ${\textsf {EVer}}$ of the underlying encryptment scheme to the input tuple and returns the result.

Notice that by including the binding tag $B_{{\mathsf{{EC}}}}$ as the header in the authenticated encryption, this ensures the integrity of $B_{{\mathsf{{EC}}}}$. If we did not authenticate $B_{{\mathsf{{EC}}}}$ then an attacker could trivially break the ${\mathrm{{MO}}\text {-}\mathrm{{CTXT}}}$-security of the scheme by using an $\mathbf{Enc }$ query to obtain ciphertext $((C_{{\mathsf{{EC}}}}, C_{ \textsf {AE} }), B_{{\mathsf{{EC}}}})$ for a pair (H, M), submitting that ciphertext to $\mathbf{Dec }$ to recover the opening/key $K_{{\mathsf{{EC}}}}$, with which he can easily create a valid forgery by computing $(C_{{\mathsf{{EC}}}}', B_{{\mathsf{{EC}}}}') \leftarrow {\textsf {EC}}(K_{{\mathsf{{EC}}}}, H', M')$ for some distinct header/message pair and outputting $((C_{{\mathsf{{EC}}}}', C_{ \textsf {AE} }), B_{{\mathsf{{EC}}}}')$. Including the binding tag as the header in the AEAD ciphertext means that an attacker trying to replicate the above mix-and-match attack must create a forgery for an encryptment binding tag and key already returned as the result of an $\mathbf{Enc }$ query, thus violating the $\text {SCU}$ security of the underlying encryptment scheme.

Security of the transform. Next, we analyze the security of the ccAEAD scheme $ {\textsf {CE}} [{\mathsf{{EC}}}, \textsf {AEAD} ]$ shown in Fig. 12. We begin with confidentiality. The proof of the following theorem follows from reductions to the ROR security of the underlying encryptment and AEAD schemes, and is given in the full version.

Theorem 4

Let ${\mathsf{{EC}}}$ be an encryptment scheme, $ \textsf {AEAD} $ be an authenticated encryption scheme, and let $ {\textsf {CE}} [{\mathsf{{EC}}}, \textsf {AEAD} ]$ be the ccAEAD scheme built from ${\mathsf{{EC}}}$ according to Fig. 12. Then for any adversary ${\mathcal A}$ in the ${\mathrm{{MO}}\text {-}\mathrm{{ROR}}}$ game against $ {\textsf {CE}} $ making a total of q queries, of which $q_c$ are to $\mathbf{ChalEnc } $ and $q_e$ are to $\mathbf{Enc } $, there exists adversaries ${\mathcal B}$ and ${\mathcal C}$ suchthat

$$\begin{aligned} \mathbf {Adv}^{{\mathrm{{mo}}\text {-}\mathrm{{ror}}}}_{ {\textsf {CE}} }({\mathcal A}) \le 2 \cdot \mathbf {Adv}^{\mathrm {ror}}_{ \textsf {AEAD} }({\mathcal B}) + q_c \cdot \mathbf {Adv}^{{\mathrm{{ot}}{\text {-}}\mathrm{{ror}}}}_{{\mathsf{{EC}}}}({\mathcal C})\; . \end{aligned}$$

Adversaries ${\mathcal B}$ and ${\mathcal C}$ run in the same time as ${\mathcal A}$ with an $\mathcal{O}(q)$ overhead, and adversary ${\mathcal B}$ makes at most $q_c + q_e$ encryption oracle queries.

Next we bound the ${\mathrm{{MO}}\text {-}\mathrm{{CTXT}}}$ advantage of any adversary against $ {\textsf {CE}} [{\mathsf{{EC}}}, \textsf {AEAD} ]$, via a reduction to the $\text {CTXT}$ security of the underlying AEAD scheme, and the $\text {SCU}$ security of the encryptment scheme; we defer the proof to the full version.

Theorem 5

Let ${\mathsf{{EC}}}$ be an encryptment scheme, $ \textsf {AEAD} $ be an authenticated encryption scheme, and let $ {\textsf {CE}} [{\mathsf{{EC}}}, \textsf {AEAD} ]$ be the ccAEAD scheme built from ${\mathsf{{EC}}}$ according to Fig. 12. Then for any adversary ${\mathcal A}$ in the ${\mathrm{{MO}}\text {-}\mathrm{{CTXT}}}$ game against $ {\textsf {CE}} $ making a total of q queries, of which $q_e$ are to $\mathbf{Enc } $, there exists adversaries ${\mathcal B}$ and ${\mathcal C}$ such that

$$\begin{aligned} \mathbf {Adv}^{{\mathrm{{mo}}\text {-}\mathrm{{ctxt}}}}_{ {\textsf {CE}} }({\mathcal A}) \le \mathbf {Adv}^{\mathrm {ctxt}}_{ \textsf {AEAD} }({\mathcal B}) + q_e \cdot \mathbf {Adv}^{\mathrm {scu}}_{{\mathsf{{EC}}}}({\mathcal C}) \; . \end{aligned}$$

Adversaries ${\mathcal B}$ and ${\mathcal C}$ run in the same time as ${\mathcal A}$ with an $\mathcal{O}(q)$ overhead, and adversary ${\mathcal B}$ makes at most as many queries as ${\mathcal A}$.

We omit bounding the $\text {s-BIND}$ and $\mathrm{{sr}}{\text {-}}\mathrm{{BIND}}$ security of $ {\textsf {CE}} [{\mathsf{{EC}}}, \textsf {AEAD} ]$, since $ {\textsf {CE}} $ inherits these properties directly from ${\mathsf{{EC}}}$. By reframing $ {\textsf {CE}} $ as a regular AEAD scheme, our transform yields a $\text {ROR}$ and $\text {CTXT}$ secure single-pass AEAD scheme. To implement the transform, the fixed-input-length AE scheme must be instantiated. One can use, for example, AES-GCM or OCB. In the full version of the paper, we provide two other approaches for building ccAEAD from encryptment, which use a PRF and a tweakable block cipher respectively.

Notes

1.
A secure commitment allows a user to commit to a message without revealing its content; see [10] for further discussion.
2.
One can modify our definitions so keys can be picked from a set as a function of the current round and messages, what Rogaway and Steinberger refer to as the no-fixed order model, and as first done in [9]. A negative result based on [9, Theorem 5] would rule out encryptment using any rate-1 no-fixed order verification algorithm.

References

Abdalla, M., Bellare, M., Neven, G.: Robust encryption. In: Micciancio, D. (ed.) TCC 2010. LNCS, vol. 5978, pp. 480–497. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-11799-2_28
Chapter Google Scholar
Bellare, M., Jaeger, J., Len, J.: Better than advertised: improved collision-resistance guarantees for MD-based hash functions. In: ACM CCS (2017)
Google Scholar
Bellare, M., Kohno, T.: A theoretical treatment of related-key attacks: RKA-PRPs, RKA-PRFs, and applications. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 491–506. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-39200-9_31
Chapter Google Scholar
Bellare, M., Namprempre, C.: Authenticated encryption: relations among notions and analysis of the generic composition paradigm. In: Okamoto, T. (ed.) ASIACRYPT 2000. LNCS, vol. 1976, pp. 531–545. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44448-3_41
Chapter Google Scholar
Bertoni, G., Daemen, J., Peeters, M., Van Assche, G.: Keccak sponge function family main document. Submission to NIST SHA3 (2009)
Google Scholar
Bertoni, G., Daemen, J., Peeters, M., Van Assche, G.: Duplexing the sponge: single-pass authenticated encryption and other applications. In: Miri, A., Vaudenay, S. (eds.) SAC 2011. LNCS, vol. 7118, pp. 320–337. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28496-0_19
Chapter Google Scholar
Biryukov, A., Khovratovich, D.: Related-key cryptanalysis of the full AES-192 and AES-256. In: Matsui, M. (ed.) ASIACRYPT 2009. LNCS, vol. 5912, pp. 1–18. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10366-7_1
Chapter Google Scholar
Biryukov, A., Khovratovich, D., Nikolić, I.: Distinguisher and related-key attack on the full AES-256. In: Halevi, S. (ed.) CRYPTO 2009. LNCS, vol. 5677, pp. 231–249. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03356-8_14
Chapter Google Scholar
Black, J., Cochran, M., Shrimpton, T.: On the impossibility of highly-efficient blockcipher-based hash functions. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 526–541. Springer, Heidelberg (2005). https://doi.org/10.1007/11426639_31
Chapter Google Scholar
Brassard, G., Chaum, D., Crépeau, C.: Minimum disclosure proofs of knowledge. JCSS 37, 156–189 (1988)
MathSciNet MATH Google Scholar
Advanced Micro Devices: The ZEN microarchitecture (2016). https://www.amd.com/en/technologies/zen-core
Dodis, Y., An, J.H.: Concealment and its applications to authenticated encryption. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 312–329. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-39200-9_19
Chapter Google Scholar
Facebook: Facebook Messenger app (2016). https://www.messenger.com/
Facebook: Messenger Secret Conversations Technical Whitepaper (2016)
Google Scholar
Farshim, P., Libert, B., Paterson, K.G., Quaglia, E.A.: Robust encryption, revisited. In: Kurosawa, K., Hanaoka, G. (eds.) PKC 2013. LNCS, vol. 7778, pp. 352–368. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36362-7_22
Chapter Google Scholar
Farshim, P., Orlandi, C., Rosie, R: Security of symmetric primitives under incorrect usage of keys. In: FSE (2017)
Google Scholar
Grubbs, P., Lu, J., Ristenpart, T.: Message franking via committing authenticated encryption. In: Katz, J., Shacham, H. (eds.) CRYPTO 2017. LNCS, vol. 10403, pp. 66–97. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63697-9_3
Chapter Google Scholar
Gulley, S., Gopal, V., Yap, K., Feghali, W., Guilford, J.: Intel SHA extensions (2013). https://software.intel.com/en-us/articles/intel-sha-extensions
Guo, J., Ling, S., Rechberger, C., Wang, H.: Advanced meet-in-the-middle preimage attacks: first results on full tiger, and improved results on MD4 and SHA-2. In: Abe, M. (ed.) ASIACRYPT 2010. LNCS, vol. 6477, pp. 56–75. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17373-8_4
Chapter Google Scholar
Halevi, S., Krawczyk, H.: Strengthening digital signatures via randomized hashing. In: Dwork, C. (ed.) CRYPTO 2006. LNCS, vol. 4117, pp. 41–59. Springer, Heidelberg (2006). https://doi.org/10.1007/11818175_3
Chapter Google Scholar
Hong, S., Kim, J., Lee, S., Preneel, B.: Related-key rectangle attacks on reduced versions of SHACAL-1 and AES-192. In: Gilbert, H., Handschuh, H. (eds.) FSE 2005. LNCS, vol. 3557, pp. 368–383. Springer, Heidelberg (2005). https://doi.org/10.1007/11502760_25
Chapter Google Scholar
Jutla, C.S.: Encryption modes with almost free message integrity. In: Pfitzmann, B. (ed.) EUROCRYPT 2001. LNCS, vol. 2045, pp. 529–544. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44987-6_32
Chapter Google Scholar
Khovratovich, D., Rechberger, C., Savelieva, A.: Bicliques for preimages: attacks on Skein-512 and the SHA-2 family. In: Canteaut, A. (ed.) FSE 2012. LNCS, vol. 7549, pp. 244–263. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34047-5_15
Chapter Google Scholar
Kim, J., Kim, G., Hong, S., Lee, S., Hong, D.: The related-key rectangle attack – application to SHACAL-1. In: Wang, H., Pieprzyk, J., Varadharajan, V. (eds.) ACISP 2004. LNCS, vol. 3108, pp. 123–136. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27800-9_11
Chapter Google Scholar
Kim, J., Kim, G., Lee, S., Lim, J., Song, J.: Related-key attacks on reduced rounds of SHACAL-2. In: Canteaut, A., Viswanathan, K. (eds.) INDOCRYPT 2004. LNCS, vol. 3348, pp. 175–190. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30556-9_15
Chapter Google Scholar
Lamberger, M., Mendel, F.: Higher-order differential attack on reduced SHA-256. IACR ePrint, Report 2011/037 (2011)
Google Scholar
Lu, J., Kim, J., Keller, N., Dunkelman, O.: Related-key rectangle attack on 42-round SHACAL-2. In: Katsikas, S.K., López, J., Backes, M., Gritzalis, S., Preneel, B. (eds.) ISC 2006. LNCS, vol. 4176, pp. 85–100. Springer, Heidelberg (2006). https://doi.org/10.1007/11836810_7
Chapter Google Scholar
McGrew, D., Viega, J.: The Galois/counter mode of operation (GCM). In: NIST Modes of Operation (2004)
Google Scholar
Millican, J.: Personal communication, Feb 2018
Google Scholar
Millican, J.: Challenges of E2E Encryption in Facebook Messenger. RWC (2017)
Google Scholar
Preneel, B., Govaerts, R., Vandewalle, J.: Hash functions based on block ciphers: a synthetic approach. In: Stinson, D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp. 368–378. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-48329-2_31
Chapter Google Scholar
Rogaway, P., Bellare, M., Black, J.: OCB: a block-cipher mode of operation for efficient authenticated encryption. ACM TISSEC 6, 365–403 (2003)
Article Google Scholar
Rogaway, P., Shrimpton, T.: A provable-security treatment of the key-wrap problem. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 373–390. Springer, Heidelberg (2006). https://doi.org/10.1007/11761679_23
Chapter Google Scholar
Rogaway, P., Steinberger, J.: Constructing cryptographic hash functions from fixed-key blockciphers. In: Wagner, D. (ed.) CRYPTO 2008. LNCS, vol. 5157, pp. 433–450. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85174-5_24
Chapter Google Scholar
Rogaway, P., Steinberger, J.: Security/efficiency tradeoffs for permutation-based hashing. In: Smart, N. (ed.) EUROCRYPT 2008. LNCS, vol. 4965, pp. 220–236. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78967-3_13
Chapter Google Scholar
Sanadhya, S.K., Sarkar, P.: New collision attacks against up to 24-step SHA-2. In: Chowdhury, D.R., Rijmen, V., Das, A. (eds.) INDOCRYPT 2008. LNCS, vol. 5365, pp. 91–103. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89754-5_8
Chapter Google Scholar
Shrimpton, T., Stam, M.: Building a collision-resistant compression function from non-compressing primitives. In: Aceto, L., Damgård, I., Goldberg, L.A., Halldórsson, M.M., Ingólfsdóttir, A., Walukiewicz, I. (eds.) ICALP 2008. LNCS, vol. 5126, pp. 643–654. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-70583-3_52
Chapter Google Scholar
Open Whisper Systems: Signal (2016). https://signal.org/
van der Linde, W.: Parallel SHA-256 in NEON for use in hash-based signatures. BSc thesis, Radboud University (2016)
Google Scholar
Whatsapp: Whatsapp (2016). https://www.whatsapp.com/

Download references

Acknowledgments

The authors thank Jon Millican for his help on understanding Facebook’s message franking systems. Dodis is partially supported by gifts from VMware Labs and Google, and NSF grants 1619158, 1319051, 1314568. Grubbs is supported by an NSF Graduate Research Fellowship. A portion of this work was completed while Grubbs visited Royal Holloway University, and he thanks Kenny Patterson for generously hosting him. Ristenpart is supported in part by NSF grants 1704527 and 1514163, as well as a gift from Microsoft. Woodage is supported by the EPSRC and the UK government as part of the Centre for Doctoral Training in Cyber Security at Royal Holloway, University of London (EP/K035584/1).

Author information

Authors and Affiliations

New York University, New York, USA
Yevgeniy Dodis
Cornell Tech, New York, USA
Paul Grubbs & Thomas Ristenpart
Royal Holloway, University of London, Egham, UK
Joanne Woodage

Authors

Yevgeniy Dodis
View author publications
You can also search for this author in PubMed Google Scholar
Paul Grubbs
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Ristenpart
View author publications
You can also search for this author in PubMed Google Scholar
Joanne Woodage
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paul Grubbs .

Editor information

Editors and Affiliations

The University of Texas at Austin, Austin, Texas, USA
Hovav Shacham
Georgia Institute of Technology, Atlanta, Georgia, USA
Alexandra Boldyreva

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dodis, Y., Grubbs, P., Ristenpart, T., Woodage, J. (2018). Fast Message Franking: From Invisible Salamanders to Encryptment. In: Shacham, H., Boldyreva, A. (eds) Advances in Cryptology – CRYPTO 2018. CRYPTO 2018. Lecture Notes in Computer Science(), vol 10991. Springer, Cham. https://doi.org/10.1007/978-3-319-96884-1_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-96884-1_6
Published: 25 July 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96883-4
Online ISBN: 978-3-319-96884-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the International Association for Cryptologic Research (opens in a new tab)

Fast Message Franking: From Invisible Salamanders to Encryptment

Abstract

1 Introduction

2 Definitions and Preliminaries

3 Invisible Salamanders: Breaking Facebook’s Franking