1 Introduction

Non-malleable codes (NMC) were introduced by Dziembowski, Pietrzak and Wichs [39] as a relaxation of error correction and error detection codes, aiming to provide strong privacy but relaxed correctness. Informally, non-malleability guarantees that any modified codeword decodes either to the original message or to a completely unrelated one, with overwhelming probability. The definition of non-malleability is simulation-based, stating that for any tampering function \(f\), there exists a simulator that simulates the tampering effect by only accessing \(f\), i.e., without making any assumptions on the distribution of the encoded message.

The main application of non-malleable codes that motivated the seminal work by Dziembowski et al. [39] is the protection of cryptographic implementations from active physical attacks against memory, known as tampering attacks. In this setting, the adversary modifies the memory of the cryptographic device, receives the output of the computation, and tries to extract sensitive information related to the private memory. Security against such types of attacks can be achieved by encoding the private memory of the device using non-malleable codes. Besides that, various applications of non-malleable codes have been proposed in subsequent works, such as CCA secure encryption schemes [29] and non-malleable commitments [4].

Due to their important applications, constructing non-malleable codes has received a lot of attention over recent years. As non-malleability against general functions is impossible [39], various subclasses of tampering functions have been considered, such as split-state functions [1,2,3, 38, 39, 51, 55], bit-wise tampering and permutations [4, 5, 39], bounded-size function classes [45], bounded depth/fan-in circuits [14], space-bounded tampering [42], and others (cf. Sect. 1.4). One characteristic shared by those function classes is that they allow full access to the codeword, while imposing structural or computational restrictions to the way the function computes over the input. In this work, we initiate a comprehensive study on non-malleability for functions that receive partial access over the codeword, which is an important yet overlooked class, as we elaborate below.

The class of partial functions. The class of partial functions contains all functions that read/write on an arbitrary subset of codeword bits with specific cardinality. The elements of the subset can be chosen selectively or adaptively. Concretely, let \(c\) be a codeword with length \(\nu \). For \(\alpha \in [0,1)\), the function class \(\mathcal {F}^{\alpha \nu }\) (or \({\mathcal {F}}^{\alpha }\) for brevity) consists of all functions that operate over any subset of bits of \(c\) with cardinality at most \(\alpha \nu \), while leaving the remaining bits unseen and intact. The work of Cheraghchi and Guruswami [27] explicitly defines this class and uses a subclass (the one containing functions that always touch the first \(\alpha \nu \) bits of the codeword) in a negative way, namely as the tool for deriving capacity lower bounds for information-theoretic non-malleable codes against split-state functions. Partial functions were also studied implicitly by Faust et al. [45], while aiming for non-malleability against bounded-size circuits.Footnote 1

Even though capacity lower bounds for partial functions have been derived (cf. [27]), our understanding about explicit constructions is still limited. Existential results can be derived by the probabilistic method, as shown in prior works [27, 39],Footnote 2 but they do not yield explicit constructions. On the other hand, the capacity bounds do not apply to the computational setting, which could potentially allow more practical solutions. We believe that this is a direction that needs to be explored, as besides the theoretical interest, partial functions is a natural model that complies with existing attacks that require partial access to the registers of the cryptographic implementation [16, 19,20,21, 63].Footnote 3

Besides the importance of partial functions in the active setting, i.e., when the function is allowed to partially read/write the codeword, the passive analogue of the class, i.e., when the function is only given read access over the codeword, matches the model considered by All-Or-Nothing Transforms (AONTs), which is a notion originally introduced by Rivest [60], providing security guarantees similar to those of leakage resilience: reading an arbitrary subset (up to some bounded cardinality) of locations of the codeword does not reveal the underlying message. As non-malleable codes provide privacy, non-malleability for partial functions is the active analogue of (and in fact implies) AONTs, that find numerous applications [22, 23, 59, 60, 62].

Plausibility. At a first glance, one might think that partial functions better comply with the framework of error-correction/detection codes (ECC/EDC), as they do not touch the whole codeword. However, if we allow the adversary to access asymptotically almost the entire codeword (which can be more than the minimum distance that an ECC can achieve), it is conceivable it can use this generous access rate, i.e., the fraction of the codeword that can be accessed (see below), to create correlated encodings; thus, we believe solving non-malleability in this setting is a natural question. Additionally, ECC/EDC cannot guarantee security against selective failure attack for high access rate tampering adversaries and thus provide weaker security compared with the simulation-based non-malleable codes. Below we elaborate.

We illustrate the separation between the notions using the following example. Consider the set of partial functions that operate either on the right or on the left half of the codeword (the function chooses if it is going to be left or right), and the trivial encoding scheme that on input message \(s\) outputs \((s,s)\). The decoder, on input \((s,s')\), checks if \(s= s'\), in which case it outputs \(s\), otherwise it outputs \(\bot \). This scheme is clearly an EDC against the aforementioned function class,Footnote 4 as the output of the decoder is in \(\{s, \bot \}\), with probability 1; however, it is considered malleable since the tampering function can create encodings whose validity depends on the message. On the other hand, an ECC would provide a trivial solution in this setting; however, it requires restriction of the adversarial access fraction to 1/2 (of the codeword); by accessing more than this fraction, the attacker can possibly create invalid encodings depending on the message, as general ECCs do not provide privacy. Thus, the ECC/EDC setting is inapt against this type of selective failure tampering in the presence of attackers that access almost the entire codeword. Later in this section, we provide an extensive discussion on challenges of non-malleability for partial functions.

Besides the plausibility and the lack of a comprehensive study, partial functions can potentially allow stronger primitives, as constant functions are excluded from the class. This is similar to the path followed by Jafargholi and Wichs [49], aiming to achieve tamper detection (cf. Sect. 1.4) against a class of functions that implicitly excludes constant functions and the identity function. In this work, we prove that this intuition holds, by showing that partial functions allow a stronger primitive that we define as non-malleability with manipulation detection (MD-NMC), which guarantees that any tampered codeword will either decode to the original message or to \(\bot \), and additionally, the outcome can be simulated without knowing the underlying message. This implies that the notion can defend against the selective failure tampering attacks, as the decoding of the tampered outcome would not only be either the original message or \(\bot \), but also independent of the underlying message. Thus, MD-NMC is not subject to the selective failure attacks, providing stronger security than what ECC/EDC can provide as we argued above.

Given the above, we believe that partial functions is an interesting and well-motivated model. The goal of this work is to answer the following (informally stated) question:

Is it possible to construct efficient (high information rate) non-malleable codes for partial functions, while allowing the attacker to access almost the entire codeword?

We answer the above question in the affirmative. Before presenting our results (cf. Sect. 1.1) and the high level ideas behind our techniques (cf. Sect. 1.2), we identify the several challenges that are involved in tackling the problem.

Challenges. We first define some useful notions used throughout the paper.

  • Information rate: the ratio of message to codeword length, as the message length goes to infinity.

  • Access rate: the fraction of the number of bits that the attacker is allowed to access over the total codeword length, as the message length goes to infinity.

The access rate measures the effectiveness of a non-malleable code in the partial function setting and reflects the level of adversarial access to the codeword. In this work, we aim at constructing non-malleable codes for partial functions with high information rate and high access rate, i.e., both rates should approach 1 simultaneously. Before discussing the challenges posed by this requirement, we first review some known impossibility results. First, non-malleability for partial functions with concrete access rate 1 is impossible, as the function can fully decode the codeword and then re-encode a related message [39]. Second, information-theoretic non-malleable codes with constant information rate (e.g., 0.5) are not possible against partial functions with constant access rate [27],Footnote 5 and consequently, solutions in the information-theoretic settings such as ECC and Robust Secret Sharing (RSS) do not solve our problem. Based on these facts, in order to achieve our goal, the only path is to explore the computational setting, aiming for access rate at most \(1-\epsilon \), for some \(\epsilon >0\).

At a first glance, one might think that non-malleability for partial functions is easier to achieve, compared to other function classes, as partial functions cannot touch the whole codeword. Having that in mind, it would be tempting to conclude that existing designs/techniques with minor modifications are sufficient to achieve our goal. However, we will show that this intuition is misleading, by pointing out why prior approaches fail to provide security against partial functions with high access rate.

The current state of the art in the computational setting considers tools such as (Authenticated) Encryption [1, 35, 36, 41, 51, 55], non-interactive zero-knowledge (NIZK) proofs [35, 41, 43, 55], and \(\ell \)-more extractable collision resistant hashes (ECRH) [51], where others use KEM/DEM techniques [1, 36]. Those constructions share a common structure, incorporating a short secret key \(sk\) (or a short encoding of it), as well as a long ciphertext, \(e\), and a proof \(\pi \) (or a hash value). Now, consider the partial function \(f\) that gets full access to the secret key \(sk\) and a constant number of bits of the ciphertext \(e\), partially decrypts \(e\) and modifies the codeword depending on those bits. Then, it is not hard to see that non-malleability falls apart as the security of the encryption no longer holds. The attack requires access rate only \(O ((|sk|)/(|sk| + |e| + |\pi |))\), for [35, 41, 55] and \(O(\textrm{poly}(k)/|s|)\) for [1, 36, 51]. A similar attack applies to [43], which is in the continual setting.

One possible route to tackle the above challenges is to use an encoding scheme over the ciphertext, such that partial access over it does not reveal the underlying message.Footnote 6 The guarantees that we need from such a primitive resemble the properties of AONTs; however, this primitive does not provide security against active, i.e., tampering, attacks. Another approach would be to use Reconstructable Probabilistic Encodings [14] which provide error-correcting guarantees, yet still it is unknown whether we can achieve information rate 1 for such a primitive. In addition, the techniques and tools for protecting the secret key can be used to achieve optimal information rate as they are independent of the underlying message, yet at the same time, they become the weakest point against partial functions with high access rate. Thus, the question is how to overcome the above challenges, allowing access to almost the entire codeword.

In this paper, we solve the challenges presented above based on the following observation: in existing solutions the structure of the codeword is fixed and known to the attacker, and independently of the primitives that we use, the only way to resolve the above issues is by hiding the structure via randomization. This requires a structure recovering mechanism that can either be implemented by an “external” source, or otherwise the structure needs to be reflected in the codeword in some way that the attacker cannot exploit. In the present work, we implement this mechanism in both ways, by first proposing a construction in the common reference string (CRS) model, and then we show how to remove the CRS using slightly bigger alphabets. Refer to Sect. 1.2 for a technical overview.

1.1 Our Results

We initiate the study of non-malleable codes with manipulation detection (MD-NMC), and we present the first (to our knowledge) construction for this type of codes. We focus on achieving simultaneously high information rate and high access rate, in the partial functions setting, which by the results of [27], it can be achieved only in the computational setting.

Our contribution is threefold. First, we construct an information rate \(1\) non-malleable code in the CRS model, with access rate \(1-1/\varOmega (\log k)\), where k denotes the security parameter. Our construction combines Authenticated Encryption together with an inner code that protects the key of the encryption scheme (cf. Sect. 1.2). The result is informally summarized in the following theorem.

Theorem 1.1

(Informal) Assuming one-way functions, there exists an explicit computationally secure MD-NMC over the binary alphabet in the CRS model, against selective selection of codeword locations, achieving information rate \(1\) and access rate \(1-1/\varOmega (\log k)\), where k is the security parameter.

Our scheme, in order to achieve security with error \(2^{-\varOmega (k)}\), produces codewords of length \(|s|+O(k^2 \log k)\), where |s| denotes the length of the message and uses a CRS of length \(O(k^2 \log k \log (|s| +k))\). We note that our construction does not require the CRS to be fully tamper-proof, and we refer the reader to Sect. 1.2 for a discussion on the topic.

In our second result, we show how to remove the CRS by slightly increasing the size of the alphabet. Our result is a computationally secure MD-NMC in the standard model, achieving information and access rate \(1-1/\varOmega (\log k)\). Our construction is proven secure by a reduction to the security of the scheme presented in Theorem 1.1. Below, we informally state our result.

Theorem 1.2

(Informal) Assuming one-way functions, there exists an explicit, computationally secure MD-NMC in the standard model against adaptive selection of codeword locations, with alphabet length \(O(\log k)\), information rate \(1-1/\varOmega (\log k)\) and access rate \(1-1/\varOmega (\log k)\), where k is the security parameter.

Our scheme produces codewords of length \(|s|(1+1/O(\log k))+O(k^2 \log ^2 k)\).

In Sect. 1.2, we consider security against continuous attacks. We show how to achieve a weaker notion of continuous security, while avoiding the use of a self-destruct mechanism, which was originally achieved by [41]. Our notion is weaker than full continuous security [43], since the codewords need to be updated with a mechanism that is heavier than self-destruct, still it is deterministic and more efficient than the re-encoding process of [39, 55]; it uses only shuffling and refreshing operations, i.e., we avoid cryptographic computations such as group operations and NIZKs. We call such an update mechanism a “light update.” Informally, we prove the following result.

Theorem 1.3

(Informal) One-way functions imply continuous non-malleable codes with deterministic light updates and without self-destruct against adaptive selection of codeword locations, in the standard model, with alphabet length \(O(\log k)\), information rate \(1-1/\varOmega (\log k)\) and access rate \(1-1/\varOmega (\log k)\), where k is the security parameter.

As we have already stated, non-malleable codes against partial functions imply AONTs [60]. The first AONT was presented by Boyko [22] in the random oracle model, and then, Canetti et al. [23] consider AONTs with public/private parts as well as a secret-only part, which is the full notion. Canetti et al. [23] provide efficient constructions for both settings, yet the fully secure AONT (called “secret-only” in that paper) is based on non-standard assumptions.Footnote 7

Assuming one-way functions, our results yield efficient, fully secure AONTs, in the standard model. This resolves, the open question left in [23], where the problem of constructing AONT under standard assumptions was posed. Our result is presented in the following theorem.

Theorem 1.4

(Informal) Assuming one-way functions, there exists an explicit secret-only AONT in the standard model, with information rate \(1\) and access rate \(1-1/\varOmega (\log k)\), where k is the security parameter.

The above theorem is derived by the Informal Theorem 1.1, yielding an AONT whose output consists of both the CRS and the codeword produced by the NMC scheme in the CRS model. A similar theorem can be derived with respect to the Informal Theorem 1.2. Finally, and in connection to AONTs that provide leakage resilience, our results imply leakage-resilient codes [55] for partial functions. In Sect. 2.3, we present the connection between MD-NMC and AONT with a formal description.

We provide concrete instantiations of our constructions, using textbook instantiations [50] for the underlying authenticated encryption scheme. For completeness, we also provide information theoretic variants of our constructions that maintain high access rate and thus necessarily sacrifice information rate.

1.2 Technical Overview

On the manipulation detection property. In the present work, we exploit the fact that the class of partial functions does not include constant functions and we achieve a notion that is stronger than non-malleability, which we call non-malleability with manipulation detection. We formalize this notion as a strengthening of non-malleability, and we show that our constructions achieve this stronger notion. Informally, manipulation detection ensures that any tampered codeword will either decode to the original message or to \(\bot \).

A MD-NMC in the CRS model. For the exposition of our ideas, we start with a naive scheme (which does not work) and then show how we resolve all the challenges. Let \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) be a (symmetric) authenticated encryption scheme and consider the following encoding scheme: to encode a message \(s\), the encoder computes \((sk||e)\), where \(e\leftarrow {{\textsf{E}}}_{sk}(s)\) is the ciphertext and \(sk\leftarrow {\textsf{KGen}}(1^k)\), is the secret key. We observe that the scheme is secure if the tampering function can only read/write on the ciphertext, \(e\), assuming the authenticity property of the encryption scheme, however, restricting access to \(sk\), which is short, is unnatural and makes the problem trivial. On the other hand, even partial access to \(sk\), compromises the authenticity property of the scheme, and even if there is no explicit attack against the non-malleability property, there is no hope for proving security based on the properties of \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\), in black-box way.

A solution to the above problems would be to protect the secret key using an inner encoding, yet the amount of tampering is now restricted by the capabilities of the inner scheme, as the attacker knows the exact locations of the “sensitive” codeword bits, i.e., the non-ciphertext bits. In our construction, we manage to protect the secret key while avoiding the bottleneck on the access rate by designing an inner encoding scheme that provides limited security guarantees when used standalone, still when it is used in conjunction with a shuffling technique that permutes the inner encoding and ciphertext bit locations, it guarantees that any attack against the secret key will create an invalid encoding with overwhelming probability, even when allowing access to almost the entire codeword (Figs. 1, 2).

Our scheme is depicted in Fig. 3 and works as follows: on input message \(s\), the encoder (i) encrypts the message by computing \(sk\leftarrow {\textsf{KGen}}(1^k)\) and \(e\leftarrow {{\textsf{E}}}_{sk}(s)\), (ii) computes an m-out-of-m secret sharing \(z\) of \((sk|| sk^3)\) (interpreting both \(sk\) and \(sk^{3}\) as elements in some finite field),Footnote 8 and outputs a random shuffling of \((z|| e)\), denoted as \(P_{\varSigma }(z||e)\), according to the common reference string \(\varSigma \). Decoding proceeds as follows: on input \(c\), the decoder (i) inverts the shuffling operation by computing \((z|| e) \leftarrow P^{-1}_{\varSigma }(c)\), (ii) reconstructs \((sk|| sk')\), and (iii) if \(sk^3 = sk'\), outputs \({{\textsf{D}}}_{sk}(e)\), otherwise, it outputs \(\bot \).

Intuitively, the properties that we require from the inner encoding scheme, i.e., the secret sharing and shuffling of \((sk|| sk^3)\), are similar to those provided by a robust secret sharing scheme [58], which guarantees tamper detection during the reconstruction phase. In our work, we additionally require simulatability of whether the reconstructed message will be the same or \(\bot \). In Sect. 3, we present the intuition behind our construction and a formal security analysis. Our instantiation yields a rate \(1\) computationally secure MD-NMC in the CRS model, with access rate \(1-1/\varOmega (\log k)\) and codewords of length \(|s|+O(k^2 \log k)\), under mild assumptions (e.g., one-way functions).

Fig. 1
figure 1

Description of the scheme in the CRS model

On the CRS . In our work, the tampering function, and consequently the codeword locations that the function is given access to, are fixed before sampling the CRS and this is critical for achieving security. However, proving security in this setting is non-trivial. In addition, the tampering function receives full access to the CRS when tampering with the codeword. This is in contrast to the work by Faust et al. [45] in the information-theoretic setting, where the (internal) tampering function receives partial information over the CRS.

In addition, our results tolerate adaptive selection of the codeword locations, with respect to the CRS, in the following way: each time the attacker requests access to a location, he also learns if it corresponds to a bit of \(z\) or \(e\), together with the index of that bit in the original string. In this way, the CRS is gradually disclosed to the adversary while picking codeword locations.

Finally, our CRS sustains a substantial amount of tampering that depends on the codeword locations chosen by the attacker: an attacker that gets access to a sensitive codeword bit is allowed to modify the part of the CRS that defines the location of that bit in the codeword. The attacker is allowed to modify all but \(O(k \log (|s| + k))\) bits of the CRS, that is of length \(O(k^2 \log k \log (|s|+k))\). To our knowledge, this is the first construction that tolerates, even partial modification of the CRS. In contrast, existing constructions in the CRS model are either using NIZKs [35, 41, 43, 55], or they are based on the knowledge of exponent assumption [51], thus tampering access to the CRS might compromise security.

Removing the CRS . A first approach would be to store the CRS inside the codeword together with \(P_{\varSigma }(z||e)\) and give to the attacker read/write access to it. However, the tampering function, besides getting direct (partial) access to the encoding of \(sk\), it also gets indirect access to it by (partially) controlling the CRS. Then, it can modify the CRS in way such that, during decoding, ciphertext locations of its choice will be treated as bits of the inner encoding, \(z\), increasing the tampering rate against \(z\) significantly. This makes the task of protecting \(sk\) hard, if not impossible (unless we restrict the access rate significantly).

To handle this challenge, we embed the structure recovering mechanism inside the codeword and we emulate the CRS effect by increasing the size of the alphabet, giving rise to a block-wise structure.Footnote 9 Notice that, non-malleable codes with large alphabet size (i.e., \(\textrm{poly}(k)+|s|\) bits) might be easy to construct, as we can embed in each codeword block the verification key of a signature scheme together with a secret share of the message, as well as a signature over the share. In this way, partial access over the codeword does not compromise the security of the signature scheme while the message remains private, and the simulation is straightforward. This approach, however, comes with a large overhead, decreasing the information rate and access rate of the scheme significantly. In general, and similar to error correcting codes, we prefer smaller alphabet sizes—the larger the size is, the more coarse access structure is required, i.e., in order to access individual bits we need to access the blocks that contain them. In this work, we aim at minimizing this restriction by using small alphabets as below.

Our approach on the problem is the following. We increase the alphabet size to \(O(\log k)\) bits, and we consider two types of blocks: (i) sensitive blocks, in which we store the inner encoding, \(z\), of the secret key, \(sk\), and (ii) non-sensitive blocks, in which we store the ciphertext, \(e\), that is fragmented into blocks of size \(O(\log k)\). The first bit of each block indicates whether it is a sensitive block, i.e., we set it to 1 for sensitive blocks and to 0, otherwise. Our encoder works as follows: on input message \(s\), it computes \(z\), \(e\), as in the previous scheme and then uses sampling without replacement to generate the indices, \(\rho _1,\ldots ,\rho _{|z|}\), for the sensitive blocks. Then, for every \(i \in \{1,\ldots ,|z|\}\), \(\rho _i\) is a sensitive block, with contents \((1||i||z[i])\), while the remaining blocks keep ciphertext pieces of size \(O(\log k)\). Decoding proceeds as follows: on input codeword \(C=(C_1,\ldots ,C_{{\textsf{bn}}})\), for each \(i \in [{\textsf{bn}}]\), if \(C_i\) is a non-sensitive block, its data will be part of \(e\), otherwise, the last bit of \(C_i\) will be part of \(z\), as it is dictated by the index stored in \(C_i\). If the number of sensitive blocks is not the expected, the decoder outputs \(\bot \), otherwise, \(z\), \(e\), have been fully recovered and decoding proceeds as in the previous scheme. Our scheme is depicted in Fig. 5.

The security of our construction is based on the fact that, due to our shuffling technique, the position mapping will not be completely overwritten by the attacker, and as we prove in Sect. 4, this suffices for protecting the inner encoding over \(sk\). We prove security of the current scheme (cf. Theorem 4.8) by a reduction to the security of the scheme in the CRS model. Our instantiation yields a rate \(1-1/\varOmega (\log k)\) MD-NMC in the standard model, with access rate \(1-1/\varOmega (\log k)\) and codewords of length \(|s|(1+1/O(\log k))+O(k^2 \log ^2 k)\), assuming one-way functions.

It is worth pointing out that the idea of permuting blocks containing sensitive and non-sensitive data was also considered by [61] in the context of list-decodable codes; however, the similarity is only in the fact that a permutation is being used at some point in the encoding process, and our objective, construction and proof are different.

Fig. 2
figure 2

Description of the scheme in the standard model

Continuously non-malleable codes with light updates. We observe that the codewords of the block-wise scheme can be updated efficiently, using shuffling and refreshing operations. Based on this observation, we prove that our code is secure against continuous attacks, for a notion of security that is weaker than the original one [43], as we need to update our codeword. However, our update mechanism is using cheap operations, avoiding the full decoding and re-encoding of the message, which is the standard way to achieve continuous security [39, 55]. In addition, our solution avoids the usage of a self-destruction mechanism that produces \(\bot \) in all subsequent rounds after the first round in which the attacker creates an invalid codeword, which was originally achieved by [41], and makes an important step toward practicality.

The update mechanism works as follows: in each round, it randomly shuffles the blocks and refreshes the randomness of the inner encoding of \(sk\). The idea here is that, due to the continual shuffling and refreshing of the inner encoding scheme, in each round the attacker learns nothing about the secret key, and every attempt to modify the inner encoding, results to an invalid key, with overwhelming probability. Our update mechanism can be made deterministic if we further encode a seed of a PRG together with the secret key, which is similar to the technique presented in [55].

Our results are presented in Sect. 5 (cf. Theorem 5.3), and the rates for the current scheme match those of the one-time secure, block-wise code.

1.3 Applications

Security against passive attackers - AONTs. Regarding the passive setting, our model and constructions find useful application in all settings where AONTs are useful (cf. [22, 23, 59, 60]), e.g., for increasing the security of encryption without increasing the key-size, for improving the efficiency of block ciphers and constructing remotely keyed encryption [22, 60], and also for constructing computationally secure secret sharing [59]. Other uses of AONTs are related to optimal asymmetric encryption padding [22].

Security against memory tampering - (Binary alphabets, Logarithmic length CRS). As with every NMC, the most notable application of the proposed model and constructions is when aiming for protecting cryptographic devices against memory tampering. Using our \(\textsc {CRS}\) based construction, we can protect a large tamperable memory with a small (logarithmic in the message length) tamperproof memory, that holds the \(\textsc {CRS}\).

The construction is as follows. Consider any device performing cryptographic operations, e.g., a smart card, whose memory is initialized when the card is being issued. Each card is initialized with an independent CRS, which is stored in a tamper-proof memory, while the codeword is stored in a tamperable memory. Due to the independency of the CRS values, it is plausible to assume that the adversary is not given access to the CRS prior to tampering with the card; the full CRS is given to the tampering function while it tampers with the codeword during computation. This idea is along the lines of the only computation leaks information model [56], where data can only be leaked during computation, i.e., the attacker learns the CRS when the devices perform computations that depend on it. We note that in this work we allow the tampering function to read the full CRS, in contrast to [45], in which the tampering function receives partial information over it (our CRS can also be tampered, cf. the above discussion). In subsequent rounds, the CRS and the codeword are being updated by the device, which is the standard way to achieve security in multiple rounds while using a one-time NMC[39].

Security against memory tampering - (Logarithmic length alphabets, no CRS). In modern architectures, data are stored and transmitted in chunks; thus, our block-wise encoding scheme can provide tamper resilience in all these settings. For instance, consider the case of arithmetic circuits, having memory consisting of consecutive blocks storing integers. Considering adversaries that access the memory of such circuits in a block-wise manner, is a plausible scenario. In terms of modeling, this is similar to tamper resilience for arithmetic circuits [47], in which the attacker, instead of accessing individual circuit wires carrying bits, accesses wires carrying integers. The case is similar for RAM computation where the CPU operates over 32 or 64 bit registers (securing RAM programs using NMC was also considered by [34,35,36, 44]). We note that the memory segments in which the codeword blocks are stored do not have to be physically separated, as partial functions output values that depend on the whole input in which they receive access to. This is in contrast to the split-state setting in which the tampering function tampers with each state independently, and thus, the states need to be physically separated.

Security against adversarial channels. In Wiretap Channels [17, 57, 64] the goal is to communicate data privately against eavesdroppers, under the assumption that the channel between the sender and the adversary is “noisier” than the channel between the sender and the receiver. The model that we propose and our block-wise construction can be applied in this setting to provide privacy against a wiretap adversary under the assumption that due to the gap of noise there is a small (of rate o(1)) fraction of symbols that are delivered intact to the receiver and dropped from the transmission to the adversary. This enables private, key-less communication between the parties, guaranteeing that the receiver will either receive the original message, or \(\bot \). In this way, the communication will be non-malleable in the sense that the receiver cannot be lead to output \(\bot \) depending on any property of the plaintext. Our model allows the noise in the receiver side to depend on the transmission to the wiretap adversary, that tampers with a large (of rate \(1-o(1)\)) fraction of symbols, leading to an “active” variant of the wiretap model.

1.4 Related Work

Manipulation detection has been considered independently of the notion of non-malleability, in the seminal paper by Cramer et. al. [30], who introduced the notion of algebraic manipulation detection (AMD) codes, providing security against additive attacks over the codeword. A similar notion was considered by Jafargholi and Wichs [49], called tamper detection, aiming to detect malicious modifications over the codeword, independently of how those affect the output of the decoder. Tamper detection ensures that the application of any (admissible) function to the codeword leads to an invalid decoding.

Non-malleable codes for other function classes have been extensively studied, such as constant split-state functions [26, 37], block-wise tampering [24, 28], while the work of [2] develops beautiful connections among various function classes. There has been even richer classes studied in recent years, such as small depth circuits [12], bounded-degree polynomials over finite fields [11], and bounded polynomial time/depth functions [13, 33, 40]. The results [11, 12] are information-theoretic, and the others [13, 33, 40] require stronger complexity/cryptographic assumptions. On the other hand, the constructions of this work only rely on the minimal cryptographic assumption—the existence of one-way functions.

In addition, other variants of non-malleable codes have been proposed, such as continuous non-malleable codes [43], augmented non-malleable codes [1], locally decodable/updatable non-malleable codes [25, 34,35,36, 44], and non-malleable codes with split-state refresh [41]. In [15], the authors consider AC0 circuits, bounded-depth decision trees and streaming, space-bounded adversaries. Leakage resilience was also considered as an additional feature, e.g., by [25, 36, 41, 53, 55].

A related line of work in tamper resilience aims to protect circuit computation against tampering attacks on circuit wires [31, 32, 46, 48] or gates [9, 10, 54] aim at protecting circuits against hardware Trojans, while [18] relies on trusted hardware. In this setting, using non-malleable codes for protecting the circuit’s private memory is an option, still in order to achieve security the encoding and decoding procedures should be protected against fault injection attacks using the techniques from [31, 32, 46, 48, 54]. The work of [52] is the first that constructs (one-time) NMCs for the class of partial functions that tamper with almost the entire codeword. Whether NMCs could be useful in secure messaging remains an interesting open question [6,7,8].

2 Preliminaries

In this section, we present basic definitions and notation that will be used throughout the paper.

Definition 2.1

(Notation) Let t, i, j, be nonnegative integers. Then, [t] is the set \(\{1,\ldots ,t\}\). For bit strings x, y, x||y, is the concatenation of x, y, |x| denotes the length of x, for \(i \in [|x|]\), x[i] is the i-th bit of x, , and for \(i \le j\), \(x[i:j]=x[i] || \ldots || x[j]\). For a set I, |I|, \({\mathcal {P}}(I)\), are the cardinality and power set of I, respectively, and for \(I \subseteq [|x|]\), \(x_{|_{I}}\) is the projection of the bits of x with respect to I. For a string variable c and value v, \(c \leftarrow v\) denotes the assignment of v to c, and \(c[I] \leftarrow v\), denotes an assignment such that \(c_{|_{I}}\) equals v. For a distribution D over a set \({\mathcal {X}}\), \(x \leftarrow D\) denotes sampling an element \(x \in {\mathcal {X}}\), according to D, \(x \leftarrow {\mathcal {X}}\) denotes sampling a uniform element x from \({\mathcal {X}}\), \(U_{{\mathcal {X}}}\) denotes the uniform distribution over \({\mathcal {X}}\) and \(x_1,\ldots ,x_t {\mathop {\leftarrow }\limits ^{{\textsf{rs}}}}{\mathcal {X}}\) denotes sampling a uniform subset of \({\mathcal {X}}\) with t distinct elements, using rejection sampling. The statistical distance between two random variables \(X,\ Y\), is denoted by \(\varDelta (X,Y)\), “\(\approx \)” and “\(\approx _c\)”, denote statistical and computational indistinguishability, respectively, and \({\textsf{negl}}(k)\) denotes an unspecified, negligible function, in k.

2.1 Non-malleable Codes

Below, we define coding schemes, based on the definitions of [39, 55].

Definition 2.2

(Coding scheme [39]) A \((\kappa ,\nu )\)-coding scheme, \(\kappa , \nu \in {\mathbb {N}}\), is a pair of algorithms \(({\textsf{Enc}},{\textsf{Dec}})\) such that: \({\textsf{Enc}}: \{0,1\}^{\kappa } \rightarrow \{0,1\}^\nu \) is an encoding algorithm, \({\textsf{Dec}}: \{0,1\}^\nu \rightarrow \{0,1\}^{\kappa } \cup \{\bot \}\) is a decoding algorithm, and for every \(s \in \{0,1\}^{\kappa }\), \(\Pr [{\textsf{Dec}}({\textsf{Enc}}(s))=s]=1\), where the probability runs over the randomness used by \(({\textsf{Enc}},{\textsf{Dec}})\).

We can easily generalize the above definition for larger alphabets, i.e., by considering \({\textsf{Enc}}: \{0,1\}^{\kappa } \rightarrow \varGamma ^\nu \) and \({\textsf{Dec}}: \varGamma ^\nu \rightarrow \{0,1\}^{\kappa } \cup \{\bot \}\), for some alphabet \(\varGamma \).

Definition 2.3

(Coding scheme in the Common Reference String (CRS) Model [55]) A \((\kappa ,\nu )\)-coding scheme in the CRS model, \(\kappa ,\nu \in {\mathbb {N}}\), is a triple of algorithms \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\) such that: \({\textsf{Init}}\) is a randomized algorithm which receives \(1^{k}\), where k denotes the security parameter, and produces a common reference string \(\varSigma \in \{0,1\}^{\textrm{poly}(k)}\), and \(({\textsf{Enc}}(1^{k},\varSigma ,\cdot ),{\textsf{Dec}}(1^{k},\varSigma ,\cdot ))\) is a \((\kappa ,\nu )\)-coding scheme, \(\kappa ,\nu =\textrm{poly}(k)\).

For brevity, \(1^{k}\) will be omitted from the inputs of \({\textsf{Enc}}\) and \({\textsf{Dec}}\).

Below we define non-malleable codes with manipulation detection, which is a stronger notion than the one presented in [39], in the sense that the tampered codeword will always decode to the original message or to \(\bot \). Our definition is with respect to alphabets, as in Sect. 4 we consider alphabets of size \(O(\log k)\).

Definition 2.4

(Non-Malleability with Manipulation Detection (MD-NMC)) Let \(\varGamma \) be an alphabet, let \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\) be a \((\kappa ,\nu )\)-coding scheme in the common reference string model, and \(\mathcal {F}\) be a family of functions \(f: \varGamma ^\nu \rightarrow \varGamma ^\nu \). For any \(f \in \mathcal {F}\) and \(s\in \{0,1\}^\kappa \), define the tampering experiment

$$\begin{aligned} {\textsf{Tamper}}^{f}_{s}:= \left\{ \begin{array}{c} \varSigma \leftarrow {\textsf{Init}}(1^k), c\leftarrow {\textsf{Enc}}(\varSigma ,s), {\tilde{c}}\leftarrow f_{\varSigma }(c), \tilde{s}\leftarrow {\textsf{Dec}}(\varSigma ,{\tilde{c}}) \\ \text {Output}: \tilde{s}. \end{array} \right\} \end{aligned}$$

which is a random variable over the randomness of \({\textsf{Enc}}\), \({\textsf{Dec}}\) and \({\textsf{Init}}\). The coding scheme \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\) is non-malleable with manipulation detection with respect to the function family \(\mathcal {F}\), if for all, sufficiently large k and for all \(f \in \mathcal {F}\), there exists a distribution \(D_{(\varSigma ,f)}\) over \(\{0,1\}^{\kappa } \cup \{\bot , {\textsf{same}}^*\}\), such that for all \(s \in \{0,1\}^{\kappa }\), we have:

$$\begin{aligned} \left\{ {\textsf{Tamper}}^{f}_s \right\} _{k \in {\mathbb {N}}} \approx \left\{ \begin{array}{c} {\tilde{s}} \leftarrow D_{(\varSigma ,f)} \\ \text {Output }s\text { if }{\tilde{s}}={\textsf{same}}^*\text {, and }\bot \text { otherwise} \end{array} \right\} _{k \in {\mathbb {N}}} \end{aligned}$$

where \(\varSigma \leftarrow {\textsf{Init}}(1^k)\) and \(D_{(\varSigma ,f)}\) is efficiently samplable given access to f, \(\varSigma \). Here, “\(\approx \)” may refer to statistical, or computational, indistinguishability.

In the above definition, \(f\) is parameterized by \(\varSigma \) to differentiate tamper-proof input, i.e., \(\varSigma \), from tamperable input, i.e., \(c\).

2.2 Partial Functions

Below we define the tampering function class that will be used throughout the paper.

Definition 2.5

(The class of partial functions \(\mathcal {F}_{\varGamma }^{\alpha \nu }\) (or \(\mathcal {F}^{\alpha }\))) Let \(\varGamma \) be an alphabet, \(\alpha \in [0,1)\) and \(\nu \in {\mathbb {N}}\). Any \(f \in \mathcal {F}_{\varGamma }^{\alpha \nu }\), \(f: \varGamma ^{\nu } \rightarrow \varGamma ^{\nu }\), is indexed by a set \(I \subseteq [\nu ]\), \(|I| \le \alpha \nu \), and a function \(f': \varGamma ^{\alpha \nu } \rightarrow \varGamma ^{\alpha \nu }\), such that for any \(x \in \varGamma ^\nu \), \(\left( f(x) \right) _{|_{I}} = f'\left( x_{|_{I}} \right) \) and \(\left( f(x) \right) _{|_{I^{{\textsf{c}}}}} = x_{|_{I^{{\textsf{c}}}}}\), where \(I^{{\textsf{c}}}:= [\nu ] \backslash I\).

For simplicity, in the rest of the text we will use the notation f(x) and \(f(x_{|_{I}})\) (instead of \(f'\left( x_{|_{I}} \right) \)). Also, the length of the codeword, \(\nu \), according to \(\varGamma \), will be omitted from the notation, and whenever \(\varGamma \) is omitted we assume that \(\varGamma =\{0,1\}\). In Sect. 3, we consider \(\varGamma =\{0,1\}\), while in Sect. 4, \(\varGamma =\{0,1\}^{O(\log k)}\), i.e., the tampering function operates over blocks of size \(O(\log k)\). When considering the CRS model, the functions are parameterized by the common reference string.

The following lemma is useful for proving security throughout the paper.

Lemma 2.6

Let \(({\textsf{Enc}},{\textsf{Dec}})\) be a \((\kappa ,\nu )\)-coding scheme and \(\mathcal {F}\) be a family of functions. For every \(f\in \mathcal {F}\) and \(s\in \{0,1\}^{\kappa }\), define the tampering experiment

$$\begin{aligned} {\textsf{Tamper}}^f_s:= \left\{ \begin{array}{c} c \leftarrow {\textsf{Enc}}(s), {\tilde{c}} \leftarrow f(c), {\tilde{s}} \leftarrow {\textsf{Dec}}({\tilde{c}}) \\ \text {Output }{\textsf{same}}^*\text { if }{\tilde{s}}=s\text {, and }{\tilde{s}}\text { otherwise}. \end{array} \right\} \end{aligned}$$

which is a random variable over the randomness of \({\textsf{Enc}}\) and \({\textsf{Dec}}\). \(({\textsf{Enc}},{\textsf{Dec}})\) is an MD-NMC with respect to \(\mathcal {F}\), if for any \(f \in {\mathcal {F}}\) and all sufficiently large k: (i) for any pair of messages \(s_0\), \(s_1 \in \{0,1\}^{\kappa }\), \(\left\{ {\textsf{Tamper}}^f_{s_0} \right\} _{k \in {\mathbb {N}}} \approx \left\{ {\textsf{Tamper}}^f_{s_1} \right\} _{k \in {\mathbb {N}}}\), and (ii) for any \(s\), \(\Pr \left[ {\textsf{Tamper}}^f_{s} \notin \{\bot , s\} \right] \le {\textsf{negl}}(k)\). Here, “\(\approx \)” may refer to statistical, or computational, indistinguishability.

Proof

By Definition 2.4, we have that \(({\textsf{Enc}},{\textsf{Dec}})\) is an MD-NMC against \(\mathcal {F}\), if for any \(f \in {\mathcal {F}}\), there exists an efficiently samplable distribution \(D_{f}\) over \(\{0,1\}^k \cup \{\bot ,{\textsf{same}}^*\}\), such that for any message s

$$\begin{aligned}&\left\{ \begin{array}{c} c \leftarrow {\textsf{Enc}}(s), {\tilde{c}} \leftarrow f(c), {\tilde{s}} \leftarrow {\textsf{Dec}}({\tilde{c}}) \\ \text {Output}: {\tilde{s}} \end{array} \right\} \nonumber \\&\quad \approx \left\{ \begin{array}{c} {\tilde{s}} \leftarrow D_f \\ \text {Output }s\text { if }{\tilde{s}}={\textsf{same}}^*\text {, and }\bot \text { otherwise} \end{array} \right\} \end{aligned}$$
(1)

Let \({{\textsf{0}}}\) be the zero message in \(\{0,1\}^{\kappa }\). For any \(f \in {\mathcal {F}}\), we define \(D_f\) as follows:

  • Sample \(c \leftarrow {\textsf{Enc}}({{\textsf{0}}})\) and compute \({\tilde{c}}\leftarrow f(c)\), \(\tilde{s}\leftarrow {\textsf{Dec}}({\tilde{c}})\).

  • Output: if \(\tilde{s}={{\textsf{0}}}\), set \(\tilde{s}\leftarrow {\textsf{same}}^*\), else, \(\tilde{s}\leftarrow \bot \). Output \(\tilde{s}\).

From the above, we have that for any \(s\),

$$\begin{aligned}{} & {} \left\{ \begin{array}{c} {\tilde{s}} \leftarrow D_f \\ \\ \text {Output }s\text { if }{\tilde{s}}={\textsf{same}}^*\text {, and }\bot \text { otherwise} \end{array} \right\} \\{} & {} \quad \equiv \left\{ \begin{array}{c} \left\{ \begin{array}{c} c \leftarrow {\textsf{Enc}}({{\textsf{0}}}), {\tilde{c}} \leftarrow f(c), {\tilde{s}} \leftarrow {\textsf{Dec}}({\tilde{c}})\\ \text {if }{\tilde{s}}={{\textsf{0}}}, \tilde{s}\leftarrow {\textsf{same}}^*\text {, else, }\tilde{s}\leftarrow \bot \text {. Output }\tilde{s}\end{array} \right\} \\ \text {Output }s\text { if }{\tilde{s}}={\textsf{same}}^*\text {, and }\bot \text { otherwise} \end{array} \right\} \\{} & {} \quad \approx \left\{ \begin{array}{c} \left\{ \begin{array}{c} c \leftarrow {\textsf{Enc}}(s), {\tilde{c}} \leftarrow f(c), {\tilde{s}} \leftarrow {\textsf{Dec}}({\tilde{c}})\\ \text {if }{\tilde{s}}=s, \tilde{s}\leftarrow {\textsf{same}}^*\text {, else, }\tilde{s}\leftarrow \bot \text {. Output }\tilde{s}\end{array} \right\} \\ \text {Output }s\text { if }{\tilde{s}}={\textsf{same}}^*\text {, and }\bot \text { otherwise} \end{array} \right\} \\{} & {} \quad \approx \left\{ \begin{array}{c} c \leftarrow {\textsf{Enc}}(s), {\tilde{c}} \leftarrow f(c), {\tilde{s}}\leftarrow {\textsf{Dec}}({\tilde{c}}) \\ \text {Output}: {\tilde{s}} \end{array} \right\} , \end{aligned}$$

where the first relation follows by the definition of \(D_f\), the second one follows from the main assumption which states that for any pair of messages \(s_0\), \(s_1\), \({\textsf{Tamper}}^f_{s_0} \approx {\textsf{Tamper}}^f_{s_1}\), and the third one follows from the assumption that \(\Pr \left[ {\textsf{Tamper}}^f_{s} \notin \{\bot , s\} \right] \le {\textsf{negl}}(k)\). This concludes our proof since for any \(f \in {\mathcal {F}}\) and any message s, Relation 1 is satisfied. \(\square \)

For coding schemes in the CRS model the above lemma is similar, and \({\textsf{Tamper}}^f_{s}\) internally samples \(\varSigma \leftarrow {\textsf{Init}}(1^k)\).

2.3 All-Or-Nothing-Transform

Here we present the definition of all-or-nothing-transform (AONT), by adopting the notion of [23] and presenting it with respect to coding schemes against partial codeword leakage.

Definition 2.7

Let \(\varGamma \) be an alphabet, and \(({\textsf{Enc}},{\textsf{Dec}})\) be a \((\kappa ,\nu )\) coding scheme over the alphabet \(\varGamma \). The encoding is an AONT for parameter \(\alpha \in (0,1)\) if for any pair of messages \((s_0,s_1)\) and every subset \(I \subset [\nu ]\) with cardinality \(\alpha \nu \), we have \(\left( s_0,s_1, {\textsf{Enc}}(s_0)_{|_{I}} \right) \approx \left( s_0,s_1, {\textsf{Enc}}(s_1)_{|_{I}}\right) \), where the indistinguishability may be information-theoretic or computational.

Next, we present a simple theorem, showing that MD-NMC against partial functions (with a sufficiently large access rate) implies AONT.

Theorem 2.8

Let \(\varGamma \) be an alphabet, and \(({\textsf{Enc}},{\textsf{Dec}})\) be a \((\kappa ,\nu )\) coding scheme over the alpha bet \(\varGamma \). Suppose the coding scheme is MD-NMC with respect to the partial function class \(\mathcal {F}^\alpha \) for some \(\alpha > 0.5\), then the coding scheme is an AONT with parameter \(\alpha \).

Proof

We first note that by Lemma 2.6, an equivalent formulation of MD-NMC against an \(f\in \mathcal {F}^\alpha \) can be stated as: (1) for any messages \(s_0,s_1\), we have \(\left\{ {\textsf{Tamper}}^f_{s_0} \right\} _{k \in {\mathbb {N}}} \approx \left\{ {\textsf{Tamper}}^f_{s_1} \right\} _{k \in {\mathbb {N}}}\), and (2) for any \(s\), \(\Pr \left[ {\textsf{Tamper}}^f_{s} \notin \{\bot , s\} \right] \le {\textsf{negl}}(k)\).

We prove the theorem via a reduction. Assume that there exist a subset I with \(|I| = \alpha \nu \), messages \(s_0,s_1\), and a distinguisher \({\mathcal {D}}\) that breaks the AONTsecurity; then, we construct a reduction \({\mathcal {A}}\) that uses a carefully chosen function \(f\in \mathcal {F}^\alpha \) to break the MD-NMCsecurity. Without loss of generality, we assume that \(s_0\) and \(s_1\) are both nonzero messages.

First we define the function f: the function has hardcoded \((s_0, s_1)\) and receives as input a partial codeword \(C^*\). f first runs the distinguisher, i.e., computes \(b^*= {\mathcal {D}}(s_0,s_1,C^*)\). If \(b^*=1\), then f outputs \({\textsf{Enc}}(0^\kappa )_{|_{I}}\); otherwise, f acts as the identity function, i.e., just outputting the input \(C^*\). Now, we prove that the coding scheme is not an MD-NMC.

First, we assume that \(\Pr \left[ {\textsf{Tamper}}^f_{s} \notin \{\bot , s\} \right] = {\textsf{negl}}(k)\) for any message s. Otherwise, the manipulation detection property is broken, implying a contradiction. Then, we claim that in either case of \(s=s_0\) or \(s=s_1\), we have \({\textsf{Tamper}}^f_{s} = \bot \), with an overwhelming probability. We observe that f has changed an \(\alpha >0.5\) fraction of the codeword into \({\textsf{Enc}}(0^\kappa )_{|_{I}}\). This effect is the same as another related tampering function \(g\in \mathcal {F}^\alpha \) who changes a \((1-\alpha ) < 0.5\) fraction of \({\textsf{Enc}}(0^\kappa )\) into \({\textsf{Enc}}(s)\) on the set \([\nu ]\setminus I\), meaning that \({\textsf{Tamper}}^f_{s} \approx {\textsf{Tamper}}^g_{0^\kappa }\). By assumption, we have \(\Pr \left[ {\textsf{Tamper}}^f_{s} \in \{\bot , s\} \right] = \Pr \left[ {\textsf{Tamper}}^g_{0^\kappa } \in \{\bot , 0^\kappa \} \right] = 1- {\textsf{negl}}(k)\), implying that \({\textsf{Tamper}}^f_{s} = {\textsf{Tamper}}^g_{0^\kappa } = \bot \), with overwhelming probability.

Next, we show that \(\left\{ {\textsf{Tamper}}^f_{s_0} \right\} _{k \in {\mathbb {N}}} \not \approx \left\{ {\textsf{Tamper}}^f_{s_1} \right\} _{k \in {\mathbb {N}}}\). We notice that for \(b\in \{0,1\}\), \({\textsf{Tamper}}^f_{s_b}\) outputs \(\bot \) with probability the same as that of \({\mathcal {D}}(s_0,s_1, {\textsf{Enc}}(s_b)_{|_{I}})\) outputting 1. As the distinguisher \({\mathcal {D}}\) has a non-negligible gap outputting 1 between \(b=0\) and \(b=1\), the two tampering experiments can be distinguished with non-negligible probability. This breaks the non-malleability property of the coding scheme, reaching a contradiction. \(\square \)

By the above theorem, we derive that any MD-NMC code against partial functions with sufficiently large access rate, i.e., for \(\mathcal {F}^\alpha \) for \(\alpha > 0.5\), is also an AONT with the same parameter \(\alpha \). On the other hand, we notice that \(\alpha > 0.5\) is necessary for the theorem, as otherwise we can construct a simple counter example—first we observe that the repetition code with majority decodingFootnote 10 is an MD-NMC against \(\mathcal {F}^\beta \) for any \(\beta < 0.5\), as any tampering function in this class cannot change the outcome of the decoding. However, this is certainly not an AONT, as reading one bit is sufficient to recover the underlying message.

2.4 One-Time Authenticated Encryption

Below, we define the security notion of authenticated encryption required by our construction of non-malleable codes.

Definition 2.9

(Authenticated encryption) Let k be the security parameter and \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) be a symmetric encryption scheme. Then, \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) is authenticated, semantically secure against (1-query) key recovery attacks, if it satisfies the following properties:

  1. 1.

    (Correctness): For every message s, \(\Pr [{{\textsf{D}}}_{sk}({{\textsf{E}}}_{sk}(s))=s]=1\), where \(sk\leftarrow {\textsf{KGen}}(1^k)\).

  2. 2.

    (Semantic security): for any pair of messages \(s_{0}\), \(s_{1}\), \( \Big ({{\textsf{E}}}_{sk}(s_0) \Big ) \approx \Big ({{\textsf{E}}}_{sk}(s_1) \Big ) \), where \( sk\leftarrow {\textsf{KGen}}(1^k)\).

  3. 3.

    (1-query key recovery security): for any message \(s\) and any adversary \({\mathcal {A}}\), we have

    $$\begin{aligned} \Pr \left[ sk' = sk\left| \begin{array}{c} sk\leftarrow {\textsf{KGen}}(1^k);\\ e\leftarrow {{\textsf{E}}}_{sk}(s); sk' \leftarrow {\mathcal {A}}(e) \end{array} \right. \right] \le {\textsf{negl}}(k). \end{aligned}$$
  4. 4.

    (Unforgeability): For any algorithm \({\mathcal {A}}=({\mathcal {A}}_1,{\mathcal {A}}_2)\),

    $$\begin{aligned} \Pr \left[ {\tilde{e}}\ne e\wedge {{\textsf{D}}}_{sk}({\tilde{e}}) \ne \bot \left| \begin{array}{c} sk\leftarrow {\textsf{KGen}}(1^k); (s,st) \leftarrow {\mathcal {A}}_1(1^k);\\ e\leftarrow {{\textsf{E}}}_{sk}(s); {\tilde{e}}\leftarrow {\mathcal {A}}_2(e,st) \end{array} \right. \right] \le {\textsf{negl}}(k). \end{aligned}$$

When the scheme is computationally secure, we consider computational indistinguishability instead of statistical, and \({\mathcal {A}}\) is PPT .

We notice that the notions of semantic security and key-recovery are not compatible just from the definitions, i.e., key-recovery security clearly does not imply semantic security, and the other way implication does not hold in general, i.e., if the key space is small (e.g., one-time pad with bit messages and keys), there is always a 1/2 probability guessing the key correctly. Nevertheless, below we instantiate two simple schemes which satisfy the key-recovery security.

Next we provide the definition of one-time message authentication code (MAC) following [50].

Definition 2.10

(One-time MAC [50]) Let k be the security parameter. A message authentication code \(\varPi =({\textsf{Gen}},{\textsf{Mac}},{\textsf{Vrfy}})\) is one-time \(\epsilon \)-secure, if for all algorithms \({\mathcal {A}}=({\mathcal {A}}_1,{\mathcal {A}}_2)\),

$$\begin{aligned} \Pr [{\mathsf {Mac-forge}}_{{\mathcal {A}},\varPi }(k)=1] \le \epsilon , \end{aligned}$$

where,

$$\begin{aligned} \begin{array}{l} {\mathsf {Mac-forge}}_{{\mathcal {A}},\varPi }(k):\\ sk\leftarrow {\textsf{Gen}}(1^k)\\ (s,st) \leftarrow {\mathcal {A}}_1(1^k)\\ t \leftarrow {\textsf{Mac}}_{sk}(s)\\ ({{\tilde{s}}}, {{\tilde{t}}}) \leftarrow {\mathcal {A}}_2(t,st)\\ \text {Output 1 if }{\textsf{Vrfy}}_{sk}({{\tilde{s}}},{{\tilde{t}}})=1\text { and }{{\tilde{s}}} \ne s. \end{array} \end{aligned}$$

Below we describe two instantiations of one-time authenticated encryption; the first is a computationally secure rate 1 scheme, while the latter is information-theoretically secure with a lower rate.

Instantiation 2.11

(Computationally secure authenticated encryption) Let \(F_r\) be a pseudo-random function, \(F_r: \{0,1\}^{k} \rightarrow \{0,1\}^{k}\), let \({\textsf{PRG}}\) be a pseudo-random generator, \({\textsf{PRG}}: \{0,1\}^{k} \rightarrow \{0,1\}^{|s|}\), and let \(({\textsf{MKGen}},{\textsf{Mac}},{\textsf{Vrfy}})\) be a message authentication code that outputs tags of length k (cf. [50]). We define a symmetric encryption scheme \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\), as follows:

  • \({\textsf{KGen}}(1^k)\): sample \(r \leftarrow \{0,1\}^{k}\), \(mk\leftarrow {\textsf{MKGen}}(1^k)\) and output \(sk=(r,mk)\).

  • \({{\textsf{E}}}_{sk}(\cdot )\): On input \(s\), sample \(\tau \leftarrow \{0,1\}^k\), set \(e=\left( {\textsf{PRG}}(F_r(\tau )) \oplus s, \tau \right) \), \(t={\textsf{Mac}}_{mk}(e)\), and output \((e,t)\).

  • \({{\textsf{D}}}_{sk}(\cdot )\): On input \((e,t)\), if \({\textsf{Vrfy}}_{mk}(e,t)=1\), parse \(e\) as \((e',\tau )\) and output \(s= \left( {\textsf{PRG}}(F_r(\tau ) )\oplus e' \right) \), otherwise output \(\bot \).

It is not hard to see that the scheme defined above is a rate 1, computationally secure authenticated encryption scheme [50]. The semantic security follows from the security of PRF/PRG. From the security of the pseudo-random function, the scheme is also secure against 1-query key recovery attack. Particularly, from any adversary who can break the 1-query key recovery security, we can easily derive a reduction who can invert the PRF, i.e., recovering the key r given \((F_r(\tau ), \tau )\). The reduction can then be used to distinguish \(F_r(\cdot )\) from the truly random function given oracle access, as the truly random function cannot be compressed into a short key r, following the compression argument.

Next we describe a simple one-time information theoretic construction.

Instantiation 2.12

(Information-theoretically secure authenticated encryption) Let \({\mathcal {H}}\), \({\bar{{\mathcal {H}}}}\), be pair-wise independent hash function families, such that for any \(h \in {\mathcal {H}}\), \(h: \{0,1\}^{O(|s|)} \rightarrow \{0,1\}^{|s|}\) and for any \({\bar{h}} \in {\bar{{\mathcal {H}}}}\), \({{\bar{h}}}: \{0,1\}^{O(|s|)} \rightarrow \{0,1\}^{|s|}\). We define a symmetric encryption scheme \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\), as follows:

  • \({\textsf{KGen}}(1^k)\): sample \(h \leftarrow {\mathcal {H}}\), \({{\bar{h}}} \leftarrow {\bar{{\mathcal {H}}}}\) and set \(sk= (h, {{\bar{h}}})\).

  • \({{\textsf{E}}}_{sk}(\cdot )\): On input \(s\), sample \(r \leftarrow \{0,1\}^{|s|}\), set \(e=(r||(h(r) + s))\) and output \((e, {{\bar{h}}}(e))\).

  • \({{\textsf{D}}}_{sk}(\cdot )\): On input \((e,t)\), if \({{\bar{h}}} (e)=t\), parse \(e\) as \((r || e')\) and output \(s= h(r) + e' \), otherwise output \(\bot \).

It is easy to verify that the security (i.e., semantic, unforgeability) of the above scheme comes from the pair-wise independence of \({\mathcal {H}}\), \({{\bar{{\mathcal {H}}}}}\). As long as \(|h| > |s| + k\), then the conditional entropy of \(sk\) given e is still greater to k, meaning that the information-theoretic adversary has at most \(2^{-k}\) probability to predict \(sk\) successfully.

2.5 Secret Sharing

In this section, we present the definition and a concrete instantiation of the m-out-of-m secret sharing scheme later used in this work.

Definition 2.13

An m-out-of-m secret sharing scheme has the following two algorithms \(({\textsf{SS}}_{m},{\textsf{Rec}}_{m})\) that works as follow:

  • \({\textsf{SS}}_{m}\): on input x outputs shares \((z_1,\dots , z_m)\). Denote , where denotes concatenation.

  • \({\textsf{Rec}}_{m}\): on input shares \((z_1,\dots , z_m)\) (denoted as z as above) outputs the message x.

The correctness requires that \({\textsf{Rec}}_{m}({\textsf{SS}}_{m}(x) ) = x\) holds with probability 1. The security requires that the message x is information-theoretically hidden given any proper subset of shares \((z_1,\dots , z_m)\).

Instantiation. Given any finite field \({\textbf{GF}}(p^e)\) where p is some prime and e is some non-negative integer, there is a simple additive m-out-of-m secret sharing scheme that works as follow. \({\textsf{SS}}_{m}\) takes input \(x\in {\textbf{GF}}(p^e)\) and samples uniformly random shares \((z_1,\dots , z_m)\) where each \(z_i \in {\textbf{GF}}(p^e)\) and \(x = \sum _{i\in [m]} z_i \). Similarly, \({\textsf{Rec}}_{m}\) takes input \((z_1,\dots , z_m)\) and just outputs \(\sum _{i\in [m]} z_i \). It is easy to verify that both the correctness and security hold.

Remark. In this work, we always refer the particular instantiation above as \(({\textsf{SS}}_{m},{\textsf{Rec}}_{m})\). Moreover, to simplify the presentation, we use \({\textsf{SS}}_{m}(x \Vert y)\) to denote \({\textsf{SS}}_{m}(x) \Vert {\textsf{SS}}_{m}(y)\) for sharing multiple inputs. For this case, the reconstruction works analogously.

3 An MD-NMC for Partial Functions, in the CRS Model

In this section, we consider \(\varGamma =\{0,1\}\) and we construct a rate 1 MD-NMC for \({\mathcal {F}}^{\alpha }\), with access rate \(\alpha = 1-1/\varOmega (\log k)\).

Before presenting the encoding scheme for \({\mathcal {F}}^{\alpha }\), we provide the intuition behind the construction. As a staring point, we consider a naive scheme (which does not work) and then show how we resolve all the challenges. Let \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) be a (symmetric) authenticated encryption scheme and consider the following encoding scheme: to encode a message \(s\), the encoder computes \((sk||e)\), where \(e\leftarrow {{\textsf{E}}}_{sk}(s)\) is the ciphertext and \(sk\leftarrow {\textsf{KGen}}(1^k)\), is the secret key. We observe that the scheme is secure if the tampering function can only read/write on the ciphertext, \(e\), assuming the authenticity property of the encryption scheme, however, restricting access to \(sk\), which is short, is unnatural and makes the problem trivial. On the other hand, even partial access to \(sk\), compromises the authenticity property of the scheme, and even if there is no explicit attack against the non-malleability property of the code, there is no hope for proving security based on the properties of \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\), in a black-box way.

A solution to the above problems would be to protect the secret key using an inner encoding, yet the amount of tampering is now restricted by the capabilities of the inner scheme, as the attacker knows the exact locations of the “sensitive” codeword bits, i.e., the non-ciphertext bits. In the proposed construction, we manage to protect the secret key while avoiding the bottleneck on the access rate, by designing an inner encoding scheme that provides limited security guarantees when used standalone, still when it is used in conjunction with a shuffling technique that permutes the inner encoding and ciphertext bit locations, it guarantees that any attack against the secret key will create an invalid encoding with overwhelming probability, even when allowing access to almost the entire codeword.

The proposed scheme is depicted in Fig. 3 and works as follows: on input message \(s\), the encoder (i) encrypts the message by computing \(sk\leftarrow {\textsf{KGen}}(1^k)\) and \(e\leftarrow {{\textsf{E}}}_{sk}(s)\), (ii) computes an m-out-of-m secret sharing, \(z\), of \((sk|| sk^3)\) (interpreting both \(sk\) and \(sk^{3}\) as elements in some finite field),Footnote 11 and outputs a random shuffling of \((z|| e)\), denoted as \(P_{\varSigma }(z||e)\), according to the common reference string, \(\varSigma \). Decoding proceeds as follows: on input \(c\), the decoder (i) inverts the shuffling operation by computing \((z|| e) \leftarrow P^{-1}_{\varSigma }(c)\), (ii) reconstructs \((sk|| sk')\), and (iii) if \(sk^3 = sk'\), it outputs \({{\textsf{D}}}_{sk}(e)\), otherwise, it outputs \(\bot \). The proposed instantiation yields a rate \(1\) computationally secure MD-NMC in the CRS model, with access rate \(1-1/\varOmega (\log k)\) and codewords of length \(|s|+O(k^2 \log k)\), under mild assumptions, e.g., one-way functions.

Below, we formally define our construction.

Fig. 3
figure 3

Description of the MD-NMC scheme in the CRS model

Construction 3.1

Let k, \(m \in {\mathbb {N}}\), let \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) be a symmetric encryption scheme, \(({\textsf{SS}}_{m},{\textsf{Rec}}_{m})\) be an m-out-of-m secret sharing scheme as Sect. 2.5, and let \(l \leftarrow 2\,m|sk|\), where \(sk\) follows \({\textsf{KGen}}(1^k)\). We define an encoding scheme \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\), that outputs \(\nu =l+|e|\) bits, \(e\leftarrow {{\textsf{E}}}_{sk}(s)\), as follows:

  • \({\textsf{Init}}(1^k)\): Sample \(r_1,\ldots ,r_l {\mathop {\leftarrow }\limits ^{{\textsf{rs}}}}\{0,1\}^{\log (\nu )}\), and output \(\varSigma :=(r_1,\ldots ,r_l)\).

  • \({\textsf{Enc}}(\varSigma ,\cdot )\): for input message \(s\), sample \(sk\leftarrow {\textsf{KGen}}(1^k)\), \(e\leftarrow {{\textsf{E}}}_{sk}(s)\).

    • \(\bullet \)   (Secret share) Sample \(z\leftarrow {\textsf{SS}}_{m}(sk||sk^3)\), where , \(z \in \{0,1\}^{2\,m|sk|}\), and for \(i \in \left[ |sk| \right] \), \(z_i\) (resp. \(z_{|sk|+i}\)) is an m-out-of-m secret sharing of \(sk[i]\) (resp. \(sk^3[i]\)).

    • \(\bullet \)   (Shuffle) Compute \(c\leftarrow P_{\varSigma }(z||e)\) as follows:

      1. 1.

        (Sensitive bits): Set \(c\leftarrow 0^{\nu }\). For \(i \in [l]\), \(c[r_i] \leftarrow z[i]\).

      2. 2.

        (Ciphertext bits): Set \(i \leftarrow 1\). For \(j \in [l + |e|]\), if \(j \notin \{ r_p \ | \ p \in [l] \}\): \(c[j] \leftarrow e[i]\), i\(++\).

    Output \(c\).

  • \({\textsf{Dec}}(\varSigma ,\cdot )\): on input \(c\), compute \((z||e) \leftarrow P_{\varSigma }^{-1}(c)\), \((sk|| sk' )\leftarrow {\textsf{Rec}}_{m}(z)\), and if \(sk^3 = sk'\), output \({{\textsf{D}}}_{sk}(e)\), otherwise output \(\bot \).

The set of indices of \(z_i\) in the codeword will be denoted by \(Z_i\).

In the above, we consider \(sk\), \(sk^3\), as elements over \({\textbf{GF}}(2^{\textrm{poly}(k)})\).

In a high level, the construction presented above combines authenticated encryption with an inner encoding that works as follows. It interprets \(sk\) as an element in the finite field \({\textbf{GF}}({2^{|sk|}})\) and computes \(sk^3\) as a field element. Then, for each bit of \((sk||sk^3)\), it computes an m-out-of-m secret sharing of the bit, for some parameter m (we note that elements in \({\textbf{GF}}(2^{|sk|})\) can be interpreted as bit strings). Then, by combining the inner encoding with the shuffling technique, we get an encoding scheme whose security follows from the observations that we briefly present below:

  • For any tampering function which does not have access to all m shares of a single bit of \((sk||sk^3)\), the tampering effect on the secret key can be expressed essentially as a linear shift, i.e., as \((( sk+ \delta ) || (sk^3 + \eta ))\) for some \((\delta , \eta ) \in {\textbf{GF}}(2^{|sk|}) \times {\textbf{GF}}(2^{|sk|})\), independent of \(sk\).

  • By permuting the locations of the inner encoding and the ciphertext bits, we have that with overwhelming probability any tampering function who reads/writes on a \((1-o(1))\) fraction of codeword bits, will not learn any single bit of \((sk||sk^3)\).

  • With overwhelming probability over the randomness of \(sk\) and the CRS, for non-zero \(\eta \) and \(\delta \), \((sk+ \delta )^3 \ne sk^3 + \eta \), and this property enables us to design a consistency check mechanism whose output is simulatable, without accessing \(sk\).

  • The security of the final encoding scheme follows by composing the security of the inner encoding scheme with the authenticity property of the encryption scheme.

Intuitively, the properties that we require from the inner encoding scheme (after the shuffling operation) employed by our construction are similar to those provided by a robust secret sharing scheme [58], which guarantees tamper detection during the reconstruction phase. In our work, we additionally require simulatability of whether the reconstructed message will be the same or \(\bot \).

Below we present the formal security proof of the above ideas.

Theorem 3.2

Let k, \(m \in {\mathbb {N}}\) and \(\alpha \in [0,1)\). Assuming \(({\textsf{SS}}_{m},{\textsf{Rec}}_{m})\) is an m-out-of-m secret sharing scheme and \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) is 1-IND-CPA secure (cf. Definition 2.9),Footnote 12 authenticated encryption scheme, the code of Construction 3.1 is a MD-NMC against \({\mathcal {F}}^{\alpha }\) (cf. Definition 2.5), for any \(\alpha \), m, such that \((1-\alpha )m=\omega (\log (k))\).

Fig. 4
figure 4

The hybrid experiments for the proof of Theorem 3.2. The gray part signifies the portion of the code of an experiment that differs from the previous one

Proof

Let I be the set of indices chosen by the attacker and \(I^{{\textsf{c}}}= [\nu ] \backslash I\), where \(\nu = 2\,m|sk| + |e|\). The tampered components of the codeword will be denoted using the symbol “~” on top of the original symbol, i.e., we have \({\tilde{c}}\leftarrow f(c)\), the tampered secret key \(sk\) (resp. \(sk^3\)) that we get after executing \({\textsf{Rec}}_{m}({\tilde{z}})\) will be denoted by \(\tilde{sk}\) (resp. \(\tilde{sk}'\)). Also the tampered ciphertext will be \({\tilde{e}}\). We prove the needed using a series of hybrid experiments that are depicted in Fig. 4. Below, we describe the hybrids.

  • \({\textsf{Exp}}_{0}^{\varSigma ,f,s}\): We prove security of our code using Lemma 2.6, i.e., by showing that (i) for any \(s_{0}\), \(s_1\), \({\textsf{Tamper}}_{s_0}^{f} \approx {\textsf{Tamper}}_{s_1}^{f}\), and (ii) for any \(s\), \(\Pr \left[ {\textsf{Tamper}}^f_{s} \notin \right. \) \(\left. \{\bot , s\} \right] \le {\textsf{negl}}(k)\), where \({\textsf{Tamper}}_{s}^{f}\) is defined in Lemma 2.6. For any \(f\), \(s\), the first experiment, \({\textsf{Exp}}_{0}^{\varSigma ,f,s}\), matches the experiment \({\textsf{Tamper}}_{s}^{f}\) in the CRS model, where \(\varSigma \) is sampled by \({\textsf{Tamper}}_{s}^{f}\).

  • \({\textsf{Exp}}_{1}^{\varSigma ,f,s}\): In the second experiment we define \(Z_i\), \(i \in [2|sk|]\), to be the set of codeword indices in which the secret sharing \(z_i\) is stored, \(|Z_i|=m\). The main difference from the previous experiment is that the current one outputs \(\bot \), if there exists a bit of \(sk\) or \(sk^3\) for which the tampering function reads all the shares of it, while accessing at most \(\alpha \nu \) bits of the codeword. Intuitively, and as we prove in Claim 3.3, by permuting the location indices of \(z||e\), this event happens with probability negligible in k, and the attacker does not learn any bit of \(sk\) and \(sk^3\), even if it is given access to \((1-o(1)) \nu \) bits of the codeword.

  • \({\textsf{Exp}}_{2}^{\varSigma ,f,s}\): By the previous hybrid, we have that for all \(i \in [2|sk|]\), the tampering function will not access all bits of \(z_i\), with overwhelming probability. In the third experiment, we unfold the encoding procedure, and in addition, we substitute the secret sharing procedure \({\textsf{SS}}_{m}\) with \(\bar{{\textsf{SS}}}_{m}^{f}\) that computes shares \(z_i^*\) that reveal no information about \(sk||sk^3\); for each i, \(\bar{{\textsf{SS}}}_{m}^{f}\) simply “drops” the bit of \(z_i\) with the largest index that is not being accessed by \(f\). We formally define \(\bar{{\textsf{SS}}}_{m}^{f}\) below. \(\bar{{\textsf{SS}}}_{m}^{f}(\varSigma ,sk)\):

    1. 1.

      Sample \(\left( z_1,\ldots ,z_{2|sk|} \right) \leftarrow {\textsf{SS}}_{m}\left( sk||sk^3 \right) \) and set \(z_i^* \leftarrow z_i\), \(i \in [2|sk|]\).

    2. 2.

      For \(i \in [2|sk|]\), let \(l_i:= \max _{d} \left\{ d \in [m] \wedge {\textsf{Ind}}\left( z_i[d] \right) \notin I) \right\} \), where \({\textsf{Ind}}\) returns the index of \(z_i[d]\) in \(c\), i.e., \(l_i\) is the largest index in [m] such that \(z_i[l_i]\) is not accessed by \(f\).

    3. 3.

      (Output): For all i set \(z_i^*[l_i]=*\), and output \(z^*:= \parallel _{i=1}^{2|sk|} z^*_i\).

    In \({\textsf{Exp}}_{1}^{\varSigma ,f,s}\), , and each \(z_i\) is an m-out-of-m secret sharing for a bit of \(sk\) or \(sk^3\). From Claim 3.3, we have that for all i, \(|I \cap Z_i|<m\) with overwhelming probability, and we can observe that the current experiment is identical to the previous one up to the point of computing \(f(c_{|_{I}})\), as \(c_{|_{I}}\) and \(f(c_{|_{I}})\) depend only on \(z^*\), that carries no information about \(sk\) and \(sk^3\).

    Another difference between the two experiments is in the external “Else” branch: \({\textsf{Exp}}_{1}^{\varSigma ,f,s}\) makes a call to the decoder while \({\textsf{Exp}}_{2}^{\varSigma ,f,s}\), before calling \({{\textsf{D}}}_{sk}({\tilde{e}})\), checks if the tampering function has modified the shares in a way such that the reconstruction procedure \(((\tilde{sk},\tilde{sk}')\leftarrow {\textsf{Rec}}_{m}({\tilde{z}}))\) will give \(\tilde{sk}\ne sk\) or \(\tilde{sk}' \ne sk'\). This check is done by the statement “If \(\exists i: \bigoplus _{j \in (I \cap Z_i)} c[j] \ne \bigoplus _{j \in (I \cap Z_i)} {\tilde{c}}[j]\)”, without touching \(sk\) or \(sk^3\).Footnote 13 In case modification is detected the current experiments outputs \(\bot \). The intuition is that an attacker that partially modifies the shares of \(sk\) and \(sk^3\), creates shares of \(\tilde{sk}\) and \(\tilde{sk}'\), such that \(\tilde{sk}^3=\tilde{sk}'\), with negligible probability in k. We prove this by a reduction to the 1-IND-CPA security of the encryption scheme: any valid modification over the inner encoding of the secret key gives us method to compute the original secret key \(sk\), with non-negligible probability. The ideas are presented formally in Claim 3.4.

  • \({\textsf{Exp}}_{3}^{\varSigma ,f,s}\): The difference between the current experiment and the previous one is that instead of executing the decryption, \({{\textsf{D}}}_{sk}({\tilde{e}})\), we first check if the attacker has modified the ciphertext, in which case the current experiment outputs \(\bot \), otherwise it outputs \({\textsf{same}}^*\). By the previous hybrid, we reach this newly introduced “Else” branch of \({\textsf{Exp}}_{3}^{\varSigma ,f,s}\), only if the tampering function didn’t modify the secret key. Thus, the indistinguishability between the two experiments follows from the authenticity property of the encryption scheme in the presence of \(z^*\): given that \(\tilde{sk}= sk\) and \(\tilde{sk}' = sk'\), we have that if the attacker modifies the ciphertext, then with overwhelming probability \({{\textsf{D}}}_{sk}({\tilde{e}})=\bot \), otherwise, \({{\textsf{D}}}_{sk}({\tilde{e}})=s\), and the current experiment correctly outputs \(\bot \) or \({\textsf{same}}^*\) (cf. Claim 3.5).

  • Finally, we prove that for any \(f\in {\mathcal {F}}^{\alpha }\), and message \(s\), \({\textsf{Exp}}_3^{f,s}\) is indistinguishable from \({\textsf{Exp}}_3^{f,{{\textsf{0}}}}\), where \({{\textsf{0}}}\) denotes the zero message. This follows by the semantic security of the encryption scheme, and gives us the indistinguishability property required by Lemma 2.6. The manipulation detection property is derived by the indistinguishability between the hybrids and the fact that the output of \({\textsf{Exp}}_{3}^{\varSigma ,f,s}\) is in the set \(\{{\textsf{same}}^*, \bot \}\).

In what follows, we prove indistinguishability between the hybrids using a series of claims.

Claim 3.3

For k, \(m \in {\mathbb {N}}\), assume \((1-\alpha )m = \omega (\log (k))\). Then, for any \(f\in {\mathcal {F}}^{\alpha }\) and any message \(s\), we have \({\textsf{Exp}}_{0}^{\varSigma ,f,s} \approx {\textsf{Exp}}_{1}^{\varSigma ,f,s}\), where the probability runs over the randomness used by \({\textsf{Init}}\), \({\textsf{Enc}}\).

Proof

The difference between the two experiments is that \({\textsf{Exp}}_{1}^{\varSigma ,f,s}\) outputs \(\bot \) when the attacker learns all shares of some bit of \(sk\) or \(sk^3\), otherwise it produces output as \({\textsf{Exp}}_{0}^{\varSigma ,f,s}\) does. Let E be the event “\(\exists i: |(I \cap Z_i)| = m\)”. Clearly, \({\textsf{Exp}}_{0}^{\varSigma ,f,s} = {\textsf{Exp}}_{1}^{\varSigma ,f,s}\) conditioned on \(\lnot E\), thus the statistical distance between the two experiments is bounded by \(\Pr [E]\). In the following, we show that \(\Pr [E] \le {\textsf{negl}}(k)\). We define by \(E_i\) the event in which \(f\) learns the entire \(z_i\). Assuming the attacker reads n bits of the codeword, we have that for all \(i \in [2|sk|]\),

$$\begin{aligned} \mathop {\Pr }\limits _{\varSigma }[E_i] = \mathop {\Pr }\limits _{\varSigma }\left[ \ |I \cap Z_i| = m \ \right] = \prod _{j=0}^{m-1} \frac{n-j}{\nu -j} \le \left( \frac{n}{\nu } \right) ^m. \end{aligned}$$

We have \(n=\alpha \nu \) and assuming \(\alpha = 1-\epsilon \) for \(\epsilon \in (0,1]\), we have

$$\begin{aligned} \Pr [E_i] \le (1-\epsilon )^m \le 1/e^{m\epsilon }, \end{aligned}$$

and

$$\begin{aligned} \Pr [E]=\mathop {\Pr }\limits _{\varSigma } \left[ \bigcup _{i=1}^{2|sk|} E_i \right] \le \frac{2|sk|}{e^{m\epsilon }}, \end{aligned}$$

which is negligible when \((1-\alpha )m = \omega (\log (k))\), and the proof of the claim is complete. \(\square \)

Claim 3.4

Assuming \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) is secure against the 1-query key recovery attack as Definition 2.9, then for any \(f\in {\mathcal {F}}^{\alpha }\) and any message \(s\), \({\textsf{Exp}}_{1}^{\varSigma ,f,s} \approx {\textsf{Exp}}_{2}^{\varSigma ,f,s}\), where the probability runs over the randomness used by \({\textsf{Init}}\), \({\textsf{Enc}}\).

Proof

In \({\textsf{Exp}}_{2}^{\varSigma ,f,s}\), we unfold the encoding procedure; however, instead of calling \({\textsf{SS}}_{m}\), we make a call to \(\bar{{\textsf{SS}}}_{m}^{f}\). As we have already stated above, this modification does not induce any difference between the output of \({\textsf{Exp}}_{2}^{\varSigma ,f,s}\) and \({\textsf{Exp}}_{1}^{\varSigma ,f,s}\), with overwhelming probability, as \(z^*\) is indistinguishable from \(z\) in the eyes of \(f\). Another difference between the two experiments is in the external “Else” branch: \({\textsf{Exp}}_{1}^{\varSigma ,f,s}\) makes a call on the decoder while \({\textsf{Exp}}_{2}^{\varSigma ,f,s}\), before calling \({{\textsf{D}}}_{sk}({\tilde{e}})\), checks if the tampering function has modified the shares in a way such that the reconstruction procedure will give \(\tilde{sk}\ne sk\) or \(\tilde{sk}' \ne sk'\). This check is done by the statement “If \(\exists i: \bigoplus _{j \in (I \cap Z_i)} c[j] \ne \bigoplus _{j \in (I \cap Z_i)} {\tilde{c}}[j]\)”, without touching \(sk\) or \(sk^3\) (cf. Claim 3.3).Footnote 14 We define the events E, \(E'\) as follows

$$\begin{aligned} E: {\textsf{Dec}}({\tilde{c}}) \ne \bot , E': \exists i: \mathop {\bigoplus }\nolimits _{j \in (I \cap Z_i)} c[j] \ne \mathop {\bigoplus }\nolimits _{j \in (I \cap Z_i)} {\tilde{c}}[j]. \end{aligned}$$

Clearly, conditioned on \(\lnot E'\) the two experiments are identical, since we have \(\tilde{sk}=sk\) and \(\tilde{sk}'=sk'\), and the decoding process will output \({{\textsf{D}}}_{sk}({\tilde{e}})\) in both experiments. Thus, the statistical distance is bounded by \(\Pr [E']\). Now, conditioned on \(E' \wedge \lnot E\), both experiments output \(\bot \). Thus, we need to bound \(\Pr [E \wedge E']\). Assuming \(\Pr [E \wedge E']>p\), for \(p=1/\textrm{poly}(k)\), we define an attacker \({\mathcal {A}}\) that simulates \({\textsf{Exp}}_{2}^{\varSigma ,f,s}\), and uses \(f\), \(s\) to break the 1-query key recovery security of \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) in the presence of \(z^*\), with probability at least p/2.

First we prove that any secure (against 1-query key recovery attacks) encryption scheme, remains secure even if the attacker receives \(z^* \leftarrow \bar{{\textsf{SS}}}_{m}^{f}(\varSigma ,sk)\). This is because that \(z^*\) can be simulated without knowing \(sk\), by using random shares on positions that the tampering function can see and \(*\)’s otherwise. Then this fact follows by a simple reduction argument.

Now we prove our claim. Assuming \(\Pr [E \wedge E']>p\), for \(p=1/\textrm{poly}(k)\), we define an attacker \({\mathcal {A}}\) that breaks the 1-query key recovery security of \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) in the presence of \(z^*\), with non-negligible probability. \({\mathcal {A}}\) receives the encryption of \(s\), which corresponds to the oracle query right before receiving the challenge ciphertext, the challenge ciphertext \(e\leftarrow {{\textsf{E}}}_{sk}(s_b)\), for uniform \(b \in \{0,1\}\) and uniform messages \(s_0\), \(s_1\), as well as \(z^*\). \({\mathcal {A}}\) is defined below.

\({\mathcal {A}}\left( z^* \leftarrow \bar{{\textsf{SS}}}_{m}^{f}(\varSigma ,sk),e' \leftarrow {{\textsf{E}}}_{sk}(s), e\leftarrow {{\textsf{E}}}_{sk}(s_b) \right) \):

  1. 1.

    (Define the shares that will be accessed by \(f\)): For \(i \in [2|sk|]\), define \(w_i:=(z_i^*)_{|_{[m]\backslash \{l_i\}}}\) and for \(i \in [m-1]\) define , .

  2. 2.

    (Apply \(f\)) Set \(c \leftarrow P_{\varSigma }(z^*||e)\), compute \({\tilde{c}}[I] \leftarrow f_{\varSigma }(c_{|_{I}})\) and let \({{\tilde{C}}}_i\), \({{\tilde{D}}}_i\), \(i \in [m]\), be the tampered shares resulting after the application of \(f\) to \(c_{|_{I}}\).

  3. 3.

    (Searching the secret key) Let \(U=\sum _{i=1}^{m-1} C_i\), \(V=\sum _{i=1}^{m-1} D_i\), i.e., U, V denote the sum of the shares that are being accessed by the attacker (maybe partially), and \({{\tilde{U}}} = \sum _{i=1}^{m-1} {{\tilde{C}}}_i\), \({{\tilde{V}}} = \sum _{i=1}^{m-1} {{\tilde{D}}}_i\), are the corresponding tampered values after applying \(f\) on U, V. Define

    $$\begin{aligned} p(X):= (U-{{\tilde{U}}})X^2 + (U^2-{{\tilde{U}}}^2)X + (U^3 - {{\tilde{U}}}^3-V+{{\tilde{V}}}), \end{aligned}$$

    and compute the set of roots of p(X), denoted as \({\mathcal {X}}\), which are at most two. Then set

    $$\begin{aligned} \hat{\mathcal{S}\mathcal{K}}:= \left\{ x+U | x \in {\mathcal {X}} \right\} . \end{aligned}$$
    (2)
  4. 4.

    (Output) Just output a random element in \(\hat{\mathcal{S}\mathcal{K}}\).

In the first step, \({\mathcal {A}}\) removes the dummy symbol “\(*\)” and computes the shares that will be partially accessed by \(f\), denoted as \(C_i\) for \(sk\) and as \(D_i\) for \(sk^3\). In the second step, it simulates the codeword partially, applies the tampering function on it, and defines the tampered shares, \({{\tilde{C}}}_i\), \({{\tilde{D}}}_i\). Conditioned on \(E'\), it is not hard to see that \({\mathcal {A}}\) simulates perfectly \({\textsf{Exp}}_{2}^{\varSigma ,f,s}\). In particular, it simulates perfectly the input to \(f\) as it receives \(e\leftarrow {{\textsf{E}}}_{sk}(s)\) and all but \(2|sk|\) of the actual bit-shares of \(sk\), \(sk^3\). Part of those shares will be accessed by \(f\). Since for all i, \(|I \cap Z_i| < m\), the attacker is not accessing any single bit of \(sk\), \(sk^3\). Let \(C_m\), \(D_m\), be the shares (not provided by the encryption oracle) that completely define \(sk\) and \(sk^3\), respectively. By the definition of the encoding scheme and the fact that \(sk\), \(sk^3 \in {\textbf{GF}}(2^{\textrm{poly}(k)})\), we have \(\sum _{i=1}^{m} C_i=sk\), \(\sum _{i=1}^{m} D_i=sk^3\), and

$$\begin{aligned} \left( U + C_m \right) ^3 = V + D_m. \end{aligned}$$
(3)

In order for the decoder to output a non-bottom value, the shares created by the attacker must decode to \(\tilde{sk}\), \(\tilde{sk}'\), such that \(\tilde{sk}^3 = \tilde{sk}'\), or in other words, if

$$\begin{aligned} \left( {{\tilde{U}}} + C_m \right) ^3 = {{\tilde{V}}} + D_m. \end{aligned}$$
(4)

From 3 and 4 we receive

$$\begin{aligned} (U-{{\tilde{U}}})C_m^2 + (U^2-{{\tilde{U}}}^2)C_m + (U^3 - {{\tilde{U}}}^3)=V - {{\tilde{V}}}. \end{aligned}$$
(5)

Clearly, \(\Pr [E \wedge E' \wedge (U= {{\tilde{U}}})]=0\). Thus, assuming \(\Pr [E \wedge E']>p\), for \(p > 1/\textrm{poly}(k)\), we receive

$$\begin{aligned} p = \Pr \left[ E \wedge E' \wedge (U\ne {{\tilde{U}}}) \right]\le & {} \Pr \left[ {\textsf{Dec}}({\tilde{c}}) \ne \bot \wedge E' \wedge U \ne {{\tilde{U}}} \right] \nonumber \\\le & {} \Pr \left[ \tilde{sk}^3 = \tilde{sk}' \wedge E' \wedge (U \ne {{\tilde{U}}}) \right] \nonumber \\{} & {} {\mathop {=}\limits ^{(\text {5},\text {2})}} \Pr \left[ C_m \in {\mathcal {X}} \right] {\mathop {\le }\limits ^{(\text {2})}} \Pr \left[ sk\in \hat{\mathcal{S}\mathcal{K}} \right] , \end{aligned}$$
(6)

and \({\mathcal {A}}\) manages to recover \(C_m\), and thus the set \(\hat{\mathcal{S}\mathcal{K}} \) that contains \(sk\), with non-negligible probability at least p. As \(\hat{\mathcal{S}\mathcal{K}} \) is derived from solving a quadratic equation, the cardinality, i.e., \(|\hat{\mathcal{S}\mathcal{K}} |\) is at most 2. Thus, a random guess would hit the \(sk\) with probability at least p/2. This is a contradiction to the 1-query key recovery security of \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\).

Thus, we have \(\Pr [E \wedge E'] \le {\textsf{negl}}(k)\), and both experiments output \(\bot \) with overwhelming probability. \(\square \)

Claim 3.5

Assuming the authenticity property of \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\), for any \(f\in {\mathcal {F}}^{\alpha }\) and any message \(s\), \({\textsf{Exp}}_{2}^{\varSigma ,f,s} \approx {\textsf{Exp}}_{3}^{\varSigma ,f,s}\), where the probability runs over the randomness used by \({\textsf{Init}}\), \({\textsf{KGen}}\) and \({{\textsf{E}}}\).

Proof

Before proving the claim, recall that the authenticity property of the encryption scheme is preserved under the presence of \(z^*\) (cf. Claim 3.4). Let E be the event \(\tilde{sk}=sk\wedge \tilde{sk}'=sk^3\) and \(E'\) be the event \({\tilde{e}}\ne e\). Conditioned on \(\lnot E\), the two experiments are identical, as they both output \(\bot \). Also, conditioned on \(E \wedge \lnot E'\), both experiments output \({\textsf{same}}^*\). Thus, the statistical distance between the two experiments is bounded by \(\Pr [E \wedge E']\). Let B be the event \({{\textsf{D}}}_{sk}({\tilde{e}})\ne \bot \). Conditioned on \(E \wedge E' \wedge \lnot B\) both experiments output \(\bot \). Thus, we need to bound \(\Pr [E \wedge E' \wedge B]\).

Assuming there exist \(s\), \(f\), for which \(\Pr [E \wedge E' \wedge B] > p\), where \(p = 1/\textrm{poly}(k)\), we define an attacker \({\mathcal {A}}=({\mathcal {A}}_1,{\mathcal {A}}_2)\) that simulates \({\textsf{Exp}}_{3}^{\varSigma ,f,s}\) and breaks the authenticity property of the encryption scheme in the presence of \(z^*\), with non-negligible probability. \({\mathcal {A}}\) is defined as follows: sample \((s,st) \leftarrow {\mathcal {A}}_1(1^k)\), and then, on input \((z^*,e,st)\), where \(e\leftarrow {{\textsf{E}}}_{sk}(s)\), \({\mathcal {A}}_2\), samples \(\varSigma \leftarrow {\textsf{Init}}(1^k)\), sets \({\tilde{c}}\leftarrow 0^{\nu }\), \(c\leftarrow P_{\varSigma }(z^*||e)\), computes \({\tilde{c}}[I] \leftarrow f(c_{|_{I}})\), \({\tilde{c}}[I^{{\textsf{c}}}] \leftarrow c_{|_{I^{{\textsf{c}}}}}\), \(({{\tilde{z}}} ^*||{\tilde{e}}) \leftarrow P^{-1}_{\varSigma }({\tilde{c}})\), and outputs \({{\tilde{e}}}\). Assuming \(\Pr [E \wedge E' \wedge B] > p\), we have that \({{\textsf{D}}}_{sk}({\tilde{e}}) \ne \bot \) and \({\tilde{e}}\ne e\), with non-negligible probability and the authenticity property of \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) breaks. \(\square \)

Claim 3.6

Assuming \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) is semantically secure, for any \(f\in {\mathcal {F}}^{\alpha }\) and any message \(s\), \({\textsf{Exp}}_{3}^{\varSigma ,f,s} \approx {\textsf{Exp}}_{3}^{\varSigma ,f,{{\textsf{0}}}}\), where the probability runs over the randomness used by \({\textsf{Init}}\), \({\textsf{KGen}}\), \({{\textsf{E}}}\). “\(\approx \)” may refer to statistical or computational indistinguishability, and \({{\textsf{0}}}\) denotes the zero message.

Proof

Recall that \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) is semantically secure even in the presence of \(z^* \leftarrow \bar{{\textsf{SS}}}_{m}^{f}(\varSigma ,sk)\) (cf. 3.4), and toward contradiction, assume there exist \(f\in {\mathcal {F}}^{\alpha }\), message \(s\), and \(\textrm{PPT} \) distinguisher \({\mathcal {D}}\) such that

$$\begin{aligned} \left| \Pr \left[ {\mathcal {D}}\left( \varSigma ,{\textsf{Exp}}_{3}^{\varSigma ,f,s} \right) =1 \right] - \Pr \left[ {\mathcal {D}}\left( \varSigma ,{\textsf{Exp}}_{3}^{\varSigma ,f,{{\textsf{0}}}} \right) \right] =1 \right| > p, \end{aligned}$$

for \(p = 1/\textrm{poly}(k)\). We are going to define an attacker \({\mathcal {A}}\) that breaks the semantic security of \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) in the presence of \(z^*\), using \(s_0:=s\), \(s_1:={{\textsf{0}}}\). \({\mathcal {A}}\), given \(z^*\), \(e\), executes \({\textsf{Program}}\).

$$\begin{aligned} \begin{array}{l} {\textsf{Program}}(z^*,e):\\ c\leftarrow P_{\varSigma }(z^*||e), {\tilde{c}}\leftarrow 0^{\nu }, {\tilde{c}}[I] \leftarrow f(c_{|_{I}})\\ \text {If }\exists i: |(I \cap Z_i)| = m:\ \tilde{s}\leftarrow \bot \\ \text {Else:}\\ \hspace{0.5cm}\text {If } \exists i: \bigoplus _{j \in (I \cap Z_i)} c[j] \ne \bigoplus _{j \in (I \cap Z_i)} {\tilde{c}}[j]: \\ \hspace{1cm} \tilde{s}\leftarrow \bot \\ \hspace{0.5cm}\text {Else: }\tilde{s}\leftarrow \bot \\ \hspace{1cm} \text {If }{\tilde{e}}=e:\\ \hspace{1.5cm} \tilde{s}\leftarrow {\textsf{same}}^*\\ \text {Output }\tilde{s}. \end{array} \end{aligned}$$

It is not hard to see that \({\mathcal {A}}\) simulates \({\textsf{Exp}}_{3}^{\varSigma ,f,s_b}\); thus, the advantage of \({\mathcal {A}}\) against the semantic security of \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) is the same with the advantage of \({\mathcal {D}}\) in distinguishing between \({\textsf{Exp}}_{3}^{\varSigma ,f,s_0}\), \({\textsf{Exp}}_{3}^{\varSigma ,f,s_1}\), which by assumption is non-negligible. We have reached a contradiction, and the proof of the claim is complete. \(\square \)

From the above claims, we have that for any \(f\in {\mathcal {F}}^{\alpha }\) and any \(s\), \({\textsf{Exp}}_{0}^{\varSigma ,f,s} \approx {\textsf{Exp}}_{3}^{\varSigma ,f,{{\textsf{0}}}}\), thus for any \(f\in {\mathcal {F}}^{\alpha }\) and any \(s_0\), \(s_1\), \({\textsf{Exp}}_{0}^{\varSigma ,f,s_0} \approx {\textsf{Exp}}_{0}^{\varSigma ,f,s_1}\). Also, by the indistinguishability between \({\textsf{Exp}}_{0}^{\varSigma ,f,s}\) and \({\textsf{Exp}}_{3}^{\varSigma ,f,{{\textsf{0}}}}\), the second property of Lemma 2.6 has been proven as the output of \({\textsf{Exp}}_{3}^{\varSigma ,f,{{\textsf{0}}}}\) is in \(\{s,\bot \}\), with overwhelming probability, and non-malleability with manipulation detection of our code follows by Lemma 2.6, since \({\textsf{Exp}}_{0}^{\varSigma ,f,s}\) is identical to \({\textsf{Tamper}}^{f}_{s}\) of Lemma 2.6. \(\square \)

Instantiations and rates. By instantiating Construction 3.1 with the authenticated encryption scheme 2.11, Theorem 3.2, for \(m=k\log k\), \(\alpha = 1-1/\varOmega (\log k)\), yields a rate \(1\) MD-NMC, with access rate \(1-1/\varOmega (\log k)\) and codewords of length \(|s|+O(k^2 \log k)\), assuming one-way functions. Furthermore, by instantiating Construction 3.1 with 2.12, Theorem 3.2, for \(m=|s|\log |s|\), \(\alpha = 1-1/O(\log (|s|))\), yields an unconditionally secure MD-NMC in the CRS model, with concrete information rate \(1/O(|s|\log (|s|))\), access rate \(1-1/\varOmega (\log (|s|))\) and codewords of length \(O(|s|^2 \log |s|)\).

On the CRS . In the above, the tampering function, and consequently the codeword locations that the function is given access to, are fixed before sampling the CRS and this is critical for achieving security. However, by the proof of Theorem 3.2, we observe that proving security in this setting is highly non-trivial. In addition, the tampering function receives full access to the CRS when tampering with the codeword, which is in contrast to the work by Faust et. al. [45] in the information-theoretic setting, where the (internal) tampering function receives partial information over the CRS.

In addition, the proposed scheme tolerates adaptive selection of the codeword locations, with respect to the CRS, in the following way: each time the attacker requests access to a location, he also learns if it corresponds to a bit of \(z\) or \(e\), together with the index of that bit in the original string. In this way, the CRS is gradually disclosed to the adversary while picking codeword locations.

Finally, our CRS sustains a substantial amount of tampering that depends on the codeword locations chosen by the attacker: an attacker that gets access to a sensitive codeword bit is allowed to modify the part of the CRS that defines the location of that bit in the codeword. The attacker is allowed to modify all but \(O(k \log (|s| + k))\) bits of the CRS, that is of length \(O(k^2 \log k \log (|s|+k))\). To our knowledge, this is the first construction that tolerates, even partial modification of the CRS. In contrast, existing constructions in the CRS model are either using NIZKs [35, 41, 43, 55], or they are based on the knowledge of exponent assumption [51], thus tampering access to the CRS would compromise security.

4 Removing the CRS

In the present section, we show how to construct an MD-NMC for partial functions, in the standard model.

A first approach would be to store the CRS of Construction 3.1, inside the codeword together with \(P_{\varSigma }(z||e)\), and give to the attacker read/write access to it. However, the tampering function, besides getting direct (partial) access to the encoding of \(sk\), it also gets indirect access to it by (partially) controlling the CRS. Then, it can modify the CRS in a way such that, during decoding, ciphertext locations of its choice will be treated as bits of the inner encoding, \(z\), increasing the tampering rate against \(z\) significantly. This makes the task of protecting \(sk\) hard, if not impossible (unless we restrict the access rate significantly).

To handle this challenge, we embed a structure recovering mechanism inside the codeword and we emulate the CRS effect by increasing the size of the alphabet, giving rise to a block-wise structure.Footnote 15 Notice that, non-malleable codes with large alphabet size (i.e., \(\textrm{poly}(k)+|s|\) bits) might be easy to construct, as we can embed in each codeword block the verification key of a signature scheme together with a secret share of the message, as well as a signature over the share. In this way, partial access over the codeword does not compromise the security of the signature scheme while the message remains private, and the simulation is straightforward. This approach, however, comes with a large overhead, decreasing the information rate and access rate of the scheme significantly. In general, and similar to error correcting codes, we prefer smaller alphabet sizes—the larger the size is, the more coarse access structure is required, i.e., in order to access individual bits we need to access the blocks that contain them. The present work aims at minimizing this restriction by using small alphabets, as described below.

Our approach on the problem is the following. We increase the alphabet size to \(O(\log k)\) bits, and we consider two types of blocks: (i) sensitive blocks, in which we store the inner encoding, \(z\), of the secret key, \(sk\), and (ii) non-sensitive blocks, in which we store the ciphertext, \(e\), that is fragmented into blocks of size \(O(\log k)\). The first bit of each block indicates whether it is a sensitive block, i.e., we set it to 1 for sensitive blocks and to 0, otherwise. Our encoder works as follows: on input message \(s\), it computes \(z\), \(e\), as in the previous scheme and then uses rejection sampling to sample the indices, \(\rho _1,\ldots ,\rho _{|z|}\), for the sensitive blocks. Then, for every \(i \in \{1,\ldots ,|z|\}\), \(C_{\rho _i}\) is a sensitive block, with contents \((1||i||z[i])\), while the remaining blocks keep ciphertext pieces of size \(O(\log k)\). Decoding proceeds as follows: on input codeword \(C=(C_1,\ldots ,C_{{\textsf{bn}}})\), for each \(i \in [{\textsf{bn}}]\), if \(C_i\) is a non-sensitive block, its data will be part of \(e\), otherwise, the last bit of \(C_i\) will be part of \(z\), as it is dictated by the index stored in \(C_i\). If the number of sensitive blocks is not the expected, the decoder outputs \(\bot \), otherwise, \(z\), \(e\), have been fully recovered and decoding proceeds as in the previous scheme. The proposed scheme is depicted in Fig. 5.

The security of our construction is based on the fact that, due to our shuffling technique, the position mapping will not be completely overwritten by the attacker, and we prove later in this section, this suffices for protecting the inner encoding over \(sk\). We prove security of the current scheme (cf. Theorem 4.8) by a reduction to the security of the scheme in the CRS model. Our instantiation yields a rate \(1-1/\varOmega (\log k)\) MD-NMC in the standard model, with access rate \(1-1/\varOmega (\log k)\) and codewords of length \(|s|(1+1/O(\log k))+O(k^2 \log ^2 k)\), assuming one-way functions.

It is worth pointing out that the idea of permuting blocks containing sensitive and non-sensitive data was also considered by [61] in the context of list-decodable codes; however, the similarity is only in the fact that a permutation is being used at some point in the encoding process, and our objective, construction and proof are different.

Fig. 5
figure 5

Description of the scheme in the standard model

In what follows, we consider alphabets of size \(O(\log (k))\) and we provide a computationally secure, rate \(1-1/\varOmega (\log k)\) encoding scheme in the standard model, tolerating modification of \((1-o(1))\nu \) blocks, where \(\nu \) is the total number of blocks in the codeword. The projection operation will be also used with respect to bigger alphabets, enabling the projection of blocks.

Our construction is defined below.

Construction 4.1

Let k, \(m \in {\mathbb {N}}\), let \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) be a symmetric encryption scheme and \(({\textsf{SS}}_{m},{\textsf{Rec}}_{m})\) be an m-out-of-m secret sharing scheme. We define an encoding scheme \(({\textsf{Enc}}^*,{\textsf{Dec}}^*)\), as follows:

  • \({\textsf{Enc}}^*(1^k,\cdot )\): for input message \(s\), sample \(sk\leftarrow {\textsf{KGen}}\left( 1^k \right) \), \(e\leftarrow {{\textsf{E}}}_{sk}(s)\).

    • \(\bullet \)   (Secret share) Sample \(z\leftarrow {\textsf{SS}}_{m}(sk||sk^3)\), where , \(z \in \{0,1\}^{2\,m|sk|}\), and for \(i \in \left[ |sk| \right] \), \(z_i\) (resp. \(z_{|sk|+i}\)) is an m-out-of-m secret sharing of \(sk[i]\) (resp. \(sk^3[i]\)).

    • \(\bullet \)   (Construct blocks & permute) Set \(l \leftarrow 2\,m|sk|\), \({\textsf{bs}}\leftarrow \log l+2\), \(d \leftarrow |e|/{\textsf{bs}}\), \({\textsf{bn}}\leftarrow l+d\), sample \(\rho :=(\rho _1,\ldots ,\rho _l) {\mathop {\leftarrow }\limits ^{{\textsf{rs}}}}\{0,1\}^{\log ({\textsf{bn}})} \) and compute \(C\leftarrow \varPi _{\rho }(z||e)\) as follows:

      1. 1.

        Set \(t \leftarrow 1\), \(C_i \leftarrow 0^{{\textsf{bs}}}\), \(i \in [{\textsf{bn}}]\).

      2. 2.

        (Sensitive blocks) For \(i \in [l]\), set \(C_{\rho _i} \leftarrow \left( 1||i||z[i] \right) \).

      3. 3.

        (Ciphertext blocks) For \(i \in [{\textsf{bn}}]\), if \(i \ne \rho _j\), \(j \in [l]\), \(C_i \leftarrow (0||e[t:t+({\textsf{bs}}-1)])\), \(t \leftarrow t + ({\textsf{bs}}-1) \).Footnote 16

    Output \(C:=(C_1||\ldots ||C_{{\textsf{bn}}})\).

  • \({\textsf{Dec}}^*(1^k,\cdot )\): on input \(C\), parse it as \((C_1||\ldots ||C_{{\textsf{bn}}})\), set \(t \leftarrow 1\), \(l \leftarrow 2\,m|sk|\), \(z \leftarrow 0^l\), \(e\leftarrow {{\textsf{0}}}\), \({\mathcal {L}} = \emptyset \) and compute \((z||e) \leftarrow \varPi ^{-1}(C)\) as follows:

    • \(\bullet \)   For \(i \in [{\textsf{bn}}]\),

      • \(*\)   (Sensitive block) If \(C_i[1] = 1\), set \(j \leftarrow C_i[2:{\textsf{bs}}-1]\), \(z \left[ j \right] \leftarrow C_i[{\textsf{bs}}]\), \({\mathcal {L}} \leftarrow {\mathcal {L}} \cup \{j \}\).

      • \(*\)   (Ciphertext block) Otherwise, set \(e[t:t+{\textsf{bs}}-1]= C_i[2:{\textsf{bs}}]\), \(t \leftarrow t+ {\textsf{bs}}-1\).

    • \(\bullet \)   If \(|{\mathcal {L}}| \ne l\), output \(\bot \), otherwise output \((z|| e)\).

    If \( \varPi ^{-1}(C)= \bot \), output \(\bot \), otherwise, compute \((sk|| sk' )\leftarrow {\textsf{Rec}}_{m}(z)\), and if \(sk^3 = sk'\), output \({{\textsf{D}}}_{sk}(e)\), otherwise output \(\bot \).

The set of indices of the blocks in which \(z_i\) is stored will be denoted by \(Z_i\).

We prove security for the above construction by a reduction to the security of Construction 3.1. We note that our reduction is non-black box with respect to the coding scheme in which security is reduced to; a generic reduction, i.e., non-malleable reduction [2], from the standard model to the CRS model is an interesting open problem and thus out of the scope of the present work.

In the following, we consider \(\varGamma =\{0,1\}^{O(\log (k))}\).Footnote 17 The straightforward way to prove that \(({\textsf{Enc}}^*,{\textsf{Dec}}^*)\) is secure against \( {\mathcal {F}}^{\alpha }_{\varGamma }\) by a reduction to the security of the bit-wise code of Sect. 3, would be as follows: for any \(\alpha \in [0,1)\), \(f\in {\mathcal {F}}^{\alpha }_{\varGamma }\) and any message \(s\), we have to define \(\alpha '\), \(g\in {\mathcal {F}}^{\alpha '}\), such that the output of the tampered execution with respect to \(({\textsf{Enc}}^*,{\textsf{Dec}}^*)\), \(f\), \(s\), is indistinguishable from the tampered execution with respect to \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\), \(g\), \(s\), and \(g\) is an admissible function for \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\). However, this approach might be tricky as it requires the establishment of a relation between \(\alpha \) and \(\alpha '\) such that the sensitive blocks that \(f\) will receive access to, will be simulated using the sensitive bits accessed by \(g\). Our approach is cleaner: for the needs of the current proof we leverage the power of Construction 3.1, by allowing the attacker to choose adaptively the codeword locations, as long as it does not request to read all shares of the secret key. Then, for every block that is accessed by the block-wise attacker \(f\), the bit-wise attacker \(g\) requests access to the locations of the bit-wise code that enable him to fully simulate the input to \(f\). We formally present our ideas in the following sections. In Sect. 4.1 we introduce the function class \({\mathcal {F}}_{{\textsf{ad}}}\) that considers adaptive adversaries with respect to the CRS and we prove security of Construction 3.1 in Corollary 4.3 against a subclass of \({\mathcal {F}}_{{\textsf{ad}}}\), and then, we reduce the security of the block-wise code \(({\textsf{Enc}}^*,{\textsf{Dec}}^*)\) against \( {\mathcal {F}}^{\alpha }_{\varGamma }\) to the security of Construction 3.1 against \({\mathcal {F}}_{{\textsf{ad}}}\) (cf. Sect. 4.2).

4.1 Security Against Adaptive Adversaries

In the current section, we prove that Construction 4.1 is secure against the class of functions that request access to the codeword adaptively, i.e., depending on the CRS, as long as they access a bounded number of sensitive bits. Below, we formally define the function class \({\mathcal {F}}_{{\textsf{ad}}}\), in which the tampering function picks up the codeword locations depending on the CRS, and we consider \(\varGamma =\{0,1\}\).

Definition 4.2

(The function class \({\mathcal {F}}_{{\textsf{ad}}}^{\nu }\) (or \({\mathcal {F}}_{{\textsf{ad}}}\))) Let \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\) be an \((\kappa ,\nu )\)-coding scheme and let \(\mathbb {\varSigma }\) be the range of \({\textsf{Init}}(1^k)\). For any \(g=(g_1,g_2) \in {\mathcal {F}}_{{\textsf{ad}}}^{\nu }\), we have \(g_1: \mathbb {\varSigma }\rightarrow {\mathcal {P}}\left( [\nu ] \right) \), \(g_2^{\varSigma }: \{0,1\}^{|{\textsf{range}}(g_1)|} \rightarrow \{0,1\}^{|{\textsf{range}}(g_1)|} \cup \{\bot \}\), and for any \(c\in \{0,1\}^{\nu }\), \(g^{\varSigma }\left( c\right) =g_2 \left( c_{|_{g_1(\varSigma )}} \right) \). For brevity, the function class will be denoted as \({\mathcal {F}}_{{\textsf{ad}}}\).

Construction 3.1 remains secure against functions that receive full access to the ciphertext, as long as they request to read all but one shares for each bit of \(sk\) and \(sk^3\). The result is formally presented in the following corollary.

Corollary 4.3

Let k, \(m \in {\mathbb {N}}\). Assuming \(({\textsf{SS}}_{m},{\textsf{Rec}}_{m})\) is an m-out-of-m secret sharing scheme and \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) is 1-IND-CPA secure authenticated encryption scheme, the code of Construction 4.1 is an MD-NMC against any \(g=(g_1,g_2) \in {\mathcal {F}}_{{\textsf{ad}}}\), assuming that for all \(i \in [2|sk|]\), \(\left( Z_i \cap g_1(\varSigma ) \right) < m\), where \(sk\leftarrow {\textsf{KGen}}(1^k)\) and \(\varSigma \leftarrow {\textsf{Init}}(1^k)\).

Proof

Let \(g=(g_1,g_2)\) be as stated above. For any message \(s\), the tampered execution with respect to \(g\) and \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\), is defined as follows.

$$\begin{aligned} {\textsf{Tamper}}^{g}_{s}:= \left\{ \begin{array}{l} \varSigma \leftarrow {\textsf{Init}}(1^k), c\leftarrow {\textsf{Enc}}(\varSigma ,s), I \leftarrow g_1(\varSigma )\\ {\tilde{c}}\leftarrow g_2^{\varSigma }(c_{|_{I}}), \tilde{s}\leftarrow {\textsf{Dec}}(\varSigma ,{\tilde{c}}) \\ \\ \text {If }\tilde{s}=s\text {, output }{\textsf{same}}^*\text {, otherwise, output }\tilde{s}. \end{array} \right\} \end{aligned}$$

The proof is along the lines of the proof of Theorem 3.2, i.e., we prove that for any \(g\) having the properties stated above, and any pair of messages \(s_0\), \(s_1\), \({\textsf{Tamper}}^{g}_{s_0} \approx {\textsf{Tamper}}^{g}_{s_1}\), and the output of the tampered execution is either the original message, or \(\bot \), with overwhelming probability. Below, we revisit the hybrids of Theorem 3.2 and we prove that the indistinguishability between adjacent hybrids, holds with respect to \(g\).

  • \({\textsf{Exp}}_{0}^{\varSigma ,g,s}\): For any \(f\), \(s\), the first experiment, \({\textsf{Exp}}_{0}^{\varSigma ,g,s}\), is identical to the experiment \({\textsf{Tamper}}_{s}^{g}\).

  • \({\textsf{Exp}}_{1}^{\varSigma ,g,s}\): In the second experiment, we have \(Z_i\), \(i \in [2|sk|]\), to be the set of indices in which \(z_i\) is stored, \(|Z_i|=m\). The main difference from the previous experiment is that the current one outputs \(\bot \), if there exists a bit of \(sk\) or \(sk^3\) for which the tampering function reads all shares of it. However, by the definition of \(g\) we know that this happens with zero probability; thus, we have that the following claim holds,

Claim 4.4

Let k, \(m \in {\mathbb {N}}\). For any \(g=(g_1,g_2) \in {\mathcal {F}}_{{\textsf{ad}}}\), assuming that for all \(i \in [2|sk|]\), \(\left( Z_i \cap g_1(\varSigma ) \right) < m\) and any message \(s\), we have \({\textsf{Exp}}_{0}^{\varSigma ,g,s} = {\textsf{Exp}}_{1}^{\varSigma ,g,s}\), where \(sk\leftarrow {\textsf{KGen}}(1^k)\), \(\varSigma \leftarrow {\textsf{Init}}(1^k)\).

  • \({\textsf{Exp}}_{2}^{\varSigma ,g,s}\): In the current experiment, we unfold the encoding procedure, and in addition, we substitute the secret sharing procedure \({\textsf{SS}}_{m}\) with \(\bar{{\textsf{SS}}}_{m}^{g}\), where \(\bar{{\textsf{SS}}}_{m}^{g}\) is defined as \(\bar{{\textsf{SS}}}_{m}^{f}\) does with respect to \(f\), in Claim 3.4 of Theorem 3.2. From the above claim, we have that for all i, \(|I \cap Z_i|<m\), and we observe that the current experiment is identical to the previous one up to the point of computing \(g(c_{|_{I}})\), as \(c_{|_{I}}\) carries no information about \(sk\) and \(sk^3\). Thus, the transition between the current experiment and the previous one is identical to that of Theorem 3.2: an attacker that partially modifies the shares of \(sk\) and \(sk^3\), creates shares of \(\tilde{sk}\) and \(\tilde{sk}'\), such that \(\tilde{sk}^3=\tilde{sk}'\), with negligible probability in k, which is proved by a reduction to the 1-IND-CPA security of the encryption scheme in the presence of \(z^*\). Thus, we have the following claim.

Claim 4.5

Assuming \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) is 1-IND-CPA secure (cf. Definition 2.9), for any \(g\in {\mathcal {F}}_{{\textsf{ad}}}\) and any message \(s\), \({\textsf{Exp}}_{1}^{\varSigma ,g,s} \approx {\textsf{Exp}}_{2}^{\varSigma ,g,s}\), where the probability runs over the randomness used by \({\textsf{Init}}\), \({\textsf{Enc}}\).

  • \({\textsf{Exp}}_{3}^{\varSigma ,g,s}\): As in Theorem 3.2, the indistinguishability between the two experiments follows from the authenticity property of the encryption scheme in the presence of \(z^*\). Thus, the following holds.

Claim 4.6

Assuming the authenticity property of \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\), for any \(g\in {\mathcal {F}}_{{\textsf{ad}}}\) and any message \(s\), \( {\textsf{Exp}}_{2}^{\varSigma ,f,s} \approx {\textsf{Exp}}_{3}^{\varSigma ,f,s} \), where the probability runs over the randomness used by \({\textsf{Init}}\), \({\textsf{KGen}}\) and \({{\textsf{E}}}\).

  • Finally, since \(g\) learns nothing about \(sk\), we have that for any \(g\in {\mathcal {F}}_{{\textsf{ad}}}\), and message \(s\), \({\textsf{Exp}}_3^{g,s}\) is indistinguishable from \({\textsf{Exp}}_3^{g,{{\textsf{0}}}}\), where \({{\textsf{0}}}\) denotes the zero message. This follows by the semantic security of the encryption scheme (Definition 2.9). Formally, we prove the following claim.

Claim 4.7

Assuming \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) is semantically secure, for any \(g\in {\mathcal {F}}_{{\textsf{ad}}}\) and any message \(s\), \({\textsf{Exp}}_3^{g,s} \approx {\textsf{Exp}}_3^{g,{{\textsf{0}}}} \), where the probability runs over the randomness used by \({\textsf{Init}}\), \({\textsf{KGen}}\), \({{\textsf{E}}}\), “\(\approx \)” may refer to statistical or computational indistinguishability, and \({{\textsf{0}}}\) is the zero message.

From the above claims we have that for any \(g\in {\mathcal {F}}_{{\textsf{ad}}}\) and any \(s_0\), \(s_1\), assuming that for all \(i \in [2|sk|]\), \(\left( Z_i \cap g_1(\varSigma ) \right) < m\), \({\textsf{Exp}}_{0}^{g,s_0} \approx {\textsf{Exp}}_{0}^{g,s_1}\), and non-malleability with manipulation detection follows by Lemma 2.6, since \({\textsf{Exp}}_{0}^{g,s}\) is identical to \({\textsf{Tamper}}^{g}_{s}\) of Lemma 2.6, and by the indistinguishability between \({\textsf{Exp}}_{0}^{\varSigma ,g,s}\) and \({\textsf{Exp}}_{3}^{\varSigma ,g,s}\), the second property of Lemma 2.6 has been proven as the output of \({\textsf{Exp}}_{3}^{\varSigma ,g,s}\) is in \(\{s,\bot \}\), with overwhelming probability. \(\square \)

4.2 MD-NMC Security of the Block-Wise Code

In the current section, we prove security of Construction 4.1 against \({\mathcal {F}}^{\alpha }_{\varGamma }\), for \(\varGamma =\{0,1\}^{O(\log (k))}\).

Theorem 4.8

Let k, \(m \in {\mathbb {N}}\), \(\varGamma =\{0,1\}^{O(\log (k))}\) and \(\alpha \in [0,1)\). Assuming \(({\textsf{SS}}_{m},{\textsf{Rec}}_{m})\) is an m-out-of-m secret sharing scheme and \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) is a 1-IND-CPA secure authenticated encryption scheme, the code of Construction 4.1 is an MD-NMC against \({\mathcal {F}}^{\alpha }_{\varGamma }\), for any \(\alpha \), m, such that \((1-\alpha )m=\omega (\log (k))\).

Proof

Following Lemma 2.6, we prove that for any \(f\in {\mathcal {F}}^{\alpha }_{\varGamma }\), and any pair of messages \(s_0\), \(s_1\), \({\textsf{Tamper}}_{s_0}^{f} \approx {\textsf{Tamper}}_{s_1}^{f}\), and for any \(s\), \(\Pr \left[ {\textsf{Tamper}}^f_{s} \notin \{\bot , s\} \right] \le {\textsf{negl}}(k)\), where \({\textsf{Tamper}}\) denotes the experiment defined in Lemma 2.6 with respect to the encoding scheme of Construction 4.1, \(({\textsf{Enc}}^*,{\textsf{Dec}}^*)\). Our proof is given by a series of hybrids depicted in Fig. 9. We reduce the security \(({\textsf{Enc}}^*,{\textsf{Dec}}^*)\), to the security of Construction 3.1, \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\), against \({\mathcal {F}}_{{\textsf{ad}}}\) (cf. Corollary 4.3). The idea is to move from the tampered execution with respect to \(({\textsf{Enc}}^*,{\textsf{Dec}}^*)\), \(f\), to a tampered execution with respect to \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\), \(g\), such that the two executions are indistinguishable and \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\) is secure against \(g\).

Let \(I_{{{\textsf{b}}}}\) be the set of indices of the blocks that \(f\) chooses to tamper with, where \(|I_{{{\textsf{b}}}}| \le \alpha \nu \), and let \(l \leftarrow 2\,m|sk|\), \({\textsf{bs}}\leftarrow \log l+2\), \({\textsf{bn}}\leftarrow l+|e|/{\textsf{bs}}\). Below we describe the hybrids of Fig. 9.

  • \({\textsf{Exp}}_{0}^{f,s}\): The current experiment is the experiment \({\textsf{Tamper}}_{s}^{f}\), of Lemma 2.6, with respect to \(({\textsf{Enc}}^*,{\textsf{Dec}}^*)\), \(f\), \(s\).

  • \({\textsf{Exp}}_{1}^{(g_1,g_2),s}\): The main difference between \({\textsf{Exp}}_{0}^{f,s}\) and \({\textsf{Exp}}_{1}^{(g_1,g_2),s}\) is that in the latter one, we introduce the tampering function \((g_1,g_2)\), that operates over codewords of \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\) and we modify the encoding steps so that the experiment creates codewords of the bit-wise code \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\). \((g_1,g_2)\) simulates partially the block-wise codeword \(C\), while given partial access to the bit-wise codeword \(c\leftarrow {\textsf{Enc}}(s)\). As we prove in Claim 4.9, it simulates perfectly the tampering effect of \(f\) against \(C\leftarrow {\textsf{Enc}}^*(s)\).

    \(g_1\) operates as follows (cf. Figure 6): it simulates perfectly the randomness for the permutation of the block-wise code, denoted as \(\rho \), and constructs a set of indices I, such that \(g_2\) will receive access to, and tamper with, \(c_{|_{I}}\).

    The set I is constructed with respect to the set of blocks \(I_{{{\textsf{b}}}}\), that \(f\) chooses to access, as well as \(\varSigma \), that reveals the original bit positions, i.e., the ones before permuting \((z||e)\). \(g_2\) receives \(c_{|_{I}}\), reconstructs I, simulates partially the blocks of the block-wise codeword, \(C\), and applies \(f\) on the simulated codeword. The program of \(g_2\) is given in Fig. 7.

    In Claim 4.9, we show that \(g_2\), given \(c_{|_{I}}\), simulates perfectly \(C_{|_{I_{{{\textsf{b}}}}}}\), which implies that \(g_2^{\varSigma }(c_{|_{I}}) = f(C_{|_{I_{{{\textsf{b}}}}}})\), and the two executions are identical.

  • \({\textsf{Exp}}_{2}^{(g_1,g_3),s}\): In the current experiment, we substitute the function \(g_2\) with \(g_3\), and \({\textsf{Dec}}^*\) with \({\textsf{Dec}}\), respectively. By inspecting the code of \(g_2\) and \(g_3\) (cf. Figures 7,8, respectively), we observe that latter function executes the code of the former, plus the “Check labels and simulate \({\tilde{c}}[I]\)” step. Thus, the two experiments are identical up to the point of computing \(f(C_{|_{I_{{{\textsf{b}}}}}}^*)\).

    The main idea here is that we want the current execution to be with respect to \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\) against \((g_1,g_3)\). Thus, we substitute \({\textsf{Dec}}^*\) with \({\textsf{Dec}}\), and we expand the function \(g_2\) with some extra instructions/checks that are missing from \({\textsf{Dec}}\). We name the resulting function as \(g_3\) and we prove that the two executions are identical.

  • Finally, we prove that for any \(f\) and any \(s\),

    $$\begin{aligned} \Pr \left[ {\textsf{Exp}}_{2}^{(g_1,g_3),s} \notin \{\bot , s\} \right] \le {\textsf{negl}}(k). \end{aligned}$$

    We do so by proving that \((g_1,g_3)\) is admissible for \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}},)\), i.e., \((g_1,g_3) \in {\mathcal {F}}_{{\textsf{ad}}}\), and \(g_3\) will not request to access more that \(m-1\) shares for each bit of \(sk\), \(sk^3\) (cf. Corollary 4.3). This implies security according to Lemma 2.6.

Fig. 6
figure 6

The function \(g_1\) that appears in the hybrid experiments of Fig. 9

Fig. 7
figure 7

The function \(g_2\) that appears in the hybrid experiments of Fig. 9

Fig. 8
figure 8

The function \(g_3\) that appears in the hybrid experiments of Fig. 9

Fig. 9
figure 9

The hybrid experiments for the proof of Theorem 4.8

In what follows, we prove indistinguishability between the hybrids.

Claim 4.9

For any \(f\in {\mathcal {F}}^{\alpha }_{\varGamma }\) and any \(s\), \({\textsf{Exp}}_{0}^{f,s} = {\textsf{Exp}}_{1}^{(g_1,g_2),s}\).Footnote 18

Proof

The main difference between \({\textsf{Exp}}_0\) and \({\textsf{Exp}}_1\) is that in \({\textsf{Exp}}_1\), we introduce the tampering function \(g=(g_1,g_2)\) that operates over codewords of \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\), and simulates partially the block-wise code. We observe that \(g_1\) simulates perfectly the randomness of the permutation for the block-wise code, denoted as \(\rho \). Thus, the computation \(C\leftarrow \varPi _{\rho }(z||e)\) does not induce any statistical difference between the two experiments. By the definition of \(g_1\), we have that \(c_{|_{I}}\) consists of all ciphertext bits, as well as the indices \(r_i\), for which \(\rho _i \in I_{{{\textsf{b}}}}\), \(i \in [l]\), i.e., if \(f\) requests access to the sensitive block with index \(\rho _i\), containing \(z[i]\), \(g_1\) will request access to the \(r_i\)-th bit of \(c\), which is \(z[i]\). Thus, \(g_2\) will receive as input the entire ciphertext and all the sensitive bits that \(f\) will request access to, with respect to \(I_{{{\textsf{b}}}}\), thus it can fully simulate \(C_{|_{I_{{{\textsf{b}}}}}}\) while being consistent with the distribution of blocks in \(C_{|_{I_{{{\textsf{b}}}}^{{\textsf{c}}}}}\), as \(\rho \) is generated by \(g_1\). Thus, we have that \(g_2^{\varSigma }(c_{|_{I}})\) is identical to \(f( C_{|_{I_{{{\textsf{b}}}}}} )\), and the proof of the claim is complete. \(\square \)

Claim 4.10

For any \(f\in {\mathcal {F}}^{\alpha }_{\varGamma }\) and any \(s\), \({\textsf{Exp}}_{1}^{(g_1,g_2),s} = {\textsf{Exp}}_{2}^{(g_1,g_3),s}\).

Fig. 10
figure 10

The unfolded code of \({\textsf{Exp}}_1\) and \({\textsf{Exp}}_2\)

Proof

In \({\textsf{Exp}}_2\) we substitute the function \(g_2\) with \(g_3\), and \({\textsf{Dec}}^*\) with \({\textsf{Dec}}\), respectively. By inspecting the code of \(g_2\) and \(g_3\), we observe that latter function executes the code of the former, plus the “Check labels and simulate \({\tilde{c}}[I]\)” step. Thus, the two experiments are identical up to the point of computing \(f(C_{|_{I_{{{\textsf{b}}}}}})\). We unfold the code of the two experiments from that point of the computation and on (cf. Figure 10). The idea is that the consistency check on the labels of the block-wise code is transferred from \({\textsf{Dec}}^*\) in \({\textsf{Exp}}_1\) to \(g_3\) in \({\textsf{Exp}}_2\), and \({\textsf{Dec}}^*\) is substituted by \({\textsf{Dec}}\), so that \({\textsf{Exp}}_{2}^{(g_1,g_3),s}\) is the tampering experiment of Lemma 2.6 with respect to \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\) and \((g_1,g_3)\).

In order to show that \({\textsf{Exp}}_{1}^{(g_1,g_2),s} = {\textsf{Exp}}_{2}^{(g_1,g_3),s}\), is suffices to prove that \({\textsf{Dec}}^*({\tilde{C}})={\textsf{Dec}}({\tilde{c}})\). By inspecting \({\textsf{Exp}}_{2}^{(g_1,g_3),s}\), we have that \({\tilde{c}}= \bot \) if and only if \(\varPi ^{-1}({\tilde{C}}^*) = \bot \). By the definition of \(\varPi ^{-1}\) (cf. Construction 4.1), \(\varPi ^{-1}({\tilde{C}}^*) = \bot \), if and only if the tampering function creates an inconsistent set of labels, an effect that can be decided by \(g_3\) by only partially accessing \(C\), since it fully simulates the labels for the block-wise code. By Claim 4.9, \(C_{|_{I_{{{\textsf{b}}}}}} = C^*_{|_{I_{{{\textsf{b}}}}}}\) and thus \({\tilde{C}}_{|_{I_{{{\textsf{b}}}}}} = {\tilde{C}}^*_{|_{I_{{{\textsf{b}}}}}}\), which implies that \(\varPi ^{-1}({\tilde{C}}^*) = \bot \) if and only if \(\varPi ^{-1}({\tilde{C}}) = \bot \). We conclude that \({\tilde{c}}= \bot \) if and only if \(\varPi ^{-1}({\tilde{C}}) = \bot \). Let E be the event in which \({\tilde{c}}\ne \bot \). Clearly, conditioned on \(\lnot E\) the two experiments are identical, as both output \(\bot \). It remains to prove the same conditioned on E.

By inspecting the two experiments, and conditioned on E, we have

$$\begin{aligned} {\tilde{c}}_{|_{I}} = {\tilde{c}}^*_{|_{I}} = \left[ P_{\varSigma }\left( \varPi ^{-1} ({\tilde{C}}^*) \right) \right] _{|_{I}} = \left[ P_{\varSigma }\left( \varPi ^{-1} ({\tilde{C}}) \right) \right] _{|_{I}}, \end{aligned}$$
(7)

where the last equality follows from the fact that \(\left[ P_{\varSigma }\left( \varPi ^{-1} ({\tilde{C}}^*) \right) \right] _{|_{I}}\) is independent of the blocks of \({\tilde{C}}\) that \({\textsf{Exp}}_2\) does not have access to. Moreover,

$$\begin{aligned} {\tilde{c}}_{|_{I^{{\textsf{c}}}}} = c_{|_{I^{{\textsf{c}}}}} = \left[ P_{\varSigma }\left( \varPi ^{-1} (C) \right) \right] _{|_{I^{{\textsf{c}}}}} = \left[ P_{\varSigma }\left( \varPi ^{-1} ({\tilde{C}}) \right) \right] _{|_{I^{{\textsf{c}}}}}, \end{aligned}$$
(8)

where the last equality follows from the fact that \({\tilde{c}}_{|_{I^{{\textsf{c}}}}}\) is not being accessed by the tampering function. From the above relations, we have that \({\tilde{c}}= P_{\varSigma }\left( \varPi ^{-1} ({\tilde{C}}) \right) \), thus \(P^{-1}_{\varSigma } ({\tilde{c}}) = \varPi ^{-1} ({\tilde{C}}) \), and the two executions are identical conditioned on E. \(\square \)

Claim 4.11

Assuming \((1-\alpha )m = \omega (\log (k))\), for any \(f\in {\mathcal {F}}^{\alpha }_{\varGamma }\) and any \(s\),

$$\begin{aligned} \Pr \left[ {\textsf{Exp}}_{2}^{(g_1,g_3),s} \notin \{\bot , s\} \right] \le {\textsf{negl}}(k), \end{aligned}$$

over the randomness of \({\textsf{Exp}}_2\).

Proof

Assuming \((1-\alpha )m = \omega (\log (k))\), it suffices to prove that for any \(f\in {\mathcal {F}}^{\alpha }_{\varGamma }\), the function \((g_1,g_3) \in {\mathcal {F}}_{{\textsf{ad}}}\) is admissible for \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}},)\), i.e., \(g_1\) will not request to access more that \(m-1\) shares for each bit of \(sk\), \(sk^3\), and the proof of the claim will follow by Corollary 4.3 and Lemma 2.6. We prove that for any \(f\in {\mathcal {F}}^{\alpha }_{\varGamma }\), the corresponding \((g_1,g_3)\) will not access the entire \(z_i\), for all \(i \in [|2 sk|]\), with overwhelming probability. Such an event takes place if and only if \(\exists i: |(I_{{{\textsf{b}}}}\cap Z_i)| = m\). We define by \(E_i\) the event in which \(f\) request access to all blocks in which \(z_i\) is stored. Assuming \(f\) reads n blocks, we have that for all \(i \in [2|sk|]\),

$$\begin{aligned} \Pr _{\rho }[E_i] = \Pr _{\rho }\left[ \ |I_{{{\textsf{b}}}}\cap Z_i| = m \ \right] = \prod _{j=0}^{m-1} \frac{n-j}{\nu -j} \le \left( \frac{n}{\nu } \right) ^m. \end{aligned}$$

We have \(n=\alpha \nu \) and assuming \(\alpha = 1-\epsilon \) for \(\epsilon \in (0,1]\), we have \(\Pr [E_i] \le (1-\epsilon )^m \le 1/e^{m\epsilon }\) and

$$\begin{aligned} \Pr [E]=\Pr _{\rho } \left[ \bigcup _{i=1}^{2|sk|} E_i \right] \le \frac{2|sk|}{e^{m\epsilon }}, \end{aligned}$$

which is negligible when \((1-\alpha )m = \omega (\log (k))\). \(\square \)

The security of the block-wise code follows from the above claims and the MD-NMC security of \(({\textsf{Init}},{\textsf{Enc}},{\textsf{Dec}})\). \(\square \)

Instantiations and rates. By instantiating Construction 4.1, with 2.11, Theorem 4.8, for \(m=k\log k\), \(\alpha = 1-1/\varOmega (\log k)\), yields rate \(1-1/\varOmega (\log k)\) MD-NMC, with access rate \(1-1/\varOmega (\log k)\) and codewords of length \(|s|(1+1/O(\log k))+O(k^2 \log ^2 k)\), assuming one-way functions. In addition, by instantiating Construction 4.1, with 2.12, Theorem 4.8, for \(m=|s|\log |s|\), \(\alpha = 1-1/O(\log (|s|))\), yields unconditionally secure, rate \(1/O(|s|\log ^2(|s|))\) MD-NMC in the standard model, with access rate \(\varOmega (1-1/\log (|s|))\) and codewords of length \(O(|s|^2 \log ^2 |s|)\).

5 Continuous MD-NMC with light Updates

In this section, we enhance the block-wise scheme of Sect. 4 with an update mechanism, that uses only shuffling and refreshing operations. The resulting code is secure against continuous attacks, for a notion of security that is weaker than the original one [45], as we need to update the codeword after each round of execution. However, our update mechanism is using cheap operations, avoiding the full decoding and re-encoding of the message, which is the standard way to achieve continuous security [39, 55] using a one-time NMC. In addition, our solution avoids the usage of a self-destruction mechanism that produces \(\bot \) in all subsequent rounds after the first round in which the attacker creates an invalid codeword, which was originally proposed by [41]. Avoiding the self-destruction mechanism was originally proposed by [41], and it is an important step toward practicality, as (i) the mechanism is subjective to denial of service attacks, and (ii) it renders the device useless in the presence of non-adversarial hardware faults. Our solution enables normal use of the device in the presence of such faults and provides security against malicious attacks.Footnote 19

The update mechanism of the proposed scheme works as follows: in each round, it randomly shuffles the blocks and refreshes the randomness of the inner encoding of \(sk\). The idea here is that, due to the continual shuffling and refreshing of the inner encoding scheme, in each round the attacker learns nothing about the secret key, and every attempt to modify the inner encoding, results to an invalid key, with overwhelming probability. Our update mechanism can be made deterministic if we further encode the seed of a PRG together with the secret key, which is similar to the technique presented in [55].

Below we define the update mechanism, which is denoted as \({\textsf{Update}}^*\).

Construction 5.1

Let k, \(m \in {\mathbb {N}}\), and let \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\), \(({\textsf{SS}}_{m},{\textsf{Rec}}_{m})\), \({\textsf{Enc}}^*\), \({\textsf{Dec}}^*\), be as in Construction 4.1. We define the update procedure, \({\textsf{Update}}^*\), for the encoding scheme of Construction 4.1, as follows:

  • \({\textsf{Update}}^*(1^k,\cdot )\): on input \(C\), parse it as \((C_1||\ldots ||C_{{\textsf{bn}}})\), set \(l \leftarrow 2\,m|sk|\), \(\hat{{\mathcal {L}}}= \emptyset \), and set \({\hat{C}}:= ({\hat{C}}_1 || \ldots || {\hat{C}}_{{\textsf{bn}}})\) to zeros.

    • \(\bullet \)   (Secret share \(0^{2|sk|}\)): Sample \(z\leftarrow {\textsf{SS}}_{m}\left( 0^{2|sk|} \right) \), where , \(z \in \{0,1\}^{2\,m|sk|}\), and for \(i \in \left[ 2|sk| \right] \), \(z_i\) is an m-out-of-m secret sharing of the 0 bit.

    • \(\bullet \)   (Shuffle & Refresh): Sample \(\rho :=(\rho _1,\ldots ,\rho _l) {\mathop {\leftarrow }\limits ^{{\textsf{rs}}}}\{0,1\}^{\log ({\textsf{bn}})} \). For \(i \in [{\textsf{bn}}]\),

      • \(*\)   (Sensitive block) If \(C_i[1] = 1\),

      •       \(\cdot \)   (Shuffle): Set \(j \leftarrow C_i[2:{\textsf{bs}}-1]\), \({\hat{C}}_{\rho _j} \leftarrow C_i\).

      •       \(\cdot \)   (Refresh): Set \({\hat{C}}_{\rho _j}[{\textsf{bs}}] \leftarrow {\hat{C}}_{\rho _j}[{\textsf{bs}}] \oplus z[j]\).

      • \(*\)   (Ciphertext block)

        If \(C_i[1] = 0\), set \(j \leftarrow \min _n \left\{ n \in [{\textsf{bn}}] \big | n \notin \hat{{\mathcal {L}}}, n \ne \rho _i, i \in [l] \right\} \), and \({\hat{C}}_{j} \leftarrow C_i\), \(\hat{{\mathcal {L}}}\leftarrow \hat{{\mathcal {L}}}\cup \{j \}\).

    Output \({\hat{C}}\).

The following definition of security is along the lines of the one given in [45], adapted to the notion of non-malleability with manipulation detection. Also, after each invocation the codewords are updated, where in our case the update mechanism is only using shuffling and refreshing operations. In addition, there is no need for self-destruct after detecting an invalid codeword [41].

Definition 5.2

(Continuous MD-NMC with light updates) Let \({\textsf{CS}}=({\textsf{Enc}},{\textsf{Dec}})\) be an encoding scheme, \(\mathcal {F}\) be a function class and \(k,q \in {\mathbb {N}}\). Then, \({\textsf{CS}}\) is a q-continuously non-malleable code with manipulation detection (q-MD-CNMC) with light updates, if for every, sufficiently large \(k \in {\mathbb {N}}\), any pair of messages \(s_0\), \(s_1 \in \{0,1\}^{\textrm{poly}(k)}\), and any algorithm \({\mathcal {A}}\),

$$\begin{aligned} \left\{ {\textsf{Tamper}}^{{\mathcal {A}}}_{s_0}(k) \right\} _{k \in {\mathbb {N}}} \approx \left\{ {\textsf{Tamper}}^{{\mathcal {A}}}_{s_1}(k) \right\} _{k \in {\mathbb {N}}}, \end{aligned}$$

where,

$$\begin{aligned} \begin{array}{l} {\textsf{Tamper}}^{{\mathcal {A}}}_{s}(k):\\ C\leftarrow {\textsf{Enc}}(1^k,s),\tilde{s}\leftarrow {{\textsf{0}}}\\ \text {For }\tau \in [q]:\\ \hspace{0.5cm} f\leftarrow {\mathcal {A}}(\tilde{s}), {\tilde{C}}\leftarrow f(C), \tilde{s}\leftarrow {\textsf{Dec}}({\tilde{C}}) \\ \hspace{0.5cm} \text {If } \tilde{s}= s: \ \tilde{s}\leftarrow {\textsf{same}}^*\\ \hspace{0.5cm} C\leftarrow {\textsf{Update}}^*(1^k,C)\\ out \leftarrow {\mathcal {A}}(\tilde{s})\\ {\textbf {Return}}: out \end{array} \end{aligned}$$

and for each round the output of the decoder is not in \(\{s, \bot \}\) with negligible probability in k, over the randomness of \({\textsf{Tamper}}^{{\mathcal {A}}}_{s}\).

Below we prove that the scheme of Construction 5.1 is continuously non-malleable with manipulation detection and light updates.

Theorem 5.3

Let q, k, m, \( \in {\mathbb {N}}\), \(\varGamma =\{0,1\}^{O(\log (k))}\) and \(\alpha \in [0,1)\). Assuming \(({\textsf{SS}}_{m},{\textsf{Rec}}_{m})\) is an m-out-of-m secret sharing scheme and \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) is a 1-IND-CPA, authenticated encryption scheme, the scheme of Construction 5.1 is a MD-CNMC with light updates, against \({\mathcal {F}}^{\alpha }_{\varGamma }\), for any \(\alpha \), m, such that \((1-\alpha )m=\omega (\log (k))\).

Fig. 11
figure 11

Hybrids for the proof of Theorem 5.3. The gray part signifies the portion of the code of an experiment that differs from the previous one

Proof

Let \({\mathcal {A}}\) be any adversary playing against \({\textsf{Tamper}}_{s}^{{\mathcal {A}}}\), for any \(s\). Let \(I_{{{\textsf{b}}}}\) be the set of indices chosen by the attacker in each round and \(I^{{\textsf{c}}}= [\nu ] \backslash I\). The tampered components of the codeword will be denoted using the symbol “~” on top of the original symbol. Our proof follows the strategy of the one given in Theorem 3.2, using a series of hybrid experiments that are depicted in Fig. 11. Below, we describe the hybrids.

  • \({\textsf{Exp}}_{0}^{{\mathcal {A}},s,q}\): For any \({\mathcal {A}}\), \(s\), q, the experiment \({\textsf{Exp}}_{0}^{{\mathcal {A}},s,q}\) is the experiment \({\textsf{Tamper}}_{s}^{{\mathcal {A}}}\), of Definition 5.2.

  • \({\textsf{Exp}}_{1}^{{\mathcal {A}},s,q}\): In the second experiment and for each round of the execution, we define \(Z_i\), \(i \in [2|sk|]\), to be the set of indices in which \(z_i\) is stored, \(|Z_i|=m\). Intuitively, in each round, by calling the \({\textsf{Update}}^*\) procedure that permutes the blocks using a fresh permutation key and updates the shares of \(sk\) and \(sk^3\), we achieve the following: in each round, the attacker finds all shares for a bit of \(sk\), and \(sk^3\), with negligible probability in k, thus the tampering function is not accessing any bit of \(sk\) and \(sk^3\), even if it is given access to \((1-o(1)) \nu \) blocks of the codeword. Thus, the indistinguishability between the current experiment and the previous one comes from a claim analogous to Claim 3.3, made in the proof of Theorem 3.2. In particular, we have the following claim.

Claim 5.4

For k, q, \(m \in {\mathbb {N}}\), assume \((1-\alpha )m = \omega (\log (k))\). Then, for any \({\mathcal {A}}\) that chooses its tampering strategy from \({\mathcal {F}}^{\alpha }_{\varGamma }\), and any message \(s\), we have \({\textsf{Exp}}_{0}^{{\mathcal {A}},s,q} \approx {\textsf{Exp}}_{1}^{{\mathcal {A}},s,q}\), where the probability runs over the randomness used by \({\textsf{Enc}}^*\), \({\textsf{Update}}^*\).

  • \({\textsf{Exp}}_{2}^{{\mathcal {A}},s,q}\): In the third experiment, we define by \(\tilde{Z_i}\) to be the set of indices in which \({\tilde{z}}_i\) is stored, \(|{{\tilde{Z}}}_i|=m\). The main difference with the previous experiment is that we unfold the encoding procedure, and in addition, we substitute the secret sharing procedure \({\textsf{SS}}_{m}\) with \(\bar{{\textsf{SS}}}_{m}^{f,\rho }\), defined as follows:

    \(\bar{{\textsf{SS}}}_{m}^{f,\rho }(sk)\):

    1. 1.

      Sample \(\left( z_1^*,\ldots ,z_{2|sk|}^* \right) \leftarrow {\textsf{SS}}_{m}\left( sk||sk^3 \right) \).

    2. 2.

      For \(i \in [2|sk|]\):

      $$\begin{aligned} l_i:= \max _{d} \left\{ d \in [m] \wedge {\textsf{Ind}}\left( z_i[d] \right) \notin I_{{{\textsf{b}}}}) \right\} , \end{aligned}$$

      where \({\textsf{Ind}}\) returns the index of \(z_i[d]\) in \(C\), i.e., \(l_i\) is the largest index in [m] such that the codeword block containing \(z_i[l_i]\), is not accessed by \(f\).

    3. 3.

      (Output): For all i set \(z_i^*[l_i]=*\), and output \(z^*:= \parallel _{i=1}^{2|sk|} z^*_i\).

    In \({\textsf{Exp}}_{1}^{{\mathcal {A}},s,q}\), we have , and each \(z_i\) is an m-out-of-m secret sharing for a bit of \(sk\) or \(sk^3\). From the first transition we have that for all i, \(|I_{{{\textsf{b}}}}\cap Z_i|<m\) with overwhelming probability, and the current experiment is identical to the previous one up to the point of computing \(f(C_{|_{I_{{{\textsf{b}}}}}})\), as \(C_{|_{I_{{{\textsf{b}}}}}}\) and \(f(C_{|_{I_{{{\textsf{b}}}}}})\) depend only on \(z^*\), that gives no information about \(sk\) and \(sk^3\).

    Another difference between the two experiments is that, after applying the tampering function, \({\textsf{Exp}}_{1}^{{\mathcal {A}},s,q}\) makes a call on the decoder while \({\textsf{Exp}}_{2}^{{\mathcal {A}},s,q}\), checks if the tampering function has modified the shares in a way such that the reconstruction procedure will give \(\tilde{sk}\ne sk\) or \(\tilde{sk}' \ne sk'\). In case modification is detected the current experiments sends \(\bot \) to the attacker. The main idea here is that, a tampering function that modifies the shares of \(sk\) and \(sk^3\), creates shares of \(\tilde{sk}\) and \(\tilde{sk}'\), such that \(\tilde{sk}^3=\tilde{sk}'\), with negligible probability in k. We prove this by a reduction to the 1-IND-CPA security of the encryption scheme in the presence of \(z^*\), that as we have already stated, it gives no information about the secret key. The indistinguishability between the two experiments comes from the following claim, whose proof is similar to the one given in Claim 3.4.

Claim 5.5

Assuming \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) is 1-IND-CPA secure (cf. Definition 2.9), for any \({\mathcal {A}}\) choosing its tampering strategy from \({\mathcal {F}}^{\alpha }_{\varGamma }\), and any message \(s\), \({\textsf{Exp}}_{1}^{{\mathcal {A}},s,q} \approx {\textsf{Exp}}_{2}^{{\mathcal {A}},s,q}\), where the probability runs over the randomness used by \({\textsf{Enc}}^*\), \({\textsf{Update}}^*\).

  • \({\textsf{Exp}}_{3}^{{\mathcal {A}},s,q}\): In the final experiment, in each round of the execution, instead of calling the decryption \({{\textsf{D}}}_{sk}({\tilde{e}})\), we first check if the attacker has modified the ciphertext, in which case the current experiment outputs \(\bot \), otherwise it outputs \({\textsf{same}}^*\). This part of the program is reached only if the tampering function does not modify the secret key. Thus, the indistinguishability between the two experiments follows from the authenticity property of the encryption scheme in the presence of \(z^*\), which is updated in each round depending on the set \(I_{{{\textsf{b}}}}\). Clearly, requesting \(z^*\) adaptively in each round does not compromise the security of the encryption scheme, as \(z^*\) carries no information about \(sk\). Thus, in each round, given that \(\tilde{sk}= sk\) and \(\tilde{sk}' = sk'\), we have that if the attacker modifies the ciphertext, then with overwhelming probability \({{\textsf{D}}}_{sk}({\tilde{e}})=\bot \), otherwise, \({{\textsf{D}}}_{sk}({\tilde{e}})=s\), and the current experiment correctly sends \(\tilde{s}= {\textsf{same}}^*\) to the attacker. Thus, we have the following claim.

Claim 5.6

Assuming the authenticity property of \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\), for any \({\mathcal {A}}\) choosing its tampering strategy from \({\mathcal {F}}^{\alpha }_{\varGamma }\), and any message \(s\), \( {\textsf{Exp}}_{2}^{{\mathcal {A}},s,q} \approx {\textsf{Exp}}_{3}^{{\mathcal {A}},s,q}\), where the probability runs over the randomness used by \({\textsf{KGen}}\), \({{\textsf{E}}}\) and \({\textsf{Update}}^*\).

  • Finally, we have that for any \({\mathcal {A}}\) choosing its tampering strategy from \({\mathcal {F}}^{\alpha \nu }_{\varGamma }\), and any message \(s\), \({\textsf{Exp}}_3^{{\mathcal {A}},s,q}\) is indistinguishable from \({\textsf{Exp}}_3^{{\mathcal {A}},{{\textsf{0}}},q}\), where \({{\textsf{0}}}\) denotes the zero message. This follows by the semantic security of the encryption scheme in the presence of \(z^*\), for the multi-round case.

Claim 5.7

Assuming \(({\textsf{KGen}},{{\textsf{E}}},{{\textsf{D}}})\) is semantically secure, for any \({\mathcal {A}}\) choosing its tampering strategy from \({\mathcal {F}}^{\alpha }_{\varGamma }\), \({\textsf{Exp}}_3^{{\mathcal {A}},s,q} \approx {\textsf{Exp}}_3^{{\mathcal {A}},{{\textsf{0}}},q}\), where the probability runs over the randomness used by \({\textsf{KGen}}\), \({{\textsf{E}}}\), \({\textsf{Update}}\), “\(\approx \)” may refer to statistical or computational indistinguishability, and \({{\textsf{0}}}\) denotes the zero message.

The above claims conclude our proof. Clearly, the manipulation detection property follows from the fact that the output of \({\textsf{Exp}}_3\) is in \(\{{\textsf{same}}^*,\bot \}\), with overwhelming probability. \(\square \)

In the above theorem, q can be polynomial (resp. exponential) in k, assuming the underlying encryption scheme is computationally (resp. unconditionally) secure.

Instantiations and rates. By instantiating Construction 5.1, with 2.11, Theorem 5.3, for \(m=k\log k\), \(\alpha = 1-1/\varOmega (\log k)\), yields rate \(1-1/\varOmega (\log k)\) MD-NMC, with access rate \(1-1/\varOmega (\log k)\) and codewords of length \(|s|(1+1/O(\log k))+O(k^2 \log ^2 k)\), assuming one-way functions. In addition, by instantiating Construction 5.1, with 2.12, Theorem 5.3, for \(m=|s|\log |s|\), \(\alpha = 1-1/O(\log (|s|))\), yields unconditionally secure, rate \(1/O(|s|\log ^2(|s|))\) MD-NMC in the standard model, with access rate \(\varOmega (1-1/\log (|s|))\) and codewords of length \(O(|s|^2 \log ^2 |s|)\).