Keywords

1 Introduction

Security Under Selective Opening Attacks. Consider a scenario where many parties \(1,\ldots ,n\) send messages to one common receiver. To transmit a message \(\mathbf {m}{}_i\), party i samples fresh randomness \(\mathbf {r}{}_i\) and sends the ciphertext \(\mathbf {c}{}_i={\mathsf {Enc}}_ pk (\mathbf {m}{}_i;\mathbf {r}{}_i)\) to the receiver. Consider an adversary \(\mathcal {A}\) that does not only eavesdrop on the sent ciphertexts \((\mathbf {c}{}_1,\ldots ,\mathbf {c}{}_n)\), but corrupts a set \(\mathcal {I}\subseteq [n]\) of the sender’s systems, thus learning the encrypted message \(\mathbf {m}{}_i\) and the randomness \(\mathbf {r}{}_i\) used to encrypt \(\mathbf {m}{}_i\). The natural question to ask is whether the messages of uncorrupted parties remain confidential. Such attacks are referred to as selective opening (\({\mathsf {SO}}\)) attacks (under sender corruption).

Selective opening attacks naturally occur in multi-party computation where we assume secure channels between parties. Since a party might become corrupted, we would need the encryption on the channels to be selective opening secure. In practice the same argument applies to a server that establishes secure connections that shall remain secure if users are corrupted.

Difficulty of Proving Security Under Selective Opening Attacks. The widely accepted standard notion for public-key encryption schemes is indistinguishability under chosen-plaintext attacks (\({\mathsf {IND\text {-}CPA}}\) security). At first sight one might consider a straight-forward hybrid argument to show that \({\mathsf {IND\text {-}CPA}}\) security already implies security against selective opening attacks since every party samples fresh randomness independently. However, so far nobody has been able to bring forward such a hybrid argument in general. Notice that revealing randomness \(\mathbf {r}{}_i\) allows a selective opening adversary to verify that a corrupted ciphertext \(\mathbf {c}{}_i\) is an encryption of \(\mathbf {m}{}_i\). The adversary’s possibility to corrupt parties introduces a difficulty in proving that standard (\({\mathsf {IND\text {-}CPA}}\)) security already implies selective opening security. It seems that the reduction has to know (i.e. guess) the complete set \(\mathcal {I}\) of all corruptions going to be made by \(\mathcal {A}\) in order to serve its security game before \(\mathcal {A}\) actually announces the senders it wishes to corrupt. Since \(\mathcal {I}\) might be any subset of \(\{1,\ldots ,n\}\), a direct approach would lead to an exponential loss in the reduction. A main technical obstacle is that the encrypted messages may depend on each other. If, for example, they are encrypted and sent sequentially, message \(\mathbf {m}{}_i\) may depend on \(\mathbf {m}{}_{i-1}\) and all previous messages. Thus, corrupting some parties might already leak some information on messages sent by parties that have not been corrupted.

Until today, the only result in the standard model, given in [3, 8], shows that \({\mathsf {IND\text {-}CPA}}\) implies selective opening security for the special case of a product distribution, i.e., when all messages \(\mathbf {m}{}_{1}, \ldots , \mathbf {m}{}_{n}\) are sampled independently from each other. Intuitively, this holds since corrupting some ciphertext cannot reveal information on related messages if there are no related messages at all and the hybrid argument one might expect to work goes through. This leaves the following open question:

Does standard security imply selective opening security for any non-trivial message distribution?

1.1 Our Contributions

We present the first non-trivial positive results in the standard model, namely we show that \({\mathsf {IND\text {-}CPA}}\) security implies \({\mathsf {IND\text {-}SO\text {-}CPA}}\) security for a class of message distributions with few dependencies. Here \({\mathsf {IND\text {-}SO\text {-}CPA}}\) security refers to the indistinguishability-based definition of selective opening security sometimes referred to as weak \({\mathsf {IND\text {-}SO\text {-}CPA}}\) security [4].

\({\mathsf {IND\text {-}SO\text {-}CPA}}\) requires that a passive adversary that obtains a vector of ciphertexts \((\mathbf {c}_1,\ldots ,\mathbf {c}_n)\) and has access to a ciphertext opening oracle, revealing the underlying message \(\mathbf {m}_i\) of some ciphertext \(\mathbf {c}_i\) and the randomness used to encrypt \(\mathbf {m}_i\), cannot distinguish the originally encrypted messages from freshly resampled messages that are as likely as the original messages given the messages of opened ciphertexts.

We consider graph-induced distributions where dependencies among messages correspond to edges in a graph and show that \({\mathsf {IND\text {-}CPA}}\) implies \({\mathsf {IND\text {-}SO\text {-}CPA}}\) security for all graph-induced distributions that satisfy a certain low connectivity property.

In particular, our result holds for the class of Markov distributions, i.e. distributions on message vectors \((\mathbf {m}{}_1,\ldots ,\mathbf {m}{}_n)\) where all information relevant for the distribution of \(\mathbf {m}{}_i\) is present in \(\mathbf {m}{}_{i-1}\). We prove that any \({\mathsf {IND\text {-}CPA}}\) secure public-key encryption scheme is \({\mathsf {IND\text {-}SO\text {-}CPA}}\) secure if the messages are sampled from a Markov distribution. Our results cover for instance distributions where message \(\mathbf {m}{}_i\) contains all previous messages (e.g. email conversations) or distributions where messages are increasing, i.e., \(\mathbf{m}_1 \! \le \mathbf{m}_2 \le \! \ldots \! \le \mathbf{m}_n\).

Note that a positive result on “weak” \({\mathsf {IND\text {-}SO\text {-}CPA}}\) security for all \({\mathsf {IND\text {-}CPA}}\)-secure encryption schemes for certain distributions is the best we can hope for due to the negative result of Bellare et al. [1] ruling out such an implication for \({\mathsf {SIM\text {-}SO\text {-}CPA}}\) security.

Details. Think of a vector of n messages sampled from some distribution \(\mathfrak {D}\) as a graph G on n vertices \(\{1,\ldots ,n\}\) where we have an edge from message \(\mathbf {m}_i\) to message \(\mathbf {m}_j\) if the distribution of \(\mathbf {m}_j\) depends on \(\mathbf {m}_i\). Further, fix any subset \(\mathcal {I}\subseteq \{1,\ldots ,n\} \) of opening queries made by some adversary. The main observation is that removing \(\mathcal {I}\) and all incident edges, G decomposes into connected components \(C_1,\ldots ,C_{n'}\) that can be resampled independently, since the distribution of messages on \(C_k\) solely depends on the messages in the neighborhood of \(C_k\) and \(\mathfrak {D}\).

To argue that there is no efficient adversary \({\mathcal {A}_{\mathsf {SO}}}\) that distinguishes sampled and resampled messages in the selective opening experiment, we proceed in a sequence of hybrid games, starting in a game where after receiving encryptions of sampling messages and replies to opening queries, \({\mathcal {A}_{\mathsf {SO}}}\) obtains the sampled messages. In each hybrid step we use \({\mathsf {IND\text {-}CPA}}\) security to replace sampled messages on a connected component \(C_k\) with resampled messages without \({\mathcal {A}_{\mathsf {SO}}}\) noticing. To this end, the reduction from \({\mathsf {IND\text {-}CPA}}\) to the indistinguishability of two consecutive hybrids has to identify \(C_k\) to embed its own challenge before \({\mathcal {A}_{\mathsf {SO}}}\) makes any opening query.

We consider two approaches for guessing \(C_k\). The first will consider graphs that have only polynomially many connected subgraphs; hence, the reduction can guess \(C_k\) right away. The second approach studies graphs for which every connected subgraph has a neighborhood of constant size; this allows the reduction to guess \(C_k\) by guessing its neighborhood. We show that the first approach ensures a reduction with polynomial loss for a strictly greater class of graphs than the second one.

Additionally, when the distribution is induced by an acyclic graph, we give a more sophisticated hybrid argument for the second approach, where in each hybrid transition only a single sampled message is replaced by a resampled message, allowing for a tighter reduction. Due to the definition of the hybrids, it will suffice to guess on fewer vertices of \(C_k\)’s neighborhood.

1.2 Previous Work

There are three not polynomially equivalent definitions of \({\mathsf {SO}}\)-secure encryption [4]. Since messages in the \({\mathsf {IND\text {-}SO}}\) experiment have to be resampled conditioned on opened messages, there are two notions based on indistinguishability: Weak \({\mathsf {IND\text {-}SO}}\) restricts to distributions that support efficient conditional resampling. Bellare et al. [2] gave an indistinguishability-based notion for passive adversaries, usually referred to as \({\mathsf {IND\text {-}SO\text {-}CPA}}\). Full \({\mathsf {IND\text {-}SO}}\) allows for arbitrary distributions on the messages and is due to Böhl et al. [4], who adopted a notion for commitment schemes from [2] to encryption.

\({\mathsf {SIM\text {-}SO}}\) captures semantic security and demands that everything an adversary can output can be computed by a simulator that only sees the messages of corrupted parties, whereas it does not see the public key, any ciphertext or any randomness. The notion dates back to Dwork et al. [8], who studied the selective decommitment problem, and does not suffer from a distribution restriction like weak \({\mathsf {IND\text {-}SO}}\), since it does not involve resampling.

The first \({\mathsf {IND\text {-}SO\text {-}CPA}}\)-secure encryption scheme in the standard model was given in [2] based on lossy encryption. Selective opening secure encryption can be constructed from deniable encryption [6] as well as non-committing encryption [7]. Bellare et al. [1, 3] separated \({\mathsf {SIM\text {-}SO\text {-}CPA}}\) from \({\mathsf {IND\text {-}CPA}}\) security and showed that \({\mathsf {IND\text {-}CPA}}\) security implies weak \({\mathsf {IND\text {-}SO\text {-}CPA}}\) security if the messages are (basically) sampled independently. The same result was already established for commitment schemes in [8].

To date, this is the only positive result that shows that \({\mathsf {IND\text {-}CPA}}\) implies weak \({\mathsf {IND\text {-}SO\text {-}CPA}}\) in the standard model. Full \({\mathsf {IND\text {-}SO\text {-}CPA}}\) and \({\mathsf {SIM\text {-}SO\text {-}CPA}}\) security were separated in [4]; neither of them implies the other. Hofheinz et al. [10] proved that \({\mathsf {IND\text {-}CPA}}\) implies weak \({\mathsf {IND\text {-}SO\text {-}CPA}}\) in the generic group model for a certain class of encryption schemes and separated \({\mathsf {IND\text {-}CCA}}\) from weak \({\mathsf {IND\text {-}SO\text {-}CCA}}\) security.

Recently, Hofheinz et al. [9] constructed the first (even \({\mathsf {IND\text {-}CCA}}\)-secure) \({\mathsf {PKE}}\) that is not weakly \({\mathsf {IND\text {-}SO\text {-}CPA}}\) secure. Their result relies on the existence of public-coin differing-inputs obfuscation and certain correlation intractable hash functions. Their scheme employs “secret-sharing message distributions” whose messages are evaluations of some polynomial. It is easily seen that such distributions have too many dependencies to be covered by our positive result. There is a gap between their result and ours, that is, distributions for which it is still open whether \({\mathsf {IND\text {-}CPA}}\) implies \({\mathsf {IND\text {-}SO\text {-}CPA}}\).

2 Preliminaries

We denote by \(\lambda \) the security parameter. A function f is polynomial in n, \(f(n)={\mathsf {poly}}(n)\), if \(f(n)=\mathcal {O}(n^c)\) for some \(c>0\). Let \(0<n:=n(\lambda )={\mathsf {poly}}(\lambda )\). A function f(n) is negligible in n, \(f(n)={\mathsf {negl}}(n)\), if \(f(n)=\mathcal {O}(n^{-c})\) for all \(c>0\). Any algorithm receives the unary representation \(1^\lambda \) of the security parameter as first input. We say that an algorithm is a \({\mathsf {PPT}}\) algorithm if it runs in probabilistic polynomial time (in \(\lambda \)). For a finite set \(\mathcal S\) we denote the sampling of a uniform random element a by \(a{\,{\leftarrow {{\scriptscriptstyle \$}}}\,}\mathcal S\), and the sampling according to some distribution \(\mathfrak {D}\) by \(a\leftarrow \mathfrak {D}\). For \(a,b\in {{\mathbb N}}\), \(a\le b\), let \([a,b]:=\{a,a+1,\ldots ,b\}\) and \([a]:=[1,a]\). For \(a<b\) let \([b,a]:=\emptyset \). For \(\mathcal {I}\subseteq [n]\) let \(\overline{\mathcal {I}}:=[n]\setminus \mathcal {I}\). We use boldface letters to denote vectors, which are of length n if not indicated otherwise. For a vector \(\mathbf {m}\) and \({i\in [n]}\) let \(\mathbf {m}{}_i\) denote the i-th entry of \(\mathbf {m}\) and \(|\mathbf {m}|\) the number of entries in \(\mathbf {m}\). For a set \(\mathcal {I}=\{i_1,\ldots ,i_{|\mathcal {I}|}\}\), \(i_1<\ldots <i_{|\mathcal {I}|}\) let \(\mathbf {m}{}_\mathcal {I}\) denote the projection of \(\mathbf {m}\) to its \(\mathcal {I}\)-entries: \(\mathbf {m}{}_\mathcal {I}:=(\mathbf {m}{}_{i_1},\ldots ,\mathbf {m}{}_{i_{|\mathcal {I}|}})\). For an event \({\texttt {E}}\) let \(\overline{{\texttt {E}}}\) denote the complementary event.

2.1 Games

A game \({\mathsf {G}}\) is a collection of procedures or oracles \(\{\textsc {Initialize},\textsc {P}_1,\textsc {P}_2,\ldots ,\textsc {P}_t,\textsc {Finalize}\}\) for \(t\ge 0\). Procedures \(\textsc {P}_1\) to \(\textsc {P}_t\) and \(\textsc {Finalize}\) might require some input parameters. We implicitly assume that boolean flags are initialized to false, numerical types are initialized to 0, sets are initialized to \(\emptyset \), while strings are initialized to the empty string \(\epsilon \). An adversary \(\mathcal {A}\) is run in game \({\mathsf {G}}\) if \(\mathcal {A}\) calls \(\textsc {Initialize}\). During the game \(\mathcal {A}\) may run some procedure \(\textsc {P}_i\) as often as allowed by the game.

For each game in this paper, the “\(\textsc {Open}\)” procedure may be called an arbitrary number of times, while every other procedure is called once during the execution.

The interface of the game is provided by the challenger. If \(\mathcal {A}\) calls \(\textsc {P}\), the output of \(\textsc {P}\) is returned to \(\mathcal {A}\), except for the \(\textsc {Finalize}\) procedure. On \(\mathcal {A}\)’s call of \(\textsc {Finalize}\) the game ends and outputs whatever \(\textsc {Finalize}\) returns. Let \({\mathsf {G}}^\mathcal {A}\Rightarrow {\mathsf {out}}\) denote the event that \({\mathsf {G}}\) runs \(\mathcal {A}\) and outputs \({\mathsf {out}}\). The advantage \(\mathbf {Adv}({\mathsf {G}}^\mathcal {A},{\mathsf {H}}^\mathcal {A})\) of \(\mathcal {A}\) in distinguishing games \({\mathsf {G}}\) and \({\mathsf {H}}\) is defined as \( \left| \Pr [{\mathsf {G}}^\mathcal {A}\Rightarrow 1]-\Pr [{\mathsf {H}}^\mathcal {A}\Rightarrow 1]\right| \). We let \({\texttt {Bad}}\) denote the event that a boolean flag \({\texttt {Bad}}\) was set to true during the execution of some game.

2.2 Public-Key Encryption Schemes

A public-key encryption scheme consists of three \({\mathsf {PPT}}\) algorithms. \({\mathsf {Gen}}\) generates a key pair \((pk,sk)\leftarrow {\mathsf {Gen}}(1^\lambda )\) on input \(1^\lambda \). The public key \( pk \) implicitly contains \(1^\lambda \) and defines three finite sets: the message space \(\mathcal {M}\), the randomness space \(\mathcal {R}\), and the ciphertext space \(\mathcal {C}\). Given pk, a message \(m \in \mathcal {M}\) and randomness \(r\in \mathcal {R}\), \({\mathsf {Enc}}\) outputs an encryption \(c= {\mathsf {Enc}}_ pk (m;r) \in \mathcal {C}\) of m under pk. The decryption algorithm \({\mathsf {Dec}}\) takes a secret key \( sk \) and a ciphertext \(c\in \mathcal {C}\) as input and outputs a message \(m = {\mathsf {Dec}}_ sk (c) \in \mathcal {M}\), or a special symbol \(\perp \,\not \in \mathcal {M}\) indicating that c is not a valid ciphertext. In the following we let \({\mathsf {PKE}}=({\mathsf {Gen}},{\mathsf {Enc}},{\mathsf {Dec}})\) denote a public-key encryption scheme.

We require \({\mathsf {PKE}}\) to be correct: for all security parameters \(\lambda \), for all \(( pk , sk )\leftarrow {\mathsf {Gen}}(1^\lambda )\), and for all \(m\in \mathcal {M}\) we have \(\Pr [{\mathsf {Dec}}_ sk ({\mathsf {Enc}}_ pk (m;r))=m]=1\) where the probability is taken over the choice of r. We apply \({\mathsf {Enc}}\) and \({\mathsf {Dec}}\) to message vectors \(\mathbf {m}=(\mathbf {m}{}_1,\ldots , \mathbf {m}{}_n)\) and randomness \(\mathbf {r}=( \mathbf {r}{}_1,\ldots , \mathbf {r}{}_n)\) as \({\mathsf {Enc}}(\mathbf {m};\mathbf {r}):=({\mathsf {Enc}}(\mathbf {m}{}_1; \mathbf {r}{}_1),\ldots ,{\mathsf {Enc}}( \mathbf {m}{}_n;\mathbf {r}{}_n))\).

2.3 IND-CPA and Mult-IND-CPA Security

We revise the standard notion of \({\mathsf {IND\text {-}CPA}}\) security and give a definition of indistinguishable ciphertext vectors under chosen-plaintext attacks that will allow for cleaner proofs of our results.

Definition 1

( \({\mathsf {mult\text {-}IND\text {-}CPA}}\) security). For \({\mathsf {PKE}}\), an adversary \({\mathcal {B}_{\mathsf {mult}}}\), \(s\in {{\mathbb N}}\) and a bit b we consider game \({{\mathsf {mult\text {-}IND\text {-}CPA}}}_{{\mathsf {PKE}},b}^{\mathcal {B}_{\mathsf {mult}}}\) as given in Fig. 1. \({\mathcal {B}_{\mathsf {mult}}}\) may only submit message vectors \(\mathbf {m}^0{}{}\), \( \mathbf {m}^1{}{}\in \mathcal {M}^s\). To \({\mathsf {PKE}}\), \({\mathcal {B}_{\mathsf {mult}}}\) and \(\lambda \) we associate the following advantage function

$$ \mathbf {Adv}_{\mathsf {PKE}}^{{{\mathsf {mult\text {-}IND\text {-}CPA}}}}({\mathcal {B}_{\mathsf {mult}}},\lambda ):= \mathbf {Adv}\big ({\mathsf {mult\text {-}IND\text {-}CPA}}^{\mathcal {B}_{\mathsf {mult}}}_{{\mathsf {PKE}},0},{\mathsf {mult\text {-}IND\text {-}CPA}}^{\mathcal {B}_{\mathsf {mult}}}_{{\mathsf {PKE}},1}\big ). $$

\({\mathsf {PKE}}\) is \({{\mathsf {mult\text {-}IND\text {-}CPA}}}\) secure if \(\mathbf {Adv}_{\mathsf {PKE}}^{{{\mathsf {mult\text {-}IND\text {-}CPA}}}}({\mathcal {B}_{\mathsf {mult}}},\lambda )\) is negligible for all \({\mathsf {PPT}}\) adversaries \({\mathcal {B}_{\mathsf {mult}}}\).

Fig. 1.
figure 1

Game \({{\mathsf {mult\text {-}IND\text {-}CPA}}}_{{\mathsf {PKE}},b}\); \({\mathcal {B}_{\mathsf {mult}}}\) must submit \(\mathbf {m}^0,\mathbf {m}^1\in \mathcal {M}^s\)

For an adversary \({\mathcal {B}_{\mathsf {CPA}}}\), we obtain the definition of \({\mathsf {IND\text {-}CPA}}\) security by letting \(s:=1\) and write \(\mathbf {Adv}_{\mathsf {PKE}}^{{\mathsf {IND\text {-}CPA}}}({\mathcal {B}_{\mathsf {CPA}}},\lambda )\) instead of \(\mathbf {Adv}_{\mathsf {PKE}}^{\mathsf {mult\text {-}IND\text {-}CPA}}({\mathcal {B}_{\mathsf {CPA}}},\lambda )\). A standard hybrid argument proves the following lemma.

Lemma 2

For any adversary \({\mathcal {B}_{\mathsf {mult}}}\) sending message vectors from \(\mathcal {M}^s\) to the \({\mathsf {mult\text {-}IND\text {-}CPA}}\) game there exists an \({\mathsf {IND\text {-}CPA}}\) adversary \({\mathcal {B}_{\mathsf {CPA}}}\) with roughly the same running time as \({\mathcal {B}_{\mathsf {mult}}}\) such that

$$ \mathbf {Adv}_{\mathsf {PKE}}^{{\mathsf {mult\text {-}IND\text {-}CPA}}}({\mathcal {B}_{\mathsf {mult}}},\lambda )\le s\cdot \mathbf {Adv}_{\mathsf {PKE}}^{\mathsf {IND\text {-}CPA}}({\mathcal {B}_{\mathsf {CPA}}},\lambda ). $$

2.4 IND-SO-CPA Security

In this section we recall an indistinguishability-based definition for selective opening security under chosen-plaintext attacks and discuss the existing notions of \({\mathsf {SO}}\) security.

Definition 3

(Efficiently resamplable distribution). Let \(\mathcal {M}\) be a finite set. A family of distributions \({\{\mathfrak {D}_\lambda \}}_{\lambda \in {{\mathbb N}}}\) over \(\mathcal {M}^n=\mathcal {M}^{n(\lambda )}\) is efficiently resamplable if the following properties hold for every \(\lambda \in {{\mathbb N}}\):

  • Length consistency. For every \({i\in [n]}\) : \(\Pr _{ \mathbf {m}^1{}{}, \mathbf {m}^2{}{}\leftarrow \mathfrak {D}_\lambda }\big [|\mathbf {m}^{1}_{i}|=|\mathbf {m}^{2}_{i}|\big ]=1.\)

  • Resamplability. There exists a \({\mathsf {PPT}}\) resampling algorithm \({\mathsf {Resamp}}_{\mathfrak {D}_\lambda }(\cdot ,\cdot )\) that runs on \((\mathbf {m},\mathcal {I})\) for \(\mathbf {m}\in \mathcal {M}^n\), \(\mathcal {I}\subseteq [n]\) and outputs a \(\mathfrak {D}_\lambda \)-distributed vector \(\mathbf {m}'\in \mathcal {M}^n\) conditioned on \({\mathbf {m}'}{}_\mathcal {I}=\mathbf {m}{}_\mathcal {I}\).

A class of families of distributions \(\mathcal {D}\) is efficiently resamplable if every family \({\{\mathfrak {D}_\lambda \}}_{\lambda \in {{\mathbb N}}}\in \mathcal {D}\) is efficiently resamplable.

Since the security parameter uniquely specifies an element of a family \(\mathfrak {D}_\lambda \) we write \(\mathfrak {D}\) instead of \(\mathfrak {D}_\lambda \) whenever the security parameter is already fixed.

Definition 4

For \({\mathsf {PKE}}\), a bit b, an adversary \({\mathcal {A}_{\mathsf {SO}}}\) and a class of families of distributions \(\mathcal {D}\) over \(\mathcal {M}^n\) we consider game \({\mathsf {IND\text {-}SO\text {-}CPA}}^{\mathcal {A}_{\mathsf {SO}}}_{{\mathsf {PKE}},b}\) in Fig. 2. Run in the game, \({\mathcal {A}_{\mathsf {SO}}}\) calls \(\textsc {Enc}\) once right after \(\textsc {Initialize}\) and has to submit \(\mathfrak {D}\in \mathcal {D}\) along with a \({\mathsf {PPT}}\) resampling algorithm \({\mathsf {Resamp}}_{\mathfrak {D}}\). \({\mathcal {A}_{\mathsf {SO}}}\) may call \(\textsc {Open}\) multiple times and invokes \(\textsc {Challenge}\) once after its last \(\textsc {Open}\) query before calling \(\textsc {Finalize}\). We define the advantage of \({\mathcal {A}_{\mathsf {SO}}}\) run in game \({\mathsf {IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},b}\) as

$$ \mathbf {Adv}^{\mathsf {IND\text {-}SO\text {-}CPA}}_{\mathsf {PKE}}({\mathcal {A}_{\mathsf {SO}}},\mathfrak {D}_\lambda ,\lambda ):= \mathbf {Adv}\big ({\mathsf {IND\text {-}SO\text {-}CPA}}^{\mathcal {A}_{\mathsf {SO}}}_{{\mathsf {PKE}},0},{\mathsf {IND\text {-}SO\text {-}CPA}}^{\mathcal {A}_{\mathsf {SO}}}_{{\mathsf {PKE}},1}\big ). $$

\({\mathsf {PKE}}\) is \({\mathsf {IND\text {-}SO\text {-}CPA}}\) secure w.r.t. \(\mathcal {D}\) if \(\mathbf {Adv}^{\mathsf {IND\text {-}SO\text {-}CPA}}_{\mathsf {PKE}}({\mathcal {A}_{\mathsf {SO}}},\mathfrak {D}_\lambda ,\lambda )\) is negligible for all \({\mathsf {PPT}}\) \({\mathcal {A}_{\mathsf {SO}}}\).

Fig. 2.
figure 2

Game \({\mathsf {IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},b}\)

Notions of Selective Opening Security. Definition 4 is in the spirit of [2] but we allow for adaptive corruptions and let the adversary choose the distribution, as done by Böhl et al. [4]. The latter renamed \({\mathsf {IND\text {-}SO\text {-}CPA}}\) to weak \({\mathsf {IND\text {-}SO\text {-}CPA}}\) and introduced a strictly stronger notion, called full \({\mathsf {IND\text {-}SO\text {-}CPA}}\), where \({\mathcal {A}_{\mathsf {SO}}}\) may submit any distribution (even one not efficiently resamplable) and need not provide a resampling algorithm.Footnote 1 We consider the name weak \({\mathsf {IND\text {-}SO\text {-}CPA}}\) unfortunate and simply refer to the security notion in Definition 4 as \({\mathsf {IND\text {-}SO\text {-}CPA}}\) security.

3 Selective Opening for Graph-Induced Distributions

This section considers graph-induced distributions and identifies connectivity properties so that \({\mathsf {IND\text {-}CPA}}\) entails \({\mathsf {IND\text {-}SO\text {-}CPA}}\) security. We introduce some notation in Sect. 3.1. Sections 3.2 and 3.3 discuss a hybrid argument that considers the connected components of \(G_{\overline{\mathcal {I}}}\), switching one of them from sampled to resampled in each transition. Section 3.4 discusses a different hybrid argument that will allow for tighter proofs if the distribution-inducing graph is acyclic.

3.1 Graphs

A directed graph G consists of a set of vertices V, identified with [n] for \(n>0\) and a set of edges \(E\subseteq V^2\setminus \{(v,v):v\in V\}\), i.e. we do not allow loops. G is undirected if \((v_2,v_1)\in E\) for each \((v_1,v_2) \in E\). For \(V'\subseteq V\) let \(G_{V'}:=(V',E')\) denote the induced subgraph of G where \(E':=E\cap ({V'})^2\). For \(G=(V,E)\) we obtain its undirected version, \(G^{\leftrightarrow }=(V,E^{\leftrightarrow })\) where \(E^{\leftrightarrow }\supseteq E\) is obtained by adding the minimum number of edges to E so that the graph becomes undirected. For \(V'\subseteq V\) let \( N(V'):=\left\{ v\in V\setminus V':\exists v'\in V'\ s.t.\ (v,v')\in E^{\leftrightarrow }\right\} \) denote the (open) neighborhood of \(V'\) in G. For a vertex v, we denote by \(P(v)=\{j :(j,v)\in E\}\) the set of its parents.

A path from \(v_1\) to \(v_\ell \) in G is a list of at least two vertices \((v_1,\ldots ,v_\ell )\) where \(v_i\in V\) for \(i\in [\ell ]\) and \((v_i,v_{i+1})\in E\) for all \(i\in [\ell -1]\). If there is a path from u to v then u is a predecessor of v. Let \({\mathsf {pred}}(v)\) denote the set of all predecessors of v. A cycle is a path where \(v_\ell =v_1\). If G contains no cycles, it is acyclic. A directed, acyclic graph is called \({\mathsf {DAG}}\).

A non-empty subset \(V'\subseteq V\) is connected in G if for every pair of distinct vertices \((v_1,v_2)\in V'\) there exists a path form \(v_1\) to \(v_2\) in \(G^{\leftrightarrow }\). G is connected if V is connected in G. G is disconnected if G is not connected. We assume G to be connected if not stated otherwise. A (set-)maximal connected set of vertices of G is called connected component.

Notational Convention. We do not distinguish between the i-th message of an n-message vector and vertex i in a graph on n vertices.

We start with defining Markov distributions, which are distributions on vectors of random variables reflecting processes, that is, variables with higher indices depend on ones with lower indices. A distribution is Markov if it is memoryless in the sense that all relevant information for the distribution of a value \(\mathbf {M}_i\) is already present in \(\mathbf {M}_{i-1}\), although the latter itself depends on its predecessor.

Definition 5

Let \({\{\mathfrak {D}_\lambda \}}_{\lambda \in {{\mathbb N}}}\) be a family of distributions over \(\mathcal {M}^n\). Let \(\mathbf {M}=(\mathbf {M}{}_1,\ldots , \mathbf {M}{}_n)\) denote a vector of \(\mathcal {M}\)-valued random variables. We say \({\{\mathfrak {D}_\lambda \}}_{\lambda \in {{\mathbb N}}}\) is Markov if the following holds for all \(\lambda \in {{\mathbb N}}\) and all \(\mathbf {m}\in \mathcal {M}^n\):

$$ \mathop {\Pr }\limits _{\mathbf {M}\leftarrow \mathfrak {D}_\lambda }\Big [\mathbf {M}{}_i=\mathbf {m}{}_i\,\Big \vert \bigwedge _{j=1}^{i-1} \mathbf {M}{}_j= \mathbf {m}{}_j\Big ]=\mathop {\Pr }\limits _{\mathbf {M}\leftarrow \mathfrak {D}_\lambda }\Big [\mathbf {M}{}_i=\mathbf {m}{}_i\,\Big \vert \, \mathbf {M}{}_{i-1}= \mathbf {m}{}_{i-1}\Big ]. $$

Markov distribution can be seen as “induced” by a chain graph \(\mathbf {M}_1\rightarrow \mathbf {M}_2\rightarrow \ldots \rightarrow \mathbf {M}_n\), where edges represent dependencies. We will now generalize this to arbitrary graphs and still require (a generalization of) “memorylessness”. We say that a graph G induces a distribution \(\mathfrak {D}\) if whenever the distribution of \(\mathbf {M}_j\) depends on \(\mathbf {M}_i\) then there is a path from i to j in G. As for Markov distributions, we require that the distribution of a message only depends on its parents; in particular, for all \(\lambda \in {{\mathbb N}}\), all \(j\in [n]\) and \(\mathbf {M}=(\mathbf {M}_1,\ldots ,\mathbf {M}_n)\leftarrow \mathfrak {D}_\lambda \) the distribution of \(\mathbf {M}_j\) only depends on its parents in \(G_\lambda \), i.e. the set P(j), rather than all its predecessors \({\mathsf {pred}}(j)\).

Definition 6

(Graph-induced distribution). Let \({\{\mathfrak {D}_\lambda \}}_{\lambda \in {{\mathbb N}}}\) be a family of distributions over \(\mathcal {M}^n\) and let \({\{G_\lambda \}}_{\lambda \in {{\mathbb N}}}\) be a family of graphs on n vertices. We say that \({\{\mathfrak {D}_\lambda \}}_{\lambda \in {{\mathbb N}}}\) is \({\{G_\lambda \}}_{\lambda \in {{\mathbb N}}}\) -induced if the following holds for all \(\lambda \in {{\mathbb N}}\):

  • For all \(i\ne j\in [n]\) if for \(\mathfrak {D}_\lambda \) the distribution of \(\mathbf {M}_j\) depends on \(\mathbf {M}_i\) then there is a path from i to j in \(G_\lambda \).

  • For all \(j\in [n]\) and all \(\mathbf {m}\in \mathcal {M}^n\) we have

    $$ \mathop {\Pr }\limits _{\mathbf {M}\leftarrow \mathfrak {D}_\lambda }\!\Big [ \mathbf {M}_j=\mathbf {m}_j\,\Big \vert \bigwedge _{i\in {\mathsf {pred}}(j)}\!\!\mathbf {M}_i=\mathbf {m}_i\Big ] = \mathop {\Pr }\limits _{\mathbf {M}\leftarrow \mathfrak {D}_\lambda }\!\Big [ \mathbf {M}_j=\mathbf {m}_j\,\Big \vert \bigwedge _{i\in P(j)}\!\!\mathbf {M}_i=\mathbf {m}_i\Big ]. $$

We demand that for any \(\lambda \in {{\mathbb N}}\) one can efficiently reconstruct \(G_\lambda \) from \(\mathfrak {D}_\lambda \).

As with a family of distributions, we drop the security parameter and say that \(\mathfrak {D}\) is G -induced whenever \(\lambda \) is already fixed. Note that G may contain cycles and may be undirected. Further note that Markov distributions can be seen as graph-induced distributions where the graph \(G=(V,E)\) is a chain on n vertices, that is, \(V=[n]\) and \(E=\{(i-1,i):{i\in [n]}\}\).

Although our proof ideas can be applied to disconnected graphs directly, Sects. 3.2, 3.3, and 3.4 consider connected graphs for simplicity. A hybrid argument over the connected components of a graph as given in Sect. 3.5 extends all our results to disconnected graphs.

3.2 A Bound Using Connected Subgraphs

Definition 7

(Number of connected subgraphs). Let \(G=(V,E)\). We define the number of connected subgraphs of G:

$$ S(G):=\left| \{V'\subseteq V :V' \text { connected}\}\right| . $$

For example, for a chain graph on n vertices we have \(S(G)=\frac{1}{2}\cdot n\cdot (n+1)\) and for the complete graph \(C_n\) on n vertices we have \(S(C_n)=2^n-1\).

Theorem 8

Let \({\mathsf {PKE}}\) be \({\mathsf {IND\text {-}CPA}}\) secure. Then \({\mathsf {PKE}}\) is \({\mathsf {IND\text {-}SO\text {-}CPA}}\) secure w.r.t. the class of efficiently resamplable and G-induced distribution families over \(\mathcal {M}^n\) where \(S(G)={\mathsf {poly}}(n)\) and G is connected.

Precisely, for any adversary \({\mathcal {A}_{\mathsf {SO}}}\) run in game \({\mathsf {IND\text {-}SO\text {-}CPA}}_{\mathsf {PKE}}\) there exists an \({\mathsf {IND\text {-}CPA}}_{\mathsf {PKE}}\) adversary \({\mathcal {B}_{\mathsf {CPA}}}\) with roughly the running time of \({\mathcal {A}_{\mathsf {SO}}}\) plus two executions of \({\mathsf {Resamp}}\) such that

$$ \mathbf {Adv}_{\mathsf {PKE}}^{\mathsf {IND\text {-}SO\text {-}CPA}}({\mathcal {A}_{\mathsf {SO}}},\mathfrak {D}_\lambda ,\lambda )\le \textstyle n \cdot (n-1) \cdot S(G_\lambda ) \cdot \mathbf {Adv}^{\mathsf {IND\text {-}CPA}}_{\mathsf {PKE}}({\mathcal {B}_{\mathsf {CPA}}},\lambda ). $$

Proof Idea. Recall game \({\mathsf {IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},b}\) given in Fig. 2. During \(\textsc {Challenge}\) the game sends \(\mathbf {m}^b\), where \(\mathbf {m}^0_{{\overline{\mathcal {I}}}}\) consists of messages sampled at the beginning, while \(\mathbf {m}^1_{\overline{\mathcal {I}}}\) is resampled (conditioned on \(\mathbf {m}^1_\mathcal {I}=\mathbf {m}^0_\mathcal {I}\)). We will define hybrid games \({\mathsf {H}}_{0},{\mathsf {H}}_{1},\ldots ,{\mathsf {H}}_{n}\). For this, let \(\mathcal S\subseteq 2^{V}\) denote all the connected subgraphs of G. We have \(|\mathcal S|=S(G)\).

Note that \(G_{\overline{\mathcal {I}}}\) consists of connected components \(C_1,\ldots ,C_{n'}\subseteq \mathcal S\) for some \(n'\le n-1\). (This upper bound is attained by the star graph when I consists of the internal node.) We assume those components to be ordered, e.g. by the smallest vertex contained in each.

Thus, if \(b=1\) in game \({\mathsf {IND\text {-}SO\text {-}CPA}}\) then the challenger can resample \(\mathbf {m}^1_{\overline{\mathcal {I}}}\) in \(n'\) batches \(\mathbf {m}^1_{C_1},\ldots ,\mathbf {m}^1_{C_{n'}}\) (as \({\overline{\mathcal {I}}}=\bigcup _{i=1}^{n'}C_i\)). Moreover, each batch \(\mathbf {m}^1_{C_i}\) can be resampled independently, i.e., as a function of \(\mathbf {m}^0_\mathcal {I}\) and \(\mathfrak {D}\), but not \(\mathbf {m}^1_{C_j}\), \(j\ne i\).

Fig. 3.
figure 3

\(\textsc {Challenge}\) procedure of hybrid game \({\mathsf {H}}_{k}\). \(C_i\) denotes the i-th connected component of \(G_{{\overline{\mathcal {I}}}}\). The challenge vector contains resampled messages in the first k batches \(C_1,\ldots ,C_k\) while the other messages remain sampled.

Proof

(Theorem 8 ). For \(k=0,\ldots ,n\) we define hybrid game \({\mathsf {H}}_{k}\) as a modified game \({\mathsf {IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}}}\), in which the messages of the first k batches \(C_1,\ldots ,C_k\) are resampled during \(\textsc {Challenge}\) while the remaining batches stay sampled.

Every procedure except \(\textsc {Challenge}\) remains as in Definition 4, and \(\textsc {Challenge}\) is given in Fig. 3. Clearly, \({\mathsf {H}}_{0}\) is the (real) game \({\mathsf {IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},0}\) and \({\mathsf {H}}_{n'}\) for some \(n'\le n-1\) is the (random) game \({\mathsf {IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},1}\). Note that for \(k,j\in [n',n]\) hybrids \({\mathsf {H}}_{k}\) and \({\mathsf {H}}_{j}\) are identical. We have

$$ \mathbf {Adv}_{\mathsf {PKE}}^{\mathsf {IND\text {-}SO\text {-}CPA}}({\mathcal {A}_{\mathsf {SO}}},\mathfrak {D}_\lambda ,\lambda ) =\mathbf {Adv}\big ({\mathsf {H}}_{0}^{\mathcal {A}_{\mathsf {SO}}},{\mathsf {H}}_{n'}^{\mathcal {A}_{\mathsf {SO}}}\big ) \le \sum _{k=0}^{n'-1}\mathbf {Adv}\big ({\mathsf {H}}_{k} ^{\mathcal {A}_{\mathsf {SO}}},{\mathsf {H}}_{k+1}^{\mathcal {A}_{\mathsf {SO}}}\big ). $$

We now upper-bound the distance between two consecutive hybrids using the following lemma.

Lemma 9

For every adversary \({\mathcal {A}_{\mathsf {SO}}}\) that distinguishes hybrids \({\mathsf {H}}_{k}\) and \({\mathsf {H}}_{k+1}\) there exists a \({\mathsf {mult\text {-}IND\text {-}CPA}}\) adversary \({\mathcal {B}_{\mathsf {mult}}}\) with roughly the running time of \({\mathcal {A}_{\mathsf {SO}}}\) plus two executions of \({\mathsf {Resamp}}\) such that

$$\begin{aligned} \mathbf {Adv}\big ({\mathsf {H}}_{k} ^{\mathcal {A}_{\mathsf {SO}}},{\mathsf {H}}_{k+1}^{\mathcal {A}_{\mathsf {SO}}}\big )\le S(G)\cdot \mathbf {Adv}_{\mathsf {PKE}}^{{\mathsf {mult\text {-}IND\text {-}CPA}}}({\mathcal {B}_{\mathsf {mult}}},\lambda ). \end{aligned}$$
Fig. 4.
figure 4

\({\mathcal {A}_{\mathsf {SO}}}\)’s game interface as provided by \({\mathcal {B}_{\mathsf {mult}}}\) run in game \({\mathsf {mult\text {-}IND\text {-}CPA}}\). \({\mathcal {B}_{\mathsf {mult}}}\) interpolates between hybrids \({\mathsf {H}}_{k}\), \({\mathsf {H}}_{k+1}\) for \(k\in [0,n-1]\).

Proof

We construct adversary \({\mathcal {B}_{\mathsf {mult}}}\) as follows (cf. Fig. 4):

\({\mathcal {B}_{\mathsf {mult}}}\) forwards \( pk \) to \({\mathcal {A}_{\mathsf {SO}}}\) and picks \(C_{k+1}^*{\,{\leftarrow {{\scriptscriptstyle \$}}}\,}\mathcal S\) uniformly at random (trying to guess \(C_{k+1})\) after receiving \((\mathfrak {D},{\mathsf {Resamp}}_\mathfrak {D})\). \({\mathcal {B}_{\mathsf {mult}}}\) samples \(\mathbf {m}^0\leftarrow \mathfrak {D}\) and resamples \(\mathbf {m}^1\) keeping the neighborhood of \(C_{k+1}^*\) fixed. It submits \((\mathbf {m}^0_{C_{k+1}^*},\mathbf {m}^1_{C_{k+1}^*})\) to its \({\mathsf {mult\text {-}IND\text {-}CPA}}\) challenger, obtains ciphertexts for positions in \(C_{k+1}^*\), picks randomness and uses it to encrypt each message in \(\overline{C_{k+1}^*}\). \({\mathcal {B}_{\mathsf {mult}}}\) sends \((\mathbf {c}_1,\ldots ,\mathbf {c}_n)\) to \({\mathcal {A}_{\mathsf {SO}}}\), embedding its challenge at positions \(C_{k+1}^*\) and answers opening queries honestly if they do not occur on \(C_{k+1}^*\). If \({\mathcal {A}_{\mathsf {SO}}}\) issues such a query, \({\mathcal {B}_{\mathsf {mult}}}\) cannot answer and sets \({\texttt {Bad}}:=true\) since it guessed \(C_{k+1}\) wrong. During \(\textsc {Challenge}\), \({\mathcal {B}_{\mathsf {mult}}}\) verifies that it guessed \(C_{k+1}\) correctly and sets \({\texttt {Bad}}:=true\) if not. \({\mathcal {B}_{\mathsf {mult}}}\) resamples messages \(\widetilde{\mathbf {m}}^1\) that are sent in the first k batches while messages from \(\mathbf {m}^0\) are sent in every other position. \({\mathcal {B}_{\mathsf {mult}}}\) outputs \({\mathcal {A}_{\mathsf {SO}}}\)’s output.

In the following we use \(\mathbf {m}\equiv \mathbf {m}'\) if \(\mathbf {m}\) and \(\mathbf {m}'\), interpreted as random variables, are identically distributed where the probability is taken over all choices in the computation of \(\mathbf {m}\), \(\mathbf {m}'\), respectively.

Assume, \({\mathcal {B}_{\mathsf {mult}}}\) guessed correctly, i.e. \(C_{k+1}^*=C^{}_{k+1}\). Clearly, \({\mathcal {B}_{\mathsf {mult}}}\) perfectly simulates hybrids \({\mathsf {H}}_{k}\) and \({\mathsf {H}}_{k+1}\) for messages and ciphertexts at positions in \(\overline{C_{k+1}}\). Run in \({\mathsf {mult\text {-}IND\text {-}CPA}}_{{\mathsf {PKE}},0}\), \({\mathcal {B}_{\mathsf {mult}}}\) obtains \({\mathsf {Enc}}_ pk (\mathbf {m}^0_{C_{k+1}})\) and \({\mathcal {A}_{\mathsf {SO}}}\) therefore receives encryptions of sampled messages. During \(\textsc {Challenge}\) the \((k+1)\)-th batch contains sampled messages \(\mathbf {m}^0_{C_{k+1}}\), thus \({\mathcal {B}_{\mathsf {mult}}}\) perfectly simulates hybrid \({\mathsf {H}}_{k}\).

When \({\mathcal {B}_{\mathsf {mult}}}\) is run in \({\mathsf {mult\text {-}IND\text {-}CPA}}_{{\mathsf {PKE}},1}\), \({\mathcal {A}_{\mathsf {SO}}}\) obtains encryptions of resampled messages \({\mathsf {Enc}}_ pk (\mathbf {m}^1_{C_{k+1}})\) while it expects encrypted sampled messages: \({\mathsf {Enc}}_ pk (\mathbf {m}^0_{C_{k+1}})\). During \(\textsc {Challenge}\) \({\mathcal {A}_{\mathsf {SO}}}\) expects resampled messages \(\widetilde{\mathbf {m}}^1_{C_{k+1}}\) but obtains sampled \(\mathbf {m}^0_{C_{k+1}}\). Thus, the sampled and resampled messages change roles on \(C_{k+1}\).

However, they are equally distributed, i.e., \(\mathbf {m}^0_{C_{k+1}}\equiv \mathbf {m}^1_{C_{k+1}}\) since the messages in \(N(C_{k+1})\) were fixed when resampling \(\mathbf {m}^1\) and the distribution of messages in \(C_{k+1}\) depends on \(\mathfrak {D}\) and messages in positions \(N(C_{k+1})\) only. Likewise, \(\mathbf {m}^1_{C_{k+1}}\equiv \widetilde{\mathbf {m}}^1_{C_{k+1}}\) for \(\mathbf {m}^1\leftarrow {\mathsf {Resamp}}_\mathfrak {D}(\mathbf {m}^0,N(C_{k+1}))\) and \(\widetilde{\mathbf {m}}^1\leftarrow {\mathsf {Resamp}}_\mathfrak {D}(\mathbf {m}^0,\mathcal {I})\) since the distribution of messages in \(C_{k+1}\) solely depends on \(\mathfrak {D}\) and messages in \(N(C_{k+1})\subseteq \mathcal {I}\) and \({\mathcal {A}_{\mathsf {SO}}}\)’s view is identical to hybrid \({\mathsf {H}}_{k+1}\). We have

$$\begin{aligned} \Pr [{\mathsf {mult\text {-}IND\text {-}CPA}}^{\mathcal {B}_{\mathsf {mult}}}_{{\mathsf {PKE}},0}\Rightarrow 1]&=\Pr [{\mathsf {H}}_{k}^{\mathcal {A}_{\mathsf {SO}}}\Rightarrow 1\wedge \overline{{\texttt {Bad}}}] ~\text {and}\\ \Pr [{\mathsf {mult\text {-}IND\text {-}CPA}}^{\mathcal {B}_{\mathsf {mult}}}_{{\mathsf {PKE}},1}\Rightarrow 1]&=\Pr [{\mathsf {H}}_{k+1}^{\mathcal {A}_{\mathsf {SO}}}\Rightarrow 1\wedge \overline{{\texttt {Bad}}}]. \end{aligned}$$

Observe that \({\texttt {Bad}}\) does not happen when \({\mathcal {B}_{\mathsf {mult}}}\) guessed \(C_{k+1}\) correctly. Since \(\overline{{\texttt {Bad}}}\) is independent of \({\mathcal {A}_{\mathsf {SO}}}\)’s output in a hybrid and \(|\mathcal S|=S(G)\), we have

$$ \mathbf {Adv}_{\mathsf {PKE}}^{{\mathsf {mult\text {-}IND\text {-}CPA}}}({\mathcal {B}_{\mathsf {mult}}},\lambda )\ge \frac{1}{S(G)}\cdot \mathbf {Adv}\big ({\mathsf {H}}_{k} ^{\mathcal {A}_{\mathsf {SO}}},{\mathsf {H}}_{k+1}^{\mathcal {A}_{\mathsf {SO}}}\big ), $$

which concludes the proof.    \(\square \)

We proceed with the proof of Theorem 8. Using Lemma 9 we have

$$\begin{aligned} \mathbf {Adv}_{\mathsf {PKE}}^{\mathsf {IND\text {-}SO\text {-}CPA}}({\mathcal {A}_{\mathsf {SO}}},\mathfrak {D}_\lambda ,\lambda )&\le \sum _{k=0}^{n'-1}\mathbf {Adv}\big ({\mathsf {H}}_{k}^{\mathcal {A}_{\mathsf {SO}}},{\mathsf {H}}_{k+1}^{\mathcal {A}_{\mathsf {SO}}}\big )\\&\le \sum _{k=0}^{n'-1}S(G_\lambda )\cdot \mathbf {Adv}_{\mathsf {PKE}}^{{\mathsf {mult\text {-}IND\text {-}CPA}}}({\mathcal {B}_{\mathsf {mult}}},\lambda ). \end{aligned}$$

\({\mathcal {B}_{\mathsf {mult}}}\) sends message vectors of length \(|C_{k+1}^*|\le n\) to its \({\mathsf {mult\text {-}IND\text {-}CPA}}\) challenger. Using Lemma 2, we have

$$ \le \sum _{k=0}^{n'-1}n\cdot S(G_\lambda )\cdot \mathbf {Adv}_{\mathsf {PKE}}^{\mathsf {IND\text {-}CPA}}({\mathcal {B}_{\mathsf {CPA}}},\lambda ) \le n \cdot (n-1) \cdot S(G_\lambda )\cdot \mathbf {Adv}^{\mathsf {IND\text {-}CPA}}_{\mathsf {PKE}}({\mathcal {B}_{\mathsf {CPA}}},\lambda ), $$

since \(n'\le n-1\), which completes the proof of Theorem 8.    \(\square \)

Markov Distributions. Markov distributions (Definition 5) are induced by the chain graph \((V=[n],E=\{(i-1,i):{i\in [n]}\})\), for which \(S(G)= \frac{1}{2} \cdot n\cdot (n+1)\). We thus immediately obtain the following corollary from Theorem 8.

Corollary 10

Let \({\mathsf {PKE}}\) be \({\mathsf {IND\text {-}CPA}}\) secure. Then \({\mathsf {PKE}}\) is \({\mathsf {IND\text {-}SO\text {-}CPA}}\) secure w.r.t. efficiently resamplable Markov distributions over \(\mathcal {M}^n\).

Precisely, for any adversary \({\mathcal {A}_{\mathsf {SO}}}\) run in game \({\mathsf {IND\text {-}SO\text {-}CPA}}_{\mathsf {PKE}}\) there exists an \({\mathsf {IND\text {-}CPA}}_{\mathsf {PKE}}\) adversary \({\mathcal {B}_{\mathsf {CPA}}}\) with roughly the running time of \({\mathcal {A}_{\mathsf {SO}}}\) plus two executions of \({\mathsf {Resamp}}\) such that

$$ \mathbf {Adv}_{\mathsf {PKE}}^{\mathsf {IND\text {-}SO\text {-}CPA}}({\mathcal {A}_{\mathsf {SO}}},\mathfrak {D}_\lambda ,\lambda ) \le \textstyle \frac{1}{2} \cdot n^2 \cdot (n^2-1) \cdot \mathbf {Adv}^{\mathsf {IND\text {-}CPA}}_{\mathsf {PKE}}({\mathcal {B}_{\mathsf {CPA}}},\lambda ). $$

3.3 A Bound Using the Maximum Border

Definition 11

(Maximum border). Let \(G=(V,E)\). We define the maximum border of G as the maximal size of the neighborhood of any connected subgraph in G.

$$ B(G):=\max \big \{\left| N(V')\right| \ :\ V'\subseteq V\text { connected}\big \}. $$

For example, if G is an n-path for \(n\ge 3\) then \(B(G)=2\). For the complete graph or star graph on n vertices we have \(B(G)=n-1\). Notice that \(B(G)<n\).

In the reduction in Sect. 3.2 we guessed a connected component in \(G_{\overline{\mathcal {I}}}\) that would be switched from sampled to resampled in a hybrid transition. Alternatively, we can guess a connected component in \(G_{\overline{\mathcal {I}}}\) via its neighborhood. The following theorem expresses S(G) in terms of B(G).

Theorem 12

Let G be a connected graph. Then the following bound on S(G) holds:

$$ S(G)\le \frac{2}{(B(G)-1)!}\cdot n^{B(G)}\quad \text { for all }\quad 0 < B(G)\le \frac{n-2}{3}. $$

We begin with a simple observation before proving the theorem.

Lemma 13

Let \(G=(V,E)\) and \(V_1 \ne V_2\) each of them connected in G such that \(N(V_1)=N(V_2)\). Then \(V_1\cap V_2=\emptyset \).

Proof

Assume \(V_1\cap V_2\ne \emptyset \). As \(V_1 \ne V_2\) we have \(V_1\setminus V_2\ne \emptyset \) without loss of generality. Because \(V_1\) is connected, there exist vertices \(v_\cap \in V_1\cap V_2\) and \(v_1\in V_1\setminus V_2\) such that \((v_1,v_\cap )\in E^{\leftrightarrow }\). Since \(v_1\notin V_2\), \(v_\cap \in V_2\) and \((v_1,v_\cap )\in E^{\leftrightarrow }\), we see that \(v_1\in N(V_2)\). As \(N(V_2)=N(V_1)\) it follows that \(v_1\in N(V_1)\); a contradiction since \(v_1\in V_1\).    \(\square \)

Proof

(Theorem 12 ). Let \(B:=B(G)\). We have

$$\begin{aligned} S(G)&=\sum _{i=0}^B\left| \big \{V'\subseteq V:V'\text { connected}\wedge |N(V')|=i\big \}\right| . \end{aligned}$$

For \(i=0\) we count the connected components of G.

$$\begin{aligned}&= 1+\sum _{i=1}^B\left| \big \{V'\subseteq V:V'\text { connected}\wedge |N(V')|=i\big \}\right| \\&=1+\sum _{i=1}^B\sum _{\begin{array}{c} V_i\subseteq V\\ |V_i|=i \end{array}}\left| \big \{V'\subseteq V:V'\text { connected}\wedge N(V')=V_i\big \}\right| . \end{aligned}$$

Let \(V_i\subseteq V\) be non-empty and \(\{V'\subseteq V:V'\text { connected}\wedge N(V')=V_i\}=\{V'_1,\ldots ,V'_k\}\) for appropriate k. Applying Lemma 13 to \(V'_1,\ldots ,V'_k\), we see that those sets \(V'_j\) are pairwise disjoint. Fix any vertex \(v_i\in V_i\). Since \(N(V'_j)=V_i\) for \(j\in [k]\) and all \(V'_j\) are pairwise disjoint, there exists at least one vertex \(v'_j\) in each \(V'_j\) such that \((v'_j,v_i)\in E\) for all \(j\in [k]\). Thus, \(N(v_i)\ge k\), i.e. \(B\ge k\). Hence, \(k\le B\) for given B and we obtain an upper bound for the number of possible sets \(V'\) for each fixed \(V_i\). It follows

$$\begin{aligned} S(G)&\le 1+\sum _{i=1}^B\sum _{\begin{array}{c} V_i\subseteq V\\ |V_i|=i \end{array}}B = 1+B\cdot \sum _{i=1}^B\left( {\begin{array}{c}n\\ i\end{array}}\right) \le B\cdot \sum _{i=0}^B\left( {\begin{array}{c}n\\ i\end{array}}\right) . \end{aligned}$$
(1)

To bound the sum in (1) we use the geometric series and upper-bound the quotient of two consecutive binomial coefficients by \(\frac{1}{2}\):

$$ \frac{\left( {\begin{array}{c}n\\ i\end{array}}\right) }{\left( {\begin{array}{c}n\\ i+1\end{array}}\right) }=\frac{i+1}{n-i}\le \frac{1}{2}\Leftrightarrow i\le \frac{n-2}{3}. $$

Hence

$$ \quad B\cdot \sum _{i=0}^B\left( {\begin{array}{c}n\\ i\end{array}}\right) \le B\cdot \sum _{i=0}^B{\frac{1}{2^i}}\left( {\begin{array}{c}n\\ B\end{array}}\right) \le B\cdot \left( {\begin{array}{c}n\\ B\end{array}}\right) \cdot \sum _{i=0}^\infty \frac{1}{2^i}\le 2\cdot B\cdot \frac{n^B}{B!}=\frac{2}{(B-1)!}\cdot n^B $$

for \(B(G)\le \frac{n-2}{3}\), which concludes the proof.    \(\square \)

Theorems 8 and 12 together now yield the following corollary.

Corollary 14

Let \({\mathsf {PKE}}\) be \({\mathsf {IND\text {-}CPA}}\) secure. Then \({\mathsf {PKE}}\) is \({\mathsf {IND\text {-}SO\text {-}CPA}}\) secure w.r.t. the class of efficiently resamplable and G-induced distribution families over \(\mathcal {M}^n\) where \(B(G)={\mathsf {const}}\), \(n\ge 3\cdot B(G)+2\) and G is connected.

Concretely, for any adversary \({\mathcal {A}_{\mathsf {SO}}}\) in game \({\mathsf {IND\text {-}SO\text {-}CPA}}_{\mathsf {PKE}}\) there exists an \({\mathsf {IND\text {-}CPA}}_{\mathsf {PKE}}\) adversary \({\mathcal {B}_{\mathsf {CPA}}}\) with roughly the running time of \({\mathcal {A}_{\mathsf {SO}}}\) plus two executions of \({\mathsf {Resamp}}\) such that

$$ \mathbf {Adv}_{\mathsf {PKE}}^{\mathsf {IND\text {-}SO\text {-}CPA}}({\mathcal {A}_{\mathsf {SO}}},\mathfrak {D}_\lambda ,\lambda ) \le \frac{2\cdot (n-1)}{(B(G_\lambda )-1)!}\cdot n^{B(G_\lambda )+1}\cdot \mathbf {Adv}^{\mathsf {IND\text {-}CPA}}_{\mathsf {PKE}}({\mathcal {B}_{\mathsf {CPA}}},\lambda ). $$

Since Corollary 14 ensures a polynomial loss in the reduction for \(B(G)={\mathsf {const}}\) and we are interested in asymptotic statements, we do not consider the restriction to \(n\ge 3 \cdot B(G)+2\) grave. One can easily obtain a version of Theorem 12 that is weaker by a factor of roughly B(G) but holds for all \(B(G)<n\). To this end one bounds the sum of binomial coefficients in (1) in terms of the incomplete upper gamma function \(\varGamma \) to get

$$ \sum _{i=1}^B\left( {\begin{array}{c}n\\ i\end{array}}\right) \le \sum _{i=1}^B\frac{n^i}{i!}=\frac{e^n \varGamma (B+1,n)}{B!}-1. $$

Using a nice bound on \(\varGamma \) due to [11] that can be found in [5] we obtain a bound for \(B(G)<n\).

Think of a direct reduction for proving Corollary 14 as implicitly guessing \(C_{k+1}\) via guessing \(N(C_{k+1})\) by picking up to B(G) vertices in G and guessing one of at most B(G) connected subgraphs that have the guessed neighborhood.

Note that Corollary 14 cannot provide a tighter bound on the loss than Theorem 8. In particular, there are (even connected) graphs for which Theorem 8 ensures an at most polynomial loss, while Corollary 14 does not. For instance, let G be the star graph on \(\log n\) vertices attached to the chain graph of \(n-\log n\) vertices, then \(S(G)={\mathsf {poly}}(n)\), but \(B(G)>{\mathsf {const}}\).

3.4 A Tighter Reduction for Acyclic Graphs

While we considered graph-induced distributions for arbitrary graphs in Sects. 3.2 and 3.3, we now consider \({\mathsf {DAG}}\)-induced distributions for which we obtain a tighter reduction than what is guaranteed by Corollary 14.

For a \({\mathsf {DAG}}\) G we require that the vertices are semi-ordered in such a way that there is no directed path from i to j for \(i<j\). Such an ordering always exists as G has no cycles. Note that the dependencies now go the other way as for Markov distributions, but this will allow us to replaced sampled messages by resampled ones from left to right as in the previous hybrids. We will traverse dependencies backwards, that is, if message \(\mathbf {m}_i\) depends on \(\mathbf {m}_j\) then \(\mathbf {m}_i\) is switched from sampled to resampled before \(\mathbf {m}_j\) is switched. So, as in the previous proofs, messages \(\mathbf {m}_1,\ldots ,\mathbf {m}_i\) will be resampled in the i-th hybrid.

Theorem 15

Let \({\mathsf {PKE}}\) be \({\mathsf {IND\text {-}CPA}}\) secure. Then \({\mathsf {PKE}}\) is \({\mathsf {IND\text {-}SO\text {-}CPA}}\) secure w.r.t. the class of efficiently resamplable and G-induced distribution families over \(\mathcal {M}^n\) where \(B(G)={\mathsf {const}}\) and G is a connected \({\mathsf {DAG}}\).

Precisely, for any adversary \({\mathcal {A}_{\mathsf {SO}}}\) run in game \({\mathsf {IND\text {-}SO\text {-}CPA}}_{\mathsf {PKE}}\) there exists an \({\mathsf {IND\text {-}CPA}}_{\mathsf {PKE}}\) adversary \({\mathcal {B}_{\mathsf {CPA}}}\) with roughly the running time of \({\mathcal {A}_{\mathsf {SO}}}\) plus three executions of \({\mathsf {Resamp}}\) such that

$$ \mathbf {Adv}_{\mathsf {PKE}}^{\mathsf {IND\text {-}SO\text {-}CPA}}({\mathcal {A}_{\mathsf {SO}}},\mathfrak {D}_\lambda ,\lambda ) \le 3\cdot n^{B(G_\lambda )+1} \cdot \mathbf {Adv}^{\mathsf {IND\text {-}CPA}}_{\mathsf {PKE}}({\mathcal {B}_{\mathsf {CPA}}},\lambda ). $$

Proof

We proceed in a sequence of hybrid games \({\mathsf {H}}_{0},{\mathsf {H}}_{1},\ldots ,{\mathsf {H}}_{n}\) and switch message \(\mathbf {m}_{k+1}\) from sampled to resampled in hybrid transition \({\mathsf {H}}_{k}\) to \({\mathsf {H}}_{k+1}\). Hybrid \({\mathsf {H}}_{k}\) will return the sampled messages for all positions \([k+1, n]\cup \mathcal {I}\), but resampled messages on all positions \([k]\setminus \mathcal {I}\) where the resampling is conditioned on every message in \([k+1,n]\cup \mathcal {I}\). The code for \(\textsc {Challenge}\) in given in Fig. 5, every other procedure stays as in Fig. 2.

Fig. 5.
figure 5

\(\textsc {Challenge}\) procedure of hybrid game \({\mathsf {H}}_{k}\). For \(k=n\) we have \([n+1,n]=\emptyset \).

Hybrid \({\mathsf {H}}_{0}\) is identical to game \({\mathsf {IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},0}\), and \({\mathsf {H}}_{n}\) is identical to \({\mathsf {IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},1}\), hence

$$\begin{aligned} \mathbf {Adv}_{\mathsf {PKE}}^{\mathsf {IND\text {-}SO\text {-}CPA}}({\mathcal {A}_{\mathsf {SO}}},\mathfrak {D}_\lambda ,\lambda ) =\mathbf {Adv}\big ({\mathsf {H}}_{0}^{\mathcal {A}_{\mathsf {SO}}},{\mathsf {H}}_{n}^{\mathcal {A}_{\mathsf {SO}}}\big ) \le \sum _{k=0}^{n-1}\mathbf {Adv}\big ({\mathsf {H}}_{k} ^{\mathcal {A}_{\mathsf {SO}}},{\mathsf {H}}_{k+1}^{\mathcal {A}_{\mathsf {SO}}}\big )\;. \end{aligned}$$
(2)

We bound the distance between two consecutive hybrids \({\mathsf {H}}_{k}\), \({\mathsf {H}}_{k+1}\) and proceed with the following lemma.

Lemma 16

For every adversary \({\mathcal {A}_{\mathsf {SO}}}\) that distinguishes hybrids \({\mathsf {H}}_{k}\) and \({\mathsf {H}}_{k+1}\) there exists a \({\mathsf {mult\text {-}IND\text {-}CPA}}\) adversary \({\mathcal {B}_{\mathsf {mult}}}\) with roughly the running time of \({\mathcal {A}_{\mathsf {SO}}}\) plus three executions of \({\mathsf {Resamp}}\) such that

$$\begin{aligned} \mathbf {Adv}\big ({\mathsf {H}}_{k} ^{\mathcal {A}_{\mathsf {SO}}},{\mathsf {H}}_{k+1}^{\mathcal {A}_{\mathsf {SO}}}\big ) \le {\Pr [\overline{{\texttt {Bad}}}_k]}^{-1}\cdot \mathbf {Adv}_{\mathsf {PKE}}^{{\mathsf {mult\text {-}IND\text {-}CPA}}}({\mathcal {B}_{\mathsf {mult}}},\lambda ), \end{aligned}$$

where \(\Pr [\overline{{\texttt {Bad}}}_k]^{-1}=\sum _{i=0}^{B(G_\lambda )-1}\left( {\begin{array}{c}k\\ i\end{array}}\right) \) for \(k<n-1\) and \(\Pr [\overline{{\texttt {Bad}}}_k]^{-1}=\sum _{i=0}^{B(G_\lambda )}\left( {\begin{array}{c}k\\ i\end{array}}\right) \) for \(k=n-1\).

Proof Idea: We construct a \({\mathsf {mult\text {-}IND\text {-}CPA}}\) adversary \({\mathcal {B}_{\mathsf {mult}}}\) that interpolates between hybrids \({\mathsf {H}}_{k}\) and \({\mathsf {H}}_{k+1}\). Ideally, \({\mathcal {B}_{\mathsf {mult}}}\) embeds its own challenge at position \(k+1\), but might have to resample some already resampled messages in \(\mathbf {m}_{[k]}\) to avoid inconsistencies. Let \({\mathsf {middle}}\) denote the connected component in \(G_{[k+1]\setminus \mathcal {I}}\) that contains \(\mathbf {m}_{k+1}\). Let \({\mathsf {right}}:=[k+2,n]\), and \({\mathsf {left}}:=\overline{\left( {\mathsf {middle}}\cup {\mathsf {right}}\right) }\). Observe that it is sufficient to resample \({\mathsf {middle}}\) again to obtain consistent resampled messages. In particular, there is no need to resample any \({\mathsf {right}}\) message due to the semi-order imposed on the vertices, as a message in \({\mathsf {right}}\) does not depend on any message in \(\overline{{\mathsf {right}}}\) (cf. Fig. 6). The reduction will guess \({\mathsf {middle}}\) to embed its \({\mathsf {mult\text {-}IND\text {-}CPA}}\) challenge, while it waits for all opening queries to happen to resample the \({\mathsf {left}}\) messages. Note that \({\mathsf {middle}}\) and \({\mathsf {left}}\) are disconnected in \(G_{\overline{\mathcal {I}}}\), thus can be resampled independently of each other only depending on their respective neighborhood. Since \({\mathsf {right}}\) messages are fixed while resampling, it suffices to guess \(N({\mathsf {middle}})\cap [k]\). Further, G is connected, i.e. \(N({\mathsf {middle}})\) contains at least one vertex from \({\mathsf {right}}=[k+2,n]\) as long as \(k<n-1\). Hence, for \(k<n-1\), we have \(|N({\mathsf {middle}})\cap [k]|\le B(G)-1\).

Fig. 6.
figure 6

Structure of G. Edges between particular sets cannot exist if there is no arrow depicted. If \({\mathsf {right}}\ne \emptyset \), there is at least one edge from \({\mathsf {right}}\) to \({\mathsf {middle}}\) since G is connected. \({\mathsf {left}}\) and \({\mathsf {middle}}\) are disconnected in \(G_{\overline{\mathcal {I}}}\).

Fig. 7.
figure 7

\({\mathcal {A}_{\mathsf {SO}}}\)’s game interface as provided by \({\mathcal {B}_{\mathsf {mult}}}\) run in game \({\mathsf {mult\text {-}IND\text {-}CPA}}\). \({\mathcal {B}_{\mathsf {mult}}}\) interpolates between hybrids \({\mathsf {H}}_{k}\), \({\mathsf {H}}_{k+1}\) for \(k\in [0,n-1]\).

Proof

(Lemma 16 ). For \(k\in [0,n]\) and \(i\in [n]\) let \({\texttt {Open}}_k(i)\) denote the event that \({\mathcal {A}_{\mathsf {SO}}}\) calls \(\textsc {Open}(i)\) in hybrid \({\mathsf {H}}_{k}\). Two arbitrary hybrids only differ in the \(\textsc {Challenge}\) procedure, hence \(\Pr [{\texttt {Open}}_s(i)]=\Pr [{\texttt {Open}}_t(i)]\) for all \(s,t\in [0,n]\), for all \({i\in [n]}\). Additionally, two consecutive hybrids \({\mathsf {H}}_{k}\), \({\mathsf {H}}_{k+1}\) only differ in the \((k{+}1)\)-th message returned during \(\textsc {Challenge}\) unless \({\mathcal {A}_{\mathsf {SO}}}\) calls \(\textsc {Open}(k+1)\) in game \({\mathsf {H}}_{k+1}\). Thus, we have

$$\begin{aligned} \Pr [{\mathsf {H}}_{k}^{\mathcal {A}_{\mathsf {SO}}}\Rightarrow 1\wedge {\texttt {Open}}_k(k+1)]=\Pr [{\mathsf {H}}_{k+1}^{\mathcal {A}_{\mathsf {SO}}}\Rightarrow 1\wedge {\texttt {Open}}_{k+1}(k+1)] \end{aligned}$$

and obtain

$$\begin{aligned}&\mathbf {Adv}\big ({\mathsf {H}}_{k} ^{\mathcal {A}_{\mathsf {SO}}},{\mathsf {H}}_{k+1}^{\mathcal {A}_{\mathsf {SO}}}\big )= \nonumber \\&\quad \left| \Pr [{\mathsf {H}}_{k+1}^{\mathcal {A}_{\mathsf {SO}}}\Rightarrow 1\wedge \overline{{\texttt {Open}}_{k+1}(k+1)}]-\Pr [{\mathsf {H}}_{k}^{\mathcal {A}_{\mathsf {SO}}}\Rightarrow 1\wedge \overline{{\texttt {Open}}_k(k+1)}]\right| . \end{aligned}$$
(3)

We describe \({\mathcal {B}_{\mathsf {mult}}}\) (cf. Fig. 7): It passes \( pk \) on to \({\mathcal {A}_{\mathsf {SO}}}\); obtaining \((\mathfrak {D},{\mathsf {Resamp}}_\mathfrak {D})\), \({\mathcal {B}_{\mathsf {mult}}}\) makes a guess for \({\mathsf {middle}}\) (labeled \({\mathsf {middle}}^*\)) by making a guess (labeled \(N^*\)) of \({\mathsf {middle}}\)’s neighborhood in \(G_{[k+1]}\) and samples \(\mathbf {m}^0\leftarrow \mathfrak {D}\). \({\mathcal {B}_{\mathsf {mult}}}\) resamples \(\mathbf {m}^{1,0}\) fixing \(N^*\cup {\mathsf {right}}\) and resamples \(\mathbf {m}^{1,1}\) fixing \(N^*\cup {\mathsf {right}}\cup \{k+1\}\). \({\mathcal {B}_{\mathsf {mult}}}\) sends \((\mathbf {m}^{1,0}_{{\mathsf {middle}}^*},\mathbf {m}^{1,1}_{{\mathsf {middle}}^*})\) to its \({\mathsf {mult\text {-}IND\text {-}CPA}}\) challenger, receives \(\mathbf {c}_{{\mathsf {middle}}^*}\), samples fresh randomness to encrypt messages in \(\overline{{\mathsf {middle}}^*}\) on its own and forwards \((\mathbf {c}_1,\ldots ,\mathbf {c}_n)\) to \({\mathcal {A}_{\mathsf {SO}}}\). \({\mathcal {B}_{\mathsf {mult}}}\) sets \({\texttt {Bad}}:=true\) if \({\mathcal {A}_{\mathsf {SO}}}\) calls \(\textsc {Open}(i)\) for some \(i\in {\mathsf {middle}}^*\setminus \{k+1\}\) since it cannot answer those queries.Footnote 2 Other opening queries are answered honestly. On \({\mathcal {A}_{\mathsf {SO}}}\)’s call of \(\textsc {Challenge}\), \({\mathcal {B}_{\mathsf {mult}}}\) checks if \(N^*\subseteq \mathcal {I}\). If not, \({\mathcal {B}_{\mathsf {mult}}}\) guessed \({\mathsf {middle}}\) wrong and sets \({\texttt {Bad}}\) to true. Otherwise, \({\mathcal {B}_{\mathsf {mult}}}\) resamples messages fixing those at positions \(\mathcal {I}\cup {\mathsf {right}}\) to obtain resampled messages \(\mathbf {m}^1\) and sends \(\mathbf {m}^1_i\) for all \({\mathsf {left}}\) positions and \(\mathbf {m}^0_i\) for all remaining positions to \({\mathcal {A}_{\mathsf {SO}}}\). \({\mathcal {B}_{\mathsf {mult}}}\) outputs whatever \({\mathcal {A}_{\mathsf {SO}}}\) outputs.

Assume that \({\mathcal {B}_{\mathsf {mult}}}\) guessed correctly, i.e. \(N^*\) is the neighborhood of \({\mathsf {middle}}\) in \(G_{[k]}\). Then \({\mathsf {middle}}^*={\mathsf {middle}}\) holds and by definition of \({\mathsf {middle}}\), \({\texttt {Bad}}\) cannot happen.

Clearly, \({\mathcal {B}_{\mathsf {mult}}}\) correctly simulates \({\mathcal {A}_{\mathsf {SO}}}\)’s hybrid view in all \({\mathsf {left}}\) and \({\mathsf {right}}\) positions. Note that \({\mathcal {A}_{\mathsf {SO}}}\) obtains resampled encryptions \({\mathsf {Enc}}_ pk (\mathbf {m}^{1,b}_{\mathsf {middle}})\) during \(\textsc {Enc}\), but expects sampled encryptions \({\mathsf {Enc}}_ pk (\mathbf {m}^0_{\mathsf {middle}})\), while receiving sampled \(\mathbf {m}^0_{\mathsf {middle}}\) when calling \(\textsc {Challenge}\), expecting resampled \(\mathbf {m}_{\mathsf {middle}}\). Thus, sampled \({\mathsf {middle}}\) messages become \(resampled \) \({\mathsf {middle}}\) messages from \({\mathcal {A}_{\mathsf {SO}}}\)’s view and vice versa.

However, we have \(\mathbf {m}_{\mathsf {middle}}\equiv \mathbf {m}^0_{\mathsf {middle}}\) since \(N({\mathsf {middle}})\subseteq \mathcal {I}\cup {\mathsf {right}}\), whereby \(\mathcal {I}\cup {\mathsf {right}}\) is fixed when resampling \(\mathbf {m}_{\mathsf {middle}}\).

For \({\mathcal {B}_{\mathsf {mult}}}\) run in game \({\mathsf {mult\text {-}IND\text {-}CPA}}_{{\mathsf {PKE}},0}\), \({\mathcal {A}_{\mathsf {SO}}}\) receives \({\mathsf {Enc}}_ pk (\mathbf {m}^{1,0}_{\mathsf {middle}})\) where \(\mathbf {m}^{1,0}_{\mathsf {middle}}\equiv \mathbf {m}^0_{\mathsf {middle}}\) since \(N^*\cup {\mathsf {right}}=N\cup {\mathsf {right}}\) is fixed when \(\mathbf {m}^{1,0}\) is resampled. Hence, all \({\mathsf {middle}}\) messages sent during \(\textsc {Challenge}\) look resampled and \({\mathcal {A}_{\mathsf {SO}}}\)’s view is identical to hybrid \({\mathsf {H}}_{k+1}\).

When \({\mathcal {B}_{\mathsf {mult}}}\) is run in \({\mathsf {mult\text {-}IND\text {-}CPA}}_{{\mathsf {PKE}},1}\), it forwards \({\mathsf {Enc}}_ pk (\mathbf {m}^{1,1}_{\mathsf {middle}})\) to \({\mathcal {A}_{\mathsf {SO}}}\) where \(\mathbf {m}^{1,1}_{\mathsf {middle}}\equiv \mathbf {m}^1_{\mathsf {middle}}\) for the same reason as for \(b=0\). In particular, we have \(\mathbf {m}^0_{k+1}=\mathbf {m}^{1,1}_{k+1}\) since \(\mathbf {m}^0_{k+1}\) is fixed while resampling. Consequently, each message in \({\mathsf {middle}}\) except the \((k+1)\)-th looks resampled during \(\textsc {Challenge}\) and \({\mathcal {A}_{\mathsf {SO}}}\)’s view is identical to hybrid \({\mathsf {H}}_{k}\).

\({\mathcal {B}_{\mathsf {mult}}}\) outputs 1 in its game \({\mathsf {mult\text {-}IND\text {-}CPA}}\) if \({\mathcal {A}_{\mathsf {SO}}}\) outputs 1 in its respective hybrid and \({\mathcal {A}_{\mathsf {SO}}}\) does not open ciphertext \(\mathbf {c}{}_{k+1}\) and \({\texttt {Bad}}\) does not happen. We thus have

$$\begin{aligned} \begin{aligned} \mathbf {Adv}_{\mathsf {PKE}}^{{\mathsf {mult\text {-}IND\text {-}CPA}}}&({\mathcal {B}_{\mathsf {mult}}},\lambda ) \ge \\&\left| \Pr [{\mathsf {mult\text {-}IND\text {-}CPA}}_{{\mathsf {PKE}},0}^{{\mathcal {B}_{\mathsf {mult}}}}\Rightarrow 1] - \Pr [{\mathsf {mult\text {-}IND\text {-}CPA}}_{{\mathsf {PKE}},1}^{{\mathcal {B}_{\mathsf {mult}}}}\Rightarrow 1] \right| \end{aligned} \end{aligned}$$
$$\begin{aligned}&=\left| \Pr [{\mathsf {H}}_{k+1}^{\mathcal {A}_{\mathsf {SO}}}\Rightarrow 1\wedge \overline{{\texttt {Open}}_{k+1}(k{+}1)}\wedge \overline{{\texttt {Bad}}}] - \Pr [{\mathsf {H}}_{k}^{\mathcal {A}_{\mathsf {SO}}}\Rightarrow 1\wedge \overline{{\texttt {Open}}_k(k{+}1)}\wedge \overline{{\texttt {Bad}}}] \right| . \end{aligned}$$

Since \(\overline{{\texttt {Bad}}}\) is independent of \({\mathsf {H}}_{i}^{\mathcal {A}_{\mathsf {SO}}}\Rightarrow 1\wedge {\texttt {Open}}_i(k+1)\) for \(i\in \{k,k+1\}\) we have

$$\begin{aligned}&= \Pr [\overline{{\texttt {Bad}}}]\cdot \left| \Pr [{\mathsf {H}}_{k+1}^{\mathcal {A}_{\mathsf {SO}}}\Rightarrow 1\wedge \overline{{\texttt {Open}}_{k+1}(k+1)}] - \Pr [{\mathsf {H}}_{k}^{\mathcal {A}_{\mathsf {SO}}}\Rightarrow 1\wedge \overline{{\texttt {Open}}_k(k+1)}] \right| \\&= \Pr [\overline{{\texttt {Bad}}}]\cdot \mathbf {Adv}\big ({\mathsf {H}}_{k} ^{\mathcal {A}_{\mathsf {SO}}},{\mathsf {H}}_{k+1}^{\mathcal {A}_{\mathsf {SO}}}\big ), \end{aligned}$$

by Eq. (3). \({\mathcal {B}_{\mathsf {mult}}}\) picks \(N^*\) from a set of size \(\sum _{i=0}^{B(G\lambda )-1}\left( {\begin{array}{c}k\\ i\end{array}}\right) \) for \(k<n-1\), and of size \(\sum _{i=0}^{B(G\lambda )}\left( {\begin{array}{c}k\\ i\end{array}}\right) \) for \(k=n-1\), respectively, which proves Lemma 16.    \(\square \)

The remaining proof consists of tedious computations. From Eq. (2) and Lemma 16 we have

$$ \mathbf {Adv}_{\mathsf {PKE}}^{\mathsf {IND\text {-}SO\text {-}CPA}}({\mathcal {A}_{\mathsf {SO}}},\mathfrak {D}_\lambda ,\lambda ) \le \sum _{k=0}^{n-1}{\Pr [\overline{{\texttt {Bad}}}_k]}^{-1}\cdot \mathbf {Adv}_{\mathsf {PKE}}^{{\mathsf {mult\text {-}IND\text {-}CPA}}}({\mathcal {B}_{\mathsf {mult}}},\lambda ). $$

Let \(B:=B(G)\). Since \({\mathcal {B}_{\mathsf {mult}}}\) submits message vectors of length \(|{\mathsf {middle}}^*|\le k+1\) to its \({\mathsf {mult\text {-}IND\text {-}CPA}}\) challenger and by Lemma 2:

$$\begin{aligned} \mathbf {Adv}_{\mathsf {PKE}}^{\mathsf {IND\text {-}SO\text {-}CPA}}({\mathcal {A}_{\mathsf {SO}}},\mathfrak {D}_\lambda ,\lambda )&\!\le \!\nonumber \\ \left( \sum _{k=0}^{n-2}(k \! + \! 1)\cdot \sum _{i=0}^{B-1}\left( {\begin{array}{c}k\\ i\end{array}}\right) \right.&\left. + n \cdot \sum _{i=0}^B\left( {\begin{array}{c}n \! - \! 1\\ i\end{array}}\right) \! \right) \cdot \mathbf {Adv}_{\mathsf {PKE}}^{\mathsf {IND\text {-}CPA}}({\mathcal {B}_{\mathsf {mult}}},\lambda ). \end{aligned}$$
(4)

We upper-bound the loss in (4). Let \(2\le B<n\).

$$\begin{aligned}&\!\!\!\! \sum _{k=0}^{n-2}(k+1)\cdot \sum _{i=0}^{B-1}\left( {\begin{array}{c}k\\ i\end{array}}\right) + n \cdot \sum _{i=0}^{B}\left( {\begin{array}{c}n-1\\ i\end{array}}\right) \\&= \sum _{i=0}^{B-1}\left( {\begin{array}{c}0\\ i\end{array}}\right) +2\cdot \sum _{i=0}^{B-1}\left( {\begin{array}{c}1\\ i\end{array}}\right) +\sum _{k=2}^{n-2}(k+1)\cdot \sum _{i=0}^{B-1}\left( {\begin{array}{c}k\\ i\end{array}}\right) + n \cdot \sum _{i=0}^{B}\left( {\begin{array}{c}n-1\\ i\end{array}}\right) \\&\le 5 + \sum _{k=2}^{n-2}(k+1)\cdot \sum _{i=0}^{B-1}k^i + n \cdot \sum _{i=0}^{B}\left( {\begin{array}{c}n-1\\ i\end{array}}\right) \\&= 5+ \sum _{k=2}^{n-2}(k+1)\cdot \frac{k^B-1}{k-1} + n \cdot \sum _{i=0}^{B}\left( {\begin{array}{c}n-1\\ i\end{array}}\right) \\&= 5+ \sum _{k=2}^{n-2}\underbrace{\frac{k+1}{k-1}}_{\le 3}\cdot (k^B-1) + n \cdot \sum _{i=0}^{B}\left( {\begin{array}{c}n-1\\ i\end{array}}\right) \\&\le 5+ 3 \cdot \sum _{k=2}^{n-2}(k^B-1) + n \cdot \sum _{i=0}^{B}\left( {\begin{array}{c}n-1\\ i\end{array}}\right) \\&= 5+ 3 \cdot \sum _{k=2}^{n-2}k^B-3\cdot (n-3) + n \cdot \sum _{i=0}^{B}\left( {\begin{array}{c}n-1\\ i\end{array}}\right) \\&= 14-3n+3 \cdot \sum _{k=2}^{n-2}k^B+n \cdot \sum _{i=0}^{B}\left( {\begin{array}{c}n-1\\ i\end{array}}\right) \\&=11-3n+3 \cdot \sum _{k=0}^{n-2}k^B+n \cdot \sum _{i=0}^{B}\left( {\begin{array}{c}n-1\\ i\end{array}}\right) \text { since }B\ge 1\\&\le 11 - 3n + 3 \cdot \sum _{k=0}^{n-2}k^B + n \cdot \sum _{i=0}^{B}n^i = 11 - 3n + 3 \cdot \sum _{k=0}^{n-2}k^B + n \cdot \frac{n^{B+1}-1}{n-1}\\&= 11 - 3n + 3 \cdot \sum _{k=0}^{n-2}k^B + \underbrace{\frac{n}{n-1}}_{\le 2} \cdot (n^{B+1}-1)\text { since }n\ge 2\\&\le 9 -3n + 3 \cdot \sum _{k=0}^{n-2}k^B + 2\cdot n^{B+1} \le 9-3n+3\cdot \int \limits _0^n k^B\mathrm {d}k+ 2\cdot n^{B+1}\\&= 9-3n+3\cdot \frac{n^{B+1}}{B+1}+2\cdot n^{B+1} = 9-3n + \left( 2+\frac{3}{B+1}\right) \cdot n^{B+1}\\&\le 9-3n + 3 \cdot n^{B+1}\text { since }B\ge 2\\&\le 3 \cdot n^{B+1}\text { since }n\ge 3. \end{aligned}$$

Since G is connected we have \(B=0\Leftrightarrow n=1\), \(B=1\Leftrightarrow n=2\). Thus, it is easily verified that the bound holds for \((B,n)\in \{(0,1),(1,2)\}\) as well.    \(\square \)

Because Markov distributions are \({\mathsf {DAG}}\)-induced by chain graphs and the maximum border of a chain graph is at most 2 we immediately obtain a tighter version of Corollary 10 whose proof directly follows from Theorem 15.

Corollary 17

Let \({\mathsf {PKE}}\) be \({\mathsf {IND\text {-}CPA}}\) secure. Then \({\mathsf {PKE}}\) is \({\mathsf {IND\text {-}SO\text {-}CPA}}\) secure with respect to efficiently resamplable Markov distributions over \(\mathcal {M}^n\).

In particular, for any adversary \({\mathcal {A}_{\mathsf {SO}}}\) run in game \({\mathsf {IND\text {-}SO\text {-}CPA}}_{\mathsf {PKE}}\) there exists an \({\mathsf {IND\text {-}CPA}}_{\mathsf {PKE}}\) adversary \({\mathcal {B}_{\mathsf {CPA}}}\) with roughly the running time of \({\mathcal {A}_{\mathsf {SO}}}\) plus three executions of \({\mathsf {Resamp}}\) such that

$$ \mathbf {Adv}_{\mathsf {PKE}}^{\mathsf {IND\text {-}SO\text {-}CPA}}({\mathcal {A}_{\mathsf {SO}}},\mathfrak {D}_\lambda ,\lambda ) \le 3\cdot n^3\cdot \mathbf {Adv}^{\mathsf {IND\text {-}CPA}}_{\mathsf {PKE}}({\mathcal {B}_{\mathsf {CPA}}},\lambda ). $$

Applying the proof of Theorem 15 directly to the Markov case gives a slightly better bound on the loss, namely \(n\cdot (n+1)\cdot (2n+1)/6\), since \(N({\mathsf {middle}})\cap [n-1]=1\) even for the last transition \({\mathsf {H}}_{n-1}\) to \({\mathsf {H}}_{n}\). Hence, the loss in Eq. (4) decreases to \(\sum _{k=0}^{n-1}{(k+1)}^2\).

Recall that the hybrids in the proof of Theorem 15 saved us a factor of n because it suffices to guess a set of size at most \(B(G)-1\) instead of B(G) for \(k<n-1\) as at least one vertex of the neighborhood of \({\mathsf {middle}}\) is contained in \({\mathsf {right}}\).

The same hybrids can be used to strengthen Theorem 8 as it suffices to guess a connected subgraph in \([k+1]\) (instead of [n]) containing vertex \(k+1\).

Since G is connected, there is at least a path in \(\{k+1\}\cup {\mathsf {right}}\) that contains \(k + 1\), i.e. at least \(n-k\) connected subgraphs in \({\mathsf {right}}\cup \{k+1\}\). Thus, there exist at least \(n-k\) connected subgraphs in G that contain vertex \(k + 1\) and are identical if restricted to \([k+1]\). Hence the probability that the reduction guesses \(C_{k+1}\) correctly can be increased from 1 / S(G) to \((n-k)/S(G)\), bringing the loss from \(\mathcal O(n^2)\cdot S(G)\) down to \(\mathcal O(n\cdot \log n)\cdot S(G)\).

3.5 A Hybrid Argument for Disconnected Graphs

Let G be a graph with \(z'\) connected components. Fix any semi-order on them, e.g. ordered by the smallest vertex in each component and let \(V_1,\ldots ,V_{z'}\) denote the sets of vertices of the connected components of G. For \(j\in [z'+1,n]\) let \(V_j:=\emptyset \). We define a security game where an adversary plays the \({\mathsf {IND\text {-}SO\text {-}CPA}}\) game on a connected component of the graph that induced the distribution chosen by the adversary.

Definition 18

For a public-key encryption scheme \({\mathsf {PKE}}:=({\mathsf {Gen}},{\mathsf {Enc}},{\mathsf {Dec}})\), a bit b, a family \(\mathcal {F}\) of efficiently resamplable, G-induced distributions over \(\mathcal {M}^n\), \(z\in [n]\) and an adversary \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) we consider game \({\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}^{\mathcal {B}_{\mathsf {G\text {-}SO}}}_{{\mathsf {PKE}},b,z}\) given in Fig. 8. Run in the game, \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) calls \(\textsc {Enc}\) once right after \(\textsc {Initialize}\) and submits \(\mathfrak {D}\in \mathcal {F}\) along with a \({\mathsf {PPT}}\) resampling algorithm \({\mathsf {Resamp}}_{\mathfrak {D}}\). \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) may call \(\textsc {Open}\) multiple times but only for \(i\in V_z\) and invokes \(\textsc {Challenge}\) once after its last \(\textsc {Open}\) query before calling \(\textsc {Finalize}\). We define the advantage of \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) run in \({\mathsf {IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},b,z}\) as

$$\begin{aligned} \mathbf {Adv}^{\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},z}({\mathcal {B}_{\mathsf {G\text {-}SO}}},&\mathfrak {D}_\lambda ,\lambda ):=\\&\mathbf {Adv}\big ({\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}^{\mathcal {B}_{\mathsf {G\text {-}SO}}}_{{\mathsf {PKE}},0,z},{\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}^{\mathcal {B}_{\mathsf {G\text {-}SO}}}_{{\mathsf {PKE}},1,z}\big ). \end{aligned}$$

\({\mathsf {PKE}}\) is \({\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_z\) secure w.r.t. \(\mathcal {F}\) if \(\mathbf {Adv}^{\mathsf {IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},z}({\mathcal {B}_{\mathsf {G\text {-}SO}}},\mathfrak {D}_\lambda ,\lambda )\) is negligible for all \({\mathsf {PPT}}\) adversaries \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\). \({\mathsf {PKE}}\) is \({\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}\) secure w.r.t. \(\mathcal {F}\) if \({\mathsf {PKE}}\) is \({\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_z\) secure w.r.t. \(\mathcal {F}\) for all \(z\in [n]\).

We have \(\mathbf {Adv}_{{\mathsf {PKE}},z}^{\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}({\mathcal {B}_{\mathsf {G\text {-}SO}}},\mathfrak {D}_\lambda ,\lambda )=0\) for \(z\in [z'+1,n]\).

Fig. 8.
figure 8

\({\mathcal {B}_{\mathsf {G\text {-}SO}}}\)’s interface in game \({\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},b,z}\).

Theorem 19

Let \({\mathsf {PKE}}\) be \({\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}\) secure w.r.t. a family \(\mathcal {F}\) of efficiently resamplable and G-induced distributions over \(\mathcal {M}^n\), then \({\mathsf {PKE}}\) is \({\mathsf {IND\text {-}SO\text {-}CPA}}\) secure w.r.t \(\mathcal {F}\).

Proof

Again, the main idea is that connected components can be dealt with independently. We give a hybrid argument over the connected components of \(G_\lambda \) using \({\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_z\) security for switching connected component z from sampled to resampled. See Fig. 9 for code of \(\textsc {Challenge}\) in hybrid \({\mathsf {H}}_{z}\); every other procedure stays as in \({\mathsf {IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},b}\) (cf. Fig. 2).

Fig. 9.
figure 9

Hybrid \({\mathsf {H}}_{z}\). The first z connected components are already resampled conditioned on opening queries, while the rest remain sampled.

Note that \({\mathsf {H}}_{0}\) is identical to game \({\mathsf {IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},0}\) and \({\mathsf {H}}_{z'}\) is identical to \({\mathsf {IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},1}\). Thus

$$ \mathbf {Adv}^{\mathsf {IND\text {-}SO\text {-}CPA}}_{\mathsf {PKE}}({\mathcal {A}_{\mathsf {SO}}},\mathfrak {D}_\lambda ,\lambda ) = \mathbf {Adv}\big ({\mathsf {H}}_{0}^{\mathcal {A}_{\mathsf {SO}}},{\mathsf {H}}_{z'}^{\mathcal {A}_{\mathsf {SO}}}\big ) \le \sum _{z=0}^{z'-1}\mathbf {Adv}\big ({\mathsf {H}}_{z}^{\mathcal {A}_{\mathsf {SO}}},{\mathsf {H}}_{z+1}^{\mathcal {A}_{\mathsf {SO}}}\big ). $$

We proceed with the following Lemma.

Lemma 20

For every adversary \({\mathcal {A}_{\mathsf {SO}}}\) distinguishing hybrids \({\mathsf {H}}_{z}\) and \({\mathsf {H}}_{z+1}\) there exists an adversary \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) run in game \({\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},z+1}\) with roughly the running time plus one executions of \({\mathsf {Resamp}}\) such that

$$\begin{aligned} \mathbf {Adv}\big ({\mathsf {H}}_{z}^{\mathcal {A}_{\mathsf {SO}}},{\mathsf {H}}_{z+1}^{\mathcal {A}_{\mathsf {SO}}}\big ) \le \mathbf {Adv}^{\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},z+1}({\mathcal {B}_{\mathsf {G\text {-}SO}}},\mathfrak {D}_\lambda ,\lambda ). \end{aligned}$$

Proof

We construct an adversary \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) that interpolates between hybrids \({\mathsf {H}}_{z}\) and \({\mathsf {H}}_{z+1}\) for \({\mathcal {A}_{\mathsf {SO}}}\). \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) proceeds as follows (cf. Fig. 10).

\({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) forwards \( pk \) to \({\mathcal {A}_{\mathsf {SO}}}\). On \({\mathcal {A}_{\mathsf {SO}}}\)’s call of \(\textsc {Enc}\), \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) calls \(\textsc {Enc}_{{\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_{z+1}}\) to obtain an encryption \(\mathbf {c}^{}_{V_{z+1}}\) of messages in the component \(V_{z+1}\). \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) samples messages \(\mathbf {m}^0\leftarrow \mathfrak {D}\) on its own and encrypts the messages in \(\overline{V_{z+1}}\). \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) sends \(\mathbf {c}=(\mathbf {c}_1,\ldots ,\mathbf {c}_n)\) to \({\mathcal {A}_{\mathsf {SO}}}\). \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) answers opening queries on its own unless they occur on \(V_{z+1}\), where it invokes its \(\textsc {Open}_{{\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_{z+1}}\) oracle to answer. On \(\textsc {Challenge}\), \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) receives a challenge message vector \(\mathbf {m}^{}_{V_{z+1}}\) by calling \(\textsc {Challenge}_{{\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_{z+1}}\) and resamples \(\mathbf {m}^1\) conditioned on \(\mathcal {I}\). \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) returns resampled messages \(\mathbf {m}^1\) on \(\bigcup _{j=1}^z V_j\), its challenge messages \(\mathbf {m}^{}_{V_{z+1}}\) and sampled messages \(\mathbf {m}^0\) for \(\bigcup _{j=z+2}^nV_j\) to \({\mathcal {A}_{\mathsf {SO}}}\). \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) outputs whatever \({\mathcal {A}_{\mathsf {SO}}}\) outputs.

Obviously \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) simulates the hybrids correctly during \(\textsc {Enc}\) since it always returns encryptions of sampled messages. On \({\mathcal {A}_{\mathsf {SO}}}\)’s call of \(\textsc {Challenge}\) the messages in the first z connected components are already resampled while the messages in the last \(n-z-1\) connected components are sampled as in hybrids \({\mathsf {H}}_{z}\) and \({\mathsf {H}}_{z+1}\). When \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) is run in game \({\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},0,z+1}\), it obtains sampled messages for the \((z+1)\)-th connected component; thus it runs \({\mathcal {A}_{\mathsf {SO}}}\) in hybrid \({\mathsf {H}}_{z}\). When run in \({\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},1,z+1}\), \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) receives resampled messages for \(V_{z+1}\); hence running \({\mathcal {A}_{\mathsf {SO}}}\) in hybrid \({\mathsf {H}}_{z+1}\). Thus

$$\begin{aligned} \Pr [{\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}^{\mathcal {B}_{\mathsf {G\text {-}SO}}}_{{\mathsf {PKE}},0,z+1}\Rightarrow 1]&=\Pr [{\mathsf {H}}_{z}^{\mathcal {A}_{\mathsf {SO}}}\Rightarrow 1] ~\text {and}\\ \Pr [{\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}^{\mathcal {B}_{\mathsf {G\text {-}SO}}}_{{\mathsf {PKE}},1,z+1}\Rightarrow 1]&=\Pr [{\mathsf {H}}_{z+1}^{\mathcal {A}_{\mathsf {SO}}}\Rightarrow 1]. \end{aligned}$$

Lemma 20 follows.    \(\square \)

Fig. 10.
figure 10

Reduction run by \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) to simulate \({\mathsf {H}}_{z}\) (or \({\mathsf {H}}_{z+1}\)) when \({\mathcal {B}_{\mathsf {G\text {-}SO}}}\) is run in \({\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},0,z+1}\) (or \({\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},1,z+1}\)).

We obtain

$$ \mathbf {Adv}^{\mathsf {IND\text {-}SO\text {-}CPA}}_{\mathsf {PKE}}({\mathcal {A}_{\mathsf {SO}}},\mathfrak {D}_\lambda ,\lambda ) \le \sum _{z=1}^{z'}\mathbf {Adv}^{\mathsf {G\text {-}IND\text {-}SO\text {-}CPA}}_{{\mathsf {PKE}},z}({\mathcal {B}_{\mathsf {G\text {-}SO}}},\mathfrak {D}_\lambda ,\lambda ) $$

and Theorem 19 follows immediately since \(z'\le n\).    \(\square \)

In particular, we achieve versions of Theorem 8, Corollary 14 and Theorem 15 for disconnected graphs, where

$$ S(G)=\sum _{i=1}^{z'}S(C_i)\quad \text {and}\quad B(G)=\max _{i\in [z']}\{B(C_i)\} $$

for a graph G consisting of connected components \(C_1,\ldots , C_{z'}\).

Moreover, for \(G=([n],\emptyset )\), G-induced distributions become product distributions, i.e. the messages are sampled independently. Hence, the positive result of [3] can be seen as a special case of Theorem 19.