Public Bayesian Persuasion: Being Almost Optimal and Almost Persuasive

We study algorithmic Bayesian persuasion problems in which the principal (a.k.a. the sender) has to persuade multiple agents (a.k.a. receivers) by using public communication channels. Specifically, our model follows the multi-receiver model with no inter-agent externalities introduced by Arieli and Babichenko (J Econ Theory 182:185–217, 2019). It is known that the problem of computing a sender-optimal public persuasive signaling scheme is not approximable even in simple settings. Therefore, prior works usually focus on determining restricted classes of the problem for which efficient approximation is possible. Typically, positive results in this space amounts to finding bi-criteria approximation algorithms yielding an almost optimal and almost persuasive solution in polynomial time. In this paper, we take a different perspective and study the persuasion problem in the general setting where the space of the states of nature, the action space of the receivers, and the utility function of the sender can be arbitrary. We fully characterize the computational complexity of computing a bi-criteria approximation of an optimal public signaling scheme in such settings. In particular, we show that, assuming the Exponential Time Hypothesis, solving this problem requires at least a quasi-polynomial number of steps even in instances with simple utility functions and binary action spaces such as an election with the k-voting rule. In doing so, we prove that a relaxed version of the Maximum Feasible Subsystem of Linear Inequalities problem requires at least quasi-polynomial time to be solved. Finally, we close the gap by providing a quasi-polynomial time bi-criteria approximation algorithm for arbitrary public persuasion problems that, under mild assumptions, yields a QPTAS.


Introduction
In many real-life strategic interactions, agents rely on information revealed by an exogenous entity to take decisions.The latter acts as an informed principal whose goal is to shape the agents' beliefs so as to achieve a desired outcome.In this context, deciding what information to reveal amounts to an information structure design problem.When information is incomplete, the information structure determines "which agent gets to know what" about the current state of the environment (i.e., the parameters determining payoff functions).There has been a recent surge of interest in the study of how an informed principal may steer agents' collective behavior towards a favorable outcome.The study of these problems has been largely driven by their application in various domains such as auctions and online advertisement [1][2][3], voting [4,5], traffic routing [6,7], recommendation systems [8], security [9][10][11], and marketing [12,13].
Persuasion is the task faced by an informed principal, which we call the sender, which tries to influence the behavior of the self-interested agent(s) (i.e., the receivers) taking part in a strategic interactions.The sender faces the algorithmic problem of determining the optimal information structure to achieve her objectives.A solution to this problem is described through the notion of signaling scheme, which is a mapping from the sender's observations to the space of probability distributions over the set of available signals.A foundational model describing the persuasion problem is the Bayesian persuasion framework (BP) introduced by Kamenica and Gentzkow in [14].That model describes a setting with a sender and a single receiver.There is a set of parameters influencing the payoff functions of the sender and of the receiver.These parameters are collectively called the state of nature, and model exogenous stochasticity in the environment.The sender and the receiver share a common prior over the possible states of nature.However, only the sender gets to observe the realized state of nature, which is drawn according to the shared prior probability distribution.This originates a fundamental asymmetry in the information available to the two agents.The sender can exploit this additional knowledge to steer the receiver's actions towards a favorable outcome.Specifically, the action selected by the receiver is the best action available under her current posterior distribution, which is updated in a classical Bayesian fashion after observing the sender's signal.Therefore, the prior distribution together with the sender's signaling scheme determine the receiver equilibrium behavior.We observe that the BP framework assumes the sender's commitment power, which is a natural assumption in many settings (see, e.g., the arguments by [14,15]).One argument to that effect is that reputation and credibility may be a key factor for the long-term utility of the sender [16].
In many practical scenarios, the sender may need to persuade multiple receivers, revealing information to each one of them.In the multi-receiver setting, it is useful to make a distinction between private and public signaling schemes. 1 In the former setting, the sender may reveal different information to each receiver through private communication channels.In the latter, which is the focus of this paper, the sender has to reveal the same information to all receivers.Public persuasion is well suited for settings where private communication channels are either too costly or impractical.This is the case in scenarios with a large population of receivers, such as elections, and scenarios where receivers may share private information, which are frequent in practice.
In our paper, we adopt and generalize the multi-agent persuasion model introduced by Arieli and Babichenko in [18], which rules out the possibility of inter-agent externalities.Specifically, each receiver's utility depends only on her action and on the realized state of nature, but not on the actions of other receivers.This assumption allows one to focus on the key problem of coordinating the receivers' behavior, without the additional complexity arising from externalities which have been shown to make the problem largely intractable [6,19].Previous works on Arieli and Babichenko's model either address the private persuasion setting [18,20,21] or make some structural assumptions which render them special cases of our model [22].To the best of our knowledge, this is the first work which generalizes the model by Arieli and Babichenko to settings with arbitrary spaces of states of nature, arbitrary receivers' action spaces, and arbitrary sender's utility functions.The generality of our setting raises a number of technical difficulties with respect to previous works on the same model.Our solution to these challenges is a first step towards actionable persuasion models that can be applied to real-world multi-receiver problems without structural restrictions.

Context: Persuasion with Multiple Receivers
Dughmi and Xu [23] analyze for the first time Bayesian persuasion from a computational perspective, focusing on the single receiver case.In [18], Arieli and Babichenko introduce the model of persuasion with multiple receivers and without inter-agent externalities, with a focus on private Bayesian persuasion.In particular, they study the setting with a binary action space for the receivers and a binary space of states of nature.They provide a characterization of the optimal signaling scheme in the case of supermodular, anonymous submodular, and supermajority sender's utility functions.In [20], Babichenko and Barman extend the work by Arieli and Babichenko providing a tight (1 − 1/e)-approximate signaling scheme for monotone submodular sender's utilities and showing that an optimal private scheme for anonymous utility functions can be found efficiently.In [21], Dughmi and Xu generalize the previous model to settings with an arbitrary number of states of nature.
When considering the problem of designing public persuasive signaling schemes, some previous works study scenarios with inter-agent externalities by making some structural assumptions on the nature of the strategic interaction.For instance, Bhaskar et al. [6] and Rubinstein [19] study public signaling problems in which two receivers play a zero-sum game.In particular, Bhaskar et al. rule out an additive PTAS assuming hardness of the planted clique problem.The setting studied by Bhaskar et al. [6] and Rubinstein [19] is fundamentally different from ours.In their setting the game can be compactly represented through its normal-form representation, and the complexity of the problem lies in handling externalities among players.On the other hand, in our setting, a compact normal-form representation is not possible since it is exponential in the (arbitrary) number of receivers.
Moreover, Rubinstein proves that the problem of computing an -optimal signaling scheme requires at least quasi-polynomial time assuming the Exponential Time Hypothesis (ETH).This result is tight due to the quasi-polynomial approximation scheme proposed by Cheng et al. [5].
A number of previous works focus on the public signaling problem in the no interagent externalities framework of Arieli and Babichenko.In particular, Dughmi and Xu [21] rule out the existence of a PTAS even when receivers have binary action spaces and objectives are linear, unless P = NP.For this reason, most of the following works focus on the computation of bi-or tri-criteria approximations in which the persuasion constraints can be violated by a small amount.In [5], Cheng et al. describe a polynomial-time tri-criteria approximation algorithm for k-voting scenarios.The work of [5] on k-voting is related to the voting problem that we study in this paper.However, while we relax the problem allowing approximately optimal and approximately persuasive signaling schemes, [5] considers also a third type of relaxation.In particular, they consider a relaxed sender's utility function in which less than k votes are sufficient to win the election.This third relaxation is necessary to provide a PTAS to the problem, while we show that without relaxing the utility function the problem requires at least quasi-polynomial time.In [22], Xu studies public persuasion with binary action spaces and an arbitrary number of states of nature, showing that no bi-criteria FPTAS is possible, unless P = NP.Furthermore, the author proposes a bi-criteria PTAS for monotone submodular sender's utility functions and shows that, when the number of states of nature is fixed and a non-degeneracy assumption holds, an optimal signaling scheme can be computed in polynomial time.

Our Results and Techniques
We provide a tight characterization of the complexity of computing bi-criteria approximations of optimal public signaling schemes in arbitrary persuasion problems with n receivers and no inter-agent externalities.

Impossibility result
Previous works studying the same model (i.e., one with no inter-agent externalities and public signaling schemes) exploit specific structures of the sender's utility functions to provide optimal or approximate polynomial-time algorithms.We show that the complexity of the approximation problem shifts from poly-time to quasi-poly-time when the utility function of the sender can be arbitrary.Indeed, we show that the positive results by Xu [22], which assumes noise-stability of the sender's utility function, cannot be extended to the case of arbitrary sender's utility functions.Specifically, we argue that no polynomial-time bi-criteria approximation algorithm is possible in general settings.This is shown by reasoning over simple k-voting instances, which are per se an interesting application scenario of Bayesian persuasion (see, e.g., the work by Castiglioni et al. [24]), and are sufficient to extend the result to the general case of arbitrary sender's utility functions.In addressing this result, we follow a different approach from that used in [22].Specifically, we cannot hope for a 'standard' NP-hardness result because there exist quasi-polynomial time bi-criteria approximation algorithms (Theorem 2).Therefore, by assuming ETH, we show that it is unlikely that there exists a bi-criteria polynomial-time approximation algorithm even in instances with simple utility functions and a fixed space of actions.Let n be the size of the instance in input.Our main impossibility result reads as follows.
Theorem 1 Assuming ETH, there exists a constant * > 0 such that, for any 0 < ≤ * , finding a signaling scheme that is -persuasive and α-approximate requires time n ˜ (log n) for any multiplicative or additive factor α ∈ (0, 1), even with binary action spaces.
The proof of this result requires an intermediate step that is of independent interest and of general applicability.Specifically, we study a slight variation of the Maximum Feasible Subsystem of Linear Inequalities problem ( -MFS) [5], where, given a linear system A x ≥ 0, A ∈ [−1, 1] n row ×n col , we look for the vector x ∈ n col almost (i.e., except for an additive factor ) satisfying the highest number of inequalities (Definition 4).This is a constrained variant of the Max FLS problem previously studied by Amaldi and Kann in [25], and it is commonly used in scheduling [26], signaling, and mechanism design [5].In Sect.5, we prove that solving -MFS requires at least a quasi-polynomial number of steps assuming ETH.The proof is based on a reduction from two-provers games [27,28].Then, equipped with the result on -MFS, we focus on a simple public persuasion problem where the receivers are voters, and they have a binary action space since they must choose one between two candidates.In Sect.6, we prove a hardness result (Theorem 8) for this setting which directly implies Theorem 1.We show that the -MFS problem is deeply connected to the problem of computing 'good' posteriors, as finding of an optimal x in -MFS maps to the problem of finding an -persuasive posterior, which is equivalent to determining an -persuasive signaling scheme.

Positive result
In order to design an approximation algorithm (in the multiplicative sense), we resort to the assumption of α-approximable utility functions for the sender, as previously defined by Xu in [22].An α-approximable sender's utility function is such that it is possible to obtain in polynomial time a tie-breaking for the receivers guaranteeing to the sender an α-approximation of the optimal objective value.The α-approximability condition is a natural minimal requirement since, otherwise, even the problem of evaluating the sender's objective function for a given posterior over the states of nature would not be tractable.When the sender's utility function is αapproximable, there is no hope for a better approximation than an α-approximate signaling scheme.The following theorem, presented in Sect.7, shows that it is possible to compute, in quasi-polynomial time, a bi-criteria approximation with a factor arbitrarily close to α.
Therefore, our approximation algorithm guarantees the best possible factor on the objective value, and an arbitrary small loss in persuasiveness.For 1-approximable functions, Theorem 2 yields a bi-criteria QPTAS.In the setting of Xu [22] (i.e., binary action spaces and state-independent sender's utility function), our result directly yields a QPTAS for any monotone sender's utility function.In order to prove the result, we show that any posterior can be represented as a convex combination of k-uniform posteriors with only a small loss in the objective value.By restricting our attention to the set of k-uniform posteriors, which has a quasi-polynomial size, the problem can be solved via a linear program of quasi-polynomial size.

Preliminaries
This section describes the instantiation of the Bayesian persuasion framework which is the focus of this work (Sect.2.1), public signaling problems (Sect.2.2) and the notion of the bi-criteria approximation which we adopt (Sect.2.3).For a comprehensive overview of the Bayesian persuasion framework we refer the reader to [15], [29], and [30]. 2

Basic Model
Our model is a generalization of the framework introduced by Arieli and Babichenko in [18], that is, multi-agent persuasion with no inter-agent externalities.We adopt the perspective of a sender facing a finite set of receivers R := [ n].Each receiver r ∈ R has a finite set of r actions A r := {a i } r i=1 .Each receiver's payoff depends only on the action she takes and on a (random) state of nature θ , drawn from a finite set := {θ i } d i=1 of cardinality d.In particular, receiver r 's utility is given by the function u r : A r × → [0, 1].The utility of each receiver does not depend on other receivers' actions because of the no inter-agent externalities assumption.We denote by u r θ (a r ) ∈ [0, 1] the utility observed by receiver r when the state of nature is θ and she plays a r .Let A := × r ∈R A r be the set of joint receivers' actions.An action profile (i.e., a tuple specifying an action for each receiver) is denoted by a = (a r ) n r =1 ∈ A. The sender's utility, when the state of nature is θ , is given by the function f θ : A → [0, 1].We write f θ (a) to denote the sender's payoff when the receivers behave according to action profile a and the state of nature is θ .As it is customary in Bayesian persuasion, we assume f θ can be represented succinctly, that is without explicitly describing the function through its (exponentially many) input-output pairs.As an example, the reader can refer to Eq. 3, where it is possible to compute the sender's payoff by reasoning on the structure of the action profile at hand.
The state of nature θ is drawn from a common prior distribution μ ∈ int( ), which is explicitly known to the sender and the receivers.Moreover, the sender can publicly commit to a policy φ (i.e., a signaling scheme, see Sect.2.2) which maps states of nature to signals for the receivers.A generic signal for receiver r is denoted by s r , while the set of available signals to each receiver r is denoted by S r .The interaction between the sender and the receivers goes as follows: 1.The sender commits to a publicly known signaling scheme φ; 2. The sender observes the realized state of nature θ ∼ μ; 3. The sender draws a signal s r for each receiver according to the signaling scheme φ θ , and communicates to each receiver r the signal s r ; 4. Each receiver r observes s r and updates her prior beliefs over following Bayes rule.Then, each receiver r selects an action a r ∈ A r maximizing her expected reward.
Let a = (a 1 , . . ., a n ) ∈ A be the tuple of the receivers' choices, then each receiver r gets payoff u r θ (a r ), and the sender observes payoff f θ (a).This work focuses on the specific setting in which φ is a public signaling schemes.We give more details on the structure of public signaling schemes in the following section.

Public Signaling Schemes
A signal profile is a tuple s = (s r ) n r =1 ∈ S specifying a signal for each receiver, where S := × r ∈R S r .A public signaling scheme is a function φ : → S mapping states of nature to probability distributions over signal profiles, with the constraint that each receiver has to receive the same signal, that is, for any θ and s ∼ φ θ , it holds s r = s r for each pair of receivers r , r .With an overload of notation we write s ∈ S to denote the public signal received by all receivers.The probability with which the sender selects s after observing θ is denoted by φ θ (s).Thus, it holds s∈S φ θ (s) = 1 for each θ ∈ .
After observing s ∈ S, receiver r performs a Bayesian update and infers a posterior belief p ∈ over the states of nature.Specifically, the realized state of nature is θ with probability .
Since the prior is common knowledge and all receivers observe the same s, they all perform the same Bayesian update and have the same posterior belief regarding the realized state of nature.After computing p, since the problem is without inter-agent externalities, each receiver solves a disjoint single-agent decision problem to find the action maximizing her expected utility.A signaling scheme is direct when signals can be mapped to actions of the receivers, and interpreted as action recommendations.Each receiver is sent the same signal s ∈ A specifying a (possibly different) action for each other receiver, that is, the set of possible signals is S = A.Moreover, a signaling scheme is persuasive if following the recommendations is an equilibrium of the underlying Bayesian game [31,32].A direct signaling scheme is persuasive if the sender's action recommendations belongs to arg max a∈A r θ∈ p θ u r θ (a) for every receiver r .A simple revelation-principle style argument shows that there always exists an optimal public signaling scheme which is both direct and persuasive [17,22].A signal in a direct signaling scheme can be equivalently expressed as an action profile a ∈ A. It is easy to see that there is an exponential number of such signals.We write φ θ (a) to denote the probability with which the sender selects s = a when the realized state of nature is θ .The problem of determining an optimal public signaling scheme which is direct and persuasive can be formulated with the following (exponentially sized) linear program (LP): where a = (a r ) n r =1 ∈ A. The sender's goal is computing the signaling scheme maximizing her expected utility (Objective Function (1a)).Constraints (1b) force the public signaling scheme to be persuasive. 3,4

Bi-criteria Approximation
Let ∈ [0, 1].We say that a public signaling scheme is -persuasive if the following holds for any r ∈ R, a ∈ A, and a ∈ A r : Throughout the paper, we focus on the computation of approximately optimal signaling schemes.Let Opt be the optimal value of LP (1), i.e., the best expected revenue that the sender can reach under public persuasion constraints.For each state of nature θ , f θ is a non-negative function, and we have that Opt ≥ 0. When a signaling scheme yields an expected sender utility of at least α Opt, with α ∈ (0, 1], we say that the signaling scheme is α-approximate (that is, approximate in multiplicative sense).When a signaling scheme yields an expected sender utility of at least Opt − α, with α ∈ [0, 1), we say that the scheme is α-optimal (that is, approximate in additive sense).
Finally, we consider approximations which relax both the optimality and the persuasiveness constraints.When a signaling scheme is both -persuasive and α-approximate (or α-optimal), we say it is a bi-criteria approximation.We say that one such signaling scheme is (α, )-persuasive.

An Application: Persuasion in Voting Problems
In order to clarify the framework we just described, we present a simple example of a possible application of public Bayesian persuasion with no inter-agent externalities.This example is going to be useful in the remainder of the paper.
In an election with the k-voting rule, candidates are elected if they receive at least k ∈ [n] votes (see Brandt et al. [33] for further details).In this setting, a sender (e.g., a politician or a lobbyist) may send signals to the voters on the basis of her private information which is hidden to them.After observing the sender's signal, each voter (i.e., one of the receivers) chooses one among the set of candidates.
In the following, we will employ instances of k-voting in which receivers have to choose one between two candidates.Then, they have a binary action space with actions a 0 and a 1 corresponding to choosing the first or the second candidate, respectively.Each receiver r has utility u r θ (a) ∈ [0, 1] for each a ∈ {a 0 , a 1 }, where θ ∈ .The sender's preferred candidate is the one corresponding to action a 0 .Therefore, her objective is maximizing the probability that a 0 receives more than k votes.Formally, the sender's utility function is such that f θ = f for each θ , and Moreover, let W : → N + 0 be a function returning, for a given posterior distribution p ∈ , the number of receivers such that θ p θ (u r θ (a 0 ) − u r θ (a 1 )) ≥ 0, i.e., the number of voters that will vote for a 0 with a persuasive signaling scheme.Analogously, W (p) is the number of receivers for which θ p θ (u r θ (a 0 ) − u r θ (a 1 )) ≥ − , i.e., the number of voters that will vote for a 0 with an -persuasive signaling scheme.In the above voting setting, we refer to the problem of finding an -persuasive signaling scheme which is also α-approximate (or α-optimal) as (α, )-k-voting.To further clarify this election scenario, we provide the following simple example, adapted from Castiglioni et al. [24].
Example 1 There are three voters R = {1, 2, 3} who must select one between two candidates {a 0 , a 1 }.The sender (e.g., a politician or a lobbyist) observes the realized state of nature, drawn from the uniform probability distribution (1/3, 1/3, 1/3) over = {A, B, C}, and exploits this information to support the election of a 0 .The state of nature describes the position of a 0 on a matter of particular interest to the voters.Moreover, all the voters have a slightly negative opinion of candidate a 1 , independently of the state of nature, while the opinion on candidate a 0 can be better or worse than the opinion on a 1 depending the state of nature.Table 1 describes the utility of the three voters.
We consider a k-voting rule with k = 2. Without any form of signaling, all the voters would vote for a 1 because it provides an expected utility of −1/4, against −1/3, and the sender would get a utility of 0. If the sender discloses all the information regarding the state of nature (i.e., with a fully informative signal), the sender would still get a utility of 0, since two out of three receivers would pick a 1 in each of the possible states.However, the sender can design a public signaling scheme guaranteeing herself a utility of 1 for each state of nature.Table 2 describes one such scheme with arbitrary signals.Suppose the observed state is A, and that the signal sent by the sender is not B.Then, the posterior distribution over the states of nature is (1/2, 0, 1/2).Therefore, receiver 1 and receiver 3 would vote for a 0 since their expected utility would be 0 against −1/4.Similarly, for any other signal, two receivers vote for a 0 .Then, the sender's expected payoff is 1.We can recover an equivalent direct signaling scheme by sending a tuple with a candidates' suggestion for each voter.For example, not A would become (a 1 , a 0 , a 0 ), and each voter would observe the recommendations given to the others.

Technical Toolkit
In this section, we summarize some key results previously studied in the literature that we will extensively use in the remainder of the paper.In particular, we describe some of the results on two-prover games by Babichenko et al. [34] and Deligkas et al. [28] (Sect.4.1), and we describe a useful theorem on error-correcting codes due to Gilbert [35] (Sect.4.2).

Two-Prover Games
A two-prover game G is a co-operative game played by two players (Merlin 1 and Merlin 2 , respectively), and an adjudicator (verifier) called Arthur.At the beginning of the game, Arthur draws a pair of questions (x, y) ∈ X × Y according to a probability distribution D over the joint set of questions (i.e., D ∈ X ×Y ).Merlin 1 (resp., Merlin 2 ) observes x (resp., y) and chooses an answer ξ 1 (resp., ξ 2 ) from her finite set of answers 1 (resp., 2 ).Then, Arthur declares the Merlins to have won with a probability equal to the value of a verification function V(x, y, ξ 1 , ξ 2 ).A strategy for Merlin 1 is a function η 1 : X → 1 mapping each possible question to an answer.Analogously, η 2 : Y → 2 is a strategy of Merlin 2 .Before the beginning of the game, Merlin 1 and Merlin 2 can agree on their pair of (possibly mixed) strategies (η 1 , η 2 ), but no communication is allowed during the games.The payoff of a game G to the Merlins under The value of a twoprover game G, denoted by ω(G), is the maximum expected payoff to the Merlins when they play optimally: A two-prover game is called a free game if D is a uniform probability distribution over X × Y.This implies that there is no correlation between the questions sent to Merlin 1 and Merlin 2 .It is possible to build a family of free games mapping to 3SAT formulas arising from Dinur's PCP theorem.We say that the size n of a formula ϕ is the number of variables plus the number of clauses in the formula.Moreover, SAT(ϕ)∈ [0, 1] is the maximum fraction of clauses that can be satisfied in ϕ.With this notation, the Dinur's PCP Theorem reads as follows: Theorem 3 [Dinur's PCP Theorem [36]] Given any 3SAT instance ϕ of size n, and a constant ρ ∈ (0, 1/8), we can produce in polynomial time a 3SAT instance ϕ such that:

Each clause of ϕ contains exactly 3 variables, and every variable is contained in
A 3SAT formula can be seen as a bipartite graph in which the left vertices are the variables, the right vertices are the clauses, and there is an edge between a variable and a clause whenever that variable appears in that clause.Then, such a bipartite graph has constant degree since each vertex has constant degree.This holds because each clause has at most 3 variables and each variable is contained in at most d clauses.A useful result on bipartite graphs is the following.
Lemma 1 (Lemma 1 of Deligkas et al. [28]) Let (V , E) be a bipartite graph with |V | = n, where U and W are the two disjoints and independent sets such that V = U ∪ W (i.e., U and W are the two sides of the graph), and where each vertex has a degree at most ν.
Suppose that U and W both have a constant fraction of the vertices, i.e.,

Then, we can efficiently find a partition {S
of W , such that each set has a size of at most 2 √ n, and for all i and j we have Lemma 1 can be used to build the following free game.Definition 1 (Definition 2 of Deligkas et al. [28]) Given a 3SAT formula ϕ of size n, we define a free game F ϕ as follows: 1. Arthur applies Theorem 3 to obtain formula ϕ of size n • polylog(n); 2. let m = n • polylog(n).Arthur applies Lemma 1 to partition the variables of ϕ in sets {S i } m i=1 , and the clauses in sets {T j } m j=1 ; 3. Arthur draws an index i uniformly at random from [m], and draws independently an index j uniformly at random from [m].Then, he sends S i to Merlin 1 and T j to Merlin 2 ; 4. Merlin 1 responds by choosing a truth assignment for each variable in S i , and Merlin 2 responds by choosing a truth assignment to every variable that is involved with a clause in T j ; 5. Arthur awards the Merlins payoff 1 if and only if the following conditions are both satisfied: • Merlin 2 's assignment satisfies all clauses in T j ; • the two Merlins' assignments are compatible, i.e., for each variable v appearing in S i and each clause in T j that contains v, Merlin 1 's assignment to v agrees with Merlin 2 's assignment to v; Arthur awards payoff 0 otherwise.
When computing the Merlins' rewards, the second condition is always satisfied when S i and T j share no variables.Moreover, when Merlin 1 's and Merlin 2 's assignments are not compatible, we say that they are in conflict.The following lemma shows that, if ϕ is unsatisfiable, then the value of the corresponding free game F ϕ is bounded away from 1.
• OUTPUT: Yes-instances: Finally, we will need to assume the Exponential Time Hypothesis (ETH), which conjectures that any deterministic algorithm solving 3SAT requires 2 (n) time.

Error-Correcting Codes
A message of length k ∈ N + is encoded as a block of length n ∈ N + , with n ≥ k.A code is a mapping e : {0, 1} k → {0, 1} n .Moreover, let dist(e(x), e(y)) be the relative Hamming distance between e(x) and e(y), which is defined as the Hamming distance weighted by 1/n.The rate of a code is defined as R := k/n.Finally, the relative distance dist(e) of a code e is the maximum value d rel such that dist(e(x), e(y)) ≥ d rel for each x, y ∈ {0, 1} k .
In the following, we will need an infinite sequence of codes E := {e k : {0, 1} k → {0, 1} n } k∈N + containing one code e k for each possible message length k.The following result, due to Gilbert [35], can be used to construct an infinite sequence of codes with constant rate and distance.Theorem 5 (Gilbert-Varshamov Bound) For every k ∈ N + , 0 ≤ d rel < 1  2 and n ≥ k / (1 − H 2 (d rel )), there exists a code e : {0, 1} k → {0, 1} n with dist(e) = d rel , where Moreover, such a code can be computed in time 2 O(n) .

Maximum -Feasible Subsystem of Linear Inequalities
First, we prove the following auxiliary results that follow from Lemma 2 and will be useful in the remainder of the paper.Omitted proofs can be found in the Appendix.
Lemma 3 Given a 3SAT formula ϕ, if ϕ is unsatisfiable, then for each (possibly randomized) Merlin 2 's strategy η 2 there exists a set S i such that each Merlin 1 's assignment to variables in S i is in conflict with Merlin 2 's assignment with a probability of at least ρ/2ν.Now, we introduce the maximum -feasible subsystem of linear inequalities problem.Given a system of linear inequalities A x ≥ 0 with A ∈ [−1, 1] n row ×n col and x ∈ n col , we study the problem of finding the largest subsystem of linear inequalities that violate the constraints of at most .As we will show in Sect.6, this problem presents some deep analogies with the problem of determining good posteriors in Bayesian persuasion problems.Definition 3 (MFS) Given a matrix A ∈ [−1, 1] n row ×n col , the problem of finding the maximum feasible subsystem of linear inequalities (MFS) reads as follows: We are interested in the problem of finding a vector x which yields at least the same number of feasible inequalities of MFS under a relaxation of the constraints with respect to Definition 3.

Definition 4 ( -MFS) Given a matrix
Then, the problem of finding the maximum -feasible subsystem of linear inequalities ( -MFS) amounts to finding a probability vector x ∈ n col such that, by letting w = A x, it holds: This problem was previously studied by Cheng et al. [5].In particular, they design a PTAS for the -MFS problem guaranteeing the satisfaction of at least k * − • n row inequalities.This yields a bi-criteria PTAS for the MFS problem.

Upper-bound on -MFS
First, we show that -MFS can be exactly solved in n O(log n) steps for every fixed > 0. We introduce the following auxiliary definition.
Definition 5 (k-uniform distribution) A probability distribution x ∈ X is k-uniform if and only if it is the average of a multiset of k basis vectors in an |X |-dimensional space.
Equivalently, each entry x i of a k-uniform distribution has to be a multiple of 1/k.Then, the following result holds.
Thus, setting k = log n row / 2 ensures the existence of a vector x guaranteeing that, if w * i ≥ 0, then wi ≥ − .Since x is k-uniform by construction, we can find it by enumerating over all the O((n col ) k ) k-uniform probability vectors where k = log n row / 2 .Trivially, this task can be performed in n log n row / 2 steps and, therefore, in n O(log n) steps.

Lower-bound on -MFS
Now we show that -MFS requires at least n ˜ (log n) steps.In doing so, we close the gap with the upper bound stated by Proposition 6 except for polylogarithmic factors of log n in the denominator of the exponent.
Theorem 7 Assuming ETH, there exists a constant > 0 such that solving - Proof Overview.We provide a polynomial-time reduction from FreeGame δ (Definition 1) to -MFS, where = δ / 26 = ρ / (52ν) (see Sect. 4.1 for the definition of parameters δ, ρ, ν).We show that, given a free game instance F ϕ , it is possible to build a matrix A such that, for a certain value k * , the following holds: (i) If ω(F ϕ ) = 1, then there exists a vector x such that where with w = A x.
Construction.In the free game F ϕ , Arthur sends a set of variables S i to Merlin 1 and a set of clauses T j to Merlin 2 , where i, j ∈ [m], m = n polylog(n).Then, Merlin 1 's (resp., Merlin 2 's) answer is denoted by ξ 1 ∈ 1 (resp., ξ 2 ∈ 2 ).The system of linear inequalities used in the reduction has a vector of variables x structured as follows.
2. Variables corresponding to Merlin 1 's answers.We need to introduce some further machinery to augment the dimensionality of 1 through a viable mapping.Let e : {0, 1} 2m → {0, 1} 8m be the code defined in Theorem 5 with rate 1/4 and relative distance dist(e) ≥ 1/5.We can safely assume that , we extend ξ 1 with a sufficient number of extra bits).Then, e(ξ 1 ) is the 8m-dimensional encoding of answer ξ 1 via code e.Let e(ξ 1 ) j be the j-th bit of vector e(ξ 1 ).We have a variable x i, for each index i ∈ [8m] and := ( j ) j∈[m] ∈ {0, 1} m .These x i, variables can be interpreted as follows.Suppose to have an encoding of an answer for each of the possible set S j .There are m such encodings, each of them having 8m bits.Then, it holds x i, > 0 if and only if the i-th bit of the encoding corresponding to S j is j .
There is a total of m • 2 m • (2 5m + 8) variables.Matrix A has a number of columns equal to the number of variables.We denote with A •,(T j ,ξ 2 ) the entry in row (•) and column corresponding to variable x T j ,ξ 2 .Analogously, A •,(i, ) is the entry in row (•) and column corresponding to variable x i, .Rows are grouped in four types, denoted by {t i } 4 i=1 .We write A t i ,• when referring to an entry of any row of type t i .Further arguments may be added as a subscript to identify specific entries of A. Rows are structured as follows.
1. Rows of type t 1 : there are q rows of type t 1 such that A t 1 ,(T j ,ξ 2 ) = 1 for each j ∈ [m], ξ 2 ∈ 2 , and −1 otherwise (the value of q is specified later in the proof).2. Rows of type t 2 : there are q rows for each subset T ⊆ {T j } j∈ [m] with cardinality m/2 (i.e., there is a total of q • m m/2 rows of type t 2 ).Then, the following holds for each T : 3. Rows of type t 3 : there are q rows of type t 3 for each subset of 4m indices I drawn from [8m], for a total of q • 8m 4m rows.For each subset of indices I we have: 4. Rows of type t 4 : there is a row of type t 4 for each S i and ξ 1 .Each of these rows is such that: q + m and q m (for example, q = 2 10m ).We say that row i satisfies -MFS condition for a certain x if w i ≥ − , where w = A x (in the following, we will also consider w i ≥ 0 as an alternative condition).We require at least k * rows to satisfy the -MFS condition.Then, all rows of types t 1 , t 2 , t 3 and at least m rows of type t 4 must be such that w i satisfies the -MFS condition.
Completeness.Given a satisfiable assignment of variables ζ to ϕ, we build vector x as follows.Let ζ T j be the partial assignment obtained by restricting ζ to the variables in the clauses of T j (if |T j | < 2m we pad ζ T j with bits 0 until ζ T j has length 6m).Then, we set x T j ,ζ T j = 1/2m.Moreover, for each i ∈ [8m] and i = (e(ζ S 1 ) i , . . ., e(ζ S m ) i ), we set x i, i = 1/16m.We show that x is such that there are at least k * rows i with w i ≥ 0 (Condition (4)).First, each row i of type t 1 is such that w i = 0 since , we have ξ 2 ,T j ∈T x T j ,ξ 2 = 1/4.This implies that each row i of type t 2 is such that w i = 0.A similar argument holds for rows of type t 3 .Finally, we show that for each S i there is at least a row i of type t 4 such that w i ≥ 0. Take the row corresponding to (S i , ζ S i ).For each x b, > 0 where b ∈ [8 m] and ∈ {0, 1} m , it holds e(ζ S i ) b = i .Then, there are 8m columns played with probability 1/16 m with value 1/2, i.e., b, A (t 4 ,S i ,ζ S i ),(b, ) x b, = 1/4.Moreover, for each This concludes the proof of completeness.
Soundness.We show that, if ω(F ϕ ) ≤ 1 − δ, there is no probability distribution x such that with w = A x. Assume, by contradiction, that one such vector x exists.For the sake of clarity, we summarize the structure of the proof: (i) We show that the probability assigned by x to columns with index (T j , ξ 2 ) has to be close to 1/2, and the same has to hold for columns of type (i, ).(ii) We show that x has to assign probability almost uniformly among T j s and indices i of the encoding of 1 (resp., Lemmas 5 and 6 below).Intuitively, this resembles the fact that, in F ϕ , Arthur draws questions T j according to a uniform probability distribution.(iii) For each S i , there is at most one row (t 4 , S i , ξ 1 ) such that w (t 4 ,S i ,ξ 1 ) ≥ − (Lemma 7).Together with the hypothesis that at least m rows of type t 4 satisfy the -MFS condition, this implies that there exists exactly one such row for each S i .(iv) Finally, we show that the above construction leads to a contradiction with Lemma 3 for a suitable free game.
Before providing the details of the four above steps, we introduce the following result, due to Babichenko et al. [34].
Lemma 4 (Lemma 2 of Babichenko et al. [34]) Let v ∈ n be a probability vector, and u be the n-dimensional uniform probability vector.
If ||v − u|| > c, then there exists a subset of indices Then, we proceed with the following steps (the proofs of the auxiliary results can be found in Appendix A.2): (i) Equation 6 requires all rows i of type t 1 , t 2 , t 3 to be such that w i ≥ − .This implies that, for rows of type t 1 , it holds If, by contradiction, this inequality did not hold, each row i of type t 1 would be such that and ṽ be a uniform probability vector of suitable dimension.The following result shows that having a bounded element-wise difference between v 1 and ṽ is a necessary condition for Eq. 6 to be satisfied.
Let v 2 ∈ [8m] be the probability vector defined as and ṽ be a suitable uniform probability vector.Moreover, the following holds.
Notice that, if this condition did not hold, by Step (i) we would obtain which would go against the satisfiability of Eq. 6.
(iv) Finally, let F * ϕ be a free game in which Arthur (i.e., the verifier) chooses question T j with probability v 1, j as defined in Step (ii), and Merlin 2 (i.e., the second prover) answers ξ 2 with probability x T j ,ξ 2 /v 1, j .In this setting (i.e., F * ϕ ), given question S i to Merlin 1 , the two provers will provide compatible answers with probability where the first inequality holds for Eq. 7 at Step (i).In a canonical (as in Definition 1) free game F ϕ , Arthur picks questions according to a uniform probability distribution.Therefore, the main difference between F ϕ and F * ϕ is that, in the latter, Arthur draws questions for Merlin 2 from v 1 which may not be a uniform probability distribution.However, we know that differences between v 1 and a uniform probability vector must be limited.Specifically, by Lemma 5, we have ||v 1 − ṽ|| 1 ≤ 16 .Then, if Merlin 1 and Merlin 2 applied in F ϕ the strategies we described for F * ϕ , their answers would be compatible with probability at least for each S i .Finally, by picking = ρ/52 ν, we reach a contradiction with Lemma 3.This concludes the proof.

Hardness of (˛, )-Persuasion
We show that a public signaling scheme approximating the value of the optimal one cannot be computed in polynomial time even if we allow it to be -persuasive (see Eq. 2).Specifically, assuming ETH, computing an (α, )-persuasive signaling scheme requires at least n ˜ (log n) , where the dimension of the instance is n = O( n d).We prove this result for the specific case of the k-voting problem, as introduced in Sect.3.Besides its practical applicability, this problem is particularly instructive in highlighting the strong connection between the problem of finding suitable posteriors and the -MFS problem, as discussed in the following lemma.An analogous observation was also highlighted by Cheng et al. in [5].

Lemma 8 Given a k-voting instance, the problem of finding a posterior p ∈ such that W (p) ≥ k is equivalent to finding an -feasible subsystem of k linear inequalities over the simplex when
Proof By setting x = p, it directly follows that i∈ The above lemma shows that deciding if there exists a posterior p such that W (p) ≥ k or if all the posteriors have W (p) < k (i.e., deciding if the utility of the sender can be greater than zero) is as hard as solving the -MFS problem.More precisely, if an -MFS instance does not admit any solution, then there does not exist a posterior guaranteeing a strictly positive winning probability for the sender's preferred candidate.On the other hand, if an -MFS instance admits a solution, there exists a signaling scheme where at least one of the induced posteriors guarantees strictly positive winning probability to the sender's preferred candidate.However, the above connection between the -MFS problem and the k-voting problem is not sufficient to prove the inapproximability of the k-voting problem, as the probability whereby this posterior is reached may be arbitrarily small.
Luckily enough, the next theorem shows that it is possible to strengthen the inapproximability result by constructing an instance in which, when 3SAT is satisfiable, there exists a signaling scheme such that all the induced posteriors satisfy W (p) ≥ k (i.e., the sender's preferred candidate wins with a probability of 1).The main idea is to suitably extend the construction of Theorem 7 with an additional set of states {θ d } d ∈ {0, 1} 7m , where we can see vectors d as the concatenation of a subvector d S ∈ {0, 1} m and a subvector d T ∈ {0, 1} 6m .Moreover, we need to extend the set of receivers.In particular, we replace each receiver relative to a set S i and an answer ξ 1 with a set including a receiver for each d.The new receivers' payoffs are defined as follows.Let ⊕ be the XOR operator.The payoff of the receiver relative to S i , ξ 1 , and d in a state θ (T j ,ξ 2 ⊕d T ) is equivalent to the payoff of the original receiver in the state θ (T j ,ξ 2 ) , while we use a similar procedure for the payoffs in the states θ (i, ) .Then, the signaling scheme employs a signal s d for each d.Each signal s d defines which of the {0, 1} 7m games we are playing.All these games are equivalent since we are simply changing the meaning of the states: for example, a state θ (T j ,ξ 2 ⊕d T ) is equivalent to the original state θ (T j ,ξ 2 ) .Using this construction, we have that all the signals induce a posterior in which at least k voters votes for c 0 , while in the original game only one signal induces a posterior that satisfies this condition.
Proof Overview.By following the proof of Theorem 7, we can provide a polynomialtime reduction from FreeGame δ to the problem of finding an -persuasive signaling scheme in k-voting, with = δ/780 = ρ/1560ν.Specifically, if ω(F ϕ ) = 1, there exists a signaling scheme guaranteeing the sender an expected value of 1. Otherwise, if ω(F ϕ ) ≤ 1 − δ, then all posteriors are such that W (p) < k (i.e., the sender cannot obtain more than 0).Construction.The k-voting instance has the following possible states of nature.
It is useful to see vector d as the union of the subvector d S ∈ {0, 1} m and the subvector d T ∈ {0, 1} 6m .
The common prior μ is such that: To simplify the notation, in the remaining of the proof, let u r θ := u r θ (a 0 ) − u r θ (a 1 ).The k-voting instance comprises the following receivers.
1. Receivers of type t 1 : there are q (the value of q is specified later in the proof) receivers of type t 1 , which are such that u t 1 θ (T j ,ξ 2 ) = 1 for each (T j , ξ 2 ), and −1/3 otherwise.2. Receivers of type t 2 : there are q receivers of type t 2 such that u t 2 θ (i, ) = 1 for each (i, ), and −1/3 otherwise.3. Receivers of type t 3 : there are q receivers of type t 3 for each subset T ⊆ {T j } j∈ [m] of cardinality m/2.Each receiver corresponding to the subset T is such that: = 0 for every otherθ.

Receivers of type t 4 :
we have q receivers of type t 4 for each subset I of 4m indices selected from [8m].
Each receiver corresponding to subset I is such that: = 0 for every otherθ.

Receivers of type t 5 :
there is a receiver of type t 5 for each S i , ξ 1 ∈ 1 and d ∈ {0, 1} 7m .
Then, for each receiver of type t 5 the following holds: q + m.By setting q m (e.g., q = 2 10m ), candidate a 0 can get at least k votes only if all receivers of type t 1 , t 2 , t 3 , t 4 vote for her.
Completeness.Given a satisfiable assignment ζ to the variables in ϕ, let [ζ ] T j ∈ {0, 1} 6m be the vector specifying the variables assignment of each clause in T j , and [ζ ] S i ∈ {0, 1} 2 m be the vector specifying the assignment of each variable belonging to S i .The sender has a signal for each d ∈ {0, 1} 7m .The set of signals is denoted by S, where |S| = 2 7m , and a signal is denoted by s d ∈ S. We define a signaling scheme φ as follows.First, we set First, we prove that the signaling scheme is well-formed.For each state θ (T j ,ξ 2 ) , it holds that and, for each θ (i, ) , the following holds: Now, we show that there exist at least k voters that will choose a 0 .Let p ∈ be the posterior induced by a signal s d .All receivers of type t 1 choose a 0 since it holds: Analogously, all receivers of type t 2 select a 0 .Furthermore, for each T j , it holds ξ 2 p θ (T j ,ξ 2 ) = 1/4 m.Then, for each subset T ⊆ {T j } j∈ [m] of cardinality m/2, it holds T j ∈T ,ξ 2 p θ (T j ,ξ 2 ) = m/2 1/4 m = 1/8.Therefore, each receiver of type t 3 chooses a 0 .An analogous argument holds for receivers of type t 4 .Finally, we show that, for each S i , the receiver (t 5 , S i , [ζ ] S i , d) chooses a 0 .In particular, receiver (t 5 , S i , [ζ ] S i , d) has the following expected utility: since, for each p (T j ,ξ 2 ) > 0, the following holds This concludes the proof of completeness. 6oundness.We prove that, if ω(F ϕ ) ≤ 1 − δ, there is no posterior in which a 0 is chosen by at least k receivers, thus implying that the sender's utility is equal to 0. Now, suppose, towards a contradiction, that there exists a posterior p such that at least k receivers select a 0 .Let γ := (T j ,ξ 2 ) p θ (T j ,ξ 2 ) + (i, ) p θ (i, ) .Since all voters of types t 1 and t 2 vote for a 0 , it holds that (T j ,ξ 2 ) p θ (T j ,ξ 2 ) ≥ 1 4 − and (i, ) p θ (i, ) ≥ 1 4 − .Moreover, since at least a receiver (t 5 , S i , ξ 1 , d) must play a 0 , there exists a d ∈ {0, 1} 7m and a state θ d with p θ ≥ 1 2 − .This implies that 1 2 − 2 ≤ γ ≤ 1 2 + .Consider the reduction to -MFS, with = ρ/52ν (Theorem 7).Let x (T j ,ξ 2 ) = p θ (T j ,ξ 2 ⊕d T ) /γ , x (i, ) = p θ (i, ⊕d S ) /γ , and = /30.All rows of type t 1 of -MFS are such that All voters of type t 3 choose a 0 .Then, for all T ⊆ {T j } j∈[m] of cardinality m/2, it holds: Then, all rows of type t 2 of -MFS are such that: A similar argument proves that all rows of type t 3 of the instance of -MFS have w (t 3 ,I) ≥ − .
Theorem 8 shows that, assuming the ETH, computing an (α, )-persuasive signaling schemes requires at least a quasi-polynomial number of steps in the specific scenario of a k-voting instance.Therefore, the same holds in the general setting of arbitrary public persuasion problems with binary action spaces, which is precisely the claim of Theorem 1.

A Quasi-Polynomial Time Algorithm
In this section, we prove that our hardness result (Theorem 8) is tight by devising a bicriteria approximation algorithm.Our result extends the results by Cheng et al. [5] and Xu [22], which deal with signaling problems with binary action spaces and sender's utility functions which are independent from the state of nature.This is arguably a restrictive assumption, and even the original Bayesian persuasion framework by Kamenica and Gentzkow [14] describes state-dependent sender's utility functions.Our results generalize those by Cheng et al. [5] to the case of state-dependent sender's utility functions, and arbitrary discrete action spaces.
In order to prove our result, we need some further machinery.Let Z r := 2 A r be the power set of A r .Then, Z := × r ∈R Z r is the set of tuples specifying a subset of A r for each receiver r .For a given probability distribution over the states of nature, we are interested in determining the set of best responses of each receiver r , i.e., the subset of A r maximizing her expected utility.Formally, we have the following.Definition 6 (BR-set) Given a probability distribution over states of p ∈ , the best-response set (BR-set) M p := (Z 1 , . . ., Z n ) ∈ Z is such that Z r = arg max a∈A r θ∈ p θ u r θ (a) for each r ∈ R.
Similarly, we define a notion of -BR-set which comprises -approximate best responses to a given distribution over the states of nature.
Definition 7 ( -BR-set) Given a probability distribution over states of nature p ∈ , the -best-response set ( -BR-set) M p, := (Z 1 , . . ., Z n ) ∈ Z is such that, for each r ∈ R, action a belongs to Z r if and only if We introduce a suitable notion of approximability of the sender's objective function.Our notion of α-approximable function is a generalization of the one proposed by Xu [22,Definition 4.5] to the setting of arbitrary action spaces and state-dependent sender's utility functions.
We say that f is α-approximable if there exists a function g : × Z → A computable in polynomial time such that, for all p ∈ and Z ∈ Z, it holds: a = g(p, Z ), a ∈ Z and The voting function f defined in Sect. 3 is 1-approximable, while, for example, when the action space is binary a non-monotone submodular function is 1/2-approximable.The α-approximability assumption is a natural requirement since, otherwise, even evaluating the sender's objective value would result in an intractable problem.When f is α-approximable, it is possible to find an approximation of the optimal receivers' actions profile when they are constrained to select actions profiles in Z .
We now provide an algorithm which computes in quasi-polynomial time, for any α-approximable f , a bi-criteria approximation of the optimal solution with an approximation on the objective value arbitrarily close to α.When f is 1-approximable our result yields a bi-criteria QPTAS for the problem.The key idea is showing that an optimal signaling scheme can be approximated by a convex combination of suitable k-uniform posteriors.As in previous works [5,22], the key part of the proof is a decomposition lemma that proves that all the posteriors can be decomposed in a convex combination of k-uniform posteriors with a small loss in utility.However, the assumption of state-dependent sender's utility functions makes previous approaches ineffective in our setting.In particular, we observe that previous decomposition lemmas are based on a direct application of the Hoeffding's and union bounds.In our case, such a direct derivation not possible, and we need to introduce some technical intermediate results (Lemmas 9 -12).In particular, we need to develop a new probabilistic analysis of the decomposition lemma.Let := max r ∈R r , n := |R|, and d := | |.The proof of our main positive result, as stated in Theorem 2, goes as follows.

Proof of Theorem 2
We show that there exists a poly d log( n /δ) / 2 algorithm that computes the given approximation.Let k = 32 log(4 n /δ) / 2 and K ⊂ be the set of k-uniform distributions over (Def.5).We prove that all posteriors p * ∈ can be decomposed as a convex combination of k-uniform posteriors without lowering too much the sender's expected utility.Formally, each posterior p * ∈ can be written as p * = p∈K γ p p, with γ ∈ K such that Let γ ∈ K be the empirical distribution of k i.i.d.samples from p * , where each θ has probability p * θ of being sampled.Therefore, the vector γ is a random variable supported on k-uniform posteriors with expectation p * .Moreover, let γ ∈ K be a probability distribution such as, for each p ∈ K, γ p := Pr( γ = p).For each γ ∈ K and p ∈ K, we denote by γ (θ,i) p the conditional probability of having observed posterior p, given that the posterior must assign probability i/k to state θ .Formally, for each p ∈ K, if p θ = i/k, we have . Finally, let P ⊆ K be the set of posteriors such that Now, we prove the following intermediate results (the proofs of the auxiliary results are provided in Appendix A.3).The following lemma show that, given a posterior p * and a state θ , if we take k i.i.d.samples from p * and we consider only the induced posteriors p in which p θ is close to p * θ , then the probability that the utility of all the receivers in p is close to the their utility in p * is close to 1.The following result combines Lemmas 9 and 10.In particular, if we consider the distribution of k i.i.d.samples from a posterior p * , we have that, for each θ , the probability that in state θ the utility of all the receivers is close to their utility in p * is close to 1. Equivalently, the induced posterior belongs to P as defined in (9).

Lemma 9
Lemma 11 Given a p * ∈ , for each θ ∈ , it holds: where γ is the distribution of k i.i.d.samples from p * .Now, we need to prove that all the posteriors in P guarantee to the sender at least the same expected utility of p * .Formally, we prove that the -BR-set of each p ∈ P contains the BR-set of p * .This is shown via the following lemma.

Lemma 12
Given p * ∈ , for each p ∈ P, it holds: Finally, we prove that we can represent each posterior p * as a convex combination of k-uniform posteriors with a small loss in the sender's expected utility.For p ∈ K and Z ∈ Z, let g * : × Z → [0, 1] be a function such that Given p * ∈ , we are interested in bounding the difference in the sender's expected utility when p * is approximated as a convex combination γ of k-uniform posteriors, the sender exploits an α-approximation of f , and receivers play -persuasive best-responses.Formally, Lemma 13 Given a * ∈ , it holds: where γ is the distribution of k i.This shows that such signaling scheme φ is α(1 − δ)-approximate and -persuasive, which are precisely our desiderata, thus concluding the proof.Assume, by contradiction, that for a given S i there exist two assignments ξ 1 and ξ 1 such that w (t 4 ,S i ,ξ 1 ) ≥ − for each ξ 1 ∈ {ξ 1 , ξ 1 }.Then, f (x, ξ 1 ) ≥ 1/2 − , for each ξ 1 ∈ {ξ 1 , ξ 1 }.Otherwise, we would get w (t 4 ,S i ,ξ 1 ) < 1/2(1/2 − ) − 1/2(1/2 + ) = − for at least one ξ 1 ∈ {ξ 1 , ξ 1 }.Let x be the vector such that x i, := x i, i, x i, .

Table 1
Voters' payoffs from voting different candidates State

Table 2
Proof Denote by x * the optimal solution of -MFS.Let x ∈ n col be the empirical distribution of k i.i.d.samples drawn from probability distribution x Proposition 6 -MFS can be solved in n O(log n) steps.* .Moreover, let w * := A x * and w := A x. for each i ∈ [n row ].Then, by the union bound, we get Pr(∃i s.t.w * i − wi ≥ ) ≤ n row • e −2k 2 .

Lemma 6
If ||v 2 − ṽ|| 1 > 16 , there exists a row i of type t 3 such that w i < − .In order to satisfy Eq. 6, all rows i of type t 2 and t 3 have to be such that w i ≥ − .Then, by Lemmas 5 and 6, it holds that ||v 1 − ṽ|| 1 ≤ 16 and ||v 2 − ṽ|| 1 ≤ 16 .(iii)We show that, for each S i , there exists at most one row (t 4 , S Then, there are at least m rows (t 4 , S i , ξ 1 ) such that w (t 4 ,S i ,ξ 1 ) ≥ − and, by Lemma 7, we get that there exists exactly one such row for each S i , i ∈ [m].Therefore, for each S i , there exists ξ i 1 ∈ 1 such that i , ξ 1 ) for which w (t 4 ,S i ,ξ 1 ) ≥ − .Lemma 7 For each S i , i ∈ [m], there exists at most one row (t 4 , S i , ξ 1 ) such that w (t 4 ,S i ,ξ 1 ) ≥ − .
Then, we show that the condition in the previous lemma is satisfied with high probability.In particular, we show that given a posterior p * and a state θ , if we take k i.i.d.samples from p * , then with probability close to 1 the induced posterior p is such that p θ is close to p * θ .Formally, we prove the following lemma.
Given p * ∈ , for each θ ∈ , and for each i ∈ [k] such that * .i:|i/k−p * θ |≥ /4 p∈K: p θ =i/k where γ is the distribution of k i.i.d.samples from p * .
Funding Open access funding provided by Politecnico di Milano within the CRUI-CARE Agreement.This paper is supported by the ALGADIMAR project funded by PRIN2017 program.This paper is supported by FAIR (Future Artificial Intelligence Research) project, funded by the NextGenerationEU program within the PNRR-PE-AI scheme (M4C2, Investment 1.3, Line on Artificial Intelligence).The authors have no conflicts of interest to declare that are relevant to the content of this article.Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.If ||v − ṽ|| 1 > 16 , there exists a row i of type t 3 such that w i < − .Proof Lemma 4 implies that, if ||v 2 − ṽ|| 1 > 16 , then there exists a set I ⊆ [8m] such that It follows that there exists a row (t 3 , I) such that w t 3 ,I < −1/4− +1/4− /2 < − .For each S i , i ∈ [m], there exists at most one row (t 4 , S i , ξ 1 ) such that w (t 4 ,S i ,ξ 1 ) ≥ − .