Opinion Dynamics with Limited Information

We study opinion formation games based on the famous model proposed by Friedkin and Johsen (FJ model). In today’s huge social networks the assumption that in each round agents update their opinions by taking into account the opinions of all their friends is unrealistic. So, we are interested in the convergence properties of simple and natural variants of the FJ model that use limited information exchange in each round and converge to the same stable point. As in the FJ model, we assume that each agent i has an intrinsic opinion si∈[0,1]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_i \in [0,1]$$\end{document} and maintains an expressed opinion xi(t)∈[0,1]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x_i(t) \in [0,1]$$\end{document} in each round t. To model limited information exchange, we consider an opinion formation process where each agent i meets with one random friend j at each round t and learns only her current opinion xj(t)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x_j(t)$$\end{document}. The amount of influence j imposes on i is reflected by the probability pij\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_{ij}$$\end{document} with which i meets j. Then, agent i suffers a disagreement cost that is a convex combination of (xi(t)-si)2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(x_i(t) - s_i)^2$$\end{document} and (xi(t)-xj(t))2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(x_i(t) - x_j(t))^2$$\end{document}. An important class of dynamics in this setting are no regret dynamics, i.e. dynamics that ensure vanishing regret against the experienced disagreement cost to the agents. We show an exponential gap between the convergence rate of no regret dynamics and of more general dynamics that do not ensure no regret. We prove that no regret dynamics require roughly Ω(1/ε)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varOmega (1/\varepsilon )$$\end{document} rounds to be within distance ε\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document} from the stable point of the FJ model. On the other hand, we provide an opinion update rule that does not ensure no regret and converges to x∗\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x^*$$\end{document} in O~(log2(1/ε))\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\tilde{O}(\log ^2(1/\varepsilon ))$$\end{document} rounds. Finally, in our variant of the FJ model, we show that the agents can adopt a simple opinion update rule that ensures no regret to the experienced disagreement cost and results in an opinion vector that converges to the stable point x∗\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x^*$$\end{document} of the FJ model within distance ε\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document} in poly(1/ε)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{poly}(1/\varepsilon )$$\end{document} rounds. In view of our lower bound for no regret dynamics this rate of convergence is close to best possible.


Introduction
The study of Opinion Formation has a long history (see e.g.[29]).Opinion Formation is a dynamic process in the sense that socially connected people (e.g.family, friends, colleagues) exchange information and this leads to changes in their expressed opinions over time.Today, the advent of the internet and social media makes the study of opinion formation in large social networks even more important; realistic models of how people form their opinions by interacting with each other are of great practical interest for prediction, advertisement etc.In an attempt to formalize the process of opinion formation, several models have been proposed over the years (see e.g., [13,14,18,28]).The common assumption underlying all these models, which dates back to DeGroot [13], is that opinions evolve through a form of repeated averaging of information collected from the agents' social neighborhoods.
Our work builds on the model proposed by Friedkin and Johnsen [18].The FJ model is a variation on the DeGroot model capturing the fact that consensus on the opinions is rarely reached.According to FJ model each person i has a public opinion x i ∈ [0, 1] and an internal opinion s i ∈ [0, 1], which is private and invariant over time.There also exists a weighted graph G(V , E) representing a social network where V stands for the persons (|V | = n) and E their social relations.Initially, all nodes start with their internal opinion and at each round t, update their public opinion x i (t) to a weighted average of the public opinions of their neighbors and their internal opinion, where N i = { j ∈ V : (i, j) ∈ E} is the set of i's neighbors, the weight w i j associated with the edge (i, j) ∈ E measures the extent of the influence that j poses on i and the weight w ii > 0 quantifies how susceptible i is in adopting opinions that differ from her internal opinion s i .
The FJ model is one of most influential models for opinion formation.It has a very simple update rule, making it plausible for modeling natural behavior and its basic assumptions are aligned with empirical findings on the way opinions are formed [1,32].At the same time, it admits a unique stable point x * ∈ [0, 1] n to which it converges with a linear rate [23].The FJ model has also been studied under a game theoretic viewpoint.Bindel et al. considered its update rule as the minimizer of a quadratic disagreement cost function and based on it they defined the following opinion formation game [5].Each node i is a selfish agent whose strategy is the public opinion x i that she expresses incurring her a disagreement cost 2  (2) Note that the FJ model is the simultaneous best response dynamics and its stable point x * is the unique Nash equilibrium of the above game.In [5] they quantified its inefficiency with respect to the total disagreement cost.They proved that the Price of Anarchy (PoA) is 9/8 in case G is undirected and w i j = w ji .They also provided PoA bounds in the case of unweighted Eulerian directed graphs.We remark that in [5] an alternative framework for studying the way opinions evolve was introduced.
The opinion formation process can be described as the dynamics of an opinion formation game.This framework is much more comprehensive since different aspects of the opinion formation process can be easily captured by defining suitable games.Subsequent works [3,4,15] considered variants of the above game and studied the convergence properties of the best response dynamics.

Motivation and Our Setting
Many recent works study the Nash equilibrium x * of the opinion formation game defined in [5] under various perspectives.In [10] they extended the bounds for PoA in more general classes of directed graphs, while many recently introduced influence maximization problems [2,24,33], which are defined with respect to x * .The reason for this scientific interest is evident: the equilibrium x * is considered as an appropriate way to model the final opinions formed in a social network, since the well established FJ model converges to it.Our work is motivated by the fact that there are notable cases in which the FJ model is not an appropriate model for the dynamic of the opinions, due to the large amount of information exchange that it implies.More precisely, at each round its update rule (1) requires that every agent learns all the opinions of her social neighbors.In today's large social networks where users usually have several hundreds of friends it is highly unlikely that, each day, they learn the opinions of all their social neighbors.In such environments it is far more reasonable to assume that individuals randomly meet a small subset of their acquaintances and these are the only opinions that they learn.Such information exchange constraints render the FJ model unsuitable for modeling the opinion formation process in such large networks and therefore, it is not clear whether x * captures the limiting behavior of the opinions.In this work we ask: Question 1 Is the equilibrium x * an adequate way to model the final formed opinions in large social networks?Namely, are there simple variants of the FJ model that require limited information exchange and converge fast to x * ?Can they be justified as natural behavior for selfish agents under a game-theoretic solution concept?
To address these questions, one could define precise dynamical processes whose update rules require limited information exchange between the agents and study their convergence properties.Instead of doing so, we describe the opinion formation process in such large networks as dynamics of a suitable opinion formation game that captures these information exchange constraints.This way we can precisely define which dynamics are natural and, more importantly, to study general classes of dynamics (e.g.no regret dynamics) without explicitly defining their update rule.The opinion formation game that we consider is a variant of the game in [5] based on interpreting the weight w i j as a measure of how frequently i meets j.
Definition 1 For a given opinion vector x ∈ [0, 1] n , the disagreement cost of agent i is the random variable C i (x i , x −i ) defined as follows: -Agent i meets one of her neighbors j with probability p i j = w i j / j∈N i w i j .
-Agent i suffers cost 2 , where Note that the expected disagreement cost of each agent in the above game is the same as the disagreement cost in [5] (scaled by j∈N i w i j + w ii ).Moreover its Nash equilibrium, with respect to the expected disagreement cost, is x * .This game provides us with a general template of all the dynamics examined in this paper.At round t, each agent i selects an opinion x i (t) and suffers a disagreement cost based on the opinion of the neighbor that she randomly met.At the end of round t, she is informed only about the opinion and the index of this neighbor and may use this information to update her opinion in the next round.Obviously different update rules lead to different dynamics, however all of these respect the information exchange constraints: at every round each agent learns the opinion of just one of her neighbors.Question 1 now takes the following more concrete form.

Question 2
Can the agents update their opinions according to the limited information that they receive such that the produced opinion vector x(t) converges to the equilibrium x * ?How is the convergence rate affected by the limited information exchange?Are there dynamics that ensure that the cost that the agents experience is minimal?
In what follows, we are mostly concerned about the dependence of the rate of convergence on the distance ε from the equilibrium x * .Thus, we shall suppress the dependence on other parameters such as the size of the graph, n.We remark that the dependence of our dynamics on these constants is in fact rather good (see Sect. 2), and we do this only for clarity of exposition.
Definition 2 (Informal) We say that a dynamics converges slowly resp.fast to the equilibrium x * if it requires poly(1/ε) resp.poly(log(1/ε)) rounds to be within (expected) error ε of x * .

Contribution
The major contribution of the paper is proving an exponential separation on the convergence rate of no regret dynamics and the convergence rate of more general dynamics produced by update rules that do not ensure no regret.
No regret dynamics are produced by update rules that ensure no regret to any agent that adopts them.Namely, the total disagreement cost of an agent that follows such a rule is close to the total disagreement cost that she would experience by selecting the best fixed opinion in hindsight.The latter must hold regardless of the way the other agents update their opinions and of the neighbors that the agent gets to meet.This powerful property renders no regret dynamics natural dynamics for describing the behavior of agents [8,16,31,37].We prove that if all the agents adopt an update rule that ensures no regret, then there exists an instance of the game such that the produced opinion vector x(t) requires roughly Ω(1/ε) rounds to be ε-close to x * .No regret comes at the price of slow convergence because it provides robust guarantees.Agents who adopt no regret update rules suffer minimal total disagreement cost even if the other agents play irrationally or adversarially.In order to provide such strong guarantees, no regret rules must only depend on the opinions that the agent observes and not take into account the weights w i j of the outgoing edges (see Sect. 5).We call the update rules with the latter property, graph oblivious.In Sect. 5 we use a novel information theoretic argument to prove the aforementioned lower bound for this more general class.
In Sect.6, we present a simple update rule whose resulting dynamics converges fast, i.e. the opinion vector x(t) is ε-close to x * in O(log 2 (1/ε)) rounds.The reason that the previous lower bound doesn't apply is that this rule does not ensure no regret to the agents that adopt it.In fact there is a very simple example with two agents, in which the first follows the rule while the second selects her opinions adversarially, where the first agent experiences regret (see Example 1 in Sect.6).
We introduce an intuitive no regret update rule and we show that if all agents adopt it, the resulting opinion vector x(t) converges to x * .Our rule is a Follow the Leader algorithm, meaning that at round t, each agent updates her opinion to the minimizer of total disagreement cost that she experienced until round t − 1.It also has a very simple form: it is roughly the time average of the opinions that the agent observes.In Sect.3, we bound its convergence rate and show that in order to achieve ε distance from x * , poly(1/ε) rounds are sufficient.In view of our lower bound this rate is close to best possible.In Sect.4, we prove its no regret property.This can be derived by the more general results in [25].However, we give a short and simple proof that may be of interest.
In conclusion, our results reveal that the equilibrium x * is a robust choice for modeling the limiting behavior of the opinions of agents since, even in our limited information setting, there exist simple and natural dynamics that converge to it.The convergence rate crucially depends on whether the agents act selfishly, i.e. they are only concerned about their individual disagreement cost.We present an update rule that selfish agents can adopt (no regret update rule) and show that the resulting opinion vector converges to x * but with a slow rate, while, for non selfish agents, the update rule in Sect.6 leads to a dynamics with fast convergence rate.

Related Work
There exists a large amount of literature concerning the FJ model.Many recent works [3,4,12,15] bound the inefficiency of equilibrium in variants of opinion formation game defined in [5].In [23] they bound the convergence time of the FJ model in special graph topologies.In [3], a variant of the opinion formation game, in which social relations depend on the expressed opinions, is studied.They prove that the discretized version of the above game admits a potential function and thus best-response converges to the Nash equilibrium.Convergence results in other discretized variants of the FJ model can be found in [17,40].In [19] they provide convergence results for limited information variants of the Heglesmann-Krause model [28] and the FJ model.Although the considered limited information variant of the FJ model is very similar to ours, their convergence results are much weaker, since they concern the expected value of the opinion vector.
Other works that relate to ours concern the convergence properties of dynamics based on no regret learning algorithms.In [20,21,36,37] it is proved that in a finite n-person game, if each agent updates her mixed strategy according to a no regret algorithm, the resulting time-averaged strategy vector converges to Coarse Correlated Equilibrium.The convergence properties of no regret dynamics for games with infinite strategy spaces were considered in [16].They proved that for a large class of games with concave utility functions (socially concave games), the time-averaged strategy vector converges to Pure Nash Equilibrium (PNE).More recent work investigates a stronger notion of convergence of no regret dynamics.In [11] they show that, in nperson finite generic games that admit unique Nash equilibrium, the strategy vector converges locally and fast to it.They also provide conditions for global convergence.Our results fit in this line of research since we show that for a game with infinite strategy space, the strategy vector (and not the time-averaged) converges to the Nash equilibrium x * .
No regret dynamics in limited information settings have recently received substantial attention from the scientific community since they provide realistic models for the practical applications of game theory.Perfect payoff information is rare in practice; agents act based on random or noisy past-payoff observations.Kleinberg et al. in [30] treated load-balancing in distributed systems as a repeated game and analyzed the convergence properties of no regret learning algorithms under the full information assumption that each agent learns the load of every machine.In a subsequent work [31], the same authors consider the same problem in a limited information setting ("bulletin board model"), in which each agent learns the load of just the machine that served him.Most relevant to ours are the works [6,11,27,35], where they examine the convergence properties of no regret learning algorithms when the agents observe their payoffs with some additive zero-mean random noise.In our limited information setting the agents experience random disagreement cost with expected value equal to the actual cost.The main difference is that our noise is not additive but due to a sampling process.

Our Results and Techniques
We have adopted the convention of using ln to denote the natural logarithm.We will also use the notation log freely without specifying a base when inside the big-O notation or when we have a constant C that is arbitrary.As previously mentioned, an instance of the game in [5] is also an instance of the game of Definition 1.Following the notation introduced earlier we have that p i j = w i j / j∈N i w i j if j ∈ N i and 0 otherwise.Moreover, α i = w ii /( j∈N i w i j + w ii ) > 0 since w ii > 0 by the definition of the game in [5].If an agent i does not have outgoing edges (N i = ∅) then p i j = 0 for all j.Therefore n j=1 p i j = 0, α i = 1 if N i = ∅ and n j=1 p i j = 1, α i ∈ (0, 1) otherwise.For simplicity we adopt the following notation for an instance of the game of Definition 1.

Definition 3
We denote an instance of the opinion formation game of Definition 1 as I = (P, s, α), where P is a n ×n matrix with non-negative elements p i j , with p ii = 0 and n j=1 p i j is either 0 or 1, s ∈ [0, 1] n is the internal opinion vector, α ∈ (0, 1] n the self confidence vector. An instance I = (P, s, α) is also an instance of the FJ model, since by the update rule (1) It also defines the opinion vector x * ∈ [0, 1] n which is the stable point of the FJ model and the Nash equilibrium of the game in [5].
Definition 4 For a given instance I = (P, s, α) the equilibrium x * ∈ [0, 1] n is the unique solution of the following linear system, for every i ∈ V : The fact that the above linear system always admits a solution follows by matrix norm properties.Throughout the paper we study dynamics of the game of Definition 1.We denote as W t i the neighbor that agent i met at round t, which is a random variable whose probability distribution is determined by the instance I = (P, s, α) of the game, P W t i = j = p i j .Another parameter of an instance I that we often use is ρ = min i∈V α i .
In Sect.3, we examine the convergence properties of the opinion vector x(t) when all agents update their opinions according to the Follow the Leader principle.Since each agent i must select x i (t), before knowing which of her neighbors she will meet and what opinion her neighbor will express, this update rule says "play the best according to what you have observed".For a given instance (P, s, a) of the game the Follow the Leader dynamics x(t) is defined in Dynamics 1 and Theorem 1 shows its convergence rate to x * .

Theorem 1 Let I = (P, s, α) be an instance of the opinion formation game of Definition 1 with equilibrium x
Algorithm 1 Follow the Leader dynamics 1: Initially x i (0) = s i for all agents i. 2: At round t ≥ 0 each agent i: 2 and learns the opinion x W t i (t).

5: Updates her opinion
where ρ = min i∈V a i and C is a universal constant and t ≥ 6.
In Sect. 4 we argue that, apart from its simplicity, update rule (3) ensures no regret to any agent that adopts it and therefore the FTL dynamics can be considered as natural dynamics for selfish agents.Since each agent i selfishly wants to minimize the disagreement cost that she experiences, it is natural to assume that she selects x i (t) according to a no regret algorithm for the online convex optimization problem where the adversary chooses a function In Theorem 2 we prove that Follow the Leader is a no regret algorithm for the above OCO problem.We remark that this does not hold, if the adversary can pick functions from a different class (see e.g.chapter 5 in [26]).

Theorem 2 Consider the function f
On the positive side, the FTL dynamics converges to x * and its update rule is simple and ensures no regret to the agents.On the negative side, its convergence rate is outperformed by the rate of FJ model.For a fixed instance I = (P, s, α), the FTL dynamics converges with rate O(1/t min(ρ,1/2) ) while FJ model converges with rate O(e −ρt ) [23].
Question 3 Can the agents adopt other no regret update rules such that the resulting dynamics converges fast to x * ?
The answer is no.In Sect.5, we prove that fast convergence cannot be established for any no regret dynamics.The reason that FTL dynamics converges slowly is that rule (3) only depends on the opinions of the neighbors that agent i meets, α i , and s i .This is also true for any update rule that ensures no regret to the agents (see Sect. 5).As already mentioned, we call this larger class of update rules graph oblivious, and we prove that fast convergence cannot be established for graph oblivious dynamics.Definition 5 (graph oblivious update rule) A graph oblivious update rule A is a sequence of functions Definition 6 (graph oblivious dynamics) Let A be a graph oblivious update rule.For a given instance I = (P, s, α) the rule A produces a graph oblivious dynamics x A (t) defined as follows: -Initially each agent i selects her opinion where W t i is the neighbor that i meets at round t.
Theorem 3 states that for any graph oblivious dynamics there exists an instance I = (P, s, α), where roughly Ω(1/ε) rounds are required to achieve convergence within error ε.
Theorem 3 Let A be a graph oblivious update rule, which all agents use to update their opinions.For any c > 0 there exists an instance I = (P, s, a) such that where x A (t) denotes the opinion vector produced by A for the instance I = (P, s, α).
To prove Theorem 3, we show that graph oblivious rules whose dynamics converge fast imply the existence of estimators for Bernoulli distributions with "small" sample complexity.The key part of the proof lies in Lemma 6, in which it is proven that such estimators cannot exist.We also briefly discuss two well-known sample complexity lower bounds from the statistics literature and explain why they do not work in our case.
In Sect.6, we present a simple update rule that achieves error rate e − O( t) .This update rule is a function of the opinions and the indices of the neighbors that i met, s i , α i and the i-th row of the matrix P. Obviously this rule is not graph oblivious, due to its dependency on the i-th row and the indices, and thus does not ensure no regret to an agent that adopts it (see Example 1 in Sect.6).However it reveals that slow convergence is not a generic property of the limited information dynamics, but comes with the assumption that agents act selfishly.

Convergence Rate of FTL Dynamics
In this section we prove Theorem 1 which bounds the convergence time of FTL dynamics to the unique equilibrium point x * .Notice that for an instance I = (P, s, α), the opinion vector x(t) ∈ [0, 1] n of the FTL dynamics (see Dynamics 1) can be written equivalently as follows: -Initially all agents adopt their internal opinion, x i (0) = s i .
-At round t ≥ 1, each agent i updates her opinion where W τ i is the neighbor that i met at round t.Since the opinion vector x(t) is a random vector, the convergence metric used in Theorem 1 is E x(t) − x * ∞ where the expectation is taken over the random meeting of the agents.At first we present a high level idea of the proof.We remind that the unique equilibrium x * ∈ [0, 1] n of the instance I = (P, s, α) satisfies the following equations for each agent i ∈ V , Since our metric is E x(t) − x * ∞ , we can use the above equations to bound for all t ≥ 1, then with simple algebraic manipulations one can prove that x(t) − x * ∞ ≤ e(t) where e(t) satisfies the recursive equation where ρ = min a i .It follows that x(t)−x * ∞ ≤ 1/t ρ meaning that x(t) converges to x * .Obviously the latter assumption does not hold, however since W τ i are independent random variables with P W τ i = j = p i j , the quantity tends to 0 with probability 1.In Lemma 1 we use this fact to obtain a similar recursive equation for e(t) and then in Lemma 2 we upper bound its solution.

Lemma 1 Let e(t) be the solution of the recursion
where e(0) = x(0) − x * ∞ , δ(t) = ln(π 2 nt 2 /6 p)/t and ρ = min i∈V α i .Then, Proof At first we prove that with probability at least 1 − p, for all t ≥ 1 and all agents i: Since W τ i are independent random variables with P W τ i = j = p i j and E x * By the Hoeffding's inequality we get To bound the probability of error for all rounds t ≥ 1 and all agents i, we apply the union bound As a result with probability at least 1 − p we have that inequality (4) holds for all t ≥ 1 and all agents i.We now prove our claim by induction.Let x(τ )−x * ∞ ≤ e(τ ) for all τ ≤ t − 1.Then We get (5) from the induction step and (6) from inequality (4).Similarly, we can prove that As a result x(t) − x * ∞ ≤ e(t) and the induction is complete.Therefore, we have that with probability at least 1 − p, x(t) − x * ∞ ≤ e(t) for all t ≥ 1.
Now that we have obtained the recursive equation for the error, we can solve it using straightforward computation.The idea is to express the term e(t + 1) in terms of the previous term e(t) and to apply this expression repeatedly to obtain a formula for e(t).The main technical difficulty is upper bounding the sums that arise during this computation.This is done in the following lemma.
Proof Observe that for all t ≥ 0 the function e(t) satisfies the following recursive relation For t = 0 we have that Observe that for D > e 2.5 , δ(t) is decreasing for all t ≥ 1.Therefore, Also, note that where in the last inequality we used the fact that (t + 1)/t ≤ 2 for all t and ln t ≤ ln(t + 1).Thus, from equations ( 7) and ( 8) we get that for all t ≥ 0 e(t + 1) ≤ e(t) 1 − ρ Now that we have expressed e(t + 1) in terms of e(t), we can apply this expression to obtain a formula for e(t).We denote by H t the t-th partial sum of the harmonic series.
To simplify notation, we define In the following, we will make heavy use of the following elementary inequality, which holds for all p > 0: By "unrolling" the recurrence of Eq. 9 we obtain: Next, by using (10) we obtain that We now use the following well known upper and lower bounds for H t , which holds for all t and can be found in page 2 of [22] for all n, where γ is the Euler number.This immediately gives for n = t γ + ln t ≤ H t Also, by (11) for n = t + 1 we have which implies that Putting everything together, we have obtained the following for all t This implies that for all t and τ ≤ t, we have Thus, we obtain: Now observe that Putting all these together, we obtain Now the remaining task is to bound the sum on the right hand side.A standard way of bounding a sum of decreasing terms is with the corresponding Riemann integral.Indeed, we observe that To see that, notice that the derivative of this function is where the inequality holds since ρ < 1.Now, for any τ ≥ 1 we have that meaning that the function is indeed decreasing for all τ ≥ 1.To bound the integral in (12), we have to distinguish cases for ρ.Intuitively, if ρ is small, then the fraction decays faster than 1/t, which translates to the overall integral being polylogarithmic.If ρ is large, then a polynomial term with a small exponent might arise in the calculation.
since ln(Dt) > ln t for all t ≥ 1. Hence where we used the fact that e(0) ≤ 1 and Hence For the last inequality, we used the fact that ln D > 0 to conclude that (ln(Dt)) 3/2   for all t ≥ 1. Combining inequalities 13 and 14 yields that for all t ≥ 1 e(t) ≤ 2 √ 5 (ln(Dt)) 3/2 t min(ρ, 1/2) , which proves the first claim of the lemma.We would like the following inequality to be satisfied: 5, the right hand side is at most 1 + 2/3.Numerically, we observe that for t ≥ 6, ln t ≥ 1 + 2/3.Thus, the second inequality of the lemma follows for t ≥ 6.
An interesting consequence of Lemma 2 is that the rate of convergence is never better than 1/ √ t regardless of the value of ρ.In Sect. 5 we provide evidence that no reasonable protocol can achieve a better convergence rate.
We are now ready to prove Theorem 1.
Theorem 1 Let I = (P, s, α) be an instance of the opinion formation game of Definition 1 with equilibrium x * ∈ [0, 1] n .The opinion vector x(t) ∈ [0, 1] n produced by update rule (3) after t rounds satisfies where ρ = min i∈V a i and C is a universal constant and t ≥ 6.
Proof By Lemma 1 we have that for all t ≥ 1 and p ∈ [0, 1], for some universal constant C and for all t ≥ 6.Finally, 123 Hence, FTL dynamics converges to the same equilibrium point as the original FJ-model, albeit slower.In the next section we provide justification about why this strategy is a natural one for players to adopt, given that they operate in an adversarial environment.

Follow the Leader Ensures No Regret
In this section we provide rigorous definitions of no regret algorithms and explain why update rule (3) ensures no regret to any agent that repeatedly plays the game of Definition 1.Based on the cost that the agents experience, we consider an appropriate Online Convex Optimization problem.This problem can be viewed as a "game" played between an adversary and a player.At round t ≥ 0, 1. the player selects a value x t ∈ [0, 1].

the adversary observes the x t and selects a b
where s, α are constants in [0, 1].The goal of the player is to pick x t based on the history (b 0 , . . ., b t−1 ) in a way that minimizes her total cost.Generally, different OCO problems can be defined by a set of functions F that the adversary chooses from and a feasibility set K from which the player picks her value (see [26] for an introduction to the OCO framework).In our case the feasibility set is K = [0, 1] and the set of functions is As a result, each selection of the constants s, α leads to a different OCO problem.

Definition 7
An algorithm A for the OCO problem with F s,α and K = [0, 1] is a sequence of functions (A t ) ∞ t=0 where A t : [0, 1] t → [0, 1].Definition 8 An algorithm A is no regret for the OCO problem with F s,α and K = [0, 1] if and only if for all sequences (b t ) ∞ t=0 that the adversary may choose, if Informally speaking, if the player selects the value x t according to a no regret algorithm then she does not regret not playing any fixed value no matter what the choices of the adversary are.Theorem 2 states that Follow the Leader i.e.
is a no regret algorithm for all the OCO problems with F s,α .
Returning to the dynamics of the game in Definition 1, it is reasonable to assume that each agent i selects x i (t) according to no regret algorithm A i for the OCO problem with F s i ,α i , since by Definition 8, The latter means that the time averaged total disagreement cost that she suffers is close to the time averaged cost by expressing the best fixed opinion and this holds regardless of the opinions of the neighbors that i meets.Meaning that even if the other agents selected their opinions maliciously, her total experienced cost would still be in a sense minimal.Under this perspective update rule ( 3) is a rational choice for selfish agents and as a result FTL dynamics is a natural limited information variant of the FJ model.We would like to prove the following.

Theorem 2 Consider the function f
We now present the key steps for proving Theorem 2. We first prove that a similar strategy that also takes into account the value b t admits no regret (Lemma 3).Obviously, knowing the value b t before selecting x t is in direct contrast with the OCO framework, however proving the no regret property for this algorithm easily extends to establishing the no regret property of Follow the Leader.Theorem 2 follows by direct application of Lemma 4.

Lemma 3 Let (b t ) ∞ t=0 be an arbitrary sequence with b
The last inequality follows by the fact that Now we can understand why Follow the Leader admits no regret.Since the cost incurred by the sequence y t is at most that of the best fixed value, we can compare the cost incurred by x t with that of y t .Since the functions in F s,α are quadratic, the extra term f (x, b t ) that y t takes into account doesn't change dramatically the minimum of the total sum.Namely, x t , y t are relatively close.Hence, the costs incurred by the two sequences are not very different.
Proof We first prove that the two sequences are close.Namely, for all t, By definition The last inequality follows from the fact that b τ ∈ [0, 1].We now use inequality (15) to bound the difference f (x t , b t ) − f (y t , b t ).Since f is a quadratic function, the bound follows easily from calculations.
We are now ready to prove that FTL-dynamics has the no regret property.
Theorem 2 Consider the function f Proof Theorem 2 easily follows by Lemma 3 In the next section, we are going to prove that FTL dynamics is the fastest possible no regret protocol to solve the problem of opinion formation.
Definition 9 (no regret dynamics) Consider a collection of no regret algorithms such that for each (s, α) ∈ [0, 1] 2 a no regret algorithm A s,α1 for the OCO problem with F s,α and K = [0, 1], is selected.For a given instance I = (P, s, α) this selection produces the no regret dynamics x(t) defined as follows: -Initially each agent i selects her opinion x i (0) = A s i ,α i 0 (s i , α i ) -At round t ≥ 1, each agent i selects her opinion where W t i is the neighbor that i meets at round t.Such a selection of no regret algorithms can be encoded as a graph oblivious update rule.Specifically, the function . Theorem 3 applies and establishes the existence of an instance I = (P, s, α) such that the produced x(t) converges at best slowly to x * .
The rest of the section is dedicated to prove Theorem 3. In Lemma 5 we show that any graph oblivious update rule A can be used as an estimator of the parameter p ∈ [0, 1] of a Bernoulli random variable.Since we prove Theorem 3 using a reduction to an estimation problem, we shall first briefly introduce some definitions and notation.For simplicity we will restrict the following definitions of estimators and risk to the case of estimating the parameter p of Bernoulli random variables.Given t independent samples from a Bernoulli random variable B( p), an estimator is an algorithm that takes these samples as input and outputs an answer in [0, 1].Definition 10 An estimator θ = (θ t ) ∞ t=1 is a sequence of functions, Perhaps the first estimator that comes to one's mind is the sample mean, that is θ t = t i=1 X i /t.To measure the efficiency of an estimator we define the risk, which corresponds to the expected error of an estimator.Definition 11 Let P be a Bernoulli distribution with mean p and P t be the corresponding t-fold product distribution.The risk of an estimator θ for brevity.The risk E p [|θ t − p|] quantifies the error rate of the estimated value p = θ t (Y 1 , . . ., Y t ) to the real parameter p as the number of samples t grows.Since p is unknown, any meaningful estimator θ = (θ t ) ∞ t=1 must guarantee that lim t→∞ E p [|θ t − p|] = 0 for all p.For example, sample mean has error rate Fig. 1 This is an instance of an opinion formation game, where any algorithm for approximating the equilibrium can be used to construct an estimator for the mean of a Bernoulli random variable Lemma 5 Let A be a graph oblivious update rule such that for all instances I = (P, s, α), Then there exists an estimator Proof We construct an estimator θ A = (θ A t ) ∞ t=1 using the update rule A. Consider the instance I p described in Fig. 1.By straightforward computation, we get that the equilibrium point of the graph is x * c = p/3, x * 1 = p/6+1/2, x * 0 = p/6.Now consider the opinion vector x A (t) produced by the update rule A for the instance I p .Note that for t ≥ 1, The key observation is that the opinion vector x A (t) is a deterministic function of the index sequence W 0 c , . . ., W t−1 c and does not depend on p.Thus, we can construct the estimator θ A with θ A t (W 0 c , . . ., W t−1 c ) = 3x A c (t).For a given instance I p the choice of neighbor W t c is given by the value of the Bernoulli random variable with parameter p (P W t c = 1 = p).As a result, Since for any instance I p , we have that In order to prove Theorem 3 we just need to prove the following claim.
The above claim states that for any estimator θ = (θ t ) ∞ t=1 , we can inspect the functions θ t : {0, 1} t → [0, 1] and then choose a p ∈ [0, 1] such that the function E p [|θ t − p|] = Ω(1/t 1+c ).As a result, we have reduced the construction of a lower bound concerning the round complexity of a dynamical process to a lower bound concerning the sample complexity of estimating the parameter p of a Bernoulli distribution.The claim follows by Lemma 6, which we present at the end of the section.
At this point we should mention that it is known that Ω(1/ε2 ) samples are needed to estimate the parameter p of a Bernoulli random variable within additive error ε.Another well-known result is that taking the average of the samples is the best way to estimate the mean of a Bernoulli random variable.These results would indicate that the best possible rate of convergence for an graph oblivious dynamics would be O(1/ √ t).However, there is some fine print in these results which does not allow us to use them.In order to explain the various limitations of these methods and results we will briefly discuss some of them.We remark that this discussion is not needed to understand the proof of Lemma 6.
The oldest sample complexity lower bound for estimation problems is the wellknown Cramer-Rao inequality.Let θ t : {0, 1} t → [0, 1] be a function such that Since E p [|θ t − p|] can be lower bounded by E p (θ t − p) 2 we can apply the Cramer-Rao inequality and prove our claim in the case of unbiased estimators, E p [θ t ] = p for all t.Obviously, we need to prove it for any estimator θ , however this is a first indication that our claim holds.Sample complexity lower bounds without assumptions about the estimator are usually given as lower bounds for the minimax risk, which was defined 2 by Wald in [39] as Minimax risk captures the idea that after we pick the best possible algorithm, an adversary inspects it and picks the worst possible p ∈ [0, 1] to generate the samples that our algorithm will get as input.The methods of Le'Cam, Fano, and Assouad are well-known information-theoretic methods to establish lower bounds for the minimax risk.For more on these methods see [38,41].As we stated before, it is well known that the minimax risk for the case of estimating the mean of a Bernoulli is lower bounded by Ω(1/ √ t) and this lower bound can be established by Le Cam's method.In order to show why such results do no work for our purposes we shall sketch how one would apply Le Cam's method to get this lower bound.To apply Le Cam's method, one typically chooses two Bernoulli distributions whose means are far but their total variation distance is small.Le Cam showed that when two distributions are close in total variation then given a sequence of samples X 1 , . . ., X t it is hard to tell whether these samples were produced by P 1 or P 2 .The hardness of this testing problem implies the hardness of estimating the parameters of a family of distributions.For our problem the two distributions would be B(1/2 − 1/ √ t) and B(1/2 + 1/ √ t).It is not hard to see that their total variation distance is at most O(1/t), which implies a lower bound Ω(1/ √ t) for the minimax risk.The problem here is that the parameters of the two distributions depend on the number of samples t.The more samples the algorithm gets to see, the closer the adversary takes the 2 distributions to be.For our problem we would like to fix an instance and then argue about the rate of convergence of any algorithm on this instance.Namely, having an instance that depends on t does not work for us.
Trying to get a lower bound without assumptions about the estimators while respecting our need for a fixed (independent of t) p we prove Lemma 6.In fact, we show something stronger: for almost all p ∈ [0, 1], any estimator θ cannot achieve rate o(1/t 1+c ).Proof Since θ t is a function from {0, 1} t to [0, 1], θ t can have at most 2 t different values.Without loss of generality, we assume that θ t takes the same value θ t (x) for all x ∈ {0, 1} t with the same number of 1's.For example, θ 3 ({1, 0, 0}) = θ 3 ({0, 1, 0}) = θ 3 ({0, 0, 1}).This is due to the fact that for any p ∈ [0, 1], by Jensen's inequality, we have Therefore, for any estimator θ with error rate E p [|θ t − p|] there exists another estimator θ that satisfies the above property and for all p ∈ [0, 1].Thus, we can assume that θ t takes at most t + 1 different values.

123
Let A denote the set of p for which the estimator has error rate o(1/t 1+c ), that is We show that if we select p uniformly at random in [0, 1] then P [ p ∈ A] = 0. We also define the set Observe that if p ∈ A then there exists t p such that p ∈ A t p , meaning that A ⊆ ∞ k=1 A k .As a result, To complete the proof we show that P [ p ∈ A k ] = 0 for all k.Notice that p ∈ A k implies that for t ≥ k, the estimator θ must always have a value θ t (i) close to p. Using this intuition we define the set As a result, Each value θ t (i) "covers" length 1/t 1+c from its left and right, as shown in Fig. 2, and since there are at most t + 1 such values, by the union bound we get P [ p ∈ B k ] ≤ 2(t +1)/t 1+c , for all t ≥ k.More formally, for a fixed i we get We conclude that P [ p ∈ B k ] = 0.
Lemma 6 essentially shows that we cannot construct a protocol that is graphoblivious and converges exponentially fast to the equilibrium, as the dynamics of the original FJ model does.However, as we show in the next section, even a small amount of information about the topology of the graph results in faster protocols.

Limited Information Dynamics with Fast Convergence
We already discussed that the reason that graph oblivious dynamics suffer slow convergence is that the update rule depends only on the observed opinions.Based on works for asynchronous distributed minimization algorithms [7,9], we provide an update rule showing that information about the graph G combined with agents that do not act selfishly can restore the fast convergence rate.Our update rule depends not only on the expressed opinions of the neighbors that an agent i meets, but also on the i-th row of matrix P.
In update rule (6), each agent stores the most recent opinions of the random neighbors that she meets in an array and then updates her opinion according to their weighted sum (each agent knows row i of P).For a given instance I = (P, s, α) we call the produced dynamics Row Dependent dynamics (Dynamics 2).We have already mentioned that while this update rule guarantees fast convergence it does not guarantee no regret to the agents.To make this concrete we include a simple example.
Example 1 The purpose of this example is to illustrate that the update rule (6) does not ensure the no regret property.If some agents for various reasons exhibit irrational or adversarial behavior, agents that adopt update rule (6) may experience regret.That is the reason that Row Dependent dynamics converge exponetially faster that any no regret dynamics incluing the FTL dynamics.
Consider the instance of the game of Definition 1 consisting of two agents.Agent 1 adopts update rule (6) and has s 1 = 0, α 1 = 1/2, p 12 = 1 and agent 2 plays adversarially.Thus, s 2 , α 2 , p 21 don't need to be specified.By update rule (6), x 1 (t) = x 2 (t − 1)/2 and thus total disagreement cost that agent 1 experiences until round t is Since agent 2 plays adversarially, she selects x 2 (t) = 0 if t is even and 1 otherwise.As a result, the total cost that agent 1 experiences is τ )) 2 3t/8.Now agent 1 regrets for not adopting the fixed opinion 1/3 during the whole game play.Selecting x 1 (t) = 1/3 for all t would incur her total disagreement cost which is less than 3t/8.
The problem with the approach of Row Dependent Dynamics is that the opinions of the neighbors that she keeps in her array are outdated, i.e. the opinion of a neighbor of agent i has changed since their last meeting.The good news are that as long as this outdatedness is bounded we can still achieve fast convergence to the equilibrium.By bounded outdatedness we mean that there exists a number of rounds B such that all agents have met all their neighbors at least once from t − B to t.The latter is formally stated in Lemma 7, which states that if such a B exists, then the protocol converges exponentially fast to x * .For convenience, we call such a sequence of B rounds an epoch.

Remark 1
Update rule (6), apart from the opinions and the indices of the neighbors that an agent meets, also depends on the exact values of the weights p i j and that is why Row Dependent dynamics converge fast.We mention that the lower bound of Sect. 5 still holds even if the agents also use the indices of the neighbors that they meet to update their opinion, since Lemma 5 can be easily modified to cover this case.The latter implies that any update rule that ensures fast convergence would require from each agent i to be aware of the i-th row of matrix P.
The idea behind the proof of Lemma 7 is simple: if during each epoch B an agent meets all his neighbors at least once, then certainly at the end of the epoch a full step of the original FJ dynamics will have been computed.This means that the running time will be slower than that of the FJ model by a factor of B. Lemma 7 Let ρ = min i a i , and π i j (t) be the most recent round before round t, that agent i met her neighbor j.If for all t ≥ B, t − B ≤ π i j (t) then, for all t ≥ k B,  In Row Dependent dynamics there does not exist a fixed length window B that satisfies the requirements of Lemma 7. However we can select a length value such that the requirements hold with high probability.To do this observe that agent i should collect the opinions of all of her neighbors, which resembles the process of the coupons collector problem.We first state a useful fact concerning this problem, whose proof uses just elementary probability.Lemma 8 (see e.g.[34]) Suppose that the collector picks coupons with different probabilities, where n is the number of distinct coupons.Let w be the minimum of these probabilities.If he selects ln n/w + c/w coupons, then: P collector hasn't seen all coupons ≤ 1 e c It is now clear that each agent i simply needs to wait to meet the neighbor j with the smallest weight p i j .Therefore, after log(1/δ)/ min j∈N i p i j rounds we have that with probability at least 1 − δ agent i met all her neighbors at least once.Since we want this to be true for all agents, we shall roughly take B = 1/ min i∈V , j∈N i p i j .These calculations become precise in the following lemma.
Lemma 9 Let π i j (t) be the most recent round before round t that agent i met agent j and B = 2 ln( nt δ )/w where w = min i∈V min j∈N i p i j .Then with probability at least 1 − δ, for all τ ≥ B and for all i ∈ V and all j ∈ N i τ − B ≤ π i j (τ ) ≤ τ − 1.
Proof Consider an agent i at round τ ≥ B where B = 2 ln( nt δ )/w and assume that there exists an agent j ∈ N i such that π i j (τ ) < τ − B. Agent i can be viewed as a coupon collector that has buyed B coupons but has not found the coupon corresponding to agent j.Since N i < n and min j∈N i p i j ≥ w by Lemma 8 we have that P there exists j ∈ N i s.t.π i j (τ ) < τ − B ≤ δ nt The proof follows by a union bound for all agents i and all rounds B ≤ τ ≤ t.
Our goal is to prove Theorem 4, showing that the convergence rate of update rule ( 6) is exponentially fast in expectation (although not as fast as the original FJ dynamics).where x(t) ∈ [0, 1] n is the opinion vector produced by update rule (6), ρ = min i∈V a i , w = min i∈V min j∈N i p i j .
By direct application of Lemma 7 and Lemma 9, we obtain the following corollary that will be useful in proving Theorem 4.
Corollary 1 Let x(t) be the opinion vector produced by update rule (6) for the instance I = (P, s, α), then with probability at least 1 − δ, for all t ≥ 2 ln( nt δ )/w, where ρ = min i∈V α i and w = min i∈V , j∈N i p i j .
Proof Let B = 2 ln( nt δ )/w.By Lemma 9 we have that with probability at least 1 − δ, for all i ∈ V , j ∈ N i and for all τ ≥ B, τ − B ≤ π i j (τ )

Fig. 2
Fig. 2 Estimator output at time t