Truthful Matching with Online Items and Offline Agents

We study truthful mechanisms for welfare maximization in online bipartite matching. In our (multi-parameter) setting, every buyer is associated with a (possibly private) desired set of items, and has a private value for being assigned an item in her desired set. Unlike most online matching settings, where agents arrive online, in our setting the items arrive online in an adversarial order while the buyers are present for the entire duration of the process. This poses a significant challenge to the design of truthful mechanisms, due to the ability of buyers to strategize over future rounds. We provide an almost full picture of the competitive ratios in different scenarios, including myopic vs. non-myopic agents, tardy vs. prompt payments, and private vs. public desired sets. Among other results, we identify the frontier for which the celebrated $e/(e-1)$ competitive ratio for the vertex-weighted online matching of Karp, Vazirani and Vazirani extends to truthful agents and online items.


Introduction
Matching in bipartite graphs is a fundamental model that has found numerous applications with the growth of the Internet.Some examples include items and buyers in e-commerce, drivers and passengers in ride-sharing platforms, ad slots and advertisers in online ad auctions, and jobs and workers in online labor markets.In these applications, it is common that vertices on one side are known from the outset, while vertices from the other side arrive one-by-one in an online fashion.Upon the arrival of an online vertex, its information is revealed (containing, e.g., its set of adjacent edges, and their weights), and the algorithm has to immediately and irrevocably decide either to match it with an available offline partner or leave it unmatched forever.The goal is to maximize the sum of the weights along the matched edges.
A celebrated result in online matching by Karp,Vazirani,and Vazirani [16] shows that in the unweighted setting, a simple randomized strategy, called Ranking, achieves a competitive ratio of e/(e − 1), and this is optimal.This result extends to the setting where the vertices on the offline side are weighted and the objective is to maximize the sum of the weights of the matched vertices.Although the original algorithm for this problem, Perturbed-Greedy [1], was designed for non-strategic settings, online matching problems have also been studied in the presence of strategic agents [e.g., 19,23,9].This is not a mere theoretical exercise: online matching is used in many situations where the parties involved are interested in misreporting their true valuations to obtain a better outcome: e.g., combinatorial and ad-auctions, kidney exchange, schoolstudent matching, and house allocation.In the presence of strategic agents, an agent's value is her private information, and is not directly available to the mechanism designer.The main challenge here is to design incentive-compatible or truthful mechanisms which, besides finding a good matching, also ensure that it is in the agents' best interest to report their true values.In addition to making decisions regarding the matching itself, such mechanisms can also charge some payment from the agents in order to incentivize them to truthfully report their values.Here, each agent strives to maximize her quasi-linear utility, which is defined as the value she obtains from her assigned item, minus the payment she has to make.
In almost all previous studies, the agents are represented by the vertices in the online side, while the items they are competing over are available offline.In many natural Internet applications, e.g., selling advertising opportunities via repeated auctions, the agents are fixed and observe a stream of items arriving online.This motivates the study of a reversed online matching problem, where the offline side is strategic on her value and on her set of desired items that arrive online.This particular variant has been considered thus far only in very restricted settings [6,7].This is not a coincidence: when agents are present throughout the entire matching Table 1: Summary of our results, with ν = min(m, n), where n is the number of agents and m the number of items.
process, many new manipulation opportunities arise, and incentivizing truthful behavior is significantly more challenging.Indeed, the online nature of the problem forces any mechanism to repeatedly make irrevocable decisions upon the arrival of goods, lacking knowledge about future opportunities that might arise to the participating agents.The agents -possibly aware of those future opportunities -may strategize to gain benefits in the future, challenging standard tools that have been applied in cases where agents arrive online.
Our work provides a systematic analysis of this scenario, and gives (almost) tight competitive ratios under a rich combination of natural assumptions.We study this problem along different dimensions, as follows.First, we consider two types of agents -myopic and non-myopic -that are characterized by the different information they have on the instance.Myopic agents make strategic considerations that are limited to the current time step, without looking forward into the future (see, e.g., Deng, Panigrahi and Zhang [7]), whereas non-myopic agents optimize across multiple time steps, using the up-front knowledge of the underlying (online) graph.The assumption of myopic agents clearly eradicates some of the difficulties of designing (almost tight) online mechanisms with offline strategic agents thus allowing to derive efficient mechanisms from known online matching algorithms, e.g., from Aggarwal et al. [1].Second, we consider two types of private information.In the first scenario we consider, an agent's private information consists of her private value for her desired items, but the set of desired items is publicly known.In the second scenario, both the value and the set of desired items are private information.
Notably, in both cases the graph structure is revealed to the mechanism step-by-step, upon the arrival of every item.Finally, we distinguish between prompt and tardy mechanisms.Both types of mechanisms make allocation decisions immediately.However, they differ in the time at which they make payment decisions.Prompt mechanisms make payment decisions immediately upon allocation, while tardy mechanism may delay payment decisions to the end of the entire process.

Our Results and Techniques
We conduct a systematic study of online bipartite matching with online items and offline agents, in a variety of scenarios and we provide (almost) tight bounds for the settings of interest, as summarized in Table 1.
Myopic agents.We start by investigating the simpler setting of myopic agents.These agents care only about their instantaneous utility, and do not strategize over the future.As such, we only consider prompt mechanisms for this type of agents.Exploiting the myopic nature of the agents, it is not difficult to turn the best (non-truthful) algorithms into (truthful) mechanisms.In particular, we construct a deterministic prompt mechanism based on the greedy matching algorithm that is guaranteed to achieve at least a half of the optimal welfare.We also give a randomized prompt mechanism based on the algorithm for weighted online matching [1], which is e/(e−1)-competitive.This shows that the transition from non-strategic agents to strategic myopic agents does not lead to a deterioration in efficiency guarantees.Notably, for the special case we study, our bounds for myopic online matching improve vastly over those obtained by Deng, Panigrahi and Zhang [7] for general XOS valuations.The results for myopic agents are presented in Table 1(a).
Non-myopic agents with public graph edges.Next we consider non-myopic agents who can strategize about their values but not about their desired items: upon the arrival of an item, the set of agents interested in it is revealed (no strategizing involved), but the agent values are reported by the agents themselves.This variant is single-parameter, for which Myerson's lemma applies [22].We prove that, if the mechanism is allowed to wait until the end of the online phase to set prices (i.e., tardy mechanism), then it is possible to achieve the same bounds as in the myopic case, subject to showing that the Greedy and Perturbed-Greedy algorithms induce a certain form of global monotonicity.For prompt mechanisms, in contrast, we establish strong impossibility results.For deterministic mechanisms, we prove a ν = min(m, n) competitive lower bound, where n and m denote the number of agents and items, respectively.This sharp deterioration from tardy to prompt mechanisms occurs since in order to prevent buyers from strategizing over future rounds, the prices must be non-decreasing.Tardy mechanisms circumvent this by assigning payments to agents in the end of the entire process.A matching ν upper bound is inherited from the more general case of private graph edges, presented below.For randomized prompt mechanisms we establish an Ω(log ν/ log log ν) lower bound, using Yao's Minimax principle.Starting from a carefully designed distribution of problem instances with exponentially increasing agent valuations, we employ a primal-dual approach together with our previous observations on the behavior of deterministic truthful mechanisms to bound the achievable competitive ratio.An almost matching upper bound is inherited from randomized non-myopic prompt mechanisms with private graph edges.The results for non-myopic agents with public graph edges are presented in Table 1

(b).
Non-myopic agents with private graph edges.We finally consider non-myopic agents when both valuations and the set of desired items are private information.For deterministic prompt mechanisms, the ν lower bound from the case of public graph edges applies.Moreover, we show that in the case of private edges, every deterministic truthful mechanism is essentially prompt.Thus, tardy mechanisms for this case retain the ν lower bound, exhibiting a large gap between tardy mechanisms for public vs. private edges.We then provide a prompt truthful deterministic mechanism that is ν-competitive, matching the lower bound.For randomized prompt truthful mechanisms, the Ω(log ν/ log log ν) lower bound from the case of public edges applies.This lower bound extends to tardy randomized mechanisms as well, since these are probability distributions over deterministic mechanisms and, as stated above, all deterministic truthful mechanisms for private edges are prompt.On the positive side, we provide a randomized prompt truthful mechanism that gives an almost matching competitive ratio of O(log ν).This algorithm is based on an explore-exploit approach specifically tailored to our case.
Ex-post vs. ex-ante truthfulness.Finally, we explore the notion of ex-ante truthfulness, as opposed to expost truthfulness, where agent's true declarations maximize their expected utility instead of their utility in any realization of the random choices of the mechanism.Clearly, ex-post truthfulness implies ex-ante truthfulness.In the setting with myopic buyers, we only need to consider ex-post truthfulness as we obtain tight approximation in this stronger model that closes the problem also for the ex-ante analogue.In the setting of non-myopic buyers, we show that the additional hardness introduced by truthfulness cannot be fully attributed to the fact that we require ex-post truthfulness.Specifically, we establish a lower bound of 2 for the competitive ratio of ex-ante truthful mechanisms for this setting (even with respect to randomized tardy ones), exhibiting a gap from the corresponding e/(e − 1) upper bound for myopic buyers.Our proof utilizes an instance for which we establish lower bounds on the expected utility of various types of agents.We then employ these to show a contradiction to the mechanism's correctness.
Remark.Throughout the paper, we assume that weights are assigned to vertices (agents) rather than edges.Indeed, it is well known that for the more general case of edge weights, even the algorithmic problem is hopeless (see, e.g., Appendix G of [1]).In addition, one may wonder why we do not study the case of nonmyopic agents with public valuations but private edges.The reason is that in the case of public valuations, it is easy to see that agents cannot benefit from misreporting their edges, implying that Greedy and Perturbed-Greedy are inherently truthful.

Related Work
Karp Vazirani and Vazirani [16] introduced the online matching problem, and studied it under one-sided bipartite arrivals.They observe that the trivial 1/2-competitive greedy algorithm (which matches any arriving vertex to an arbitrary unmatched neighbor, if one exists) is optimal among deterministic algorithms for this problem.They also provide a groundbreaking and elegant randomized algorithm for this problem, called Ranking, which achieves an optimal e/(e−1) competitive ratio.The work of Karp Vazirani and Vazirani [16] was extended to vertex weighted settings by Aggarwal et al. [1], who give an optimal e/(e − 1)-competitive, randomized algorithm using random perturbations of weights by appropriate multiplicative factors.The same bound has been re-proven over the years [5,8,12,10].Various extensions of one sided online matching and its economic applications (e.g., display ads) have been widely studied over the years, see e.g. the excellent survey of Mehta [20] for further reference.Online matching has also been studied under edge and general vertex arrivals, as well as in different stochastic settings (see e.g., [17,18,11,15,13,14]).
An important generalization of assignment problems in the form of matchings are combinatorial auctions, where buyers can obtain a subset of the available items, instead of just one.Combinatorial auctions with offline strategic buyers and online items has been recently studied by [7] for submodular and XOS valuations in the case of myopic buyers -considered also in this work -and in the less constrained setting of items that must not be irrevocably assigned at time of arrival.Deng, Panigrahi and Zhang [7] show (for myopic buyers) a sharp separation between submodular valuations, which admit a logarithmic competitive ratio, and XOS valuations, for which a polynomial lower bound is proven.In our work, we prove tight constant bounds for myopic buyers in the important special case of a unit-demand matching setting.
Cole, Dobzinski and Fleischer [6] formally introduced the notions of prompt and tardy mechanism, after observing the severe negative aspects of many existing (tardy) methods.They study prompt trutfhul mechanisms for an online problem that is related to ours, but with some restrictions: while agents can be thought as being on the offline side of the graph, their items of interest are restricted to correspond to form an interval over the online steps (which corresponds to the interval buyers are present).Further, agents report their departure time (which can be public/private) once they arrive, and their arrival time is public knowledge.Their work is probably closest to ours in spirit, presenting e.g. a logarithmic-competitive, prompt mechanism for the above, less general variant of our problem with private departures.The notions of tardy and prompt mechanisms have since been adopted in the literature, see e.g.[3,26].The model of offline agents and online items has been the subject of extensive investigation in economic theory in dynamic mechanism design.Despite this obvious relation to our setting, there are fundamental differences (see for example [21,2,4]).In dynamic mechanism design, a strategic buyer learns her valuations at time of arrival of each item.Opposed to our setting, priors on agents' valuations for each online item are usually known beforehand.Finally, in our matching setting the agents' valuations can assume only two values, v i and 0, and we consider unit demand buyers instead of additive valuation agents as it is customary in dynamic mechanism design.

Preliminaries
We are given a bipartite graph G = (B, I; E), where B is a set of n vertices, corresponding to buyers, I is a set of m vertices, corresponding to items, and E ⊆ B × I is the set of edges.As aready mentioned, we denote with ν the smallest between the number of buyers n and the items m.The set of buyers is known beforehand, while the items arrive one by one in some unknown, possibly adversarial order.Without loss of generality assume that item j arrives at time j.Each buyer i has two pieces of private information: the set of items she is interested in, and her value v i if she gets at least one of them (the value for other items is 0).Upon the arrival of a new item, every buyer declares if she is interested in the current item and, if yes, her value.Let b i,j denote the bid of buyer i for item j (with the convention that b i,j = 0 if buyer i is not interested in item j).Without loss of generality, we may assume that buyers cannot change their declared valuation after they have declared it once4 , i.e. every nonzero bid of the same buyer is the same value b i , and that every buyer is assigned at most one item.
A mechanism M is composed of an allocation scheme and a payment scheme.Upon the arrival of every item, and based on buyer bids, the mechanism decides immediately and irrevocably to either assign the new item to some buyer who has not been assigned an item yet, or leave it unassigned forever.Thus, the resulting allocation is a matching in G: every buyer receives at most one item, and every item is allocated to at most one buyer.We denote by µ the induced matching, so that µ j denotes the buyer to whom item j is assigned (we assume that an item j can only be assigned to a buyer who declares interested in j).If j is unassigned, we write µ j = ∅.We also write µ −1 i to denote the item assigned to buyer i, with the convention that µ −1 i = ∅ if i is left unassigned.The allocation is computed online; i.e., µ j is determined using only the bids on items up to j.In addition to the allocation, the mechanism decides how much each buyer should pay.A payment scheme is denoted by p, where p i denotes the non-negative payment of buyer i.We distinguish between two types of payment schemes, according to the time at which the mechanism determines the payment.A tardy mechanism is one where the payment vector p is computed in the end of the process.A prompt mechanism is one where the payment p i of every buyer i is determined upon the assignment of buyer i (i.e., upon the arrival of item µ −1 i ).The mechanism's objective is to maximize the social welfare of the allocation µ, which is the sum of the buyer values for their assigned items.The social welfare is given by SW Note that a mechanism can also be randomized, so that its allocation is a distribution over matchings.In case of a randomized mechanism, we measure its efficiency by the expected social welfare.We say that a mechanism gives an α approximation, or is α-competitive (where α ≥ 1), if its (expected) social welfare is at least an 1/α fraction of the welfare of a maximum weight matching.That is, , where µ ⋆ is the maximum weight matching in G.
A bidding strategy B i for buyer i is a sequence of bids b i,j that specifies, every time a new item j arrives, whether to declare interest in it and which value to report.The bid B i might depend on the bids of the other agents, the actions of the mechanism, and the knowledge the buyers have on the sequence of items.Recall that once an agent declares a positive valuation b i,j = b i > 0 for some item j, she cannot change her value thereafter; namely, all bids for future items j ′ can take the value of either b i or 0. Let B denote the profile of buyer bidding strategies, and B −i denote the profile of all buyer strategies excluding buyer i.We assume that every buyer has a quasi-linear utility function: A buyer is called myopic if upon the arrival of every item j, she cares only about maximizing her utility in that round, without considering its effect on future rounds.I.e., upon the arrival of item j, she maximizes the utility function u i,j = v i • I (µj =i, (i,j)∈E) − p i .We consider myopic agents only in the context of prompt mechanisms, where the price p i is determined immediately.We study the following ex-post notion of truthfulness: (i) A mechanism for myopic agents is truthful if it is always in the best interest of a myopic buyer to declare her value truthfully.(ii) A mechanism for non-myopic agents is truthful if an agent maximizes her utility for every realization of the mechanism by declaring her value truthfully.Finally, we only consider mechanisms that are ex-post individually rational, meaning that all agents (myopic or not) have non-negative utility, for every realization of the mechanism.

Truthful Mechanisms for Myopic Buyers
In this section we study myopic buyers and we show that for this class of agents it is possible to make strategy proof the best (non-truthful) algorithms [16].In particular, we construct a deterministic prompt mechanism that is guaranteed to achieve at least half of the welfare of the best offline matching, and a randomized prompt mechanism that is (in expectation) e/(e − 1)-competitive with the best offline matching.
We start describing our deterministic mechanism HonestGreedy, that mimics the classical Greedy algorithm for online weighted matching in a way that is robust to strategic bidding.Every time a new item arrives, HonestGreedy runs a second price auction [25] to allocate it between the remaining (interested) buyers.Since the buyers are myopic, every time a new item arrives, they behave like if it was the last: clearly there is no point in lying about being interested in an item.Moreover, the truthfulness in each step (as well as the individual rationality) is guaranteed by the well-known properties of the second price auction.Note that the mechanism sets the price for item i immediately, so it is prompt.The analysis of the approximation guarantee is also quite simple: the allocation output by HonestGreedy is the same one that the standard Greedy algorithm would have computed.It is well known that Greedy is 2-competitive with respect to the best offline matching (see, e.g., Appendix B of [1]), and that this approximation is tight in the class of deterministic algorithms [16].We summarize these observations in the following theorem.

Theorem 1. The deterministic prompt mechanism HonestGreedy is truthful for myopic agents and guarantees a 2 approximation to the best offline matching. The approximation is tight even for (non-truthful) deterministic algorithms.
We complement this deterministic 2-competitive, simple mechanism with an optimal, randomized e/(e − 1)−competitive alternative, HonestPerturbedGreedy, based on Perturbed-Greedy of Aggarwal et al. [1].There, each offline vertex is associated with a random multiplier; then, every time one of the online vertices arrives, it is matched to the free neighbor with largest multiplier-value product.To protect from the strategic behavior of agents, HonestPerturbedGreedy declares -before the beginning of the online phase -publicly all random multipliers, and then implements Myerson's payment rule [22] for every round.For a formal description we refer to the pseudocode of HonestPerturbedGreedy, where we maintain the convention Let y i = 1 − e xi−1 4: Reveal publicly all x i and y i 5: For item j arriving online, do 6: Receive bids for j and let N (j) be the set of agents interested in j 7: Discard for further consideration i ⋆ that the max of an empty set is 0 and thus if N (j) is empty in line 7, then j is discarded and the mechanism passes to the next item.The properties of HonestPerturbedGreedy are summarized in the following Theorem, whose formal proof is deferred to the Appendix.
Theorem 2. The randomized prompt mechanism HonestPerturbedGreedy is truthful for myopic agents and achieves (in expectation) a e/(e − 1) approximation to the best offline matching.The approximation is tight even for (non-truthful) randomized algorithms.

Non-myopic buyers with public graph edges
We now move our attention to a more demanding notion of truthfulness, where agents are assumed to know, and strategize about, the whole sequence of items arriving.Note that this is a strong information asymmetry between agents and mechanism, as the latter only discovers the items as they are revealed online and has no information on the future.As a first step in this challenging model, in this Section we study the case where agents may only lie on their valuations.Our main focus here is on establishing lower bounds, which will naturally extend to the case where the edges of the graph are private information.

Tardy truthful mechanisms
When the graph edges are public knowledge, we can turn once again to using the algorithmic approaches outlined in the previous Section, i.e.Greedy and Perturbed-Greedy.Now that agents cannot strategically withhold or misreport the existence of edges, a tardy truthful mechanism can use the whole graph structure (but of course still not the reported value b i ) when computing the price charged from any buyer i.
The prompt, round-wise payment rules from our considerations on myopic buyers, however, do not guarantee non-myopic truthfulness.What remains to prove therefore is that these algorithms can be augmented by a different (tardy) payment rule to be made truthful.This is formally done in two steps in the Appendix: first, it is established that the allocations produced are monotone, and then Myerson's Lemma is employed on the whole algorithm.All in all, we obtain the following Theorem.
Theorem 3.There exists a deterministic, respectively randomized, tardy mechanism that is truthful for nonmyopic agents with public graph edges and guarantees a 2, resp.e/(e − 1), approximation to the best offline matching.The approximation is tight even for (non-truthful) deterministic, resp.randomized, algorithms.
A last observation: while the allocation computed by the mechanisms we just described are analogue to the ones computed by HonestGreedy and HonestPerturbedGreedy, the payments are different!We are still using Myerson's Lemma, but the critical prices 5 are clearly different, as they are computed considering the whole run of the algorithm.To see this, consider the following example.There are two buyers, b 1 and b 2 , and two items i 1 and i 2 .b 1 is interested in both the items and has a value of 1, while b 2 only cares about i 1 , with a value of 0.9.Assume also for the sake of simplicy that the perturbations y 1 and y 2 of Perturbed-Greedy are both 1.Both versions of Perturbed-Greedy would only allocate i 1 to b 1 , but at two different prices: the mechanism for myopic agents would charge 0.9, while the tardy one for non-myopic agents would wait the end of the second round and charge 0.

Prompt deterministic truthful mechanisms
When mechanisms are required to be prompt, the problem becomes much harder despite the fact that each agent's private information is just a single value.This is due to the online nature of the problem versus the possibly universal knowledge of the buyers, as outlined in the introduction.We first concentrate on deterministic prompt truthful mechanisms, and prove that the scope of these is indeed quite limited.

Definition 1 (critical item property).
We say that a deterministic mechanism satisfies the critical item property if and only if for every buyer i, there exists some j ∈ I such that for any reported value b i of i, the mechanism assigns i with item j, or none at all.Note that j may depend on the edges of the graph, and on the values of other buyers.

Lemma 1. Prompt deterministic truthful mechanisms for the problem with public graph edges satisfy the critical item property.
Proof.For the sake of contradiction, assume that there is a buyer i who gets item j 1 at price p 1 if she reports a value β 1 and gets item j 2 at price p 2 if she reports a value β 2 .Without loss of generality, let j 1 < j 2 .By truthfulness, the mechanism must give item j 1 to buyer i if she reports a value ≥ p 1 (as far as the mechanism knows, i might not like items after j 1 , and she would have incentive to lie and report β 1 if she is not given j 1 ).Thus, we have p 2 ≤ β 2 < p 1 , where the first inequality comes from individual rationality.But now, buyer i has incentive to report β 2 , in order to get j 2 and pay p 2 which is less than p 1 .

Theorem 4. Any prompt deterministic truthful mechanism for the problem with public graph edges has competitive ratio of at most ν = min(m, n).
Proof.Consider an instance with n buyers with value 1 that are all interested in the first item.If there is a buyer i, who will never get item 1 no matter what she reports, then we change the instance so that i has an arbitrary large value and is only interested in item 1, in which case i will get nothing and the mechanism does not even approximate the optimal social welfare.Conversely, if there is no such buyer, then the critical item property states that no other item can be allocated, which gives an approximation ratio of min(m, n).

Prompt randomized truthful mechanisms
Somewhat surprisingly, the previous section has revealed a very large gap between tardy and prompt deterministic mechanisms, when the topology of the graph is public knowledge: while tardy mechanisms can be implemented for free, i.e., maintaining the efficiency guarantees of (non-strategic) combinatorial algorithms, for prompt mechanisms the story is different.After showing that deterministic mechanisms cannot achieve anything better than ν, we now turn our focus towards impossibility results for randomized mechanisms.We utilize a well-known property of randomized truthful mechanisms, which (by definition) make truthful reports utility-maximizing for any outcome of a mechanism's random decisions, even in hindsight: this implies that they are in fact lotteries over deterministic truthful mechanisms, which in turn satisfy the characterizing properties shown in the previous section.

Theorem 5. Any prompt randomized truthful mechanism for the problem with public graph edges has competitive ratio of at least Ω(log ν/ log log ν).
Proof.Fix any prompt randomized ex-post truthful mechanism for public graph edges.We are going to argue by Yao's principle [27] that its competitive ratio is at least Ω(log ν/ log log ν).This holds due to the upcoming Lemma 2, which shows that there exists a distribution over instances, such that the optimal solutions have expected welfare Ω(log n), and a best-possible deterministic mechanism M, since it satisfies the critical item property, outputs solutions with expected value O(log log n).Lemma 2. There is a distribution over instances with n buyers and n items, for which optimal solutions have expected value Ω(n log n), whereas any deterministic mechanism satisfying the critical item property outputs solutions whose expected value is O(n log log n).
Proof.Let k ≥ 1 be a parameter, which corresponds to the number of types of buyers, and let . Consider the following distribution over instances, with n buyers and n items.Each buyer i draws independently a type t(i) ∈ {1, . . ., k} with probability β t(i) , and we set her value to v i = 1/β t(i) .Then, we sort buyers by decreasing t(i), breaking ties using indices, and call σ(i) ∈ {1, . . ., n} the rank of buyer i in this ordering.We decide that buyer i is interested in all items up to the σ(i)-th one.To visualize this procedure, we refer to Figure 1.It is easy to find the optimal allocation: it consists in assigning each buyer of rank σ(i) the σ(i)-th item, in a perfect matching.Thus the expected optimal social welfare is equal to We now define the type s(j) = t(σ −1 (j)) of an item j as the type of the j-th buyer in the ordering σ, which corresponds to the type of its buyer in the abovementioned optimal matching.Observe that of each type, there are as many items as buyers, and that buyer i cannot be allocated an item j of type s(j) < t(i).For each buyer i and for all types t ≤ s, let x i s,t be the probability (over the randomness of the types of all buyers except i) that i gets an item of type s, conditioning on the fact that i has type t.Let x s,t = i x i s,t /n, that is, the average probability that a type t buyer will be assigned a type s item.The expected social welfare of our deterministic mechanism is equal to In expectation, the mechanism sells i s t=1 β t • x i s,t items of type s.Because there are equally many items and buyers of each type, the expected number of items of type s is β s • n.Thus, we have the linear constraint We are now going to use the critical item property.Fix a buyer i, and condition on the types of all buyers except her.We show that there exists an item j(i) ∈ {1, . . ., n}, such that for every type t(i), either i gets item j(i), or she gets nothing.Denote as I t the instance given by the fixed types of all buyers except i, together with buyer i who has type t.Using the critical item property with instance I 1 , where i instead is of type 1 (meaning that i is interested in maximally many items), there is an item j(i) such that buyer i either gets j(i) or nothing.From the perspective of the mechanism, any other instance I t (defined analogously) is identical to instance I 1 up to the point when i stops being interested in items.At this point, if buyer i has already been allocated an item, then it must be j(i).Otherwise, she will not get anything.Now that j(i) is well-defined (and only depends on types of other buyers), let y i s be the probability (over the randomness of the types of all buyers except i) that there exists some type t such that if t is the type of i, then item j(i) has type s.Let y s = i y i s /n.Because buyer i can only get item j(i), and because j(i) is independent from t(i), we have x i s,t ≤ y i s .Thus, summing over all buyers, we have the linear constraint x ≤ y s , for all 1 ≤ t ≤ s ≤ k.Finally, conditioning on the types of all buyers expect i, we show that there is only a small number of types that j(i) can take.Recall that s(j(i)) = t(σ −1 (j(i))), that is, the type of item j(i) is by definition the type of the j(i)-th buyer in the ordering σ, where σ was obtained by sorting buyers in decreasing order of type.Consider the ordering induced by σ after excluding buyer i, and denote i 1 and i 2 the buyers of rank j(i) − 1 and j(i).In the original ordering σ, either i comes before i 1 (in which case s(j(i)) = t(i 1 )), or i comes after i 2 (in which case s(j(i)) = t(i 2 )), or i comes between i 1 and i 2 (in which case s(j(i)) = t(i)).In any case, t(i 1 ) ≥ s(j(i)) ≥ t(i 2 ).This shows that there are at most 2+z possible values for s(j(i)), where z denotes the number of types not seen among other buyers.By a standard computation, the expected value of z is smaller than k t=1 (1 − β t ) n−1 .Recall that y s denotes the average probability over i that there exists a type for i which can make j(i) have type s, where the randomness is over the instance without i.Since for every fixed such instance, j(i) can only possibly take two of the types seen in buyers except i, for any fixed i, it holds and therefore, the same holds also on average, i.e. for the y s .Thus, averaging over possible types for the other buyers, and summing over i, we have the linear constraint and thus α ≤ 3. To conclude the proof, we use the linear constraints obtained to define a linear program (P) whose objective function is the expected value of the social welfare obtained by a deterministic truthful mechanism.We want to show that the objective function of our linear program is at most O(n log k).To this end, Lemma 3 builds a solution for the dual linear program (D), whose value is an upper bound on the value of the primal linear program (for convenience, the objective function is divided by n).

then the dual (B) has a feasible solution of value O(α log k).
Proof.Set δ = ⌈log 2 k⌉, then following solution of the dual is feasible and yields the desired objective value: w = δ, v s = 0 if s < δ and 2 s−δ otherwise, while the u s,t are defined as:

Non-myopic buyers with private graph edges
Next, we move on to the (harder) case where the graph edges are private information of the agents.The additional hardness, interestingly, severely affects the competitive guarantees only for deterministic truthful mechanisms.Similarly to before, we begin by characterizing these, and then move on to results for randomized mechanisms.

Deterministic truthful mechanisms
In the previous section we assumed that the agents could not misreport their interest in items, thus reducing the problem to a single-parameter one.We now lift this assumption, and investigate the effect on the competitive ratio of determistic truthful mechanisms.We show that deterministic truthful mechanisms can always be implemented in a prompt manner.Then, we give matching upper and lower bounds on the best approximation ratio for the social welfare.

Lemma 4. Tardy deterministic truthful mechanisms for the problem with private graph edges satisfy the critical item property (see Definition 1).
Proof.For the sake of contradiction, assume that there is a buyer i who gets item j 1 at price p 1 if she reports a value β 1 , and gets item j 2 at price p 2 if she reports a value β 2 .Without loss of generality, we assume that j 1 < j 2 .First, we argue that p 1 = p 2 .Indeed, if p 1 > p 2 then i with value β 1 has incentives to lie and report β 2 ; whereas if p 1 < p 2 then i with value β 2 has incentives to lie and report β 1 .Second, we slightly change the instance, such that buyer i has value β 2 and is not interested in items after j.When allocating j, the mechanism has not seen any difference with the original instance, hence i has incentives to lie and report β 1 to get j, then lie and pretend she was interested in subsequent items to make sure she is charged p 1 .

Lemma 5. Tardy deterministic truthful mechanisms for the problem with private graph edges are prompt.
Proof.Assume that our mechanism assigns an item j to a buyer i, who reports a value b i .Using Lemma 4, the mechanism satisfies the critical item property, and j is the only item which can be assigned to i. Let π be the minimum value that i could have reported and still be assigned j.By truthfulness, i must be charged exactly π.Indeed, if she is charged p > π then i with value b i has incentives to lie and report π; whereas if she is charged p < π then i with value p would have incentives to lie and report b i .Now, observe that when the mechanism assigns j to i, it can retrospectively compute π, which proves that the mechanism is prompt.Theorem 6.There exists a deterministic truthful mechanism that achieves an ν = min(m, n) approximation of the offline optimum.This result is tight in the class of deterministic truthful mechanisms, when graph edges are private.
Proof.We start with presenting the positive result.Consider the simple mechanism which only assigns an item to a buyer if she has the highest value seen so far (breaking ties arbitrarily), charging her the second highest value seen so far.It is immediate to verify that this is a deterministic truthful mechanism with an approximation ratio of ν = min(m, n).For the tightness of the result, Lemma 5 shows that deterministic tardy mechanisms are in fact prompt, thus the lower bound from Theorem 5 (where graph edges are public) applies to this setting.

Randomized truthful mechanisms
Recall that randomized (ex-post) truthful mechanisms are lotteries over deterministic truthful mechanisms, which in turn satisfy the characterizing properties we obtain for the deterministic case.The proof of our lower bound in Theorem 8 was based on this fact.First, we give a short argument why it also applies to mechanisms for private edges, even when they are tardy.Then, we provide an (almost) matching upper bound, namely a prompt randomized truthful logarithmic approximation.

Corollary 1.
The Ω(log n/ log log n) lower bound of Theorem 5 holds also for the case of private edges, even for tardy mechanisms.
Proof.Fix all random decisions of an ex-post truthful randomized mechanism.This yields a deterministic algorithm, that together with the original mechanism's payment scheme yields a (tardy) mechanism.This mechanism is deterministic, and truthful due to the definition of truthfulness.Also, such a mechanism fulfills the critical item property (Lemma 4), and can even be made prompt ( Lemma 5).With this, we can follow the original proof of the lower bound.
We state now our prompt mechanism for the problem with private edges, and prove its ratio to almost match our lower bound.
4: When an item arrives: Buyers report if they are interested in the item.

6:
For each buyer i of type t i = Explore who is interested in the item, do 7: Sell the item at price p to a buyer i of type t i = Exploit, who is interested in the item and does not yet has an item, chosen arbitrarily (e.g.lowest index).
Theorem 7. The Explore-Exploit Mechanism is truthful, and computes a O(log n) approximation to the optimal social welfare.
Proof.Buyers of type Explore will not get any item, and thus have no incentive to lie.Buyers of type Exploit only need to say if they are interested to buy an item at a given price.Because prices are non-decreasing, they have no incentives to misreport their value or their interest in an item.For each item j, we define x j as the maximum value seen among buyers interested in items up to j.
For the sake of analysis, we look at a maximum weight matching µ ⊆ E, having a total value of OP T .Each edge (i, j) ∈ µ from the optimal solution is assigned to a bucket ℓ (i,j) = ⌈log 2 (x j /v i )⌉ ∈ N. Then for each ℓ ∈ N we define OP T ℓ as the total weight of the restriction of the optimal solution to bucket ℓ.
Let V be maximum value among buyers who are interested in at least one item.By optimality of µ, the corresponding buyer must be given an item, and thus OP T 0 ≥ V .Now observe that for each (i, j) ∈ µ such that ℓ (i,j) > ⌈log 2 n⌉, we have v i < x j /n ≤ V /n ≤ OP T 0 /n.Thus, the sum of OP T ℓ for ℓ > ⌈log 2 n⌉ is smaller than OP T 0 .Therefore, buckets 0, 1, . . ., ⌈log 2 n⌉ contain at least half of OP T , that is For all ℓ ∈ {0, 1 . . ., ⌈log 2 n⌉}, we will now show that if k = ℓ then the Explore-Exploit Mechanism gives a solution of expected cost at least Ω(OP T ℓ ).Then we will conclude the proof using the law of total probability: summing over k shows that the Explore-Exploit Mechanism computes a solution of expected cost at least Ω(OP T / log n).First, assume that k = 0.For each edge (i, j) ∈ µ in bucket ℓ (i,j) = 0, then i is the best buyer seen so far.With probability 1/4, buyer i has type Exploit and the second best buyer has type Explore.In that case, the Explore-Exploit Mechanism gives buyer i an item (either j or one of the previous items).Using linearity of expectation, the Explore-Exploit Mechanism outputs a solution of expected value at least OP T 0 /4.Second, assume that k = ℓ with ℓ ∈ {1, . . ., ⌈log 2 n⌉}.This case requires an amortized analysis: for each buyer i, denote X i the random variable equal to v i if i gets an item and 0 otherwise; and for each item j, denote Y j the random variable equal to the value of the buyer to whom j is assigned, and 0 if j is unassigned.Notice that the Explore-Exploit Mechanism outputs a solution of value = i∈B X i = j∈I Y j .Let (i, j) ∈ µ be an edge from bucket ℓ (i,j) = ℓ.We are going to show that We condition on the fact that k = ℓ and t i = Exploit.If buyer i already has an item when item j arrives, then X i = v i .Otherwise, the best buyer seen so far has type Explore with probability 1/2, in which case the Explore-Exploit Mechanism gives item j to a buyer of value ≥ x j /2 ℓ ≥ v i /2.Buyer i has type Summing this last inequality over edges from bucket ℓ shows that the Explore-Exploit Mechanism outputs a solution of expected value at least OP T ℓ /10.

Ex-ante truthfulness
One might wonder if the hardness of truthful mechanisms for our problem is mainly due to the very restrictive notion of ex-post truthfulness.We state here that also for the much looser ex-ante truthfulness, the setting of non-myopic buyers separates clearly from the myopic case.The proof can be found in Appendix B.
Theorem 8.There exists no randomized ex-ante truthful mechanism that yields an α-approximation to the optimal social welfare, for the problem with private edges and any α < 2. This is true even for tardy mechanisms.

Conclusions
We have studied vertex-weighted bipartite online matching with offline agents in various settings, obtaining an almost-complete picture of the competitive ratios achievable by mechanisms under different truthfulness notions.Our results encompass that for myopic truthfulness, the bounds of Karp Vazirani and Vazirani [16] and Aggarwal et al. [1] transfer to the online agents setting.This showcases that the very general myopic bounds of Deng, Panigrahi and Zhang [7] are far from tight for restricted settings like ours.On the other hand, we also show that equally near-optimal approximations are impossible under the assumption of classic truthfulness, even ex-ante; and for ex-post truthfulness our seemingly simple problem already exhibits lower bounds almost matching the myopic, logarithmic competitive ratio for submodular combinatorial auctions in Deng, Panigrahi and Zhang [7].We leave open to what extent this additional hardness (moving from a tight e/(e − 1) myopic to Ω(log n/ log log n) truthful) already happens when imposing ex-ante truthfulness.This is an interesting subject of investigation, also for different scenarios than the one of our ≥ 2 lower bound (private edges).Obtaining according positive or negative results for other variants of online problems with offline agents poses another natural direction.Besides this, note that our work considers only the (especially hard) case of adversarial arrival order, warranting the question which improved bounds can be obtained e.g. for random-order models.We suspect that non-trivial approximations via (ex-post) truthful mechanisms quickly cease to exist when considering online problems with offline agents that are more general and challenging than ours.On the other hand, under the myopic assumption, these could exhibit interesting bounds situated in between our e/(e − 1) and the logarithmic mechanism for submodular combinatorial auctions [7].

A Proofs of Theorems 2 and 3
In this section we prove the properties of HonestPerturbedGreedy and of the tardy versions of Greedy and Perturbed-Greedy presented in the main body.Starting from the guarantees of their non-strategic counterparts it is immediate to see that the approximation factor claimed are indeed correct.The only property to show is incentive compatibility.A crucial ingredient to prove incentive compatibility is Myerson's Lemma, that we recall here for the sake of completeness.The Lemma has been proved in Myerson's seminal paper [22]; here we follow the more modern approach by Roughgarden [24].Since in this paper we study unit-demand agents, we restrict to consider only this type of agents.We start introducing the notion of single-parameter environments.In such environments, there are n agents and a set X of feasible allocations of items to agents.Each agents is characterized by a private valuation to get an item and strives to maximize her quasi-linear utility.To familiarize with this notion consider the model of non-myopic buyers with public graph edges studied in the paper: those agents are indeed single-parameters, as their valuations is their only private information.At the same time, note that the "edge compatibility" is implicitly modeled by the following set of feasible allocations of items to agents: an allocation x ∈ {0, 1} n is feasible if and only if it is corresponds with a matching in the underlying buyers-items bipartite graph.
As already mentioned in the main body, a mechanism M is characterized by two features: an allocation x ∈ X and a payment rule p.While the allocation specifies who gets what, the payment rule defines how much each agent pays.Allocation and payments are functions of the bids; in particular, we use the notation x i (b i , b −i ) ∈ {0, 1} to specify whether the i th agent is allocated an item, given her bid b i and the n − 1 bids b −i of the other agents.We are ready for the following crucial definition: Definition 2 (Monotone allocation).An allocation rule x for a single-parameter environment is monotone if for every bidder i and bids b −i by the other bidders, the allocation x i (z, b −i ) to i is nondecreasing in its bid z.

Definition 3 (critical prices).
Fix and agent i and bids b −i of the other agents.Then the critical price for i is defined as the smallest bid z i such that i is allocated an item, if any.Formally, if we use the convention that the inf of an empty set is 0, we have Clearly, the critical prices enforce ex-post individual rationality.Myerson showed that they also induce (ex-post) truthfulness; we report here a version of Lemma 2 of Myerson [22] that is tailored to our problem.
Theorem 9 (Myerson's Lemma).Fix a single-parameter mechanism.Given any monotone allocation x, it is possible to compute a payment scheme p such that the resulting mechanism is truthful and individually rational.In particular, in p, each agents that receives an item pays its critical price and 0 otherwise.
We are now ready to show the two Theorems.
Theorem 2. The randomized prompt mechanism HonestPerturbedGreedy is truthful for myopic agents and achieves (in expectation) a e/(e − 1) approximation to the best offline matching.The approximation is tight even for (non-truthful) randomized algorithms.
Proof.We start the proof by arguing that HonestPerturbedGreedy is truthful and individually rational for myopic agents.First, note that when any item j arrives, there is no point for the buyers still unallocated to lie about their interest for it: if they are not interested and they bid, they would risk to get j and lose future opportunity to get allocated to something they are interested in, while if they are interested they do not want to lose the opportunity (since they have no information on the future, and the prices charged never exceed their valuations).If we restrict to consider the buyers N (j) interested in item j, we see that the problem reduces to a single-parameter auction: the agents are myopic and just want to maximize their utility by getting j at a small price.All y i are public knowledge and non-negative, so our allocation rule (line 7 of HonestPerturbedGreedy), fixing these values, is clearly monotone (the more an agent i bids, the more likely she is to exhibit the largest y i • b i ).The allocation is therefore implementable using the Myerson payment rule (line 8 of HonestPerturbedGreedy).We can conclude, by Myerson's Lemma, that our mechanism is truthful for myopic buyers.Moreover, it is easy to verify that the payment rule enforces individual rationality.Once we have settled the truthfulness, we can assume that all buyers declare their true bids and thus the allocation output by HonestPerturbedGreedy is the same as Perturbed-Greedy for any realization of the perturbations x i and inherits the same approximation: HonestPerturbedGreedy is e/(e − 1)-competitive in expectation.Theorem 3.There exists a deterministic, respectively randomized, tardy mechanism that is truthful for nonmyopic agents with public graph edges and guarantees a 2, resp.e/(e − 1), approximation to the best offline matching.The approximation is tight even for (non-truthful) deterministic, resp.randomized, algorithms.
Proof.It is easy to see how the two mechanisms are monotone, thus it is possible to employ directly Myerson's Lemma, as the problem is single-parameter (i.e., the only private information of buyer i is the single value v i ).Therefore, Greedy or Perturbed-Greedy (with fixed perturbation factors) together with the critical payments defined in Myerson's Lemma result in a truthful mechanism.Note that the greedy algorithm clearly respects our ex-post notion of truthfulness, since no randomization is involved.For the Perturbed-Greedy algorithm, this is also true since we fix all random decisions (perturbation) up front, and choose the payment rule accordingly.

B Proof of Theorem 8
Theorem 8.There exists no randomized ex-ante truthful mechanism that yields an α-approximation to the optimal social welfare, for the problem with private edges and any α < 2. This is true even for tardy mechanisms.
Proof.Fix α < 2 and assume mechanism M guarantees an expected approximation ratio of α.Consider the following problem instance: there are n ′ buyers and m = n ′ + 1 items.Every item j has exactly one interested buyer, i j , and all i j have some small value v ij = ǫ > 0. There exist some additional buyers B 1 ⊆ B with different values who are interested only in item 1, and one buyer, i, whom we fix for our considerations.Note that |B| = n ′ + n 1 , with n 1 = |B 1 |.For n ′ large enough, clearly, n ′ ǫ > max i ′ ∈B1 v i ′ and the contribution of item 1 to the optimum becomes negligible with growing n ′ .Therefore, for M to guarantee an α-approximation, there must exist j ∈ {2, . . ., n ′ + 1} such that i j is assigned the according item with probability at least 1 α , or in case item 1 is worth more than ǫ, at least probability 1 α − ∆ 1 , where ∆ 1 arbitrarily small for large n ′ .Now, if we choose i = i j , then M will assign item j to i j w.pr.≥ 1 α − ∆ 1 , and charge an expected price of at most ǫ.The latter is because the price cannot depend on i's bid due to incentive compatibility, and it needs to be below i's value.Assume we replace i's valuation by some v > ǫ, and call this new buyer i (1) .Since M is ex-ante truthful, still, the exp.utility u i (1) achieved with a truthful report must be at least as large as when reporting ǫ instead of v, i.e. at least (v − ǫ)( 1 α − ∆ 1 ) > 1 2 v, which is at least half of v because α is < 2 and ǫ, ∆ 1 can be chosen arbitrarily small.We replace i (1) again by a different buyer i = i (2) .She still has valuation v, however, she is now interested in items 1 and j.We consider the first step of M , i.e. the assignment decision made for item 1.Assuming that v is the largest value bid on item 1, and given the fact that M has no idea if any additional value will present itself in the later steps, the probability that M assigns item 1 to i (2) is at least 1 α − ∆ 2 , where ∆ 2 approaches 0 since the other bids on item 1 might be, in comparison, too small to matter.Note again that the assignment decision cannot depend on v itself, but only on the fact that it is the largest value bid on item 1.
We know that i (2) can get utility larger than v 2 by simply reporting type i (1) instead.We also know that since she is assigned item 1 w.pr.> 1  2 , she is assigned item j w.pr.< 1 2 .This, intuitively, means that not all of the guaranteed utility is generated by item j, not even if the price of j is always 0 -but some must be generated because her expected price paid when item 1 is assigned is bounded away from v, i.e. p i (2) (1) = v − ∆ 3 .In fact, the exp.price M charges from i (2) when assigning item 1 cannot be smaller if i (2)  later reports interest in item j, since this would give a buyer of type i (1) incentive to also report interest in j.Also, the price charged from i (2) when assigning item j cannot be less than 0, and when there is no item assigned, i (2) is not charged anything (see preliminaries).This implies that, for P k (i) denoting the assignment probability of item k to buyer i, u i (2) = (v − p i (2) (1)) • P 1 (i (2) ) + (v − p i (2) (j)) • P j (i (2) ) = ∆ 3 • P 1 (i (2) ) + (v − p i (2) (j)) • P j (i (2) ) > v 2 .
Otherwise, we would have a contradiction on the utility being larger than v 2 , i.e. it would be beneficial for i (2) to only report interest in item j.In consequence, it also holds u i (2) = ∆ 3 • P 1 (i (2) ) + (v − p i (2) (j)) • P j (i (2) ) ≥ ∆ 3 • P 1 (i (2) ) + (v − v) • P j (i (2) ) > 0. This is true because the exp.price when receiving item j can be no more than v , and P j (i (2) ) < 1  2 .Therefore, there exists some v − < v for which it holds that Here, u i − (1) denotes the utility obtained from being assigned item 1 of some buyer with valuation v − for item 1, and 0 otherwise, when she reports i (2) as her type.Note that if buyer i − reports value v for item 1 and 0 for all others, she will also obtain u i − (1) from being assigned the first item: the assignment decision is made before the algorithm can know the difference, and the expected price paid cannot depend on the buyer's later reports due to truthfulness.
We use this to show a contradiction to the approximation ratio of M .Assume there exists, in absence of i (2) , such a buyer i − with smaller value v − and utility of u − (1) > 0 when reporting to have value v, who is interested in purchasing item 1, i.e. i − ∈ B 1 .Since M is ex-ante truthful, a truthful report for her will also result in positive expected utility of at least u − (1).As a direct consequence, it holds also that the probability P 1 (i − ) for assigning item 1 to i − (when she reports truthfully) is lower bounded, in order to achieve above expected utility, as follows: P 1 (i − ) ≥ v − .Finally, we copy buyer i − at least v − u i − (1) + 1 times.If necessary for tie-breaking, we distort their values a bit.Our conclusions about i (2) 's utility hold once i (2) reports the largest value for item 1, regardless of other values.This means, if either of our copied v − should decide to deviate and report to be valued like i (2) instead, they can recover utility u i − (1).As a result, each one of the copies, when reporting truthfully, has at least the same utility, and therefore an assignment probability of at least P 1 (i − ).This, in sum, results in a probability of more than 1 for assigning item 1, i.e., a contradiction.

Fig. 1 :
Fig.1:The instance from Lemma 2 with k = 3 and n = 9.Items are ordered (from top to bottom) according to their arrival times, and buyers are ordered (from top to bottom) according to σ (sort by decreasing types, breaking ties with indices).Preferences of buyers are given by the edges of the graph.