Heavy Traffic Analysis of Multi-Class Bipartite Queueing Systems Under FCFS

This paper examines the performance of multi-class multi-server bipartite queueing systems under a FCFS-ALIS service discipline, where each arriving customer is only compatible with a subset of servers. We analyze the system under conventional heavy-traffic conditions, where the traffic intensity approaches one from below. Building upon the formulation and results of Afeche et al. 2021, we generalize the model by allowing the vector of arrival rates to approach the heavy-traffic limit from an arbitrary direction. We characterize the steady-state waiting times of the various customer classes and demonstrate that a much wider range of waiting time outcomes is achievable. Furthermore, we establish that the matching probabilities, i.e., the probabilities of different customer classes being served by different servers, do not depend on the direction along which the system approaches heavy traffic. We also investigate the design of compatibility between customer classes and servers, finding that a service provider who has complete control over the matching can design a delay-minimizing menu by considering only the limiting arrival rates. When some constraints on the compatibility structure exist, the direction of convergence to heavy-traffic affects which menu minimizes delay. Additionally, we discover that the bipartite matching queueing system exhibits a form of Braess's paradox, where adding more connectivity to an existing system can lead to higher average waiting times, despite the fact that neither customers nor servers act strategically.


Introduction
In this paper, we analyse the performance of multi-class bipartite queuing systems under an FCFS-ALIS service discipline.Multi-class bipartite queuing systems are used for modelling a variety of important applications, such as public housing, health-care, and manufacturing.However, these models can be both analytically and computationally intractable, making questions of performance analysis and system design difficult to answer.Heavy-traffic scaling can be used to provide approximations of these systems that are much simpler to analyse and reveal fundamental properties of the system.
The specific model we look at has n customer classes and m distinct servers.Customers arrive to each class according to independent Poisson processes.Service times are exponentially distributed, with service rates depending only on the server, and not on the customer class.Each customer class has a particular subset of servers they can be served by.Each server may potentially be compatible with multiple customer classes.Servers serve the customer classes they are compatible with according to a FCFS discipline.That is, when a server finishes serving a customer, they consider all of the customers that belong to classes they are compatible with, and serve the customer that has been waiting the longest.
We analyse two aspects of the performance of this model, the expected waiting time delays of the different customer classes, and the matching probabilities of the different customer classes, that is, the probability with which a customer of a given class is served by a particular server.This paper is an extension of Afèche et al. (2022), who study a similar model to ours.Their model uses a specific heavy-traffic scaling, which limits the range of outcomes that the model can produce.By using a more general heavy-traffic scaling, we increase the range of outcomes the model produces, allowing for accurate approximations of a wider range of scenarios.Additionally, we allow for some queues to have no arrivals at the heavy-traffic limit, and are able to calculate the expected delay should a customer join such a queue.Our main motivation for considering this generalisation is the study of systems with strategic customers, i.e., customers who can choose their class type upon arrival based on waiting time delays and matching probabilities.For example, consider a system with two independent M/M/1 queues, both being served at rate 1.Further suppose that arriving customers would prefer to be served by a particular server, but also incur some waiting cost.If customers are able to choose which queue to join, and make their decisions by trading off the cost of waiting against the value of being served by the preferred server, then we would expect the average waiting time at the queue served by the preferred server to be higher than the average waiting time at the queue served by the less preferred server.
Using a conventional heavy traffic scaling, in which the number of servers and the service rates remain fixed, and the traffic intensity approaches 1 from below, the limiting arrival rates of both queues will be 1.The heavy traffic scaling in Afèche et al. (2022) has the proportion of customers arriving into the different queues remaining constant while taking the limit.However, if we do this in our simple M/M/1 example, we can see that this would limit us to concluding that the heavy traffic delays of both queues are equal.However, if we generalise the approach to heavy-traffic, allowing the arrival rates into the different queues to approach their limits at different rates, we are able to increase the range of outcomes we can model.We can interpret the different rates of approach in the real world as the different queues having arrival rates closer or further away to their predicted limiting value.This application of strategic arrivals also motivates us to allow for queues with no arrivals.This can be important for developing a coherent model when including strategic behaviour.In this case, it is possible to offer queues that no customers will choose to join, but we still need to calculate expected delays for those queues in order to justify why customers are not choosing to join them.
In this paper, we calculate the expected delays of the different customer classes using our more general scaling.We show different approaches to the heavy-traffic limit produce different waiting time outcomes.In Section 5, we use this to show that very minor perturbations in arrival rates can produce significant improvements in waiting time outcomes in the pre-limit.Additionally, we show that the limiting matching probabilities do not depend on the scaling used, but only depend on the limiting arrival rates.Finally, we look at some simple questions regarding the design of the compatibility between customer classes and servers.We find that when the service provider has complete control over the compatibility structure, they only need to consider the limiting arrival rates in order to design a delay minimising menu.When there are some constraints on the compatibility structure, then the particular approach to heavy-traffic does affect which menu minimises delay.
Related Literature.Heavy-traffic approximations have long been used to simplify the study of intractable queueing systems.Early works in this area include Kingman (1962) and Whitt (1974).These papers look at a so-called "conventional" approach to heavy-traffic, in which the number of servers and their service capacities remain fixed, and the arrival rate grows large in such a way that the traffic intensity of the system converges to one from below.An alternative class of "many-server" heavy-traffic limits have also been considered in the literature by carefully letting the number of servers and arrival rate grow unboundedly, e.g., Halfin and Whitt (1981) or Atar (2012).Motivated by mathematical tractability as well as by the fact that many real-world service systems operate under high levels of congestion † , we will study the performance of our multi-class multi-server bipartite queuing system operating under conventional heavy traffic conditions.
A range of questions can be answered using heavy-traffic approximations.In the context of parallel service systems, Harrison and Lopez (1999) study the question of optimal control of parallel service systems, that is, which servers should be used to serve which customer classes, and in which order should the different customer classes be served.Harrison and Lopez (1999) solve an approximating Brownian control problem, and conjecture that a discrete review policy will minimise holding costs for the original queuing system.This approach of using an approximating Brownian control problem to develop an optimal policy was originally suggested by Harrison (1988).Williams (2000) and Bell and Williams (2001) go on to prove the asymptotic optimality of a continuous review policy for a twoserver system.Following this work, Mandelbaum and Stolyar (2004) proves the asymptotic optimality of the cµ−rule for convex holding costs.A distinctive feature in all of these papers is that they impose a complete resource pooling condition on the connectivity and/or compatibility between customer classes and servers (see Harrison and Lopez, 1999).Roughly speaking, this condition boils down to assuming that the servers' capacities can be pooled together so that the servers can essentially act as a single "super-server".This assumption significantly simplifies the analysis as it allows us to obtain a single-dimensional state-space description of the workload of the system in the heavy traffic limit.
The complete resource pooling assumption is quite restrictive, however, and can be shown not to † For example, the Chicago Housing Authority reported more than 170,000 families waiting for public housing in 2021.
Similarly, in the same year, about 113,589 children in the United States were waiting to be adopted.In the healthcare system, more than 100,000 people are waiting for an organ transplant at any given moment in time, with average waiting times that can be as long as 5 years for a kidney transplant according to the National Kidney Foundation.
hold when strategic customer behaviour is allowed as in Caldentey et al. (2022).There has already been some work moving beyond the complete resource pooling assumption.Kushner and Chen (2000) prove the convergence to the heavy-traffic limit of a particular class of systems that do not satisfy the complete resource pooling assumption under quite general conditions.Pesic and Williams (2016) generalises Harrison and Lopez (1999) beyond the complete resource pooling assumption.Other works analysing multi-class multi-server queueing systems with no complete resource pooling assumption include Shah and de Veciana (2016) and Hurtado Lange and Maguluri (2022).Shah and de Veciana (2016) look at a system in which servers simultaneously work to process the same job, while Hurtado Lange and Maguluri (2022) analyse a generalised switch problem under a MaxWeight service policy.
In addition to studying the problem of optimal control, questions regarding the performance of parallel service systems have been studied using heavy-traffic approximations, or fluid approximations more generally.Talreja and Whitt (2008) looks at the problem of calculating matching rates for a parallel service system operating under FCFS, that is, with what probability is each customer class served by each server, although the authors looked at this question for an overloaded system with abandonments.Matching rates were calculated for specific classes of networks.Various approximation methods have been developed for calculating matching rates including the dissipative algorithm proposed by Caldentey and Kaplan (2002), a related approximation based on Ohm's law proposed by Fazel-Zarandi and Kaplan (2018) and a quadratic programming formulation proposed by Afèche et al. (2022).Of these papers looking at the performance of parallel service systems under FCFS, Afèche et al. (2022) is the only one to also look at calculating waiting times as we do here.Another contribution of Afèche et al. (2022) is to study the question of the design of matching topologies fixing the scheduling policy.While Afèche et al. (2022) studies this design question for a FCFS service discipline, Varma and Maguluri (2021) studies the same question of the design of matching topologies under a MaxWeight service discipline.
The specific model we look at here is a generalisation of Afèche et al. (2022), which itself developed out of a long history of papers studying bipartite queueing systems and bipartite matching models under an FCFS service discipline.Early papers in this area include Schwartz (2004) and Green (1985), who look at the steady-state performance of these systems given a particular hierarchical compatibility structure between customer classes and service classes, and Kaplan (1984Kaplan ( , 1988)), who similarly analysed the steady-state performance of parallel queuing systems, but for more general compatibility structures.Following Kaplan (1984Kaplan ( , 1988), Kaplan's multi-class multi-server queueing model was adapted by Caldentey and Kaplan (2002), who introduced an infinite-bipartite matching model to analyse matching probabilities under a FCFS service discipline.The model of Caldentey and Kaplan (2002) was further developed by Caldentey et al. (2009) and then adapted by Adan and Weiss (2014) to that of a multi-class multi-server parallel queuing system, which is the model we use here.
Since the development of the infinite matching model and the queueing model, different authors have looked at different aspects of the problem.Bušić et al. (2013), Mairesse and Moyal (2017), and Moyal and Perry (2017) look at stability conditions of such systems, and find that the system will be stable so long as a set of Hall's type conditions are satisfied.Also of interest are the steady-state matching probabilities.Caldentey et al. (2009) were able to use a particular Markov chain representation to calculate the steady-state distribution of the matching system for particular classes of matching topologies.Adan and Weiss (2012) came up with an alternative Markov chain representation to derive the steady-state distribution of the matching system for general matching topologies, while Adan and Weiss (2014) used a similar approach to look at the multi-class multi-server queueing problem, and showed the equivalence of the steady-state outcomes for the matching and the overloaded queueing system.However, the combinatorial structure of the state space description of the Markov chain limits the size of the systems that can be studied both analytically and computationally.Afèche et al. (2022) use heavy traffic analysis to unveil a number of structural properties embedded in the infinite matching model and its corresponding multi-class bipartite matching queueing system (see also the survey by Gardner and Righter, 2020 for a comprehensive review of related papers and models).
The rest of the paper is organized as follows.In Section 2 we provide a detailed mathematical description of the bipartite queueing model, review some related results in the literature and introduce the heavy traffic regime that we will use to analyze the performance of the system.Section 3 is devoted to the derivation of the limiting steady-state waiting times of the different service classes.Our main result in this section is Theorem 1 which provides a complete characterization of these limiting waiting times in terms of an underlying set of complete resource pooling components and their connectivity that emerge under heavy traffic.In Section 4 we study the steady-state matching probabilities between customer classes and servers and show in Theorem 2 that these probabilities do not depend on the particular direction along which the system reaches heavy traffic.This is in direct contrast to the behaviour of the steady-state waiting times, which are particularly sensitive to the direction of convergence.In Section 5 we discuss a number of insights that emerge from our theoretical results.For instance, what vectors of delays are implementable and how to design the connectivity between service classes and servers to achieve them.We also show that adding more connectivity to an existing bipartite queueing system can lead to longer average delays (i.e., some form of Braess's paradox).Section 6 contains the proofs and additional discussion of our main results Theorems 1 and 2. Some concluding remarks and possible directions in which our work can be extended are present in Section 7. Finally, the Appendix contains additional proofs of various intermediate results.

Model Description
In this section, we provide a detailed mathematical description of the model and basic definitions.To simplify our notation, we will adopt the following conventions throughout the paper.For a positive integer k, [k] := {1, 2, . . ., k}.All vectors are column vectors, and for a vector We consider a service system as follows.We have a set of m servers organised into a set of n customer classes.Each customer class is served by a particular subset of servers.This information is encoded in a compatibility matrix M ∈ {0, 1} n×m , where customer class i can be served by server j iff m ij = 1.Customers arrive to the customer classes according to independent Poisson processes.We let λ = (λ 1 , ..., λ n ) be the arrival rates into the different customer classes.Service times are exponentially distributed, and depend only on the server.The vector of service rates will be denoted by µ = (µ 1 , ..., µ m ).Servers will serve customers they are compatible with according to a FCFS-ALIS service discipline.
To illustrate, Figure 1 depicts an example with four servers (m = 4), and four service classes (n = 4).
. . .In this example, the menu M is given by that is, class 1 is compatible with server 1; class 2 is compatible with server 2; class 3 is compatible with server 3; and class 4 is compatible with all servers.Note that a server may belong to multiple service classes.
We are only interested in systems which operate with stable queue lengths.The following result, from Adan and Weiss (2014) tells us exactly which triplets (λ, µ, M ) produce stable steady-state outcomes.
Proposition 1. (Adan and Weiss, 2014, Theorem 2.1) For a menu M with arrival rates λ and service rates µ, define the slack of a set of servers where is the subset of service classes that can only be served by servers in S .
The menu M admit a steady state under a FCFS-ALIS service discipline if and only if:

Steady state results for fixed arrival rates
Our results build on the steady state analysis of Adan and Weiss (2014), which we briefly review for completeness.The authors derive their results based on a Markov chain representation of the system defined on a carefully crafted state space X.A state in this state space is described by three components: (i) a permutation of servers s = (s 1 , . . ., s m ), (ii) an integer b ∈ {0, . . ., m} indicating the number of busy servers, and (iii) a vector (n 1 , . . ., n b ) that indicates the composition of customers waiting for service in the different service classes.It is helpful to denote a generic state x ∈ X by the tuple: The first b components (s 1 , . . ., s b ) of the server permutation s denote the b busy servers ranked according to the arrival time of the customer they are serving, with server s 1 serving the oldest arrival and server s b serving the youngest arrival.The remaining servers (s b+1 , . . ., s m ) are all idle and ranked in the order they became idle, with s b+1 the server that has been idle the longest.Finally, n for = 1, . . ., b, represents the number of customers in the system who arrived after the job currently being served by s but before the job currently being served by s +1 .Due to the FCFS-ALIS service discipline, we know these customers can only be served by some server in (s 1 , . . ., s ).That is, each of these n customers must belong to some service menu in U (s 1 , . . ., s ).
According to (Adan and Weiss, 2014, Theorem 2.1), the steady-state probability of state x admits the product form: where B is an appropriate normalizing constant.Additionally, each of the n customers 'between' server s and server s +1 belongs to service class i ∈ U (s 1 , . . ., s ) independently with probability . These steady-state probabilities can be used to calculate the expected number of customers of each type in the system.Little's Law can then be applied to calculate expected steady-state mean waiting times.However, if we consider the process for calculating expected waiting times even for our relatively simple example in Figure 1, we see that while these calculations are possible, the process is laborious and the resulting expressions are unwieldy.For example, let us consider how we would calculate the expected number of class 4 customers.We first observe that class 4 customers are compatible with all servers.This means that the only times class 4 customers are waiting in the system is if all servers are busy when a class 4 customer arrives.Thus if we want to calculate the expected number of class 4 customers waiting for service in the system, we can restrict ourselves to considering only the states in which all 4 servers are busy.
Fixing the permutation of servers, and the number of busy servers, the values of n i are geometrically distributed, and hence the expected values have closed form expressions.For example, if we condition on being in the subset of states x ∈ X (s 1 ,s 2 ,s 3 ,s 4 ) such that b = 4 and the server permutation (s 1 , s 2 , s 3 , s 4 ), i.e. x = (s 1 , n 1 , s 2 , n 2 , s 3 , n 3 , s 4 , n 4 ), then the expected value of n 4 is where |λ| := λ 1 + λ 2 + λ 3 + λ 4 , |µ| := µ 1 + µ 2 + µ 3 + µ 4 and B is an appropriate normalizing constant.Note that n 4 is not the number of class 4 customers; instead n 4 is the number of customers who arrived to the system after the customer server 4 is currently serving.Therefore the expected number of class 4 customers conditional on being in the subset of states To fully calculate the expected number of class 4 customers, we would need to repeat this process for every permutation of servers.Since there are four servers, there are 24 possible permutations of servers to sum over, with different combinations of terms appearing in the denominator for each permutation.This gives us very complicated expressions for the expected number of servers.If we were instead looking at the number of class 1 customers, we would also need to consider states in which only some servers are busy, giving us even more server combinations that we need to consider.
It is this underlying computational complexity -which grows combinatorially fast in the size of the system-that motivates our move to heavy traffic.As the system approaches heavy traffic, the probability of being in a state with an idle server approaches 0, letting us restrict our attention only to states in which all servers are busy.Additionally, we show in Proposition 7 that in heavy-traffic, only certain server permutations have positive probability, which is a fact that simplifies the problem even further.

Heavy traffic scaling
The last part of the model is the heavy-traffic scaling.As mentioned in the Introduction, our formulation extends Afèche et al. (2022), who consider a specific direction of convergence to heavy traffic to derive their results.Specifically, they assume that the proportions of customers of different types remain constant as the system approaches heavy traffic.In this paper, we allow a general direction of convergence.
We consider a conventional heavy traffic regime in which the arrival rates approach the capacity of the service system, while the number of customer classes and servers, and the service menu remain constant.We parameterize our systems by , and let the service system approach heavy traffic as ↓ 0. Specifically, we assume that there is a sequence of arrival rates λ where for some vector Λ ∈ R n + , some vector γ ∈ R n , and some + > 0. We make the following additional assumptions on λ ( ) and µ.
Assumption 1.All of the following hold for arrival rates λ ( ) given by (6) and service rates µ: Parts (i) and (ii) ensure that we are approaching heavy traffic from below.Part (iii) is implied by λ i ( ) > 0 for all i ∈ [n] and 0 < < + , but we include it in Assumption 1 for clarity.Note that for i ∈ [n] such that Λ i > 0, we allow γ i to be positive, negative, or zero.This is more general than the scaling used in Afèche et al. (2022), where the authors assume that γ = Λ.Additionally, Afèche et al. (2022) requires that Λ i > 0 for all i ∈ [n].We relax that assumption here, as it is useful to allow for no arrivals to particular customer classes when considering strategic customer behaviour.
We are only interested in studying systems which produce stable outcomes.This leads us to restrict our attention to a set of admissible menus.
Definition 1. (Admissible Menus) For a given menu M , arrival rates λ ( ) , and service rates µ, define for any subset of servers S ⊆ [m] and > 0 ∆ ( ) A menu M is admissible under arrival rates λ ( ) and service rates µ if In words, this ensures that the menu M and arrival rates λ ( ) admit a steady state under a FCFS-ALIS service discipline, and that the slack in the system is converging slowly enough so that the average delays of the different customer classes converge when scaled by .
We let M(λ ( ) , µ) denote the set of all menus M that are admissible for arrival rates λ ( ) and service rates µ.The set M(λ ( ) , µ) will be non-empty for all pairs (λ ( ) , µ) satisfying Assumption 1.To see this, observe that the complete menu M such that m ij = 1 for all i ∈ [n] and j ∈ [m] will be admissible for all (λ ( ) , µ) satisfying Assumption 1.The complete menu will operate like a single queue with arrival rates |λ ( ) | that is served by all servers.

Mean Waiting Times in Heavy Traffic
We are interested in being able to calculate the mean waiting times of the different service classes.
Because we are looking at a conventional heavy traffic setting, the waiting times themselves will grow out of bound as ↓ 0. We will instead look at the scaled mean waiting time which will remain bounded in heavy traffic.
In what follows we show how to find the limiting expected waiting times by building upon and extending the methods and results in Afèche et al. (2022).

Feasible flows and CRP components
We begin by identifying the feasible flows of customers between customer classes and servers.For arrival rates λ ( ) and service rates µ satisfying Assumption 1, and an admissible menu M ∈ M(λ ( ) , µ), for 0 ≤ < 0 we define the set of feasible flows as where 0 ∈ R is such that λ ( ) > 0 for all 0 < < 0. We know from the admissibility of M that such an 0 exists, and that F( , λ ( ) , M ) is non-empty for all 0 < < 0 .The following lemma shows that F(0, λ ( ) M ) is also non-empty.The proof relies on F( , λ ( ) , M ) being a subset of a compact set Lemma 1.For a given λ ( ) and µ satisfying Assumption 1, and M ∈ M(λ ( ) , µ), the set F(0, Λ, M ) is non-empty.Furthermore, every sequence of flows f ( ) such that f ( ) ∈ F( , λ ( ) , M ) has a sub-sequence that converges to some f ∈ F(0, Λ, M ).
Proof: See Appendix A:.
As this lemma suggests, the set F(0, Λ, M ) contains information about what sort of flows it is possible to observe in heavy traffic.We will use the set of feasible limiting flows to determine which servers have a positive probability of serving which service classes in the limit.To do this, we will first define the residual matching of the menu M .
Intuitively, for a service class i and server j with m ij = 1 but mij = 0, the flow of customers from service class i to server j must vanish in the heavy-traffic limit.Afèche et al. (2022) provide an algorithm for finding the residual matching.However, for small, simple systems the residual matching can be found by inspection.To see this, consider again the simple example in Figure 1, specifying the service rates to be µ = [2, 1, 2, 1].We will consider two example vectors of arrival rates, Λ a = [2, 1, 1, 2] and Λ b = [2, 1, 0, 3].In each case, there is only one set of feasible flows in F(0, Λ a , M ) and F(0, Λ b , M ), given by In example (a), the arcs in the compatibility network with m ij = 1 and mij = 0 are (4,1) and (4,2).While service class 4 is compatible with servers 1 and 2, there will be zero flow between class 4 and servers 1 and 2 in the limit.All the service capacity of servers 1 and 2 will be allocated to serving classes 1 and 2. We can see this visually in panel (a) in Figure 2, where the arcs with m ij = 1 and mij = 1 are represented with solid lines, and the arcs with m ij = 1 and mij = 0 are represented with dashed lines.Example (b) is similar, but we now additionally have arc (3,3) with m 33 = 1 and m33 = 0.In panel (b) of Figure 2 we can see that class 3 only has one dashed arc connecting it to any servers, representing that no servers are allocating any capacity to class 3 in the limit, even though class 3 is compatible with server 3.
. . .Knowing the residual matching allows us to decompose the initial bipartite matching system into a partition of independent components, which Afèche et al. (2022) refer to as complete resource pooling (CRP) components.
Definition 3. (CRP Component) For a given (λ ( ) , µ, M ) such that λ ( ) and µ satisfy Assumption 1 and M ∈ M(λ ( ) , µ), let the induced residual matching be denoted M .We say that the subset C = (C, S) ∈ 2 [n] × 2 [m] of customer classes and servers forms a CRP component if for any pair of nodes k 1 , k 2 ∈ C ∪ S there exists a path between k 1 and k 2 in M , and C is maximal in the sense that the condition is violated for any strict superset of C.
We let {C 1 , C 2 , . . ., C K } denote the collection of CRP components induced by the residual matching M , where K is the number of components.Each C k = (C k , S k ) is defined by the subset of customer classes C k and the subset of servers S k that belong to C k .Since we allow for service classes with no arrivals, that is Λ i = 0, some CRP components will have an empty server set.Each service class with Λ i = 0 forms a separate CRP component with an empty server set.We denote the subset of such CRP components by I 0 : We let K := K − |I 0 | be the number of CRP components with non-empty sets of servers, and will assume that the CRP components are indexed so that the components in [K] \ I 0 have indices 1, 2, . . ., K .We will use k(i) and k(j) to denote the component that service class i or server j is part of, where the use should be clear from context.
To make these ideas more concrete, let us return to our examples in Figure 2. In example (a), service class 1 and server 1 make up a CRP component, as they are not connected to any other service classes or servers with solid arcs.Similarly, service class 2 and server 2 make up a CRP component.We can see a path between classes 3 and 4 through server 3, so these classes along with servers 3 and 4 make up a single CRP component.This means the CRP components for example (a) can be written as is similar, the difference being that now service class 3 is not connected to any server or service class with a solid arc, and therefore is in a CRP component by itself with an empty server set, i.e.I 0 = {3}.So the CRP components for example (b) are Abusing notation, we denote the aggregate arrival and service rates for the CRP components under λ ( ) as: where Λ k = i∈C k Λ i and γ k = i∈C k γ i .We will later show that each CRP component must satisfy Λ k = µ k so that the slack between demand and capacity within a CRP component in heavy-traffic goes to zero with .While each CRP component is critically loaded, the "well-connectedness" within a CRP component allows shifting load from one service class to another on short time scales.In particular, we will show in Theorem 1 that under an FCFS-ALIS policy, waiting times are balanced in such a way that service classes that belong to the same CRP component have the same limiting scaled mean waiting time in the heavy traffic limit.

Directed Acyclic Graph of CRP components
The menu M and the residual matching M uniquely induce a directed acyclic graph (DAG) on the collection of CRP components defined in the previous step.This is useful as the DAG defines a precedence relation among service classes: since component k 1 has a directed arc to component k 2 , there is a service class in k 1 that can be served by a server in k 2 .This means k 1 can "off-load" its customers to the servers of component k 2 , and so the instantaneous waiting time in component k 1 cannot exceed that in component k 2 under FCFS-ALIS.This intuition is made precise in the proof of Theorem 1.
The following is a formal statement of how the DAG is induced.
(Afèche et al., 2022, Lemma 2) formally proves that the directed graph defined above is in fact acyclic.
Returning to our examples in Figure 2, the DAGs are given below.In both cases, service class 4 can be served by servers 1 and 2 in the original menu, i.e. m 41 = m 42 = 1, and so there are directed arcs from C 3 to C 1 and C 2 .In example (b), C 4 contains service class 3 but no servers, since service class 3 has an arrival rate of 0. Therefore C 4 has a directed arc to C 3 , as this is the CRP component containing the server that customer class 3 is compatible with.
As we mentioned earlier, our computations for the heavy-traffic waiting times build on the work of Adan and Weiss (2014).The crucial component of their analysis is a state-space representation for the FCFS-ALIS matching model which involves ranking the busy servers in order of the waiting time of the customers they are serving.As was proved in Afèche et al. (2022) for the less general scaling, in heavy-traffic this entails restricting attention to only certain permutations of the CRP components which have asymptotically non-zero steady-state probability.We show in Proposition 7 below that this also holds for our more general scaling.The topological orders of the DAG D are precisely these permutations.The definition we give next differs slightly from Afèche et al. (2022) due to the potential presence of CRP components with Λ k = 0. Definition 5. (Topological Orders on CRP Components) Let {C 1 , C 2 , . . ., C K } be the CRP components with Λ k > 0. Given the DAG D = ([K], A), we say that a permutation σ = (σ(1), σ(2), . . ., σ(K )) of [K ] induces a topological order (C σ(1) , C σ(2) , . . ., C σ(K ) ) of these CRP components if for every pair In other words, sink components of D precede source components.We let T (D, K ) denote the set of all permutations σ of [K ] that induce a topological order on components {C 1 , . . ., C K }.
Further, for each σ ∈ T (D, K ), we partition the CRP components [K] by associating a subset for each k ∈ [K ] as follows: The interpretation of this is that for each index k ∈ [K ], we associate the CRP component corresponding to σ(k) as well as all CRP components κ with Λ κ = 0 (i.e., server-less components) for which the component σ(k) is the last component in the topological order σ that is reachable from κ via a directed path.
To highlight the difference with Afèche et al. (2022), under the heavy-traffic regime considered in Afèche et al. (2022) all CRP components have a non-empty server set S j .In contrast, in our model, we have customer classes that are in CRP components by themselves.These CRP components are special in that they have no incoming arc in the DAG D, and can only have a directed arc to CRP components with non-empty server sets.The topological orders T (D, K ) can thus be thought of as preprocessing D to remove the server-less CRP components {C K +1 , . . ., C K } which are "hanging off" D, and finding topological orders on the remaining components.Since the topological order has sink components of D preceding source components, and as we mentioned earlier, the DAG defines a precedence relation among service classes, we can then interpret comps −1 (σ, k) as associating each server-less CRP component with the CRP component that is reachable from it that has the shortest steady-state wait.
Returning to our examples in Figure 3 4).Since C 3 is the last element of the topological order for both permutations σ a and σ b , we have that comps(σ a , 3) = comps(σ b , 3) = {3, 4}.

Calculating waiting times
Let T (D, K ) = (σ 1 , . . ., σ T ) be the collection of topological orders on {C 1 , . . ., C K } (the components with Λ k > 0).For a topological order σ t ∈ T (D, K ) with the associated function comps(σ t , •) defined in (13), we define the unnormalized probability of being in a state associated with the topological order σ t as: where we use the shorthand For a permutation σ t ∈ T (D, K ), for any CRP component C k , we define the waiting time conditioned on the topological order σ t as: The following Lemma 2 proves that the expressions above are well-defined.
With the expressions for the unnormalized probabilities and conditional waiting times of topological orders in place, we are ready to state our main theorem regarding the mean scaled steady-state waiting times of different service classes.
Theorem 1.For a given (λ ( ) , µ, M ) such that λ ( ) and µ satisfy Assumption 1, and an admissible menu M ∈ M(λ ( ) , µ), let M be the residual matching and {C 1 , . . ., C K , C K +1 , . . ., C K } be the collection of CRP components induced by M .Then, customer classes that belong to the same CRP component experience the same scaled steady-state mean waiting time in heavy traffic.Furthermore, the scaled steady-state mean waiting time of CRP component C k is equal to The proof of Theorem 1 can be found in Section 6.1.

Matching Probabilities in Heavy Traffic
Another performance metric of interest is the matching probabilities, that is, for each customer class i and server j, the probability that a customer who joins class i is served by server j.For any menu M that is admissible with arrival rates λ ( ) and service rates µ, we let p ( ) (M, λ ( ) , µ) be the matrix of matching probabilities, so p ( ) ij (M, λ ( ) , µ) is the steady state probability with which a customer who joins class i ∈ [n] is served by server j ∈ [m].While exact matching probabilities are difficult to calculate, and remain difficult to calculate even in heavy traffic, we are able to provide two results regarding how matching rate calculations simplify as we move to heavy traffic.Before stating our results, it will be useful to describe the combinations of limiting arrival rates Λ, service rates µ, and menus M such that there is some sequence λ ( ) converging to Λ that makes M admissible.The following proposition will help us understand these combinations.
Proposition 2. Take any sequence of arrival rates λ ( ) and service rates µ such that , λ ( ) and µ satisfy Assumption 1, and let M be such that M ∈ M(λ ( ) µ).Let Λ = lim →0 λ ( ) .Then M is admissible with service rates µ and arrival rates Furthermore, if M is admissible with λ ( ) = Λ − Λ and µ, then the menu M given by the residual matching of M is also admissible with λ ( ) = Λ − Λ and µ.
Proof: See Appendix B:.This lets us talk about menus that are admissible for limiting arrival rates Λ and service rates µ.We will define the set M + (Λ, µ) to be the set of all menus M such that M is admissible for arrival rates λ ( ) = Λ(1 − ) and service rates µ.This provides us with a more convenient way to express our results regarding matching probabilities, the first of which is stated formally in Theorem 2. This tells us that while the limiting expected delays depend on the particular sequence of arrival rates λ ( ) , and in particular depend on the slacks γ, the matching probabilities depend only on the limiting arrival rates.
Theorem 2. Take any limiting arrival rates Λ and service rates µ such that |Λ| = |µ|.Consider any menu M ∈ M + (Λ, µ).Take any two sequences of arrival rates λ ( ) both sequences satisfy Assumption 1 with µ, and M is admissible for both sequences of arrival rates with µ.Then lim →0 p Theorem 2 and Corollary 1 can be found in Section 6.2.
Theorem 2 lets us talk about the matching probabilities of a menu M just in terms of the limiting arrival rates Λ and service rates µ.In light of this, for the rest of this paper we will refer to matching probabilities in terms of the limiting arrival rates, that is, we will write p The second result we have relating to matching probabilities, stated formally in Corollary 1, tells us that matching probabilities within a CRP component are independent of all other CRP components.
Corollary 1.Take any limiting arrival rates Λ and service rates µ such that |Λ| = |µ|, and take any M ∈ M + (Λ, µ).Let M be the residual matching, and let {C 1 , . . ., C K , C K +1 , . . ., C K } be the collection of CRP components induced by M .Then for any customer class i ∈ C k and server j ∈ S k , Corollary 1 implies that when calculating the matching rates, we can look at each CRP component individually.Additionally, it tells us that the DAG structure does not affect the matching probabilities.We will see in Section 5 that two menus M and M with the same residual matching M can have significantly different expected waiting times in heavy-traffic if the two menus induce different DAGs.Corollary 1 tells us that despite this, the limiting matching probabilities of menus M and M are the same.

Discussion
Before getting into the proofs of our main results, we discuss some of their implications, while highlighting the differences between the behaviours of our model and the model in Afèche et al. (2022).We also explore some simple questions regarding the design of menus of service classes.

Implementable outcomes
Our motivation for the heavy-traffic scaling used in this paper is that it allows for a wider range of outcomes than the proportional scaling used in Afèche et al. (2022).The following definition will help formalise what we mean by this.Definition 6. (Implementable Waiting Times) Take limiting arrival rates Λ, service rates µ, and a menu M such that a collection of CRP components C = {C 1 , C 2 , . . ., C K } is induced.We say a vector of limiting scaled waiting times W = (W 1 , W 2 , . . ., W K ) is implementable if there exists γ ∈ R n such that the menu M is admissible for the pair (λ ( ) , µ) where λ and the resulting limiting waiting times W C k given by ( 16) are equal to If we only look at the scaling in Afèche et al. (2022), in which γ = Λ, then each combination of limiting arrival rates Λ, service rates µ, and menu M can produce one specific vector of waiting times.By allowing γ to change, we increase the set of implementable outcomes.
As we alluded to in Section 3, the DAG provides information about which vectors of waiting times are implementable.The following statement, which is a corollary of Theorem 1, formalises this idea.
Proof: See Appendix C:.Corollary 2 provides a necessary condition for waiting times to be implementable.While completely characterising the set of implementable waiting times for a particular Λ, µ, and M is difficult in general, we are able to provide a sufficient condition for waiting times to be implementable for menus such that the DAG satisfies the following property.For menus such that the DAG is chained, the following result regarding which vectors of waiting times are implementable applies.
y C p v 0 V t 2 r b 7 P + H u j 1 5 K u z d 8 r Z W 2 Q r u y v S l e C u 2 e 1 W y W 4 2 / 8 r u 8 1 i A u Z l 7 + 7 A Y 2 S s l i L 9 F 4 I T O F 4 X I u v k y r l b 5 9 3 J 7 + L 3 1 o 7 4 J + g 3 X a 7 q N t 3 7 t q L u 8 B r v o K X q G X i A P v U R H 6 A 0 6 Q z 3 E U I I + o a / o m x M 5 H 5 3 P z p e r r Z W d J e c J u l H O 9 1 + j B T e 8 < / l a t e x i t > C 1 < l a t e x i t s h a 1 _ b a s e 6 4 = " l a 0 z l N 0 q 6 z v v w C p / T e + < / l a t e x i t > C 5 < l a t e x i t s h a 1 _ b a s e 6 4 = " p 7 q u / I y C p v 0 V t 2 r b 7 P + H u j 1 5 K u z d 8 r Z W 2 Q r u y v S l e C u 2 e 1 W y W 4 2 / 8 r u 8 1 i A u Z l 7 + 7 A Y 2 S s l i L 9 F 4 I T O F 4 X I u v k y r l b 5 9 3 J 7 + L 3 1 o 7 4 J + g 3 X a 7 q N t 3 7 t q L u 8 B r v o K X q G X i A P v U R H 6 A 0 6 Q z 3 E U I I + o a / o m x M 5 H 5 3 P z p e r r Z W d J e c J u l H O 9 1 + j B T e 8 < / l a t e x i t > C 1 < l a t e x i t s h a 1 _ b a s e 6 4 = " l a 0

g I t E A g 9 A e f + / L h c P + A V D y O P u g s g X F I Z x G f c k a a X l e S P W l 7 + f H x c S Z 7 D e I T a r C m 8 B Z g g Z a t n k o P b D C 2 K W h h B p J q h S I 4 c k e p x T q T
z l N 0 q 6 z v v w C p / T e + < / l a t e x i t > C 5 < l a t e x i t s h a 1 _ b a s e 6 4 = " p 7 q u / I r N i z z y w r f Y Z 6 d C j 6 H 1 w p A 5 t h W x q L F L M + I e 9 C q k z p x u 2 3 7 a p H 8 z w I 9 x d n s n U U Y r w p Y W q d b C F g d t 9 H 5 B 4 H G f y u c K i q n s E l v 1 7 3 6 N u v v g V 5 P v j p 7 t 5 y 9 T b a y e y J d C e 6 a 3 W m X 7 G b z r + w B j w W Y m 7 l 3 D o q R v V K C t L Y I n N D 5 o h B Z N 1 / G 1 S 7 / X m 4 P v 7 d + 1 D f B o O F 6 T b f x t l U 7 6 i 2 v w S 5 6 h p 6 j l 8 h D h + g I v U F n q I 8 Y S t A n 9 B V 9 c y L n o / P Z + X K 1 t b K z 5 D x F N 8 r 5 / g u t e T e / < / l a t e x i t > (a) Chained DAG < l a t e x i t s h a 1 _ b a s e 6 4 = " E w a z K w 2 Y j 9 j 1 < l a t e x i t s h a 1 _ b a s e 6 4 = " 1 2 u Z k x + j b f 9 n R F j g T F 3 I L N z q 5 I I = " > A A A D 6 3 i c n V P L a t The vector + is implementable if the following both hold: where < .
Proof: See Appendix C:.This tells us that we greatly increase the set of implementable outcomes by using a more general heavy traffic scaling.

Menu Design
We now turn our attention to some simple questions regarding the design of menus of customer classes.We will consider two objectives: (1) minimising the total average delay across all customer classes, and (2) minimising the maximum expected delay of any customer class.We will assume that the arrival rates into the customer classes λ ( ) and the service rates µ are fixed, and the service provider is designing the menu M , or the compatibility between the customer classes and servers.
When the service provider has complete flexibility over how to design the menu, the service provider can minimise both the average delay and the maximum delay faced by any customer class simultaneously.The following proposition shows that this can be achieved with a menu that has a single CRP component.
Proposition 4. Given arrival rates λ ( ) and service rates µ satisfying Assumption 1, for any admissible menu M ∈ M(λ ( ) , µ), Proof: See Appendix C:.Therefore any menu that induces a single CRP component will ensure that all customer classes achieve the minimum possible expected delay, hence minimising both the average delay across all customer classes and the maximum delay faced by any customer class.The following proposition is helpful in designing such a menu.
Proposition 5. Consider a system with limiting arrival rates Λ and service rates µ.Any menu M such that will be admissible for any vector of slacks Γ ∈ R n such that |Γ| > 0. Furthermore, such a menu will induce a single CRP component.
Proof: See Appendix C:.A complete menu, in which every customer class is compatible with every server, will always satisfy this condition.The complete menu will operate like a single queue served by all servers according to an FCFS service discipline.Proposition 5 also tells us that we do not need to know the values for the slacks Γ to design a delay minimising menu, making it easier to implement in practice.
While a menu that induces a single CRP component minimises delays, it may not be desirable or even feasible to offer such a menu due to real-world compatibility constraints on which servers can serve which customer types.Motivated by these sorts of constraints, we consider the question of how to design the DAG on a collection of CRP components to minimise expected delays for customers.
It will be useful first to understand the expression for average expected delays across all customer classes.In Equation ( 16) we defined the delay of each CRP component conditional on being in a particular topological order.We can similarly define wσ , the average delay across all customer classes conditional on being in a particular topological order σ, as This then lets us express the average expected delay for a particular menu M as . ( Here we can also see the differences with Afèche et al. (2022), in which the authors find that the average delays depend only on the number of CRP components.With our more general scaling, the average delays depend on the values of the slacks themselves, as well as the structure of the DAG and the set of topological orders that are induced.
Introducing additional arcs into the DAG reduces the number of topological orders.If we can introduce or remove arcs from a DAG in such a way that the system spends more time in states associated with topological orders that have lower conditional average delays wσ , then the total average delay will be reduced.However, the values of the slacks of the different CRP components γ limit how we are able to adjust the DAG and still have an admissible menu.This leads us to the following definition of an admissible topological order.
Definition 8.A topological order σ is admissible for arrival rates λ ( ) and service rates µ satisfying Assumption 1, and a collection of CRP components {C 1 , . . ., The following lemma tells us how admissible topological orders relate to admissible menus.Lemma 3. Take any arrival rates λ ( ) and service rates µ satisfying Assumption 1, and any collection of CRP components {C 1 , . . ., C K , C K +1 , . . ., C K }.For any admissible topological order σ, we can construct an admissible menu M ∈ M(λ ( ) , µ) such that the DAG induced by M with λ ( ) and µ only admits the topological order σ.Furthermore, if σ is not admissible, then there are no admissible menus M that admit the topological order σ.
Proof: See Appendix C:.The set of admissible topological orders tells us which DAGs are feasible given a particular CRP component.We can then minimise average delays by identifying the topological order with the lowest condition delays.Proposition 6.Given limiting arrival rates Λ, service rates µ, slacks Γ, and CRP components {C 1 , . . ., C K , C K +1 , . . ., C K }, there will be a permutation of CRP components σ that minimises the average expected delay across all implementable topological orders, .
The DAG or menu that will minimise delays is one that only allows for this topological order.
Proof: See Appendix C:.Given that adding arcs to a DAG is achieved by adding additional flexibility to a service system, one might think that adding an additional arc to a DAG will always reduce expected delays.However, we find that adding arcs to the DAG may potentially increase, decrease, or not affect the average delays.This can be shown through the following two server example.
Consider the case of two independent M/M/1 queues.We will use M a to denote this menu.Let the arrivals rates be λ ( ) If we were to consider the alternative menu then using Theorem 1 we find that W 1 = 1/(γ 1 + γ 2 ) and W 2 = 1/(γ 1 + γ 2 ) + 1/γ 2 .The average delay across both customer classes is then Therefore the difference in average delays is When γ 1 = γ 2 , ∆ ab = 0 and menus M a and M b have the same average delays.When γ 1 > γ 2 , ∆ ab is positive, and menu M b has higher average delays than M a , despite the additional flexibility.
Otherwise, ∆ ab is negative, and menu M b has lower average delays than M a .
This simple example demonstrates that adding additional flexibility to the design of the menu does not necessarily reduce the average delay (i.e., some form of Braess's paradox).Therefore if a service provider is considering adding additional flexibility to a system, it is important to carefully consider the way in which flexibility is being added.

Numerical example
We will end this section by returning to our example in Figure 2 (a) to make some of the ideas discussed in the section more concrete.Recall the menu M is given by The limiting arrival rates are Λ = (2, 1, 1, 2), and service rates are µ = (2, 1, 2, 1).We will let the sequence of arrival rates be λ We have three CRP components, C 1 consisting of class 1 and server 1, C 2 consisting of class 2 and server 2, and C 3 consisting of classes 3 and 4 and servers 3 and 4.
We will begin by considering the question of implementability.We can see that the DAG induced by M is a chained DAG, with C 1 and C 2 belonging to one partition in the chain, and C 3 belonging to the other partition in the chain.Then Proposition 3 tells us that we can implement any waiting times In this simple case, we can see which delays are implementable more directly, by looking at the exact expressions for the delays.Using Theorem 1, we can calculate the delays as , and Ŵ3 = 1 .
By looking at these expressions, we can see that we can implement any delays W 1 , W 2 , and W 3 such that W 3 > 0, W 1 > W 3 and W 2 > W 3 .To do this we would let This also suggests that in a congested system, a service provider is able to produce significant improvements in delay if they can make small changes to the arrival rates into the different service classes.
Suppose arrival rates are initially such that the slacks are proportional to arrival rates, i.e. γ = Λ, as in Afèche et al. (2022).The following table shows us the improvements in delay by adjusting the slacks so that γ = (9, 9, −3, −9) for different values of .Note that |Λ| = |γ |, so this adjustment does not alter the total arrival rate of customers into the system.We also show the percentage difference in average delays, denoted δ W %, as well as the percentage of customers who are joining a different customer class across the two scenarios, denoted δλ%.As we can see, significant improvements in scaled delays are achieved while only changing the arrivals of a relatively small fraction of customers, with the improvements in comparison to the change required increasing as congestion increases.
Finally, we look at the question of menu design.In particular, we look at how we can change a menu to improve delays given a fixed CRP component structure, and fixed arrival rates.The residual matching for the menu M in Equation ( 24) with limiting arrival rates Λ = (2, 1, 2, 1) and service rates There are 6 possible permutations of CRP components when the menu is just the residual matching M , these permutations being all the permutations of the number (1, 2, 3).We can use Equation ( 17) to calculate the expected delay conditional on a particular permutation of CRP components.In this case, we will assume the slacks are γ = (4, 3, 1, 1).The following table uses Equation ( 17) to calculate the conditional delays for all possible server permutations.We can see from this table that the permutation of CRP components that minimises delay is (2,1,3).We can then design a menu such that the DAG only admits this specific topological order.The DAG that achieves this is shown below.
This DAG can by having the customer class in C 1 served by the server in C 2 , and either of the customer classes in C 3 served by the server in C 1 .The following menu is one example of a menu that achieves Permutation Delay (1,2,3) this.
In comparison, the menu M in Equation ( 24) with γ = (4, 2, 1, 1) has average delays of 1.5, which as expected is higher than the average delays of our newly designed menu.
6 Proof of Main Results

Proof of Theorem 1
The key observation needed to prove Theorem 1 is that only a relatively small subset of states have positive probability in heavy-traffic, and the information about which states have positive probability is captured by the CRP components and the DAG on the CRP components.However, before we go into more detail, it will be useful to introduce some notation.In section Equation ( 12), we defined the aggregate arrival rate for a CRP component C k to be λ For a subset of servers S ⊆ [m], we define the slack for S by: where U S (M ) is defined in Proposition 1 as the subset of service classes that can only be served (or, uniquely served) by servers in S under the menu M .For succinctness, we will suppress the dependence on M in this section and use the notation U (S) for U S (M ).
It will also be useful to further aggregate the state space described in Section 2.1 so that the state depends only on the server permutation s and the number of busy servers b, and not the number of customers.Specifically, for a server permutation s = {s 1 , . . ., s m } and b ∈ {0, 1, . . ., m} define: as the set of all states where s is the ranking of servers in terms of the age of the customer for busy servers and the time since idleness for idle servers, and where exactly the first b servers in s are busy.
We then have the following expression for the probability of the aggregate state P (s; b): As a last step before developing the proof of Theorem 1, in Lemma 4 we state some properties of CRP components and topological orders that will be useful.This lemma has been slightly modified from (Afèche et al., 2022, Lemma 6).(ii) For any strict subset of servers S ⊂ S k , the set of service classes in residual matching M served only by S is a strict subset of C k , and S exhibits strictly positive slack as → 0, that is, Further, since U S (M ) ⊆ U S ( M ), the positive slack condition also holds for U S (M ).
(Recall that U S (M ) is the subset of service classes that can only be served by servers in S.) Let σ ∈ T (D, K ) be a topological order of the CRP components with non-empty server sets.Define (iv) The capacity slack of the set of servers S k converges to zero as → 0, in particular, Proof: See Appendix D:.
We can now begin calculating the expected waits.Using the aggregated states from Equation ( 26), the following lemma (rephrased) from Afèche et al. (2022) gives an expression for the mean waiting time for each service class in terms of the probabilities π(P (s; b)).
Lemma 5. (Afèche et al., 2022, Lemma 6) The steady-state mean waiting time of service class i is equal to where Σ m denotes the set of all the permutations of [m], and π(P (s; b)) is given by (26).
We are able to simplify these expressions further by showing that only a relatively small subset of aggregate states (s, b) have asymptotically non-zero probabilities in heavy-traffic.These states are exactly those that are consistent with T (D, K ) = (σ 1 , . . ., σ T ) the collection of topological orders on {C 1 , . . ., C K }, a notion we will formalize in Definition 9. Our first step to showing this is to consider the slacks ∆(s 1 , . . ., s ), which the preceding lemma suggests will be an important part of the analysis.Lemma 6 below, which is an extension of (Afèche et al., 2022, Lemma 4) shows that only certain subsets of servers have "interesting" slacks under a given sequence of arrival rates λ ( ) .
Lemma 6.Let D be the DAG for the CRP decomposition {C 1 , . . ., C K , C K +1 , . . ., C K } under some menu M and a given heavy-traffic equilibrium strategy profile.Then, a subset of servers {s 1 , .Proof: See Appendix D:.As implied in the previous paragraph, we can use Lemma 6 to prove Proposition 7 below, which states that a relatively small number of aggregate states have positive steady-state probability in heavy traffic; these are the aggregate states P (s; m) in which s is a permutation of the servers induced by a topological order σ ∈ T (D, K ) and such that all servers are busy.Definition 9. (Server Permutations Induced by Topological Orders) We say that a permutation of the servers s = (s 1 , s 2 , . . ., s m ) ∈ Σ m is induced by the topological order σ ∈ T (D, K ), if s can be expressed as a concatenation of sub-permutations: with s k ∈ Σ S k denoting a permutation of the servers S k of CRP component C k .In other words, the servers of a CRP component are contiguous in the permutation s, and the order of the CRP components obeys the topological order σ.
Returning to our four server example in Figure 3a, the CRP components were where B is a normalization constant, Q(σ) was defined in (14) as , and {θ k : Σ S k → + } k∈[K ] is a fixed collection of functions mapping the sub-permutation of servers of CRP components to positive reals.
Finally, we provide a lemma giving expressions for the scaled W i (s; b) when s is a server permutation induced by a topological order σ, and b = m, as these are the only permutations that will be important in arriving at the result.A somewhat remarkable fact is that the limiting scaled W i (s; m) depends only on the topological order σ and not the full server permutation s. .
Proof: See Appendix D:.Combining Proposition 7 with Lemmas 5-7, the limiting scaled mean waiting time for service class i ∈ C k is: Using the product rule of limits ‡ we can reduce the above sum to a sum over server permutations induced by topological orders, and where all servers are busy.
as in the theorem statement.

Proof of Theorem 2
Throughout this section, we will take the menu M , limiting arrival rates Λ and service rates µ, and slacks Γ to be given, and largely suppress any dependence on M in the notation.We will let M be the residual matching of the menu M with arrival rates Λ and service rates µ.
Instead of directly working with the matching rates p ( ) ij (M ), we will look at the service probabilities q ( ) ij .For all i ∈ [n] and j ∈ [m], q ( ) ij (x) is the probability with which server j serves customer i given the system is in state x and server j has become idle.We prove Theorem 2 by deriving and simplifying expressions for the limiting service probabilities q ij for the menu M , and find that the limiting service probabilities depend only on the service rates µ, limiting arrival rates Λ, and the connectivity within each CRP component.To do this, we will make use of a new state space aggregation which we will introduce here.
In Section 6.1, we introduced the aggregate states P (s, b) for ever s ∈ Σ m and b ∈ [m].Recall that P (s, b) is the set of all states where s is the ranking of servers in terms of the age of the customers they are serving for busy servers, and the time since becoming idle for the idle servers, and b is the number of busy servers.In this section, we further aggregate the state space, so that we can consider all of the states in which we observe a particular subpermutation of servers within a CRP component together.Specifically, for some k ∈ [K ] and some subpermutation s k ∈ Σ S k , we define Note that while the set of aggregated states P (s, b) does not depend on the menu being offered, P k (s k ) depends on the set of topological orders, and hence does depend on the menu.
The first main step of our derivation will be to calculate the limiting service probabilities for our new further aggregated state space.That is, for each pair of customer classes i ∈ [n] and servers j ∈ [m] in the same CRP component, and for any subpermutation of servers within that CRP component s k(k) ∈ Σ S k(j) , we would like to calculate q ij (P k(j) (s k(j) )), the limiting service probability of customer class i by server j given the system is in a state in P k(j) (s k(j) ).Recall that k(j) denotes the index of the CRP component that server j belongs to.We do not consider i and j that are not in the same CRP component, as we know the limiting service probabilities of customer classes and servers that are not in the same CRP component converge to zero.Similarly, we do not consider that service probabilities in any states x not in P k (s k ) for some k ∈ [K ] and s k ∈ Σ S k , as those states have idle servers, and hence have probabilities converging to zero.
We will begin by writing the state dependent matching probability q ( ) ij (x) for an arbitrary state x ∈ P k(j) (s k(j) ).We will let j(x) denote the position in the server permutation of server j in the state x and similarly will let j(s) denote the position of server j in the server permutation s.We can look at q ( ) ij (x) by conditioning on the position in the queuing network of the potential customer of type i that j serves.This lets us express q It will be useful to decompose this expression into two parts, q + ij (x), the part of the expression representing a transition within the CRP component, and q 0 ij (x), the part of the expression representing a transition outside of the CRP component.We suppress the dependence on to reduce clutter in the notation.So |S |, that is, m κ is the number of servers in the first κ CRP components in the topological order.
As an intermediate step to looking at the aggregate matching probabilities q ( ) ij (P k (s k )), we will first look at the partially aggregated matching probabilities q However, the second term represents transitions from a state where the permutation of servers is induced by a topological order to a state where the permutation of servers is not induced by a topological order, and hence has a limiting probability of zero.This means we expect the second term in this expression to converge to zero, which we prove in the following lemma.
Lemma 8.For a given admissible service menu M with limiting arrival rates Λ, service rates µ, and slacks Γ, let {C 1 , . . ., C K , C K +1 , . . ., C K } be the set of CRP components, and let T (D, K ) be the set of topological orders on the CRP components.Then for any permutation of servers s induced by some topological order σ ∈ T (D, K ), lim Proof: See Appendix E:.
We will now fix a topological order σ ∈ T (D, K ), and a server permutation s ∈ Σ m that is induced by σ.To reduce notational clutter, we assume without loss of generality that the CRP components are labelled in order of their position in the topological order, that is, σ(k) = k for all k ∈ K .Using Lemma 8, we can write q or written another way, The following notation will be useful in simplifying this expression.Recall from Equation (25) that It will also be useful to define ∆ j (S) as We can then write Equation (30) as where as before m κ = ∈[κ] |S |.That is, m κ is the number of servers in the first κ CRP components in the topological order.
We saw in Section 6.1 that the limiting values of ∆(s 1 , . . ., s ) depend on the values of .If = m κ for some κ ∈ [K ], then we know from Lemma 6 that lim .
The same reasoning implies that for all j ≤ ≤ m k(j) , lim →0 ∆ j (s 1 , . . ., s ) is a real number greater than zero that depends only on the permutation or servers in C k .
We can use these observations to prove the following lemma.
Lemma 9. We can find functions {θ κ : Σ ij (P (s, m)) can be written as where θ κ and H ij only depend on M , Λ, and µ.
Proof: See Appendix E:.We provide exact definitions of {θ k : Σ S k → + } k∈[K ] , H ij : Σ S k → + , and G ij : Σ S k → + in the proof of Lemma 9 in Appendix E:.Notice that the first line in Equation ( 33) has an −K term, and the second line has an −(K −1) term.Since q ij are probabilities and therefore must be between 0 and 1, we know that lim →0 B −K is bounded.This implies that lim →0 B −(K −1) = 0, and so only the first line in Equation (33) will be non-zero.Thus Because q ij are matching probabilities, we also know that q ij (P (s, m)) = q ij (P (s, m)) i ∈C k(j) q i j (P (s, m)) .
Since the only term in Equation ( 35) that depend on j is the H ij (s k(j) ) term, we can write q ij (P (s, m)) as Since Equation ( 36) holds for any server permutation s ∈ Σ, and depends only on s k and not on the rest of the server permutation, this implies that Since, as Lemma 9 states, H ij (s k(j) ) does not depend on Γ, the remaining step needed to prove Theorem 2 is to show that π(P k(j) (s k(j) )) also does not depend on Γ.This is captured in the following lemma.
Lemma 10.For an admissible service menu M with limiting arrival rates Λ service rates µ, and slacks Γ, the limiting probability of being in a state with the sub-permutation of server s k ∈ Σ S k for k ∈ K is equal to where {θ κ : Σ Sκ → + } κ∈[K ] is a function that depends only on M , Λ, and µ.
Proof: See Appendix E:.
Combining Lemma 10 with Equation (36), we have that the limiting service probabilities lim →0 q ( ) ij do not depend on the exact values of the slacks Γ, only requiring that M is an admissible menu for the slacks Γ.

Concluding Remarks
In this paper, we have studied the performance of multi-class multi-server bipartite queueing systems under a FCFS-ALIS service discipline by extending the heavy traffic analysis introduced in Afèche et al. ( 2022) for a similar class of systems.In Theorem 1 we have provided a general characterization of the mean steady-state waiting time delay for each customer class.Our characterization relies on decomposing the queueing system into a collection of complete resource pooling (CRP) components and identifying the connectivity among these CRP components in the form of a directed acyclic graph (DAG).Interestingly, only the knowledge of this DAG together with the capacity slack in each CRP component is enough to derive the mean steady-state waiting time for all customer classes.We have also studied the steady-state matching probabilities among customer classes and servers and showed in Theorem 2 that only the limiting values of arrival and service rates influence these matching probabilities.This is in direct contrast to the behaviour of the mean steady-state waiting times, which are also affected by the direction of convergence to heavy traffic.To illustrate this point, we have provided a numerical example that shows that small changes to the arrival rates in a heavily congested system can have large impacts on the average delays.We use our results regarding steady-state outcomes to explore some questions regarding the design of queueing systems.In doing this, we find that when service providers are looking to minimise expected delays and have complete control over the design of the menu, then they should implement a menu that induces a single CRP component.
Our work points towards several promising research directions.Firstly, we suggest exploring the problem of menu design, which involves determining the service classes to offer when customers can select which queue to join upon arrival.Caldentey et al. (2022) have made some preliminary progress in this area.Another area that deserves further investigation is the relationship between delays and the underlying matching topology in our bipartite queueing system.In Section 5.3, we demonstrate that adding more connectivity to the system can lead to a deterioration in the average waiting time of customers, exhibiting a form of Braess's paradox, despite neither customers nor servers acting strategically.Mathematically, this negative effect happens when adding an additional arc to the menu increases the probability of a topological order with higher conditional delays.Theorem 1 characterizes waiting time delays and can be used to identify an optimal flexibility structure as a combinatorial optimization problem over the collection of directed acyclic graphs (DAGs) associated with a particular set of CRP components.
In addition, there are alternative modelling choices that could be worth exploring.For example, while we have focused on conventional heavy-traffic scaling in this paper, a many-server scaling may be more appropriate for certain application settings, such as public housing and healthcare, where many identical servers are available.Furthermore, we have primarily examined steady-state outcomes, but in real-world scenarios, conditions often change frequently, making it unclear if a steady-state will be achieved.Therefore, studying the transient behaviour of bipartite queueing systems could also be of interest.
for S = ∪ k κ=1 S , which contradicts M being admissible.This holds even if we were to consider the scenario in which Λk = 0 for some k ∈ [K], as this would only decrease the values of γcomps(σ),k , making it more difficult to satisfy the condition lim →0 ∆ ( ) S (M ) > 0. Finally, we will mention how we can extend the construction of M to account for CRP components k with Λk = 0. Recall that these CRP components do not influence the topological orders themselves, only the slacks the elements comps(σ, k).We require for the admissibility of M that k =1 comps(σ, ) > 0 for all k ∈ [K ].This can potentially be achieved in many ways, one of which will always be to let m ij = 1 for some j in C K and for all i ∈ [n] such that Λ i = 0.This construction will mean that γcomps(σ,k) = γk for all k ∈ [K − 1], and γcomps(σ,K ) = γK + i:Λ i =0 γ i .Thus k =1 compsσ, = k =1 γk > 0 for all k ∈ [K − 1], and K =1 compsσ, = |γ| > 0 as required.
Proof of Proposition 6: Because the total delays are weighted averages of conditional delays, we know if the only conditional delay we are taking the average over is the minimum possible conditional delay, we will achieve the minimum total delay.From Lemma 3, we know for any admissible menu M , the only topological orders with positive probability are those that are admissible.
Because the set of all permutations of CRP components is finite, the set of admissible topological orders is finite.Thus there will be some implementable topological order that achieves the minimum conditional delay (If there are some i ∈ [n] such that Λ i = 0, for each topological order we would also need to consider the assignment of customers classes with zero arrivals to servers that minimises delay for each topological order).
Therefore we will be able to minimise the total average delay by choosing an admissible menu M that only allows for the admissible topological order that achieves the minimum conditional delay.We know that such a menu exists from Lemma 3. Proof of Lemma 6: The first part follows from the proof of (Afèche et al., 2022, Lemma 4) where it is argued that if the subset S = {s 1 , . . ., s } does not obey the condition mentioned, then π(x)q 0 ij (x) = 0.
Proof of Lemma 9: Recall from Definition 9 that since the permutation of servers s is induced by the topological order σ, we can express s as the concatenation of sub-permutations: Finally also recall the definition of Q(σ) from Equation ( 14) as .

λ 4 Figure 1 :
Figure 1: Example with four service classes and four servers.

Definition 4 .
(DAG) Given the menu M = [m ij ], and the CRP components {C k = (C k , S k ) : k = 1, . . ., K} induced by the residual matching M , we define D = ([K], A) associated to M as the directed acyclic graph whose nodes correspond to the CRP components, and there is a directed arc

Figure 5
Figure 5 illustrates an example of a chained DAG in panel (a) and one unchained DAG (i.e., a DAG that is not chained) in panel (b), both over a collection of seven CRP components.For the chained DAG in panel (a), L = 4 andC 1 = {C 2 , C 3 }, C 2 = {C 4 }, C 3 = {C 1 , C 6 , C 7 } and C 4 = {C 5 }.On the other hand, to see that the DAG in panel (b) is not chained, note that we cannot satisfy the requirement in Definition 7 if we consider the three CRP components C 1 , C 2 and C 4 .Indeed, the arcs connecting C 2 and C 4 to C 1 would require that C 2 and C 4 belong to the same class C l in the partition C for some , but then the arc connecting C 2 to C 4 would require these two CRP components to be in different classes in C .
t e x i t s h a 1 _ b a s e 6 4 = " Q 5

4 <
a B g P N g d l y s n 8 9 B a R 7 L D 2 a R w C i i U 8 l D z q i x r e E w o u Y y C L L j f N w Y 7 9 e I S 8 r C m 8 B b g h p a 1 t n 4 o PJ j O I l Z G o E 0 T F C t B x 5 J z C i j y n A m I K 8 O U w 0 J Z T M 6 h Y G F k k a g R 1 l p O s f P b W e C w 1 j Z R x p c d l c Z G Y 2 0 X k S B 3 V m Y 1 O t r R f N 3 a 4 P U h P 4 o 4 z J J D U h 2 9 a E w F d j E u E g A T 7 g C Z s T C A s o U t 1 4 x u 6 S K M m N z q l a H r y C 0 Y Z 5 Y s 5 M y v 6 z w P c m z U 8 H n 8 F o B y B z b y l i 0 m O U Z c Q 9 b d V I n b q d t X y 2 S / 1 m g q z i b v b M I 4 1 U B S / M 7 h Y D V c R v + P w g 0 / l v h V F E 5 h U 1 6 u + 7 V t 1 l / D / R 6 8 t X Z O + X s b b K V 3 R X p S nD X b L 9 d s p v N v 7 L 7 P B Z g b u b u H x Y j e 6 U E a W 0 R O K H z R S G y b r 6 M q 1 3 + v d w e f m / 9 q G + C f s P 1 m m 7 j b a t 2 1 F 1 e g 1 3 0 F D 1 D L 5 C H X q I j 9 A a d o R 5 i K E G f 0 F f 0 z Y m c j 8 5 n 5 8 v V 1 s r O k v M E 3 S j n + y + c D T e 6 < / l a t e x i t > C l a t e x i t s h a 1 _ b a s e 6 4 = " e R W j U T V d Q P h i R P 0 8 P 4 a S H v I N Q T Q = " > A A A D 6 3 i c n V P d a t s w G F X j / X T Z T 9 v t c j d i Y b C L Y O T E 6 Z K 7 k g 6 2 q 7 b 7 S V p I Q p C V z 6 m I L B t J D g T j p x j s Y g y 2 m z 3 M H m F v M 9 k N N E 2 2 b O w D m 8 M n n e P z H U t B I r g 2 h P z c q T i 3 b t + 5 u 3 u v e v / B w 0 d 7 + w e P + z p O F Y M e i 0 W s L g K q Q X A J P c O N g I t E A Y 0 C A e f B 7 L h Y P 5 + D 0 j y W H 8 w i g V F E p 5 K H n F F j W 8 N h R M 1 l E G T H + d g f 7 9 e I S 8 r C m 8 B b g h p a 1 t n 4 o P J j D D b p b t N p b r P + H u j 1 5 K u z 9 6 r Z X b K V 3 R f p S n D X 7 K 5 b s d v t v 7 K H P B a g b + b e P S x H d i o J 0 t k i c E I X W S m y b r 6 K y 6 3 + X m E O v 7 N + 1 D f B s G U 7 b b v 1 t t M 4 6 i + v w S 5 6 i p 6 h F 8 h B L 9 E R e o P O 0 A A x l K B P 6 C v 6 Z o X W R + u z 9 e V q a 2 1 n y X m C b p T 1 / R e Y k T e 5 < / l a t e x i t > C 6 < l a t e x i t s h a 1 _ b a s e 6 4 = " 5 f d O 2 M J M h L 5 I Y o t + e K T b c 6 Z + O 5 w = " > A A A D 6 3 i c n V P L i t s w F N X E f U z T x z y 6 7 E Y 0 F L o I R s 6 r z m 7 I F N r V z P S R z E A S g q x c Z 0 T k B 5 I c M M Z f M d B F K b S b f k w / o X 9 T 2 R O Y T N K m p R d s D l c 6 x + c e S 1 4 s u N K E / N y p W H f u 3 r u / + 6 D 6 8 N H j J 3 v 7 B 4 c D F S W S Q Z 9 F I p I X H l U g e A h 9 z b W A i 1 g C D T w B 5 9 7 8 u F g / X 4 B U P A o / 6 j S G c U B n I f c 5 o 9 q 0 R q O A 6 k v P y 4 7 z S W e y X y M 2 K Q t v A m c J a m h Z Z 5 O D y o / R N G J J A K F m g i o 1 d E i s x x m V m j M B e X W U K I g p m 9 M Z D A 0 M a Q B q n J W m c / z C d K b Y j 6 R 5 Q o 3 L 7 i o j o 4 F S a e C Z n Y V J t b 5 W N H + 3 N k y 0 7 4 4 z H s a J h p B d f 8 h P B N Y R L h L A U y 6 B a Z E a Q J n k x i t m l 1 R S p k 1 O 1 e r o N f g m z B N j d l r m l x W + p 3 l 2 K v g C 3 k i A M M e m M h a k 8 z w j d q d V J 3 V i d 9 v m 1 S L 5 n w V 6 k r P 5 e 4 M w X h U w N L d b C B g d u + H + g 0 D j v x V O J Q 1 n s E l v 1 5 3 6 N u s f g N 5 M v j p 7 t 5 y 9 T b a y e y J Z C e 6 G 7 b Z L d r P 5 V / a A R w L 0 7 d z d T j G y 3 y / k G g 9 d 8 K p 5 J G M 9 i k u 0 2 n u c 3 6 e 6 D X k 6 / O 3 q 1 m d 8 l W d k + k K 8 F d s z 2 3 Y r f b f 2 U P e C x A 3 8 z d O y x H d i o J 0 t k i c E I X W S m y b r 6 K y 6 3 + X m E O v 7 N + 1 D f B o G U 7 b b v 1 t t M 4 6 i 2 v w S 5 6 i p 6 h F 8 h B L 9 E R e o P O U B 8 x l K B P 6 C v 6 Z o X W R + u z 9 e V q a 2 1 n y X m C b p T 1 / R e m g T e 9 < / l a t e x i t > C 7 < l a t e x i t s h a 1 _ b a s e 6 4 = " f + p s 0 K p X I D O S c R i P k V j k C d a T K G Y = " > A A A D 6 3 i c n V N b a 9 s w G F X j X b r s 1 m 6 P e x E L g z 0 r N i z z y w r f Y Z 6 d C j 6 H 1 w p A 5 t h W x q L F L M + I e 9 C q k z p x u 2 3 7 a p H 8 z w I 9 x d n s n U U Y r w p Y W q d b C F g d t 9 H 5 B 4 H G f y u c K i q n s E l v 1 7 3 6 N u v v g V 5 P v j p 7 t 5 y 9 T b a y e y J d C e 6 a 3 W m X 7 G b z r + w B j w W Y m 7 l3 D o q R v V K C t L Y I n N D 5 o h B Z N 1 / G 1 S 7 / X m 4 P v 7 d + 1 D f B o O F 6 T b f x t l U 7 6 i 2 v w S5 6 h p 6 j l 8 h D h + g I v U F n q I 8 Y S t A n 9 B V 9 c y L n o / P Z + X K 1 t b K z 5 D x F N 8 r 5 / g u t e T e / < / l a t e x i t > (b) Unchained DAG < l a t e x i t s h a 1 _ b a s e 6 4 = " c 2 p 3 a m H b d 0 X y W G m h t 9 o w n V + S e E 8 = " > A A A D / n i c n V N b a 9 s w F F b j X b r s 0 r R 7 H A O x M O g g G D u X L n n r s k L 3 1 H a X p I U k B F k + S U V k 2 U h y a D C G w f 7 L Y A 9 j s L 3 s L + w n 7 N 9 M d g N N k y 0 b O 2 D x c Y 6 + z + d 8 k r y I M 6 U d 5 + d G w b p x 8 9 b t z T v F u / f u P 9 g q b e 9 0 V R h L C h 0 a 8 l C e e U Q B Z w I 6 m m k O Z 5 E E E n g c T r 3 J y 6 x + O g W p W C j e 6 V k E g 4 C M B R s x S r R J D U u P + 4 E X X u C k L w O 8 6 z 3 D H U H P i d H y 8 c G L w 3 R Y K j u 2 k w d e B e 4 c l N E 8 T o b b h R 9

4 <
r p e 3 m / P n 8 E m e o S e o F 3 k o u d o H 7 1 C J 6 i D K P q A P q G v 6 J v 1 3 v p o f b a + X G 4 t b M w 5 D 9 G 1 s L 7 / A p 3 D P d I = < / l a t e x i t > C 2 < l a t e x i t s h a 1 _ b a s e 6 4 = " Q 5n J o m 1 w Q Q D N D V w H o W 6 k O W B O e W k = " > A A A D 6 3 i c n V N b a 9 s w G F X j X b r s 0 n Z 7 3 I t Y G O w h G D m X z n k r 6 W B 7 a r t L 0 k I S g q x 8 T k V k 2 U h y I B j / i s E e x m B 7 2 Y / Z T 9 i / m e w G m i Z b N v a B z e G T z v H 5 j q U g E V w b Q n 7 u V J x b t + / c 3 b 1 X v f / g 4 a O 9 / Y P H f R 2 n i k G P x S J W F w H V I L i E n u F G w E W i g E a B g P N g d l y s n 8 9 B a R 7 L D 2 a R w C i i U 8 l D z q i x r e E w o u Y y C L L j f N w Y 7 9 e I S 8 r C m 8 B b g h p a 1 t n 4 o P J j O I l Z G o E 0 T F C t B x 5 J z C i j y n A m I K 8 O U w 0 J Z T M 6 h Y G F k k a g R 1 l p O s f P b W e C w 1 j Z R x p c d l c Z G Y 2 0 X k S B 3 V m Y 1 O t r R f N 3 a 4 P U h P 4 o 4 z J J D U h 2 9 a E w F d j E u E g A T 7 g C Z s T C A s o U t 1 4 x u 6 S K M m N z q l a H r y C 0 Y Z 5 Y s 5 M y v 6 z w P c m z U 8 H n 8 F o B y B z b y l i 0 m O U Z c Q 9 b d V I n b q d t X y 2 S / 1 m g q z i b v b M I 4 1 U B S / M 7 h Y D V c R v + P w g 0 / l v h V F E 5 h U 1 6 u + 7 V t 1 l / D / R 6 8 t X Z O + X s b b K V 3 R X p S n D X b L 9 d s p v N v 7 L 7 P B Z g b u b u H x Y j e 6 U E a W 0 R O K H z R S G y b r 6 M q 1 3 + v d w e f m / 9 q G + C f s P 1 m m 7 j b a t 2 1 F 1 e g 1 3 0 F D 1 D L 5 C H X q I j 9 A a d o R 5 i K E G f 0 F f 0 z Y m c j 8 5 n 5 8 v V 1 s r O k v M E3 S j n + y + c D T e 6 < / l a t e x i t > C l a t e x i t s h a 1 _ b a s e 6 4 = " e R W j U T V d Q P h i R P 0 8 P 4 a S H v I N Q T Q = " > A A A D 6 3 i c n V P d a t s w G F X j / X T Z T 9 v t c j d i Y b C L Y O T E 6 Z K 7 k g 6 2 q 7 b 7 S V p I Q p C V z 6 m I L B t J D g T j p x j s Y g y 2 m z 3 M H m F v M 9 k N N E 2 2 b O w D m 8 M n n e P z H U t B I r g 2 h P z c q T i 3 b t + 5 u 3 u v e v / B w 0 d 7 + w e P + z p O F Y M e i 0 W s L g K q Q X A J P c O N g I t E A Y 0 C A e f B 7 L h Y P 5 + D 0 j y W H 8 w i g V F E p 5 K H n F F j W 8 N h R M 1 l E G T H + d g f 7 9 e I S 8 r C m 8 B b g h p a 1 t n 4 o P J j D D b p b t N p b r P + H u j 1 5 K u z 9 6 r Z X b K V 3 R f p S n D X 7 K 5 b s d v t v 7 K H P B a g b + b e P S x H d i o J 0 t k i c E I X W S m y b r 6 K y 6 3 + X m E O v 7 N + 1 D f B s G U 7 b b v 1 t t M 4 6 i + v w S 5 6 i p 6 h F 8 h B L 9 E R e o P O 0 A A x l K B P 6 C v 6 Z o X W R + u z 9 e V q a 2 1 n y X m C b p T 1 / R e Y k T e 5 < / l a t e x i t > C 6 < l a t e x i t s h a 1 _ b a s e 6 4 = " 5 f d O 2 M J M h L 5 I Y o t + e K T b c 6 Z + O 5 w = " > A A A D 6 3 i c n V P L i t s w F N X E f U z T x z y 6 7 E Y 0 F L o I R s 6 r z m 7 I F N r V z P S R z E A S g q x c Z 0 T k B 5 I c M M Z f M d B F K b S b f k w / o X 9 T 2 R O Y T N K m p R d s D l c 6 x + c e S 1 4 s u N K E / N y p W H f u 3 r u / + 6 D 6 8 N H j J 3 v 7 B 4 c D F S W S Q Z 9 F I p I X H l U g e A h 9 z b W A i 1 g C D T w B 5 9 7 8 u F g / X 4 B U P A o / 6 j S G c U B n I f c 5 o 9 q 0 R q O A 6 k v P y 4 7 z S W e y X y M 2 K Q t v A m c J a m h Z Z 5 O D y o / R N G J J A K F m g i o 1 d E i s x x m V m j M B e X W U K I g p m 9 M Z D A 0 M a Q B q n J W m c / z C d K b Y j 6 R 5 Q o 3 L 7 i o j o 4 F S a e C Z n Y V J t b 5 W N H + 3 N k y 0 7 4 4 z H s a J h p B d f 8 h P B N Y R L h L A U y 6 B a Z E a Q J n k x i t m l 1 R S p k 1 O 1 e r o N f g m z B N j d l r m l x W + p 3 l 2 K v g C 3 k i A M M e m M h a k 8 z w j d q d V J 3 V i d 9 v m 1 S L 5 n w V 6 k r P 5 e 4 M w X h U w N L d b C B g d u + H + g 0 D j v x V O J Q 1 n s E l v 1 5 3 6 N u s f g N 5 M v j p 7 t 5 y 9 T b a y e y J Z C e 6 G 7 b Z L d r P 5 V / a A R w L 0 7 d z d T j G y 3 y / k G g 9 d 8 K p 5 J G M 9 i k u 0 2 n u c 3 6 e 6 D X k 6 / O 3 q 1 m d 8 l W d k + k K 8 F d s z 2 3 Y r f b f 2 U P e C x A 3 8 z d O y x H d i o J 0 t k i c E I X W S m y b r 6 K y 6 3 + X m E O v 7 N + 1 D f B o G U 7 b b v 1 t t M 4 6 i 2 v w S 5 6 i p 6 h F 8 h B L 9 E R e o P O U B 8 x l K B P 6 C v 6 Z o X W R + u z 9 e V q a 2 1 n y X m C b p T 1 / R e m g T e 9 < / l a t e x i t > C 7 < l a t e x i t s h a 1 _ b a s e 6 4 = " f + p s 0 K p X I D O S c R i P k V j k C d a T K G Y = " > A A A D 6 3 i c n V N b a 9 s w G F X j X b r s 1 m 6 P e x E L g z 0 E I + f S J m 8 l H W x P b X d J W o h D k O U v q Y g s G 0 k O B O N f M d j D G G w v + z H 7 C f s 3 k 9 1 A 0 2 T L x j 6 w O X z S O T 7 f s R Q k g m t D y M + d i n P r 9 p 2 7 u / e q 9 x 8 8 f P R 4 b / / J Q M e p Y t B n s Y j V R U A 1 C C 6 h b 7 g R c J E o o F E g 4 D y Y H R f r 5 3 N Q m s f y g 1 k k M I r o V P I J Z 9 T Y l u 9 H 1 F w G Q X a c j w / H e z X i k r L w J v C W o I a W d T b e r / z w w 5 i l E U j D B N V 6 6 J H E j D K q D G c C 8 q q f a k g o m 9 E p D C 2 U N A I 9 y k r T O X 5 h O y G e x M o + 0 u C y u 8 r I a K T 1 I g r s z s K k X l 8 r m r 9 b G 6 Z m 0 h l l X C a p A c m u

3 <
w m u N C E / t w r W j Z u 3 b m / f K d 6 9 d / / B w 9 L O o 6 4 K Y 8 m g w 0 I R y g u X K h A 8 g I 7 m W s B F J I H 6 r o B z d 3 q Y 1 c 9 n I B U P g 7 d 6 H s HA p + O A j z i j 2 q S G p d 2 + 7 4 b v c N K X P t 6 j z / D h h B o l D x + 9 O E 6 H p T K x S R 5 4 H T g L U E a L O B v u F H 7 0 v Z D F P g S a C a p U z y G R H i R U a s 4 E p M V + r C C i b E r H 0 D M w o D 6 o Q Z J P k e K n J u P h U S j N F 2 i c Z 5 c Z C f W V m v u u 2 e l T P V G r t S z 5 u 1 o v 1 q P m I O F B F G s I 2 O W P Rr H A O s S Z J d j j E p g W c w M o k 9 z 0 i t m E S s q 0 M a 5 Y 7 B / B y H h y Y p r 1 c k O T r G 8 v T U 4 F n 8 G x B A h S b C J h / n y a J s T e r 1 d I h d i t h l n q J P 2 z Q F t y N n 1 t E M b L A o b W b G U C R s e u N v 9 B o P r f C q e S B m N Y p z c q T m V T 6 2 + A X k 2 + P H s r n 7 1 B N r L b I l 4 y 7 o r d b O T s W u 2 v 7 C 4 P B e j r v j f 3 s 5 G d X I L U N w i c 0 N k 8 E 1 l t P r e r k Z 9 e d v m d 1 a u + D r p V 2 6 n Z 1 V f 1 8 k F 7 8 Q y 2 0 S 5 6 g v a Q g 5 6 j A / Q S n a E O Y u g D + o S + o m / W e + u j 9 d n 6 c r m 1 s L X g P E b X w v r + C y A p P N o = < / l a t e x i t >C l a t e x i t s h a 1 _ b a s e 6 4 = " 1 2 u Z k x + j b f 9 n R F j g T F 3 I L N z q 5

Figure 4 :
Figure 4: Examples of chained (panel a) and unchained (panel b) DAGs over seven CRP components.

Lemma 4 .
Let M be a service menu and {C 1 , . . ., C K , C K +1 , . . ., C K } be its CRP components under a given heavy-traffic equilibrium strategy profile.For a CRP componentC k = (C k , S k ) with non-empty S k (i.e., k ∈ [K ]) : (i)The aggregate demand of service classes converges to the aggregate service rate as → 0, that is, Λ k := Λ C k = µ S k =: µ k (see (12) for definitions).
lim →0 B −K .Proof of Lemma 10: From Proposition 7, we know thatlim →0 π(P (s, m)) = B • Q(σ) K k=1 θ k (s k ), (A2) , both example (a) and example (b) have the same set of CRP components with positive limiting arrival rates, the set {C 1 , C 2 , C 3 }.Both examples also have the same connectivity with these components.C 3 has directed arcs to C 1 and C 2 , but there are no arcs between C 2 and C 2 .Hence in any topological orders on these CRP components, we know that C 1 and C 2 come before C 3 , but C 1 can come either before or after C 2 .Thus the possible permutations are σ 1 = (1, 2, 3) and σ 2 = (2, 1, 3), and the associated topological orders are (C 1 , C 2 , C 3 ) and (C 2 , C 1 , C 3 ).As example (a) has no CRP components with limiting arrival rates of 0, for each σ and each k, comps(σ, k) is simply the set containing the index of the CRP component at position k of the topological order σ.In example (b), C 4 has λ4 = 0, so for each topological order σ, we need to determine for which k we have4 ∈ comps(σ, k).The only directed arc from C 4 to any other CRP component is to C 3 .Hence for each σ, we have that 4 ∈ comps(σ, k) if and only if3 ∈ comps(σ, and only if there exists a directed path from W C k to any other CRP component C κ with κ ∈ {[K] \ k}.This condition is trivially satisfied if there is only one CRP component.