Large deviations for acyclic networks of queues with correlated Gaussian inputs

We consider an acyclic network of single-server queues with heterogeneous processing rates. It is assumed that each queue is fed by the superposition of a large number of i.i.d. Gaussian processes with stationary increments and positive drifts, which can be correlated across different queues. The flow of work departing from each server is split deterministically and routed to its neighbors according to a fixed routing matrix, with a fraction of it leaving the network altogether. We study the exponential decay rate of the probability that the steady-state queue length at any given node in the network is above any fixed threshold, also referred to as the ‘overflow probability’. In particular, we first leverage Schilder’s sample-path large deviations theorem to obtain a general lower bound for the limit of this exponential decay rate, as the number of Gaussian processes goes to infinity. Then, we show that this lower bound is tight under additional technical conditions. Finally, we show that if the input processes to the different queues are nonnegatively correlated, non-short-range dependent fractional Brownian motions, and if the processing rates are large enough, then the asymptotic exponential decay rates of the queues coincide with the ones of isolated queues with appropriate Gaussian inputs.


Introduction
Modern communication networks are complex, and handle huge amounts of data.This is especially true closer to the backbone of the networks, where large numbers of connections share the same resources.The design and operation of these networks greatly benefits from tractable theoretical models that are able to describe and predict the performance of the system.In order to obtain such tractable models, a common practice is to represent the network's nodes as single server queues with an appropriate service discipline.Moreover, given the high level of traffic aggregation, it is appealing to approximate the incoming traffic to the network by Gaussian processes [1,2].Since these networks are often operated in a regime where the packet loss probabilities are very small, there is a need for understanding the large-deviations behavior of these networks.
While a queueing network with Gaussian inputs is a rather streamlined model, the analysis of its largedeviations behavior is notoriously difficult outside the case of an isolated queue, which has been thoroughly studied [3,4,5,6].The main reason for this is that, after the (initially Gaussian) incoming traffic goes through the first queue, it is no longer Gaussian.Then, when it is fed to a different queue, the analysis of this queue is significantly harder.For the special case of two queues in tandem, with work arriving only to the first queue and all the departing work of the first queue going into the second one, a useful trick involving subtracting the first queue (which has Gaussian input) from the sum of both queues (which behaves exactly as a single-server queue with a Gaussian input) yields a tractable analysis of the second queue in the tandem [7], even if it does not have a Gaussian input; see also the more refined approach in [8] based on the delicate busy-period analysis developed in [9].However, this trick does not work for more complex networks (not even for two queues in tandem with inputs to both queues, or when not all departures from the first queue join the second one [10]).Another factor that further complicates the analysis of complex networks is the fact that the input processes to the different queues can be correlated.This becomes a problem when the output of queues with correlated inputs are merged into another queue.
In this paper we consider acyclic networks of single-server queues, where work arrives to the queues as (possibly correlated) Gaussian processes, and where the work departing from each queue is deterministically split among its neighbors, with a fraction of it leaving the system altogether.This deterministic split of the departing work was also considered in e.g.[11], and it is particularly suitable for modeling single-class networks (where all work is essentially exchangeable), or for modelling networks where all work needs to be routed to the same node (and thus where the splitting of departure streams is only performed to load balance the network).
In terms of our approach, this paper fits in the framework of the analysis of a single Gaussian queue [5], and the subsequent analysis of tandem, priority, and generalized processor sharing queues [7,12]; we refer to [13] for a textbook account on Gaussian queues.In terms of our scope, this paper is perhaps most similar to [11], where the authors obtained large-deviations results for acyclic networks of G/G/1 queues.However, in that paper there were certain limitations regarding the correlation structure of the input processes (in that they have to be independent across different queues), and regarding the structure of the network (in that any two directed paths cannot meet in more than one node).
1.1.Our contribution.In this paper we generalize the analysis of a pair of queues in tandem, fed by a single Gaussian process [7], to acyclic networks of single server queues, fed by (possibly correlated) Gaussian processes.As in [7], we assume that the arrival processes are the superposition of n i.i.d.(multi-dimensional) Gaussian processes, and scale the processing rates of the servers by a factor of n, which corresponds to the so called 'many sources regime'.In this regime, for any given node i, we work toward characterizing the asymptotic exponential decay rate of its 'overflow probability', that is, the limit where is the steady-state queue length at the i-th node, and b is any positive threshold.In particular: (i) We obtain a general lower bound on the asymptotic exponential decay rate by leveraging the power of a generalized version of Schilder's theorem (Theorem 3).
(ii) Under additional technical conditions, we prove the tightness of the lower bound by finding the most likely sample paths, and showing that the corresponding asymptotic exponential decay rates coincide with the lower bound (Theorems 4, 5, and 6).(iii) We show that, if the input processes to the different queues are non-negatively correlated, non shortrange dependent fractional Brownian motions, and if the processing rates are large enough, then the asymptotic exponential decay rates of the queues coincide with the ones of isolated queues with appropriate Gaussian inputs (Theorem 7).
1.2.Organization of the paper.The paper is organized as follows.In Section 2 we introduce some notation, the network model, and a few preliminaries on large-deviations theory.In Section 3 we present our main results.In Section 4 we introduce an interesting example where the large-deviations behavior of any queue in the network coincides with the behavior of a single-server queue with Gaussian input.Finally, we conclude in Section 5.

Model and preliminaries
In this section we introduce some notation, the queueing network model that we analyze, and present a few preliminaries on sample-path large deviations theory.
2.1.Notation for underlying graph.Given a directed graph G = (V, E), and a node i ∈ V , we introduce the following notation.Let be the set of all inbound neighbors of i.Let be the set of all directed paths that contain at least m nodes, and end at node i.In particular, note that the trivial path (i) is only in P 1 (i).For any path r ∈ P 2 (i), let r + ∈ P 1 (i) be the path that results from removing the node r 1 from the path r.Finally, for any path r ∈ P 1 (i), let |r| be the number of nodes that it contains.
2.2.Queueing network.In this subsection we introduce the basic structure of our queueing network.Consider a directed acyclic graph with k nodes, and a scaling parameter n ∈ Z + .Each node i of the graph is equipped with a single server with rate nµ i , and a queue with infinite capacity.Work arrives to the network in a number of stochastic processes, k (•), with stationary increments and positive rates nλ 1 , . . ., nλ k , respectively (more details about these processes are given in Section 2.3).In particular, A (n) i (•) is the stream of work that enters the network at node i. Work departing from node i is split deterministically so that, for each edge (i, j) with i = j, a fraction p i,j ∈ [0, 1] is routed to node j.The remaining fraction of the work departing from node i, denoted by p i,i ∈ [0, 1], leaves the network; evidently, i p i,j = 1.In order to simplify notation, for any directed path r, let us denote In particular, we have Π i (s) as the amount of exogenous work that arrived to the i-th node during the time interval (s, t].Let D (n) i (s, t) be the amount of work that departed the i-th node during (s, t].Then, the total amount of work arriving to the i-th node during (s, t] is recalling that N in (i) is the set of inbound neighbors of i.Furthermore, for t ∈ R, Reich's formula states that the amount of remaining work in the i-th queue at time t (also called 'queue length') is given by Moreover, we evidently have Since we are interested in the steady-state of the queue lengths, we need to ensure that the service rate of each server is strictly larger than the total arrival rate to its node.This is enforced by imposing the following assumption.
Assumption 1.For each i ∈ {1, . . ., k}, we have Note that, even under Assumption 1, the existence and uniqueness of k-dimensional processes 2), (3), and ( 4) is not immediate.This will be established in Section 3.1, by expressing them as functionals of the exogenous arrival processes k (•).
Remark 1. Equation ( 5) corresponds to the setting where the arrival processes are a superposition of individual streams, which is also called the "many-sources regime" [14].
Finally, the following assumption is in place.It is required for a generalized version of Schilder's theorem to hold, which is introduced in the following subsection.
(i) The covariance matrix Σ is differentiable.(ii) For every i, j ∈ {1, . . ., k}, we have 2.4.Sample-path large deviations.In this paper, our aim is to study the limit where is the steady-state queue length of the i-th node, and I i : R + → R k + is a function that only depends on the server rates µ (µ 1 , . . ., µ k ), on the drift vector λ, and on the covariance matrix Σ.In order to do this, we rely on a sample-path large deviations principle for centered Gaussian processes, based on the generalized Schilder's theorem.Before stating this theorem, we introduce its framework.
First, we introduce the sample-path space equipped with the norm which is a separable Banach space [15].Next, we introduce the Reproducing Kernel Hilbert Space (rkhs) R k ⊂ Ω k (see [16] for more details) induced by using the covariance matrix Σ(•, •) as the kernel.In order to define it, we start from the smaller space for all t, s ∈ R and u, v ∈ R k .The closure of R k * with respect to the topology induced by its inner product is the rkhs R k .Using this inner product and its corresponding norm • R k , we define a rate function by otherwise.
Remark 2. In [15,12], the authors defined an appropriate multi-dimensional rkhs as the product of singledimensional spaces that use the individual variance functions as kernels.There this could be done because the different coordinates of the multi-dimensional Gaussian process of interest were assumed independent.
In our case, since the coordinates of our Gaussian process of interest need not be independent, we needed to define the multi-dimensional space directly, using the whole covariance matrix as the kernel.When the coordinates are indeed independent, both definitions are equivalent.
Under the framework define above, the following sample-path large deviations principle holds.
Schilder's theorem typically only gives implicit results, as it is often hard to explicitly compute the infimum over the set of sample paths.However, as in [7,12,15], we will leverage the properties of our rkhs to obtain explicit results.

Main results
In this section we will establish large-deviations results for the steady-state queue-length distributions.In particular, we will use Theorem 1 to show that, for any {1, . . ., k}, and for every b > 0, the limit exists, and to find (tight) bounds for it.The first step is to express this probability as a function of the Gaussian arrival processes (Section 3.1), and to show that the limit exists (Section 3.2).Second, we obtain a general upper bound for this limit (Section 3.3), and prove that it is tight under additional technical assumptions (Section 3.4).The arguments largely follow the same structure as the arguments for the analysis of the second queue in a tandem [7], but without the simplifications that come from having only two queues in tandem, with arrivals only to the first one.
3.1.Overflow probability as a function of the arrival processes.In this subsection we obtain a set E i (b) of sample paths such that By Reich's formula, we have where is the total amount of work that arrived to the i-th queue in the time interval (t, 0].If i is a node with no inbound neighbors, i.e., if N in (i) = ∅, we have that In this case, a large-deviations analysis can be performed through a straightforward application of Schilder's theorem (this is exactly the same as in the case of an isolated Gaussian queue [5]).However, in general the input process is the sum of the local Gaussian arrival process, and the departure processes of its inbound neighbors, which are not Gaussian.In the following lemma we obtain the input process as a functional of the exogenous arrival processes of all the upstream nodes.
Lemma 1.For each i ∈ {1, . . ., k}, and for all t < 0, we have where The proof is given in Appendix A, and consists of solving a recursive equation on the input processes by using induction on the maximum length of paths that end in node i.
Remark 3. Let t * and s * be finite optimizers of the two suprema in (8) over the closure of their domains.These have the following interpretation: for each path r ∈ P 2 (i), the time t * r (respectively, s * r ) is the starting point of the busy period of the r 1 -th queue that contains the time t * r+ (respectively, s * r+ ).Then, since t i = t < 0 and s i = 0, it follows that t * r ≤ s * r , for all r ∈ P 1 (i).Combining this with (8), and using the continuity of A (n) (•), we obtain where Note that the continuity of is what allows us to have the condition t r < s r instead of t r ≤ s r .This distinction will be convenient later.
We now state the main result of this subsection.
Theorem 2. For each i ∈ {1, . . ., k}, and for every b > 0, we have where The proof follows immediately from Reich's formula and Lemma 1, and it is given in Appendix B.

Decay rate of the overflow probability.
In this subsection we establish the existence of the limit for all b > 0. Recall that Theorem 2 states that P(Q > bn) satisfies (9), where E i (b) is an open set of the path space Ω k .Then, by Schilder's theorem (Theorem 1), we have Then, the existence of the limit is equivalent to showing that E i (b) is an I-continuity set, which is stated in the following proposition.The proof follows the lines of the proof of [7, Thm.3.1], and it is thus omitted.
Proposition 1.For each i ∈ {1, . . ., k}, and for every b > 0, we have Since the existence of the decay rate of interest given in ( 6) has been established now, in the following subsections we focus on finding lower and upper bounds on it.

3.3.
Lower bound on the decay rate.In this subsection we present a general lower bound for the asymptotic exponential decay rate of the overflow probability in steady state.We start by introducing some notation.Given a vector v and a scalar a, we denote v − (a, . . ., a) as v − a.For each node i ∈ {1, . . ., k}, we denote Note that A(•) is a k-dimensional Gaussian process with zero mean, and covariance matrix Σ.For each node i ∈ {1, . . ., k}, Moreover, let us define the functions , where Using the above notation, we now state our lower bound.Theorem 3.Under Assumptions 1 and 2, for each i ∈ {1, . . ., k} and for every b > 0, where The proof is given in Appendix C, and it essentially consists of two steps.First, we decompose the event E i (b) given in Theorem 2 as a union of intersections of simpler events that only involve the sample paths at fixed times, and we upper bound the probability of the intersection by the probability of the least likely one.Then, we use Cramér's theorem to obtain the decay rate of the least likely of these simpler events by solving the additional quadratic optimization problem that arises by its application.
Remark 4. As part of the proof of Theorem 3, it is established that conditions k i b (t, s) < c i (t, s) or s = t−t i , and h i b (t, s) > c i (t, s) cannot be satisfied at the same time.As a result, the three cases in the definition of Remark 5.The lower bound in Theorem 3 generalizes the lower bound given in [7, Corollary 3.5], not only by generalizing the network structure from a set of tandem queues to any acyclic network of queues, but also by removing a concavity assumption on the square root of the variance of the input processes.However, the removal of this assumption makes the expression of the lower bound more convoluted, even if we restrict it to the case of a pair of queues in tandem.
Remark 6.It is worth highlighting that, even if the bound of Theorem 3 is not tight, it provides an upper bound for the asymptotic exponential decay rate of overflow probability that can be used as a performance guarantee in applications.
3.4.Tightness of the lower bound.In this subsection we obtain conditions under which the lower bound in Theorem 3 is tight.We present three results, one for each of the cases in the definition of I i b (t, s) in ( 11), with different technical conditions for each case.Let (t * , s * ) be an optimizer of (11) over the closure of its domain.We first establish that, if the optimum of ( 11) is achieved in the first case, then the lower bound of Theorem 3 is tight under an additional technical condition.This is formalized in the following theorem.
Theorem 4.Under Assumptions 1 and 2, the following holds.If The proof is given in Appendix D, and it essentially consists of two steps.First, we identify a most likely sample path in the least likely event of the intersection given in the decomposition of the event E i (b) that was used in the proof of Theorem 3.Then, we show that under the assumptions imposed this most likely sample path is in all the sets featuring in the intersection, thus implying optimality in E i (b).
Since the condition in ( 12) requires an optimizer t * of (11), it is generally hard to verify.In the following lemma we present a sufficient condition that is easier to verify.
Lemma 2. A sufficient condition for (12) to hold is that for all s ∈ S i t such that s = t − ti , where The proof is given in Appendix E.
Remark 7.Although the condition of ( 13) looks almost the same as the original one of ( 12), the key simplification is that for (13) we only need t instead of t * , which is an optimizer of an easier optimization problem.
We now present the second result of this subsection.It asserts that, if the optimum of ( 11) is achieved in the second case, then the lower bound of Theorem 3 is tight under an additional technical condition.
Theorem 5.Under assumptions 1 and 2, the following holds.Suppose that The proof is analogous to the proof of Theorem 4, and it is thus omitted.
Remark 8.Note that the second condition in Theorem 5 is satisfied if the first one is satisfied with strict inequality for s = t * − t * i .
Finally, we show that if the optimum of ( 11) is achieved in the third case, then the lower bound of Theorem 3 is tight under a different additional technical condition.
Theorem 6.Under Assumptions 1 and 2, the following holds.Suppose that The structure of the proof is the same as the proof of Theorem 4, and it is given in Appendix F.

Example: equivalence to a single server queue
In this section we show that, if the input process is a multivariate fractional Brownian motion with non short-range dependence and non-negative correlation between its coordinates, and if the service rates are sufficiently large, then the large deviations behavior of any fixed queue in the network is the same as if all inputs to upstream queues were inputs to the queue itself.This phenomenon was also observed in [7] for the second queue in a tandem, and here we generalize the conditions under which it occurs.
4.1.Preliminaries on multivariate fractional Brownian motions.Consider the case where the exogenous arrival process A (n) (•) is a multivariate fractional Brownian motion (mfBm).Since each coordinate is a real valued fBm, for each i ∈ {1, . . ., k}, and for every t < s < 0, we have where H i ∈ (0, 1) is its Hurst index, and is its variance.Furthermore, it is known [18] that, for each i, j ∈ {1, . . ., k}, and for every t < s < 0, we have j (1) are their covariances, and η i,j = −η j,i ∈ R represents the inter-correlation in time between the two coordinates.Note that, contrary to the single-dimensional fBm, they need not be time-reversible.In particular, a mfBm is time-reversible if and only if η i,j = 0 for all i, j [19,Prop. 6].Moreover, the parameters η i,j have the following interpretation [19]: (i) If the one-dimensional fBm s are short-range dependent (i.e., if H i , H j < 1/2), then they are either short-range interdependent if ρ i,j = 0 or η i,j = 0, or independent if ρ i,j = η i,j = 0.This also holds when H i + H j < 1, even if one of them is larger than or equal to 1/2.
(ii) If the one-dimensional fBm s are long-range dependent (i.e., if H i , H j > 1/2), then they are either long-range interdependent if ρ i,j = 0 or η i,j = 0, or independent if ρ i,j = η i,j = 0.This also holds when H i + H j > 1, even if one of them is smaller than or equal to 1/2.(iii) If the one-dimensional fBm s are Brownian motions (i.e., if H i = H j = 1/2), then they are either longrange interdependent if η i,j = 0, or independent if η i,j = 0.This also holds whenever H i + H j = 1, even if neither of them are equal to 1/2.

4.2.
Non-negatively correlated, non short-range dependent inputs.We now present the main result of this section.
Remark 9. Note that this decay rate is the same as the one that we would obtain in a single-server queue with processing rate µ i and input This means that, under the assumptions of Theorem 7, in this regime the queues upstream of node i are 'transparent'.In particular, this implies that the most likely overflow path is the one where all upstream queues are empty.
Remark 10.In the case of a pair of queues tandem with arrivals only to the first queue, the condition in (15) is the same as the one obtained in the analysis of the tandem queues done in [7].

Conclusions
We have considered an acyclic network of queues with (possibly correlated) Gaussian inputs and static routing, and characterized the large deviations behavior of the steady-state queue length in each queue of the network.We achieved this by defining an appropriate multi-dimensional Reproducing Kernel Hilbert Space, and using Schilder's theorem to obtain lower and upper bounds for the asymptotic exponential decay rate.This generalizes previous results, which focused on isolated queues and two-queue tandem systems (with arrivals only to the first queue).
While the results that we obtain are quite general both in terms of the network structure and in terms of the correlation structure among the arrival processes to the different nodes, there are still interesting open problems.For instance: (i) While we considered essentially only single-class traffic with a deterministic split of the work departing from each server, it would be interesting to extend our results to multi-class networks, where the servers are shared by using, for example, the Generalized Processor Sharing discipline [12].(ii) While we only obtained large-deviations results for each queue separately, it would be interesting to obtain similar results for the joint queue lengths.
Appendix A. Proof of Lemma 1 We prove this by induction in the maximum length of paths that end in node i. Suppose that the maximum length is one.Then, P 2 (i) = ∅ and thus Now suppose that ( 8) holds for all nodes j such that the maximum length of paths that end in j is at most one less than the maximum lengths of paths that end in node i. Recall that Combining the last two equations, we obtain that Since all j are inbound neighbors of i, and the graph is acyclic, the maximum lengths of paths that end in nodes j are at most one less than the maximum length of paths that end in node i.Then, using the inductive hypothesis on the input processes After renaming the variables for ease of exposition, we obtain By Reich's formula, we have By Lemma 1, we obtain Since the centered Gaussian processes are symmetric, we have where Finally, rearranging terms, and using that s i = 0, we obtain The proof consists of two steps.First, we decompose the event E i (b) given in Theorem 2 as a union of intersections of simpler events that only involve the sample paths at fixed times, and we majorize the probability of the intersection by the probability of the least likely one (Lemma 3).Then, we use Cramér's theorem to obtain the decay rate of the least likely of these simpler events by solving the additional quadratic optimization problem that arises by its application (Lemma 4).
Lemma 3. We have inf where Remark 11.Note that the first condition in the definition of the set U t,s is the same as the second one, but with s r = t r − t i , for all r ∈ P 2 (i).This generalizes Theorem 3.2 in [7], where an appropriate U t,s is defined by having the first condition being the same as the second one but with s r = 0, for all r ∈ P 2 (i).In the case of a tandem with arrivals only to the first queue, both definitions are equivalent.

Proof.
Recall that where Then, we have inf Et,s Now fix t ∈ T i , and consider the innermost infimum.Since f is continuous, then for all s ∈ S i (t) implies Et,s Et,s Combining this with ( 16) completes the proof.
Remark 12.Note that, by taking the supremum over all r ∈ S i (t) at the end of the proof, we are essentially upper bounding the probability of an intersection with the probability of the least likely event.
While we have made progress towards obtaining the desired expression for the limiting overflow probability, the expression in Lemma 3 still depends on the rate function I.We now proceed to compute this simpler expression.
Lemma 4.Under Assumption 2, for t ∈ T i , and s ∈ S i (t), we have Proof.Recall that can be rewritten as Since this probability only depends on the state of the trajectories at fixed points in time, that is, only depends on a finite set of Gaussian random variables, it follows that U t,s is a I-continuity set, and thus Schilder's theorem implies that We now proceed to compute the left-hand side.
First, consider the exceptional case where s = t − t i .Substituting this in (17), we get Moreover, by Cramér's theorem, we have that Combining this with ( 18) and ( 19), we obtain inf f ∈Ut,s Now consider the case when s = t − t i .By the multivariate version of Cramér Theorem, we have that Combining this with ( 17) and ( 18), we get that inf f ∈Ut,s Since Λ t,s is quadratic and the constraints are linear, it follows by standard calculus that the optimal values of y and z are and respectively.Although this gives four possible combinations for (y * , z * ), the following lemma states that one of them is not possible.
Claim 1.For all t ∈ T i and s ∈ S i (t) such that s = t − t i , we have that Proof.Suppose that and that Then, we have which is impossible because the Cauchy-Schwarz inequality implies that for all t, and s such that s = t − t i .
Combining Claim 1 with (24) and ( 23), we conclude that In that case, substituting the optimal values Combining this with (21) we get that, if On the other hand, combining Claim 1 with equations ( 24) and (23), we also get that In that case, substituting the optimal values in (20), we obtain that Λ t,s (y * , z * ) equals Combining this with (21) we get that, if then inf f ∈Ut,s 2 Var Āi (s, t) .
Finally, if neither (26) nor (27) hold, Claim 1 implies that Combining this with (21) we get that, if Combining Lemmas 3 and 4 concludes the proof of Theorem 3.

Appendix D. Proof of Theorem 4
Given Theorem 3, it is enough to show that, if for all s ∈ S i (t * ) such that s = t * − t * i , then In the proof of Theorem 3, the lower bound in the decay rate was obtained by replacing the decay rate of an intersection of events by the decay rate of the least likely of these.Therefore, if the optimum path in this least likely set happens to be in all the sets in the intersection, then the bound is tight.In particular, if t * and s * are optimizers in the lower bound of Theorem 3, then we need to show that the most probable path in U t * ,s * is in E i (b).Furthermore, since Theorem 1 states that E i (b) is a I-continuity set, then it is enough to show that the most probable path in for j ∈ {1, . . ., k}.
Proof.For j ∈ {1, . . ., k}, we have Then, we can write , and thus f * is in the rkhs R k .Then, we have .
Since k i b (t * , s) < c i (t * , s) for all s ∈ S i (t * ) such that s = t * − t * i , the expression above is equal to the lower bound in Theorem 3. It follows that f * is a most probable path in the set U t * ,s * .
To complete the proof, we just need to show that f * ∈ E i (b), i.e., we need to show that there exists t ∈ T i such that for all s ∈ S i (t).For t = t * , we have Finally, combining this with (28) and the fact that for all s ∈ S i (t * ), which concludes the proof.
for all t ∈ T i .Therefore, we have sup s∈Si(t) On the other hand, since k i b t, s < c i t, s for all s ∈ S i ( t) such that s = t − ti , we have In particular, this means that we can pick t = t * , and thus Appendix F. Proof of Theorem 6 Similarly to the proof of Theorem 4, if t * and s * are optimizers in the lower bound of Theorem 3, we need to show that the most probable path in for j ∈ {1, . . ., k}.
Proof.Using standard properties of conditional multivariate Normal random variables, we get that for all j ∈ {1, . . ., k}, where Then, we can write and thus f * is in the rkhs R k .After tedious but straightforward computations we obtain ), the equation above is equal to the lower bound in Theorem 3. It follows that f * is a most probable path in U t * ,s * .
To complete the proof, we just need to show that f * ∈ E i (b), i.e., we need to show that there exists t ∈ T i such that for all s ∈ S i (t).In order to simplify notation, we denote Combining this with ( 14), we obtain for all s ∈ S i (t * ), which concludes the proof.
Appendix G. Proof of Theorem 7 We start with a technical lemma.
Lemma 5.There exists such that t * r = t * i , for all r ∈ P 2 (i).
Proof.Note that the numerator of the function being minimized in (31) only depends on t i .As a result, we can focus on the structure of the maximizers of its denominator when we keep t i fixed.Using that A(•) is a time-reversible mfBm, we obtain that Var Āi (t − t i , t) equals Taking the derivative with respect to t r , and using that t r ≤ t i ≤ 0 for all t ∈ T i , we obtain Moreover, for all t r ≤ min{t r ′ : r ′ ∈ P 1 (i), r ′ = r}, we have If t r − t r ′ − t i ≤ 0, we have where in the last inequality we used that H ≥ 1/2.On the other hand, if t r − t r ′ − t i > 0, we have where in the last inequality we used that H ≥ 1/2.Combining (32), (33), and (34) with ρ r1,r ′ 1 ≥ 0, for all r, r ′ ∈ P 1 (i), it follows that Var Āi (t − t i , t) is maximized when t r = t i , for all r ∈ P 2 (i).
Lemma 5 implies that we can pick such that t * r = t * i , for all r ∈ P 2 (i).In that case, we have An elementary computation yields that 2σ r1 σ i ρ r1,i + r ′ ∈P2(i) Then, a sufficient condition for (36) to hold is that min µ l p l,j : Lemma 2 and Theorem 4 finish the proof.