A Note on Wiener–Hopf Factorization for Markov Additive Processes

We prove the Wiener–Hopf factorization for Markov additive processes. We derive also Spitzer–Rogozin theorem for this class of processes which serves for obtaining Kendall’s formula and Fristedt representation of the cumulant matrix of the ladder epoch process. Finally, we also obtain the so-called ballot theorem.


Introduction
The classical Wiener-Hopf factorization of a probability measure was given by Spitzer [29] and Feller [13], and has a strong connection to random walks. This result was generalized by Rogozin [28], Fristedt [16], and other authors using approximation based on discrete time skeletons. Greenwood and Pitman [17] used a direct approach which relies on excursion theory for reflected process; for details, see [7,20]. Another approach is presented in [14] where the link with scattering theory is also made. Presman [26] and Arjas and Speed [5] generalized Spitzer identity in a different direction, to the class of Markov additive processes (MAPs) in discrete time (see also [2,25]). Later, Kaspi [19] proved Wiener-Hopf factorization for a continuous time parameter Markov additive process, where Markovian component has a finite state space and is ergodic. The fluctuation identity given by Kaspi [19] involves distribution of the inverse local time. Dieker and Mandjes [12] investigate discrete-time Markov additive processes and use an embedding to relate these to a continuous-time setting (see also [9,27]).
The use of MAPs is widespread, making it a classical model in applied probability with a variety of application areas, such as queues, insurance risk, inventories, data communication, finance, environmental problems and so forth; see, e.g., [2,Chap. XI], [4,9,10,15], [25,Chap. 7]. The reason comes from considering seasonality of prices, recurring everyday patterns of activity, burst arrivals, occurrence of events in phases, and so on. This leads to regime-switching models, where the process of interest is modulated by a background process. The so-called phase-type distributions fit also naturally into the framework of MAPs. MAP with positive phase-type jumps can be reduced to a MAP with no positive jumps without losing any information. This procedure is called fluid embedding. Informally, it involves enlarging the state space of the background process and replacing the jumps by linear stretches of unit slope. Apart of above a MAP is a natural generalization of a Lévy process with many analogous properties and characteristics although various new mathematical objects appear in the theory of MAPs posing new challenges.
This paper presents Wiener-Hopf factorization for a special, but nonetheless quite general, class of Markov additive processes. For this class of processes, we give short proof of Wiener-Hopf factorization based on Markov property and additivity. We also express the terms of Wiener-Hopf factorization directly in terms of the basic data of the process. Finally, we derive Spitzer-Rogozin theorem for this class of processes which serves for obtaining Kendall's formula and Fristedt representation of the cumulant matrix of the ladder epoch process. We also present the ballot theorem.
The paper is organized as follows. Section 2 introduces basic definitions, facts and properties related with MAPs. In Sect. 3, we give the main results of this paper. Finally, in Sect. 4, we prove all the theorems.

Markov Additive Processes
Before presenting our main results, we shall simply begin by defining the class of processes we intend to work with and their properties. Following [3], we consider a process X(t), where X(t) = X (1) (t) + X (2) (t), and the independent processes X (1) (t) and X (2) (t) are specified by the characteristics: q ij , G i , σ i , a i , ν i (dx) which we shall now define. Let J (t) be a right-continuous, ergodic, finite state space continuous time Markov chain, with I = {1, . . . , N}, and with the intensity matrix Q = (q ij ). We denote the jumps of the process . Define the jump process by For each j ∈ I, let X j (t) be a Lévy process with the Lévy-Khinchine exponent: (2) (t) we denote the process which behaves in law like X i (t), when J (t) = i. We shall assume that the afore-mentioned class of MAPs is defined on a probability space with probabilities {P i : i ∈ I}, where P i (·) = P(·|J (0) = i), and right-continuous natural filtration F = {F t : t ≥ 0}. In fact, we can consider more general MAP where additional jumps U (i) n appearing during the change of the state of J (t) could also depend on the state J (T n+1 ) (so-called anticipative MAP). This could be done by considering the vector state space I 2 and the modified governing Markov process J on it. If each of the measures ν i are supported on (−∞, 0] as well as the distributions of each U (i) n then we say that X is a spectrally negative MAP. These definition and more concerning the basic characterization of MAPs can be found in [2, Chap. XI].

Time Reversal
Predominant in the forthcoming analysis will be the use of the bivariate process ( J , X), representing the process (J, X) time reversed from a fixed moment in the future when J (0) has the stationary distribution π . For definitiveness, we mean under P π = i∈I π i P i . Note that X is also Markov additive process. The characteristics of ( J , X) will be indicated by using a hat over the existing notation for the characteristics of (J, X). Instead of talking about the process ( J , X), we shall also talk about the process (J, X) under probabilities { P i : i ∈ I}. Note also for future use, following classical time reversed path analysis, for y ≥ 0 and s ≤ t, where I (t) = inf 0≤s≤t X(s), S(t) = sup 0≤s≤t X(s) and G(t) = sup{s < t : X(s) = S(s)}, G(t) = sup{s < t : X(s) = I (s)}. (A diagram may help to explain the last identity.) From now on, we assume that at least one of the processes X i is not a downward subordinator and compound Poisson process. To include compound Poisson process X (i) (t) in the main Theorem 1(i), it is necessary to work with the new definition G(t) = inf{s < t : X(s) = I (s)} instead the previous one. Under the above assumption, we have also G(t) = sup{s ≤ t : X(s) = S(s)} and G(t) = sup{s ≤ t : X(s) = I (s)}.

Ladder Height Process
We start from recalling the representation of the local time given in [19, formula (3.21)]. For MAP, we say that state i ∈ {1, . . . , N} is regular when Denote by {U n } the stopping times at which R(t−) = 0 and R(t) > 0 for the By Kaspi [19,Theorem 3.28] (see also [22]), for the MAP we can define the ladder height process: choosing the local time: where L c (t) is a continuous additive process that increases only on M and e (n) 1 are independent exponential random variables with intensity 1, Obviously, to make the functional (3) measurable, we enlarge probability space to include the exponential random variables. One can easily verify that (L −1 (t), H (t), J (L −1 (t))) is again a (bivariate) MAP (see [19, p. 185]). For each moment of time, we can define the excursion: ( 3) it follows that the excursion process {(t, t ), t ≥ 0} is a (possibly stopped at the first excursion with infinite length) marked Cox point process with the intensity n(J (L −1 (t−)), d ) depending on the state process J (L −1 (t−)).
Denote by E the σ -field on the excursion state space.

Spectrally Negative Markov Additive Process
), for spectrally negative MAP we can define the cumulant generating matrix (cgm) of a MAP X(t): where ψ j (α) = −Ψ (−iα) for Ψ defined in (1). Perron-Frobenius theory identifies F(α) as having a real-valued eigenvalue with maximal real part which we shall label κ(α). The corresponding left and right 1 × N eigenvectors we label v(α) and h(α), respectively. In this text, we shall always write vectors in their horizontal form and use the usual T to mean transpose. Since v(α) and h(α) are given up to multiplicative constants, we are free to normalize them such that v(α)h(α) T = 1 and πh(α) T = 1.
Note also that h(0) = e, the 1 × N vector consisting of a row of ones. We shall write h i (α) for the ith element of h(α). The eigenvalue κ(α) is a convex function (this can also be easily verified) such that κ(0) = 0 and κ (0) is the asymptotic drift of X in the sense that for each For the right inverse of κ, we shall write Φ.
It can be checked that under the following Girsanov change of measure, the process (X, P γ i ) is again a spectrally negative MAP whose intensity matrix F γ (α) is well defined and finite for α ≥ −γ . Generally, for all quantities calculated for P γ , we will add subscript γ . Further, if F γ (α) has largest eigenvalue κ γ (α) and associated right eigenvector where I is the N × N identity matrix and Similarly, the time reversed process X(t) is the spectrally negative MAP with the characteristics F, h, κ. To relate them to the original ones, recall that the intensity matrix of J must satisfy where Δ π is the diagonal matrix whose entries are given by the vector π . Hence according to (4) we find that Moreover, κ(α) = κ(α) and Δ π h(α) T = v(α) T (see [21] for details).
Here and throughout, we work with the definition that e q is a random variable which is exponentially distributed with mean 1/q and independent of (J, X).
As much as possible, from now on, we shall also prefer to work with matrix notation. For a random variable Y and (random) time τ , we shall understand E(Y ; J (τ )) as the matrix with the (i, j )th element E i (Y ; J (τ ) = j). For an event A, P(A; J (τ )) will be understood in a similar sense. Let I ij (q) = P i,0 (J (e q ) = j), in other words, The spectrally negative MAP is easier to analyze since its ladder height process where a ≥ 0. Denote the generator of the Markov process {J (τ + a ), a ≥ 0} by Λ(q) on P Φ(q) . Pistorius [24], Ivanovs et al. [18] show that it solves the following equation: where the above equation is understood as a matrix equation by putting −Λ(q) in the place of α in (6) with γ = Φ(q) and the obvious meaning of an exponential matrix.
In other words, we should understand the latter as where N). For details, check [21] and [23,Prop. 5.6]. D'Auria et al. [6] give a numerical algorithm of calculating Λ(q) based on the theory of Jordan chains. Note that the ladder height process can be identified as {(τ + a , X(τ + a ) = a, J (τ + a )), a ≥ 0}. It is a bivariate Markov additive process with the cumulant generating matrix: for α, q > 0. The above could be deduced from the equalities and Theorem 1 of Kyprianou and Palmowski [21] stating that
Remark 3 It is hard to give an explict expression for the expression appearing in Theorem 1(ii) which depends on the matrix Ξ defined in (10) and hence requires solving the matrix equation (8). There only a few known examples; see, e.g., the examples given in [10]. There are still two numerical methods in the literature. The first one uses an iteration scheme, and the second one uses a theory of generalized Jordan chains; see, e.g., [1,6].
We prove also the following counterpart of Spitzer-Rogozin version of Wiener-Hopf factorization and the Fristedt theorem: Then E e −αS(e q )−ξ G(e q ) ; J (e q )

qt P X(t) ∈ dx; J (t) I(q)
and E e αI (e q )−ξG(e q ) ; J (e q ) By Markov property, the assumption (20) heuristically means that for the MAP (X, J ) on time interval t + s going at time t above 0 and then below the present value is statistically equivalent to going first below 0 at time s and then above the present value. This assumption is satisfied, for example, for the two-state Markov process J (t) and for X(t) = 0 when J (t) = 1 and X(t) = B(t), which is a Brownian motion, when J (t) = 2. It is not satisfied, for example, for the two-state Markov process J (t) with q 12 = q 21 = λ and for X(t) = t when J (t) = 1 and X(t) = −t + B(t) when J (t) = 2. Indeed, take α = 0 and note that for s ↓ 0 we have The following generalizations of Kendall's identity and the ballot theorem also hold.

Theorem 4 Let
where {σ (t), t ≥ 0} is a Markov additive subordinator without the drift component and c > 0. Under the assumptions of Theorem 2, the following identity holds Summarizing, the theorems given here might be seen as a foundation of the fluctuation theory for the (spectrally negative) MAP and might serve for deriving counterparts of the well-known identities for the Lévy processes.

Proof of Theorem 1
(i) Sampling the MAP process (X(t), J (t)) up to an exponential random time e q corresponds to sampling the marked Cox point process (double Poisson point process) of the excursions up to time L(e q ). Moreover, since conditioning on a realization of the process J (t) the point process (t, t ) is a non-homogeneous marked Poisson process, we know that, conditioning on J (L −1 (σ A −)) for the point process {(t, t ), t < σ A } is independent of σ A . Indeed, for Borel sets B 1 and B 2 and k 0 = max{k : Hence Consider now Note that σ 2 is σ A for A = {ζ( ) > e q } and possibly σ 1 = ∞ (e.g., when the set of maxima has Lebesgue measure 0). If σ 2 < σ 1 , then conditioning on J (L −1 ((σ 1 ∧ σ 2 )−)) = J (L −1 (σ 2 −)) the process (t, t ), t < σ 1 ∧ σ 2 and t = ∂ is independent of σ 2 = σ 1 ∧σ 2 . If σ 1 < σ 2 , then σ 1 = σ 1 ∧σ 2 = ∂ and is also independent of the process (21). Hence conditioning on J (L −1 (σ 1 ∧ σ 2 −)) the excursion σ 1 ∧σ 2 is independent of the process (21). Note also that and the last excursion σ 1 ∧σ 2 occupies the final e q − G(e q ) units of time in the interval [0, e q ] and reaches the depth X(e q ) − S(e q ). The proof of the identities (22) completely follows the arguments given in [17,Sect. 4]. These identities complete the proof of the first part of Theorem 1(i). Note that (e q − G(e q ), X(e q ) − S(e q )) has the same law as ( G(e q ), I (e q )). The second part of Theorem 1(i) follows now from the first part applied to the reversed process.
Taking n → ∞ we have i + n → S(e q ). Moreover, S(e q ) ≥ X τ + i + n ≥ i + n and X G(e q ) = S(e q ).
where we use fact that by excluding a compound Poisson processes X j the supremum S(e q ) is uniquely attained. Hence the left-hand side of the above equation converges by the dominated convergence theorem. Thus the righthand side also converges. Note that for any matrix A, Thus E e −ξ G(e q )−αS(e q ) ; J (e q ) = Ξ(q + ξ, α) −1 B for some matrix B using the fact that the matrix (Φ(q + ξ) + α)I − Λ(q + ξ) is invertible for q > 0. Taking ξ = α = 0, we obtain which completes the proof of (16). Similarly, from (24), E i e −ξτ + k/n −αk/n ; τ + k/n ≤ e q ; J τ + k/n = j P j e q < τ + for C = lim n→∞ diag P i e q < τ + 1/n −1 P e q < τ + 1/n ; J (e q ) = lim n→∞ P J (e q )|S(e q ) < 1/n .

Proof of Theorem 2
For a general matrix A with the distinct eigenvalues λ i (hence with the independent eigenvectors s i ) such that Re λ i > 0, using Frullani integral and the representation A = S diag{λ i }S −1 with S = (s 1 , . . . , s N ), for q > 0 we can derive the following identities: and Lemma 1 Under assumption (20), for ξ strictly larger than the largest real part of an eigenvalue of F(α), Proof By additivity of the process X(t), there exists a matrix F such that E exp{iαX(t)} = exp{F(iα)t} (see [ Note that, by identity (20), the matrices commute. This gives the assertion of the lemma by the factorization.
From Lemma 1 and Theorem 1, using classical extension arguments for ξ ≥ 0, we have where H(α, ξ ) = E e iαS(e q )−ξ G(e q ) ; J (e q ) I(q) −1 and From Theorem 1(i), it follows that matrices H(α, ξ ) and T(α, ξ ) are invertible. Thus, Moreover, each entry of the matrix H(α, ξ ) is analytic in the upper half of the complex plane. The same applies also to the matrix H −1 (α, ξ ). Thus each entry of the LHS of (34) extends analytically to the lower half of the complex plane in α, and similarly each entry of the matrix on the RHS of (34) extends analytically to the upper half of the complex plane in α. Hence from Morera's Theorem matrices on both sides of (34) can be defined in the whole α-plane. Observe that each entry of these matrices is a continuous and bounded function. Indeed, from definitions (32) and (33), by Jensen inequality it follows that each entry of the matrices H(α, ξ ) and T(α, ξ ) is bounded in respective regions. Note that the reciprocal of the determinant of H(α, ξ ) is also bounded. Indeed, by (10), (16), and (32), we have as |α| → ∞ in the upper half complex plane, where f (α) ∼ g(α) means that f (α)/g(α) → 1. Now the fact that each entry of H −1 (α, ξ ) is bounded follows from Phragmen-Lindelöf Theorem (see [11,Corr. 4.4]) and the asymptotics which is a consequence of (35). Similarly, one can prove that each entry of the second factors of the RHS and LHS of (34) is bounded. Thus, by Liouville's Theorem, each entry of (34) must be a constant. Putting α = ξ = 0 gives the assertion of the theorem.
In view of Theorem 2, this completes the proof.

Proof of Theorem 4
If there is an atom at x = ct in P(X(t) ∈ dx; J (t)), then the assertion of the theorem remains true. Assume now that P(X(t) ∈ dx; J (t)) is absolutely continuous. By Kendall's identity given in Theorem 3, it suffices then to prove that P X(t) ∈ dx, I (t) = 0; J (t) dt = 1 c P τ + x ∈ dt; J (t) dx, or that for all q > 0 and sufficiently large s > 0 We prove (37) passing from its left-hand side to its right-hand side. Let q = q − κ(s). The change of measure (5) where π s is a stationary measure of X under P s i . Note that Hence lim α→∞ E e sX(e q )+αI (e q ) ; J (e q ) = q c Δ h (s)Δ −1 π s Ξ s (q, 0) −1 T Δ π s Δ h (s) −1 .
Using classical arguments for the reversed process, note that under P s π s we have We can now proceed as follows: where in the first equality we use (11) and in the third one we apply (38). The last equality gives the right-hand side of (37), which completes the proof.