Consider the general setup of Sect. 2.1 where the Markov chain \(\{ X(t); t \ge 0\}\) moves among m transient states before it is absorbed in state \(m + 1\). Suppose now instead that there are \(K > 1\) absorbing states, named \(m+1,m+2,\ldots ,m+K\), say. Let T be the time of absorption in any one of the absorbing states, and let the cause C represent the state where absorption occurs, defining \(C=j\) if \(X(T) = m+j; \;j=1,2,\ldots ,K\). Then the pair (T, C) can be viewed as an observation from a classical competing risks case with possible causes \(1,\ldots ,K\).
The ordinary Coxian phase-type model can now be extended to the competing risks case by allowing transitions to any of the K absorbing states \(m+1,\ldots ,m+K\) from each of the transient states. The case \(K=2\) is illustrated in Fig. 3.
Representation of phase-type distributions for competing risks.
By extending the matrix (1) to include K absorbing states, we obtain the infinitesimal transition matrix of the modified Markov process to be the \((m+K) \times (m+K)\) matrix given in block form as
$$\begin{aligned} {{\mathbf {A}}}= \left[ \begin{array}{cc} {{\mathbf {Q}}}&{} \quad {{\mathbf {L}}}\\ {{\mathbf {0}}}_1 &{} \quad {{\mathbf {0}}}_2 \end{array} \right] . \end{aligned}$$
(15)
Here \({{\mathbf {Q}}}\) is the \(m \times m\) matrix corresponding to transitions between the transient states, while the m-vector \({\varvec{\ell }}\) is replaced by the \(m \times K\) matrix \({{\mathbf {L}}}\) of transition intensities from the transient states to the absorbing states. Further, \({{\mathbf {0}}}_1\) and \({{\mathbf {0}}}_2\) are, respectively, \(K \times m\) and \(K \times K\) matrices of zeros.
Similarly to the derivation of (2), we obtain the matrix of transition probabilities \(P_{ij}(t)\) given by
$$\begin{aligned} {{\mathbf {P}}}(t) = \left[ \begin{array}{cc} e^{{{\mathbf {Q}}}t} &{} \quad {{\mathbf {Q}}}^{-1}(e^{{{\mathbf {Q}}}t} - {{\mathbf {I}}}){{\mathbf {L}}}\\ {{\mathbf {0}}}_1 &{} \quad {{\mathbf {I}}}\end{array} \right] . \end{aligned}$$
(16)
Let the m-vector \({{\mathbf {p}}}\) be the initial distribution of the Markov chain. The triple \(({{\mathbf {p}}},{{\mathbf {Q}}},{{\mathbf {L}}})\) thus determines a competing risks distribution.
Observe that \({\varvec{\ell }}\equiv {{\mathbf {L}}}{{\mathbf {1}}}\) is the vector of transition rates from the transient states to the lumped set of absorbing states, and hence corresponds to the vector \({\varvec{\ell }}\) of the ordinary phase-type model considered in Sect. 2.1. It follows that \({{\mathbf {L}}}{{\mathbf {1}}}= - {{\mathbf {Q}}}{{\mathbf {1}}}\). Thus the \(m\times K\) matrix \({{\mathbf {L}}}\) satisfies m conditions given by \({{\mathbf {Q}}}\) and has hence \(Km-m=(K-1)m\) free parameters. Adding these to the \(2m-1\) parameters of \(({{\mathbf {p}}},{{\mathbf {Q}}})\), we have \((K+1)m-1\) parameters in the representation \(({{\mathbf {p}}},{{\mathbf {Q}}},{{\mathbf {L}}})\).
From (16) we obtain expressions for the subdistribution functions, given by
$$\begin{aligned} F_j(t) = P(T \le t, C=j) = P(X(t)=m+j) = {{\mathbf {p}}}' {{\mathbf {Q}}}^{-1}(e^{{{\mathbf {Q}}}t} - {{\mathbf {I}}}){\varvec{\ell }}_j \end{aligned}$$
for \(j=1,\ldots ,K\), where \({{\mathbf {p}}}\) is the m-vector defining the initial distribution of the Markov chain and \({\varvec{\ell }}_j\) is the jth column of \({{\mathbf {L}}}\). By differentiation we get the subdensities
$$\begin{aligned} f_j(t)=F_j'(t) ={{\mathbf {p}}}'e^{{\mathbf {Q}}t}{\varvec{\ell }}_j, \end{aligned}$$
(17)
while the cause-specific hazard rates are given by
$$\begin{aligned} \lambda _j(t)= \lim _{\varDelta t \rightarrow 0} \frac{P(T \le t+\varDelta t,C=j|T>t)}{\varDelta t} = \frac{f_j(t)}{S(t)} = \frac{{{\mathbf {p}}}' e^{{{\mathbf {Q}}}t} {\varvec{\ell }}_j}{{{\mathbf {p}}}' e^{{{\mathbf {Q}}}t} {{\mathbf {1}}}} . \end{aligned}$$
(18)
We shall below also need the Laplace transforms of the subdensities \(f_j(t)\). Similarly to (6) we get
$$\begin{aligned} f_j^*(s) = {{\mathbf {p}}}' (s{{\mathbf {I}}}-{{\mathbf {Q}}})^{-1}{\varvec{\ell }}_j. \end{aligned}$$
(19)
Identifiability of phase-type models for competing risks
Let \(({{\mathbf {p}}},{{\mathbf {Q}}},{{\mathbf {L}}})\) be a phase-type model for competing risks. We shall call it nonredundant if \(({{\mathbf {p}}},{{\mathbf {Q}}})\) is a nonredundant phase-type model as defined in Sect. 2.1.
The next theorem extends the result of Theorem 1 to the competing risks case. The proof is given in Appendix A3.
Theorem 4
Let \(({{\mathbf {p}}}^{(a)},{{\mathbf {Q}}}^{(a)},{{\mathbf {L}}}^{(a)})\) and \(({{\mathbf {p}}}^{(b)},{{\mathbf {Q}}}^{(b)},{{\mathbf {L}}}^{(b)})\) be two nonredundant phase-type representations for competing risks, having subdistribution functions \(F_j^{(a)}(t)\) and \(F_j^{(b)}(t)\), respectively. Then \(F_j^{(a)}(t)=F_j^{(b)}(t)\) for all t and j if and only if there exists a nonsingular \(m \times m\) matrix \({{\mathbf {B}}}\) with \({{\mathbf {B}}}{{\mathbf {1}}}= {{\mathbf {1}}}\) such that \({{{\mathbf {p}}}^{(b)}}' = {{{\mathbf {p}}}^{(a)}}'{{\mathbf {B}}}\), \({{\mathbf {Q}}}^{(b)} = {{\mathbf {B}}}^{-1} {{\mathbf {Q}}}^{(a)}{{\mathbf {B}}}\) and \({{\mathbf {L}}}^{(b)} = {{\mathbf {B}}}^{-1} {{\mathbf {L}}}^{(a)}\).
Coxian phase-type models for competing risks
In the present subsection we specialize to the case of Coxian phase-type distributions for competing risks. These will be defined as the triple \(({{\mathbf {p}}},{{\mathbf {Q}}},{{\mathbf {L}}})\) where \({{\mathbf {p}}}=(1,0,\ldots ,0)\), \({{\mathbf {Q}}}\) is given by (10), and \({{\mathbf {L}}}\) is an \(m \times K\) matrix defined in the same way as in Sect. 3.1. Recalling that \({{\mathbf {Q}}}\) has \(2m-1\) parameters, it follows from the reasoning of the cited subsection that the Coxian model has \((K+1)m-1\) parameters.
We can now prove the following result which extends Theorem 3 and is a simple consequence of Theorem 4. As for Theorem 3, we build upon the reasoning in Rizk et al. (2019).
Theorem 5
Consider two non-redundant Coxian phase-type distributions for competing risks, given by \(({{\mathbf {p}}}^{(a)},{{\mathbf {Q}}}^{(a)},{{\mathbf {L}}}^{(a)})\) and \(({{\mathbf {p}}}^{(b)},{{\mathbf {Q}}}^{(b)},{{\mathbf {L}}}^{(b)})\), where \({{\mathbf {p}}}^{(a)} = {{\mathbf {p}}}^{(b)}=(1,0,\ldots ,0)^\prime \). Assume further that the diagonals of \({{\mathbf {Q}}}^{(a)}\) and \({{\mathbf {Q}}}^{(b)}\) are ordered in the same way. Then if \(F_j^{(a)}(t) = F_j^{(b)}(t)\) for all \(t>0\) and \(j=1,2,\ldots ,K\), we have
$$\begin{aligned} {{\mathbf {Q}}}^{(b)}= & {} {{\mathbf {Q}}}^{(a)} \\{{\mathbf {L}}}^{(b)}= & {} {{\mathbf {L}}}^{(a)} \end{aligned}$$
Proof
Let \({{\mathbf {B}}}\) be the invertible matrix obtained by using Theorem 4 for the present situation. Corollary 1 of Rizk et al. (2019) shows that if \({{\mathbf {p}}}^{(a)} = {{\mathbf {p}}}^{(b)}=(1,0,\ldots ,0)^\prime \), then the matrix \({{\mathbf {B}}}\) is lower triangular. Moreover, they show that if the ordering of the diagonal elements of \({{\mathbf {Q}}}^{(a)}\) and \({{\mathbf {Q}}}^{(b)}\) are the same, then \({{\mathbf {B}}}\) is the identity matrix and hence \({{\mathbf {Q}}}^{(a)} = {{\mathbf {Q}}}^{(b)}\). Since \({{\mathbf {B}}}\) is the identity matrix, we conclude from Theorem 4 that \({{\mathbf {L}}}^{(b)} = {{\mathbf {L}}}^{(a)}\) and we are done. \(\square \)
Example 5
(An example of non-uniqueness)Lindqvist and Kjølen (2018) considered the following situation. Let \(m=K=2\) and consider the two Coxian models with \({{\mathbf {p}}}^{(a)}={{\mathbf {p}}}^{(b)}=(1,0)'\) and, respectively,
$$\begin{aligned} {{\mathbf {Q}}}^{(a)} = \left( \begin{array}{rrr} -4 &{} \quad ~ &{} 1 \\ 0 &{} \quad ~ &{} -5 \end{array} \right) , \; \; {{\mathbf {L}}}^{(a)} = \left( \begin{array}{ccc} 2 &{} \quad ~ &{} \quad 1 \\ 3 &{} \quad ~ &{} \quad 2 \end{array} \right) \end{aligned}$$
(20)
and
$$\begin{aligned} {{\mathbf {Q}}}^{(b)} = \left( \begin{array}{rrr} -5 &{} \quad ~ &{} \quad 2 \\ 0 &{} \quad ~ &{} \quad -4 \end{array} \right) , \; \; {{\mathbf {L}}}^{(b)} = \left( \begin{array}{ccc} 2 &{} \quad ~ &{} \quad 1 \\ 5/2 &{} \quad ~ &{} \quad 3/2 \end{array} \right) \end{aligned}$$
(21)
It was shown that these different representations lead to the same subdensities, namely
$$\begin{aligned} f_1(t)= & {} 5e^{-4t} - 3e^{-5t} \end{aligned}$$
(22)
$$\begin{aligned} f_2(t)= & {} 3e^{-4t} - 2e^{-5t} \end{aligned}$$
(23)
Since \({{\mathbf {Q}}}^{(a)} \ne {{\mathbf {Q}}}^{(b)}\) this shows that the condition of equally ordered diagonals in Theorem 5 cannot be removed. Note also that Theorem 4 can be used to show that the two representations of (20) and (21) give rise to the same competing risks model, using the matrix
$$\begin{aligned} {{\mathbf {B}}}= \left( \begin{array}{rcc} 1 &{} \quad ~ &{} \quad 0 \\ -1 &{} \quad ~ &{} \quad 2 \end{array} \right) . \end{aligned}$$
Canonical representation of Coxian competing risks distributions
The recent article by Rizk et al. (2021) motivates an extension of the canonical model \(({{\tilde{{{\mathbf {p}}}}}},{{\tilde{{{\mathbf {Q}}}}}})\) of Sect. 2.3 to the case of multiple absorbing states. These authors model an emergency department with patients moving through a series of service stations, where each station is modeled by a Coxian phase-type distribution. Then in order to model the movement between stations, they include an additional absorbing state in each station. Hence one absorbing state represents patients leaving the hospital, while the other represents movement to the next station. Their clue is to model the case with two absorbing states using a mixture of two series models like the one considered in Fig. 2. As they note, this approach facilitates statistical inference and the inclusion of covariates. For example, matrix exponentials can then be avoided in the likelihood function.
In the following we use their idea to extend the canonical model of Sect. 2.3 to involve an arbitrary number K of absorbing states, thus obtaining a canonical model for the Coxian competing risks situation. We also demonstrate below that properties of the single absorbing state case (Sect. 2.3) imply that this is a valid representation for any competing risks representation \(({{\mathbf {p}}},{{\mathbf {Q}}},{{\mathbf {L}}})\) with upper triangular \({{\mathbf {Q}}}\).
As indicated above, the idea is essentially to involve one canonical model of the form \(({{\tilde{{{\mathbf {p}}}}}},{{\tilde{{{\mathbf {Q}}}}}})\) (or see Fig. 2) for each absorbing state, and then consider a mixture of them to represent the full competing risks situation. This is illustrated in Fig. 4 for the case \(K=2\). Let \(p_{ij}\) be the probability of entering the system in state \(m-i+1\) \((i=1,2,\ldots ,m)\) and being absorbed in state \(m+j\) (\(j=1,2,\ldots ,K\)). Once entered, the transition rates leading to the absorbing states are identical for each subchain. The parameters of the model are now the \(p_{ij}\), which sum to 1 and hence contribute \(Km-1\) parameters, in addition to the m transition rates \(\lambda _i\). Altogether, this gives \((K+1)m-1\) parameters.
In a similar way as for the single absorbing state case, the Laplace transforms of the subdensities \(f_j(t)\) of the above representation, can be given in the form [recall (12)]
$$\begin{aligned} f_j^*(s) = \sum _{i=1}^m p_{ij} g^*_{(i)}(s) \end{aligned}$$
(24)
where the \(g^*_{(i)}(s)\) are as given by (13).
Consider then an arbitrary representation \(({{\mathbf {p}}},{{\mathbf {Q}}},{{\mathbf {L}}})\) where \({{\mathbf {Q}}}\) is upper triangular. The corresponding Laplace transforms, defined in (19), are of the form
$$\begin{aligned} {{\hat{f}}}_j^*(s) = \frac{{{\hat{N}}}_j(s)}{{{\hat{D}}}(s)} \end{aligned}$$
with \({{\hat{D}}}(s) = \prod _{i=1}^m(s+\lambda _i)\). O’Cinneide (1989) pointed out that the canonical representation \(({{\tilde{{{\mathbf {p}}}}}},\tilde{{\mathbf {Q}}})\) of Cumani (1982) also holds for subdensities (actually, O’Cinneide in his papers has included the possibility of a positive probability for \(T=0\)). This proves that there are nonnegative \(p_{ij}\) such that
$$\begin{aligned} {{\hat{f}}}_j^*(s) = \sum _{i=1}^m p_{ij} g^*_{(i)}(s), \end{aligned}$$
which by (24) implies that the representation presented above is equivalent to the \(({{\mathbf {p}}},{{\mathbf {Q}}},{{\mathbf {L}}})\). Indeed, the nonnegativeness of the \(p_{ij}\) is guaranteed by the result by Cumani (1982) and is a consequence of requiring \(\lambda _1 \ge \lambda _2 \ge \ldots \ge \lambda _m\). That the \(p_{ij}\) sum to 1, follows since \(\sum _{j=1}^K {{\hat{f}}}_j^*(s)\) is the Laplace transform for the absorption time T, which is given in (12). Validity and uniqueness of the new representation can now be proven in the same manner as for the single absorbing state case as shown in Cumani (1982). Note the requirement that \(\lambda _1 \ge \lambda _2 \ge \ldots \ge \lambda _m\).
In a way similar to the single absorbing state case, an equivalent version of the above canonical model can be given in the form of a Coxian competing risks model with \({{\mathbf {Q}}}\) as given in (10) and \(\lambda _1 \ge \lambda _2 \ge \ldots \ge \lambda _m\). Rizk et al. (2021) present formulas for going from the above canonical representation to the Coxian model representation for the case \(K=2\).
Example 5 (cont.) We show how to derive the canonical representation of the competing risks model considered in the first part of Example 5. First, calculate the Laplace transforms corresponding to \(f_1(t)\) and \(f_2(t)\) in (22) and (23),
$$\begin{aligned} f^*_1(s)= & {} \frac{5}{s+4}-\frac{3}{s+5} = \frac{2s + 13}{(s+4)(s+5)} \\ f^*_2(s)= & {} \frac{3}{s+4}-\frac{2}{s+5} = \frac{s+7}{(s+4)(s+5)} \end{aligned}$$
With notation as above, we further get by letting \(\lambda _1=5, \lambda _2=4\),
$$\begin{aligned} g^*_{(1)}(s)= & {} \frac{\lambda _1}{s+\lambda _1} = \frac{5}{s+5} \\ g^*_{(2)}(s)= & {} \frac{\lambda _1 \lambda _2}{(s+\lambda _1)(s+\lambda _2)} = \frac{20}{(s+5)(s+4)} \end{aligned}$$
By equating coefficients on each side of (24) with the above functions, we arrive at
$$\begin{aligned} p_{11} = 0.40, \; p_{21} = 0.25, \;p_{12} = 0.20, \;p_{22} = 0.15. \end{aligned}$$
The alternative canonical Coxian model turns out to be simply \(({{\mathbf {Q}}}^{(b)},{{\mathbf {L}}}^{(b)})\), given in the beginning of Example 5.
Maximum likelihood estimation in the canonical model
As noted by Rizk et al. (2021), the representation for Coxian competing risks models given in the previous subsection, turns out to be very well suited for statistical inference. We consider below for simplicity the case of right-censored competing risks data without covariates, while a note on the inclusion of covariates is given in the end, following the suggestion of Rizk et al. (2021).
Let the observed survival time for individual k (\(k=1,2,\ldots ,N\), say) be \(T_k\) and let \(D_k\) be the corresponding status variable, where
$$\begin{aligned} D_k = \left\{ \begin{array}{cl} 0 &{} \quad \text{ if }\ T_k\ \text{ is } \text{ a } \text{(right) } \text{ censoring } \text{ time } \\ j &{} \quad \text{ if }\ T_k\ \text{ is } \text{ the } \text{ time } \text{ of } \text{ absorption } \text{ in } \text{ state }\ m+j, j=1,2,\ldots ,K \end{array} \right. . \end{aligned}$$
Consider the modeling of these data using the model of Sect. 3.4. The parameters are hence \(p_{ij}\) for \(i=1,\ldots ,m; \;j=1,2,\ldots ,K\), noting that \(p_{mK} = 1-\sum _{(i,j) \ne (m,K)}p_{ij}\), and \(\lambda _1,\ldots ,\lambda _m\). For the latter parameters, we shall assume strict inequalities in \(\lambda _1> \ldots > \lambda _m\), which simplifies the likelihood and which seems reasonable when there is no apriori modeled connection between the \(\lambda _i\).
Several authors have suggested using the EM-algorithm for maximum likelihood estimaton in phase-type models (see, e.g., Asmussen et al. 1996). The clue is then to let the states visited by the Markov chain on the way to absorption, define the latent observations. This simplifies the full likelihood and its maximization in the M-step, while the E-step is usually also rather straightforward to perform. As we shall see, the canonical representation considered here allows a very simple and transparent use of the EM-algorithm.
Let then the (latent) starting state for individual k be defined by the vector
$$\begin{aligned} {{\mathbf {X}}}_k = (X_{k,ij}; i=1,\ldots ,m, \; j=1,\ldots ,K), \end{aligned}$$
where \(X_{k,ij} = 1\) if the kth individual starts in state \(m-i+1\) in the jth subchain, while the other entries in \({{\mathbf {X}}}_k\) equal 0.
The likelihood contribution for an item that starts in state \(m-i+1\) with \(i \in \{1,2,\ldots ,m\}\) and is absorbed in state \(m+j\) for \(j \in \{1,2,\ldots ,K\}\) after a time t is
$$\begin{aligned} p_{ij} g_{(i)}(t;{\varvec{\lambda }}), \end{aligned}$$
where \(p_{ij}\) is defined in Sect. 3.4, \(g_{(i)}(t;{\varvec{\lambda }})\) is the density of \(U_1+\ldots +U_i\) when \(U_{\ell }\) for \(\ell =1,2,\ldots ,m\) is the waiting time in state \(\ell \), which is exponentially distributed with rate \(\lambda _\ell \), and \({\varvec{\lambda }}=(\lambda _1,\ldots ,\lambda _m)\). The density \(g_{(i)}\) is given by
$$\begin{aligned} g_{(i)}(t;{\varvec{\lambda }}) = \sum _{\ell =1}^i\frac{\prod _{u=1}^i \lambda _u}{\prod _{u \ne \ell }^i (\lambda _u-\lambda _\ell )}e^{-\lambda _\ell t}. \end{aligned}$$
For right censored observations, the likelihood contribution for an item that starts in state \(m-i+1\) in the jth subchain, and is right censored at time t, is
$$\begin{aligned} p_{ij} S_{(i)}(t;{\varvec{\lambda }}). \end{aligned}$$
Here, \(S_{(i)}(t;{\varvec{\lambda }})\) is the survival function of \(U_1+\ldots +U_i\), given by
$$\begin{aligned} S_{(i)}(t;{\varvec{\lambda }}) = \sum _{\ell =1}^i\frac{\prod _{u=1,u\ne \ell }^i \lambda _u}{\prod _{u \ne \ell }^i (\lambda _u-\lambda _\ell )}e^{-\lambda _\ell t}. \end{aligned}$$
The contribution to the log-likelihood function from a single individual k can then be written
$$\begin{aligned} L_k= & {} \sum _{j=1}^K \sum _{i=1}^m X_{k,ij}I(D_k=j)(\log (p_{ij})+\log (g_{(i)}(T_k;{\varvec{\lambda }})))\\+ & {} \sum _{j=1}^K \sum _{i=1}^m X_{k,ij}I(D_k=0)(\log (p_{ij})+\log (S_{(i)}(T_k;{\varvec{\lambda }}))), \end{aligned}$$
so the full log-likelihood function for the data plus latent variables is \(L = \sum _{k=1}^N L_k\).
For the E-step of the algorithm we get for \(j=1,\ldots ,K\),
$$\begin{aligned} E(X_{k,ij}|D_k=j,T_k=t)= \frac{g_{(j)}(t;{\varvec{\lambda }})p_{ij}}{\sum _{\ell =1}^m g_{(\ell )}(t;{\varvec{\lambda }})p_{\ell j}} \end{aligned}$$
and
$$\begin{aligned} E(X_{k,ij}|D_k=0,T_k=t)= \frac{S_{(j)}(t;{\varvec{\lambda }})p_{ij}}{\sum _{\ell =1}^m \sum _{r=1}^K S_{(\ell )}(t;{\varvec{\lambda }})p_{\ell r}}. \end{aligned}$$
The M-step is a maximization of the log-likelihood function L with respect to the parameters \(p_{ij}\) and \({\varvec{\lambda }}\), with the restriction \(\lambda _1> \lambda _2 > \ldots \lambda _m\).
Following Rizk et al. (2021), if \({{\mathbf {z}}}_k\) is a covariate vector for individual k, then we may let the covariates influence the rates \(\lambda _i\) in the way \(\lambda _i \exp \{{\varvec{\beta }}'{{\mathbf {z}}}_k\}\). In particular, this will maintain the inequalities between the \(\lambda \)-parameters for each individual.
Example 6
(Real data case) As an illustration we fitted competing risks data from Beyersmann et al. (2012, Ch. 1). The data are observations from an intensive care unit, with the purpose to examine the effect of hospital-acquired infections. We analyzed the data for patients without pneumonia at admission to the unit. The time variable was length of stay at the intensive care unit, with two outcomes of interest, either alive discharge (\(D=1\)) or hospital death (\(D=2\)). There were 650 patients in these data, with 589 being discharged alive, 55 dead in hospital, and 6 right censored.
Using the model of Fig. 4 with \(m=3\), we obtained the estimates \({{\hat{p}}}_{31}=0.8712\), \({{\hat{p}}}_{21}=0.0410\), \({{\hat{p}}}_{32}=0.0878\), \({{\hat{p}}}_{11}={{\hat{p}}}_{22}={{\hat{p}}}_{12}=0.0000\), \(\hat{\lambda }_1 = 1.5147\), \({{\hat{\lambda }}}_2 = 0.6938\), \({{\hat{\lambda }}}_3 = 0.0946\).
Figure 5 shows the estimated cumulative incidence functions obtained from the phase-type model, together with nonparametric estimates found by using the Aalen–Johansen estimators (see, e.g., Borgan 1998). With the nonparametric curve serving as a “benchmark”, the fit seems very good for the outcome ‘hospital death’, while the model with \(m=3\) is seemingly not able to pick up completely the steepness of the first part of the cumulative incidence function for the outcome ‘alive discharge’. A seemingly better fit was obtained, on the other hand, using a model with \(m=4\) in Lindqvist and Kjølen (2018).