Gamma Process-Based Models for Disease Progression

Classic chronic diseases progression models are built by gauging the movement from the disease free state, to the preclinical (asymptomatic) one, in which the disease is there but has not manifested itself through clinical symptoms, after spending an amount of time the case then progresses to the symptomatic state. The progression is modelled by assuming that the time spent in the disease free and the asymptomatic states are random variables following specified distributions. Estimating the parameters of these random variables leads to better planning of screening programs as well as allowing the correction of the lead time bias (apparent increase in survival observed purely due to early detection). However, as classical approaches have shown to be sensitive to the chosen distributions and the underlying assumptions, we propose a new approach in which we model disease progression as a gamma degradation process with random starting point (onset). We derive the probabilities of cases getting detected by screens and minimize the distance between observed and calculated distributions to get estimates of the parameters of the gamma process, screening sensitivity, sojourn time and lead time. We investigate the properties of the proposed model by simulations.

(S p ), in which one has the disease but it has not yet manifested itself through clinical symptoms. The progression ends from our point of view when symptoms appear and one reaches the clinical state (S c ) (Zelen and Feinleib 1969).
Screening programs are organized aiming for early detection of diseases in hopes of improving survival. However, early detection automatically means that the survival of cases that were diagnosed by screens is longer than the survival of cases that were diagnosed by clinical symptoms (S c ). In Fig. 1 one can see case A (in black) of which the disease was detected early, and case B (in grey) which was detected after showing symptoms. Although both cases become onset and are deceased at the same time, case A will appear to have survived longer simply because its survival is recorded from the first date of diagnosis. This apparent increase in survival which is observed purely due to early detection is called lead time bias (Gordis 2008).
The estimation of the sojourn time, which is the amount of time spent in the preclinical state (S p ) allows correcting the lead time bias as well as the evaluation of existing programs and optimizing future ones. Sojourn time thus governs the movement between S p and S c . For gauging the movement between S f and S p , we define the preclinical intensity as the probability of moving from S f into S p during (t + dt).
The classical approach for modelling the process (Wu et al. 2005) is by assuming that both sojourn time and the preclinical intensity are random variables of which the parameters are to be estimated. The sensitivity is given a functional form (e.g. logistic form) with some parameters. Thus one can determine the probability of a case to be detected by a screen or during an interval between screens by symptoms in terms of the parameters. Subsequently a likelihood function can be formulated and maximized in order to get parameter estimates, see e.g. Hijazy and Zempléni (2020).
However, there is a lot of discrepancy between results in the literature, for instance an entry-exit model into and out of S p is used by Duffy et al. (1995), who applied their model to the Swedish two-county study of breast cancer, the resulting estimate for the mean sojourn time is 2.3 years. On the other hand, Weedon-Fekjaer et al. (2005) obtained their estimates through weighted non-linear least-square regression using a three step Markov chain model, applying their method to the Norwegian Breast Cancer Screening Program (NBCSP), they estimated the mean sojourn time to be 6.1 years for those aged between [50,59] and 7.9 years for those between [60,69]. These differences are most likely caused by the underlying assumptions while building the model, as these can have a crucial effect on the estimates. To deal with this discrepancy, more information has to be incorporated into such models. Namely, we chose to include a measure of sickness or degradation. The measure of sickness on diagnosis should provide some information about the preclinical time, e.g. in breast Fig. 1 Apparent increase in survival due to early detection cancer, a case detected with a large tumor is likely to have been preclinical for a long time. In other words, integrating the tumor size at diagnosis into the model forms an additional constraint and deals with the discrepancies between the results. Nonetheless, in order to incorporate tumor size into the model, one has to specify the rate of growth.
Very early results by Collins et al. (1956) and Schwartz (1961) proposed simple linear exponential growth, introducing the notion of doubling time, that is the time needed for a tumor to double its size. Using an exponential growth is still by far the most popular in the literature. However, Laird (1964) found that exponential growth of tumors is realistic only in cases where tumors were observed for relatively brief periods. Moreover, when tumor growth was followed for a sufficiently extensive period of time, the results showed that nearly all tumors grow more and more slowly as the tumor got larger, opposing the constant specific growth rate that would be expected for simple exponential growth. Norton (1988) suggested a Gompertz function in which successive doublings occur at increasingly longer intervals. Figure 2 shows the decelerating rate of Gompertzian growth beside an exponentially growing tumor of constant rate. Speer et al. (1984) proposed modelling by a more generalized approach using the Gompertzian kinetics. They included the noisiness in the growth process, i.e. they assumed that from time to time, in a random fashion, a spontaneous change occurs in the growth rate.
Although it seems that there is no general consensus about the shape of tumor growth, it looks like a logistic or a Gompertz growth shape is more likely than an exponential one. Besides, it seems that the growth process has some randomness and the growth rate is not constant. These properties led us to use the gamma process, since the expected value of the gamma process is given by the product of the shape and the scale. We used the Gompertz function as the shape of a gamma process as it was proposed by Norton (1988). As a result, in our model, tumors grow in gamma distributed increments with an expected Gompertz growth (see Fig. 3).
Since its introduction by Abdel-Hameed (1975) the gamma process has been extensively used in the literature to describe the stochastic and monotone degradation accumulating over time. The gamma process is very tractable and has nice properties, besides, the process is flexible in the sense that multiple functional forms of the shape can be easily adapted. This enabled us to establish the distributions of detected tumor sizes on screen, which in turn allows the use of classical estimation methods.
The paper is organized as follows: in Section 2 we lay the setup of the model, derive the distributions of the sojourn time, lead time and the distributions on screens. In Section 3 we apply the model on simulated data and show the results. In Section 4 we state some concluding remarks, possible extensions of the model as well as its limitations.

Model
Gamma processes have been identified as the main way to model degradation phenomena. A non-stationary gamma process can model degradation when there is some temporal variability in the degradation phenomenon. For a shape parameter η(t) (time dependent) and a scale parameter β, the marginal distribution of a non-stationary gamma process Y t at time t assuming that the process starts from 0 at t = 0 is the gamma distribution, namely: where η(t) is a non-negative, monotone increasing function for t ≥ 0 and η(0) = 0. Recall that one says that (Y t ) t≥0 is a non-stationary gamma process if: -The increments of the gamma process in the interval (t, t + h) denoted by ΔY (t, h) = Y t+h − Y t , t > 0, h > 0 are independent random variables over disjoint time intervals. -Each increment ΔY (t, h) follows a gamma distribution with constant scale parameter and time-varying shape parameter Δη(t, h) = η(t + h) − η(t) for all t and h. The density of the increments is given by: Now suppose that the tumor growth is a gamma process Y t with a time dependent shape parameter η(t) = m 1 (1 − exp(−m 2 t)) (Norton 1988) and a constant scale parameter β. For a tumor becoming onset at age t p > 0 with size 0, denote by X(t p , t p , t 1 ) = Y t 1 −t p , t 1 ≥ t p the tumor size at age t 1 . Similarly, let X(t p , t 1 , t 2 ) = Y t 2 −t p − Y t 1 −t p , t 2 ≥ t 1 ≥ t p denote the increments from age t 1 till t 2 for a tumor becoming onset at t p . Then, for a given t p the density of X(t p , t 1 , t 2 ) is given by f X(t p ,t 1 ,t 2 ) (x) = f ΔY (t 1 −t p ,t 2 −t 1 ) (x).
As the exact onset of the disease is unknown, the preclinical intensity is assumed to be a random variable independent of the tumor growth. The distribution is chosen to be lognormal LN (μ, s 2 ) sub-density, meaning that it is the density multiplied by the lifetime risk, as not everyone in the population is affected by the disease. The choice of the lognormal distribution is based on the results of Lee and Zelen (1998) who found that the transition probability of breast cancer to the preclinical state is right skewed with a heavy tail, the lognormal distribution has similar properties. Denote by t p the age in which the case moves into S p , and by r the life time risk of breast cancer, the density of the preclinical intensity is then given by: Now consider a breast cancer screening program consisting of K screens with fixed inter-screening time (time between two consecutive screens) Δ, suppose that the first screen takes place when a patient is aged t 0 , Denote by τ i = t 0 + (i − 1)Δ the time of the i th screen. Moreover, assume that the sensitivity of the screen has a logistic form depending on the tumor size. This is motivated by the results of Michaelson et al. (2003) who determined estimates of the sizes at which breast cancers become detectable on mammographic and clinical grounds, showing that smaller tumors are very hard to detect, other factors include the density of the breast, age and others. Keeping our approach simple, we will only use the tumor size, namely, let us define the sensitivity as: where b 0 and b 1 are parameters to be estimated and x is the tumor size. Since the logistic function takes values between 0 and 1 and is monotonically increasing in x, it is suitable for modelling sensitivity. The parameter b 0 determines the location of the curve while b 1 is the growth rate or the steepness of the curve. Furthermore, suppose that progression into the clinical state S c happens when the tumor size reaches a critical size denoted by C independent from the growth process, in other words, the patient's tumor size has reached a level in which it is noticed or causes symptoms. Assume that C is a gamma distributed random variable with shape λ and scale ξ , where λ and ξ are parameters to be estimated. Figure 3 shows some paths of the gamma process for the parameter values which will be used throughout the paper (Scenario 1). A tumor becomes onset at a random time t p triggering the growth of a tumor, which then grows as a gamma process till it is either detected by a screen or by reaching its critical threshold (showing symptoms). The aim is to establish the distribution of the size of detected tumors on screens and then use suitable estimation methods for the parameter governing the process.

Sojourn Time
As we defined the sojourn time as the amount of time spent in the preclinical state S p , it is then the amount of time before hitting the critical tumor size C. Denote by X t the marginal distribution of the tumor size t time after the onset (X(t p , t p , t + t p )) and consider the stopping time T C = inf{t ≥ 0; X t ≥ C}, which is just the sojourn time with cdf F T C . Note that for some parametrizations, there could be a positive probability that T C is infinite. In these cases, the distribution of T C will not be a proper one and consequently the expected value and the variance of T c are undefined. Nonetheless, after the parameters are estimated, one may truncate the distribution to get finite estimates of the mean and variance of cases with finite sojourn time as shown in Section 2.3. The cdf of the sojourn time may be derived using the law of total probability, namely: and as the process is increasing, P (T c ≥ t|C = x) = P (X t < x). When C is gamma distributed, a closed form for the cdf was derived by Paroissin and Salami (2014): where 2 F 1 is the generalized hypergeometric function of order (2,1) (Gradshteyn and Ryzhik 1965), note that p F q is given by: where (a) k is the Pochammer symbol defined as (a) k = Γ (a + k)/Γ (k). Assuming that η is differentiable, then the derivative of F T C is given by: where ψ is the digamma function and F 2:2,1 2:1,0 is the Kampé de Fériet function (Ancarani and Gasaneo 2009).

Distribution of the Tumor Sizes on Screens
Next, the distribution of screened cases has to be established. Note that we will be dealing with subdensities, since not all cases will move to the preclincal state and not all preclinical cases are detected. In other words, we will be establishing the subdensity of the sizes of detected tumors on screens. That being said, we also need to consider that an individual participating in screen i means that the individual has not shown symptoms yet. This means that the subdensity is derived under the condition that the individual did not hit the critical threshold yet. Starting with the first screen, the subdensity is built up from those who have progressed to the preclinical state S p before τ 1 , did not move into the clinical state before τ 1 , and were screened positively.
Since the random threshold, the onset, and the process are assumed independent, for a given preclinical age t p the conditional subdensity of screened tumor sizes on the first screen is obtained by the density of the increments in (t p , τ ) weighted by the sensitivity Λ(x) as it was screened positively and by the probability of not hitting the threshold before τ 1 . Consequently, the conditional subdensity is then given by: ( 2 ) The full subdensity is derived by applying the law of total probability to Eq. 2, however, as the participants in screen τ 1 have not yet become clinical, the subdensity needs to be adjusted by the probability of a case not becoming clinical before τ 1 , therefore we have: The subdensity of screened tumor sizes on the second screen is derived in a similar fashion, though we need to consider two parts of the subdensity separately. The first part corresponds to those who have moved into S p during (τ 1 , τ 2 ) and the second is for those who moved to S p before τ 1 . Denote by f τ j ,(τ i ,τ i+1 ) the contribution of cases developing between (τ i , τ i+1 ) to the subdensity on τ j . Starting with the first part of the the subdensity f τ 2 ,(τ 1 ,τ 2 ) , that is built from cases becoming preclinical between τ 1 and τ 2 , not becoming clinical before τ 2 and then screened positively: The second part f τ 2 ,(0,τ 1 ) corresponds to those who have moved to S p before τ 1 , their disease was not detected by the first screen and stayed in the preclinical state till τ 2 when they were finally screened positively. Therefore: As a result, the subdensity of screened tumor sizes on the second screen is In general, the subdensity on screen i can be derived following the same logic, dividing the time-line into disjoint intervals (0, τ 1 ), (τ 1 , τ 2 ), · · · , (τ i−1 , τ i ). Thus, it is given by the sum of the contributions. Namely: where f τ i ,(τ j −1 ,τ j ) (x) is given by: Note that the distribution of interval cases is derived in a similar manner. Interval cases are defined as those who progress into S c between two screens, namely, these are ones who reach the critical threshold between screens (see Fig. 3). However, we chose not use them in the current model as data between screens is not always available.

Estimation of Parameters
Under the current setup, the parameters to be estimated are those for the sensitivity (b 0 and b 1 ), those for the preclinical intensity (μ and s), the parameters controlling the gamma process (m 1 , m 2 and β) and the random threshold parameters λ and ξ .
However, as the integrals in expression (4) do not have a closed form, numerical integration must be used. But as multidimensional numerical integration is slow and calculating the full likelihood means computing these integrals for all the sample elements and parameters, that would not be computationally feasible. Thus, we suggest using the so called minimal divergence estimators instead. These estimators are based on dividing the sample space into intervals and minimizing the divergence between observed and expected distributions, therefore limiting the number of integrals that need to be computed.
From a theoretical point of view, if we have a set of observations X , grouping them into {A 1 , · · · A M } defines a discrete statistical model in which the expected probabilities of A i are denoted by q i (θ) = P θ (A i ) for i = 1 · · · M and letp i = n i /n be the relative frequency of A i , i = 1 · · · M. For fixed (n 1 , · · · , n M ) the likelihood is: Therefore, the log-likelihood can be written as: whereP = (p 1 , · · · ,p M ), Q(θ) = (q 1 (θ), · · · q M (θ)), D K-L is the Kullback-Leibler divergence (Kullback and Leibler 1951 and l is a constant in θ . Then to estimate θ by the discrete model maximum-likelihood estimator is equivalent to minimize the Kullback-Leibler divergence. However, the Kullback-Leibler divergence is not the unique divergence measure, one can chooseθ as the estimator which solves the following: D(P , Q(θ)) = inf θ∈Θ D(P , Q(θ)).
With D being a divergence measure for our discrete model. Some of the other divergence measures include the chi-squared divergence, Hellinger divergence , Burbea-Rao divergence (Burbea and Rao 1982) and several others. Minimum divergence estimators are quite popular in the literature, especially when the likelihood has a complex form. Asymptotic properties of these estimators and comparisons to the maximum likelihood estimator are also studied in the literature, see Broniatowski (2014), Jimenz and Shao (2001), and many others. We decided to use the χ 2 measure as it is well known and easy to interpret. Recall that the χ 2 divergence is defined as: Translating this to our setup, binning the observations on screens into intervals leads to a similar discrete model. One way to carry this out is by grouping the observations on each screen into intervals with approximately equal frequencies. Formally, let us introduce some notations: denote by M the chosen number of intervals, denote the resulting intervals on screen j by Y i,j i = 1 · · · M, j = 1 · · · K. Let N j the number of participants in the j th screen and denote byp i,j = n Y i,j /N j the observed relative frequency of Y i,j . Introduce q i,j (θ) as the probability of having a tumor size in Y i,j on the j th screen i.e. q i,j (θ) = Y i,j f τ j (x)dx. Furthermore, denote by Y 0,j the number of cases who participated in screen j but no tumor was detected (q 0,j (θ) = 1 − M i=1 q i,j (θ)). The divergence to be minimized is then: Note that if the model is to be applied to a data set with different ages at program entry, one can extend (6) by summing over the ages at program start t 0 . On the other hand, if data about interval cases is available, one can further extend the model by including the distance between expected and observed interval cases, therefore using all the available data.
Having the parameters estimated, the main aim is to calculate the sojourn time distribution, however, under some parametrizations of the process, there is a positive probability that the tumor will never reach the critical threshold C. In other words, the tumor would never become symptomatic. As a result, the expected value of the sojourn time will be mathematically infinite. In fact, a serious concern around screening programs is that they may cause overdiagnosis. Overdiagnosis is the diagnosis of a medical condition that would never have caused any symptoms or problems. Reported estimates of breast cancer overdiagnosis range from 0% to 54% (Elmore and Fletcher 2012).
Therefore, after the parameters are estimated, one can compute the probability of never showing symptoms by P (NS) = 1 − F T C (∞). If this probability is positive, then one has to truncate the distribution of the sojourn time to get estimates for the mean and variance. The truncation is done at the maximum realistic value of the sojourn time (Q). The adjusted distribution is then given by: The expected value and the variance of the sojourn time are then directly obtainable even if η is not differentiable using E[T C ] = ∞ 0 (1−F T C (t))dt and E(T 2 C ) = 2 ∞ 0 t (1−F T C (t)) dt. Note that the truncation is done to get finite expected values and is not incorporated into the model.

Lead Time
Lead time bias in the current framework is the amount of time that a screened case would have needed to show symptoms. Specifically, it is the unobservable future time needed to exceed C. From a reliability point of view, lead time is the remaining preclinical lifetime till the degradation reaches C. For a given threshold C = c, onset t p and size at detection x τ . The conditional survivor function of the lead time R is given by the probability that the tumor size will not exceed c in time t r after the screen τ as: To release the condition on t p , we need to get the the conditional density of t p |X τ = x τ denoted by w (t p |x τ ) which is obtainable by Bayes law. Namely: Note that a detected tumor of size x τ means that the threshold C must be larger than x τ . Taking all of that into account gives: As the lead time is directly related to the sojourn time, a similar truncation to (7) is needed to get finite values. In other words, the lead time will be computed given that detected cases are not overdiagnosed. A favorable outcome of our approach is that one is able to estimate the expected value and the variance of the lead time based on the tumor size at detection x τ . Lead time bias is corrected by deducting the expected lead time from the overall survival of a screened patient.

Applications
In order to check the performance of the model, simulations were carried out by discretizing the time-line into h-sized intervals. Movement into the preclinical state is simulated by a Bernoulli random variable with probability p h = t h t h−1 w(t p )dt p where t h−1 and t h are the boundaries for the h-sized interval. When the Bernoulli (p h ) gives a success, the gamma process is then simulated starting from (t h−1 + t h )/2. A patient in a screen if it did not reach the critical threshold C or if it is still disease free. Note that the threshold is simulated by a gamma random variable and the screens are simulated using a Bernoulli random variable with probability Λ(x). The simulator is initiated under the assumption that all cases were disease free B years before the first screen (occurring when cases are 50 years old), where B is the maximum feasible value of the sojourn time. Patients which show symptoms before the first screen are discarded. The simulator is run until the desired number of participants on the first screen N is reached.
The preclinical intensity parameters used in simulations are u = 3.86 and s = 0.293, these values result in an average preclinical age of 50 and a standard deviation of 15, the risk r is set to one for the sake of getting more preclinical cases. Parameters for the critical threshold are defined as λ = 8 and ξ = 0.25 resulting in an average critical size of 2 cm and a variance of 0.5.
That being said, the first aim is to study the performance of the model for different number of participants combined with different number of intervals (M) on screens. For that purpose, we simulated Scenario 1, the defined process parameters in this scenario are: m 1 = 5, m 2 = 0.2 and β = 2 resulting in a mean sojourn time of 2.15 years (truncated at 20 years). The defined values for sensitivity are b 0 = −2.5 and b 1 = 3.5. We decided to simulate 3 screening programs for a single age group (t 0 = 50) with different number of participants: N = 10000, N = 50000 and N = 200000. 50 datasets were generated for each of the programs, the model was run for M = 5, 10, 20, 30 and 40 on each dataset.
The second aim is to check the performance of the model under different setups, for that purpose, we simulated Scenario 2 with N = 50000. This scenario was simulated to mimic extremely aggressive growth with m 1 = 10, m 2 = 1 and β = 2.5 resulting in an adjusted mean sojourn time of 0.144 years (truncated at 2 years). The defined sensitivity parameters are b 0 = 1.5 and b 1 = 3 leading to a larger probability of detection for smaller tumors combined with a steeper increase in the sensitivity as tumor size increases. The model is run on each dataset (N=50000) for the different values of M and the results are presented in Table 2. Plots of the sojourn time cdf and screening sensitivity are shown in Fig. 5.
From a practical point of view, the critical threshold parameters λ and ξ are usually known. Estimates of these parameters can be directly obtained by applying maximum likelihood to symptomatic cases who never participated in a screen. Likewise, the parameters controlling the preclinical intensity for breast cancer are also given in terms of age-specific incidence rates (see Lee and Zelen (1998)). Employing this, we decided to fix preclinical intensity and critical threshold parameters, leaving only the parameters controlling the process and the sensitivity to be estimated θ = (m 1 , m 2 , β, b 0 , b 1 ). We binned the measurements on screens into M = 5, 10, 20, 30, 40 intervals with almost equal frequencies.
The numerical integration is implemented using the statistical software R by the function suave from the package cubature, suave implements a Monte Carlo algorithm for multidimensional numerical integration by importance sampling combined with a globally adaptive subdivision strategy (Hahn 2005). The minimization of the distance is done using the function optim. The mean of the estimates resulting from the 50 datasets and their standard deviations are displayed in Table 1.
Starting with the first scenario, the tumor size densities on the first and second screens for N = 10000 and N = 200000 are displayed in Fig. 4. Note that the number of detected cases on the first screen is 8516 in the large program and 417 for the small one. Whereas on the second screen the number of detected cases is 3731 and 186 respectively. It is noticed that the resulting densities have many irregularities, additionally we could observe that the tumor size on the second screen is denser around its peak. This is natural since larger cases either were detected on the first screen or reached the threshold during the inter-screening Table 1 Results for different sample sizes at first screen (N ) and number of intervals (M) for Scenario 1: 0.8

Second screen
Size Density N=10000 N=200000 Fig. 4 Density of the tumor size on the first and the second screen for scenario 1: interval. Nonetheless, even the small sample seems to be very informative, although some irregularities in the density are noticed. From the first block of Table 1 where N = 200000, it is noticed that the model performs really well and is nearly unbiased even for M = 5. We observed a decrease in the standard deviations with the increase of M, however there is no significant decrease beyond M=30.
Moving to the second block, the estimates are good although with higher standard deviations. Additionally, we still observe an increase in the precision with the increase of M. In the third block, where N = 10000 it is noticed that the standard deviations are much larger. This is caused by the irregularities in the densities for the small sample size, nonetheless, the model still gives acceptable estimates.
Furthermore, we have noticed that there is a strong correlation between the elements of θ , the strongest positive correlation was found between b 0 and m 2 (its value for N = 200000 and M = 40 was 0.753). Recall that m 2 controls the process rate to reach m 1 , and b 0 controls the location of the curve, in other words, b 0 adjusts the weight of the process on a screen. Decreasing b 0 means less weight on smaller tumors, the model adjusts to this by a smaller rate m 2 and vice versa. The strongest negative correlation was found to be between the scale of the process β and m 1 (for N = 200000 and M = 40 it was -0.724), both of which control the expected value and the variance of the process, so to preserve the balance for decreasing values of β the values for m 1 are increased and vice versa.
Moving on to the results of Scenario 2 displayed in Table 2, it seems that there is a small bias combined with large standard deviations. We observed a large variation in sensitivity estimates (b 0 and b 1 ). This is likely due to the sharp decrease of the number of screened  cases in this scenario. In fact, under a very short sojourn time, the chances for a case to reach a screen before becoming symptomatic are quite slim. As a result, only around 208 cases are detected on the first screen. Nonetheless, the resulting average sojourn time distribution seen in the top right part of Fig. 5 is very close to the actual one. The same is true for the sensitivity, however it is noticed that on average, the estimated sensitivity is slightly higher than the actual one.
After estimating the parameters, we calculated the lead time for cases detected on the first screen (aged 50), and had a tumor of size 1 cm on detection. After truncating the lead time distribution, the resulting expected lead time bias is 1.849 years for scenario 1 (and 0.165 years for scenario 2). This shows the magnitude of the bias caused by early detection, as a screened case with a tumor of size 1 cm would appear to have survived 1.849 years more than a symptomatic with the same onset and date of death. The implications of this are very serious, as any administered treatments after screening might falsely appear to be effective due to the prolonged survival. Furthermore, it is clearly beneficial to link lead time bias with the tumor size on detection, as the tumor size gives an indication for the duration of which the tumor size was asymptomatic, therefore giving information of how much more time it needs to show symptoms.

Concluding Remarks
Although the proposed model is somewhat computationally expensive, it proves to be an accurate and a powerful tool to use with degradation processes triggered at a random onset. The model is very flexible, as one is free to choose e.g the form of η. There is definitely room for further research, the extended gamma process can also be used (Guida et al. 2012) if the mean over variance ratio is not constant. Moreover, it is also possible to use the transformed gamma process (Giorgio et al. 2018) if damage accumulates gradually over time in a sequence of tiny increments in which the degradation increments over disjoint time intervals are not independent.
One more advantage of using the gamma process-based approach is that with some modifications, one is able to incorporate a process with random covariates as in (Lawless and Crowder 2004). For instance, if one wishes to investigate the effect of age on tumor growth, adding a covariate corresponding to age in the shape of the process should do the trick.
On the other hand, one main limitation for focusing on tumor size as a degradation measure is that the approach is limited to study solid cancers. Besides, some types of cancer are expressed through multiple tumors and are not restricted to a single primary one. However, it might be possible to find a proper measure of degradation on diagnosis in case of other degenerative diseases.