Modelling time-varying covariates effect on survival via functional data analysis: application to the MRC BO06 trial in osteosarcoma

Time-varying covariates are of great interest in clinical research since they represent dynamic patterns which reflect disease progression. In cancer studies biomarkers values change as functions of time and chemotherapy treatment is modified by delaying a course or reducing the dose intensity, according to patient’s toxicity levels. In this work, a Functional covariate Cox Model (FunCM) to study the association between time-varying processes and a time-to-event outcome is proposed. FunCM first exploits functional data analysis techniques to represent time-varying processes in terms of functional data. Then, information related to the evolution of the functions over time is incorporated into functional regression models for survival data through functional principal component analysis. FunCM is compared to a standard time-varying covariate Cox model, commonly used despite its limiting assumptions that covariate values are constant in time and measured without errors. Data from MRC BO06/EORTC 80931 randomised controlled trial for treatment of osteosarcoma are analysed. Time-varying covariates related to alkaline phosphatase levels, white blood cell counts and chemotherapy dose during treatment are investigated. The proposed method allows to detect differences between patients with different biomarkers and treatment evolutions, and to include this information in the survival model. These aspects are seldom addressed in the literature and could provide new insights into the clinical research.


Introduction
Osteosarcoma is a malignant bone tumour mainly affecting children and young adults. Although osteosarcoma is the most common primary malignant bone cancer, it is a rare disease and has an annual incidence of 3-4 patients per million (Smeland et al. 2019). Multidisciplinary management including neoadjuvant and adjuvant chemotherapy with aggressive surgical resection (Ritter and Bielack 2010) or intensified chemotherapy has improved clinical outcomes although the overall 5-year survival rate has remained unchanged in the last 40 years at 60-70% (Anninga et al. 2011). Therefore, it is extremely important to provide an effective tool to evaluate the prognosis for osteosarcoma and to guide the diagnosis. Time-varying (or time-dependent) covariates are often of interest in clinical and epidemiological research: patients are followed during the study and subject-specific measurements are recorded at each visit. Well-known examples include biomarkers which change during follow-up or cumulative exposure to medications (Austin et al. 2020), such as chemotherapy. Depending on patients' treatment history or development of toxicity, biomarkers values change and chemotherapy treatment is modified by delaying a course or reducing the dose intensity. To study the association between time-varying responses with time-to-event outcome (e.g., death) is a challenging task which could offer new insights into the direction of personalised treatment.
In osteosarcoma treatment, patients usually undergo assessment of hematologic and serum biochemical parameters (Lewis et al. 2007), such as white blood cell (WBC) counts and alkaline phosphatase (ALP). The role of ALP as tumour marker for osteosarcoma has not been established, although several studies suggested that high ALP level is associated with poor overall or event-free survival and presence of metastasis (Ren et al. 2015;Hao et al. 2017). Chemotherapy is usually modelled by different allocated regimens, i.e., by Intention-To-Treat (ITT) analysis (Gupta 2011). ITT ignores anything that happens after randomization, such as protocol deviations or changes in drug intake over time, i.e., delays or dose reduction . Lancia et al. (2019b) showed that there is mismatch between target and achieved dose of chemotherapy and the impact of dosis on patients' survival is still unclear. A novel method to study received chemotherapy dose and biomarkers as time-varying variables is proposed. This approach has never been applied to osteosarcoma treatment and provides new insight in understanding the effect of chemotherapy dosis intensity on sarcoma in childhood cancer. Moreover, as will be clear in the following, the application is inspiring from a statistical modelling perspective.
Models for time-to-event data which are able to deal with the dynamic nature of time-varying responses during follow-up are not well developed. One approach for using time-varying covariate data is the Time-Varying covariate Cox Model (TVCM) (Therneau and Grambsch 2000;Kalbfleisch and Prentice 2002), that is an extension of the Cox proportional hazard model (Cox 1972) accounting for covariates that can change value during follow-up. Since time-dependent observations are only available at the time of measurements, TVCM uses the last-observation-carried-forward (LOCF) approach (Tsiatis and Davidian 2004), which leads to the pitfall of introducing bias due to the continuous nature of the process underlying the data, and fails to account for possible measurement errors (Arisido et al. 2019). Joint models address these issues by modelling simultaneously longitudinal and time-to-event data using shared random effects (Henderson et al. 2000;Tsiatis and Davidian 2004;Chi and Ibrahim 2006;Dantan et al. 2011;Rizopoulos 2012Rizopoulos , 2016Gould et al. 2015;Proust-Lima et al. 2016;Hickey et al. 2016Hickey et al. , 2018. They are parametric models that allow for the inference on the association between the hazards characterizing the event outcome and the longitudinal processes. However, they require additional strong assumptions over TVCM that need to be carefully validated to avoid biased estimates (Arisido et al. 2019). Their benefits are hence strictly linked to the correct specification of longitudinal trajectories and baseline hazard function. In addition, inference computations could become prohibitive, especially for approaches developed in a Bayesian framework.
During the past two decades, Functional Data Analysis (FDA) has been increasingly used to analyse, model and predict dynamic processes (Ramsay andSilverman 2002, 2005;Müller 2005;Yao et al. 2005;Ferraty and Vieu 2006;Liu and Yang 2009;Ullah and Finch 2013;Ieva et al. 2013;Ieva and Paganoni 2016;Martino et al. 2019;Spreafico and Ieva 2021). The idea behind FDA and functional models is to express discrete observations arising from time series, i.e., longitudinal timevarying observations, in the form of functions (Ramsay andSilverman 2002, 2005). Functional representation incorporates trends and variations in the evolution of the process over time (Ullah and Finch 2013). Since functional data are infinitedimensional covariates, some dimensionality reduction methods are needed to summarize and select a finite dimensional set of elements representing the most important features of each covariate. This information can then be included into timeto-event models. To model the relationship between survival outcomes and a set of finite and infinite dimensional predictors Functional Linear Cox Regression Models (FLCRM) have been recently proposed (Gellar et al. 2015;Lee et al. 2015;Qu et al. 2016;Kong et al. 2018;Li and Luo 2019). In case of an infinite dimensional process, Kong et al. (2018) characterized the joint effects of both functional and scalar predictors on time-to-event outcome employing Functional Principal Component Analysis (FPCA). FPCA is one of the most popular dimensionality reduction method in FDA and it is used to summarise each function to a finite set of covariates through FPC scores, while losing a minimum part of the information. An extended version of the FLCRM by Kong et al. (2018) to the case of multiple functional predictorsnamed Multivariate FLCRM (MFLCRM)-was introduced by Spreafico and Ieva (2021) to model recurrent events effect on long-term survival. However, since the main focus of the work was to develop a methodology for effectively modelling time-varying recurrent events in terms of the functional compensators underlying the processes of interest, the authors have neither compared MFLCRM with other survival models, nor considered its predictive performances over time. In case of multiple longitudinal processes, Li and Luo (2019) exploited the multivariate FPCA approach by Happ and Greven (2018) to extract the FPC scores from the multiple longitudinal trajectories in order to make personalized dynamic predictions. However, the authors did not focus on the smoothing and functional representation aspects of the processes realized by the observed longitudinal data, on the clinical interpretation of the FPC scores and on their association with overall survival. Since it is often the changing patterns of the functional trajectories rather than the actual values that affects patients' survival, FDA provides a novel modelling and prediction approach, with a great potential for many applications in public health and biomedicine (Ullah and Finch 2013).
Motivated by a clinical question concerning the effect of biomarkers and dose variations during treatment on survival for osteosarcoma patients, an innovative FDA approach, named Functional covariate Cox Model (FunCM), is proposed and compared to a standard TVCM. In FunCM, FDA techniques are first exploited to represent time-varying processes and their derivatives over time in terms of functional data. Unlike joint models, FDA approach does not make assumptions on the distributions of longitudinal processes being computationally advantageous (Li and Luo 2019). Then, additional information contained into the evolution of the functions over time are included into MFLCRMs for overall survival through FPCA. A cross-validation method is implemented to compare MFLCRMs and standard TVCM in terms of their predictive performances at different time horizons. Three novelties of this work are listed here: (i) application of advanced statistical techniques to deal with time-varying covariates in the field of osteosarcoma treatment; (ii) reconstruction of the functional representations for biomarkers and chemotherapy dose values, and their rates of change, to retrieve information on the progression of processes over time; (iii) comparison between TVCM and FunCM in terms of both clinical interpretability and time-dependent predictive performances. This novel approach provides more information about the effect of individualized treatment adaption on survival for osteosarcoma patients.
The rest of this article is organized as follows. In Sect. 2 TVCM and FunCM to represent time-varying covariates by means of FDA and to include them into survival models are discussed. MRC BO06/EORTC 80931 Randomized Controlled Trial and longitudinal representations of time-varying covariates are described in Sect. 3. Results are presented in Sect. 4. Section 5 ends with a discussion of strengths and limitations of the current approach, identifying some developments for future research.

Time-varying covariates and survival frameworks
A time-varying (or time-dependent) process is a covariate whose value can change over the duration of follow-up (e.g., time-varying biomarkers, current use of medication, and cumulative dose of drugs). In this study, the main interest is in analysing the association between patient's survival and variations during treatment of his/her multiple time-varying characteristics. The focus is hence on patients who had completed the entire chemotherapy treatment protocol in a pre-defined and clinically acceptable timing period.
Follow-up starts from date of randomization T 0 and is divided into a pre-defined 6-months chemotherapy treatment period ½T 0 ; T Ã 0 -also called observation periodconsidered for chemotherapy treatment completion, and a post-treatment follow-up period from T Ã 0 onwards (see Fig. 1). Under the TVCM framework, the Overall Survival (OS) is measured from randomization (T 0 ) to the date of death or last follow-up date, and the time-varying covariates can be defined over the entire follow-up period. Let M be a set of timevarying processes. Let z  The TVCM is an extension of the proportional hazard model by Cox (1972) accounting for covariates that can change value during follow-up (Therneau and Grambsch 2000;Kalbfleisch and Prentice 2002). Under TVCM, the proportional hazards model for patient i has the form where h 0 ðtÞ is the baseline hazard function, x i and z i ðtÞ ¼ z TVCM can also be stratified to allow for control by "stratification" of a predictor that does not satisfy the proportional hazard assumption (Kalbfleisch and Prentice 2002). Under stratified TVCM, the hazard function h ig tjx i ; z i ðtÞ ð Þcontains also a subscript g that indicates the g-th stratum, as well as the baseline hazards h 0g ðtÞ, where the strata are different categories of the stratification variable. Notice that the baseline hazard functions are different in each stratum.

Functional covariate Cox model
FunCM approach to represent time-varying covariates by means of FDA and to include them into survival models is now introduced. A summary of the proposed methodology is provided in Appendix A.1.

From longitudinal to functional representation
To model the continuous longitudinal vectors z ðmÞ i defined over ½T 0 ; T Ã 0 as functions x ðmÞ i ðtÞ, FDA techniques can be exploited, as discussed by Ramsay andSilverman (2002, 2005). The observed data z is the vector of coefficients for patient i. Other functional forms can be used to take into account the nature of the process itself (e.g., positive, increasing, decreasing). For example, for an increasing process, the functional data object can be defined using the monotone functional form W In the presence of constrain due to the specific application, data can be alternatively smoothed by regression analysis using the transformation gðxÞ ¼ log xÀL m U m Àx , where L m and U m denote the lower and upper bounds respectively. For each patient i the customized functional predictor m is defined as: Starting from the customized functional datum, the FDA approach also allows to reconstruct its derivative dx where h 0 ðtÞ is the baseline hazard function, x i is the vector of scalar (non functional) covariates with regression parameters c. The vectorx To select the truncation parameters K m , representing the number of FPCs to be considered, Spreafico and Ieva (2021) chose the model with the highest Concordance index (Pencina and D'Agostino 2004), that is an overall measure of discrimination in survival analysis. In this work, the truncation parameters K m are selected in terms of predictive discrimination and calibration performances at different time horizons through the cross-validation procedure introduced in Sect. 2.3.3. From (4) the integrals in (3) where a ðmÞ k is the scalar representing the quantity R S m n ðmÞ k ðsÞa ðmÞ ðsÞds. Introducing approximation (5) in Eq. (3), the hazard function becomes: k ðsÞa ðmÞ ðsÞds is the regression parameter related to the k-th FPC score related to process m. Therefore, defining the following quantities: where the vector of coefficients h can be estimated by maximising the partial likelihood function (Cox 1972). Notice that even in this case the hazard function in Eq. (6) could be stratified.

Selection of truncations parameters
The truncation parameters K m in Eq. (6)  The predictive performance of the models is assessed in terms of discrimination and calibration. Discrimination is assessed through the time-dependent area under the curve (AUC), estimated through the nonparametric method by Li et al. (2018). Calibration is assessed by the weighted version of the Brier score under the assumption of independent censoring (Graf et al. 1999). Higher AUC and lower Brier score indicate better discrimination and calibration, respectively.

Sample cohort selection and baseline characteristics
Clinical studies usually collect information about baseline characteristics and multiple time-varying processes to measure the disease progression. Data from the MRC BO06/EORTC 80931 Randomized Controlled Trial for patients with nonmetastatic high-grade osteosarcoma recruited between 1993 and 2002 (Lewis et al. 2007) were analysed. Patients were randomized between conventional (Reg-C) and dose-intense (Reg-DI) regimens. Details concerning the trial protocol are provided in Appendix A.2.
The dataset included 497 eligible patients; 19 patients who did not start chemotherapy (13) or reported an abnormal dosage of one or both agents (6) were excluded. Motivated by the clinical research question concerning the effect of doses intensity on survival, only patients who completed all six cycles within 180 days (i. e., T Ã 0 of the observation period) were included in the analyses. TVCM analysis was carried out on 377 patients (75.9% of the initial sample). Among them, one subject presented T i \T Ã 0 and was excluded from the FunCM cohort (376 patients-75.7% of the initial sample). The final cohorts for both TVCM and FunCM analyses are shown in Fig. 2.

Time-varying characteristics
Due to the skewed nature of the longitudinal trajectories of both ALP and WBC biomarkers, their logarithmic transformations shifted by one were considered. The  is due to the presence of conditions usually experienced by patients in childhood cancer therapies, such as bone growth, tumour necrosis, inflammatory states, infections or toxicity (see Williamson et al. 2015).
The time-varying standardized cumulative dose of chemotherapy is now introduced. Let j 2 f1; . . .; 6g be the cycle index and t ij the time of the j-th cycle for the i-th patient. The standardized cumulative dose of chemotherapy (DOX? CDDP) for the i-th patient at time t ij is defined as: Total target dose at the end of six cycles ½mg=m 2

Results
Since the role of received chemotherapy dose and biomarkers on patient's survival is still unclear for osteosarcoma (Ren et al. 2015;Hao et al. 2017;Lancia et al. 2019b), a new time-varying/functional perspective may help in understanding their relationship, providing new insights for childhood cancer. In this regard, the methodologies proposed in Sect. 2 were applied to the MRC BO06 osteosarcoma trial. Statistical analyses were performed in the R-software environment (Core Team 2020).

Time-varying covariate Cox model
To study the effect of time-varying biomarkers and doses on survival, a TVCM was fitted on the cohort of 377 patients (see Fig. 2). In particular, the hazard function in Eq. (1) was adjusted for gender at randomization ðx i Þ and stratified by age group g 2 fchild; adolescent; adultg, as follows: where h 0g ðtÞ is the baseline hazard function for the g-th age stratum, z  Table 2 hazard ratios along with their 95% confidence interval are shown. Gender at randomization and time-varying WBC were associated to survival, whereas time-varying ALP biomarker and chemotherapy dose showed no effects on survival. Being a male was associated to a 1.5-times faster experience of the event. The higher the value of WBC at time t, the higher the risk of death. This model ignored the continuous nature of the processes underlying the data.

Functional representation of time-varying biomarkers and chemotherapy dose
To convert the longitudinal values of ALP and WBC biomarkers registered during the observation period, where b c ðdÞ i is the vector of coefficients estimated by penalized regression analysis using the transformation gðxÞ ¼ log xÀL d U d Àx . Finally, starting from the customized  Fig. 4) also provide information on treatment adjustments. Dose reductions are represented by final standardised cumulative dose smaller than 1. For patients with a similar final dose, the slope displays information on the duration of treatment: the lower the slope, the longer the duration of treatment, reflecting delays compared to protocol (see Appendix A.2). Figures 4 and 5 show that, taking into account the continuous nature of the processes underlying the data, a customized functional representation of the timevarying covariates and their derivatives highlights trends and variations in the shape of the processes over time.

Functional principal component analysis for time-varying biomarkers and chemotherapy
The functional trajectories provided in Eqs. (11), (12) and (13)   explained 13.1% of the variability and positive scores reflected "flat" curves, whereas negative score reflected curves with highly negative slopes in the first cycles (right panel). The lower the score, the higher the ALP levels during the first two cycles of the treatment. FPC scores thus summarize the different patterns of the functional biomarker trajectories between patients during treatment, being a more informative representation than the baseline value or the last available measure used through LOCF.
Results of FPCA on functional standardized cumulative dosex k is the is eigenvalue related to the k-th component and c are constants chosen in order to let the scores values lie within one, two or three (AEc ¼ AE1; AE2; AE3) standard deviations (i.e., square roots of k ðdÞ k ). The first component n ðdÞ 1 explained 86.9% of the variability and reflects information on treatment administration and adjustments with respect to protocol. Positive scores (i.e., curves above the average l ðdÞ ðtÞ in the left panel) indicate patients without dose-reduction (i.e., their final standardized cumulative dose is greater or equal to 1) and with possible delays in treatment: the lower the positive score, the higher the time needed to end the treatment. Negative scores (i.e., curves below the average l ðdÞ ðtÞ) represent patients with both time-delays and dosereduction: the lower the negative score, the higher the total dose-reduction. The second component n ðdÞ 2 explained 9.8% of the variability and a positive score indicated a faster growth in the chemotherapy assumption in the first period compared to the second one, with respect to the mean (right panel). Every two patients reported different values of FPC scores, reflecting delays or dose reductions during chemotherapy. This representation illustrates different treatment dynamics, also among patients allocated to the same regimen. Summarizing differences in both trends and variations related to the shape of chemotherapy doses consumption processes over time, the use of FPC scores is more informative than an IIT analysis by different allocated regimens or a LOCF approach that considers only the last available value.

Multivariate functional linear Cox regression model
To study the effect of risk factors on survival, several MFLCRMs based on different sets of baseline and functional predictors (see Table 3) were estimated. Since functional trajectories and their relative derivatives are correlated, in each MFLCRM only one type was considered. Each model was adjusted for gender and stratified by Table 3 Selected truncation parameters K m and integrated AUC (iAUC) for different sets of baseline and functional predictors

Model
Baseline Truncation parameters K m iAUC age group at randomization g 2 fchild; adolescent; adultg. When functional rate of changes of ALP or WBC biomarkers were included in the models, the values of logarithmic ALP or WBC levels at randomization were also considered as adjusting baseline covariates. Cross-validation with five folds was employed to select the truncation parameters K m for each set of covariates (see Table 3). Time-dependent AUCs and Brier scores were estimated with R packages tdROC (function tdROC) by Li and Wu (2016) and ipred (function sbrier) by Peters and Hothorn (2019), respectively. Figure 8 shows the cross-validated mean values of time-dependent AUC and Brier score over different time horizons for all estimated models (solid lines) and for TVCM in Eq. (10) (dashed black lines). All functional models outperformed TVCM and showed similar Brier score measures over time, therefore time-dependent AUC was used to select the final model. Weighted averages of the several time-dependent AUCs over time, estimated through the integrated AUCs (iAUC) by Heagerty and Zheng (2005), are reported in Table 3. According to the highest iAUC, the best MFLCRM was Model 3, defined as follows: where h 0g ðtÞ is the baseline hazard function for the g-th age stratum, To estimate the effect of the selected functional predictors on survival, MFLCRM (14) was fitted on the FunCM cohort of 376 patients (see Fig. 2). In Table 4 hazard ratios along with their 95% confidence interval are shown. Level of WBC at randomization and the FPC scores related to alkaline phosphatase f ðALPÞ i1 ; f ðALPÞ i2 were associate to survival. The higher the value of WBC at randomization the higher the risk of death, whereas no effects were observed due to the rate of change in WBC during the protocol observation period. Patients with high ALP trajectories had poor survival, especially in case of curves with highly negative slopes during the first cycles of chemotherapy protocol. FPC scores related to functional chemotherapy dose showed no effects on survival. Estimated survival probabilities are shown in Fig. 9. High values of baseline WBC corresponded to poor survival (top-left panel). The score f

Discussion
To study the association between time-varying processes and time-to-event data is a challenging problem in clinical research and the development of models and methods able to deal with dynamic time-varying covariates is of statistical interest and of clinical relevance. Research into functional modelling have received considerable attention in recent years. In this work, a novel approach based on FDA techniques to investigate the dynamics of time-varying processes over time and to include additional information that may be related to the survival into the time-to-event model was presented. Data from the MRC BO06/EORTC 80931 randomized clinical trial for osteosarcoma treatment were analysed. Biomarkers and chemotherapy dose were incorporated as time-varying covariates into time-to-event models using both a TVCM and a FunCM approach. The standard TVCM with LOCF approach ignored the continuous nature of the processes underlying the data. To overcome this issue, FunCM exploited FDA techniques to represent time-varying characteristics in terms of functions, enriching the information available for modelling survival with relevant time-varying features related to the evolution of the processes over time. These features were included into MFLCRMs by FPCA to study the effects of functional risk factors on patients' overall survival. Differences in results for TVCM and MFLCRM were due to the different nature of the information incorporated in the two models. TVCM considered as constants the last biomarkers/dose levels over different time points (expressed in days). In practice, among the measurements recorded during the observation period, only the last value had any real impact on overall survival, as only one patient presented with a time-to-event of less than 180 days. This discarded both information about the  continuous nature of the processes and the history of the actual levels measured. MFLCRM included information related to different levels variations and timing during the entire observation period, and functional biomarkers were defined over cycles. Thanks to the introduction of relevant dynamic features related to the continuous functional nature of the processes, MFLCRM resulted more informative than TVCM, outperforming it both in terms of calibration and discrimination over time. MFLCRM results suggested that osteosarcoma patients with high ALP trajectories during treatment, especially during the first cycles of the chemotherapy protocol, have poor overall survival. Dose-intense profiles were not associated with survival, even if functional chemotherapy representations were able to capture individual realisations of the intended treatment, detecting differences between patients randomised to the same regimen. This suggested that considering only the assumed dose as treatment proxy is not enough. Chemotherapy presents some particular aspects, such as latent accumulation of toxicity, which must be taken into account .
The proposed FunCM focused on the representation and the reconstruction of the functional trajectories related to the time-varying processes of interest. Such data are usually considered in a very simplistic way in cancer prediction models, where they act as fixed baseline or as time-dependent LOCF covariates. In this way the amount of information they may provide is not considered, as it is often the changing patterns of the functional trajectories rather than the baseline/last value that affects patients' survival. The strength and innovation of FunCM was the ability to capture the individual realisations of the process over time through a customized functional reconstruction. The developed techniques allowed (i) to account for the continuous time-varying nature of the processes underlying the data and their properties, such as nonlinearity, positivity, constraints, monotonicity, (ii) to move from sparse and irregular longitudinal data to functions defined over a common continuous domain, overcoming the issues of values missingness and different temporal grids, and (iii) to reconstruct and provide derivatives information in a tailored way. The use of derivatives is important both in extending the range of simple graphical exploratory methods and in the development of more detailed methodology (Ramsay and Silverman 2005). In fact, interesting patterns are often much more apparent in derivatives than in the original curves. Furthermore, through a proper dimensionality reduction technique, this methodology allowed to extract additional information contained in the functions. This result is an effective exploratory and modelling technique to highlight trends and variations in the evolution of the processes over time.
In contrast to a TVCM approach, the use of FunCM requires that patients survived for a period at least equal to the length of the observation period used to compute the functional predictors. This might imply a loss of information in situations with high rate of mortality during the observation period (that is not the case under study as only one of the cohort patients who had completed the chemotherapy treatment protocol died during the first 6-months after randomization-see Fig. 2). In those cases, a joint modelling approach can be used to overcome both LOCF and selection bias issues, since its allows the simultaneous modelling of longitudinal and time-toevent outcomes. However, joint models are computational expensive in case of multiple longitudinal outcomes and require assumptions on the distributions of the processes that need to be carefully validated to avoid biased estimates.
This work opens doors to many further developments, both in the field of statistical methods and in cancer research. The dimensionality reduction via FPCA is just one way to work with these data in order to use them within inferential contexts. In fact, the reconstruction via FDA allows to properly use the functional data to address relevant clinical research questions, according to the needs of the analysis and the outcomes of interest. From a clinical point of view, it will be necessary to simultaneously consider chemotherapy treatment modifications and the occurrence of adverse events. This aspect need to be taken into account into the representation of the dynamic evolution of these processes. To model them simultaneously is not a trivial task.
The complexity of the chemotherapy treatment asks for the developments of new methodologies. This study shows that working in this direction is a difficult but profitable approach, which could lead to new improvements for subject-specific survival predictions and personalised treatment.
A.2. MRC BO06/EORTC 80931 randomized controlled trial protocol Data from the MRC BO06/EORTC 80931 Randomized Controlled Trial for patients with non-metastatic high-grade osteosarcoma recruited between 1993 and 2002 were analysed (Lewis et al. 2007). The trial randomised patients between conventional treatment with doxorubicin (DOX) and cisplatin (CDDP) given every 3 weeks (Reg-C) versus a dose-intense regimen of the same two drugs given every 2 weeks (Reg-DI), supported by granulocyte colony-stimulating factor. Chemotherapy was administered for six cycles (a cycle is a period of either 2 or 3 weeks depending on the allocated regimen), before and after surgical removal of the primary osteosarcoma. In both arms, DOX (75 mg/m 2 ) plus CDDP (100 mg/m 2 ) were given over six cycles. Surgery to remove the primary tumour was scheduled at week 6 after starting treatment in both arms, that is, after 2 cycles (2 Â [DOX?CDDP]) in regimen-C and after 3 cycles (3 Â [DOX?CDDP]) in regimen-DI. Postoperative chemotherapy was intended to resume 2 weeks after surgery in both arms. Figure 10 shows the trial design. Planned total cumulative dose was 1,050 mg/m 2 in both regimens. Planned treatment time from beginning first cycle was 140 and 98 days for Reg-C (6 cycles Á 3 weeks/cycle Á 7 days/week ? 14 days of surgery period) and Reg-DI (6 cycles Á 2 weeks/cycle Á 7 days/week ? 14 days of surgery period), respectively. Laboratory tests were usually performed before each cycle of chemotherapy (in some cases also during and after the cycle) in order to monitor patient's health status and the development of toxicities or adverse events. Delays or chemotherapy dose reductions during treatment were possible in case of toxicity. Additional details can be found in the primary analysis of the trial by Lewis et al. (2007).
Funding Open access funding provided by Politecnico di Milano within the CRUI-CARE Agreement. No funding to declare. Fig. 10 Patients are randomized at baseline to one of the two regimens, with the same anticipated cumulative dose but different duration Data availability Data are not publicly available due to privacy restrictions. Access to the full dataset of MRC BO06 trial can be requested to MRC Clinical Trials Unit at UCL, Institute of Clinical Trials and Methodology, UCL, London.

Declarations
Conflict of interest The authors declare that they have no conflict of interest. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.