Decision models for value assessment of health technologies are frequently developed using spreadsheet software, i.e. Microsoft Excel. However, we opted for the statistical programming language R to develop the open-source models for several reasons.
Although Excel is sufficient for relatively simple analyses, such as clock-forward state-transition models or partitioned survival models, efficiently implementing more complex models, such as a clock-reset semi-Markov model, may not be feasible. In the case of the IVI-RA model, for example, we used an individual-patient simulation to allow modeling sequential treatment strategies for which individual treatment duration is not limited to an exponential distribution, and to make it possible for both morbidity and mortality to depend on prior history of the course of disease [9]. In the IVI-NSCLC case, we used a multi-state model that was parameterized using parametric (i.e., Weibull) and flexibly parametric (i.e., fractional polynomial) models; in addition, we considered sequential treatment so that transition rates depended on time since entering intermediate states, rather than time since treatment initiation [13]. We consequently needed to simulate outcomes based on a semi-Markov model using an individual-level simulation [18], which a spreadsheet is not suitable to implement in an efficient manner. A script-based programming language, such as R, makes implementation of individual-level simulation models more straightforward.
In our IVI-NSCLC model, the statistical model used for parameter estimation was seamlessly integrated with the simulation model. We developed a multi-state network meta-analysis model to estimate relative treatment effects that had the same structure as the state-transition model used for the simulations. This highlights a key advantage of R, namely, that because it is designed specifically for statistical computing, parameter estimation and the subsequent simulation of model outcomes can be performed in a consistent manner in one environment without the need to unnecessarily simplify the structure of the simulation model. Conversely, Excel frequently prevents the structure of the decision model describing development of outcomes over time and the impact of treatment from being closely aligned with the statistical model used for parameter estimation.
Quantifying the impact of parameter uncertainty on decision uncertainty is typically done by means of a probabilistic sensitivity analysis (PSA). The uncertainty in the model input parameters is propagated through the model by randomly sampling values for the input parameters from suitable probability distributions. Although a PSA can be performed in Excel using Visual Basic for Applications, R has validated functions for sampling from multiple univariate and multivariate distributions that make it a natural programming language for performing a PSA [19]. In addition, R has packages (such as hesim and BCEA) to produce cost-effectiveness planes, cost-effectiveness acceptability curves, and the cost-effectiveness acceptability frontier [14, 20,21,22,23]. Implementation of a PSA in R can be implemented more efficiently than with Visual Basic for Applications by using efficient programming techniques such as vectorization, linking to compiled languages (e.g., C/C++), or through parallel computing. This is imperative for computationally intensive, individual-level simulation models, such as the IVI-RA and IVI-NSCLC models. In fact, simulations with these models could not have been performed in a reasonable amount of time if implemented with Excel and Visual Basic for Applications. Computational speed is also an important consideration when a web-based user interface is being provided to interact with the model.
A clear benefit of a script-based programming language, like R, is that the complete analysis from parameter estimation to simulation of model outcomes and quantification of decision uncertainty can be performed using a reproducible script [24]. We believe this has the potential to improve transparency of decision and economic modeling studies considerably. Furthermore, with R, model documentation can be created in which each figure, table, and number cited in the text is based directly on the output of the code run in the script. For examples, we like to refer to the online tutorials of the IVI-RA and IVI-NSCLC models (https://innovationvalueinitiative.github.io/IVI-RA/articles/00-intro.html; https://innovationvalueinitiative.github.io/IVI-NSCLC/articles/tutorial.html) and corresponding technical reports (https://innovationvalueinitiative.github.io/IVI-RA/model-description/model-description.pdf; https://innovationvalueinitiative.github.io/IVI-NSCLC/model-doc/model-doc.pdf).
Whereas R is a commonly accepted and frequently used programming language for data analysis and statistical modeling, it has not been the primary software of choice for the development of economic models in the context of a health technology assessment. However, R has an active user and developer community and with the availability of recent packages, such as heemod, hesim, and BCEA, it is increasingly well suited for the development of decision and economic models as well. Some in the health technology assessment community might argue that R is less transparent than Excel, but we are of the opinion that this is confusing transparency with software familiarity. Workshops such as those by the Decision Analysis in R for Health Technologies in Health (DARTH) team, published tutorials [25, 26], and our OSVP models hopefully make R a more obvious choice for the development of decision models.
While Excel may be adequate in some circumstances, what we set out to do with the OSVP models requires software with capabilities beyond Excel. Even if Excel would be sufficiently flexible, it is in our opinion still suboptimal relative to R in terms of transparency, reproducibility, modifiability, and computational efficiency.