Capability instruments in economic evaluations of health-related interventions: a comparative review of the literature

Purpose Given increasing interest in using the capability approach for health economic evaluations and a growing literature, this paper aims to synthesise current information about the characteristics of capability instruments and their application in health economic evaluations. Methods A systematic literature review was conducted to assess studies that contained information on the development, psychometric properties and valuation of capability instruments, or their application in economic evaluations. Results The review identified 98 studies and 14 instruments for inclusion. There is some evidence on the psychometric properties of most instruments. Most papers found moderate-to-high correlation between health and capability measures, ranging between 0.41 and 0.64. ASCOT, ICECAP-A, -O and -SCM instruments have published valuation sets, most frequently developed using best–worst scaling. Thirteen instruments were originally developed in English and one in Portuguese; however, some translations to other languages are available. Ten economic evaluations using capability instruments were identified. The presentation of results show a lack of consensus regarding the most appropriate way to use capability instruments in economic evaluations with discussion about capability-adjusted life years (CALYs), years of capability equivalence and the trade-off between maximisation of capability versus sufficient capability. Conclusion There has been increasing interest in applying the capability-based approach in health economic evaluations, but methodological and conceptual issues remain. There is still a need for direct comparison of the different capability instruments and for clear guidance on when and how they should be used in economic evaluations.


Background
Economic evaluations assess whether an intervention provides value for money through the comparative analysis of alternative courses of action in terms of both costs and consequences [1]. The assessment of consequences in economic evaluation requires information about their identification (what), measurement (how much) and valuation (how valuable) [2]. Standard methods of health economic evaluations identify outcomes based on a rather narrow definition of health that aims to express outcomes in Quality-Adjusted Life Years (QALYs). However, there are many interventions, particularly in the areas of mental health, end-of-life care, public health and social care, where the impacts of interventions go beyond this narrow view of health. The contemporary literature (e.g. [3][4][5][6]) recognises the need to move away from the standard methods for assessing effects 1 3 of interventions and toward incorporating outcomes beyond the QALY framework, when producing an economic evaluation which feeds into decision making about resource allocation in health-related interventions. The most promising approach to address this issue is the application of Sen's capability framework, which was introduced by Sen [7] in the early 1980s as an alternative to standard utilitarian welfare economics. The core focus of the capability approach is on what individuals are able to be and do in their lives (i.e. capable of). The application of the capability approach in health economics has gained popularity because it potentially provides a richer evaluative space for the evaluation of interventions [8].
There has been increasing interest in developing instruments for using the capability approach in the measurement and valuation of outcomes for health economic evaluations. Capability instruments have been in the public domain for over a decade and publications have started to shift from methodological issues towards use of the measures within economic evaluations. Some decision-making institutions currently recommend the inclusion of capability measures in economic evaluations in certain contexts. The Zorginstituut in the Netherlands [9] recommends the inclusion of ICEpop CAPability measure for Older people (ICECAP-O) alongside the EuroQol instrument (EQ-5D) for the evaluation of interventions in long-term care, where the relevant outcomes extend beyond health. The most recent methods guideline [10] of the National Institute for Health and Care Excellence (NICE) acknowledges that the intended outcomes of interventions go beyond changes in health status for some decision problems; hence, 'broader, preference weighted measures of outcomes, based on specific instruments, may be more appropriate…' and 'the economic analysis may also consider effects in terms of capability and well-being' (p. 137). The manual specifically recommends the Adult Social Care Outcomes Toolkit (ASCOT) and ICECAP-O instruments.
However, the choice between instruments and their practical application in particular contexts lack a systematic approach. For instance, the ICECAP-O recommended by NICE is targeted at a subgroup of the population (older adults), whilst the ASCOT was specifically developed for the assessment of social care interventions. A recent review of the literature examined current trends in the application of ICECAP-O [11]. The authors found that the ICECAP-O has mainly been included as a secondary economic measure and the reporting of results is brief with minimal detail and often no discussion or interpretation. An overview of the psychometric properties of all potential capabilities instruments and their usefulness for economic evaluations would contribute to providing a clear guidance. This could later be used as a reference point for future comparative analysis of policies or interventions. Hence, the main aim of this paper is to synthesise the current evidence about the application of capability instruments in health economic evaluations. This translates into the following objectives: (i) to summarise information about the development, psychometric properties and preference valuation of relevant capability instruments; (ii) to compare the identified capability instruments in terms of their psychometric properties and up-to-date application in health economic evaluations; (iii) to identify applied evaluations that have used the capability-based approach in health economic evaluations and (iv) to pinpoint the challenges and considerations in the application of the capability approach in economic evaluations of health-related interventions.

Identification of relevant studies
The identification of papers was based on two main approaches: a traditional systematic literature search and a comprehensive pearl growing method [12]. The grey literature search in Google Advance either generated an unmanageable number of hits due to the term "capability" being used across a number of disciplines with varying meanings, as well as having generic lay use and interpretation of the term; or there was no addition to the search of other databases when more precise terms were used. As the development and validation of the capability approach in health economics currently appears to be concentrated among a limited group of researchers, as an additional step, websites dedicated to the instruments identified through the systematic search were specifically targeted and reviewed for relevant information.

Systematic literature search
Firstly, we conducted a systematic literature search. Search terms combined expressions for economic evaluation and frequently used terms for the capability approach, including synonyms and names of instruments most well-known in the area of health economics. Search terms are presented in Appendix 1. The selection of databases was based on similar reviews of health measures (PROMs) [6,13] in the area and included Embase, Medline, Web of Science, Psychinfo and Scopus. The literature search was conducted on 1 February 2019 and the review was limited to the last 20 years when the first publications in this topic area appeared [14]. Relevant systematic literature reviews were searched for further references and their findings were kept for comparison and discussion.

Comprehensive pearl growing method
The term 'capability' produces very broad ranging results when used as a search term due to its wide range of meanings, including lay meanings. The so-called comprehensive pearl growing method [12] is a technique used to ensure all relevant articles are included, particularly in case of issues with vocabulary in a search strategy. This method is particularly useful in interdisciplinary research and where recent developments are expected in the literature. The process of pearl growing commences with the identification of 'key pearls' (i.e. key studies), that can be identified from within the literature as being compatible with the aim of the review [12]. Once the key pearls have been identified, these are used to generate the 'first wave of pearls', that is, papers that have cited the key pearls within their reference list. It has been used successfully in a different type of review in the context of capabilities [13]. This second approach was implemented to validate the strategy applied during the systematic search and to identify potential further papers.
Two waves of the pearl growing method were conducted: one focusing on the development of instruments and a second wave related to the application of the instruments. A third wave was deemed unnecessary because the identified last generation of seminal papers were published only recently and have not been cited yet. The results are presented in Table 1. The first wave used for citation searching were the developmental studies of the four most commonly used and reported capability instruments: ASCOT, ICE-CAP-O, its version for adults (ICECAP-A) and the Oxford CAPabilities questionnaire-Mental Health (OxCAP-MH). The second wave relied on the three main papers from the last 5 years (but already with some relevant citations) that aimed to identify recent developments and up-to-date knowledge in the application of the capability approach in health economic evaluations. The number of citations was retrieved from Scopus on 14 March 2019.

Study selection
Titles and abstracts were sifted by two researchers (TL and AL) and studies were included for further assessment if they met the following inclusion criteria: (1) Full paper available in English or German languages. (2) Scope of study is the area of health or health-related interventions, including any interventions specifically targeting the promotion of health and prevention and treatment of ill-health irrespective of the sector where these were implemented. Hence, our study also included potentially relevant studies from the social care and public health sectors. (3) Focus of research is the evaluation or assessment of the outcomes of interventions using the capability approach. (4) Paper includes information on the use (or recommended use) of the capability approach in economic evaluations. (5) Paper is an applied evaluation OR focuses on the development, psychometric validation (or comparison to other tools) or preference valuation of instruments.
The full paper was retrieved if a study met the inclusion criteria based on its title and abstract. Consequently, full papers were assessed by two researchers (TH and AL) for inclusion based on their contribution to at least one of the aims of this literature review and subsequently allocated to the categories of either (i) applied evaluations (using a capability instrument in a completed economic evaluation) or (ii, iii, iv, v) methods papers. Methods papers were further categorised based on their relevance to the identification, measurement and valuation of outcomes, as well as the practical application of tools and theoretical contributions. Papers were grouped into categories of (ii) instrument development, (iii) psychometric validation or quantitative comparison of instruments, (iv) preference valuation of instruments and (v) methods for incorporation of the capability approach in economic evaluations. The latter one includes potential fields of application, approaches to use the results, incorporation of the results into a potential framework, for instance, Capability-Adjusted Life Years (CALYs), years of full capability or years of sufficient capability equivalence, etc. Some of the studies with significant theoretical Description of new methods to conduct economic evaluations using the capability approach [55] 13 Presents the opportunities and challenges of the capability approach in health economics [49] 4 Critical review of relevant questionnaires to measure and value capability contributions to the application of the capability approach in health economic evaluations which did not fit the above criteria were noted for discussion. No specific quality assessment was applied, all studies which provided information on either the psychometric properties or use of capabilities instruments in economic evaluations were included. The instruments were assessed based on their psychometric properties according to the COSMIN checklist [15], feasibility [16], potential for transferability and evidence regarding valuation.

Data extraction and analysis
Separate data extraction forms were created for empirical and psychometric evaluation (and other methods) studies. The search for information on valuation included any kind of preference-based valuation of instruments (or their dimensions/domains) and the existence of value sets. Further information on data extraction is presented in Appendix 2.
Trends in the literature were analysed based on the number of different types of studies published each year. The information elicited from the studies was structured according to the capability instrument in question. Information about economic evaluations, and the psychometric properties and correlation coefficients from studies comparing instruments are presented in review tables. Due to the variability of methods used in the validation and comparison studies, only narrative synthesis, including tabulation and frequency analyses, was conducted as no statistical pooling was possible. The information gathered was synthesised in a qualitative rather than quantitative manner by TH.

Search results
The literature search identified 98 studies for inclusion (Appendix 4 provides a complete list). The pearl growing method identified 29 citations beyond those captured by the systematic search strategy. However, none of the additional references met the inclusion criteria, and the papers included in this review were actually all picked up by the systematic search. An overview of the literature search based on the PRISMA statement is presented in Fig. 1.
The increasing number of relevant publications in recent years is a clear trend (shown in Fig. 2). A further trend also appears to be a shift from developmental studies towards the

Development of instruments
The literature review identified 14 capability instruments. Table 2 shows the heterogeneity of the capability instruments in terms of development methods, disease areas, types of interventions, population groups and the questionnaire structure.

Availability of evidence on the characteristics of capability instruments
As Table 3 demonstrates, there is at least some evidence about the psychometric properties of most instruments.
The most recently developed instruments, unsurprisingly, have less information available about their reliability, validity and responsiveness; an exception is OCAP-18 which was among the first capability instruments to be developed, but for which there is no further psychometric evidence available. The main difference across different groups of capability instruments is whether valuations that reflect the preferences of patients or the general public are available. The ASCOT and most ICECAP instruments have reported valuation studies and are therefore considered to possess evidence regarding their ability to reflect values of informants, whilst this is currently missing, for instance, for OxCAP-MH.

Different language versions of instruments
Apart from ACQ-CMH-104, all instruments were originally developed in English. The ASCOT, ICECAP-A, ICECAP-O and OxCAP-MH instruments have been translated to further languages, and these new versions have been validated ( Table 4).
Validity There were 25 studies among the included papers that used Pearson's or Spearman rank correlation coefficients to quantitatively assess the validity of all language versions of the capability instruments and/or compare it to other instruments. Quantitative evidence was provided on the validity of six capability instruments, including ACQ-CMH-104, ASCOT, ICECAP-A, ICECAP-O, OxCAP-MH and Women's Capabilities Index. Table 5 (and Appendix 5) summarise the correlations.
There is variation between studies in the correlation measures used, the instruments compared, the characteristics of the population, number of informants, testing of hypotheses generated regarding likely associations between the data and testing across known groups for discriminant and convergent validity. Hence, it is difficult to provide general statements about the comparison of capability instruments with other PROMs, or to conduct statistical pooling of the  In-depth qualitative interviews Older people from different groups across the dying trajectory 23 [62] results. High correlation estimates (above 0.8) were found between capability instruments: ASCOT/ICECAP-O [49] and ICECAP-A/AQoL-8D [20]. The examined studies provided very diverse estimates for the correlations between Health-related Quality of Life (HRQoL) and the different capability instruments. Most studies compared the ASCOT, ICECAP-A and ICECAP-O instruments with either disease-specific or generic HRQoL instruments. A wide range of disease-specific instruments were applied across studies, mainly being used when informants consisted of patients and social care recipients. EQ-5D-3L/-5L was used in 92% (n = 23) of the included validation and comparison studies as a HRQoL measure. In most cases, the 5L version of the EQ-5D instruments provided higher correlation coefficients compared to the 3L version. The higher correlation with capability instruments could be explained by lower ceiling effects and higher sensitivity to minor changes in the 5L version compared to the 3L version.
There seem to be a consensus in the literature that the capability approach provides complementary information to HRQoL measures. However, capability instruments could also be perceived as enhanced rather than complementary to the narrow interpretation of well-being/quality of life when focusing only on HRQoL. Most studies [25][26][27] found that the ICECAP and EQ-5D instruments provide complementary information, and a mapping is not recommended between them. Engel et al. [24] found that the ICECAP-A provides evidence above that gathered from most commonly used preference-based HRQoL instruments. Similar findings were reported for other capability instruments. Forder and Caiels [68] found that ASCOT has greater validity in measuring the effects of social care services than EQ-5D. Van Leeuwen et al. [28] investigated the validity of ICECAP-O and ASCOT among Dutch older adults. Although it could be attributable to cultural transferability issues, they found that respondents did not feel that these instruments give a comprehensive picture of their HRQoL because they did not find all domains of the instruments relevant, whilst other important domains were not covered, particularly concerns or delight about the well-being of family members. HRQoL instruments capture an important part of broader well-being, and some studies [22,23] established strong and positive association between capability and HRQoL instruments, which questions whether they focus on complementary constructs. Evidence suggests that some capability instruments could rather be interpreted as an enhancement of the HRQoL concept, for instance, an exploratory factor analysis [17] found that all EQ-5D-5L items and seven OxCAP-MH items loaded on one factor and nine remaining OxCAP-MH items loaded on a separate factor.
It is questionable whether the issues discussed above relate to all HRQoL measures or only the EQ-5D Utility 1 3 instrument. Lower correlation between the OxCAP-MH and EQ-5D Utility scores was observed in the Vergunst et al. [19] study than between OxCAP-MH and EQ-5D-VAS. This could be explained by the fact that the latter reflects the patient's overall judgement about their health status rather than focusing only five dimensions of their health, which is arguably more in line with the underlying broader well-being concept and the used non-preference-based index score of the OxCAP-MH instrument.
Interpretability In terms of ease of understanding, Bailey et al. [29] investigated the appropriateness of ICECAP-SCM to measure QoL and found that the capability instrument appeared more meaningful, easier to complete and had fewer errors among patients and close persons, compared to EQ-5D-5L. However, these results did not apply to healthcare professionals who preferred the EQ-5D-5L over ICECAP-SCM when measuring clinician-rated health states because it focused on observable attributes. Similar studies have also demonstrated the feasibility of use of other ICE-CAP measures [81,90]. Malley et al. [70] and Towers et al. [67] demonstrated the feasibility of using ASCOT among older people and care home residents; however, the study also highlighted the need for proxy respondents in some situations. This later led to the development of a proxy version of the ASCOT, which demonstrated good feasibility [58]. Davis et al. [30] reported that the level of agreement between patient and proxy for the EQ-5D-3L was significantly better than the level of agreement observed for the ICECAP-O in case of patients with vascular cognitive impairment. The authors conclude that due to its complexity, the ICECAP-O may have limited clinical, research and policy-related utility among individuals with mild cognitive impairment. However, these results need to be interpreted carefully due to the differing number of levels and the greater ability of proxies to observe the dimensions in EQ-5D. Although it could be explained by translational issues, van Leeuwen [28] who also reported difficulties with understanding the ASCOT and ICECAP-O in a study assessing a small number (n = 10) of Dutch, community-dwelling frail older adults. Simon     [64] et al. [39] explored the feasibility of OxCAP-MH among severely ill mental health service users. Patients provided positive feedback and felt that the questions allowed them to express their views and experience on topics they considered important but which were often left out of clinical or research interviews [39].

Responsiveness
The sensitivity of the capability instruments to measure changes is generally reported to be higher than in case of HRQoL measures [6,17,[31][32][33][34]. However, some authors found capability instruments to be less responsive than HRQoL measures. Davis et al. [35] and Couzner et al. [36] reported that the difference in values between the patient and general population groups was found to be far more pronounced for the EQ-5D-3L than for the ICECAP-O. There is a consensus in the literature that changes related to the broader meaning of health are better captured by the capability instruments than by EQ-5D [37][38][39]. Coast et al. [40] found strong evidence of association of general health with all capability attributes except for the attachment domain of ICECAP-A. Laszewska et al. [17] found that the OxCAP-MH may be seen as enhanced rather than complementary in its concept, when compared to EQ-5D-5L.

Valuation of instruments
From the reviewed 14 capability instruments, only four have a published valuation set. These used the best-worst scaling method, most often relying on the MaxDiff model. Informants mainly came from the general public. There is no published evidence available for the valuation of the remaining ten capability questionnaires (Table 6).

Applied economic evaluations and potential methods to incorporate the capability approach
Ten applied evaluations were identified in this review that have used a capability-based instrument as secondary outcome measure in health economic evaluations. No economic evaluation was found where a capability instrument was used as a primary measure of health outcomes. The information extracted from the applied evaluations is presented in Table 7 and in Appendix 6.
The number of economic evaluations reporting the use of a capability instrument has increased in recent years and further increases can be expected given that this search identified a number of recent study protocols (e.g. [41,42,114]). Four further studies were identified that specifically addressed the issues and discussed considerations when incorporating the capability approach into health-related economic evaluations.
A recent review [13] focused on using the capability approach in health research, not limited to economic evaluations. It identified four distinct common areas of application including: (1) physical activity and diet; (2) patient empowerment; (3) multidimensional poverty and (4) assessments of health and social care interventions. The authors also noted that there is a noticeable non-reliance on health status as a sole indicator of capability in health, and differences were found across studies in approaches to applying mixed methods, selecting capability dimensions and weighting capabilities. The current review identified applied economic evaluations from areas with widely accepted issues related to outcomes beyond the QALYs framework, e.g. mental health, visual impairment, chronic diseases and health decline in older people.
The presentation of results in the included economic evaluations demonstrate that there is a lack of consensus regarding the most appropriate way to use capability instruments in economic evaluations. Some authors present cost and outcome data separately and conduct a cost-consequence analysis [42][43][44][45], whilst others reported the results following the idea behind the incremental cost-effectiveness ratio (ICER) [31,46]. This lack of consensus about the use of capability instruments in decision making relates to the different approaches taken by different research groups to valuation, which means that in practice these measures are not comparable along the lines of a QALY. The idea of CALYs has been proposed by Mansdotter et al. [47] who highlights the following issues. First, it is questionable which capabilities are able to explain differences in well-being and are sensitive to public policies in high-income countries. Second, questions of the relevant instruments should capture voluntary ICECAP-SCM BWS 7 16 General public 6020 [101,110] and involuntary positions because an applied conceptualization of the capability approach includes opportunity as well as achievement. Third, methods for weighting capability and threshold values should be established, similar to QALYs. Finally, a trade-off should be made between the maximisation of capability and equity.
Mitchell et al. [48] proposed the concept of years of sufficient capability which is more closely aligned to the theory underpinning the capability approach because it has a greater focus on those in capability poverty. The process of defining a threshold for sufficient capability should be based on generating a sufficient capability score and using these scores to produce a capability outcome over time [48]. The use of ICECAP-A in the economic evaluations included in this literature review seem to focus on the choice between the options of years of full capability vs. years of sufficient capability equivalent [48].
The current state of the art identified in the reported economic evaluations applying the capability approach to their assessment are in line with the previously identified main challenges [50], including the need to research what the value of a capability improvement is, how to use the instruments globally, and compare the sensitivity of each measure to different patient groups and conditions. Only one study [49] was identified that posed a critique to using the capability approach in health economic evaluations. The authors claim that the method used in the questionnaires to measure capability will result in a capability set that is an inaccurate description of the individual's true capability set. The measured capability set will either represent only one combination and ignore the value of choice in the capability set, or represent one combination that is not actually achievable by the individual. In addition, existing methods of valuing capability may be inadequate because Cost and outcome data presented separately [114] they do not consider that capability is a set. (Although the Oxford instruments were developed based on Nussbaum's 10 basic human capabilities.) Hence, it may be practically more feasible to measure and value capability approximately rather than directly. Nevertheless, the argument is based on the questionable assumption that all capabilities have to be traded against other capabilities.

Discussion
This systematic literature review about capability instruments in economic evaluations of health-related interventions included 98 articles and identified 14 capability-based instruments. It provides a unique, comprehensive synthesis of the relevant evidence by focusing on the full spectrum of potentially available capability measures and summarising the practical and theoretical aspects of use of these instruments in economic evaluations. Most identified information related to the ASCOT, ICECAP-A, ICECAP-O and OxCAP-MH instruments. The development of capability instruments relies on methods similar to those applied in the case of HRQoL measures. Capability instruments were often compared to EQ-5D, but less often to each other. Possible reasons for this are that some instruments are population or diseasespecific, and that the inclusion of two instruments measuring the same concept in an applied evaluation study is assumed to unnecessarily increase participants' completion burden. In general, the information identified in the literature regarding the comparison of capability measures with other instruments could not be used for a pooled analysis. This is mainly due to the vast variation in the correlation measures used, the instruments compared, the characteristics of the populations and the number of informants. Despite the diverse quantitative estimates for the correlations with EQ-5D, the different capability instruments and the limited available data, this review confirms that capability measures capture a wider range of outcomes than the EQ-5D and may be more responsive when an intervention is likely to have broad impacts on HRQoL. Following the guidelines [51] to evaluate the strength of correlations, this generally observed moderate-to-high correlation suggests that EQ-5D and capability instruments measure somewhat similar, yet complementary concepts. However, there are competing statements in the literature regarding the association between capability and HRQoL instruments. Most authors argue that these measures complement each other; however, some studies suggest that capability instruments could be perceived as enhancements of the HRQoL concept. It is possible that this relationship depends on the choice of both capability and health instruments used in these comparisons. For instance, the OxCAP-MH has a relatively high number of items, which potentially capture a broader range of capability concepts than measures such as the ICECAP measures. Similarly, the EQ-5D measure of health has a narrower focus than other health measures such as measures based on SF-36 or the AQoL. The higher correlations between capability instruments and the EQ-5D-VAS scores than those observed between capability instruments and the EQ-5D utility scores suggest that respondents' overall judgement of their health status on a VAS seems to reflect better broader quality-oflife concepts present in the capability approach than specific scores for a certain limited number of HRQoL dimensions. Moreover, the differences in correlations found between measures may be due to differences in the populations studied. Hence, further research could explore which population subgroups and disease areas could benefit from the inclusion of certain capability instruments in economic evaluations.
Three of the identified 14 capability instruments were used in applied economic evaluation of interventions in the health and social care field; however, only as secondary outcome measures. Eight of the identified ten applied economic evaluations were conducted in the United Kingdom. This may be the result of the fact that the measures were developed in the UK and only available in English for some years. From the perspective of (health) economists concerned with economic evaluations, a good outcome measure should possess three main characteristics [2]. First, it should be comparable among diseases and interventions to allow for interpretation in a comparative way for resource allocation purposes. The capability instruments identified in this literature review were developed for specific population groups; hence, a comparison is currently challenging without a standard application of, for instance, the CALYs framework. Second, the instruments should have a scale with interval properties. All instruments provide a summary score; however, only a few are anchored and therefore have interval properties. The ICECAP scores are anchored on no capability and full capability, and the ASCOT scales are anchored on death and full capability. Finally, most economists are looking for an outcome measure for economic evaluation that reflects preferences, either of individual patients or the general public. Instruments with tariffs derived from the general population (ASCOT, ICECAP-A and ICECAP-SCM) or the relevant subpopulation (ICECAP-O) possess this characteristic. On the other hand, reducing capabilities information only to a single, preference-based index value on a scale of 0-1 may limit the actionable policy relevance of the information [39]. The two approaches, however, are not mutually exclusive and more research is needed about the relative values of different capabilities and their variance according to population specifics (e.g. age, disease experience, culture). More information about the weights people allocate to the attributes and levels of capability instruments would be needed to improve our understanding of the relative value of individual capability domains and dimensions.
Major limitations of this study design include that the search was limited to English and German. Next, this review only assessed instruments and studies reported in the literature, and a thorough grey literature search could not be conducted due to difficulties with the search term capability. In terms of grey literature, only dedicated websites of capability instruments were reviewed for relevant information. This resulted in some limitations, for instance, some cost-effectiveness components of studies that have used ASCOT have not been written up as journal articles and fell therefore outside the findings of this review [118,119]. Furthermore, ongoing research and developments could not be included which could be important in such a dynamically moving area. For example, we found information about ongoing economic evaluations [41,42,114] with the identified instruments where results expected to be published soon, additional capability instruments might have been used in unpublished economic evaluations, or some are currently under development. There is a potential need to update this literature review in the future to gather information from this rapidly growing body of literature about the potential development of additional capability measures, the further validation of existing ones, the empirical use of capability measures in economic evaluations, and the lessons learned from these applications.

Conclusion
There has been an increasing interest in the application of the capability-based approach in economic evaluations of health-related interventions. Different instruments are available and the choice between them should be based on both the research question and the characteristics of the instruments. Further research should focus on the comparison of the existing capability instruments and examining the correlation across capability measures. This would help future researchers in choosing the most suitable capability instrument for their study and provide further information for instrument developers.
Author contributions TH and JS conceived of the presented idea and developed the conceptual framework of this research. JS provided the resources to this study. TH and AL conducted the systematic literature search and sifting of abstracts and titles. TH took the lead in writing the manuscript in close consultation with JC, AL, TS and JS. All authors provided critical feedback and helped shape the research, analysis and manuscript.

Compliance with ethical standards
Conflict of interest JC has led the development of the ICECAP measures. JS has led the development of the OxCAP-MH measure. The remaining authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/