FormalPara Key Points for Decision Makers

Evidence on the impact of health insurance in sub-Saharan Africa is derived primarily from observational studies, i.e. studies that cannot discern causal relationships, but only highlight an association between the outcome of interest and insurance exposure.

Only 7% of all studies reviewed employed qualitative or mixed methods, suggesting that the field of impact evaluation is still largely dominated by a positivist epistemology reflected in a purely quantitative tradition.

As the number of experimental and quasi-experimental studies has increased in recent years, we can expect a substantial expansion and improvement of the evidence base on the impacts of health insurance in sub-Saharan Africa.

1 Background

Over the last few years and across sub-Saharan African countries, the push towards Universal Health Coverage (UHC) has led to an increasing number of reforms being implemented in the health-financing sector [1]. Health-financing reforms have affected the collection and pooling as well as the purchasing function in an attempt to increase revenue generation and cross-subsidization at the population level while at the same time increasing efficiency in resource allocation [2, 3]. The range of reforms has been wide, from targeted subsidies to user fee removal, from health insurance to results-based financing. These reforms have been the subject of a large number of evaluations, focused both on implementation processes and impacts, in an attempt to synthetize evidence and increase cross-learning across settings and countries. Of particular interest is the emerging number of reviews, which condense evidence on the impact produced and the challenges met during the implementation of single specific interventions [4,5,6,7,8].

These reviews consistently point at methodological weaknesses in the designs selected and the methods used to evaluate the abovementioned health-financing reforms. This observation represented the starting point for our work. We conducted a scoping review of the designs and methods used to evaluate health insurance reforms in sub-Saharan Africa (SSA). Our objective was to generate knowledge on methodological applications and challenges and reflect on what could be done to improve the quality of the currently available evidence base. In order to limit the scope of the review to a manageable size, we focused specifically on impact evaluations related to health insurance. Our choice was motivated by the central role that insurance plays in securing a sustainable path to UHC. In addition, since insurance is a health-financing strategy with a long history across sub-Saharan Africa, we were able to cover a long-time trajectory and a wide geographical spread, observing changes in study design and methods over time and across settings.

2 Methods

We conducted our scoping review by applying the classic six-step framework developed by Arksey and O’Malley [9]. We selected the scoping review methodology above the systematic review approach because our objective was neither to condense evidence on a given topic nor to appraise the strength of such evidence, but rather to provide a broad overview of all methods being used to evaluate health insurance reforms.

2.1 Research Question

Our work was guided by the following research question: What designs and methods have been used in studies evaluating all forms of health insurance reforms in SSA?

Our specific objectives were: (1) to review methods and designs used in studies evaluating the impact of health insurance reforms in SSA; (2) to describe the contexts of these evaluations; (3) to map the evolution over time and according to geographical zones; (4) to review methodological challenges faced during the evaluation reported by authors themselves.

2.2 Search Strategy

We conducted a systematic literature search using four scientific electronic databases (PubMed, Embase, Global Health and EconLit), two specific francophone databases (Cairn and BDSP), and grey literature (OpenGrey, J-Pal, CERDI, WHOLIS, Abt partnership, GiZ, World Bank). Reference lists of included articles were also screened to find potential additional relevant articles.

Our final search strategy was validated by a librarian at the University of Montréal and was performed in all databases on 2 November 2017. It consisted of the following combinations of key concepts “Insurance” AND “Evaluation” AND “Sub-Saharan Africa”. We included all possible associated keywords with each key concept and appropriate descriptors for each database as shown in Supplementary Material 1.

2.3 Selection of Relevant Studies

The screening of articles occurred in two phases: (1) A selection based on title and abstracts only; and in cases where there was no abstract, the reference was automatically included for full-text screening. We piloted the abstract-based strategy of selection on 50 random references to validate the selection criteria [10]. (2) A selection based on full texts. One reviewer (SD) screened all the articles, and in case of uncertainty, the reference was discussed with the other investigators (MDA and VR) until consensus on inclusion or exclusion could be reached.

The inclusion criteria were: (1) presented an impact evaluation of any forms of health insurance; (2) occurred in SSA; (3) was published between January 1980 and November 2017; (4) was written in English or French.

The exclusion criteria were: (1) focused on determinants of enrolment; (2) focused on the scheme performance (sustainability, financial viability, and/or cost efficiency); (3) presented a process evaluation (including quality of service delivery assessments); (4) did not present a detailed methods section (i.e. lack of elements on study design, data collection and data analysis); (5) presented an evaluation of a health reform other than health insurance; (6) was a feasibility/prevision/projection study to evaluate future health insurance reforms. We excluded items for which we could not retrieve full text as well as non-original research (e.g. reviews, comments, editorials). However, we did screen their references for potentially relevant original studies.

Inclusion and exclusion criteria were meant to ensure that we keep our focus on the designs and methods used to evaluate the impact of health insurance in sub-Saharan Africa. On the one hand, our inclusion criteria were rather broad to ensure that we would not miss any relevant impact evaluation. On the other hand, our exclusion criteria were set to screen out articles that focused on health insurance but were not impact evaluations.

We defined an impact evaluation study as any study that explicitly addressed the impact or effect of a health insurance scheme on any measure related to health-service utilization, health-service delivery (including quality of care), health status and financial protection. We also included as impact evaluations studies conducted in clinical settings to assess whether health service delivery and heath status differed between insured and non-insured people and/or periods.

2.4 Charting the Data

The authors engaged in a close iterative process to agree on the final validation of the data-extraction form. One reviewer (SD) extracted data from three studies (as a pilot round) and the principal investigator (MDA) revised the data extraction of the same three studies to further improve data extraction categories in alignment with our specific study objectives. Once the grid was validated, a study assistant (CW) was also involved in the data extraction after appropriate training with the authors (SD and MDA). The completed data extraction grid, provided in Supplementary Material 2, contained the following main categories: general information, insurance reform data, data collection, study design and data analysis, methodological considerations.

Our study design classification aimed at differentiating studies according to their ability to accurately identify causal relationship through minimization of potential threats to internal validity (such as the inability to control for bias due to observable and unobservable covariates). In line with existing literature [11, 12], we defined as observational any study that used cross-sectional, repeated cross-sectional, and/or longitudinal data, but relied only on descriptive statistics and/or simple modelling techniques (such as simple regression models not adjusting for selection bias). While observational studies may provide important information on existing associations between insurance exposure and outcomes of interest, they can make no causal claims, since they do not adequately control for possible sources of bias in the estimation of these associations. As such, observational studies are generally not a preferred option in impact assessments.

On the opposite side, we classified as experimental any study that relied on a randomized allocation of insurance exposure [12]. Studies that rely on randomization are the ones with fewer threats to internal validity, since through randomization the probability that the effect observed on the outcome of interest is confounded by observable and unobservable covariates is minimized. Somewhat standing in the middle between observational and experimental studies, we defined as quasi-experimental any study that applied statistical techniques capable of approximating an experiment at the analytical level [12]. These included: difference in differences approaches, instrumental variable models, fixed-effects models, and propensity score matching. By controlling for potential bias due to either observable and/or unobservable covariates, these studies make an attempt to accurately estimate the relationship between insurance exposure and outcome by reducing potential threats to internal validity, even in the absence of randomization.

Regarding the outcomes, we divided them in four broad categories:

  • Services use: including healthcare facility utilization; maternal care utilization, including facility-based delivery or skilled-birth attendance at delivery, antenatal and post-natal visits; under 5-year-old child immunizations or health-service utilization in case of fever, diarrhoea or cough; health-service utilization for specific vulnerable groups (women of reproductive age, children under the age of 5 years, poor); and just in one case of use of traditional medicine.

  • Financial protection: including individual and/or household out-of-pocket health expenditures; catastrophic health expenditures; socio-economic differentials after healthcare use such as loans, impoverishment and indebtedness; income variations and methods to obtain cash to pay for care; costs of care.

  • Health outcomes: including all-cause mortality in the general population, age-specific mortality, under 5 year-old mortality, perinatal and neonatal mortality (under 7 or 28 days of life); under 5- or 2-year-old anemia and stunting; changes in health status (including specific measures of hypertension) in adults, for under 5-year-old children, and for paediatric patients; pregnancy rate; and low birth weight.

  • Quality-of-care outcomes: including consumer satisfaction with health services; observed structural or process quality of care; and responsiveness of healthcare system.

2.5 Collating, Summarizing and Reporting Findings

Data extracted were summarized and analysed using simple descriptive statistics such as frequency, and when applicable mean and median, and content analysis. We identified and characterized methodological patterns and then attempted to identify relationship between categories to understand for instance if a given methodological approach was associated with an earlier or later period or with a given country and/or set of authors.

2.6 Consultation

This work was disseminated at the “Final Review Workshop of the AERC Collaborative Research Project on Healthcare Financing in Sub-Saharan Africa: Framework Phase” held in Port Louis, Mauritius, 31 May–1 June 2018. The meeting convened by AERC, which funds the present study, offered the opportunity to discuss the present paper and validate its findings with other researchers engaged in Healthcare Financing research.

3 Results

3.1 Process of Article Selection

The electronic search for relevant literature yielded a total of 3041 items, from which 2830 were found in scientific databases and 211 from grey literature, among which 194 articles were selected for full-text screening and 66 were finally included in the present review. The detailed selection process is illustrated in Fig. 1.

Fig. 1
figure 1

Prisma flow chart

3.2 Geographic Repartition and Time Trends

Most studies included were conducted in Ghana (n = 32; 48.5%) [13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41], followed by Burkina Faso (n = 8; 12.1%) [42,43,44,45,46,47,48,49]; Nigeria [50,51,52,53,54,55]; Rwanda [56,57,58,59,60,61] (n = 6; 9.1%); Kenya (n = 4; 6.1%) [62,63,64,65]; and Tanzania (n = 2; 3%) [66, 67]. Ethiopia [68], South Africa [69], Mauritania [70], Mali [71] and Zambia [72] contributed one study each (Fig. 2). Three studies (4.5%) included were multi-location studies with data being compared across several countries. One study (1.5%) compared Rwanda and Ghana [73], while two studies (3%) compared Senegal, Mali and Ghana [74, 75]. All studies included were published in English (n = 66; 100%).

Fig. 2
figure 2

Geographical repartition. Black > 10 studies; dark grey 9 to 7 studies; medium grey 6 to 4 studies; light grey 3 or 2 studies; very light grey: 1 study

Only ten out of 66 studies (15.2%) were published in or before 2011, while the remaining 56 studies (84.8%) were published between 2012 and 2017 (Fig. 3).

Fig. 3
figure 3

Time trends of publication of impact evaluation of health insurance reforms

3.3 Authorship and Authors’ Affiliation Analysis

The number of authors involved per paper ranged from single authorship to 13 authors, with a median of four authors per paper. Most papers (n = 45; 68.2%) included authors from at least one African university or research centre, but always exclusively from the country involved in the study (including six papers where two African universities or research centres were involved). One-third of all papers (n = 21; 31.8%) did not include any African university or research centre. Only two papers had an author affiliated with the Ministry of Health of the country involved in the study (3%). Authors from international agencies were involved in six papers (9.1%): United Nations International Children’s Emergency Fund (UNICEF) twice, World Bank twice, Organisation for Economic Co-operation and Development (OCDE) once and World Health Organizations (WHO) once. First authors were affiliated with an African university or research centre only in one-third of all studies (n = 24; 36.4%).

3.4 Type of Insurance Being Evaluated and Context

More than half of all studies pertained to evaluations of National Health Insurance Scheme (NHIS; n = 37; 56.1%) and more than one-quarter pertained to Community-Based Health Insurance (CBHI; n = 12; 18.2%), also called Mutual (or “Mutuelles” in French) Health Organizations (MHO; n = 8; 12.1%). The remaining studies concerned State Insurance (n = 4; 6.1%), private insurance, including one micro-health insurance (n = 2; 3%), and obstetric risk insurance (n = 1; 1.5%). Two studies (3%) pertained to various insurance schemes type (Fig. 4).

Fig. 4
figure 4

Type of health insurance evaluated. CBHI community-based health insurance; MHO “mutuelle” health organization (type of CBHI); NHIS national health insurance scheme

Most studies, 31 out of 66 (47%) [13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33, 36,37,38,39,40,41, 76,77,78] focused on the National Health Insurance Scheme in Ghana, implemented since 2003 to replace the “cash and carry” system, which required direct payment of health services at point of use. The NHIS has developed over time, becoming fully operational in 2008. The NHIS is administered on a district level, although the funding is centralized and nationally standardized. The impact evaluations we reviewed were carried out at municipal or district level in 11 studies (17%), regional level in four studies (6%) and at national level in 16 studies (24%). Only one study (1%) reported in our review and also carried out in Ghana was about a different scheme, the district-level CBHI Nkoranza scheme, which was operational from 1992 to 2005, when it was replaced with NHIS [34].

The eight studies (12% of the total) on Burkina Faso were all impact evaluations of the Nouna CBHI scheme and performed the analysis at Nouna district level [42,43,44,45,46,47,48,49]. The Nouna CBHI was implemented according to a step-wedge design phased over 3 years. The Nouna CBHI initiative was developed in collaboration between the Nouna Health District authorities and researchers at the Centre de Recherche en Santé de Nouna and at Heidelberg University (Germany).

All six studies (9% of the total) from Rwanda concerned the “Mutuelles” Health Organizations [56,57,58,59,60,61], which are community-based health insurance schemes first piloted in 1999 and scaled up to the national level starting in 2006. Over 100 MHO schemes were created between 2000 and 2003. In 2008, the government endorsed a legal framework to enable MHO “creation, organisation, functioning and management”, effectively turning these schemes into a compulsory social health protection measure offering coverage country-wide. All six studies were evaluations at national level.

Among the four studies (6% of the total) in Kenya, three evaluated the National Hospital Insurance Fund owned by the Kenyan Government [62, 64, 65], at national level in one study and at district level in two studies, and one study evaluated the impact of a private insurance owned by the Jamii Bora Trust Microfinance Institute (at district level) [63].

In Nigeria, two studies (3% of the total) evaluated the impact of the National Health Insurance Scheme [54, 55], and four studies (6% of the total) evaluated the impact a public-private partnership providing a State Health Insurance programme (Kwara State) [50,51,52,53], which had started as a community-based health insurance. All the studies performed in Nigeria were at district level.

Both studies from Tanzania presented district-level evaluations, one on the impact of the National Health Insurance Fund [67] and one on the evaluation of the Community Health Fund [66]. The study from Ethiopia [68] evaluated the pilot CBHI promoted by the Ethiopian Government with data from 16 districts located across four main regions. In Zambia, one study [72] evaluated a voluntary health insurance program referred to as “Pre-payment”, including governmental, private and community-based schemes at national level. In South Africa, one study [69] evaluated the impact of membership in one of the many private health insurance schemes in this country.

Finally, three multi-country studies (4.5% of the total) were included in this review. One compared the NHIS in Ghana with the CBHI schemes in Rwanda with an evaluation made at national level [73]. Two studies, one with a national and one with a district focus, focused on the impact of various CBHI schemes in Ghana. Mali, and Senegal [74, 75].

3.5 Study Designs

Sixty-one out of 66 studies (92.4%) were quantitative studies while only five (7.6%) were defined as mixed methods. We could not identify any exclusively qualitative study focused specifically at impact. The majority of studies used cross-sectional data measurements (n = 51; 77.3%); five (7.6%) and ten (15.2%) studies relied respectively on repeated cross-sectional data and longitudinal data measurements.

Among the quantitative studies, most applied an observational design (n = 32; 48.5%), followed by a quasi-experimental design (n = 27; 40.9%). Only two studies (3% of the total) applied an experimental design. All five mixed methods studies (7.6% of the total) applied an observational design to their quantitative component. Figure 5 reports on the geographic repartition by study design.

Fig. 5
figure 5

Designs of included studies by country

When we look at insurance types, the majority of the 20 studies evaluating CBHI relied on an observational design (n = 10; 50%), followed by a quasi-experimental design (n = 8; 40%). The only two studies using experimental designs evaluated a CBHI scheme (10%). Replicating a similar pattern, twice as many studies evaluating an NHIS relied on observational designs (n = 25; 67.6%) as a quasi-experimental design (n = 12; 32.4%). Regarding the other types of insurance, the obstetric risk package in Mauritania was evaluated using a quasi-experimental design; the State Insurance in Nigeria was evaluated exclusively using quasi-experimental designs (n = 4; 100%); private insurance schemes were evaluated in South Africa using a quasi-experimental design and in Kenya using an observational design; the mix of CBHI, NHIS and private insurance in Zambia was evaluated using an observational design.

Regarding temporal trends, we can observe that quasi-experimental studies became more prominent in more recent years with the most remarkable shift occurring in 2016 and 2017 (Fig. 6).

Fig. 6
figure 6

Temporal trends according to study designs

3.6 Types of Outcomes and Level of Analysis

Hereafter, we present study outcomes as we categorized them at the analysis stage into broad categories.

Health services use: 44 out of 66 studies (66.7%) reported on outcomes referring to service use. Seven out of the 44 studies (15.9%) used a combination of two or more outcomes from this category. In 23 studies (47.7%), the analysis was at individual level while in 16 studies (36.4%) it was at household level. In five studies (11.4%) addressing multiple outcomes, analysis was conducted both at the individual and the household level.

Financial protection: 23 out of 66 studies (34.8%) reported on outcomes referring to financial protection. In three studies, proxy outcomes were used to reflect financial protection of the households: child labouring and schooling. Five out of the 23 studies (21.7%) used a combination of two or more outcomes from this category. In four studies (17.4%), the analysis was conducted at individual level, in 17 studies (73.9%) at household level, and in two studies (8.7%), which addressed multiple outcomes, the analysis was conducted both at the individual and the household level.

Health outcomes: 18 out of 66 studies (27.3%) reported on health outcomes. One out of the 18 studies (5.6%) used a combination of two or more outcomes from this category. In 13 studies (72.2%), the analysis was at individual level while in four studies (22.2%) it was at household level. In one study (5.6%) in Burkina Faso, population rates were used at country level as outcome.

Quality-of-care outcomes: only four out of 66 studies reported on quality-of-care outcomes. Two out of the four studies (50%) used a combination of two or more outcomes. In two studies (50%), the was at individual level, while in two studies (50%) it was at household level.

Forty-four out of the 66 studies (66.7%) reported on outcomes from only one of the above categories, while 22 studies (33.3%) used outcomes from at least two of the above categories (health services use, financial protection, health outcomes and quality of care).

When looking at the qualitative component of the mixed method studies, we could not extract details on qualitative data collection and analysis processes since the two were not described in detail, attention being devoted primarily to quantitative data collection and analysis. Qualitative methods were used to collect information on birth history and experiences with insurance [13, 67]; access and quality of care [14]; complementary information about functioning, problems and successes of one private micro-health insurance scheme [63]; and one single case study to illustrate the impact of financial challenges facing families of children affected by cancer [65].

3.7 Analytical Approach

The only two experimental studies included in this review reported on the experience of the Nouna CBHI scheme. The CBHI scheme was introduced in the Nouna Health District (Burkina Faso) according to a stepped-wedge cluster-randomized trial design stretching over 3 years and starting in 2004. The whole region was divided into 33 clusters and every year 11 additional clusters were offered CBHI. A total of 990 households (30 households per cluster) were included in a panel survey, with interviews taking place at least once a year. Only two studies [42, 43] reporting on the Nouna experience made use of the randomization in the insurance assignment when estimating impacts. All other studies based on the Nouna experience applied a quasi-experimental study design [44,45,46,47,48,49].

Amongst the longitudinal quasi-experimental studies (n = 8 studies; 12% of the total), three studies (38%) [50,51,52] relied on difference-in-differences, two (25%) relied on instrumental variables and regression with fixed effects [15, 45], one (12%) relied on propensity score matching and regression with fixed effects [68], one (12%) relied on multivariate hierarchical analyses [70], and one relied on concentration curves and regression with random effects [44]. Among the three quasi-experimental studies relying on repeated cross-sectional data (5% of the total), one study (33%) relied on a difference-in-differences approach [16]; one on an instrumental variable approach [17]; and one used both propensity score matching and instrumental variable [56]. Finally, among the 16 quasi-experimental studies relying on cross-sectional data (24% of the total), 12 (75%) used propensity score matching [18, 19, 21, 22, 25, 47, 53, 57, 58, 62, 69, 73], four (25%) instrumental variables [20, 23, 24, 46] and none used difference-in-difference. Table 1 presents a summary of the designs and analytical approaches used in relation to the study outcomes.

Table 1 Descriptive summary of all included studies

Five out of 66 studies (8%) performed an explicit equity analysis, reporting outcomes stratified by socio-economic status [33, 44, 53, 68, 72], but only one study also relied on concentration curves to report on the equity impact of health insurance [44].

Only two mixed-methods studies reported on their qualitative analytical approach, in both cases described as inductive open coding [13, 67].

3.8 Methodological Limitations Reported in the Included Studies

In 19 out 66 studies (28.8%) [14, 16, 18, 23, 24, 28, 34, 40, 42, 44, 46, 47, 53, 57, 60, 66, 68, 71, 72], there was no mention of the study methodological limitations. Self-selection into voluntary insurance and the subsequent sample bias were discussed in 20 studies [13, 15, 17, 19, 22, 25, 26, 41, 50,51,52, 58, 62, 63, 67, 69, 70, 73, 75, 77], openly acknowledging the related identification problems. Another major limitation frequently discussed by authors (n = 13 studies) concerned recall bias [13, 21, 25, 26, 30, 31, 33, 35, 38, 39, 48, 73, 76], as information on insurance membership (exposure) and health status and/or health-service use (outcomes of interest) was often collected at different time points, not always allowing a perfect match between the two. Authors of studies relying on secondary data often discussed limitations due to the nature of the secondary data [15, 19, 21, 27, 32, 33, 36, 37, 39, 49, 55, 56, 58, 64, 67, 77], including lack of information on specific covariates to refine model estimation; impossibility of checking the quality and/or accuracy of the data; and for one study relying on medical charts, with large amounts of missing data [65]. Finally, only eight authors (four from quasi-experimental studies [22, 61, 62, 73] and four from observational studies [37, 38, 59, 76]), acknowledged the impossibility of establishing a causal link between insurance and outcomes of interest due to the cross-sectional nature of the data being used.

4 Discussion

Our work contributes to the existing literature by looking specifically at the methodology applied to evaluate the impact of health insurance in SSA, and by doing so, complements the evidence emerging from existing reviews focused on synthetizing the content rather than the methods applied in insurance studies [4,5,6,7,8].

The first striking result of our review is the overwhelming majority of quantitative studies, with only a handful of studies looking into the impact of health insurance using mixed-methods studies and no study at all doing so using exclusively qualitative methods. On the one hand, this finding is a clear indication of how, in spite of emerging literature advocating for the application of mixed methods in health policy and systems research [79,80,81,82,83], the field of impact evaluation is still largely dominated by a positivist approach, whereby unraveling the impact of an intervention on an outcome of interest is intrinsically associated with a quantitative approach. This reflects a relatively narrow understanding of causality, almost exclusively focused on quantifying impacts, with little interest in explaining causal pathways to change. The handful of mixed-methods studies included in our review confirm this observation since none of them used qualitative methods to explore causal pathways, but only to report on people’s experiences and views of insurance. On the one hand, this finding is somewhat surprising, considering the wealth of literature that has emerged over the last few years highlighting the role that both methodologies have to play in causal analysis [84,85,86,87]. One needs to consider, however, that our review reaches back to the early impact evaluation literature covering the experience of the first insurance schemes in the continent. It is very possible that the future of the impact literature on insurance will be more inclusive of qualitative and mixed-methods approaches to evaluation. On the other hand, this finding may reflect actual research capacity in the African continent. Scarcity of trained mixed-methods researchers and of adequate funding opportunities for mixed-methods research have been identified before as an important barrier to knowledge generation in Africa [88]. In the long run, however, explicitly promoting investments in mixed-methods training and research is likely to reduce, if not completely remove, the imbalance between quantitative and qualitative studies we observed in our review.

The second most striking element that emerged from our review concerns the clustering of studies in a few selected countries, in spite of the fact that health insurance reforms, at least in the form of micro-health insurance schemes, have concerned a much larger number of countries. At the same time, we have already noted how the African institutions involved in the reviewed studies always sat in the same country of the scheme being evaluated. It follows that this clustering of studies in a few selected countries is most likely the joint result of a focus on the most prominent schemes, such as the Ghanaian NHIS, and of the presence of large research infrastructures capable of supporting evaluation efforts, such as in the case of the Nouna CBHI. The impact of many schemes might have simply gone undocumented due to the absence of local research infrastructure. Similarly, it is not surprising that the only two experimental studies were conducted to evaluate the Nouna CBHI, considering that the scheme was purposely set up as a stepped-wedge cluster randomized community-based trial in an area of the country with enhanced research capacity due to the presence of a HDSS site [89].

Unlike the Nouna CBHI, most insurance schemes that were evaluated were not set up within the framework of research projects aimed at generating scientific evidence on the impact of health insurance. This is likely to be the main factor explaining why the vast majority of the studies included in our review relied exclusively on observational designs, making best use of whatever data could be acquired in a simple cross-sectional setting. With time, however, even studies relying on cross-sectional data made more extensive use of quasi-experimental approaches, primarily propensity-score matching and instrumental variables. This shift is likely linked to the increasing attention that has been paid over the last few years to issues of identification when assessing the impact of health interventions, including health insurance [12, 90, 91]. The results of our review indicate that the progressive shift from observational to quasi-experimental studies marked a better capacity to account for one of the fundamental problems in estimating the impact of health insurance, i.e. selection bias related to self-selection into an insurance scheme. The fact that the vast majority of studies applying a quasi-experimental approach relied on propensity score matching is not surprising, given that the latter allows making best use of even cross-sectional data, offering a pragmatic albeit second-best solution to account for selection bias even in settings when limited data are available [92].

The wide range of outcomes reported across the studies is indicative of the breadth of the impacts attributable to insurance. It is interesting that a few studies even explored non-health impacts of a health intervention by looking into schooling [58] or child labour [16, 57]. Nevertheless, it is not surprising that the majority of studies focused on health-service use and financial protection indicators. Not only are changes in service use and financial protection the most direct consequence of insurance, but also the changes that can more easily be observed shortly after the onset of an intervention. Changes in service provision and even more so in health status can only be induced by insurance over a longer period of time, escaping the evaluation framework that can be applied to schemes having emerged mostly over the last decade.

In closing our discussion of the findings, we need to acknowledge a few limitations of our study. First, we cannot exclude the possibility of not having included all relevant literature. Although we applied maximum accuracy during our search, we might have failed to identify relevant studies especially if impact measures were embedded within studies not reporting primarily an impact assessment. Similarly, it is possible that grey literature studies produced in prior decades were no longer available at the time we conducted our search and hence could not be included in our review. In addition, it is possible that we did not include studies conducted directly by the insurance implementing agencies themselves. These concerns were raised during peer review, hence we feel obliged to highlight such potential limitations. We postulate that such studies were either not present in any of the databases we searched or were excluded during extraction due to lack of clear description on design and methods, as was unfortunately often the case for grey literature. Moreover, although we included material in both English and French, we could not screen for articles in Portuguese, the third official language in SSA. Second, we need to acknowledge that in line with the objectives of a scoping review, we did not attempt to judge the quality of the methods being used, but simply to provide a comprehensive description of their use. Nevertheless, we acknowledge the need for further work to be done to assess the extent to which the methodologies applied were useful to answer the research questions set in the single studies.

5 Conclusions

The findings of our scoping review are in line with prior observations pointing at the fact that evidence on the impact of health insurance in SSA is rather weak since it rests primarily on observational studies, with a striking dominance of quantitative data. Still, we identified an increase in the use of quasi-experimental methodologies in more recent studies, suggesting that we could observe a broadening and deepening of the evidence base on insurance over the next few years. While judging the strength of the evidence generated on specific outcomes is beyond the scope of this review, and has been done before [4,5,6,7,8], we wish to echo our earlier comment and encourage further efforts in enhancing knowledge and understanding of research methodologies at the policy level too. This should be seen as an investment in policy making, enabling policy actors to assess autonomously the validity and credibility of the evidence being fed to them with the aim of informing policy.