1 Introduction

Given the constraints of limited resources and increasing demand for healthcare, particularly due to higher life expectancy and an aging population, governments are compelled to balance effectiveness, equity, and financial stability in their healthcare strategies (Grassetti and Rizzi 2019). As the growth of public health spending has been widely documented (Baltagi et al. 2017; Vincenzino 1995; White 2007; WHO et al. 2018), the discussion on co-payment healthcare services is unsettled, demonstrated by the existence of arguments both supporting and opposing its implementation or facilitation and has consistently remained on the agenda of governments in recent decades (Atella et al. 2006; Färdow et al. 2019). Co-payment, intended as a financial contribution for healthcare services and goods consumed, works as a lever to enhance consumers’ responsibility in demanding only the optimal quantities of healthcare goods and services. Co-payments are often used to reduce moral hazard in healthcare. Indeed, moral hazard can arise when individuals have insurance coverage that shields them from the full cost of medical services, leading to over-utilization or unnecessary medical expenses.

However, the effectiveness of this approach is subject to debate, as co-payments can also act as a barrier to healthcare access, particularly for individuals with lower income or limited resources. Co-payment can impose a financial burden on low-income individuals, as it may represent a significant portion of their disposable income, making it difficult to afford essential healthcare. Higher co-payments may discourage individuals from seeking care for minor or preventive health issues, such as screenings, leading to delayed diagnosis and treatment. Discouraging individuals from accessing prevention services, co-payments may lead to missed opportunities for early detection and prevention of diseases (Rezayatmand et al. 2013). These are the reasons why many healthcare systems have implemented measures to reduce or eliminate co-payments for certain needy populations. This paper tests whether a co-payment exception for needy populations increases the use of healthcare services. We focus on specialist physician services, which play a crucial role in preventive healthcare, as they enable early detection, provide specialized knowledge and guidance, assess individual risks, can develop personalized treatment plans, and it has the potential of offering valuable health education (Jusot et al. 2012; Lueckmann et al. 2021). Increasing the use of these services for needy populations can significantly improve health outcomes and quality of life (Solanki and Schauffler 1999).

Estimating the effect of co-payment on healthcare utilization is empirically challenging because, to a certain extent, the amount of co-payment depends on individual’s characteristics, e.g., income, that also determine the demand for health services. This paper addresses this issue by exploiting a multidimensional discontinuity in age and income, required in order to benefit from co-payment exemption in the Italian National Health System (NHS). Notably, people aged over 65 and with a household income less than 36,165.98 euro are eligible for co-payment exemption, defined as co-payment exemption for income. Both requirements are not exogenous with respect to health conditions and the ensuing healthcare consumption. It has been largely shown that age is one of the main drivers of healthcare utilization (Oliveira and de la Maisonneuve 2006), and it is well known that individuals’ income is a fundamental factor of individual health (Subramanian and Kawachi 2004; O’Donnell et al. 2007). The (multidimensional) discontinuity in age and income provides us with a natural experiment that allows us to study the phenomenon in a quasi-experimental setting where treatment, i.e., co-payment exemption, is “as if” randomly assigned in proximity of the multidimensional discontinuity. Considering that the discontinuity is determined by two variables, we use a recent methodological development of Regression Discontinuity (RD), the Multiple Regression Discontinuity (MRD). Furthermore, this complex discontinuity offers another advantage in that it allows us to target the estimates on the sub-population of less wealthy people, considered by the policy makers as one of the most needy and, for this reason, worth protecting and therefore studying. In this regard, our paper provides the policy makers and practitioners with useful knowledge not yet available, that may help them to tailor ad-hoc interventions. Our findings provide convincing evidence of a positive and significant effect of co-payment exemption on the utilization of specialist visits.

Empirical evidence for similar frameworks began with the 1974 RAND Health Insurance Experiment that provided the first opportunity to estimate the effects of health insurance and plan characteristics on medical spending (Aron-Dine et al. 2013). Between 1974 and 1981, the RAND experiment randomly assigned health insurance to a representative sample of families in the United States (Newhouse 1974). Results show that more generous insurance coverage leads to increases in healthcare utilization. This finding has been confirmed more recently by the Oregon Health Insurance Experiment (Finkelstein et al. 2012). A substantial number of works based on observational data followed the seminal works analyzing the consequences and effectiveness of co-payment scheme adoption in terms of demand for health services, health outcomes and distributional effects, according to the type of health service (see Kiil and Houlberg (2014), Lin et al. (2020) and the literature therein for a survey).

Even though healthcare co-payment may rationalize the demand for public health services and potentially enhance the efficiency of the public health sector, it can also create or increase inequality of access and utilization among users’ (De Matteis et al. 2019). Thus, the introduction of co-payment may have a negative impact on equality in healthcare, i.e., it is a sort of regressive form of healthcare finance (Atella et al. 2006; Gemmill et al. 2008; Costa et al. 2012; Inoue and Kachi 2017). In these cases, a sort of protection for more needy patients can be adopted by introducing exemptions from healthcare co-payment. Such a policy can be seen as a Social Health Insurance (SHI) targeted to improve health and to provide more vulnerable and needy people with protection against the financial consequences of health shocks. The research question of how a SHI targeted to the most vulnerable and needy people may affect healthcare consumption is now increasingly explored in low and middle income countries (Bernal et al. 2017; Serna 2021), but evidence about the effect of such a policy in an advanced economy is still limited. As there is ample evidence of increasing worldwide within-countries inequality (Arestis et al. 2011; Liberati 2015; Goodhart 2017), it is becoming crucial to understand how healthcare consumption could react to such a policy in a relatively advanced economy. With this aim, we empirically test the relationship between co-payment exemption and the utilization of specialist visits (outpatient services) in Italy. In particular, we focus on the effect of co-payment exemption for income as one of the several forms of co-payment exemptions introduced by the Italian Legislator to guarantee healthcare services to needy and destitute people.Footnote 1

Italy is characterized by a substantial presence of heterogeneity in income (Lagravinese 2015), multidimensional well-being (Greco et al. 2018) and health conditions (Lagravinese et al. 2019) and for these reasons the Government establishes and regulates co-payment exemption for people with specific features. Evidence about the effects of co-payment on healthcare services in the Italian NHS are provided by Fiorio and Siciliani (2010), Atella and Kopinska (2014), and Ponzo and Scoppa (2021). Fiorio and Siciliani (2010) estimate the relationship between variation in co-payment and demand for pharmaceuticals. Atella and Kopinska (2014) analyze the causal effect of co-payment on compliance with statins in a sample of Italian hypercholesterolemic patients. By using a quantile regression model, they find that the introduction of co-payment reduces the compliance in regions with low healthcare organization and performance (compared to the national average). In particular, by interacting geographic area and town size, they show that the compliance increases with the municipality size of northwest and northeast, especially among low compliers; conversely, it diminishes with the city size in the center and in the south. In addition, they find that the exemptions from co-payments are associated with higher compliance, and this association is three to four times greater for poor compliers considering all exemption categories. Moreover, females are significantly less compliant than males, especially among poor compliers. Also the clinical history (case of acute events) are important determinants of compliance among poor compliers as well as the treatment length and type of statin therapy reduce the compliance mainly among poor compliers. Ponzo and Scoppa (2021) investigate the effect of co-payment exemption on the demand for different health services. In particular, the authors do not disentangle the effects of the exemption due to chronic diseases from the one granted for economic conditions. This limitation is essentially due to the lack of information about respondents’ income in their database. Currently, the European healthcare systems, and particularly the Italian system, have been strongly put under pressure, due to the pandemic. By the same token, the US private system has raised considerable doubts.Footnote 2 In this scenario, disentangling the effect for less wealthy individuals is of great importance for a correct and informed health policy conduction. Our paper moves a step further, exactly in this direction. Indeed, our dataset is similar to that of Ponzo and Scoppa (2021) but we complement it with respondents’ income as we are interested in the effect of co-payment excemption for income. Thus, we consider all people aged more 65Footnote 3 and we enrich previous analyses introducing also income requirement for co-payment exemption eligibilityFootnote 4 This allows us to refine the analysis on co-payment exemption due to economic status, obtaining revealing results. Interestingly, we find that the effect of exemption on the number of specialist visits decreases as income increases, and it increases as individuals age. This piece of information would have not been attainable by following a standard RD. From a methodological point of view, this novelty comes at the cost of higher econometric complexity, which at the same time makes the problem more challenging and stimulating. The enhanced database allows us to consider eligibility for exemption according to the twofold requirement in terms of age and income. To tackle this empirical challenge we rely on one of the most recent developments of Regression Discontinuity (RD), namely the Multiple Regression Discontinuity (MRD) and we consistently adapt some tests developed for the single score case to this more complex setup. Our results show positive effects of co-payment exemption on the number of specialist visits.

The remainder of the paper is organized as follows: Sects. 2 and 3 introduce the institutional setting and the dataset, respectively. Section 4 explains the identification strategy which is tested in Sect. 5. The results are presented in Sect. 6, while Sect. 7 includes the robustness checks and Sect. 8 assesses to what extent the results hold away from the frontier. Finally, Sect. 9 concludes the study.

2 Institutional setting

The Italian National Health System (NHS), introduced in 1978, is a universal healthcare system providing comprehensive health insurance coverage and uniform health benefits to the whole population.

During the nineties, the Constitutional reform has introduced a decentralization process, called devolution process to contain the public expenditure, transferring greater formal responsibility and power to make decisions from Central Government to Regions in public finance issues and in the healthcare sector. Thus, each region introduced specific taxes (especially on activities and on personal income) and healthcare sector was rethought at regional level. Different Regional Health Systems (RHS) were established acquiring more autonomy in programming, funding, organization and delivery of healthcare.

In this context, regions adopted specific healthcare co-payment schemes in terms of application and structures to contain their public healthcare expenditure.Footnote 5 A co-payment was introduced for specialistic visits, for diagnostic care utilization and on the pharmautical expenditure,Footnote 6 where regions can independently apply these forms of healthcare cost-sharing to all these services or only for someones, varying costs in a range established by the Central Governement.Footnote 7 These regional heterogneity increase inequality in healthcare access and utilizations over time. However, in order to guarantee equal access and utilization of healthcare services and to safeguard more fragile and vulnerable patients, and also, for needy and destitute people, the Government establishes and regulates the co-payment exemption for people with specific features. Thus, the Italian NHS establishes different types of co-payment exemptions based on age and income level, health status (chronic diseases and rare diseases) and disability.

To our purpose, we focus on a case of total co-payment exemption strictly pertaining economic issues: the one granted according to income status. This type of exemption was introduced in 1994Footnote 8 and still applies substantially unaltered. We analyze cross-sectional data over the period 2011–2012 with exempted individuals according to the aforementioned Law. In spite of the regional and local autonomy, the Italian NHS guarantees free healthcare services to patients having household income less than 36,151.98 euro and who are older than 65 years. Despite the double requirement this type of exemption is normally referred to as “exemption for income” and we will maintain this wording throughout. To the best of our knowledge there are no waivers at regional/local level for this benefit and at these thresholds.Footnote 9 Other exemptions of partial or total exemptions can be granted according to disabilities, chronic diseases, such as cancer, war-related experience, unemployment, social pension and other issues. We will confine our attention to the “exemption for income”, being the most related to economic issues.

3 Data and summary statistics

The estimation sample we use is obtained by merging two different sources. The first one, the so called multipurpose survey on “Health conditions and use of health services", conducted by the Italian National Institute of Statistics in 2011–2012 (ISTAT 2013), is a cross section containing around 119,000 interviews. This survey is conducted every five years, to monitor the use of healthcare services in the Italian population and the prevalence of chronic health conditions. Each individual is randomly selected from the municipal registry lists, creating a statistically representative sample of resident populationFootnote 10 Unfortunately, this data source lacks information about income. In line with recent contributions in machine learning (Feigenbaum 2016; Athey and Wager 2021) and the widespread practice in epidemiology (Murray 2018; Schenker and Welsh 1988) we have implemented a statistical matching to take the household income data for the same year (2012) from the Italian version of the EU’s Statistics on Income and Living Conditions—EU-SILC (ISTAT 2021).Footnote 11 Focusing on income exemption we have dropped from the sample all the units benefiting from other types of exemptions, both partial and total, in general exempted individuals aged less than 65.Footnote 12

Information on the outcome variable of our interest, i.e., the total number of specialist visits, are provided through a detailed questionnaire in which respondents are asked about their utilization of one of these healthcare services in the last four weeks before the interview. In particular, the total number of specialist visits was built by summing the number of utilization of a set of specialist visits, such as geriatric, cardiology, gynecology, oculist, orthopedic, neurological, psychiatric and so on.

For our purpose, treatment is captured by a dummy variable, exemption, that is equal to 1 when the patient benefits from (total) co-payment exemption for income reasons. Table 1 presents descriptive statistics on the resulting merged sample. The sample span is 37,586 individuals and almost \(4\%\) benefit from total co-payment exemption. In the four weeks preceding the interview, on average, less than 1 respondent had a specialist visit. The average age is 41 years old, while the average income is about 24,000 euro. Furthermore, the generic visits are the number of visits made by the general practitioner in the four week preceding the interview, which are completely free in Italy for everybody.

Table 1 Summary statistics—full sample

Table 2 reports summary statistics by co-payment exemption. The average differences between the two sub-samples are significant for most of the variables. In general, treated individuals experience a greater number of specialist visits, whereas it is not possible to find a statistically significant difference when generic visits are concerned. This descriptive evidence raises the fundamental question we want to answer in this paper. Do exempted individuals consume more healthcare services because of the exemption? Descriptive evidence seems to suggest “YES”, but the statistical evidence in Table 2 does not provide any clue in terms of causality. To unveil the possible causal relationship behind the difference in mean we resort to a more sophisticated tool described in the following sections. Finally, and consistent with the eligibility criteria, the table reports that on average treated individuals are older and less wealthy than non-treated individuals.

Table 2 Summary statistics by co-payment exemption

4 Identification strategy

Some features of RD Design, such as its transparency and conceptual simplicity, have made it one of the most reliable techniques to identify the effects of policy interventions. Over recent years, we are witnessing advances in many directions, such as optimal bandwidth and robust inference (Calonico et al. 2014, 2017, 2020), manipulation tests (McCrary 2008; Cattaneo et al. 2018), local randomization (Cattaneo et al. 2015, 2016), power tests (Cattaneo et al. 2019), multiple cutoffs (Cattaneo et al. 2016, 2020), inference away from the cutoff (Angrist and Rokkanen 2015; Dong and Lewbel 2015; Battistin and Rettore 2008) just to cite a few. For our purpose, a relevant extension pertains to cases in which eligibility depends upon more than one running variable, or simply scores, the so called Multiple Regression Discontinuity (MRD), in which typically, but not necessarily, the scores are two. Examples are given by borders identified by latitude and longitude (Keele and Titiunik 2015; Dell 2010; Gerber et al. 2011) or test scores at school (Jacob and Lefgren 2004; Papay et al. 2011; Zajonc 2012; Clark and Martorell 2014) and multiple requirements for eligibility in an income support program (Frey 2019). A common practice to deal with such occurrences consists in reducing the MRD problem to a single score problem. This can be done in two ways. The simplest consists in using a single score on the units with the other score already crossing the cutoff. It is worth noting that in this case the estimated effect is not fully informative of the policy effect, neither at the vertex, i.e., the joint cutoffs defining eligibility, so that the local information content RD provides is made even narrower by this technique. Another slightly more complex practice consists of collapsing the two scores into one by computing a distance measure, e.g., Euclidean norm, Mahalanobis distance, and applying the standard single score RD on the new running variable. While this strategy seems perfectly sensible for geographic boundaries and when the scores are of the same scale, such as test scores, it casts doubts when the scores pertain to very different phenomena. Examples can be given by population and Human Development Index (Frey 2019) or age and income, as in our analysis. To be more clear, think of a situation in which the resulting collapsed score returns a certain value, let us say h, meaning that individual i is distant h to the treatment frontier. The same value h can apply to both “too young” individuals or “too wealthy” individuals, bearing very different implications. For cases like this Frey (2019) suggests leaving the scores separate and applying a RD experiment in a multidimensional setting. Indeed, the distinct nature of the two scores make likely that the sub-populations being compared along the frontier differ considerably. Under the one-score approach this heterogeneity might defeat the spirit of the RD Design. Differently, keeping the two scores separate one can exploit heterogeneity along the frontier, i.e., along the combinations of the scores allowing for eligibility. In other words, when the scores have a distinct nature the sub-populations along the frontier may be heterogeneous and this heterogeneity can help unveiling the heterogeneous effects of the policy.

In our specific case, individual i aged \(x_{1i}\), with family income \(x_{2i}\) is eligible for treatment, i.e., co-payment exemption, if \(x_{1i}\ge 65\) AND \(x_{2i}<36,165.98\) euro. The combination of the two scores defines two regions in a plane, one containing eligible individuals and one non-eligible, the boundary between the two can be represented by \({\mathbb {B}}=\left( x_{1i}\ge 65 \cap x_{2i}<36,165.98\right) \). The situation is depicted in Fig. 1.

Fig. 1
figure 1

Eligibility boundary. Note: the black solid line represents the eligibility boundary. Individuals located in the upper left quadrant are eligible for treatment. Red triangles represent actual treated and blue X’s non-treated individuals. (Color figure online)

The black line represents the boundary delimiting the eligibility region which is contained in the upper left, UL, quadrant. By far, this region is populated by treated individuals (red triangles), while non-treated are located mostly in the remaining region. As it is possible to see from the picture, we are in a typical situation of non-perfect compliance as there are (few) non-treated eligible individuals, i.e., blue X’s, in the UL quadrant, and treated individuals in the non-eligibility region. The non-compliance problem arises from the fact that being exempted is not automatic, but rather potential eligible individuals must apply following a complex procedure that, to a certain extent, may vary according to the regions.Footnote 13 Briefly, potential eligibles every year must self-declare their household’s income uploading it in the Electronic Health Folder, EHF, (fascicolo sanitario elettronico) a special section of the Ministry of Health website. In turn, most of the times, household’s income must be previously computed by specialists operating in this sector, such as business consultants or tax assistance centers. Yet, access to EHF is possible only through by advanced electronic means of personal identification that can discourage elderly people. Once applied for the benefit, self-declarations are transferred to fiscal authorities who can indict applicants in case of false declarations. Even once successfully concluded this complex procedure, exemption is not automatic. Indeed, there can be cases in which valid applications do not appear in the system due to mere computer errors and delays. A further reason for non-perfect compliance is that exemption is granted on the basis of past income and this may generate an inconsistency between eligibility and current income. Another interesting feature arising from the picture is that along the horizontal side of the boundary there are many points, above and below the black line, while the sample is not as ’thick’ along the vertical side, where the highest concentration of points is next to the vertex.Footnote 14 The discontinuity created by the administrative rule, coupled with non-perfect compliance, give us the possibility to estimate a treatment effect along the frontier by using a fuzzy MRD. We report the econometric details of this estimator in the Online Appendix, while offering the main intuition in what follows. The main peculiarity of MRD with respect to RD is that while the latter allows to estimate only one treatment effect at the cutoff, the former, in principle, allows to estimate as many treatment effects as the points along the frontier, the so called location-specific treatment effects (Cattaneo et al. 2023). This makes the MRD particularly attractive, because the location-specific effects are informative of how the effects change according to the specific values of the scores, i.e., along the boundary. Empirically, the number of computable location-specific effects depends upon the variability of the scores in the data along the frontier. The higher the variability, the higher the number of computable location-specific effects and the more informative the estimates will be. From a practical point of view, this technique consists of estimating as many two-stage least squares, i.e., instrumental variable regressions, as the points along the frontier one wants to consider, where the instrument is eligibility. In the potential outcome framework terminology, each estimate coincides with a Local Average Treatment Effects (LATE). At each boundary point compliers are those units who change treatment status thanks to the instrument, i.e., the eligibility rule. In particular, the first stage is:

$$\begin{aligned} {W}_{i}=\alpha _0+\alpha _1Z_i+\alpha _2 {\tilde{x}}_{1i}+\alpha _3 {\tilde{x}}_{2i}+\alpha _4 Z_i {\tilde{x}}_{1i}+\alpha _5 Z_i {\tilde{x}}_{2i}+u_i \end{aligned}$$
(1)

where \(W_i\) is the binary treatment indicator, \(W=1\left\{ if \, treated\right\} \), Z is the eligibility dummy, \(Z=1\{x_{1i}\ge 65 \cap x_{2i}<36,165.98\}\) and \({\tilde{x}}_{ji}=x_{ji}-{\bar{x}}_i\) for \(j=1,2\) are the centred scores, where \({\bar{x}}_1=65\), \({\bar{x}}_2=36{,}165.98\) represent the cutoffs. In plain English, the treatment indicator is regressed on the binary eligibility indicator, on the centred scores, and on the interactions of eligibility and (centred) scores, plus an error term, \(u_i\). The \(\alpha _1\) coefficient is equivalent to the proportion of compliers at the cutoff in a one-score RD setting, i.e., the jump in the probability of being treated. In this case, it represents the proportion of compliers at the vertex point of the boundary. The interaction terms are included to make the relationship between W and Z to depend also on the distance to the boundary. Of course, other controls can be easily added to (1). In the second step, the fitted value of \(W_i\), i.e., \({\hat{W}}_i\), is plugged into the second stage regression:

$$\begin{aligned} Y_i=\beta _0+\beta _1 {\hat{W}}_i+\beta _2{\tilde{x}}_{1i}+\beta _3{\tilde{x}}_{2i}+\beta _4 Z_i {\tilde{x}}_{1i}+\beta _5 Z_i {\tilde{x}}_{2i}+\eta _i \end{aligned}$$
(2)

where Y is the outcome variable. \(\beta _{1}\) is the conditional average treatment effect, CATE, of the policy at the vertex of the boundary, in other words it represents the jump in the outcome at the vertex point. To make valid inference the standard errors must be corrected to take into account that \({\hat{W}}_i\) is a generated regressor. At this point, two important issues must be discussed. First, the treatment effect holds only at the vertex, therefore an optimal two dimension bandwidth, i.e., an optimal sphere, must be derived. Second, the same procedure must be repeated at different points along the frontier and an average effect can be computed by averaging over these estimates.Footnote 15 As far as the first issue is concerned, Zajonc (2012) in its application argues that the optimal sphere can be proxied by the Scott’s rule:

$$\begin{aligned} {\hat{h}}_j={\hat{\sigma }}_j \cdot N^{-1/6} \ \ \hbox {for} \ \ j=1,2 \end{aligned}$$
(3)

where \({\hat{h}}_j\) and \({\hat{\sigma }}_j\) are the estimated bandwidth and standard deviation of score \(x_j\), respectively, and N is the sample size. As for the second issue, actually it stems from the first. Since the estimate as obtained from Eqs. (1) and (2) are locally valid, i.e., in a narrow neighborhood of the vertex, it is possible to repeat the same procedure to other points along the frontier, such as to exploit heterogeneous effects that different combinations of the scores may give rise to. To this purpose, Eqs. (1) and (2) must be re-estimated by re-centering the scores, one at a time at different cutoffs. For instance, to estimate the conditional effect at a given point along the vertical side of the frontier, let us say 75 years of age, one must center \(x_1\) at 75, \({\tilde{x}}^{1}_{1i}=x_{i1}-75\), keep \(x_2\) centred at the vertex, and re-estimate (1) and (2). That is:

$$\begin{aligned} W_i=\alpha _0+\alpha _1Z_i+\alpha _2 {\tilde{x}}^{1}_{1i}+\alpha _3 {\tilde{x}}_{2i}+\alpha _4 Z_i {\tilde{x}}^{1}_{1i}+\alpha _5 Z_i {\tilde{x}}_{2i}+u_i \end{aligned}$$
(4)
$$\begin{aligned} Y_i=\beta _0+\beta _1 {\hat{W}}_i+\beta _2{\tilde{x}}^{1}_{1i}+\beta _3{\tilde{x}}_{2i}+\beta _4 Z_i {\tilde{x}}^{1}_{1i}+\beta _5 Z_i {\tilde{x}}_{2i}+\eta _i \end{aligned}$$
(5)

Similarly, to estimate a point along the horizontal side of the frontier \(x_{2}\) must be re-centred and \(x_{1}\) kept at the vertex. The CATE’s can be aggregated to obtain a summary statistic of the treatment effect along the boundary, \({\hat{\tau }}_{avg}\), by taking a weighted average;

$$\begin{aligned} {\hat{\tau }}_{avg}=\frac{\sum _{k=1}^{K}{\hat{\beta }}_{1k}({\tilde{x}}^{k}_{1}, {\tilde{x}}^{k}_{2})\cdot {\hat{\lambda }} ({\tilde{x}}^{k}_{1}, {\tilde{x}}^{k}_{2})}{\sum _{k=1}^{K}{\hat{\lambda }} ({\tilde{x}}^{k}_{1}, {\tilde{x}}^{k}_{2})} \end{aligned}$$
(6)

the weights \({\hat{\lambda }} ({\tilde{x}}^{k}_{1}, {\tilde{x}}^{k}_{2})\) are the estimated density of compliers at each boundary point. They can be obtained by following Abadie (2003) who suggests how the proportion of compliers can be identified along the boundary and by applying a one score RD algorithm to a minimum distance measure at the vertex. The intuition behind this simpler way of calculating the average effect along the entire frontier is that the weighted average of the effects is no longer a function of two scores and one score RD is a weighted average, therefore the average of the (average) effects along the frontier can be conveniently obtained by applying a one score RD with an appropriate kernel function. For more details about the possibility to estimate \(\lambda (x)\) as a combination of local averages and kernel density estimates with potential boundary corrections we refer the interested reader to Hjort and Jones (1996), Jones (1993); Loader (1996)

Unfortunately, one immediate drawback of this appealing procedure is presented by the limited number of points in the spheres along the boundary. The small sample problem of one score RD is even exacerbated in the MRD because it is repeated as many times as the preferred points along the frontier, this may engender non statistically significant results. In order to lessen weak power problems it is advisable not to include covariates.

5 Testing the identification strategy

One of the central issues for valid identification of causal effects with RD experiments is continuity. Continuity of the score(s) density at the cutoff, continuity of the score(s) over the support and continuity of the regression function of the outcome over the score(s). In the case of MRD there is not a consensus in the literature about the tests that one can run in order to validate or reject the identification strategy. For this reason, we have taken a prudential standpoint by adapting the tests developed for single score RD to the MRD in different ways. Broadly speaking, the adaptation we have made consists in running a multiple number of tests. Specifically, for each null hypothesis to be put under empirical testing we run the test: (i) on each score keeping the other in the eligibility area; (ii) on each score keeping the other along the frontier. This procedure generates \(2\times 2\) tests for each null hypothesis.

Fig. 2
figure 2

McCrary manipulation test keeping the other score in the eligibility area. Note: the continuous black line represents the estimated density function, while the gray area represents a 95% CI. The left-hand-side of the figure reports the estimated density of age for units with income below the cutoff. The right-hand-side of the figure reports the estimated density of income for units with age above the cutoff

Identification strategy can break down if units can manipulate the score(s). The McCrary (2008) test is based on the idea that in the absence of manipulation of the score the density of the units does not exhibit a discontinuity around the cutoff. Presence of continuity is usually interpreted as a signal for absence of manipulation, namely random assignment into treatment or control group. The test is implemented as a Wald test and the null is continuity of the density. We implement an advanced version of the original McCrary test as suggested by Cattaneo et al. (2018) who allow us to select the bandwidth entering the test in a data-driven way, consistent with the estimation procedure. The formal test reports a P-value=0.464 and 0.398 for age and income, respectively, leading us not to reject the null hypothesis for the first version of the test, i.e., keeping the other score in the eligibility area, while visual evidence is provided in Fig. 2. Virtually identical results are obtained when repeating the test along the frontier, obtaining a P value of 0.674 and 0.32 for age and income, respectively, and a similar picture in Figure A.4.

Fig. 3
figure 3

Continuity of the regression function keeping the other score in the eligibility area. Note: the figure plots polynomial approximations of the conditional expectations of the outcome variable. The optimal number of bins is determined according to the evenly spaced optimal IMSE criterion. The left-hand-side reports the conditional expectation of the outcome with respect to the centred age, for units with income below the cutoff. The right-hand-side of the figure reports the conditional expectation of the outcome with respect to the centred income, for units with age above the cutoff

As far as continuity of the running variables are concerned, an additional empirical test is reported in Figure A.1, A.2 and A.3 in the Online Appendix, in which the distributions of the scores over their entire support are reported, keeping the other score in the eligibility area (A.1), keeping the other score at its cutoff (A.2) and keeping the score in a narrow interval along the boundary A.3 while the other is in the eligibility area. A substantial continuity can be visualized in all the cases.

A third check consists in ascertaining the continuity of the regression function. This can be done by dividing the support of the running variable(s) into bins and computing local means of the outcome within those bins. To avoid choosing an arbitrary number of bins Calonico et al. (2015) suggest a data-driven procedure based on the minimization of the integrated mean square error (IMSE). Interestingly, this test is a sort of placebo test in that it tests for discontinuity in the regression function away from the cutoffs. If we found discontinuities at some point different from the cutoffs, we would conclude against a correct identification. Visual evidence is reported in Fig. 3. The pictures are quite informative. On the one hand, they do not provide any evidence of potential discontinuities away from the cutoff in the underlying regression functions, thus acting as falsification tests and validating the key assumption of the design. On the other hand, the pictures show a clear jump in the conditional regression function at the (centred) cutoffs, providing evidence of a positive effect of treatment on outcome. Additional evidence is provided in the Online Appendix in Figure A.5 that repeats the exercise along the boundary and confirms the result.

6 Results

The first-stage results are reported in Table 3.

Table 3 First stage results at preferred points along the frontier

Each column reports the first stage repeated at different frontier points. Notably, column (1) is estimated at the vertex, as in Eq. (1), the models H1 and H2 in column (2) and (3) have been estimated along the horizontal side of the frontier, that is [age = 65, income = 10k]; [age = 65, income = 20k], respectively, whereas, models V1 and V2 in column (4) and (5) have been estimated along the vertical side [age=66, income=36,151]; [age=67, income=36,151], respectively. Put differently, columns (2) through (5) represent the empirical counterpart of Eq. (4). Each regression equation has been estimated using only the units in the two-dimensional bandwidth. Given the limited number of observations within the spheres no covariates have been added in order not to lose further degrees of freedom. Notwithstanding, in order to make identification as plausible as possible, following the suggestion by Cattaneo et al. (2015) and Cattaneo et al. (2016), we have tested whether pre-intervention covariates, such as demographic and socio-economic characteristics, are affected by treatment. Indeed, if treatment assignment is “as if random” within the sphere, the distribution of pre-intervention covariates should be the same for treated and control units. Table 4 reports the results showing that in no case we find evidence of treatment effects on covariates.Footnote 16

Table 4 Balancing test

The centred scores have been normalized to the same scale by dividing them by their standard deviations, so that the two dimensional bandwidth is the same across all the estimates, easing comparability and interpretability. The first stages are always statistically significant and compliers are between \(22\%\) and \(35\%\). It follows that being eligible increases the probability of being exempted from co-payment by 22–35%. From another vantage point, these figures represent the take-up rate of the policy and compared to another Italian policy introduced in 2012, the same year our dataset refers to, such as a support for young innovation oriented firms, they are substantially higher, 4–5 times, Mellace and Ventura (2023). This high take-up rate witnesses the relevance and the attractiveness of the policy under evaluation. Interestingly, the highest significance level, i.e., \(1\%\), is achieved in correspondence with the largest number of observations, columns (2) and (3). Indeed, Fig. 1 shows a remarkable presence of points along the horizontal side and sparse points along the vertical side with concentration around the vertex. This explains why the estimates have been performed at age 66 and 67. Considering as outcome variable the number of specialist visits, we can now estimate the CATEs, namely the causal effects for complies along the boundary, which are obtained from Eqs. (2) and (5) for the vertex and the other points along the boundary, respectively. Angrist (2001) guarantees that linear two-stage least squares are still valid with limited or count dependent variables, as in our case. The results are reported in Table 5.

Table 5 CATE’s at preferred points along the frontier. Dependent variable: No. of specialist visits

Of course, the points along the boundary considered in Table 3 are the same as Table 5, so that the estimate in the first column is at the vertex, H1 and H2 in columns (2) and (3) are along the horizontal side and the remaining two along the vertical side. Overall, the effect of co-payment exemption on the number of specialist visits is positive and significant at all the points considered. The limited number of observations at each point is likely to raise weak power problems, in this light a \(10\%\) significant level is encouraging and can be regarded as a sort of lower bound. The point estimates range between about 1 and 1.8 extra visits generated by the policy in the four weeks before the interview. Caution should be exercised when interpreting this figure since it represents only individuals who have recently received the exemption, not an average of all subjects. As a result, it is reasonable to consider that in the subsequent years, after having the exemption for an extended period, the number of visits might decrease. In essence, these results cannot be generalized to the entire sample of exempt individuals; instead, they should be regarded as “short-term” estimates applicable to those who have just received the exemption at around 65 years of age. These results are in line with the literature showing a negative demand effect of the co-payment (Cherkin et al. 1992; Cockx and Brasseur 2003; Layte et al. 2009; Nolan 2008; Van de Voorde et al. 2001; Winkelmann 2004), while they contrast with studies finding no (or positive) effects of co-payment on healthcare use (Layte et al. 2009; Rosen et al. 2011). Interestingly, considering column (2) and (3) the CATE decreases as income increases, while considering column (4) and (5) the CATE increases as individuals age, whereas, the effect at the vertex is remarkably close to the simple mean of the CATE’s in the table. As just noted, the limited number of observations at each point prevents us from testing the statistical equality of the estimated effects. However, their dynamics with respect to income and age is encouraging, as fully sensible in economic terms and consistent with the literature.

The estimated average effect computed along the entire frontier, as in Eq. (6) is reported in Table 6.

Table 6 Average effect along the frontier, MSE optimal BW

The table reports the estimate of the average effect along the frontier with the robust bias-corrected standard error and the associated Pval, as suggested by Calonico et al. (2017) and Cattaneo et al. (2017). This effect is a weighted mean of all the effects computable along the frontier, therefore it is not a function of a bivariate score. One score RD can be regarded as a weighted mean within the bandwidth, where the weights are provided by a kernel function. For the score we have followed Cattaneo et al. (2020) and Cattaneo et al. (2023) implementing the closest perpendicular distance to the boundary defined by the treatment assignment, with positive values indicating treated units and negative values indicating control units. To avoid discretionary choices of the bandwidth we have applied the data driven optimal MSE bandwidth, according to Imbens and Kalyanaraman (2012). The effect is always positive and significant at the \(5\%\) level. In the baseline specification reported in column (1) the effect amounts to 0.376, meaning that on average among the treated individuals with combination of income and age such as to be along the frontier being exempt from co-payment increases the number of specialist visits by 0.376 every four weeks. Such an average value is below the interval estimated in Table 5, i.e., 1–1.8 for two reasons. First, for practical reasons Table 5 refers to five points while Table 6 to an infinite collection. Second, the one score estimator puts less weights (because of a smaller density along the frontier) on the effect for lower income and older individuals, whose effect is greater. To check for the robustness of the result, we have carried out a number of perturbations of the baseline by including squared values of the score, column (2), allowing for asymmetric bandwidth on the left and on the right-hand-side of the cutoff, column (3), and using a different kernel function, column (4). The average effect is always positive and significant and does not show appreciable differences along the 4 different specifications. As a very final check we have repeated the same exercise by using a different data-driven optimal bandwidth, the Coverage Error Rate (CER), instead of the MSE, because Calonico et al. (2017) argue that CER has better inference properties. All the results are confirmed and evidence is provided in the Online Appendix in Table A.1.

7 Robustness

To check the robustness of our results we implement a placebo test in the spirit of Abadie et al. (2010) and the ensuing copious literature. For the placebo treatment we construct a fictitious treatment area based on the joint event \(\left( x_{1i}\le 55 \cap x_{2i}>40,000\right) \). The situation is depicted in Fig. 4.

Fig. 4
figure 4

Actual and fictitious treatment area. Note: The black solid lines delimit boundaries. The quadrant “Actual” represents the actual eligibility area for treatment. The quadrant “Fictitious” represents the placebo area for fictitious treatment

The CATE’s estimated at the vertex and along the fictitious boundary show no significant results, as reported in Table  7 as well as in Figure A.6.

Table 7 Falsification test along the fake frontier

As further proof, the average CATE along the entire fictitious frontier amounts to 0.0126 with a P value=0.67.

Another falsification test consists in keeping the actual eligibility area but changing the outcome. In particular, we know that generic and pediatric visits are free from payment, for any patient both exempted and not. Thus, if we find a significant effect of exemption on this outcome that will signal something wrong in the identification strategy. In no case it is possible to find an effect at any conventional significance level. For sake of space evidence about the average effect along the entire frontier is reported in the Online Appendix, Table A.2 and Table A.3, respectively, for the MSE and CER optimal data driven bandwidth, and their robustness checks.

8 External validity

One of the major limitations one comes across when using RD is about the local validity of the estimates, in the sense that the results apply only at the cutoff. In the case of MRD this limitation is mitigated by the fact that the discontinuity consists of a border, rather than a point, so that the estimates take on a wider interpretation, with respect to the single score case. Notwithstanding, the main question still applies: do the results we have found apply away from the frontier? Put differently, if the government slightly changed the eligibility criteria, would we find the same effect we have estimated? To answer this question, Dong and Lewbel (2015) suggest that the treatment effect derivative, TED, is an appropriate statistic for the stability of the results around the cutoff. The intuition is very simple: if small changes in a score (computed at its cutoff) lead to significant changes in the treatment effect, there is evidence of instability of the results, namely they cannot be taken as valid away from the cutoff. Thus, in our specific case, the idea is to apply this technique twice, once for each score keeping the other passing the cutoff, i.e., in the eligibility area. Table 8 reports a full set of TED estimates computed on \(x_1\) keeping \(x_2<{\bar{x}}_2\). In other words, in Table 8 we are checking whether by slightly shifting the vertical side of the frontier the results still apply. Similarly, in Table 9 we are checking whether by slightly shifting the horizontal side of the frontier the results still apply. The double check is performed under a variety of alternative specifications, just to make sure that the results are general enough and not driven by the specific functional assumption of the regression equation nor by other ad-hoc choices.

Table 8 TED for eligible units by income
Table 9 TED for eligible units by age

Column (1) of Tables 8 and 9 reports the estimated TED by using the same bandwidth we have used in MRD estimates, as suggested by Scott’s rule. Columns (2) through (5) use the optimal data-driven MSE bandwidth (Imbens and Kalyanaraman 2012) which is re-calculated each time under different circumstances. In column (2) a linear model is adopted, in column (3) a quadratic model, column (3) adopts an asymmetric bandwidth allowing for different lengths on the right- and on the left-hand-side of the cutoff, and, finally a different kernel function is used in column (5). The results show a great regularity. In each column of both tables the estimated TED is always statistically non-significant at the conventional levels, indicating that our results might have higher external validity.

Finally, in order to rule out inference problems we have followed the procedure suggested by Calonico et al. (2017) replacing the MSE optimal data-driven bandwidth with the CER. Evidence is provided in Tables A.3 and A.4 in the Online Appendix and show exactly the same results as the ones in Table A.4 and A.5 lending further support to the hypothesis of external validity.

9 Conclusions

This paper investigates the causal effect of co-payment exemption for income on the utilization of healthcare services with a particular focus on the number of specialist visits in the Italian NHS. From a methodological perspective we implement the recently developed multiple regression discontinuity (MRD) model exploiting the discontinuity in the two features connected to the eligibility criteria: household’s income and age. Given our context, the technique allows us to estimate the causal effect in a quasi-experimental setting.

Results provide strong evidence in favor of a positive and significant effect of co-payment exemption for income on the number of specialist visits. Our evidence is in line with the literature supporting a negative healthcare demand effect of the co-payment, while it is in contrast with studies finding no (or positive) effects of co-payment on healthcare use.

From a policy perspective this evidence supports the hypothesis that a reduction in the use of specialist visits can be obtained introducing an appropriate co-payment framework. However, it is worth recalling that a reduction in the use of specific healthcare services may potentially decrease the total healthcare costs, but not necessarily. Indeed, it is possible that co-payment for some services may cause substitution effects with services that are free or subject to a lower co-payment. Furthermore, the reduction in some effective and necessary care may lead to a deterioration in public health, and in the long run the savings brought by co-payment may be canceled by increases in the use of other services. Finally, by eliminating co-payment exemption it is reasonable to expect that low income individuals will reduce their use of specialist visits more than wealthier individuals. Under the standard assumption of decreasing marginal utility of income, low-income individuals will suffer from a higher utility loss associated with the additional cost they must bear. Our study presents compelling proof that waiving co-payment for underprivileged populations leads to a notable rise in the utilization of specialist physician services. These services play a pivotal role in preventive healthcare by facilitating early detection, offering specialized knowledge and guidance, evaluating individual risks, creating personalized treatment plans, and providing valuable health education. Consequently, the co-payment exception has the potential to greatly enhance health outcomes and overall quality of life.

One of the main limitations of this study is that given the low number of observations at each point along the frontier, we cannot condition the estimates upon covariates because that would lead to a further loss of observations and neither we can test for equality of the estimated effects. In this respect, the new waves of the Multipurpose Survey are expected to be richer and more accurate, especially in terms of sample span and questions asked to respondents. We leave this issue to further research.