Introduction

Regional differences in medical practice variations have long been studied among health services researchers and health economists. Glover (1938) was the first scientist describing the variations in tonsillectomies across geographic areas in the UK. Decades later, the Dartmouth group in the US became widely known and provided evidence for unwarranted regional variations across a wide array of surgical procedures in the US (see for an extensive overview the Darthmouth website).Footnote 1 The seminal paper of Wennberg and Gittelsohn (1973) is considered the first study on the existence of geographic variations of several surgical procedures. Thereafter many more studies appeared that are well documented in Phelps (2000) and Skinner (2012). Both studies provide an in-depth review of the empirical literature on geographical variation.

Much of the empirical evidence stems from the Medicare program in the United States. However in recent years, lots of research on practice variations has been done in different European countries such as England and the Netherlands (Appleby et al. 2011; Westert et al. 2004).

The main challenge within the practice variation literature is to discriminate between warranted and unwarranted variation. Unwarranted variation refers to regional differences that cannot be explained on the basis of illness, strong scientific evidence or the preference of well-informed patients (Fischer et al. 2008). In practice, however, unwarranted variation is extremely difficult to isolate from warranted variation (Mercuri and Gafni 2011).

Wennberg (2010) distinguishes three main categories of care, effective care, price sensitive care and supply sensitive care. In this paper we study the latter category in which differences in payment systems across physicians may influence the clinical decisions of physicians, and thus utilization rates.

That financial incentives play a role in explaining utilization rates is well known (see e.g. Chandra et al. 2012). A famous example is the anecdotal story by Gawande (2009) who argues that physicians in Mc Allen, Texas, are more entrepreneurial than physicians in El Paso, Texas, which results in a dramatic overprovision of several surgical procedures in Mc Allen. Other well-known examples are related to the impact of changes on payment systems on a macro level. For example, in 1983, the Medicare program moved from a retrospective, fee-for-service (FFS) payment to a prospective DRG payment for hospital care. The result was a drop in volume and a large reduction in total hospital days (Hodgkin and Mc Guire 1994). In the Netherlands, the abolishment of hospital budgets in 2001 led to a sharp increase in hospital spending and volume, especially for supply sensitive treatments (Vijsel et al. 2011). Chandra et al. (2012) document the growing literature that studies the impact on utilization rates after exogenous demand shocks or cuts in payments. Many studies find that physicians are likely to respond to financial incentives (see for a systematic review of the literature Chaix-Couturier et al. 2000). However, Chandra et al. (2012) state that the role of fee differences in explaining medical production across patients or geographical areas is unknown. One major reason for a lack of research in this area is that, under Medicare, all physicians receive the same type of payment.

We fill this gap in the literature by studying the effect of different physician remunerations on regional variation in hospital treatments in the Netherlands. Physicians maximize their utility by not only caring about patient benefits but also by caring about their own income and leisure time (McGuire 2000). Depending on the type of payments, a physician may have different choices for these three components, which will subsequently influence patient benefits and utilization rates. Also selection may play a role. Entrepreneurial physicians may put a higher weight on income and will more often choose a FFS than a salary-based payment.

In contrast to Medicare in the US, Dutch physician remunerations differ. Physicians are paid either FFS or salary, and are not constrained by the hospitals in terms of production. We examine variations in eight supply-sensitive surgical procedures and compare the outcomes with hip fractures, a procedure for which unwarranted variation is likely to be small. Moreover, our panel data set covers the totality of the Dutch population over four consecutive years, which allows us to estimate fixed and random effect panel data models.

Our results contribute to the evidence that physicians respond to financial incentives. We show that significantly more patients get treatment in geographical areas with a high percentage of FFS physicians. This effect is strong for highly supply sensitive treatments, such as cataracts and tonsillectomies, while we do not find an effect for weakly supply sensitive treatments, such as hip fractures.

This article is organized as follows: first we describe the “Institutional setting” of the Dutch hospital sector, next we describe our “Data and descriptive statistics”. The “Methods” section explains the estimation procedure and is followed by the “Results” section and “Robustness checks” section. “Conclusions” section concludes.

Institutional setting

The institutional and regulatory framework of a health care system influence the incentives of both physicians and patients and, hence, the scope for practice variations (Bickerdyke et al. 2002). Several fairly recent institutional factors make the Dutch healthcare system supply sensitive.

Consumers are almost fully insured and there are few incentives for patients to restrain demand.Footnote 2 Except for emergency care, hospital admission occurs only upon GP referral.Footnote 3 Throughout our study period almost all patients could freely choose their preferred hospital.Footnote 4 Patients can thus demand services that provide only a small benefit relative to the costs borne by the insurer. As a result, the Dutch insurance system makes it attractive for physicians to stimulate production, as they know patients will have only minor payment concerns.

The government opted in 2006 to liberalize the delivery of health care, including hospital services. Hospitals are all not-for profit and were historically funded with budgets. In 2001 the strict budgets were replaced by volume-based and open-ended budgets in which “money follows the patients”. This has led to an increase in inpatient admission rates by more than 3 percent per year from 2001 to 2007, while at the same time day care admissions increased by about 9 percent annually (Vijsel et al. 2011). Although the main idea of the liberalization of the health care sector was that health insurers discipline hospitals to deliver care efficiently, an evaluation concluded that this process is still in its early stages (ZonMw 2009).

In 2005 a new hospital payment system, ‘diagnosis treatment combination’ (DTC), was implemented. A DTC relies on an episode-based registration within hospitals. A unique characteristic of the DTC system is the absence of DTC coders, i.e., physicians register DTCs themselves and can change the DTC registration during the treatment (Steinbusch et al. 2007). The DTC system was meant to facilitate the role of insurers as purchasers of care. In 2005, hospitals received a fixed, centrally determined price for initially 90 percent of the DTCs (part A). The remaining 10 percent (part B) was left to volume and price negotiation between health insurers and hospitals. Part B was extended from 10 percent in 2005 to 34 percent in 2009.Footnote 5

Dutch physicians are either self-employed professionals organized by specialty in partnerships (FFS physicians) or they receive a salary from the hospital (Schäfer et al. 2010).Footnote 6 Since the introduction of DTCs in 2005 FFS physicians received a fixed fee for every treatment.Footnote 7 The income earned by FFS physicians is mainly determined by their production. Financial incentives for FFS physicians and hospital management are aligned since the hospital can also increase revenue by treating more patients.

Salaried physicians receive a monthly fixed wage irrespective of their production.Footnote 8 We hypothesize that salaried physicians will put different weights to patient benefits—and therefore utilization rates—than FFS physicians. Since the abolition of fixed hospital budgets in 2001 created room for additional production we hypothesize that medical production and utilization rates will be higher in areas where relatively more patients visit a FFS physician.Footnote 9 Note that this hypothesis is very different from saying that FFS physicians have a higher production level than salaried physicians, which is generally true (see “Data and descriptive statistics” section).

Between 2006 and 2009 the Netherlands had 8 university hospitals and 2 specialty hospitals. General hospitals decreased from 88 in 2006 to 85 in 2009 due to mergers (NZa 2010, 2011). The liberalization of the hospital market led to a concentrated market of general hospitals and a rapid growth of private clinics, often hospital-affiliated. The number of private clinics increased from 37 in 2005 to 129 in 2009. Private clinics offer only part B treatments and accounted for about 6 percent of total production (NZa 2010). The opening of new private clinics or changes in the remuneration schemes or capacity in hospitals creates the variation in the percentage of different types of physicians visited by patients in Dutch zip code areas.

Data and descriptive statistics

The analysis relies on three data sources: the Dutch Healthcare Authority (NZa) for DTC information and waiting lists, Statistics Netherlands (CBS) for demographic and socio-economic factors, and Dutch Hospital Data (DHD) for information on physicians working in hospitals.

DTC data and the construction of treatment density

DTC information for the period 2006–2009 was drawn from administrative data collected by the NZa (DTC-informatiesysteem DIS). Table 1 provides a summary of our dataset and description for each treatment: the number of annual DTCs, the patient’s average age, the percentage of men, the number of hospitals and private clinics performing the treatment, and our ex-ante expectation on supply sensitivity.

Table 1 Number of treatments, patient and hospital characteristics, and supply-sensitiveness of treatments

The nine treatments are chosen on the basis of recurrence and supply sensitivity. Recurrence is important to obtain enough power for econometric tests. The degree of supply sensitivity is important to check whether our results are in line with ex-ante medical expectations. According to the consulted medical advisors, the treatments (cataract, tonsillectomy and varicose veins, hernia, inguinal hernia, arthrosis) are supply sensitive, and hip fracture is non-supply sensitive.

Our dataset includes about 1.7 million DTCs collected from all Dutch general and university hospitals and 78 private clinics. For each DTC we obtained the four-digit zip code of the patient’s residence as well as the four-digit zip code of the hospital visited by the patient. We used the four-digit zip codes to construct a panel dataset with zip codes as units and years as periods. A DTC is opened at the first physician consult and closed at final examination.Footnote 10 We assign a treatment to the year in which the DTC is opened since about 75 percent of DTCs are completed within the same year. Each hospital diagnosis corresponds to a homogeneous group of unique DTC codes within a medical specialty. Appendix provides further information on the DTC codes.

Roughly 8 percent of our data contains incomplete DTC records. Some hospitals delivered incorrect information, such as wrong or non-existent zip codes, which are crucial for the construction of our panel dataset. We thus deleted a two-digit zip code area when a hospital in that area delivered incorrect zip codes for more than 20 percent of its treatments in a given year.Footnote 11

Our dependent variable, treatment density, is defined as the number of treatments in a four-digit zip code area divided by the population size,Footnote 12 which creates a panel data set of repeated observations for approximately 3,600 four-digit zip code areas for the years 2006–2009. For very small areas, treatment density shows greater variation and we were confronted with missing values and outliers.Footnote 13 According to Diehr et al. (1992) geographical areas should not be too small; we thus excluded all four-digit zip code areas with less than 500 inhabitants, losing 1 percent of the total number of DTC records and about 850 four-digit zip code areas. The final analysis relied on about 3,000 four-digit zip code areas. The descriptive statistics for treatment density are presented in Table 2.

Table 2 Descriptive statistics of treatment density

Control variables

From Statistics Netherlands (CBS) we collected demographic and socio-economic data at four-digit zip code level (see Table 3).Footnote 14

Table 3 Descriptive statistics of 4-digit zip code areas

We included several variables in our analysis that indirectly control for health status.Footnote 15 The first 20 variables in Table 3 reflect the age distribution for 5-year cohorts per four-digit zip code. Next, we include information on gender and social and economic status of the population, the zip codes’ income distribution (three classes of national income distribution: the lowest 40 percent, the upper 20 percent, and the middle 40 percent) and data on the working and self-employed population and people receiving social assistance. We also included four urbanization levels for each zip code area, defined as the number of addresses per square kilometre. Another factor that potentially influences regional health care use is the mortality rate, defined as the number of deceased per 1,000 inhabitants.Footnote 16

The number of treatments in a geographical area may be associated with the availability of health care. For example, if the number of providers in an area increases, travel costs decrease, thereby facilitating access to care. We included the number of hospitals within 20 km as a proxy for hospital availability and three proxies for GP availability, (1) the average distance to the closest GP, (2) the number of GPs within a radius of three kilometres, and (3) the average distance to the closest GP centre. In the Netherlands GPs work as gatekeepers; patients need a referral for hospital admission. For some years data was unavailable (see Table 3). In those cases we used data for adjacent years as a proxy.

We controlled for excess demand with data on average annual waiting times, which we obtained from NZa (Table 4). Data was available for six treatments. For hip fractures waiting time was less relevant since in most cases treatment is emergent. We calculated the weighted average waiting time in a four-digit zip code area by weighing waiting times of all hospitals that patients from a specific zip code visited.

Table 4 Average waiting times of two-digit zip code areas

Physician data

We obtained data on the type of physicians from Dutch Hospital Data (DHD). For almost all individual general and university hospitals we had for each specialty information on their type of remuneration (FFS versus salary). The majority of physicians (about 68 % on average) were paid FFS. For both types of salaried physicians, university hospital (UH) physicians treat on average fewer patients than general hospital (GH) physicians, presumably because they devote more time to education, research, and more complicated treatments. Kruijthof Kruijthof (2005) reports that Dutch FFS physicians have longer working hours, devote more time to patients and have fewer management responsibilities than salaried physicians. One could argue, however, that this may also be the result of patient selection, such as treating a higher proportion of short-stay patients (Wright 2007). Longer working hours may also reflect that FFS physicians put a lower weight on leisure time than salaried physicians.

We constructed the average percentages of physicians visited by patients for each remuneration type (see Table 5).Footnote 17 Interestingly, this approach helped us to provide some extra information on our data limitations, since this allowed us to identify the share of patients, about 15 percent on average and for varicose veins about 30 percent, who visited a physician of an unknown (UN) type. Recall that the majority of UN physicians works in private clinics and is paid FFS. This enables us to approximate the supply sensitivity effect of physicians working in private clinics as well.

Table 5 The average percentage of physicians visited by patients, per type of physician (UH university hospital, GH general hospital, FFS fee-for-service, UN unknown)

We defined the average percentages of physicians on the two-digit zip code level. There are about 90 two-digit zip code areas in the Netherlands. By using a more aggregated zip code we obtained a smoother pattern without large outliers. Moreover, defining our supply side variables on the four-digit zip code level was problematic when there are zero patients treated in a certain area. All percentages would then be zero. When one more patient is treated in a certain geographical area, at least one of the percentages becomes positive. This would create positive artificial correlation between treatment density and the supply side variables. The same reasoning holds for the three-digit zip code level. The disadvantage of aggregating to a two-digit zip code level is that some variation across areas is lost. However, Table 6 shows that there remains enough between and within variation of the physician percentages.

Table 6 Standard deviations physician percentages

Methods

Estimation method

We estimated the demand and supply Eq. [1] for each of the nine treatments using OLS, panel data random effects (RE), and panel data fixed effects (FE):

$$\begin{aligned} y_{it} =\alpha _i +\gamma _t +\delta _{{\textit{GH}}} p_{{\textit{GH}},it} +\delta _{{\textit{FFS}}} p_{{\textit{FFS}},it} +\delta _{{\textit{UN}}} p_{{\textit{UN}},it} +Z_{it}^{\prime } \beta +\varepsilon _{it} \end{aligned}$$
(1)

In Eq. [1], the dependent variable \(y_{it}\) is the treatment density in area \(i\) in year \(t\). The supply side variables \(p_{\theta ,it}\) represent the percentage of physicians of type \(\theta \) visited by patients from area \(i\) in year \(t\). Recall that the variables \(p_{\theta ,it}\) vary on the two-digit zip code level, while the dependent variable varies on the four-digit zip code level. The vector \(Z_{it}\) includes the set of control variables, \(\gamma _t\) is a year fixed effect, and \(\alpha _i\) is a constant \(\alpha \) (OLS) or an individual random (RE) or fixed effect (FE) depending on the estimated specification.

The variables of interest are the supply side variables \(p_{\theta , it}\), where we distinguish between the four physician types (\(\theta =\hbox {GH, FFS, UN, UH}\)). Note that GH and UH are salaried physicians working at general and university hospitals respectively. FFS are FFS physicians working at general hospitals. UN are FFS physicians working for private clinics. We excluded the variable \(p_{UH,it}\) since its inclusion would lead to perfect multicollinearityFootnote 18. The estimates of \(p_{\theta ,it}\) are thus relative to the base category, which is the percentage of patients visiting a UH physician in area \(i\) at year \(t\).

If supply sensitivity does not play a role we expect that \(\delta _\theta =0\); i.e. the type of physician visited by patients is not related to the treatment density. However, when the remuneration type of the physician does matter we expect the following two hypotheses to hold.

Hypothesis I

treatment density is higher when patients visit relatively more FFS or UN physicians compared to UH physicians. That is, \(\delta _{FFS} >0\) and \(\delta _{UN} >0\).

Hypothesis II

treatment density is higher when patients visit relatively more FFS physicians compared to GH physicians. That is, \(\delta _{FFS} >\delta _{GH} \).

Both hypotheses state that FFS and UN physicians have stronger (financial) incentives to treat more patients than UH and GH physicians. We suppose that the size of the effect is stronger for supply sensitive treatments. However, for hip fractures financial incentives are unlikely to play a large role. This leads to the third hypothesis:

Hypothesis III

Hypotheses I and II do not hold for the treatment of hip fractures.

Economic effects

The estimated coefficients of interest, \(\delta _\theta \), do not provide information about the economic significance of the supply side effects. Therefore we calculated the marginal effects. The marginal effects \(\eta _{\theta _1 -\theta _2}\) represent the effect on treatment density of a one percentage point increase in \(p_{\theta _1}\), while \(p_{\theta _2}\) decreases simultaneously by one percentage point. The marginal effects are measured as a percentage change in treatment density compared to the average treatment density. Formally:

$$\begin{aligned} \eta _{\theta _1-\theta _2} =100\% \times \left( {\frac{\hat{\delta }_{\theta _1} -\hat{\delta }_{\theta _2}}{\bar{y}}} \right) \end{aligned}$$
(2)

In addition to these baseline estimates, we make several robustness checks. We correct for patients’ cross border mobility and we tackle the low within variation for UH physicians by adding the percentage of UH physicians to the percentage of GH physicians. The rationale behind this is that UH and GH physicians are both paid a fixed salary.

Results

Estimation results

Table 7 presents the full estimation results for cataract surgery. For all nine treatments we present the estimated coefficients on our variables of interest \((p_{\theta ,it})\) in Tables 8, 9, and 10.

Table 7 Estimation results cataract surgery (t-statistics in parentheses)
Table 8 OLS estimation results all treatments (t-statistics in parentheses)
Table 9 RE estimation results all treatments (t-statistics in parentheses)
Table 10 FE estimation results all treatments (t-statistics in parentheses)

Column (1) shows the result of an OLS estimation for Eq. [1] where we excluded the supply side variables \(p_{\theta ,it}\). Cataract surgeries occur more frequently in areas with a higher share of older people. This reflects that mostly aged people need cataract surgery. Treatment density is lower in areas with relatively more Western immigrants and higher in areas with relatively more non-Western immigrants. For urbanization we find that the largest effects in the middle categories. Waiting time for cataract surgery has a negative impact on the treatment density, indicating that treatment density is lower in areas with a higher waiting time. The variables related to distance to the GP have a negative impact on treatment density as well. For other treatments, the effect of control variables is potentially very different. For instance, for tonsillectomy we find that treatment density is higher in areas with relatively more people in the age group 0 to 5 years. Since we are interested in the impact of physician remuneration, we do not report the full results here, but we limit ourselves to the coefficients of interest. The regression results for all treatments are available from the authors upon request.

In column (2), (3), and (4) of Tables 789, and 10 we add the physician remuneration variables. In the OLS and RE specifications we observe only small changes in the estimated coefficients for the control variables. In the FE model, however, most control variables are no longer significant because of the low within variation of these explanatory variables.

The physician remuneration variables are positive and highly significant in all three specifications. Treatment density is higher when patients visited relatively more FFS, GH, or UN physicians (compared to the UH type). This finding confirms hypothesis I stating that \(\delta _{FFS} >0\) and \(\delta _{UN} >0\) for cataract treatments. Note that the estimated coefficients on the variables \(p_{\theta ,it}\) are much higher in absolute terms in the FE model.This is related to the fact that the within variation for UH physicians is low (see Table 6).Footnote 19 We postpone the discussion of hypothesis II to the next section where we discuss the economic significance by comparing the coefficients on the variables \(p_{\theta ,it}\).

In Tables 8, 9, and 10 we show the estimated coefficients on the physician remuneration variables \(p_{\theta ,it}\) for all nine treatments. The coefficients in the tables show that hypothesis I is confirmed in all three models for cataract, tonsillectomy, hernia, and arthrosis (knee and hip). For varicose veins (surgery) hypothesis I is confirmed in both the OLS and the RE model. For varicose veins (dermatology) hypothesis I is confirmed in both the RE and the FE model. For inguinal hernia, hypothesis I is only confirmed in the FE model.

Moreover, we can already conclude that hypothesis I is not confirmed for the treatment of hip fractures in all three models. The coefficients on the variables \(p_{\theta ,it}\) are all insignificant in the three models, except for the coefficient on \(p_{GH,it}\) in the OLS model. This coefficient is very small and does not have the expected sign. This indicates the the treatment of hip fractures is not related to the type of physician visited by patients and confirms the first part of hypothesis III stating that hypothesis I does not hold for the treatment of hip fractures.

With the Breusch and Pagan Lagrange multiplier test we tested the RE model against the OLS model, favouring the RE model (at a 5 percent significance level) for all nine treatments. A generalized Hausman test, in turn, rejects the RE model in favour of the FE model in all casesFootnote 20. However, we continue to present OLS and RE estimates for reasons of comparison.Footnote 21

Economic importance

Table 11 presents the marginal effects \(\eta _{\theta _1 -\theta _2}\). Since the FE model is the preferred specification we focus on the results in columns (3), (6), and (9) of Table 11.

Table 11 Marginal effects main model

The results in column (3) (6), and (9) of Table 11 show the economic importance of the effects. For example, for cataract surgery we find that treatment density increases by 3.4 percent (compared to the average) when the percentage of patients visiting a FFS-physician increases by one percent and the percentage of patients visiting a UH-physician decreases by one percent.Footnote 22 Column (6) of Table 11 shows that this number is 0.27 percent when the percentage of patients visiting a FFS-physician increases by one percent and the percentage of patients visiting a GH-physician decreases simultaneously by one percent. We observe hardly any difference between FFS and UN-physicians for cataract surgery.

Hypothesis II, stating that \(\delta _{FFS} >\delta _{GH}\), is confirmed for cataract surgery, tonsillectomy, hernia, and the treatment of arthrosis (knee) in the FE model (column 6). For these treatments the estimated marginal effect is in the range of 0.17 percent (hernia) to 0.81 percent (arthrosis–knee). For the other treatments we find insignificant results, indicating that there is no difference between FFS and GH physicians. We conclude that for four treatments (cataract, tonsillectomy, hernia, and knee arthosis) practice variations can be explained by differences in the remuneration among physicians.

Besides that, it is interesting to investigate how treatment density relates to the percentage of patients visiting a UN physician. Column (9) of Table 11 shows that UN physicians, who mainly work in private clinics, perform similarly to FFS physicians.Footnote 23 However, we find for the treatment of varicose veins (dermatology) that treatment density is significantly higher in areas where patients visit relatively more frequently a UN physician rather than a FFS physician (see column (9) in Table 11). This suggests that for varicose veins private clinics have increased the number of treatments and that they may have created additional demand.

We also confirm hypothesis III. As already shown in the previous section, none of the physician remuneration variables \(p_{\theta , it}\) has a significant impact on the treatment density of hip fractures. Moreover, we find no significant differences between the percentage of FFS and GH physicians in the FE model. In other words, practice variations in hip fractures is not related to the remuneration of the physicians in our preferred specification. In the OLS and RE model, hypothesis II is confirmed for hip fractures. This in contrast to hypothesis III. However, if we take a closer look at the coefficients we see that this results from the fact that the coefficient on \(p_{GH,it}\) is negative and significant (almost significant in RE) while the coefficient on \(p_{FFS,it}\) is close to zero and not significant.We conclude that for hip fractures this difference is not likely to be related to financial incentives of physicians.

Robustness checks

Excluding border areas

Unfortunately, we have no information on cross border mobility. Some patients go abroad for treatment. Cross-border health care costs represent about 1.2 percent of the total health care costs in the Netherlands. This could influence our results if treatment densities are measured incorrectly in border areas. For this reason, we re-run our regressions excluding the 26 two-digit zip code areas that are adjacent to the Belgian and German border. This reduces the total number of observations by about 30 percent (to 7,029 nobs). Table 12 shows the marginal effects resulting from the analysis without border areas. The results are highly comparable to our previous findings presented in Table 11.

Table 12 Robustness check border areas

Low within variation for university hospitals

Table 6 shows that the within variation of the percentage of patients visiting a university hospital is low. Indeed, almost no changes took place on the market of university hospitals during our sample period. For the other group of hospitals we observe that few hospitals exited and new hospital entered the market. Moreover, for a small number of hospitals we observe changes in remuneration schemes of physicians during the sample period. The low within variation for UH physicians enhances identification problems in the FE model because identification is solely based on this variation. For example, the estimated marginal effects \(\eta _{FFS-UH}\) are much higher in the FE model than in the other models. To circumvent this problem we add the percentage of UH physicians to the percentage of GH physicians. The rationale behind this is that UH and GH physicians are both paid a fixed salary.Footnote 24 Table 13 presents the results, which are in line with our previous findings. The marginal effects \(\eta _{FFS-(GH+UH)}\) are now a weighted average of the former two marginal effects \(\eta _{FFS-UH}\) and \(\eta _{FFS-GH}\) presented in Table 11. The marginal effects \(\eta _{FFS-UN}\) do not change very much compared to the findings in Table 11.

Table 13 Robustness check GH and UH together

Conclusions

The Dutch government liberalized the provision of hospital services in 2001 by payment systems that were to a large extent volume-based and open-ended such that “money follows the patients”. In combination with patients facing limited cost-sharing and free hospital choice these factors led to a strong growth in hospital production.

We showed that physicians respond to financial incentives and these effects appear to be stronger for supply sensitive treatments. We found that utilization rates are higher in areas where more patients are treated by FFS physicians. For example, if patients visit one percent more FFS physicians instead of salaried physicians, then the number of cataract treatments in that area increases by about 0.27 percent. For treatments of varicose veins we find that physicians working in private clinics treat even more patients than physicians working in general hospitals that are paid fee for service. This may suggest a selection effect. Entrepreneurial physicians will put a high weight on income in their utility function, and are therefore more likely to be working for private clinics.

We could identify this effect by using exogenous variation for groups of physicians that face different financial incentives. This variation stems from physicians, hospitals and clinics entering and leaving the market, and from changes in physician remuneration in some hospitals during the sample period. Moreover, the validity of our results is confirmed by performing a similar analysis for the treatment of hip fractures. In line with our expectations, we find that the hip fracture treatment density is not related to physician remuneration.

In this study we were not able to analyse the evolution of the utilization rates before 2005, when the DTC system was introduced in the Netherlands, due to a lack of data and a different registration of medical conditions. We cannot thus reject the hypothesis that an uneven treatment intensity, in the years preceding the DTC implementation, could explain the differences found in the study.

There are many directions for further research. Other supply factors, such as the number of physicians, the degree of concentration of physicians in hospitals, differences in physician practices or prices of treatments, may be important as well for explaining practice variations. It would also add value to have some information on quality of care. Although we find variation between FFS physicians and salaried physicians we cannot infer that we might be witnessing poor quality of health care. Also, adding more control variables, such as the health status of patients, could improve the results.

From a policy point of view it is hard to say whether it is preferable to have more or less physicians on FFS. Our results indicate that allowing FFS physicians and private clinics on the market can be a mixed blessing. On the one hand, these physicians may treat individual patients more efficiently, but on the other hand, they may treat too many patients. If the government believes there exists “overtreatment” then replacing FFS physicians by salaried physicians could be a sensible policy option. However, to answer this question properly we need more information about the possible differences between FFS, UN physicians and salaried physicians. Information about the quality of treatments will help policymakers to make a better decision on having more or less FFS physicians. For example, if it turns out that FFS and UN physicians are delivering better quality of care than salaried physicians then this could also be seen as an argument to support more FFS physicians.

Our finding that supply side factors are related to practice variations may also have important policy implications for supply and demand side regulation. First, if hospital management suspect too much production then they could adjust their contracts with physicians. For example, regressive tariffs could be introduced to soften production. This strategy, however, will be difficult to realize since hospital income depends upon production. Second, the option of being a FFS physician could be made less attractive. This is currently politically debated in the Netherlands. Third, insurers could address unwarranted practice variations by benchmarking hospitals and using managed care activities to discipline physicians. Fourth, unwarranted practice variations can be mitigated by increasing cost-sharing arrangements for consumers regarding those supply sensitive treatments.