1 Introduction

In recent years there have been growing efforts and interest in comparing the relative performance of various health systems [1]. Many OECD countries are developing national performance measurement frameworks for monitoring and comparing the overall efficiency of their health care systems. These frameworks cover important dimensions of health care provision and usually include a selection of various indicators for health outcomes, distribution, productivity, and patient satisfaction. In this setting, output measurement based on various activities may provide a useful means to assess and compare the technical aspects of hospital production (productivity and technical efficiency measurement based on intermediate types of output such as patients grouped by DRGs). However, this approach still faces many difficulties. It is not a trivial task to make output measures comparable since the output grouping definitions, coding classifications and coding practices usually vary considerably across countries. Recently, the OECD has noted the need for a generic patient grouping system (e.g. Diagnosis Related Groups, DRG) that can take into account the varying patient classifications across the OECD countries [2]. The report by the OECD recognises the importance and need for a general mapping of various procedure coding systems. However, the use of different procedure classifications is not the only obstacle in the harmonisation of DRGs: the local applications of standardised disease classifications, such as the ICD-10 are also known to vary across European countries [3].

Due to the difficulties in the measurement of output and case-mix, international comparisons of hospital efficiency are relatively scarce in the literature. Hansen and Zwanziger [4] used cost functions to compare marginal costs in general acute care among US and Canadian hospitals. Mobley and Magnussen [5] and Magnussen and Mobley [6] examined the relative performance of Norwegian (regulated, public) and Californian (unregulated, competitive) hospitals using DEA and empirical data from 1997. Steinman et al. [7] compared a sample of Swiss and German hospitals and Linna et al. [8] compared hospital efficiency between Norway and Finland using data envelopment analysis. All these studies have found considerable differences in the average efficiency across the countries, but did not report explicitly on how the output measurement and case-mix adjustments were accomplished. It is obvious that several compromises in the accuracy of the output measurement had to be made in most of the previous comparative studies.

In this study we explored the possibility of pushing the limits of an international comparison of hospital efficiency further by using patient-level data from several countries. One of the main objectives in this study was to investigate the output grouping definitions with the primary data and how to improve validity in the total output measure when employed in international comparisons. Typical problems and differences in the output definitions which may affect DRG-based efficiency comparisons are reported in this article.

The study focused on hospital care since all Nordic countries have fairly good administrative and register data in hospitals (hospital discharge data) and even the possibility of linking registers using the personal ID number. The study was done in four Nordic countries (Denmark, Norway, Sweden and Finland) in a setting where the structure of organising hospital care and the available data (e.g. coding and used primary classifications) are sufficiently similar. In addition, each of the Nordic countries applied similar DRG grouping systems based on a common Nordic NordDRG grouping system.

2 Hospital financing and organisation in the Nordic countries

2.1 Hospital financing

In the Nordic countries, health care has been decentralised to local or regional authorities who usually purchase and also provide the health services. In all Nordic countries the funding of hospital care is a mixture of global budgeting and activity-based funding (ABF). Activity-based funding is a payment model based on the volume and type of services provided to each patient for hospital care whereas in global budgeting the volume and total price is fixed prospectively. In Denmark, activity-based financing was not yet widely implemented during the study period. In addition, in Finland the implemented activity-based financing did not have the same incentive effects as in Norway and Sweden which applied DRG based funding to a greater extent. However, Norway, Sweden and Finland could be classified as using ABF and Denmark global budgeting.

Since health and hospital care is mainly publicly provided in the Nordic countries, taxation is the main source for financing hospital care. In 2002, the total expenditure on health as a percentage of GDP was 9.8 in Norway, 9.3 in Sweden, 8.8 in Denmark and 7.8 in Finland. The hospital care providers were regional health enterprises in Norway (after 2002), county councils in Denmark and Sweden, and hospital districts in Finland.

While municipalities and counties for the most part were responsible for arranging all health care, these responsibilities were divided differently in each country. The responsibility for hospital care in Norway, Denmark and Sweden was given to regional authorities but in Finland to local authorities. In Finland, hospital care is financed mostly by municipalities, which get their income from local taxes and non-earmarked state subsidies. In Sweden and Denmark hospital care is financed by county council taxes (with additional general grants) while in Norway there have been two public financiers (Counties and the state). After 2002 the state has been the only financier.

In Denmark, the overall budget was negotiated each year in a Budget Cooperation between state government and the local governments, who were represented by the County Council Association and the Association of Municipalities. In this co-operation an overall ceiling of the growth in the local tax rate was agreed upon and the level of state block-grants was negotiated. In Sweden the central government allocated financial assistance and acted as supervisor of activities in the county councils. In Norway after 2002, the hospital care was financed directly from the state budget as the state now owned the public hospitals, reflecting the lowest possible decentralisation level.

2.2 Hospital organisation

The public hospitals in all of the Nordic countries were responsible for producing the majority of secondary and tertiary level services in health care. In each of the Nordic countries one could, during the study period, still observe a rather similar structure in hospital organization where hospitals could be classified roughly into three: University teaching hospitals, central hospitals and local hospitals (as an exception there does exist a few specialized hospitals e.g. for orthopaedics and ophthalmology).

University teaching hospitals provided the most specialized tertiary level services and were usually organized to serve regions having populations between 0.5 and 1.0 million. University teaching hospitals were also the main sources of teaching and research output by providing medical education, and conducting clinical research.

Central hospitals provided services to a region/county/district and provided somewhat less specialized tertiary services compared to the university teaching hospitals. Typical central hospitals included several (8–15) specialty clinics and emergency departments.

Local hospitals were usually smaller hospitals supplying secondary level services to a group of municipalities within a county/district A typical local hospital included inpatient wards for 3–5 specialties (e.g. internal medicine, general surgery, obstetrics/gynaecology) and some additional specialties on an outpatient basis.

In Sweden the 21 county councils (including two regions and one municipality responsible for health care delivery) were divided into six regions having eight university and region hospitals providing tertiary level services. The total number of hospitals in this study was 49, of which seven were university hospitals. In Sweden there were 69 hospitals in total, but 17 hospitals were included in the final accounts of another hospital and did not have the decision making power (and hence could not be defined as individual decision making units). In addition three very small hospitals Mora, Skellefteå and Lycksele, were excluded. However, the excluded Swedish hospitals represented sufficiently closely the overall distribution of hospital characteristics in Sweden, and the rejection did not create any significant bias to our comparison.

In Finland, the 20 hospital districts owned and ran more than 40 acute care public general hospitals, which were divided into three categories: university teaching hospitals (five hospitals), central hospitals (15 hospitals) and other municipal hospitals. Each hospital district hosted one central hospital which in five hospital districts were university teaching hospitals and in 15 hospital districts (non-university) central hospitals.

After the 2002 reform Norwegian hospital care was provided by public enterprise and financed by the central government. In Norway, the Health Enterprise Act 2002 changed crucially the hospital care system. Hospital ownership was transferred from counties to the central government, and currently there are five Regional Health Enterprises (RHEs) that are reporting to the Ministry of Health and are responsible for delivering health services in their regions. In Norway, 8 of the 43 hospitals can be classed as university hospitals and the rest as central or local hospitals.

The Danish hospitals were owned and run by the 14 counties. In Denmark, there were 10 university hospitals, yet some of them were clearly smaller units compared to e.g. Swedish and Finnish university hospitals. The total number of Danish hospitals in this study was 54 (there were approximately 60 public hospitals in total in the country).

In Finland, the executive management of hospitals usually consists of a chief physician, a chief nurse and a director of finance, while hospital managers are accountable to the council of the hospital district. In Sweden, the hospitals are managed by a combination of elected public officials sitting on boards and also hospital managers. In Denmark, for its part, a recent trend has been to merge management functions and/or create matrix organisations by merging departments from several hospitals into functional units with joint responsibilities for particular treatment areas. Somewhat on the same path is Norway as the boards of the local health trusts are managing the hospitals, though it has been noted that the management role of the trust boards is not clear [9, 10]. In the Nordic countries most of health care professionals work on salary basis and their wage level is based on nationwide contracts between employer and employee organisations. The payment of hospital staff was salary based in all Nordic countries.

2.3 Hospital data generating processes

The basic model for data generation is rather similar in all of the countries in this study. Medical personnel are responsible for feeding in the key medical record information (age, sex, primary and secondary diagnoses) in the hospital’s system. After discharging the patient, the hospital’s data administration merges data sets from various hospital systems (e.g. the procedures from the operating room information technology systems, systems for medical diagnostics, patient administration systems for the wards, outpatient records) to create a standardized discharge abstract. After quality checking, this discharge abstract is sent on to the national statistical authorities to form a national discharge registry.

However, although it is the physician’s responsibility to give a diagnosis for the patient, there are differences in how this given diagnosis is interpreted as a valid ICD-10 diagnosis and coded accordingly in the system. In some cases it may be fed in the system by nurses or assistants on the wards, sometimes by the physicians. It is known that the diagnosis coding practices differ in the hospitals within countries and also across countries. For example, the propensity of using secondary diagnoses varies considerably across the countries and this variation may affect the comparisons markedly. It was already noted in the comparison between Norway and Finland in 1999 that the relative shares of patients with serious co-morbidities affecting the DRG grouping were significantly higher in the Norwegian hospitals [8]. Another potential weakness in the data generating process was in the outpatient records. Many inpatient treatments were already shifted to outpatient care during the early 2000s, but the coding of diagnoses was not of sufficiently high quality in the IT systems for outpatient care. However, each of the Nordic countries in the present study shared the same types of problems in the data administration, and consequently our hypothesis was that there were not large systematic country-level differences in these aspects of data quality.

3 Data

The data set was based on national discharge databases collected from hospitals from Finland, Norway, Sweden and Denmark. In the current study, data from 2002 were used. The year 2002 was the last year when the Danish hospitals could be measured using the common NordDRG grouper since the Danish Ministry of Health decided to build their own grouping system (the sc. DKDRG grouper) and also a separate outpatient grouping system (DAGS) for ambulatory patients [11].

Outputs were measured as (DRG) weighted discharges in inpatient care, day surgery, day-care and the number of outpatient visits. DRG weights were averages calculated from the national cost weight sets. Inputs were measured as costs in real terms, using wage and consumer price indices to adjust operating costs. Descriptive statistics are shown in Table 1.

Table 1 Descriptive statistics

3.1 Output data—DRG grouping

In 1996, the Nordic countries launched a modified DRG system based on the Nordic version of the ICD-10 and a new Nordic Classification of Surgical Procedures (NCSP) introduced in 1996 (http://www.norddrg.net/norddrgmanual/). The current version of the NordDRG applies to Nordic diagnosis and procedure codes but imitates the classification rules in the 12th edition of the DRG classification issued by the US-HCFA in 1994. With a common Nordic version, each country has its own localised national versions. Fortunately, during the study period the national versions were not too diversified and thus in this study we could group inpatient admissions and day care episodes using each country’s own NordDRG versions. However, it was necessary to do slight modifications in the primary groupings to ensure comparability. Primary grouping of the patient data was done with yearly versions of national groupers. First of all, all cases where DRGs were split into subgroups in the national versions were aggregated back to the original grouping.Footnote 1 Secondly, in Norway there were considerable volumes in some DRG groups that did not exist in the other countries. For example, the number of normal newborns were grouped and counted as output whereas in other Nordic countries delivery related DRGs included only the hospital discharges for mother’s stay at the maternity wards. This was necessary only for the normal newborns since any other types of problems with the newborn would be counted in paediatric DRG groups. In addition, the Norwegian inpatient grouping included significant volumes of rehabilitation, dialysis treatment and radiation therapy. These treatments were provided mainly in day care or outpatient visit settings in the other countries. These discrepancies were adjusted in the data used in this study.Footnote 2

3.2 Output data—the definition for a DRG case (discharge)

Before grouping the output data, one critical task was to harmonize the definition of a ‘discharge’. It turned out that in Norway and Denmark discharges were defined as ‘hospital discharges’ while in Sweden and Finland the discharges were defined as ‘specialty discharges’. Specialty discharge means that if the patient is transferred to other clinics within the same hospital, a new discharge is counted. Due to the varying definitions for a discharge, Swedish and Finnish hospital data had to be fixed by merging patient discharge data in cases where clinical transfers were found. The main diagnosis for the hospital discharge was inherited from the specialty discharge having the largest DRG cost weight.

3.3 Output data—the definition for day care

Day care included cases where the patient did not stay overnight in the patient ward, but where treatment was considerably more resource intensive than in the usual outpatient visits. Short stay surgery (e.g. cataract surgery) is a typical example of day care in the operative specialties, renal dialysis in the medical specialties. Similarly, there were large differences among the Nordic countries within the organization and supply of day care services. Day care cases were grouped using the local NordDRG groupers and weighted accordingly.

3.4 Output data—the definition for outpatient visits

Outpatient visits were measured as simple counts without case-mix weighting. Outpatient visits included all emergency and scheduled visits, and first and follow-up visits. The treatment in a typical outpatient visit may include minor diagnostic procedures or treatments, but usually more resource intensive cases are defined as day care (see above). It is possible that Nordic countries use slightly different definitions for outpatient visits and conservative day-care. Thus there may be slight differences in the outpatient visit output. In operative day care the inclusion criteria was rather clear since an explicit list of procedure codes was used.

3.5 Output data—DRG cost weights

All patient cases (discharges and day-care) were grouped and weighted using the estimated average costs in each country separately. Because the national DRG weights varied, we used weighted average DRG-weights to aggregate inpatient care and day care to obtain a common set of weights. Because cost-accounting data at the patient level were not produced regularly in all hospitals in the Nordic countries, the cost weights were derived from the cost information of samples of hospitals in each country. In Finland, the cost weights were derived from the cost information of the largest hospital district, Helsinki-Uusimaa district (HUS). The costing sample covered approximately 30% of all acute hospital care in Finland. Cost items included diagnostic tests, procedures, medical services, support services and overhead costs. In Norway the cost weights are adjusted annually in a national price list for the DRG system and were based on a sample of the hospitals based on average price per DRG per year. In Sweden the cost weights, since 1999, are derived from the calculated cost per patient information of a sample of hospitals that covers about 50% of all acute hospital care in 2008 (with an overweight of regional hospitals).

Finally, for the analysis in the present study, the output was grouped to five categories: inpatient medical case types (IM); inpatient surgical case types (IS); day care/outpatient surgical products (OS), day care medical cases (OM), other DRG cases (ODRG) and outpatient visits (OV).

3.6 Input data

The cost data were based on the year-end accounts (according to the hospital book-keeping) and internal reports from the hospital accounting systems. In this study, hospital costs include all production related costs in a hospital, excluding capital costs and costs of teaching and research. In order to compile the cost data into a common currency, hospital costs were adjusted using the input price index. Input price adjusted operating costs (ADJ_COST) were used in all analyses.

3.7 Input prices

International comparisons of health expenditure and health prices must be based on a common currency. Purchasing power parities (PPP) are rates of currency conversion constructed to account for such price differences. Generally, the reported PPPs adjust for price differences at the level of the total GDP, not sub-aggregates of the GDP, such as health expenditures. However, cross-country differences in health care prices are not necessarily consistent with differences in prices in general.

To approximate an input price index, we weighted 70% (the average share of wage expenditures) of hospital operating costs using the wage index and the rest 30% (e.g. materials, equipment and rents) using the PPP conversion adjustment. The share of nurses wages accounted for 50% of the total expenditure and the physician wages 20%. Table 1 presents the price indices employed for each of the Nordic countries. The input prices were substantially lower in Finland than in other Nordic countries.

4 Methods

The measurement of cost efficiency is usually accomplished by using parametric stochastic frontier (SF) methods or nonparametric Data Envelopment Analysis (DEA). Parametric estimation of the stochastic frontier needs a behavioural hypothesis for cost minimisation. In addition, the econometric approach is parametric and confounds the effects of misspecification of functional form with inefficiency. DEA approach is nonparametric and, due to milder conditions set for the form of the technology, is less prone to this type of specification error [12]. DEA is based on relative efficiency measures proposed by Farrell [13], and in this framework a hospital is judged to be efficient if it is operating on the best practice production frontier. By employing the input price adjusted operating costs as the input variable our definition of cost efficiency only approximates Farrell’s measure of total efficiency [13].

In assessing the cost efficiency of hospitals in this study we used DEA which utilizes linear programming techniques in the calculation of unit-specific efficiency scores [14]. DEA constructs a piecewise linear efficient frontier which serves as the reference in the evaluation of efficiency. If a hospital is efficient, it lies on the frontier and will receive an efficiency score of 1.0 (100% efficiency). Inefficient hospitals will receive a score lower than 1.0. For example, if the score for a hospital is 0.80 as measured in the input direction, its inefficiency is 20% and it could produce its output with 20% less input. Alternatively, with an output-efficiency score it produces 80% of its potential and it could increase its output by 25% using the same resources. With the CRS assumption the scores would be the same whether measured in the input or output direction. Bias-corrected efficiency estimates can be obtained by using bootstrapping methods from simulated distributions in pseudo samples [15, 16]. If the hospital size distribution is not similar, scale assumptions may influence not only the individual efficiency measures, but also the group averages. Therefore we presented results using models with both constant and variable returns to scale.

Cost efficiency was calculated by solving the following linear program:

$$ \begin{array}{*{20}{c}} {{\hbox{Mi}}{{\hbox{n}}_{\lambda, {z_{CE}}}}{ }{z_{CE}}} \hfill \\{\begin{array}{*{20}{c}} {s{.}t{.}} & {\lambda \cdot Y \geqslant {y_0},} \\{} & {\lambda \cdot C \leqslant {z_{CE}} \cdot {c_0},} \\{} & {{\lambda_i} \geqslant 0} \\{} & {\lambda \cdot i = 1} \\\end{array} } \hfill \\\end{array} $$
(1)

where Y is an n × m matrix of observed outputs and λ is a 1 × n vector of intensity variables. c 0 is a scalar representing a hospital’s cost level, i is a column vector of 1s and C is the n × 1 matrix of observed costs. Eliminating the last equality constraint changes the model to constant returns to scale.

Three models were constructed in order to study the differences in estimated DEA efficiency scores across different model specifications. The number of output variables were varied using different data aggregations. In Model 1, DRG grouped inpatient and day-care cases were aggregated and outpatient visits were used as the second output. In Model 2 the specification was based on 3 outputs where inpatient and day care cases were separated. Model 3 used 6 outputs where inpatient discharges and day care were further split into subgroups: medical, surgical and other cases in the inpatient care and medical and surgical for day care. Each of the three models was calculated using assumptions for constant returns to scale (CRS) and variable returns to scale (VRS). All used models were input-oriented and solved using a robust solver XA from Sunset Software Technology. The characteristics of the models are summarised in Table 2.

Table 2 Models used in the analysis

However, DEA models are known to be sensitive to variable selection. For example, in DEA every inclusion of an input or output variable increases the resulting average efficiency scores, as noted by Farrell [13]. Moreover, the production unit is evaluated along its best dimension(s) and will continue to be efficient no matter how many variables are added to the model. It is important to note that the bootstrapping methods do not fix the curse of dimensionality problem and thus it is informative to report the results using rank-order methods (e.g. the number of units on the frontier) to supplement mean and standard deviations. The curse of dimensionality occurs in DEA when there is an excessive number of inputs and outputs in relation to the number of decision-making units. Following the recommendations for model specification discussed in [12], three models with varying number of output variables were used to test the sensitivity of the results.

5 Results

This study revealed considerable differences in cost efficiency between Nordic hospitals. In 2002, the average efficiency was highest in Finland, where the mean efficiency was between 0.73 and 0.80 in models using CRS and between 0.86 and 0.88 in models using VRS (Table 3). In Denmark the average efficiency was closest to the Finnish average, with a difference of only 0.00–0.09 efficiency units. The difference in the average efficiency between Danish and Finnish hospitals was clearly larger using the VRS models. Sweden appeared to have the least efficient hospitals with a difference of 0.13–0.20 efficiency units compared to Finnish hospitals in the average efficiency.

Table 3 DEA efficiency scores and the number of efficient units

While the individual hospital scores and even the (country) average efficiency scores varied markedly in different model specifications, the rank in the (country) group averages remained the same in all of the used models. The differences in group averages seemed generally to be higher in models with less output variables and models using constant returns to scale.

In Fig. 1 the bias-corrected efficiency scores and the bootstrapped 95% likelihood intervals are displayed for each of the hospitals for Model5. According to preliminary results there was more variation in cost efficiency among Swedish and Finnish hospitals, whereas the variance of efficiency scores was smallest in Norwegian hospitals (Fig. 1). Likelihood intervals seemed to be somewhat wider in the Finnish hospitals (Appendix 1).

Fig. 1
figure 1

Bias-corrected efficiency scores for Model5

In all countries there were fully efficient hospitals, depending on the model specification. The least number of efficient units were in Model1 where only one Danish and one Finnish hospital were estimated to be fully efficient. Among the most efficient hospitals there were many types and sizes of hospitals ranging from small local hospitals to largest university hospitals. However, in the CRS models the small local hospitals were clearly overrepresented whereas using the VRS models tended to significantly increase the relative performance of larger university hospitals.

6 Discussion

Our findings showing substantial differences in average cost efficiency were quite robust using different models. Moreover, after adjusting for input price differences, there still remained a difference of 0.13–0.20 in the efficiency measures, suggesting that efficiency may have been clearly lower in Norwegian and Swedish hospitals. However, although the average efficiency in Danish hospitals was slightly lower than in Finnish hospitals (0.00–0.09 units), the ranking for Denmark may be sensitive to changing some of the assumptions used in this comparison.

The country-level average differences in efficiency have been found to be surprisingly stable in time. Based on updated data sets from 2005 to 2007, hospital efficiency in the Nordic countries was assessed in a recent working paper [17]. The average efficiency in Finnish, Swedish and Norwegian hospitals converged only slightly and calculations using preliminary and partly incomplete data suggested that significant differences in average efficiency had prevailed also in 2005–2007.

The present study also demonstrated some important issues which have to be taken into account in country-level international comparisons: 1) cost efficiency differences may turn out to be substantial, explaining a large of part of the differences in health expenditure, 2) The estimates of average cost efficiency are however somewhat sensitive to the model specification, and if data comparability is ensured only the country ranking can be reliably demonstrated 3) PPP adjustments can be misleading in cases where input prices are clearly different in the health sector, such as in our case where, compared to Finnish hospitals, the hospital wage index in 2002 was 30% higher in Norway, and 28% higher in Denmark. However, the differences in the average wage levels of hospital personnel used in the present study were consistent with the reported official statistics for health workforce remuneration by OECD [18]. Further investigations of a proper input price index are still needed.

The search for explanations for cost efficiency differences using the present data proved to be difficult. Our cross-sectional analysis does not allow us to make far-reaching conclusions on any causal relationships. In the previous comparative study between Finland and Norway, the institutional, structural and technical explanations were discussed [8]. First, there had been actions to improve the organisation of health care delivery between primary and specialised health care in Finland, whereas in Norway patients crossed an “administrative border” when leaving the hospital. There was also discussion that municipalities in Norway were able to shift part of the costs of health care to the hospitals, thereby creating longer lengths of stay and lower levels of efficiency.

Third, according to the previous study there were substantial differences in the organization of care for some patient groups between Norway and Finland [8]. This pointed to structural differences and the most distinct difference was the higher number of outlier days. The reason why the number of outlier days/discharge were significantly higher in Norway may be due to a) different DRG case-mix, b) generally longer LOS in Norway, c) larger variation in inpatient care (bed-days).

The fourth factor explaining the differences is more technical and relates to coding issues and the accuracy of the case-mix adjustment. According to the results, regional hospitals, which usually provide more specialized services, seemed to be less efficient compared to local hospitals using the constant returns to scale assumption. The regional hospitals were mostly large hospitals, which made it difficult to distinguish between technology and other correlates of efficiency. One possible explanation could be that the large regional hospitals are often teaching hospitals. As shown in previous studies, teaching hospitals probably have special characteristics which may affect efficiency comparisons [19]. It may be that teaching and research activities absorb more inputs than are compensated for teaching and research activities. It is also possible that the DRG case-mix adjustment does not capture all case complexity adequately. However, since the hospitals’ size distribution in the countries was not similar, this may have affected our findings relating to differences in country-level efficiency. For example, the size of the Swedish hospitals was clearly larger judged by the statistics in Table 1.

In this study we observed that there were some differences in the average case-mix between the Nordic countries (Table 1). This may be due to coding differences. Using patient-level data, it was possible to provide insight into the differences of patient care by looking more closely at the produced DRGs in each country. For each of the DRG groups the relative share of cases within each DRG group was compared. In addition, we compared the average case-mix, LOS and case-mix weighted cases/population in every DRG group and wider aggregates of DRGs, the major diagnostic categories (MDC) (Fig. 2).

Fig. 2
figure 2

The difference between the observed and expected case-mix adjusted output in each of the MDCs (MDC 1–26). Results are presented as the percentage of total output

According to our analysis of individual DRGs, the largest differences were found among the typical day surgery/day care cases, such as tonsillectomies, hernias, cataract surgeries, abortions, vein litigations and carpal tunnel releases. These cases were underreported in the Swedish and Danish hospitals compared to Finnish and Norwegian hospitals, possibly a reflection that in Sweden and Denmark these cases are mainly reported as outpatient visits. Thus, the total output in Swedish and Danish hospitals may be slightly underestimated.

Another large discrepancy was found in the conservative treatment of some high-volume patient groups in internal medicine. There were substantially fewer cases of atherosclerosis with co-morbidities in the Finnish hospitals, perhaps due to differences in the use of secondary diagnoses. A similar difference could be observed in the cardiac arrhythmia cases with and without complications. Moreover, the number of chest pain and angina pectoris cases was clearly lower in Finnish hospitals, which may be due to either coding or treatment differences in cardiovascular diseases.

The percentage of acute myocardial infarctions (AMI) with cardiovascular complications (DRG 121) in the total number of AMIs leaving hospital alive was 48.5% in Norway but only 33.1% in Finland. This must be due to differences in coding practices which favoured Norwegian hospitals in the productivity comparison. However, further observation of coding practices revealed that 48% of deliveries were accompanied with complicating secondary diagnoses in Norway (38% in 1999) and 49% in Finland (9% in 1999). The percentages for ‘complicated’ Caesarean sections were also higher in Finland, 40% and 52% respectively.

However, exploring the MDCs revealed that although there were differences in the prevalence of individual DRGs, aggregation of output to MDC level cancelled out the effects of coding individual DRGs. The variation across the Major Diagnostic Categories (MDCs) was not large, indicating that the MDC distribution of cases was rather similar in the Nordic countries. In Figure 2 the difference between the observed and expected case-mix adjusted output is presented, measured as the percentage of total hospital output in each country. In most of the MDCs the difference was less than 1%, except for a few cases. In Norway, the higher case-mix in MDC 0 (Pre-MDC groups), MDC 5 (Diseases and disorders of the circulatory system) and MDC 18 (Infectious and parasitic diseases) gave some 1–2% (measured in efficiency units) advantage in hospital-level efficiency comparisons. In Finland, cases in MDC 18 were also seemingly more severe. In Sweden, the case-mix in MDC 9 (Diseases and disorders of the skin and subcutaneous tissue) was substantially higher while the lower case-mix in the same MDC gave some disadvantage for Norwegian and Finnish hospitals. It can be concluded that systematic bias in the casemix measurement can explain only a small part of the observed differences in efficiency.

Although there were several potential biases in the used data and methods as discussed above, it seems that there remains unexplained differences in the average cost efficiency between the Nordic countries. Identifying the causes of these differences is, however, quite challenging since it involves both substantive and methodological problems. Firstly, while the main differences between countries identified here would be captured by country dummies in a second stage regression, there are only four countries. This makes it impossible to statistically separate effects that only vary between and not within countries, although one might hope to generate fruitful hypotheses. Secondly, the variables that vary within each country (between hospitals or over time) are often difficult to operationalize and data may not be available. Thirdly, there are methodological challenges in modelling simultaneously the causes and extent of efficiency, particularly in the bootstrapped non-parametric DEA methodology [19].

Despite the methodological challenges, our findings could be important in assessing the performance of health care and further analyses may reveal interesting policy implications.

Using a second stage analysis and panel data from 1999–2004 and 2005–2007, it was possible to investigate the effect of hospital reform (centralizing ownership), size (economies of scale), activity based financing and treatment practices (variation in the average DRG adjusted length of stay and the share of outpatient treatment) on the productivity [17, 20]. The ownership reform in Norway was found to increase productivity. Positive deviation from expected LOS was associated with reduced productivity, as expected. Interestingly, these studies failed to detect any clear effects of the changes in activity based financing. Some evidence on the diseconomies of scale was found in [17].

Kittelsen et al. 2008 concluded that structural changes and better management may be the most likely explanations for increased efficiency [20]. In a comparison of Finland and Sweden, Finnish hospital care system would be expected to differ from the Swedish system in productivity in two dimensions. Firstly, the purchaser–producer split favours higher productivity in Finland, while the second, activity-based funding (ABF), favours higher productivity in Sweden. According to previous analysis, the impact of ABF on productivity differences between the countries was modest [20]. Higher productivity in Finland may thus be partly related to the role of the purchaser. The cost control by municipalities in Finland may to be more effective than that of the counties/regions in Sweden and Denmark, or the central government in Norway. The Finnish municipalities are responsible for other public services in addition to health care. Thus their resource allocation to hospital care must be balanced annually with the allocation of resources to other sectors, while increases in hospital costs must be financed either by increasing the local tax rate or by diminishing the resources allocated to social services including the day care for children.

The substantial differences in productivity between Nordic hospitals warrant further investigation and there are still unknown factors which make the transformation process from inputs to health care services more efficient in Finnish hospitals. A better insight to these can only be achieved by collecting more detailed data on the various resources used (labor, materials, capital) and investigating the technical efficiency in the hospital production. One obvious expansion already started by our research group (the Nordic Hospital Comparison Group) is to include various measurements of hospital quality in the future comparisons. In addition, the costs and outcomes in several large patient groups in the Nordic secondary and tertiary care will be explored in the 7th Framework Programme’s EuroDRG and EuroHOPE projects.