Introduction

Diagnosis-related groups (DRGs) are widely used in Europe for a range of different purposes [1]. They form the basis of hospital performance comparisons, they are used to facilitate hospital management, and in current DRG-based hospital payment systems, DRGs define the payment categories, i.e., hospital products [2]. DRGs are “diagnosis related” groups of patients that have (a) similar resource consumption patterns and that are (b) clinically meaningful [3]. They are defined by patient classification systems (PCS)—i.e., DRG systems—which group treatment cases into DRGs on the basis of classification variables such as diagnoses, procedures and demographic characteristics.Footnote 1

Making use of DRGs requires that groups of patients are sufficiently homogenous in terms of treatment costs. Otherwise, performance comparisons on the basis of DRGs do not adequately control for differences of patients within different groups and reimbursement for a large number of patients is not appropriate; it can be either too high or too low. In order to assure homogenous groups of patients, DRG systems need to consider the most important determinants of resource consumption as classification variables. In many countries, professional medical associations, specialist experts, or consultants formally participate in the process of selection, definition, and update of classification criteria via committees, expert hearings, or consultations [46]. It is, therefore, of utmost importance for specialist groups such as surgeons that they are aware of how their respective patients are classified by their DRG system in order to assess whether the classification variables adequately reflect differences in the complexity of treating different groups of patients using different techniques.

Comparative analyses of how countries’ DRG systems classify patients can help surgeons to scrutinize national standards of classification against European equivalents in order to identify potential scope for improvement. Furthermore, analyses of how the services of surgeons in treating different patients are valued and reimbursed in other DRG systems may inform and substantiate discussions about the adequacy of cost weights (or other indicators of resource consumption). Yet, detailed comparative analyses of classification algorithms for appendectomy are very scarce, suffer from a very limited scope, and have not assessed the classification of patients using routine inpatient data [7, 8].

This study therefore performs a comprehensive assessment of DRG systems across 11 European countries and has three main objectives: (1) to assess classification variables and algorithms used to group patients with appendectomy into DRGs, (2) to compare the composition of these DRGs and variations in relative resource intensity, and (3) to determine DRGs and hospital price levels for six case vignettes of appendectomy patients with different combinations of demographic, diagnostic, and treatment variables.

The results were generated in the framework of the EuroDRG project,Footnote 2 which selected 10 episodes of care to assess European DRG systems and their ability to define homogenous groups of patients. In this article, we focus on appendectomy as it is one of the most common emergency surgical procedures in high-income countries [912].

Materials and methods

Definition of episode of care and appendectomy index case

As part of the EuroDRG project, researchers from 11 European countries (i.e., Austria, England, Estonia, Finland, France, Germany, Ireland, the Netherlands, Poland, Sweden, and Spain) agreed upon a common definition for an appendectomy episode of care (EoC). The definition was based on the 2007 version of the International Classification of Diseases 10th edition (ICD-10) for diagnoses and the 2008 version of the ICD-9 Clinical Modification for procedures and is presented in Table 1. Researchers from each country translated the definition into national codes for diagnoses and procedures considering available mappings from the Hospital Data Project if applicable [13].

Table 1 Definition of episode of care and reference case

An appendectomy index case was defined (i.e., adult age, uncomplicated appendicitis without complications, open appendectomy, treated as inpatient) to facilitate comparisons of relative resource intensity of DRGs within countries (see below).

Data sources

In each country, researchers identified national or regional hospital databases and obtained access to all information necessary for the purposes of this study. Table 2 provides an overview to the databases and data years available for each country. Databases were required to contain information about diagnoses, procedures, and DRGs of individual patients in order to make possible identification of appendectomy patients conforming to the agreed definition.

Table 2 Data years and databases by country

Analysis of patient classification systems

The number of appendectomy EoC cases and the corresponding DRGs were extracted from the databases for each country. Detailed comparative analyses of classification variables and grouping algorithms of national DRG systems [1422] were performed for those most frequent DRGs that together comprised a cumulative percentage of at least 97% of all appendectomy EoC cases. All DRGs originating from an included base DRG were included in the analyses. Grouping algorithms were mapped graphically to facilitate easy comparison of differences and similarities between systems. In addition, the percentage of all appendectomy EoC cases grouped into the DRG and the percentage of all cases within each DRG conforming to the definition of appendectomy was calculated.

In order to compare relative resource intensity of DRGs within each country, a DRG cost index was calculated with the index case assuming a value of 1. The value of all other DRGs was calculated by dividing the national measure of resource consumption (i.e., cost weight, score, average tariff) of each DRG by that of the index DRG.

DRGs and hospital quasi prices

Six standardized case vignettes of patients with different combinations of primary and secondary diagnoses, procedures, age, and length of stay were defined (Table 3). This selection is meant to cover a range of different DRGs in different countries’ systems. Case vignettes 1, 2, and 3 represent rather complicated cases of appendicitis, while the remaining ones are less complicated cases.

Table 3 Case vignettes: Patient classification variables

DRG-based hospital payment systems differ between and often even within countries [23], thus, complicating comparisons across countries. Therefore, quasi prices were ascertained for each case vignette and for the index case using an approach similar to that of Koechlin et al. [8]. Quasi prices were calculated by converting national measures of resource consumption (i.e., cost weights, average tariffs, scores) into monetary values using national conversion rates that were supposed to reflect the national average costs of treatment and—if possible—to include the full set of costs, i.e., recurrent and capital costs. If necessary, prices were deflated to year 2008 national currency using national gross domestic product (GDP) deflators [24] and converted to Euros using average currency exchange rates for the year 2008 [25].

Results

Figure 1 provides a graphic illustration of grouping algorithms and classification variables of PCS in 11 European countries. The figure includes classification variables of those DRGs that together account for more than 97% of appendectomy cases in each country. On the left hand side, the figure specifies for each country the version of the PCS and the percentage of all appendectomy cases shown in the graph. The arrows indicate the sequence in which different types of classification variables are considered in the grouping algorithm. In addition, indicators to assess the composition of DRGs and the relative resource intensity of cases within each DRG are shown. Finland and Sweden are shown together with only one grouping algorithm as both use the NordDRG system which is identical for the presented DRGs.

Fig. 1
figure 1figure 1

Graphic illustration of grouping algorithms and classification variables of PCS in 11 European countries

Patient classification of appendectomy cases in Europe

Overview: number of DRGs and number of classification variables

The figure demonstrates that there is great variation in DRG systems across Europe. The number of DRGs comprising more than 97% of cases differs considerably in different countries’ systems. In Ireland, appendectomy cases are classified into only two DRGs, while in Germany 11 DRGs exist to account among other things for different levels of complexity and different age groups.

In addition, the number of classification variables differs: the Austrian system differentiates only between different age groups, when classifying appendectomy patients; the French system differentiates (1) primary diagnoses, (2) level of complications or comorbidities (CC), (3) age groups, (4) with or without death during admission, and (5) length of stay.

Characteristics of classification variables

Different DRG systems classify appendectomy patients on the basis of different classification variables. There are three main groups of classification variables: (1) treatment characteristics, (2) patient characteristics, and (3) provider/setting characteristics. Only the first two are considered in most DRG systems.

In all systems, treatment characteristics, i.e., the procedure of appendectomy dominates the grouping algorithm and is always considered prior to the specific primary diagnosis except in the Dutch Diagnose Behandeling Combinaties (DBC) system. Only the All-Patient (AP)-DRG system in use in Spain and the Dutch DBC system differentiate between laparoscopic and open appendectomy. In the German (G)-DRG system and the AP-DRG system, a small number of patients is classified on the basis of other small intestinal/digestive system surgical procedures. The length of stay (LOS) is considered only in the French system.

A maximum of four patient characteristics are considered in the grouping process. In seven countries, the DRG systems differentiate between patients with a primary diagnosis for complicated appendicitis (i.e., appendicitis with generalized peritonitis or peritoneal abscess, each defined by specific ICD-10 codes), and those without. In most countries, the presence of relevant secondary diagnoses, i.e., complications and CC, also influences the classification of patients. However, while some countries’ systems only differentiate between with and without CC, others define several levels of CC (e.g., major CC in the AP-DRG system or level one to four CC in the French system), and again other systems calculate cumulative patient clinical complexity levels. Furthermore, age plays an important role in the classification process of several systems (i.e., Austria, England, France, and Germany). Interestingly, the French system differentiates between elderly (i.e., above 80 years) and others, whereas the German system differentiates between children (i.e., below 10 or below 15 years) and others. Death is considered a classification variable only in the French Groupes Homogènes de Malades (GHM) system.

Provider and setting characteristics are considered only in the Finnish and Swedish versions of NordDRGs and in the Dutch DBC system. In these systems, the grouping process differentiates between cases treated in inpatient and outpatient settings (not shown in the case of the Netherlands, where only very few cases are concerned). In addition, the Dutch DBC system considers provider characteristics by determining the department, where patients are treated (i.e., surgery).

Composition of DRGs and variation in relative resource intensity

In most countries, the vast majority of appendectomy EoC cases are grouped into the shadowed DRG (in Fig. 1) containing the index case (see Table 1), i.e., between 55% in Germany and 92% in Ireland. Finland is the only country, where 56% of patients are classified into a DRG containing appendectomy cases with generalized peritonitis or peritoneal abscess or patients with other complications and comorbidities. Within these index DRGs, almost all patients conform to our EoC definition (i.e., around 90% or above—shown in the second to last column). Only in Austria, the index DRG includes about 25% of patients that do not have a diagnosis of appendicitis. This might be explained by the fact that the diagnosis is not part of the Austrian grouping algorithm.

The cost index shows that the index DRG is the lowest-valued DRG in all countries except in Finland and Sweden, where separate “outpatient” DRG cost weights exist that are about 20% lower than the index DRG in Finland and 55% lower than the index DRG in Sweden. In general, in DRG systems with only two or three DRGs for appendectomy patients (i.e., in Austria, England, Finland, Sweden, Ireland, the Netherlands, and Poland), even the highest-valued DRG has a cost index below 2, implying that the systems do not adequately account for cases that are more than twice as complex as the index case. In Spain (Catalonia, AP-DRG V23), the most complex DRG containing more than 3% of patients and accounting for major complications such as chronic heart failure or pneumonia has a cost index of 4.75. In France, the most complex DRG (patients with complicated appendicitis, level 4 CC or level 3 CC and age greater than 80 years, and a LOS longer than 5 days) is valued more than five times as high as the index case.

In DRG systems where age is considered in the classification process, hospitals generally receive higher payments for elderly patients and for children. The differences can be quite large: For example, in Austria patients above age 69 have a cost index of more than 1.6. However, the difference between children and adults is relatively small in England and Germany. In the AP-DRG system and the Dutch DBC system, the only two systems that differentiate between open and laparoscopic appendectomy, hospitals receive higher payments for laparoscopic appendectomy than for open appendectomy. The difference between open and laparoscopic appendectomy is relatively small in Spain but amounts to 17% in the Netherlands.

In countries differentiating between complicated and uncomplicated appendicitis as primary diagnosis, the cost index is considerably higher for complicated appendicitis. In all countries except for Finland, the cost index is at least 1.4 for complicated appendicitis cases. Only in Finland, where almost all patients are classified as complicated appendicitis, the cost index is around 1.1.

DRGs and hospital quasi prices for case vignettes

Table 4 shows a comparison of DRGs and hospital quasi prices reflecting national average hospital payments for each case vignette under the assumption that hospital payment would be exclusively based on DRGs. For each case vignette, the first column specifies the DRG into which a case vignette patient would be classified and whether he would be considered an inlier or an outlier, i.e., whether the predefined length of stay is below or above the DRG system-specific lower or upper length of stay threshold. The second column specifies for each patient the corresponding quasi price. In the last column of the table, the index DRGs (see Fig. 1) and corresponding quasi prices are presented.

Table 4 Comparison of hospital (quasi) prices for appendectomy patients in Europe (in year 2008 Euros)

Apparently, large variation in hospital payments exists across countries. In general, costs appear to be lower in countries with a low GDP per capital [24], i.e., Estonia and Poland, and high in countries with a higher GDP (even though exceptions exist). Interestingly, however, countries that pay a higher price for one patient do not necessarily pay a higher price for all kinds of patients. For example, hospitals in France would receive much higher payments for appendectomy performed on a young patient, with peritoneal abscess, wound infection disruption of the operation wound and a long length of stay (patient 3) than hospitals in England. However, hospitals in England would receive higher payments for performing appendectomy on a young patient with no secondary diagnoses and a short length of stay (patient 4) than hospitals in France.

Discussion

This study presents results of the most comprehensive available comparative analysis of grouping algorithms, classification variables, and prices used for appendectomy patients in different DRG systems in Europe. It shows great variations across countries: (1) in the number of DRGs comprising more than 97% of appendectomy cases and in the number of considered classification variables; (2) in the characteristics of classification variables that take account of treatment, patient, and provider/setting characteristics; (3) in the degree of differentiation between complex and less complex cases, i.e., in the relative resource intensity of different DRGs; and (4) in quasi prices for different types of patients (case vignettes).

As DRGs are used to assess the performance of hospitals (including that of surgeons) and to determine hospital payment [1, 2], it is important that DRG systems consider the most appropriate classification variables and define as many groups as necessary to assure that performance comparisons and hospital payments are fair [26]. Given the identified large variations between DRG systems in different countries for the classification of a relatively well-defined group of patients, it is at least questionable whether all DRG systems consider as classification variables the most important determinants of resource consumption within their country of use. Surgeons can influence decisions about how to define classification variables in their roles as advisors to national authorities responsible for defining and updating the patient classification systems of their countries [46]. International comparisons can provide a useful new perspective when thinking about how to improve an existing DRG system. However, before drawing conclusions on the basis of this study’s findings, limitations of our data and methodology need to be considered.

Firstly, the data that was used to identify patients and to assess the relative importance of different DRGs in different countries, originated from routine inpatient databases in 11 countries. As highlighted by the Hospital Data Project [13], there are differences in coding practices across countries, and the quality of data is not always comparable. One surprising finding of our study was that the majority of patients in Finland were coded as having appendicitis with generalized peritonitis or peritoneal abscesses or other complications and CC, whereas in all other countries, this percentage was well below 30%. We do not know whether this represents inappropriate coding or late presentation of appendectomy patients in Finland. However, both would be reasons for concern. As the data analyzed in this study is used to determine DRG-based payments, it is possible that some patients were inappropriately coded in order to maximize hospital revenues (“up-coding”) [27, 28].

Secondly, differences in hospital payment systems between countries complicate comparative analyses of payment levels (Table 4). On the one hand, different countries set DRG-based payment rates at different levels as they include different cost categories. For example, in Germany, fixed capital costs are not included in DRG-based payment rates, whereas in most other countries, DRG-based payment rates are supposed to cover capital costs [8]. On the other hand, different systems of additional payments exist, e.g., England assigns additional HRGs for certain diagnostic evaluations such as CT scans [29], and Poland and Austria have additional per diem-based payments for stays in intensive care units. Furthermore, the Netherlands, Finland, and—prior to HRG4—also England could have several DRGs per hospital stay, each leading to additional DRG-based payments. Last but not least, DRG-based payments are adjusted in several countries to account for differences between hospitals or regions. Therefore, the absolute price levels should not be directly interpreted as reflecting more expensive care in one country compared to another. However, relative price levels within countries that were used for comparisons in Fig. 1 should be less affected by differences in payment systems as they were always compared to the in-country DRG index case.

Thirdly, while our comparison has shown that classification of appendectomy patients and DRG-based hospital payment for these patients vary markedly across countries, we have not looked at the regular changes that take place between different versions of the same DRG system over time. Kobel et al. [30] have shown that the number of DRGs has considerably increased in almost all DRG systems (except for the Dutch DBC system) between 2004 and 2010, from well below 1,000 DRGs in 2004 to 1,200 DRGs and more in three systems (i.e., G-DRG, 1,200; HRG, 1,389; GHM, 2,297). In addition, because our comparative assessment of classification variables has focused on the more frequent cases of appendectomy, including only 97% of cases in the most populated DRGs, we were unable to point out differences between systems in how they deal with rare high-cost cases, which, however, may be particularly relevant for reimbursement issues [31].

In spite of these limitations, our study has major implications for surgeons and national authorities involved in the redesigning of national DRG systems. First, awareness about classification algorithms and variables in other countries should encourage surgeons to think about alternative and possibly better ways to classify their patients into DRGs. For example, while seven countries differentiate between patients with a complicated primary diagnosis of appendicitis (i.e., with generalized peritonitis or peritoneal abscess) and those without, four countries (Austria, England, Ireland, and the Netherlands) do not make this distinction. However, at least in England, patients with a complicated primary diagnosis stay on average almost twice as long in hospital as those without (i.e., between 5.4 and 6.3 days versus about 3.0 days for most other cases [32]), suggesting that it would be worth testing whether homogeneity of patients within DRGs could be increased by introducing a classification variable for complicated primary diagnoses.

Second, some DRG systems achieve a greater degree of differentiation between more and less complex patients than other countries as is reflected in the different range of the cost index in Fig. 1. If DRG systems do not adequately account for differences between patients, hospitals and surgeons that treat a greater share of more complex cases than others are not adequately paid for their greater efforts. Possibly, in countries with only few DRGs to account for differences in complexity, some of the differences in patient populations between hospitals are accounted for through adjustments outside of the DRG systems. For example in Ireland and in some states in Austria, hospital payments are adjusted for the type of hospital, e.g., teaching hospitals in Ireland receive higher payments [33]. However, ideally, differences in patient characteristics would be accounted for in the patient classification systems and not in the payment systems.

Third, the aim of any DRG system is to give a concise measure of what hospitals do. This measure is useful only if DRGs describe a sufficiently homogenous group of patients [2]. Therefore, quantitative research is needed to verify whether the most important determinants of cost are considered in different patient classification systems, and whether differences between systems reflect country specific differences in treatment patterns. The third phase of the EuroDRG project attempts to contribute to this discussion. However, it is also important for surgeons and other medical specialists to be aware of the significance of adequately designed DRG systems and to engage in optimizing these systems. On the background of their clinical experience in treating patients, information presented in this article about how DRG systems classify appendectomy patients can help surgeons to engage with national DRG authorities. Ultimately, this contributes to assuring adequate reimbursement for treated patients and fair performance comparisons on the basis of DRGs.