Underestimated prevalence of heart failure in hospital inpatients: a comparison of ICD codes and discharge letter information

Kaspar, Mathias; Fette, Georg; Güder, Gülmisal; Seidlmayer, Lea; Ertl, Maximilian; Dietrich, Georg; Greger, Helmut; Puppe, Frank; Störk, Stefan

doi:10.1007/s00392-018-1245-z

Underestimated prevalence of heart failure in hospital inpatients: a comparison of ICD codes and discharge letter information

Original Paper
Open access
Published: 17 April 2018

Volume 107, pages 778–787, (2018)
Cite this article

Download PDF

You have full access to this open access article

Clinical Research in Cardiology Aims and scope Submit manuscript

Underestimated prevalence of heart failure in hospital inpatients: a comparison of ICD codes and discharge letter information

Download PDF

Mathias Kaspar^1,2,
Georg Fette^1,2,
Gülmisal Güder¹,
Lea Seidlmayer¹,
Maximilian Ertl^2,3,
Georg Dietrich²,
Helmut Greger³,
Frank Puppe² &
…
Stefan Störk¹

2975 Accesses
29 Citations
1 Altmetric
Explore all metrics

Abstract

Background

Heart failure is the predominant cause of hospitalization and amongst the leading causes of death in Germany. However, accurate estimates of prevalence and incidence are lacking. Reported figures originating from different information sources are compromised by factors like economic reasons or documentation quality.

Methods

We implemented a clinical data warehouse that integrates various information sources (structured parameters, plain text, data extracted by natural language processing) and enables reliable approximations to the real number of heart failure patients. Performance of ICD-based diagnosis in detecting heart failure was compared across the years 2000–2015 with (a) advanced definitions based on algorithms that integrate various sources of the hospital information system, and (b) a physician-based reference standard.

Results

Applying these methods for detecting heart failure in inpatients revealed that relying on ICD codes resulted in a marked underestimation of the true prevalence of heart failure, ranging from 44% in the validation dataset to 55% (single year) and 31% (all years) in the overall analysis. Percentages changed over the years, indicating secular changes in coding practice and efficiency. Performance was markedly improved using search and permutation algorithms from the initial expert-specified query (F1 score of 81%) to the computer-optimized query (F1 score of 86%) or, alternatively, optimizing precision or sensitivity depending on the search objective.

Conclusions

Estimating prevalence of heart failure using ICD codes as the sole data source yielded unreliable results. Diagnostic accuracy was markedly improved using dedicated search algorithms. Our approach may be transferred to other hospital information systems.

Accuracy of heart failure ascertainment using routinely collected healthcare data: a systematic review and meta-analysis

Article Open access 01 March 2024

Development and validation of a heart failure with preserved ejection fraction cohort using electronic medical records

Article Open access 28 June 2018

Derivation and validation of a computable phenotype for acute decompensated heart failure in hospitalized patients

Article Open access 07 May 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Background

Heart failure has become the leading diagnosis at hospital discharge and the most important driver of in-hospital mortality in Germany [1]. These estimates are based on counts provided by hospitals utilizing the German modification of the International Classification of Diseases (ICD)-10. Respective ICD-10 codes identifying heart failure within the German ICD-10 catalogue are I11.*, I13.0, I13.2, and I50.*. However, estimating the true number of patients suffering from heart failure based on this catalogue is unreliable for several reasons. The heart failure syndrome, especially in its early stages, may go unrecognized or may not be encoded as an explicit diagnosis; further, various financial incentives provided by the German health care system drive the likelihood of a specific ICD code entering a patient’s list of discharge diagnoses. These incentives favor the encoding of diagnoses associated with the most favorable reimbursement profile and may therefore considerably affect the “true number” of patients burdened by heart failure. After hospital discharge, however, only the most important diagnoses (e.g., the top three) are reported to and collected by higher level organizations, e.g., health insurances or statutory registries. Thus, the “prevalence” of a certain condition may be augmented or suppressed depending on its re-imbursement profile and subsequent quality of coding. Furthermore, the quality of documentation itself, e.g., staff training [2] and the marked changes imposed on respective workflows (e.g., change from paper-based records to electronic record systems [3]), have a major influence on disease statistics. Despite these shortcomings, the above-mentioned approach of collecting ICD diagnoses remains the prime source for public health decisions [4, 5].

Beside the statutory census of disease statistics, attempts have been made towards a more reliable and comprehensive identification of diagnoses from clinical routine data. Most of them, however, based their algorithm on coded diagnoses [6]. Because of these reasons, a better and earlier recognition of heart failure patients is of utmost importance [7]. Since modern hospitals can provide a wealth of electronic patient-based information, this data may be used to improve or corroborate diagnostic certainty and comprehensiveness.

The objective of the current study was to approximate the “true number” of patients suffering from heart failure at a tertiary care center. Against a physician-based reference standard, we compared the performance of ICD-based diagnosis versus advanced definitions based on algorithms that integrate various sources of the hospital information system. We hypothesized that (a) ICD-based diagnosis may underestimate the true prevalence of heart failure and (b) a catalogue of criteria defining heart failure utilizing various sources of the hospital information system will advance diagnostic accuracy.

Methods

The Würzburg data warehouse

The clinical data warehouse (DWH) implemented at the Würzburg University Hospital provides a homogeneous and structured access to pseudonymized data of 100% of the hospital’s patient cases originally stored within various separate information systems (e.g., the central administrative system, electronic patient chart used on wards, and systems for laboratory values and discharge letters). The only current exceptions are data from psychiatric and child care facilities for data protection reasons. The technical set-up is based on open-source systems and has been described elsewhere [8,9,10]. Data of the DWH can be queried in (a) structured form (e.g., patient demographics, diagnoses as ICD codes, procedures as codes of the German procedure classification “Operationen- und Prozedurenschlüssel” (OPS), and laboratory values); (b) semi-structured form (e.g., echocardiography, cardiac catheterization), and (c) unstructured form (e.g., discharge letters). The most innovative add-on to the DWH is the unique information extraction and ad hoc text search functionality, which allows to create parametrized information from semi- and unstructured reports and to search for any textual item (e.g., search within discharge letters for text combinations including variants and negations or extract numeric parameters from echocardiographic reports) [11].

Patient selection

The Medical Department I of the Würzburg University Hospital specializes in, but is not limited to, emergency medicine, intensive care, cardiology, pulmonology, nephrology, and endocrinology. For the current analysis, we used all cases of patients treated at the Medical Department I between the years 2000 and 2015 for whom a discharge letter was available.

Reference standard for the definition of heart failure

A sample of consecutive patients treated at the Medical Department I was drawn from the DWH within a randomly selected period (January 1 to January 31, 2009), yielding 1042 cases. These patients were manually checked by a cardiologist with long-standing experience in the care of heart failure patients (GG). Information used by the physician included ICD codes, the discharge letter, and the echocardiographic report (if available). The physician assigned a label (“heart failure: yes/no”) to each case, which was then used as reference standard for subsequent analyses.

Algorithms for automated detection of heart failure

In order to investigate heart failure detection algorithms, 18 subqueries of relevant heart failure-related concepts were defined within the user interface of our DWH and presented in the rows of Table 1. Each subquery considers a specific fact and either is a restriction on a numeric DWH parameter (e.g., subquery Echo-EF ≤ 45 represents a left ventricular ejection fraction (LVEF) ≤ 45% captured from echocardiographic reports after information extraction), the existence of an ICD diagnoses (e.g., subquery ICD-Any-HF represents the existence of any heart failure related ICD) or text searches within the discharge letter suggesting presence of heart failure (e.g., subquery Text-Left-HF represents the occurrence of a textual synonym for “left ventricular heart failure”). Text searches were specified to account for typing errors, synonyms and negations [12].

Table 1 Automated advanced data warehouse interrogation to detect heart failure

Full size table

The algorithms used to detect patients with heart failure (i.e., M_ICD, M_Expert, A_Precision, A_Sensitivity, A_F1) are presented in the right-hand columns of Table 1, each by a selection of subqueries that needed to be combined for the full algorithm. Each hit of any of an algorithm’s subqueries stands for the presence of heart failure. Two of the algorithms were manually specified: M_ICD indicates an algorithm that solely utilizes ICD codes and M_Expert indicates an algorithm (i.e., subqueries used for this DWH interrogation) pre-specified by cardiologists based on clinical experience. The other three algorithms (A_Precision, A_Sensitivity, and A_F1) originated from iterative permutation testing utilizing all defined subqueries. They were optimized to yield the most favorable results regarding the chosen measures (i.e., precision, sensitivity, and F1 score; for definitions see “Data analysis”) with regard to the reference standard definition of heart failure described in the previous section. The algorithms were computed utilizing exactly the same data that the physician used to evaluate the reference standard: the discharge letter, the ICD codes, and the echocardiographic report (if available).

Data analysis

The data for the current analyses were exclusively taken from the DWH via its graphical user interface and the subqueries defined in Table 1. Analysis was done using the software package R [13]. Presence of heart failure was captured based on data of individual hospital visits of individual patients (i.e., one case); each patient was counted once per year. The proportions of true positive, false positive, true negative, and false negative matches were calculated. Further, precision, sensitivity, and the F1 score were computed to provide integrated measures of the accuracy of the match between automated heart failure detection and the reference standard. Precision (also called positive predictive value) describes the share of algorithmically labeled heart failure patients who indeed have heart failure, out of all algorithmically labeled heart failure patients; e.g., a precision of 100% means that all selected patients truly have heart failure. Sensitivity (also called true positive rate or recall) is the share of algorithmically labeled heart failure patients who indeed have heart failure, out of all patients with heart failure; e.g., a sensitivity of 100% means that all patients with heart failure are selected. The F1 score is the harmonic mean of precision and sensitivity and is used as the overall accuracy measure in this analysis; e.g., an F1 score of 100% means that exactly the patients who truly have heart failure are selected and an F1 score of 85% would describe the prevalence of heart failure with an estimated error of 15%. Measures of any permutation of the subqueries were computed in R, utilizing single DWH exports of each subquery, to maximize the F1 score and aiming to yield a precision and sensitivity of at least > 90% but still have a corresponding sensitivity and precision > 60%, respectively. Frequencies and percentages were used to present aggregated data across periods under study.

Results

From 2000 to 2015, 110,742 individual patients were treated at and received a discharge letter from the Department of Medicine I of the Würzburg University Hospital. Of these patients, 71,625 had at least one inpatient visit. After splitting the 16-year period into four 4-year periods (i.e., 2000–2003, 2004–2007, 2008–2011, and 2012–2015), respective counts for all patients (inpatients) were 25,753 (17,941), 32,301 (19,592), 37,300 (21,743), and 42,119 (25,692).

Verification of the heart failure detection algorithm

Table 2 presents the performance characteristics derived from cross-validating the heart failure detection algorithms (defined in Table 1) against the reference standard set, i.e., the 1042 manually labeled inpatients, in whom 222 subjects (21%) were identified by the expert to suffer from heart failure.

Table 2 Performance of automated heart failure detection algorithms versus reference standard

Full size table

The algorithm that was solely based on ICD codes (M_ICD) resulted in a good precision of 94%, but a low sensitivity of 50%, and a F1 score of 65%. The low sensitivity illustrates the low share of patients with heart failure detected by this algorithm. The missing 6% to a precision of 100% were caused by seven out of the 1042 patients who had a heart failure-related ICD diagnosis, but were not labeled to have heart failure. The expert-specified heart failure algorithm (M_Expert) improved the detection rate and resulted in a precision of 76%, a sensitivity of 87%, and a F1 score of 81%. Divergent conclusions between the algorithm and the reference standard were found in 89 cases, with 60 patients mistakenly classified to have heart failure and 29 patients mistakenly classified to not have heart failure.

Since the manually defined algorithms resulted in low scores, the algorithm M_Expert was refined further. Three algorithms were developed and tested, each optimizing certain aspects of diagnostic accuracy: A_Precision aimed to increase the reliability of the classification as heart failure patient (reduced false positives), A_Sensitivity aimed to reduce the number of patients not classified as heart failure patient (reduced false negatives), and A_F1 aimed for an overall improved accuracy of the classification as heart failure patient (balanced precision and sensitivity). The algorithm with the highest F1 score (i.e. A_F1) resulted in a precision of 89%, a sensitivity of 84%, and an F1 score of 86%. The missing 14% to an F1 score of 100% was caused by 59 false matches that originated from “borderline cases” with limited or unclear textual information that opened more room for interpretation and misclassification for both computer and expert. Some errors were the result of missing data in the DWH, e.g., missing LVEF values or terms that indicate negations of heart failure in the discharge letter.

Prevalence of heart failure

Figure 1 illustrates the annual frequencies of all inpatients of the Department of Medicine I with a discharge letter and the subgroup of patients with heart failure identified by the automated algorithms described in Table 1. Across the entire period, A_F1 identified 18,167 unique patients with heart failure. In the year 2000, the count of patients with heart failure started at n = 620 and showed an average annual gain of 9.3% over the entire period. After the year 2012, the annual gain appeared to accelerate from 7.4% before 2012 to 17.1% thereafter. By contrast, the average annual gain of all inpatients was 3.4%. The application of A_Precision and A_Sensitivity resulted in 10,786 and 25,084 patients identified with heart failure, respectively.

Several patients were treated multiple times over the years, which resulted in sums of unique patients per year reported in Fig. 1 that were higher than the above-reported sum of unique patients of all years. This included 3115 unique patients with 4583 heart failure-related re-hospitalizations after an initial heart failure-related hospitalization within the entire period. A characterization of inpatients with heart failure identified by the application of A_F1 is presented in Table 3 for the four 4-year periods from 2000 to 2015, grouped by age and sex.

Table 3 Frequencies of all patients with heart failure identified by the A_F1 algorithm by age group, gender and the 4-year periods (each with unique patients per time period)

Full size table

Each search term of the detection algorithm A_F1 contributed with varying impact to the identification of heart failure. In the reference standard, the largest contributions emerged from “Text-Heart-Failure” (59% capture rate), “ICD-Any-HF” (56%), and “Text-Cardiac-Decompensation” (53%). Further important contributors were “Echo-EF ≤ 45” (24%) and “Text-Systolic-Failure” (4%). In the case of “ICD-Any-HF”, for example, this means that 44% of all patients with heart failure did not have an ICD code indicative of heart failure. The contribution of the individual search terms to the overall analysis varied substantially over the years, as presented in Fig. 2. This illustration presents the search terms of the first heart failure related hospitalization per patient and year. Noteworthy is the relatively small contribution of the term LVEF ≤ 45% from echocardiography, although echocardiography was frequently performed: in 58% of patients on cardiologic ward and 29% of patients on other wards of internal medicine.

Figure 3 illustrates the contribution of ICD codes to the detection of heart failure in contrast to the additional contribution of other search terms (text/echo terms) over the entire period using the A_F1 algorithm. Within the years 2000–2015, the overall share of patients with heart failure identified by ICD codes was 69% of the total sample of patients with heart failure, which means that 31% of patients with heart failure remained undetected throughout the entire period.

Comorbidities and heart failure

We further analyzed, whether the comorbidity profile differed in subjects in whom presence of heart failure was identified via ICD codes versus subjects in whom heart failure was identified via additional sources of the DWH. Table 4 lists the most frequent comorbidities reported in the 18,167 patients with heart failure detected by the A_F1 algorithm in the mutually exclusive subgroups “detected by ICD codes” or “detected by other search terms” (specific DWH interrogation other than ICD code). Reported comorbidities were identified by their respective ICD code. Patients with ICD-coded comorbidities more frequently also had an ICD code for heart failure. Of note, the subgroup identified without ICD codes appeared to have a slightly lower burden of comorbidity.

Table 4 Frequency of comorbidities in inpatients with heart failure, detected by ICD codes and additionally detected via data sources provided by the data warehouse

Full size table

Discussion

The current analysis sheds new light on the magnitude of underestimation of heart failure prevalence in hospitalized patients. Identifying patients with heart failure from the hospital information system solely based on ICD-coded discharge diagnoses substantially underestimated the “true number” that could be gleaned after adding specific text searches and echocardiographic parameters to the search profile.

We observed a large degree of heart failure underestimation when using ICD codes only: within a single year it was up to 55% (average 31%) lower than the “true number” of heart failure. The last years of the analysis showed a trend towards better patient identification. The decreasing gap of underestimation became considerably smaller over time and indicates that coding strategies as diagnostic and therapeutic algorithms may indeed affect the “prevalence” of a disease. The detection gap came together with a marked increase in the absolute frequency of encoding ICD diagnoses for heart failure, starting with the year 2012 for inpatients. Furthermore, the percentage of patients with heart failure increased from about 15% in 2000 to about 35% in 2015. Reasons for such high proportions might be that we only included patients from the Medical Department I (hosting wards for intensive care, cardiology, pulmonology, endocrinology, nephrology), where heart failure is a frequent diagnosis, but also identified patients having heart failure as a secondary or tertiary diagnosis. Another explanation for these developments might be that the Comprehensive Heart Failure Center was founded at the Würzburg University Hospital in the year 2010, i.e., a facility devoted to the integration of research and care of patients with heart failure. This spurred numerous structural and research projects involving several hospital departments, led to a higher degree of awareness for the heart failure syndrome, and ultimately might not only have increased the count of patients with heart failure admitted to the hospital, but also improved the coding ratio.

Verifying the diagnosis of heart failure patients based on physician claims or hospital data has been attempted earlier [6, 14,15,16,17,18,19,20,21]. However, most of these studies focused on confirming or refuting the diagnosis of heart failure with the help of experts in subjects pre-identified via several variants of ICD codes, via study inclusion/exclusion criteria or manual screening. Subsequently, reported identification figures were fairly precise (i.e., yielded high precision). Frolova et al. for example aimed to verify ICD-based diagnosis of acute heart failure amongst patients admitted to the hospital with suspected acute heart failure [17] and found a precision of 93% (sensitivity of 76%) leading to an F1 score of 84%. In contrast, we aimed to identify “true heart failure” amongst all-comers, i.e., without an increased pre-test likelihood for the presence of such diagnosis. As expected, performance of the algorithm relying solely on ICD-based identification (= M_ICD) was worse in our data set (F1 score 65%). There are few studies focusing on all-comers for detecting heart failure [15, 22, 23], all reporting lower F1 scores (82, 53–67, 80%, respectively) compared to our analysis (86%).

Applying text extraction methods to detect heart failure has rarely been attempted. Meystre et al. [24], for example focused on the information extraction of a few highly selected parameters (e.g., LVEF value and medication) from texts in contrast to an overall detection of heart failure. They utilized a pre-defined data set of heart failure patients in contrast to all-comers and, subsequently, received high F1 scores of up to 99% for single parameters (e.g., LVEF value). While interesting to demonstrate feasibility, such concepts do not mirror clinical reality. No related work was found utilizing information extraction to detect heart failure in all-comers.

Another major finding of the current study is that readily available information from the hospital information system considerably improves the identification of heart failure patients beyond the traditional identification via ICD codes. The option to enrich the search strategy by clinical variables supporting or denying the presence of heart failure is not new, but a variety of problems may impede its implementation: (1) the information is only selectively documented in clinical routine; (2) the desired information is stored in a non-structured format and appropriate data extraction tools are unavailable or unreliable; (3) the information is stored in a structured format but cannot be accessed for analysis (e.g., because it is stored in dedicated research data bases); (4) the quality of the stored data (structured or non-structured) is unreliable; (5) the information behind variables (meta data) is highly flexible but cannot be connected to the source data; (6) the individual patient and the corresponding cases of a patient (repeat hospitalizations) cannot reliably be discerned. Our approach utilized the hospital’s clinical DWH as described earlier [8,9,10] and integrated the full spectrum of digital information collected per patient in the hospital information system.

The most elaborated part in providing a DWH is the implementation of the data extract–transform–load (ETL) process to transfer data from the information systems to a unified database, which often—but not always—requires to consider local peculiarities depending on the available information systems. We implemented this process for most of the information stored within our systems, be it structured or unstructured. Our DWH query system utilizes a locally developed add-on [10] to provide text search functionality to DWH systems. This add-on could be instantaneously added as an extension to the often utilized i2b2 DWH system [25] or, with little extra work, to other similar DWH systems. Importantly, these tools were tested and optimized across their repeat utilization for various studies, including data validation against the primary systems after DWH extraction [9].

The combined use of these interfaces and generation of automated detection algorithms markedly improved the identification of patients with heart failure. We found better albeit still unsatisfactory accuracy when employing the algorithm based on “clinical information” alone (i.e., the algorithm M_Expert). We therefore tested other, data-optimized algorithms, and observed another major improvement of heart failure detection: the algorithm A_F1 optimized precision and sensitivity and yielded the overall best results. Importantly, our approach allowed to adjust and optimize the detection algorithm for different scenarios or use cases, e.g., to identify potential study participants via the algorithm (and thus enabling a study nurse to fine-tune the results) A_Sensitivity might yield best results. For the scenario of a post hoc analysis, A_F1 or A_Precision might be the preferred solutions. Interestingly, the NT-proBNP queries “Lab-NT-proBNP ≥ 1000” and “Lab-NT-proBNP ≥ 3000” (see Table 1) were not selected by the permutation analysis for any algorithm. This may be explained by the collinearity contained in other terms indicative for heart failure; e.g., for the A_F1-algorithm: “Echo-EF ≤ 45”, “ICD-Any-HF”, “Text-Heart-Failure”, “Text-Cardiac-Decompensation”, and “Text-Systolic-Failure”. We also considered using Framingham heart failure signs and symptoms [26] for detection of heart failure (see [27, 28]) either alone or in combination with borderline echocardiographic data, but were unsuccessful in demonstrating superior precision and sensitivity.

Our analyses support the notion that comorbidities of heart failure may also affect coding practices for heart failure. When comparing the presence of common comorbidities with the detection of heart failure via ICD-based versus alternative approaches, the differences where highly significant for almost all conditions. Interestingly, a sizeable proportion of patients with heart failure received an ICD code for the respective comorbidity, but not the ICD code for heart failure itself. This might indicate that heart failure was not at the focus of their hospitalization visit and not a dominant contributor from the reimbursement perspective. From a health policy perspective this means that many patients with heart failure as a concomitant condition leave the hospital without being reported to statutory data banks as heart failure patients. This not only adds to the detection gap, but also constitutes a major information gap for care providers after hospital discharge who play a key role in the treatment of heart failure in Germany [29].

Limitations

A limitation of this study is that the reference standard was only defined by a single cardiologist with long-standing experience in heart failure instead of multiple experts. The count of true heart failure patients may vary considerably depending on the care setting, the type of catchment area, and numerous other influencing factors. Hence, absolute counts are likely not directly comparable between hospitals. Similarly, the successful implementation of adapted detection algorithms needs to be confirmed before our results may become generalizable to other hospitals, both in Germany as internationally.

Conclusions

Coded discharge diagnoses substantially underestimate the number of heart failure patients compared to the added information available within discharge letters and echocardiographic reports. Therefore, statistics about heart failure solely based on ICD codes might be misleading. The degree of underestimation might vary substantially across case types (inpatients versus outpatients) and the course of subsequent years. The latter might be influenced by internal factors, e.g., improved coding practices, and/or external factors, e.g., the set up of specialized centers as the Comprehensive Heart Failure Center Würzburg.

References

Federal Statistical Office. http://www.gbe.de. Accessed Mar 2017
Asakura K, Ordal E (2012) Is your clinical documentation improvement program compliant? Health Financ Manag 66(10):96–100
Google Scholar
Pourasghar F, Malekafzali H, Koch S, Fors U (2008) Factors influencing the quality of medical documentation when a paper-based medical records system is replaced with an electronic medical records system: an Iranian case study. Int J Technol Assess Health Care 24(4):445–451
Article PubMed Google Scholar
Hall MJ, Levant S, DeFrances CJ (2012) Hospitalization for congestive heart failure: United States, 2000–2010, NCHS Data Brief (108), pp 1–8
Schellenbaum GD, Heckbert SR, Smith NL, Rea TD, Lumley T, Kitzman DW, Roger VL, Taylor HA, Psaty BM (2006) Congestive heart failure incidence and prognosis: case identification using central adjudication versus hospital discharge diagnoses. Ann Epidemiol 16(2):115–122
Article PubMed Google Scholar
Saczynski JS, Andrade SE, Harrold LR, Tjia J, Cutrona SL, Dodd KS, Goldberg RJ, Gurwitz JH (2012) A systematic review of validated methods for identifying heart failure using administrative data. Pharmacoepidemiol Drug Saf 21:129–140. https://doi.org/10.1002/pds.2313
Article PubMed Google Scholar
Störk S, Handrock R, Jacob J, Walker J, Calado F, Lahoz R, Hupfer S, Klebs S (2017) Epidemiology of heart failure in Germany: a retrospective database study. Clin Res Cardiol 106(11):913–922
Article PubMed PubMed Central Google Scholar
Fette G, Ertl M, Wörner A, Kluegl P, Störk S, Puppe F (2012) Information extraction from unstructured electronic health records and integration into a Data Warehouse, Lecture notes in informatics (LNI) vol 208, pp 1238–1252
Kaspar M, Ertl M, Fette G, Dietrich G, Toepfer M, Angermann C, Störk S, Puppe F (2016) Data linkage from clinical to study databases via an R data warehouse user interface. Experiences from a large clinical follow-up study. Methods Inf Med 55(4):381–386
Article PubMed Google Scholar
Dietrich G, Ertl M, Fette G, Kaspar M, Krebs J, Mackenrodt D, Störk S, Puppe F (2017) Extending the query language of a data warehouse for patient recruitment. Stud Health Technol Inform 243:152–156
PubMed Google Scholar
Toepfer M, Fette G, Beck PD, Klügl P, Puppe F (2014) Integrated tools for query-driven development of light-weight ontologies and information extraction components. In: Ide N, Grivolla J, (eds) Proceedings of the workshop on open infrastructures and analysis frameworks for HLT; Association for Computational Linguistics and Dublin City University, pp 83–92
Aho AV (1990) Algorithms for finding patterns in strings; handbook of theoretical computer science, volume A: algorithms and complexity. The MIT Press, Cambridge, pp 255–300
Google Scholar
R Development Core Team (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. ISBN 3-900051-07-0. http://www.R-project.org. Accessed 13 Apr 2018
Quach S, Blais C, Quan H (2010) Administrative data have high variation in validity for recording heart failure. Can J Cardiol 26(8):e306–e312
Article PubMed Central Google Scholar
Schultz SE, Rothwell DM, Chen Z, Tu K (2013) Identifying cases of congestive heart failure from administrative data: a validation study using primary care patient records. Chronic Dis Inj Can 33(3):160–166
PubMed CAS Google Scholar
Agarwal SK, Wruck L, Quibrera M, Matsushita K, Loehr LR, Chang PP, Rosamond WD, Wright J, Heiss G, Coresh J (2016) Temporal trends in hospitalization for acute decompensated heart failure in the United States, 1998–2011. Am J Epidemiol 183(5):462–470
Article PubMed PubMed Central Google Scholar
Frolova N, Bakal JA, McAlister FA, Rowe BH, Quan H, Kaul P, Ezekowitz JA (2015) Assessing the use of international classification of diseases-10th revision codes from the emergency department for the identification of acute heart failure. JACC Heart Fail 3(5):386–391
Article PubMed Google Scholar
Rosenman M, He J, Martin J, Nutakki K, Eckert G, Lane K, Gradus-Pizlo I, Hui SL (2014) Database queries for hospitalizations for acute congestive heart failure: flexible methods and validation based on set theory. J Am Med Inform Assoc 21(2):345–352
Article PubMed Google Scholar
Alqaisi F, Williams LK, Peterson EL, Lanfear DE (2009) Comparing methods for identifying patients with heart failure using electronic data sources. BMC Health Serv Res 9:237
Article PubMed PubMed Central Google Scholar
Loehr LR, Agarwal SK, Baggett C, Wruck LM, Chang PP, Solomon SD, Shahar E, Ni H, Rosamond WD, Heiss G (2013) Classification of acute decompensated heart failure: an automated algorithm compared with a physician reviewer panel: the Atherosclerosis Risk in Communities study. Circ Heart Fail 6(4):719–726
Article PubMed PubMed Central Google Scholar
Khand AU, Shaw M, Gemmel I, Cleland JG (2005) Do discharge codes underestimate hospitalisation due to heart failure? Validation study of hospital discharge coding for heart failure. Eur J Heart Fail 7(5):792–797
Article PubMed Google Scholar
Quan H, Parsons GA, Ghali WA (2002) Validity of information on comorbidity derived from ICD-9-CCM administrative data. Med Care 40:675–685
Article PubMed Google Scholar
Quan H, Li B, Saunders LD, Parsons GA, Nilsson CI, Alibhai A, Ghali WA (2007) Assessing validity of ICD-9-CM and ICD-10 administrative data in recording clinical conditions in a unique dually coded database. Health Serv Res 43:1424–1441
Article Google Scholar
Meystre SM, Kim Y, Gobbel GT, Matheny ME, Redd A, Bray BE, Garvin JH (2017) Congestive heart failure information extraction framework for automated treatment performance measures assessment. J Am Med Inform Assoc 24(e1):e40–e46
PubMed Google Scholar
Murphy SN, Weber G, Mendis M, Chueh HC, Churchill S, Glaser JP, Kohane IS (2010) Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J Am Med Inform Assoc 17(2):124–130
Article PubMed PubMed Central Google Scholar
McKee PA, Castelli WP, McNamara PM, Kannel WB (1971) The natural history of congestive heart failure: the Framingham study. N Engl J Med 285(26):1441–1446
Article PubMed CAS Google Scholar
Byrd R, Steinhubl S, Sun J, Ebadollahi S, Stewart W (2014) Automatic identification of heart failure diagnostic criteria, using text analysis of clinical notes from electronic health records. Int J Med Inform 83(12):983–992
Article PubMed Google Scholar
Vijayakrishnan R, Steinhubl S, Ng K, Sun J, Byrd R, Daar Z, Williams B, Defilippi C, Ebadollahi S, Stewart W (2014) Prevalence of heart failure signs and symptoms in a large primary care population identified through the use of text and data mining of the electronic health record. J Cardiac Fail 20(7):459–464
Article Google Scholar
Störk S, Handrock R, Jacob J, Walker J, Calado F, Lahoz R, Hupfer S, Klebs S (2017) Treatment of chronic heart failure in Germany: a retrospective database study. Clin Res Cardiol 106(11):923–932
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work was supported by Grants from the Bundesministerium für Bildung und Forschung (01EO1004 and 01EO1504).

Author information

Authors and Affiliations

Comprehensive Heart Failure Center (CHFC), Department of Internal Medicine I, Würzburg University Hospital, Am Schwarzenberg 15, 97078, Würzburg, Germany
Mathias Kaspar, Georg Fette, Gülmisal Güder, Lea Seidlmayer & Stefan Störk
Chair of Computer Science VI, University of Würzburg, Würzburg, Germany
Mathias Kaspar, Georg Fette, Maximilian Ertl, Georg Dietrich & Frank Puppe
Service Center Medical Informatics, Würzburg University Hospital, Würzburg, Germany
Maximilian Ertl & Helmut Greger

Authors

Mathias Kaspar
View author publications
You can also search for this author in PubMed Google Scholar
Georg Fette
View author publications
You can also search for this author in PubMed Google Scholar
Gülmisal Güder
View author publications
You can also search for this author in PubMed Google Scholar
Lea Seidlmayer
View author publications
You can also search for this author in PubMed Google Scholar
Maximilian Ertl
View author publications
You can also search for this author in PubMed Google Scholar
Georg Dietrich
View author publications
You can also search for this author in PubMed Google Scholar
Helmut Greger
View author publications
You can also search for this author in PubMed Google Scholar
Frank Puppe
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Störk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stefan Störk.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Kaspar, M., Fette, G., Güder, G. et al. Underestimated prevalence of heart failure in hospital inpatients: a comparison of ICD codes and discharge letter information. Clin Res Cardiol 107, 778–787 (2018). https://doi.org/10.1007/s00392-018-1245-z

Download citation

Received: 20 January 2018
Accepted: 09 April 2018
Published: 17 April 2018
Issue Date: September 2018
DOI: https://doi.org/10.1007/s00392-018-1245-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Underestimated prevalence of heart failure in hospital inpatients: a comparison of ICD codes and discharge letter information