Data electronically extracted from the electronic health record require validation

Scheid, Lisa M.; Brown, L. Steven; Clark, Christopher; Rosenfeld, Charles R.

doi:10.1038/s41372-018-0311-8

Data electronically extracted from the electronic health record require validation

Article
Published: 24 January 2019

Volume 39, pages 468–474, (2019)
Cite this article

Journal of Perinatology Submit manuscript

Lisa M. Scheid^1,2,
L. Steven Brown³,
Christopher Clark³ &
…
Charles R. Rosenfeld ORCID: orcid.org/0000-0002-9208-6539²

461 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

Objectives

Determine sources of error in electronically extracted data from electronic health records.

Study design

Categorical and continuous variables related to early-onset neonatal hypoglycemia were preselected and electronically extracted from records of 100 randomly selected neonates within 3479 births with laboratory-proven early-onset hypoglycemia. Extraction language was written by an information technologist and data validated by blinded manual chart review. Kappa coefficient assessed categorical variables and percent validity continuous variables.

Results

8/23 (35%) categorical variables had acceptable Κappa (1–0.81); 5/23 (22%) had fair-slight agreement, Κappa < 0.40. Notably, “hypoglycemia” had poor agreement, Κappa 0.16. In contrast, 6/8 continuous variables had validity ≥ 94%. After correcting extraction language, 6/9 variables were corrected and inter-rater validation improved. However, “hypoglycemia” was not corrected, remaining an issue.

Conclusions

Data extraction without validation procedures, especially categorical variables using International Classification of Diseases-9 (ICD-9) codes, often results in incorrect data identification. Electronically extracted data must incorporate built-in validating processes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data accuracy in the Ontario birth Registry: a chart re-abstraction study

Article Open access 27 December 2019

Validation of an Intensive Care Unit Data Mart for Research and Quality Improvement

Article 14 October 2022

Validity of administrative data in recording sepsis: a systematic review

Article Open access 01 December 2015

References

Dick RS, Steen EB, Detmer DE. The computer-based patient record: revised ediction: an essential technology for health care. Institute of Medicine Committee on Improving the Patient Record.. Washington DC: National Academies Press; 1997.
Google Scholar
Steinbrook R. Health care and the American recovery and reinvestment act. N Engl J Med. 2009;360:1057–60.
Article CAS Google Scholar
Wasserman RC. Electronic medical records (EMRs), epidemiology, and epistemology: reflections on EMRs and future pediatric clinical research. Acad Pediatr. 2011;11:280–7.
Article Google Scholar
Sutherland SM, Kaelber DC, Downing NL, Goel VV, Longhurst CA. Electronic health record–enabled research in children using the electronic health record for clinical discovery. Pediatr Clin. 2016;63:251–68.
Article Google Scholar
Ancker JS, Shih S, Singh MP, Snuder A, Edwards A, Kaushal R. Root causes underlying challenges to secondary use of data. AMIA Annu Symp Proc. 2011;2011:57–62.
PubMed PubMed Central Google Scholar
Denny JC. Mining electronic health records in the genomics era. PLoS Comput Biol. 2012;8:e1002823.
Article CAS Google Scholar
Hersh WR, Weiner MG, Embi PJ, Logan JR, Payne PRO, Bernstam EV, et al. Caveats for the use of operational electronic health recod data in comparative effectiveness research. Med Care. 2013;51:S30–S37.
Article Google Scholar
Dean BB, Lam J, Natoli JL, Butler Q, Aguilar D, Nordyke RJ. Use of electronic medical records for health outcomes research: a literature review. Med Care Res Rev. 2009;66:611–38.
Article Google Scholar
Lucyk K, Tang K, Quan H. Barriers to data quality resulting from the process of coding health information to administratice data: a qualitative study. BMC Health Serv Res. 2017;17:1–10.
Article Google Scholar
Tang KL, Lucyk K, Quan H. Coder perspectives on physician-related barriers to producing high-quality administrative data: a qualitative study. CMAJ Open. 2017;5:E617–E622.
Article Google Scholar
Li L, Chase HS, Patel CO, Friedman C, Weng C. Comparing ICD9-encoded diagnoses and NLP-processed discharge summaries for clinical trials pre-screening: a case study. AMIA Annu Proc. 2008;404–8.
Newgard CD, Zive D, Jui J, Weathers C, Daya M. Electronic versus manual data processing: evaluating the use of electronic health records in out‐of‐hospital clinical research. Acad Emerg Med. 2012;19:217–27.
Article Google Scholar
Liaw S-T, Taggart J, Yu H, de Lusignan S. Data extraction from electronic health records-existing tools may be unreliable and potentially unsafe. Aust Fam Physician. 2013;42:820–3.
PubMed Google Scholar
Knake LA, Ahuja M, McDonald EL, Ryckman KK, Weathers N, Burstain T, et al. Quality of EHR data extractions for studies of preterm birth in a tertiary care center: guidelines for obtaining reliable data. BMC Pediatr. 2016;16:59.
Article Google Scholar
Bailie R, Bailie J, Chakraborty A, Swift K. Consistency of denominator data in electronic health records in Australian primary healthcare services: enhancing data quality. Aust J Prim Health. 2015;21:450–9.
Article Google Scholar
Parsons A, McCullough C, Wang J, Shih S. Validity of electronic health record-derived quality measurement for performance monitoring. J Am Med Inform Assoc. 2012;19:604–9.
Article Google Scholar
Ford E, Carroll JA, Smith HE, Scott D, Cassell JA. Extracting information from the text of electronic medical records to improve case detection: a systematic review. J Am Med Inform Assoc. 2016;23:1007–15.
Article Google Scholar
Castillo EG, Olfson M, Pincus HA, Vawdrey D, Stroup TS. Electronic health records in mental health research: a framework for developing valid research methods. Psychiatr Serv. 2015;66:193–6.
Article Google Scholar

Download references

Acknowledgements

Dr. Scheid was a trainee in Neonatal-Perinatal Medicine at UT Southwestern Medical School. This study was presented in part as an oral presentation at the annual meeting of the Pediatric Academic Societies in San Francisco, CA, May, 2017.

Funding:

George L. MacGregor Professorship in Pediatrics awarded to Dr. Rosenfeld

Author information

Authors and Affiliations

Department of Pediatrics, University of Texas Health Science Center at Houston, Houston, TX, USA
Lisa M. Scheid
Department of Pediatrics, Division of Neonatal-Perinatal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, USA
Lisa M. Scheid & Charles R. Rosenfeld
Parkland Health and Hospital System, Dallas, TX, USA
L. Steven Brown & Christopher Clark

Authors

Lisa M. Scheid
View author publications
You can also search for this author in PubMed Google Scholar
L. Steven Brown
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Clark
View author publications
You can also search for this author in PubMed Google Scholar
Charles R. Rosenfeld
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Charles R. Rosenfeld.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Scheid, L.M., Brown, L.S., Clark, C. et al. Data electronically extracted from the electronic health record require validation. J Perinatol 39, 468–474 (2019). https://doi.org/10.1038/s41372-018-0311-8

Download citation

Received: 20 June 2018
Revised: 07 December 2018
Accepted: 23 December 2018
Published: 24 January 2019
Issue Date: March 2019
DOI: https://doi.org/10.1038/s41372-018-0311-8
Springer Nature America, Inc.

This article is cited by

Retrospective application of algorithms to improve identification of pregnancy outcomes from the electronic health record
- Kajal Angras
- Victoria E. Boyd
- A. Dhanya Mackeen
Journal of Perinatology (2023)
Decreasing delivery room CPAP-associated pneumothorax at ≥35-week gestational age
- Edward F. Stocks
- Mambarambath Jaleel
- Luc P. Brion
Journal of Perinatology (2022)
The impact of the Baby Friendly Hospital Initiative on neonatal hypoglycemia
- Marina S. Oren
- Whittney D. Barkhuff
- Tara L. DuPont
Journal of Perinatology (2020)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data electronically extracted from the electronic health record require validation