Addressing Bias in Electronic Health Record-based Surveillance of Cardiovascular Disease Risk: Finding the Signal Through the Noise

Bower, Julie K.; Patel, Sejal; Rudy, Joyce E.; Felix, Ashley S.

doi:10.1007/s40471-017-0130-z

Addressing Bias in Electronic Health Record-based Surveillance of Cardiovascular Disease Risk: Finding the Signal Through the Noise

Cardiovascular Disease (R.Foraker, Section Editor)
Published: 02 November 2017

Volume 4, pages 346–352, (2017)
Cite this article

Current Epidemiology Reports Aims and scope Submit manuscript

Julie K. Bower^1,2,
Sejal Patel¹,
Joyce E. Rudy¹ &
…
Ashley S. Felix¹

812 Accesses
41 Citations
2 Altmetric
Explore all metrics

Abstract

Purpose of review

Use of the electronic health record (EHR) for cardiovascular disease (CVD) surveillance is increasingly common. However, these data can introduce systematic error that influences the internal and external validity of study findings. We reviewed recent literature on EHR-based studies of CVD risk to summarize the most common types of bias that arise. Subsequently, we recommend strategies informed by work from others as well as our own to reduce the impact of these biases in future research.

Recent findings

Systematic error, or bias, is a concern in all observational research including EHR-based studies of CVD risk surveillance. Patients captured in an EHR system may not be representative of the general population, due to issues such as informed presence bias, perceptions about the healthcare system that influence entry, and access to health services. Further, the EHR may contain inaccurate information or be missing key data points of interest (e.g., due to loss to follow-up or over-diagnosis bias). Several strategies, including implementation of unique patient identifiers, adoption of standardized rules for inclusion/exclusion criteria, statistical procedures for data harmonization and analysis, and incorporation of patient-reported data have been used to reduce the impact of these biases.

Summary

EHR data provide an opportunity to monitor and characterize CVD risk in populations. However, understanding the biases that arise from EHR datasets is instrumental in planning epidemiological studies and interpreting study findings. Strategies to reduce the impact of bias in the context of EHR data can increase the quality and utility of these data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Electronic health records to facilitate clinical research

Article Open access 24 August 2016

Characterizing bias due to differential exposure ascertainment in electronic health record data

Article 04 January 2021

The Electronic Health Record for Translational Research

Article 29 July 2014

References

Papers of particular interest, published recently, have been highlighted as: • Of importance •• Of major importance

Benjamin EJ, Blaha MJ, Chiuve SE, et al. Heart disease and stroke statistics—2017 update: a report from the American Heart Association. Circulation. 2017;135(10):e146–603.
Article PubMed PubMed Central Google Scholar
Kite BJ, Tangasi W, Kelley M, Bower JK, Foraker RE. Electronic medical records and their use in health promotion and population research of cardiovascular disease. Curr Cardiovasc Risk Rep. 2015;9(1):1–8.
Article Google Scholar
Porta M. A dictionary of epidemiology. Oxford: Oxford University Press; 2014.
Book Google Scholar
Tripepi G, Jager KJ, Dekker FW, Zoccali C. Selection bias and information bias in clinical research. Nephron Clin Pract. 2010;115(2):c94–9.
Article PubMed Google Scholar
Delgado-Rodríguez M, Llorca J. Bias. J Epidemiol Community Health. 2004;58(8):635–41.
Article PubMed PubMed Central Google Scholar
Bower JK, Bollinger CE, Foraker RE, Hood DB, Shoben AB, Lai AM. Active use of electronic health records (EHRs) and personal health records (PHRs) for epidemiologic research: sample representativeness and nonresponse bias in a study of women during pregnancy. eGEMs. 2017;5(1):1263.
Article PubMed PubMed Central Google Scholar
Zaccai JH. How to assess epidemiological studies. Postgrad Med J. 2004;80(941):140–7.
Article CAS PubMed PubMed Central Google Scholar
Flood TL, Zhao Y-Q, Tomayko EJ, Tandias A, Carrel AL, Hanrahan LP. Electronic health records and community health surveillance of childhood obesity. Am J Prev Med. 2015;48(2):234–40.
Article PubMed PubMed Central Google Scholar
Kandola D, Banner D, O’Keefe-McCarthy S, Jassal D. Sampling methods in cardiovascular nursing research: an overview. Can J Cardiovasc Nurs. 2014;24(3):15–8.
PubMed Google Scholar
•• Goldstein BA, Bhavsar NA, Phelan M, Pencina MJ. Controlling for informed presence bias due to the number of health encounters in an electronic health record. Am J Epidemiol. 2016;184(11):847–55. Coins the term informed presence, meaning that patients’ presence in EHR respository is not random; rather, it signifies that the patient is ill. Uses a simulation study to test the assumption that more visits to a clinician, means the higher probability that disease will be detected. When number of health care visits is conditioned on, the selection bias that informed presence induces gets eliminated.
Article PubMed Google Scholar
Romo ML, Chan PY, Lurie-Moroni E, Perlman SE, Newton-Dame R, Thorpe LE. Characterizing adults receiving primary medical care in New York City: implications for using electronic health records for chronic disease surveillance. Prev Chronic Dis. 2016;13:E56.
PubMed PubMed Central Google Scholar
Christopher AS, McCormick D, Woolhandler S, Himmelstein DU, Bor DH, Wilper AP. Access to care and chronic disease outcomes among Medicaid-insured persons versus the uninsured. Am J Public Health. 2016;106(1):63–9.
Article PubMed Google Scholar
•• Weiskopf NG, Hripcsak G, Swaminathan S, Weng C. Defining and measuring completeness of electronic health records for secondary use. J Biomed Inform. 2013;46(5):830–6. Highlights the issue of lack of a formal definition for EHR completeness. Defines EHR completeness according to four elements: documentation, breadth, density, and predictive capability. Tested this definition against representative data from EHRs in New York-Presbyterian Hospital’s clinical data warehouse and found that only half of the approximately 4 million records satisfied at least one component of completeness.
Article PubMed Google Scholar
Haneuse S, Bogart A, Jazic I, et al. Learning about missing data mechanisms in electronic health records-based research: a survey-based approach. Epidemiology (Cambridge, Mass). 2016;27(1):82–90.
Article Google Scholar
• Haneuse S, Daniels M. A general framework for considering selection bias in EHR-based studies: what data are observed and why? eGEMs. 2016;4(1):1203. Assesses the missing at random assumption in EHR data using data of patients with depression and weight change among these patients was investigated using a survey to delineate factors responsible for missing data for weight. They also described strategies for handling missing data in EHR research.
Article PubMed PubMed Central Google Scholar
• Rusanov A, Weiskopf NG, Wang S, Weng C. Hidden in plain sight: bias towards sick patients when sampling patients with sufficient electronic health record data for research. BMC Med Inform Dec Making. 2014;14(1):51. Applies criteria of The American Society of Anethesiologists Physical Status Classification System to 10,000 records from Columbia University Medical Center and determined that records with laboratory and medication information belonged to the most ill patients. EHR data sufficient for research purposes is inadvertently contingent on sicker patients.
Article Google Scholar
McVeigh KH, Remle N-D, Perlman S, Chernov C, Thorpe L, Singer J, et al. Developing an electronic health record-based population health surveillance system. New York: New York City Department of Health and Mental Hygiene; 2013.
Google Scholar
Healthy People 2020: Access to Health Services. https://www.healthypeople.gov/2020/topics-objectives/topic/Access-to-Health-Services]. Accessed September 28, 2017.
White AD, Folsom AR, Chambless LE, et al. Community surveillance of coronary heart disease in the Atherosclerosis Risk in Communities (ARIC) Study: methods and initial two years’ experience. J Clin Epidemiol. 1996;49(2):223–33.
Article CAS PubMed Google Scholar
The ARIC Investigators. The Atherosclerosis Risk in Communities (ARIC) Study: design and objectives. Am J Epidemiol. 1989;129(4):687–702.
Jha AK, DesRoches CM, Campbell EG, et al. Use of electronic health records in US hospitals. N Engl J Med. 2009;360(16):1628–38.
Article CAS PubMed Google Scholar
•• Chiolero A, Santschi V, Paccaud F. Public health surveillance with electronic medical records: at risk of surveillance bias and overdiagnosis. Eur J Pub Health. 2013;23(3):350–1. Describes surveillance and over-diagnosis bias with examples.
Article Google Scholar
•• Casey JA, Schwartz BS, Stewart WF, Adler NE. Using electronic health records for population health research: a review of methods and applications. Annu Rev Public Health. 2016;37:61–81. Describes the content of EHR data, how study variables are constructed, study population is set up, manifold uses of EHR data including epidemiological, environmental, and social epidemiology study and compares the traditional epidemiological study with EHR studies.
Article PubMed Google Scholar
• Cowie MR, Blomster JI, Curtis LH, et al. Electronic health records to facilitate clinical research. Clin Res Cardiol. 2017;106(1):1–9. Describes the use of EHR data in research including epidemiological research, challenges associated with use of EHR in clinical studies, and potential solutions.
Article PubMed Google Scholar
Hersh WR, Weiner MG, Embi PJ, et al. Caveats for the use of operational electronic health record data in comparative effectiveness research. Med Care. 2013;51(8 Suppl 3):S30–7.
Article PubMed PubMed Central Google Scholar
•• Nair S, Hsu D, Celi LA. Challenges and opportunities in secondary analyses of electronic health record data. Secondary analysis of electronic health records. Cham: Springer International Publishing; 2016. p. 17–26. Reviews biases that may occur when using pooled data from multiple healthcare systems for public health research. Uncovers barriers to development of a clinical data warehouse from multiple healthcare systems, including monetary and collaboration constraints. Considers that current pooled EHR data sets are mainly comprised of records from high revenue healthcare systems, and that the promising opportunity of EHR data for secondary analyses may have drawbacks such as data security issues.
Book Google Scholar
•• Friedman DJ, Parrish RG, Ross DA. Electronic health records and US public health: current realities and future promise. Am J Public Health. 2013;103(9):1560–7. Recognizes the potential contributions of EHR data to population health research and provides recommendations to make EHR data less biased for secondary analyses. These included increased population coverage of EHRs, standardized collection and recorded on measures, and legislature that would enable more effective pooling of EHR from multiple data sources.
Article PubMed PubMed Central Google Scholar
Sommers BD, Gawande AA, Baicker K. Health insurance coverage and health—what the recent evidence tells us. N Engl J Med. 2017;377(6):586–93.
Article PubMed Google Scholar
NIH. GUID (Global Unique Identifier). 2017; https://data-archive.nimh.nih.gov/guid.
Resouces. TNNGPG. Global Unique Identifier (GUID). 2017; https://ncats.nih.gov/grdr/guid. Accessed September 29, 2017.
Wood R, Clark D, King A, Mackay D, Pell J. Novel cross-sectoral linkage of routine health and education data at an all-Scotland level: a feasibility study. Lancet. 2013;382:S10.
Article Google Scholar
Deb S, Austin PC, Tu JV, et al. A review of propensity-score methods and their use in cardiovascular research. Can J Cardiol. 2016;32(2):259–65.
Article PubMed Google Scholar
Austin PC. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behav Res. 2011;46(3):399–424.
Article PubMed PubMed Central Google Scholar
Thompson CA, Arah OA. Selection bias modeling using observed data augmented with imputed record-level probabilities. Ann Epidemiol. 2014;24(10):747–53.
Article PubMed PubMed Central Google Scholar
Khalifa A, Meystre S. Adapting existing natural language processing resources for cardiovascular risk factors identification in clinical notes. J Biomed Inform. 2015;58(Suppl):S128–32.
Article PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Division of Epidemiology, College of Public Health, The Ohio State University, 1841 Neil Avenue, 344 Cunz Hall, Columbus, OH, 43210, USA
Julie K. Bower, Sejal Patel, Joyce E. Rudy & Ashley S. Felix
Division of Cardiovascular Medicine, The Ohio State University College of Medicine, Columbus, OH, USA
Julie K. Bower

Authors

Julie K. Bower
View author publications
You can also search for this author in PubMed Google Scholar
Sejal Patel
View author publications
You can also search for this author in PubMed Google Scholar
Joyce E. Rudy
View author publications
You can also search for this author in PubMed Google Scholar
Ashley S. Felix
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julie K. Bower.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflicts of interest.

Human and Animal Rights and Informed Consent

This article contains no studies with human or animal subjects performed by any of the authors.

Additional information

This article is part of the Topical Collection on Cardiovascular Disease

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bower, J.K., Patel, S., Rudy, J.E. et al. Addressing Bias in Electronic Health Record-based Surveillance of Cardiovascular Disease Risk: Finding the Signal Through the Noise. Curr Epidemiol Rep 4, 346–352 (2017). https://doi.org/10.1007/s40471-017-0130-z

Download citation

Published: 02 November 2017
Issue Date: December 2017
DOI: https://doi.org/10.1007/s40471-017-0130-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Addressing Bias in Electronic Health Record-based Surveillance of Cardiovascular Disease Risk: Finding the Signal Through the Noise

Abstract

Purpose of review

Recent findings

Summary

Access this article

Similar content being viewed by others

Electronic health records to facilitate clinical research

Characterizing bias due to differential exposure ascertainment in electronic health record data

The Electronic Health Record for Translational Research

References

Papers of particular interest, published recently, have been highlighted as: • Of importance •• Of major importance

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Human and Animal Rights and Informed Consent

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Addressing Bias in Electronic Health Record-based Surveillance of Cardiovascular Disease Risk: Finding the Signal Through the Noise

Abstract

Purpose of review

Recent findings

Summary

Access this article

Similar content being viewed by others

Electronic health records to facilitate clinical research

Characterizing bias due to differential exposure ascertainment in electronic health record data

The Electronic Health Record for Translational Research

References

Papers of particular interest, published recently, have been highlighted as: • Of importance •• Of major importance

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Human and Animal Rights and Informed Consent

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation