Personalized genetic medicine is near the top of the list for every medical school and hospital planning committee. Not only does it have the high-tech cache coveted by marketing and development offices, but it is really thought to be a route to better outcomes and cost reduction. The hope is that the additional costs of genetic profiling will be more than offset by the savings created by individualized treatment and avoidance of predictable iatrogenic complications [1]. The paths to discovery of the genetic determinants of common disease are now on firm ground. Cost-effective complete genome sequencing for individuals is very close [2, 3]. However, the interpretation and application of the resulting data now loom as critical challenges.

Enter the electronic health record (EHR). Successful investigation and application begins with carefully specified phenotypes. In medical practice, phenotypes are represented in the records of patient encounters. Records of subjective symptoms, measurements of physiological states like blood pressure and weight, physical exam findings, results of laboratory testing and imaging, and temporal changes in activity of disease all have to be consistently and accurately reflected in the medical record. In an ideal world all these data would be: recorded with completeness using a common vocabulary; available to all the healthcare providers; and accompanied by protection of the privacy rights of the individual patient. Hospital, county, state, and national healthcare systems might also be able to access the data in aggregate to inform decisions about what works at what cost. Capturing, archiving, and retrieving all these data cannot be accomplished without EHRs. Genetic data, while no more or less complicated than other specialized clinical data, like imaging, do present some special concerns [4]. For one thing, once they are accurately collected, DNA results do not change. New ethical and legal issues are clearly emerging not only as a consequence of the requirements of the individual patient, but also as a result of the system's needs. At the moment, these issues are practically related to acquisition of data for research, but actual clinical application to individual patient care is already beginning. How do we balance the protection of the individual with the informational needs of the community? How do we minimize the burden of genetic information while actively exploiting it for the benefit of individuals?

Privacy versus the gold-mine - research capture of electronic health record information

Attempts to maximize data utility in the research context have generally focused on broad data sharing [5, 6]. Already, massive amounts of genotypic data have been produced from genome-wide association studies (GWAS) across the globe. Archiving these data and making them available to other researchers will provide a tremendous community resource that will hopefully speed scientific discovery and result in new medical advances. In the early days of genome research, data were shared in publicly accessible databases, such as the Single Nucleotide Polymorphism database (dbSNP). Individual privacy was protected by removing personally identifying information, including most clinical information. However, ethical considerations related to the fact that DNA is itself a unique identifier [7, 8] called into question the adequacy of public databases and led to the creation of controlled-access databases, such as the database of Genotypes and Phenotypes (dbGaP) in the United States [9] and the European Genotype Archive in Europe [10]. These databases add a layer of protection to ensure that only bona fide researchers can access participant data and only if the research purpose is consistent with the participant's original consent. This heightened protection makes it a bit more cumbersome for researchers to access individual-level data, but it also presents an opportunity to increase the utility of the data by linking it to certain clinical variables and other phenotypic information.

Most GWAS collect and share only limited clinical data. This significantly reduces the utility of individual genotypes and limits the ability to study the functional significance of genetic variation. Both 'deep phenotyping' (collecting extensive phenotypic information at the outset of a study) [11] and 'targeting phenotyping' (re-contacting individual participants to obtain additional phenotypic information as it is needed) [12] have been proposed, but both of these methods are costly and time consuming. To assess the alternative option of linking genotypic data with the EHR, the Electronic Medical Records and Genomics (eMERGE) Network was created and funded by the National Institutes of Health (NIH). It consists of five institutions in the United States that use data directly from the EHR to propel genetic research [13].

Linking genotypic data to the EHR provides maximum utility, but it also poses significant privacy risks for participants. These risks can be managed in several ways. The academic model provides some privacy protection through controlled-access databases, and manages residual risk through the process of informed consent [14]. An alternative approach is that of projects like the Personal Genome Project (PGP), which promises no privacy protection and recruits only those individuals who are comfortable sharing their clinical information, including their genetic data, in a public database [15]. Finally, at the other extreme are companies like Private Access, Inc., which recently partnered with Genetic Alliance to introduce a web-based program that gives patients individualized control over who can access their records and data [16]. Each of these approaches has advantages and disadvantages, but they all strive for the same thing: to maximize the scientific utility of genetic information while respecting individual rights and protecting patient privacy.

How quickly we forget - clinical application of the personal genomic profile

It is difficult enough to link genetic information with the EHR in research; integrating it in the clinical setting raises a host of additional challenges. A quick glance at any list of disease-associated single nucleotide polymorphisms (SNPs) directly illustrates the problem. Physicians and patients cannot fully retain the contents of the matrix consisting of genes, alleles, diseases and probabilities. It will not be feasible for individuals to update their information based on the newest research results. The time-honored approach to genetic counseling falters under the weight of thousands of low-, medium-, and high-risk predictions for every single person. Physicians and patients alike will be overwhelmed.

This is a problem well-matched to the capabilities of a uniform EHR. The health record will not only accurately retain large, complex genetic results, but we believe that there will be an evolution of clinical practice guidelines into active interpretive algorithms that incorporate genomic information [17]. These algorithms will compute individual risk and apply them to clinical decision support as needed. The vast majority of the individual genetic data will be latent, never really having any part in a person's health story. Real-time 'research' by dynamic monitoring of interventions and outcomes in large clinic populations will facilitate more rapid recognition of important genetic effects and changing environmental factors.

What are some of the challenges that will likely emerge from such a system? We suggest that there will be several areas that will be of immediate concern. The first is undesired presymptomatic diagnosis or knowledge of reproductive risk. Another important challenge is that there will initially be a great deal of uncertainty in the interpretation of genetic information because the penetrance of deleterious alleles in the general population is unknown. If genetic screening is most inexpensively carried out using very broad tools, there will inevitably be loss of autonomy in the choice to test for certain diseases and decline testing of others. Finally, one could imagine that there could be active harm caused by genetic testing. Personal data could be utilized in ways that are not beneficial to the individual, for example employment discrimination, forensic analyses, or even identity theft. These challenges are real and need to be responsibly managed at the institutional, national, and international levels.

Yet, this type of system for managing genetic information in the EHR must be developed if there is any hope of widespread clinical integration of genomic data. Already, consumer-directed health networks, such as Google Health, are enabling consumers to collect, store, manage, and control access to various types of medical information [18]. It will not be long before these internet-based information tools incorporate genome-wide data that can be easily mined for their clinical significance over time as additional information is collected. Without this, attempts to counsel patients on the basis of genomic information will be futile.

Authors' information

AM is a member of the following committees: NIH Electronic Medical Records in Genomics Network (eMERGE), ELSI Consent and Community Participation Workgroup; NIH Advisory Committee to the Director (ACD), Working Group on Participant and Data Protection (PDP); Personalized Health Care Working Group (PHC), American Health Information Community (AHIC), Office of the National Coordinator for Health Information Technology (ONC HIT), Secretary of the Department of Health and Human Services (DHHS).