Linking Genes to Cardiovascular Diseases: Gene Action and Gene–Environment Interactions
A unique myocardial characteristic is its ability to grow/remodel in order to adapt; this is determined partly by genes and partly by the environment and the milieu intérieur. In the “post-genomic” era, a need is emerging to elucidate the physiologic functions of myocardial genes, as well as potential adaptive and maladaptive modulations induced by environmental/epigenetic factors. Genome sequencing and analysis advances have become exponential lately, with escalation of our knowledge concerning sometimes controversial genetic underpinnings of cardiovascular diseases. Current technologies can identify candidate genes variously involved in diverse normal/abnormal morphomechanical phenotypes, and offer insights into multiple genetic factors implicated in complex cardiovascular syndromes. The expression profiles of thousands of genes are regularly ascertained under diverse conditions. Global analyses of gene expression levels are useful for cataloging genes and correlated phenotypes, and for elucidating the role of genes in maladies. Comparative expression of gene networks coupled to complex disorders can contribute insights as to how “modifier genes” influence the expressed phenotypes. Increasingly, a more comprehensive and detailed systematic understanding of genetic abnormalities underlying, for example, various genetic cardiomyopathies is emerging. Implementing genomic findings in cardiology practice may well lead directly to better diagnosing and therapeutics. There is currently evolving a strong appreciation for the value of studying gene anomalies, and doing so in a non-disjointed, cohesive manner. However, it is challenging for many—practitioners and investigators—to comprehend, interpret, and utilize the clinically increasingly accessible and affordable cardiovascular genomics studies. This survey addresses the need for fundamental understanding in this vital area.
KeywordsGenotype and expressed phenotypes Exons, introns, and alternative splicing Monogenic and polygenic traits and gene networks Major gene, “modifier genes,” and pleiotropy Regulatory DNA “switches” and regulation of gene expression Gene interactions and epistasis Genetic cardiomyopathies, HCM, DCM Environmental influences and epigenetics Mutations and haplotypes
“With the tools and the knowledge, I could turn a developing snail's egg into an elephant.
It is not so much a matter of chemicals because snails and elephants do not differ that much.
It is a matter of timing the action of genes.”—Barbara McClintock, 1902-92, American geneticist, Nobel laureate.
Quoted (p. 176) in Bruce Wallace: The Search for the Gene .
The heart integrates structure and function across multiple spatial scales . Great challenges revolve around developing the means of linking morphomechanical myocardial characteristics across the spatial scales. A unique characteristic of myocardium is its ability to grow and remodel in response to changing environments; this is determined partly by genes and partly by the physical environment [2, 3]. There is an emerging need to ascertain not only the physiological functions of myocardial genes and proteins but also their adaptive/maladaptive modulations by mechanical “environmental factors.” In a series of recent analyses, I have outlined the role of mechanical stresses [2, 3, 4, 5], or a lack thereof, as important epigenetic factors in cardiovascular development, adaptations, and disease.
The essential tasks of cardiac myocytes are the development of contractile forces by the fibrillar sarcomeres and their transmission to the myocardial extracellular matrix (ECM); insufficient accomplishment of either dynamic task initiates cardiac remodeling entailing hypertrophy and dilatation, provokes symptoms, and can lead to heart failure. Accordingly, all cardiac diseases with abnormal growth and remodeling may be in the class of diseases with a mechanical etiology . Different levels of mechanical dynamic stimuli such as right ventricular/left ventricular (RV/LV) pressure, endocardial fluid shear stress, or myocardial deformations may initiate or modulate signal transduction cascades and other cellular processes that underlie cardiac adaptations, remodeling, or transition to pathology; these may be reversible and amenable to treatment. Moreover, it is now evident that systolic and diastolic (filling-related) local mechanical stimuli (compressive, tensile, and shear stresses) are major controllers of growth and remodeling. Cells are capable of sensing mechanical forces and converting them into biological signals via mechanotransduction mechanisms [3, 4, 5]. Therefore, mechanical forces contribute to the regulation at the molecular and cellular level of various processes, such as gene expression, cell growth, proliferation and apoptosis, and synthesis and degradation of ECM components, thus inducing altered morphology, properties, and function [2, 3, 4, 5]. Intriguingly, the influence of the ECM collagenous cardiac skeleton is reflected by the remarkable consistency of the cardiac shape over a lifetime; in fact, each individual cardiomyocyte retains only the extent of mobility permitted by its rigid collagenous suspension via wavy (non-straight) collagen struts within the ECM network .
A detailed GLOSSARY is listed in the Electronic (Online) Supplementary Material, and is referred to in the text as “Glossary, ESM.”
The Cardinal Concepts of Genotype and Expressed Phenotype
The central concepts of genotype (Genotypus) and phenotype (Phänotypus) were first introduced in 1909 by Wilhelm Johannsen, a Danish geneticist, in his German-language textbook titled Elemente der exakten Erblichkeitslehre (Elements of an exact theory of heredity) . Today, the genotype is viewed as the descriptor of the genome, which is the set of physical DNA molecules inherited from the parents [2, 3]. The phenotype is the descriptor of the phenome, which encompasses the manifest properties of cells, organs, and the whole organism, i.e., structural (morphology) and functional (physiology) characteristics, including growth patterns, specialized functions, and metabolic, behavioral, and other activities [2, 3]. Analogous considerations apply for constituent cells, tissues, and organs.
Unquestionably, however, there is some inexactness in the assignment of an individual to one genotype, because numerous mutations most likely take place at random in cells throughout development, growth, and life, so that all cells in the body do not have one identical genome. Likewise, even cloned genetic copies or monozygotic twins with near enough identical genotype will diverge from each other in phenotype due to discrepancies in their developmental environments. Therefore, phenotypes are categories actually possessing only a single member.
The Human Genome Project and Next-Generation Sequencing
Because the genes are made of DNA, what produces or influences strongly the observed similarities and differences in structure and function between different individuals should be similarities and differences in DNA. Accordingly, one might reasonably expect that to investigate such phenotypic similarities and differences, we must sequence and compare the corresponding DNA, anticipating distinct “genes for” not only different phenotypic traits but also many heritable diseases, including cardiovascular problems. After all, the evidence for the “heritability” of a trait comes from its stronger resemblance among close relatives than unrelated individuals.
Base pairs that form between specific (nucleo) bases are the building blocks of the DNA double helix (see Supplementary Table 1; cf., Fig. 5) and may be construed as the teeth of a zipper, or the rungs of the polynucleotide DNA spiral staircase, contributing to its folded structure. The helical strands are held together by hydrogen bonds between cytosine and guanine and between thymine and adenine residues. The aggregate size of the human genome is about 3 billion base pairs, arrayed in 23 (haploid number) chromosomes; the chromosomes themselves range in size from 250 million bases (Mb) for chromosome 1, to 50 Mb for chromosome 21 . When the highly anticipated draft sequence and initial analyses of the human genome assemblies were published by the Human Genome Project (HGP) in February 2001 [9, 10, 11], with the layout of the entire genome’s 3 billion DNA base pairs some 90–95 % covered at an accuracy of 99.9 %, a perplexing conclusion was that the number of human protein-coding genes was significantly smaller, at <30,000, than previous estimates reaching up to as many as 140,000 genes. The current estimate is only 20,000–25,000 genes .
Genes commonly implicated in HCM & DCM in descending order of frequency
% of HCM associated with mutation of this gene
Myosin-binding protein C, cardiac type
Troponin T, cardiac muscle
Troponin I, cardiac muscle
Tropomyosin alpha-1 chain
% of DCM associated with mutation of this gene
3 %–4 %
2 %–4 %
Myosin-binding protein C, cardiac type
In the past, genetic evaluation for, for example, genetic cardiomyopathies was performed by sequential screening of a very limited number of genes. NGS has increased the throughput and minimized cost, enabling simultaneous screening of large assortments of genes for multiple patients in a single sequencing run. Thanks to continuing advances in accuracy, throughput, pricing, and simplification of data analysis, it has now become practicable to apply NGS to clinical molecular diagnostics of inherited cardiac conditions . Once disease-associated modifications in a primary variant gene’s DNA polynucleotide sequence are identified, it may become easier to resolve how the structure of the resultant polypeptide (and thence, protein/enzyme) gene product changes in a manner impacting its biological function ; this has obvious therapeutic implications.
A Gene Is Not a Continuous DNA Segment: Exons, Introns, and “Alternative Splicing”
DNA coding for a protein generally comprises separate sections of DNA, near one another in the genome, but with interposed lengths of non-coding DNA separating them. The coding portions of DNA are termed exons, and the lengths that disconnect them introns. On average, there are 8.8 exons and 7.8 introns per gene in the human genome . The entire DNA sequence is transcribed into pre-messenger RNA (pre-mRNA), but ensuing revising processes generally remove the introns, leaving only the exons to determine the polypeptide/protein amino acid sequence. Consequently, both exons and introns, as well as assorted other regulatory DNA sequences upstream and downstream of the exon/intron set of a gene, are implicated in the actual manufacture of the RNA that guides the assembly of the protein product. Accordingly, genes should be envisioned as complex, spatially discontinuous, composite—rather than unitary—objects. This is akin to the idea that a cell culture is an object. Introns were discovered by observing that the mRNA that coded for proteins was almost always shorter than the DNA from which it had been transcribed. They are removed by splicing enzymes in the intranuclear spliceosomes before mRNA, ribosomal RNA (rRNA), and transfer RNA (tRNA) can carry out their functions in the cytoplasmic ribosomes .
Down syndrome (DS), caused by trisomy 21 (extra copy of chromosome 21), is the most common birth defect and is associated with intellectual disability, endocarditis, and heart defects that may require surgery. The gene DSCAM has been identified in the DS critical region; a homolog (Glossary, ESM) of the DSCAM protein in fruit flies (Drosophila melanogaster) has 38,016 isoforms arising from alternative splicing of exon clusters [25, 26], potentially altering gene function—the italicized symbol refers to the gene that encodes for the non-italicized protein symbol. A few regions of the large pre-mRNA molecule that undergoes alternative splicing remain in all versions of the protein, but this RNA also has four blocks containing multiple exons from which it chooses one version of each. It is akin to assembling clothing out of an apparel collection that provides only 1 type of hat, but 12 kinds of socks, 33 shirts, 48 varieties of trousers, and 2 different shoe types. Altogether, these items could be combined in (1 × 12 × 33 × 48 × 2) different ways to create 38,016 possible ensembles. Alternative splicing is particularly prevalent in genes encoding contractile proteins and in other genes expressed in muscle cells, including cardiomyocytes [27, 28, 29].
The Riddle of “Junk DNA” and Regulatory DNA “Switches”
Of the 3 billion DNA base pairs of the human genome, over 98 % that lie between our 20,000–25,000 genes were, until recently, regarded as junk DNA with no protein coding functions, implying that many DNA segments must contain information used in coding for more than just one protein, through the mechanism of alternative RNA splicing [4, 5], which is described in the previous section that details how a gene is not a continuous DNA segment. The human genome is a highly structured RNA-producing machine. The junk DNA hypothesis is being discounted decisively as more and more of the non-coding DNA is found to be transcribed into functional RNA molecules, albeit with functions not as yet characterized comprehensively . The current understanding of non-coding RNA (ncRNA) transcripts is only very incomplete due to enormous difficulties in genome-wide ncRNA gene mining. Synonyms for ncRNA are non-messenger RNA (nmRNA), non-protein-coding RNA (npcRNA), and functional RNA (fRNA). Beyond transfer RNAs (tRNAs) and ribosomal RNAs (rRNAs), ncRNAs are classified as introns; short, long, antisense, and small interfering RNA; microRNA; and pseudogenes, and into many other functional classifications.
Some sequences in the junk DNA act as genetic switches, which determine where and when genes partaking in genetic networks get expressed; without these switches, called regulatory DNA, genes are inert. Regulatory DNA switches control human genes and, thus, how the body makes different kinds of cells, tissues, and organs, how it adapts to changing conditions, and how normal gene circuitry gets rewired in disease; in therapeutics, ncRNAs are emerging as a substantial component of the advancing field of cardiac regenerative medicine [31, 32]. The switches (DNA sequences) for controlling genes are organized throughout the genome, and combinations of different switches function together to control genes, often apart at great distance along the genome . There is a large diversity of transcribed RNA molecules that play crucial roles in regulating gene expression or in guiding RNA modifications. This implies that the non-coding DNA influences the behavior of the protein-encoding genes, the “coding DNA.” In this way, non-coding polynucleotide sequences can shape, more or less profoundly, expressed phenotypic traits [34, 35]. Small non-coding microRNAs (miRNAs) are present not only in the cytoplasm but also in several extracellular compartments, including circulating plasma; consequently, miRNAs are useful as diagnostic biomarkers for an increasing number of diseases . Recent research has shown that long ncRNA (lncRNA) drives the gene expression changes underlying heart failure, wherein certain fetal genes are upregulated while adult genes become downregulated, suggesting a therapeutic potential of lncRNA in heart failure management .
Only a small fraction of regulatory DNA regions are active in any given cell type; this fraction is almost totally unique to each type of cell and becomes a sort of molecular Zebra barcode label of the cell’s identity. It seems that the rather naïve dualistic classification into coding versus non-coding transcripts (ncRNA) is rather unworkable with respect to the subtlety and complexity of genome regulation. It is vital to always remember that cells in organs and tissues, including myocardial cells (endocardium, cardiomyocytes, and fibroblasts), do not differ in genes present but only in the differential gene expression of alternatively spliced mRNA transcripts.
Lack of Correspondence Between Expressed Phenotypic and Genomic Complexity
The Human Genome Project (HGP) characterized additionally the entire genomes of several other organisms employed widely in biomedical research, such as mice, rats, fruit flies, and flatworms. Standard gene names and symbols can be found in databases that are specific to particular organisms (e.g., human: www.genenames.org/; mice: www.informatics.jax.org/; rats: rgd.mcw.edu; flies: flybase.org/; worms: www.wormbase.org/). The parallel sequencing efforts have acted synergistically, because most organisms have numerous “homologous” genes with similar function, reflecting their shared ancestry. Homologous genes and proteins that are faithfully preserved and may have not changed over millions of years of evolution could constitute essential integrants of life and attest to evolutionary conservation of core processes . The prime surprise of such parallel investigations was that a fertile human egg could produce so very different an organism than a mouse (Mus musculus) egg, whereas the human has merely 300 distinctive genes not found in the mouse! It appears that the increased intricacy in the mechanisms that regulate gene expression is an important reason why humans are more complex than “lowly” creatures, such as worms and flies, and lower mammals, such as mice . Importantly, depending in part on “environmental” epigenetic regulating factors [2, 3, 4, 5], human genes can give rise to multiple related proteins, each potentially capable of performing a different function in our bodies.
Accompanying such unexpected findings has been an uneasiness about the lack of correspondence between genomic and phenotypic complexity. Implicit in this is a non-linear, non-reductionist framework of disease, not necessarily relying on relatively simple causal explanations rooted in Mendelian gene mutation. This well-conceived framework perceives disease as an outcome of complex, multifactorial abnormalities at the manifold-scale levels of organs, tissues, and cells. Involved are integrated systems whose function depends definitively on their morphomechanical organization, their mutual interrelations and interdependencies, and on interactions among them and with their “environment.” This view does not negate the value of analyses at reduced system levels (molecules, genes, etc.) so as to understand how the interacting components work. However, at each higher level, important morphomechanical characteristics may emerge (in the sense of shifting from a billiard-ball causality to an interacting network complexity with increasing capacity for autonomous change), which could not have necessarily been predicted from just a knowledge of lower-level components.
The International Haplotype Map Project: Common Maladies Linked to Genotype
The International Haplotype Map Project (HapMap, http://www.hapmap.org/) has reallocated the attention from single gene to the whole genome, by cataloging/mapping details of genetic similarities and differences in human beings. NGS can sequence human genomes and transcriptomes and explore gene expression changes on a genome-wide scale and with single-base precision . Genome-wide association studies (GWAS) identify loci of single-nucleotide polymorphisms (SNPs, pronounced “snips”) across the genome and correlate them with disease susceptibility . Since the initial dissemination of the haplotype map (HapMap) of the human genome in 2005 , a considerable number of GWAS have been conducted to establish genetic determinants of complex, multifactorial diseases, including cardiovascular disorders, and individual responses to therapeutic agents.
A haplotype is a grouping of DNA variations, or a series of polymorphisms, that are close together in the genome and tend to be inherited jointly; it can refer to a combination of alleles (Glossary, ESM) or to a set of SNPs acquired on the same chromosome—new mutations create new haplotypes. Use of the HapMap in association studies can reveal links between common disorders and genotype; genotype is one of three factors that determine phenotype, the other two being epigenetics and non-inherited environmental factors. The haplotypes in individuals with a specific disease are compared to those of a comparable (control) group without it. If a particular haplotype occurs with higher frequency in affected individuals compared to controls, gene(s) influencing the disease may be situated within or near that haplotype.
Use of the HapMap should result in advances, such as treatments adapted to a patient’s genetic makeup to maximize effectiveness while minimizing side effects. It should be recognized, however, that genetic variation entailing multiple variants (alleles) of any given gene in the population (polymorphism) probably has a decidedly graded effect on gene expression because, for any gene, many factors participate in regulating its expression; genetic variability in any one of such genes is apt to bring about a change in expression . Moreover, at any genomic locus, there are different alleles or variants that can exist across a single haplotype cluster or in several haplotype clusters nearby. Accordingly, there is no single haplotype for disease risk but, more accurately, a collection of haplotypes that confer a graded risk of the disease. Instead of being thought of as independent hereditary units, genes, expressions of DNA, are to be conceived as parts of biochemical/molecular systems within cells, including those of the myocardium. Intense work on these systems explores the functional interdependence of DNA and molecules of other kinds, and how diverse molecules may function together and operate in ways whereby they both regulate and are regulated by each other and by physical forces in their environment [4, 5].
Distributed Gene Networks
As the facts discovered through hypothesis-driven and big data project research  in genetics continue to grow, so does the need for advanced tools to aid in visualization and analysis [42, 43]. Consequently, software platforms for visualizing gene interaction networks and for integrating these interactions with gene expression profiles—such as the Genes2Networks, the FunCoup, and the Cytoscape freeware programs and associated plugins—become indispensable. They can help identify the top disease gene candidate(s) from “seed” lists of generally accepted human gene symbols; their output includes dynamic three-color network maps, with statistical analysis reports [43, 44, 45, 46, 47]. The plug-ins allow users to alter modes of network visualization, add layouts, add novel functionality (for instance, network graph animation), and search for pathways through a network. Powerful extensions are obtainable from the Cytoscape web site (http://cytoscape.org/), where new releases, documentation, and tutorials are available.
Rather than operating along linear pathways, the collective complexity of connections among genes is better expressed as a distributed network. Some genes can cause greater abnormalities than others when mutated, but this hinges strongly on the context of other alleles that are present. The multipart connectedness of interactive gene ensembles demands a comprehensive systems approach: a versatile understanding of the gene system/subsystem that can refer to the wider network plan and simultaneously appraise the intricate interconnections among its elements. It calls for network thinking . It is always important to remember too that cells, tissues, and organs are not static entities, and that it is consequently necessary to ascertain dynamic network changes reflecting both the temporally and spatially variable nature of the interconnected gene assemblies. Comparative analysis of the underlying gene networks under different operating conditions may provide invaluable pathophysiologic understanding when investigating disease-specific genetic alterations underlying characteristic morphomechanical phenotypic differences, e.g., in various cardiac abnormalities. Such dynamic studies of the implicated genetic networks should provide more accurate, comprehensive understanding regarding the roles of genes in diseases such as hypertrophic (HCM) and dilated (DCM) genetic cardiomyopathies.
Gene Interaction Networks: Degeneracy Underlies Genomic Stability and Plasticity
Human (ENCODE) and model organism (modENCODE) genome decoding projects and the human HapMap Project profiling genetic variants provide enormous quantities of genomic data from a diverse assortment of organisms and environments [5, 52]. They supply a “parts list” of genes for a given (normal or abnormal) genome of interest. However, to understand the complexity encoded within it, knowledge is required concerning which of these parts function together and how. Rather than running in linear pathways, as modeled by the classic “one gene–one enzyme hypothesis” of Beadle and Tatum , which depicts each gene in sequence as responsible for producing a single enzyme affecting a single step in the pathway, the increasing complexity of relationships among genes is better described as a distributed network of interconnected elements (cf., Fig. 2).
Networks do not function in the same way as simple pathways; their elements can assume new roles as conditions change or other genes suffer mutations/damage. To do so, linear pathways would require inefficient redundancies, not so with gene networks: They are intrinsically more versatile, adjustable, and adaptable. Network tweaking can occur after perturbations, when gene networks exploit alternate strategies to preserve the output. The mechanisms used to arrive at that output are nonetheless different: They exemplify degeneracy, namely, using multiple pathway combinations to achieve functional plasticity. Degeneracy is a key organizational feature of our genetic code. Not to be confused with medical jargon, degeneracy in network systems analyses refers to this capacity of alternative components—viz., gene groupings—by adjusting expression of different genes to produce a similar phenotypic outcome in one context and dissimilar ones in other contexts .
Having structural arrangements that can selectively, but not exclusively, yield the same output under one condition, and dissimilar outputs in different conditions, improves robustness to perturbation and increases the flexibility of the cell, organ, or organism to changing environments or to gene mutation and damage (“knockout” experiments, [55, 56]). Such considerations emphasize the need for caution in using gene knockouts to attribute phenotypic effects to genes. Appropriately, attention is now shifting to understanding the difficult problem of how normal or disease genes and gene interaction networks, which link diseases to causal and contributing mutated genes, interact cooperatively in a multipart distributed dynamic system .
The guiding assumption in experimental interventions employing mutant data is that mutants can reveal the key constituents in a process and that they can illuminate causative mechanisms. The established approach starts off with a particular aberrant phenotype, e.g., the congenital heart defect–associated dysmorphic Noonan syndrome, or HCM, or hypercholesterolemia or other phenotypic abnormalities, for which a genetic screen is applied to identify anomalous genes implicated. Known disease genes are then used to build a network around these genes to identify new genes/nodes that could be additional disease genes affecting syndrome severity . Note, however, that given that the functional relationships among genes can shift with each genetic perturbation, ipso facto then, the very workability of extrapolating backward to a normal genetic system from mutant data becomes questionable. Likewise, it is hard to estimate the odds that phenotypic differences arise because of flanking sequences rather than the targeted mutant allele, especially as the phenotype(s) being studied are commonly selected with a bias toward identifying or confirming a known variant gene role.
Gene Mechanisms Operative in Determining Phenotype
Ever since Mendel’s classic work, geneticists have sought to both describe distributions of phenotypic attributes and explain them in terms of independent causative genetic antecedents or “genes.” The developmental theory of preformationism, which was first proposed by Aristotle and became particularly influential during the Scientific Revolution [2, 59, 60, 61, 62], originated the working notion that a gene is defined by the phenotype that it causes. In preformationism, the egg or sperm was essentially understood to contain all the final features/traits of the mature organism. Development consisted of an enlargement of these preformed characteristics (or primordia, the initial buds of embryonic cells from which the body organs/parts develop) into the individual features of the adult organism; i.e., these primordia truly corresponded to the adult features with which they had a simple and direct causal relationship.
Genes, now conceptualized as DNA, are conceived as parts of chemical/molecular systems within cell nuclei, and as blueprints of particular peptides/proteins . However, while molecular geneticists came to understand that in biological systems genes actually code for proteins, it became convenient and seemingly natural to think about preformationist-like genes for the individual traits of an organism/individual and, by extrapolation, also for classical genetic diseases in humans. The idea that genes could be “for” human traits was sustained by the discovery that genes for classical Mendelian medical disorders, such as various anemias (see below), often acted just like the hereditary elements discovered in Mendel’s pea plants. Such patterns of thinking appear to be deeply rooted and influence thinking modes concerning gene–phenotype relations and mutant disease genes in general, even today.
The evidence that the association between individual genes and complex cardiac diseases is commonly non-specific and manifold does not mean that the identification of implicated gene collections is inconsequential. Such findings can begin the important process of elucidating how specific genes interact with each other and with environmental factors/exposures [2, 3, 4, 5] to engender disease, and can provide new targets for treatment . With the full sequence of the human genome in hand, NGS technologies have subsequently reduced the cost of DNA sequencing from approximately US$0.50 per kilobase (kb−1) to less than US$0.001 kb−1 and radically increased throughput by sequencing in a massively parallel fashion . They allow differences in long stretches of gene-containing chromosomal DNA taken from many individuals to be searched for and compared extremely efficiently.
There are two principal patterns of gene expression operative in determining phenotype: monogenic and compound polygenic (Fig. 3). When pleiotropy is demonstrated, one gene is responsible for the determination of many traits; with polygenic traits, the expression of a particular phenotype results from interaction of the products of many genes. Composite traits/syndromes are generally polygenic in nature; moreover, the effect of any one gene can be dependent on the presence of one or more “modifier genes” (genetic background), a phenomenon labeled epistasis, affecting the observed phenotype in considerable ways. Importantly, epistatic mutations have different effects in combination than individually. Multiple modifier genes can act on a limited number of primary loci; with increasing numbers of loci, the possible number of different phenotypic outcomes increases exponentially, and this fact creates a continuous distribution for the composite trait.
Monogenic Phenotype–Engendering Pattern
The monogenic pattern identifies a one gene → one trait pathway. The pathway may be influenced by other genes or the environment (gene–gene, GxG, and gene–environment, GxE, interactions), but in some settings, when mutation involves an influential DNA locus, these can become effectively unimportant. Expressed monogenic (monolocus) disorders or Mendelian diseases tend to run in families . Such disorders of cardiovascular import include various anemias, as follows [68, 69]: Mutations in the HBB gene affecting β-globin, a subunit of hemoglobin, give rise to sickle cell anemia and also to beta thalassemia; alpha thalassemia typically results from deletions involving the HBA1 and HBA2 genes that produce the α-globin subunit of hemoglobin. Fanconi anemia (FA) shows considerable genetic heterogeneity but is attributable to mutations mostly in one of three genes, FANCA, FANCC, and FANCG, which disrupt the FA multigene network that repairs DNA damage; and rapidly dividing bone marrow cells are particularly affected resulting in Fanconi anemia.
Similarly, a number of large-scale, gene-based SNP association studies have led to the discovery of specific genetic determinants or risk factors conferring susceptibility to common cardiovascular diseases , including the CTLA4 gene and type 1 diabetes, the factor V Leiden F5 gene and deep vein thrombosis, the ApoAV gene and hypertriglyceridemia, the PDE4D gene and ischemic stroke, and the LTA gene that has been implicated in the pathogenesis of atherosclerosis and coronary heart disease and myocardial infarction.
Mutations have been identified as the cause of other important cardiovascular monogenic diseases that in the words of one authority “are currently thought to include”  various genetic cardiomyopathies, encompassing HCM and DCM, channelopathies, involving long/short QT syndromes, catecholaminergic polymorphic ventricular tachycardia, familial (primary) pulmonary hypertension, Marfan syndrome, and familial hypercholesterolemia. Various kinds of mutations are recognized , as follows: They can be located in an exon that contains coding DNA, thus causing an amino acid substitution giving a protein with impaired function (missense mutations). Other mutations create a stop codon, viz., one that does not correspond to any amino acid, signaling a premature stop of translation and yielding a truncated and non-functional polypeptide, or even no RNA and protein (non-sense mutations). When one base pair is missing, then a shift in the DNA reading frame ensues leading to a different (commonly non-sense) protein. Additionally, some mutations interfere with the process of splicing (see Fig. 1).
However, essentially monogenic origins of disease, as in the preceding disorders, are not the rule. Only 2 % of the total disease domain corresponds to monogenic causality, and even here the final phenotype is modulated by many factors, principally degeneracy (Glossary, ESM). Factor V Leiden offers an example of multiple contributing circumstances . Most of the thromboembolic events associated with it occur when other risk factors, such as use of oral contraceptives, surgery, and bed rest, are also present; both GxG and GxE interactions, reflected in changes in the composition of blood and in stasis of venous flow, impact the overall risk of venous thrombosis.
Polygenic Phenotype–Engendering Mechanisms
The second pattern is distinctly polygenic, signifying that phenotype is determined by a number of genes acting conjointly in a network, as alluded to previously. Inherited cardiac conditions, diabetes, and generally complex, multifactorial maladies typically result from the combined effects of a number of genetic variants. Multiple gene abnormalities may well be implicated in variant phenotypic manifestations of complex cardiovascular diseases, including genetic forms of DCM and HCM (see Table 1 and the “Can, on Their Own, Same-Gene Mutations Cause Starkly Contrasting Traits?” section). Over the last two decades, numerous disease-causing gene alterations for different cardiomyopathies, without clear genotype–phenotype correlations, have been recognized [74, 75]. Until recently, clinical genetic testing was available typically for apparently monogenic diseases, although the majority of genetic heart diseases are both genetically and clinically highly heterogeneous and complex. Complexity stems from the wide-ranging (many genes implicated) locus and allelic (many different mutations implicated within those genes) heterogeneities, which until recently had been confounding incomplete genetic testing data . As I detail in section ‘The Challenge of Pleiotropic Phenotypes of Mutant Genes: “Modifier Gene” Effects,’ large-scale studies are necessary to identify modifier genes crucial for cardiac phenotype .
Can, on Their Own, Same-Gene Mutations Cause Starkly Contrasting Traits?
As things stand now, there is no conclusive, elegant explanation of how mutations in the same gene, for instance, MYH7 (see Table 1), can by themselves cause such contrasting morphomechanical and clinical disorders as DCM and HCM. This fact is currently interpreted as a de facto manifestation of “phenotypic plasticity” . It is also a striking example of phenomenological “pleiotropy” (from the Gk. words pleio, alluding to “multiple,” and tropic, pertaining to “transformation”), where mutations in the same gene can bring forth different morphomechanical phenotypic traits [79, 80]. I am not talking here about monogenic mutations in the same gene that can be associated with quantitatively different levels of probability of pathogenicity for HCM (viz., benign variants, variants of uncertain significance, and likely or definitely pathogenic variants). I am talking about ostensibly monogenic causal mutations in the same gene  causing qualitatively drastically divergent pleiotropic outcomes: the anomalous phenotype of DCM versus that of HCM (see Table 1 and Fig. 3b)! With this in mind, I deem it thought provoking and thus instructive to briefly summarize next the striking clinical differences that typify the morphomechanical/clinical phenotypes of these two remarkable cardiac disorders.
DCM: Morphomechanical Phenotypic Characteristics
The preceding hemodynamic changes accrue from mutations that have been attributed to multiple genes, as summarized in Table 1. Allelic heterogeneity is the rule, and very few specific mutations are encountered in multiple families. Presently, application of NGS and DNA sequence assembler technologies coupled with software platforms for visualizing gene interaction networks can speed up the discovery of new DCM genes and of multicomponent DCM genetic networks. Non-genetic causes of dilated cardiomyopathy are, among others, ischemic injury associated with coronary artery disease or prior myocardial infarction, valvular and congenital heart disease, severe long-standing hypertension, viral, inflammatory, and toxin-linked myocarditides.
HCM: Morphomechanical Phenotypic Characteristics
Genetic causes can typically be found in 50 to 75 % of cases of hypertrophic cardiomyopathy; expanding the genetic screening analysis through increasing application of NGS technology to enable simultaneous investigation of a broad panel of inherited genes is projected to augment the genetic-positive yield . In those patients in whom a mutation is identified, about 80 % involve two genes (encoding myosin heavy chain 7 and myosin-binding protein C—see Table 1). The pathologic features of HCM consist of marked and asymmetric LV hypertrophy with an especially thickened ventricular septum, atrial enlargement, and an undersized LV cavity; hypertrophy and disarray of the cardiomyocytes and interstitial fibrosis are present throughout the myocardium. The cardiac phenotype and clinical course of patients with HCM are typically very variable with regard to the pattern and degree of hypertrophy, the age at onset, and the clinical outcome. Modifier genes, the environment, gender, and comorbidity (coexistence of, for example, ischemic or valvular disease) contribute to such differences.
HCM exemplifies during systolic ejection what I have previously termed “polymorphic gradients” . Protean and polymorphic are the complicated systolic fluid dynamics of HCM [2, 82, 98] (see Fig. 4). Previous CFD analyses [2, 82, 98], using ejection velocity and pressure gradient patterns obtained by multisensor catheters and angiographic measurements, indicate that convective (Bernoulli) pressure gradients proportional to the square of the applying flow rate or velocity, as shown by the Bernoulli equation, are accentuated preeminently; in the narrowed (due to the septal hypertrophy) subaortic region, they may give rise to a Venturi mechanism entraining, or sucking, the neutrally buoyant mitral leaflets toward the septum in a systolic anterior motion (SAM) with concomitant mitral regurgitation. Failure of the mural (posterior) leaflet to move as much forward as the aortic (anterior) one yields non-apposition and an interleaflet gap resulting in the regurgitation—here I use “aortic” and “mural” because they are the modern attitudinally correct nomenclatures . It is noteworthy that, as is highlighted in Fig. 4, the pressure gradient rises to its peak levels and maintains them in the face of minuscule forward, or even negative, aortic root velocities recorded by the catheter-mounted electromagnetic sensor. Such negative aortic velocities (as revealed in Fig. 4) probably represent vortices, with recirculating retrograde velocity components. Further morphomechanical systolic flow dynamic details can be found in previous publications [2, 82, 98]; here, I simply note that whether mitral leaflet–septal contact is the cause of the enormous mid- and late-systolic intraventricular pressure gradient remains controversial.
The prominent ejection pressure gradients in HCM have focused attention mostly to systole and away from important coexisting biventricular [2, 100] diastolic function abnormalities (see Fig. 4). The relative contributions of relaxation defects, asynchrony, altered passive diastolic properties, and geometry to the diastolic dysfunction in HCM remain incompletely characterized [2, 100, 101, 102]. This might be expected for a disease characterized by profound genetic and epigenetic phenotypic heterogeneity, with substantial variation in expressivity and age-dependent penetrance . Diastolic abnormalities span the gamut, starting from those that can accompany substantial physiologic remodeling hypertrophy as seen in athlete’s heart , and then developing progressively, through derangements that come with compensatory reactive adaptations to pathological systolic pressure overloads [2, 89, 102, 104, 105, 106], to distinctive qualitatively new problems peculiar to HCM . Figure 4 illustrates hemodynamics of severe diastolic dysfunction in HCM.
Recently, I proposed a new paradigm of diastolic dynamics in HCM , emphasizing the relationship of myofiber sheet and ultraconstituent distortions to LV mechanics and end-systolic shape. This innovative approach affords understanding of intricate facets in patterns of diastolic rebound and suction needed for LV filling in many of the polymorphic phenotypes of HCM. Moreover, it may lead to extraepicardial or intramural implantable recoil devices to promote diastolic elastic rebound in selected subsets of HCM patients : those with massive hypertrophy and excessive ejection fraction leading to virtually complete LV emptying, or unduly reduced end-systolic chamber dimensions.
The Challenge of Pleiotropic Phenotypes of Mutant Genes: “Modifier Gene” Effects
Allowing any complex disease to be purely monogenic appears as a sweeping simplification. Although supposedly monogenic, genetic HCM and DCM exhibit striking morphomechanical attributes conjuring up a composite pathogenesis embodying the effects of multiple genetic factors. This explains the inadequacy of existing approaches to cogently explain the variability and pleiotropism in cardiomyopathy phenotypes. As discussed in the introduction of the preceding section, various schemes have been invoked conventionally to account for the prima facie surprising and perplexing observations relating to causal mutations in the same gene that can lead to either DCM or HCM. While it is plausible that distinct variants in a given gene can cause more than one cardiomyopathy, some investigations have revealed the same causal variant in HCM patients and in patients with genetic DCM, and alluded to phenotypic plasticity to account for this finding [78, 107, 108]. However, oddly, the invoked molecular mechanisms underlying HCM (variations of the thin filament proteome, associated with an increase in Ca2+ sensitivity of the myofibril, a faster cross-bridge turnover rate, and augmented contractility) and DCM (alterations of the thin filament proteome, associated with a lower Ca2+ sensitivity, a slower rate of cross-bridge turnover, and diminished contractility) are fundamentally conflicting and incongruous [109, 110]. This and the contrasting juxtaposition of the morphomechanical characteristics of genetic DCM and HCM synopsized in the preceding section and Fig. 4 call forcefully into question whether the same variant by itself can truly cause both disease phenotypes.
On the other hand, there are plausible, albeit unconventional, reasons for such puzzling findings. They pertain to the unaccounted influences of numerous potential “modifier genes,” which are operative in the dissimilar, compound genetic backgrounds (cf., Fig. 2) that delineate the “genetic context” of individual patients. The modifier gene concept is not new, having been advanced in 1941 by Haldane . Modifier gene mutations may have different effects in combination than individually. Research involving breeding cardiomyopathic mutations into distinct genetic backgrounds in animal (mouse) models should complement human investigations of the influence of the diverse genetic backgrounds on morphomechanical effects of dominant mutations, ultimately allowing us to characterize better the modifier gene networks .
Clearly, the phenotype need not be determined “single-handedly” by the mutation(s) in a single gene (e.g., MYH7). Modifier genes can bring about pleiotropism by affecting robustly the phenotype(s) resulting from the primary mutation; thus, genotype is not reliably predictive of phenotype [113, 114]. The final expressed phenotype is the outcome of the primary causal mutations, modifier genes ushering GxG interactions into the equation, and potentially of environmental factors—see section “Epigenetics and Gene-by-Environment Interactions.” This crucial fact explains why persisting approaches treating complex diseases as if they were Mendel’s peas fall short in cogently explaining the variability and, at times, striking pleiotropism in cardiomyopathic phenotypes. Such approaches face the correlation versus causality problem: “post hoc, ergo propter hoc.” Are investigators apt to assume too easily that “because something follows something else it must be due (solely) to it”? Malaria occurs most often in persons living near marshes; does it necessarily follow that the miasma that rises from such places was the cause of the disease (the name “malaria” means this)? It is crucial to recognize that complex non-linear relationships are common in gene expression patterns and possibly can allow for a small set of weakly interacting signals to express a significant effect. Powerful approaches employing genetic algorithms (GAs) and artificial neural networks (ANNs), considered in the next section, can be exploited in future studies to search for more complex and subtle interactions involving modifier genes, which may be passed over by less sensitive methods; these interactions can lead to either DCM or HCM syndrome traits when acting in concert with the main causal variant. Modifier genes cannot be ignored as mediators of pleiotropy in cardiomyopathies and other complex disease phenotypes.
The NGS and bioinformatics technologies that are now available [115, 116] offer us the prospect to widen the scope of genetic investigations to encompass genetic diagnosis of heterogeneous and polygenic (cf., Fig. 2) clinical cardiological disorders. There is consequently an emerging strong appreciation for the value of studying epistasis, or gene interactions, and for addressing these problems in a non-disjointed, cohesive manner . Comparative expression of networks of genes coupled to complex maladies can contribute insights as to how those genes interact with each other. It is anticipated that increasingly a better, more comprehensive and detailed, mechanistic understanding of genetic abnormalities underlying, for example, various genetic cardiomyopathies can emerge; pari passu, the distinction between monogenic and polygenic underlying pathogenetic mechanisms will become more questionable for many syndromes. This is becoming evident quite rapidly now, as in vivo and in vitro studies indicate that various cardiomyopathic mutations pass on an extensive array of assorted cardiomyocyte operating flaws, encompassing reduced myosin ATPase activity, acto–myosin interaction and cross-bridging kinetics, impaired contractility, and altered Ca2+ sensitivity .
DNA Microarray and Other Gene Expression Methodologies: Linking Genes to Transcription Control and Specific Traits and Diseases
Microarray technology allows for large-scale gene experimentation and makes it possible to find the expression levels of genes across many different applying conditions. In principle, a typical microarray experiment entails the hybridization, or combination through base pairing, of fluorescent-dye-labeled mRNA molecules in samples of interest to their complementary DNA templates, acting as detectors robotically printed onto glass-slide arrays in a particular order. Each spot on a microarray contains multiple identical strands of DNA; many such single-stranded DNA/gene spots are used to assemble an array. The amount/density (brightness) of mRNA bound to each detector spot on the array indicates the expression level of the various genes, thus forming a profile of gene expression for the sample. RNA in samples can be converted into its complementary single-stranded DNA (cDNA) using the enzyme reverse transcriptase. Because RNA is readily degraded by omnipresent RNases, cDNA is more convenient to work with than mRNA. Accordingly, in practice, the mRNA of samples is commonly converted into cDNA to use in DNA microarray investigations.
The raw data sets retrieved from microarray experiments are obviously too massive to infer meaningful conclusions by inspection. Of the tens of thousands of genes in experiments, only a much smaller number show strong correlation with the targeted phenotypes. Selecting such a small subset out of the thousands of genes in microarray data is important for accurate classification of the phenotypes; when a reduced number of genes are selected, their pathogenic relationship with the target disease is more easily identified. Widely used methods typically rank genes according to their differential expressions among the phenotypes and pick the top-ranked genes. Powerful computers are used implementing sophisticated software, such as genetic algorithms (GAs), which have been shown to be a robust search method for problems with such large search spaces [119, 120, 121, 122] and artificial neural networks (ANNs), which emulate and scale up enormously the brain’s many efficient ways to store and process information [123, 124]; neural networks are structured to provide the capability to solve problems without expert guidance and without the need of programming. They can seek patterns in data that no one knows are there.
Compared to the speedily burgeoning knowledge of protein-encoding genes, the detection of transcriptional regulatory sequences within the human genome is progressing slowly. Genetic network models are typically constructed for polygenic interactive systems, in order to assess interactions and regulation of genes. Gene expression data are obtained using high-throughput technologies, such as microarrays and Serial Analysis of Gene Expression (SAGE) [5, 125], which can help in identifying candidate genes that may be variously involved in diverse normal or disease morphomechanical phenotypes; thus, they may offer innovative insights into complex cardiovascular syndromes whose etiologies implicate multiple genetic factors. Using such technologies, the expression intensity patterns of numerous genes are ascertainable under various conditions. The global analyses of gene expression levels that are obtained are useful for cataloging genes and overall phenotypes, and hence for elucidating the role of genes in human disorders. These activities are often interconnected, as we wish to identify genes that are linked to particular sets of morphomechanical traits, or to important cardiovascular anomalies—e.g., “marker genes” that are differentially expressed in particular phenotypic syndromes.
Piecing Together Genomics, Transcriptomics, Proteomics, Metabolomics, and Phenomics
Innovative sequencing and array technologies are fueling groundbreaking advancements not only in research but also in translational and consumer genomics, and molecular diagnostics, rendering possible investigations that were not even imaginable just a few years ago. As I have emphasized in a just published, comprehensive, and detailed analysis , to which I refer the interested reader, genome-wide gene expression methods (NGS technologies, high-density DNA microarrays, SAGE, Northern blotting, etc.) only give information about relative mRNA abundance, viz., concerning the transcriptome. Obviously, this is essential, but certainly not the complete story. This information should be combined with proteome data—from, e.g., Western blotting, 2D polyacrylamide gel electrophoresis (2D-PAGE), nuclear magnetic resonance (NMR) spectroscopy, and mass spectrometry (MS), or liquid chromatography and tandem mass spectrometry (LC-MS-MS), etc.—to conclude whether particular transcripts are, in fact, being translated and expressed, and in what form and intensity. In view of diverse posttranslational protein modifications, which are key mechanisms to increase proteomic diversity and regulate cellular activities, there is also the question of the biological activity of the protein products and their interaction with endogenous metabolites (endogenous metabolome) and exogenous factors, such as pharmaceuticals (exogenous metabolome) [126, 127].
Metabolomics technologies encompassing 2D-PAGE, LC-MS-MS, etc., together with powerful multivariate statistical analyses and informatics software, can afford a simultaneous and relative quantification of thousands of different metabolites within a particular sample, making available a wealth of relevant biochemical data. While transcriptomics and proteomics contribute essential insights into the coordinated regulation of metabolic adaptability, metabolomics point light on the actual enzyme activity expressing metabolic regulation together with mass action effects. Thus, in cardiac disease with altered myocardial metabolism, metabolomics is likely to improve greatly single biomarker-based approaches by ascertaining metabolic biosignatures of wide-ranging biochemical changes . Considering the archetypical paradigm of gene → transcript → protein → metabolite, the best disease biomarkers will be ascertained through studies that show correlation between biomolecules at all four levels. Gene expression work is just one part of a comprehensive “systems biology” approach to gene expression , melding various disciplines such as molecular biology, informatics, biochemistry, and statistics.
The confluence of next-generation electronic health record (EHR) and high-throughput NGS genotyping technologies offers cardiovascular translational investigators a unique opportunity to integrate genomic patient data into EHR systems to apply precision medicine to a large-scale patient base, encompassing information ranging into many petabytes of data (1 petabyte = 1 quadrillion bytes)! Computer speed, memory, and bandwidth have advanced such that multimodality digital medical images can be part of an EHR system and at high-resolution. These advances will allow integration of genomics, pharmacogenomics, transcriptomics, proteomics, and metabolomics information with clinical facts and other multifaceted phenomics data (digital pathology and virtual microscopy of myocardial biopsies, multimodal cardiac digital imaging data including angiocardiography, echocardiography, Doppler, CT, MRI and PET, solid-state multisensor catheter-derived hemodynamics, and so forth) into a single unified digital workflow. Versatile repurposing for genetic research of EHR data containing rich phenotypic information will lead to the establishment of more detailed HER phenotypes and should stimulate innovative approaches and collaborative initiatives with enhanced productivity.
Big Data and Innovative Therapeutic Directions
Big data processing, which relies on the simultaneous application of statistics, computer programming, bioinformatics, and graphic/visualization techniques, can draw from text, images, audio, and video to deliver complete interactive analytics enabling the discovery and communication of meaningful data patterns, often in real time. Such patterns facilitate the inference of relationships and dependencies and allow predictions of morphomechanical behaviors and clinical outcomes. Rapid advances in genomics alongside progress in medical imaging, computational biology, and informatics are creating opportunities to develop tools to truly personalize diagnosis and treatment in line with the Hippocratic model: “It’s far more important to know what person the disease has than what disease the person has.” To wit, they are making personalized medicine possible today by matching drugs to individual patients. These abilities are something that has never been feasible before at such a scale; it should enable academic investigators and clinicians to reach precise diagnostics effectively and should speed up enormously the advent of personalized cardiology and genomic discoveries.
However, the precise extraction of detailed disease and therapeutic-response phenotype information contained in EHRs is not an easy task; EHR-driven phenotyping has yielded multiple challenges [129, 130]. GWAS and massively parallel DNA sequencing strategies detect hundreds of thousands of loci per individual, affecting both normal variation and susceptibility to disease , and can shed light on the genetic architecture of complex traits. However, their precise biological relevance must be proven before mutations can be causally linked to a specific disorder. The definitive aim of gene discovery in complex disease is to detect and characterize precisely the specific biological networks/pathways and processes that bring about the disorder. The critical issue is not whether patients have more instances of one or more rare variants than the controls but, rather, which mutated genes are causal and contributing to the illness in the affected. Variable penetrance, epistasis, epigenetic changes, and gene–environment interactions will complicate pertinent efforts; each affected individual will nonetheless exhibit interference/interruption of related key biological processes underlying the anomalies in the morphomechanical traits. Defining comprehensively the ways in which genomics, transcriptomics, proteomics, and metabolomics of complex disorders like HCM and DCM are impacted by DNA mutations will contribute new insights into normal physiology and disease pathophysiology and will provide important clues for rational therapy/management. This discovery pathway can ultimately lead to innovative and potentially patient-specific therapeutic objectives and methods.
Clustering: a Statistical Data Mining Procedure for Analyzing Gene Expression Data
Microarray technology has made it possible to simultaneously monitor the expression levels of thousands of genes in parallel during important (patho)physiological processes/conditions and across assortments of related samples. Currently, the main concentration in genomic research is switching from sequencing to using the genome sequences to ascertain how genomes are functioning. A microarray gene expression matrix is a table where rows represent genes, columns represent various samples (or examined conditions), and numbers in each cell denote the expression level of the particular gene in the particular sample. Clustering is a helpful statistical data mining procedure for analyzing such gene expression data; it arranges genes together in groups that have potentially related functions or are co-regulated, thus helping to establish the relationships among them in the form of gene regulatory networks [123, 132, 133].
Any particular set of co-regulated genes and co-regulating conditions within a gene expression matrix, encompassing a priori interesting sets of co-expressed/co-regulated genes and co-regulating conditions or samples, represents a regulatory unit or module (RM). Given a gene expression matrix, biclustering or two-mode clustering algorithms can be applied to discern one or more local patterns, or “biclusters” [134, 135, 136], in which a subset of genes exhibit similar expression levels over a subset of conditions—i.e., specific subsets of rows exhibit similar behavior across specific subsets of columns, and vice versa.
Therefore, biclustering is a data mining system that allows simultaneous clustering of the rows and columns of a gene expression matrix; it has been the most common technique for extracting gene RMs. Each bicluster is a tuple, or data structure, containing elements of two sets: the rows and columns. It is also possible to cluster separately the rows and columns of the gene expression matrix. Biclustering techniques/algorithms can simultaneously cluster genes and conditions, to reveal distinctive “checkerboard” patterns in matrices of gene expression data, if such patterns exist [134, 137]. When biclustering is used to reveal transcriptional modules composed of genes showing coherent expression profiles over time, it is termed “temporal biclustering.” Biclustering carries out row clustering and column clustering in tandem, according to the similarities among expression profile trajectories of genes and of samples [137, 138, 139].
Effectively, gene regulation matrices are postulated to reveal the structure of the underlying genetic network. These matrices can be utilized to figure out how genes act jointly to control transcription and bring about specific phenotypic characteristics, in health and disease, bearing in mind that the regulatory network is also influenced by extracellular environmental signals, as is discussed in the next section. Gene networks are commonly deciphered in combination with figures (cf., Fig. 2) to illustrate the generally intricate interrelations between the network elements. Because of their complexity, such networks are not always easy to grasp. Since the 1960s, approaches from mathematics and physics have been applied to characterize and simulate small gene networks quite rigorously [140, 141]. Comprehensive analytical models of large genetic networks could transform investigation and understanding of complex cardiac diseases, such as the genetic cardiomyopathies. However, such models are not yet within our reach, because large genetic systems involving hundreds/thousands of regulatory genes are enormously complicated to model and, furthermore, because experimental data for large genetic systems are inadequate.
High-throughput technologies encompassing NGS technologies (discussed earlier), bioinformatics methods, and systems biology approaches allow probing aspects of gene regulatory networks on a genome-wide scale. Such high-throughput technologies and molecular biological methods nowadays make it possible to study in parallel great numbers of genes and proteins, expediting the study of sizeable gene networks [142, 143, 144]. This is empowering us to deal with multipart gene networks more efficiently and promises to gradually lead to more effective, comprehensive understanding of multifactorial and likely polygenic heart diseases, such as genetic DCM and HCM. For instance, one could use a gene expression data set to assemble a corresponding gene network model that is consistent with the data. Discrepancies between simulated data produced using this model and new experimental/clinical data, not used to construct the model, would then reveal model deficiencies. The discrepancies could consequently be used to decide on alternative genetic network models, or to upgrade the model. Thus derived understanding could then be translated into effective therapy: identified variant genes might be corrected in stem cells and subsequently given back to the patient, resulting in functional protein products and ameliorating the aberrant/cardiomyopathic phenotype.
It should become increasingly clear in the near future that many genetic heart diseases have a more intricate polygenic etiology than is generally thought. Modifier polygenes whose individual effect on the phenotype is too small to be observed, acting in concert with other (non-allelic) genes, could actually modify strikingly various phenotypic traits (cf., discussion of HCM vs. DCM phenotypes with the same “causal” primary mutation; cf., Fig. 4).
Epigenetics and Gene-by-Environment Interactions
Despite the largely unforeseen, perplexing findings mentioned in the section “Lack of Correspondence Between Expressed Phenotypic and Genomic Complexity,” and as suggested by the Epigram, many biomedical scientists worldwide had known for decades that genetics alone, in the sense of Mendelian genetic determinism, is ineffectual in explaining complex developmental, adaptive, and maladaptive outcomes. Additional integrative information management systems with extensive capabilities had to be operational. Already back in 1975, on the heels of the advancements in molecular biology of the 1950s and 1960s, through a comparison of human and chimpanzee macromolecules by various methods (immunology, amino acid differences, and protein electrophoresis), MC King and AC Wilson determined  that the extent of protein-encoding gene variation between chimpanzees (Pan troglodytes) and humans was too small (≈1 %) to account by itself for the remarkable phenotypic differences between the two species. They argued that it was not structural gene dissimilarities that were responsible for the morphomechanical phenotypic differences, but gene regulation processes altering gene expression, the timing, extent, and manner in which gene products are assembled. Later investigations, employing more modern approaches and techniques, validated this view [146, 147].
There are significant differences in gene expression patterns and it is these divergences that primarily account for the striking phenotypic disparities. It might be speculated that such more or less wide variances in gene regulation and expression are of great consequence, in view of the relatively small genomic differences between species as well as individuals within a species. However, there is no simple mapping from genome to phenotype because of the confounding contributions of multifaceted environmental factors to the ways in which the genetic blueprint may unfold [2, 3, 4, 5]. The genes are highly specific chemically and thus are called into play only under very specific conditions, yet their morphomechanical effects hinge on quantitative influences of their environment and also of immediate or remote products of other genes, which are resultants of all that has gone on before in the organism.
Epigenetics examines morphomechanical attribute variations, pertaining to appearance and functions, which are not caused by changes in the DNA sequence of (nucleo)bases (see Fig. 5). It explores external or environmental factors that switch genes on and off and affect how the genetic blueprints are read and, thus, the specific properties of cells, tissues, organs, and individuals. In the present context, environment is used in a broad Bernardian sense [2, 4, 5], to account for an extensive range of non-genetic factors that can be implicated in the etiology of complex developmental, morphomechanical adaptive/maladaptive changes, and disease. Essentially, epigenetic mechanisms regulate the exceptional genomic plasticity and are fundamental in cardiac adaptations, remodeling, reverse remodeling, and disease [2, 3, 4, 5]. Epigenetic explanations are very powerful because they provide a detailed mechanistic account for how both environmental and genetic factors might interact to affect morphomechanical traits and physiological processes such as tissue and organ development, dynamic adjustments, and the onset and evolution of disease.
Epigenetics Forms a Bridge Between Genotype and the Exhibited Phenotypes
Different genotypes can be expected to show varying transcriptional responses to identical environments and the same genotype to show varying transcriptional responses to different environments. Viewed through this prism, the effects of the environment and genetic factors are not mutually exclusive but are operating complementarily, and epigenetics forms a bridge between the genotype and the exhibited normal and abnormal/disease phenotypes (see Fig. 5). As new technologies make comprehensive description of gene expression levels more accessible, it will without doubt become essential to study gene expression phenotypes as a function of applying environmental context as well as the product of a particular genotype.
Epigenetic factors may involve both monogenic and polygenic mechanisms (cf., Fig. 3). Epigenetics engenders complexity levels beyond gene–gene interactions and encompasses interactions between genes, gene products (proteins) and genes, and between these elements and environmental forces/clues, including past and present experiences and influences upon the organism, and its constituents. For a true understanding of the determinants of gene expression, expression variation must be considered not simply as a product of genetic dissimilarity or of environmental disparities, but as a joint product of genes and the environment.
Examination and understanding of so-called gene-by-environment, or GxE, interactions [2, 3, 4, 5] can allow for identification of unappreciated linkages between genetic and environmental risks and various triggering or aggravating factors. Moreover, it can allow for an increased understanding of the biology of important multifactorial maladies and may impact risk stratification for targeted health screening and prevention efforts. In a series of up-to-date publications [2, 3, 4, 5], I have outlined the role of mechanical stresses or of a lack thereof, as important epigenetic factors in cardiovascular development, adaptations, and disease. In a just off-the-press two-article series [4, 5], I have proposed that variable forces associated with diastolic RV/LV rotatory intraventricular flows can exert physiologically and clinically important, albeit still unappreciated, epigenetic actions influencing functional and morphological cardiac adaptations and/or maladaptations. Taken in toto, this two-part investigation formulates a new paradigm in which intraventricular diastolic filling vortex–associated forces play a fundamental epigenetic role, and examines how heart cells react to these forces. Albeit still unappreciated, these epigenetic actions influence morphomechanical cardiac adaptations.
Epigenetic mechanisms involve complex, multifactorial interactions and successions of interlinked morphomechanical states resulting from them. Secondary changes in DNA may also contribute to the stabilization of the new state, but they are not encoded by genes; they are elicited by regulatory modifications in DNA methylation patterns, in DNA-binding histone proteins, and in chromatin structure [2, 3, 4, 5]. These adjustments result in transformed transcriptional patterns and, thus, in altered morphomechanical characteristics at all levels of the organism. Because some of these epigenetic modifications may affect DNA passed on through the germline (in eggs and sperm), they can be transmitted from parents to offspring and subsequent generations (Fig. 5). In this circumscribed sense, epigenetics has given new life to Lamarckian theory and the previously discarded idea that characteristics acquired during an individual’s life are heritable.
While innovation can occur as individuals and groups tackle new problems, innovation can also occur as a response to unplanned changes, such as have been ushered in by the post-genomic era. The ancient Greek philosopher Heraclitus once proclaimed that there is nothing permanent except change. Advances in genome sequencing and research have come to be exponential in recent years, leading to a precipitous and powerful augmentation of our knowledge of intricate genetic foundations of many cardiovascular diseases. It is becoming quite challenging to practitioners and investigators alike to keep abreast of the clinically ever more accessible and affordable contemporary diagnostic developments in cardiovascular genetics. Technical advances in NGS with advanced laboratory tools, databases, and analytical software threaten to overtake the ability to comprehend and interpret dependably in the clinical or laboratory setting the implications and ramifications of new research findings.
Future studies, integrating the current wealth of genome-wide technologies, are poised to bring us clinically advantageous information on whether and how multiple variable factors are collaborating to modify human disease and disease risk. Implementing such genomic findings in cardiology practice may well lead rapidly to better diagnosing and managing of common cardiovascular conditions, such as the genetic cardiomyopathies and heart failure. The adoption of new methods can admittedly be unjustifiably slow at times. However, the utility of whole-genome sequencing of personal genomes in clinical practice will rise sharply as more and more genomic variations linked up with clinically significant disease phenotypes become determined and corroborated, making it cost-effective in the process to diagnose multiple diseases in tandem. Cardiovascular specialists are still a long way from submitting their patients’ full genomes for sequencing, not because the price is prohibitive, but because the data are difficult to interpret properly. This survey has aimed at addressing the need for fundamental understanding in this burgeoning field, which is desirable in order to provide high-grade, personalized care for patients and their families in the post-genomic era.
Compliance with Ethical Standards
Sources of Funding
Research support, for work from my Laboratory surveyed here, was provided by National Heart, Lung, and Blood Institute, Grant R01 HL 050446; National Science Foundation, Grant CDR 8622201; and North Carolina Supercomputing Center and Cray Research.
Conflict of interest
I declare that I have no conflict of interest, whatsoever.
All procedures performed in studies involving human participants that are reviewed here were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments. All applicable international, national, and/or institutional guidelines for the care and use of animals in studies involving animals that are reviewed here were followed.