Journal of Cardiovascular Translational Research

, Volume 8, Issue 9, pp 506–527

Linking Genes to Cardiovascular Diseases: Gene Action and Gene–Environment Interactions


DOI: 10.1007/s12265-015-9658-9

Cite this article as:
Pasipoularides, A. J. of Cardiovasc. Trans. Res. (2015) 8: 506. doi:10.1007/s12265-015-9658-9


A unique myocardial characteristic is its ability to grow/remodel in order to adapt; this is determined partly by genes and partly by the environment and the milieu intérieur. In the “post-genomic” era, a need is emerging to elucidate the physiologic functions of myocardial genes, as well as potential adaptive and maladaptive modulations induced by environmental/epigenetic factors. Genome sequencing and analysis advances have become exponential lately, with escalation of our knowledge concerning sometimes controversial genetic underpinnings of cardiovascular diseases. Current technologies can identify candidate genes variously involved in diverse normal/abnormal morphomechanical phenotypes, and offer insights into multiple genetic factors implicated in complex cardiovascular syndromes. The expression profiles of thousands of genes are regularly ascertained under diverse conditions. Global analyses of gene expression levels are useful for cataloging genes and correlated phenotypes, and for elucidating the role of genes in maladies. Comparative expression of gene networks coupled to complex disorders can contribute insights as to how “modifier genes” influence the expressed phenotypes. Increasingly, a more comprehensive and detailed systematic understanding of genetic abnormalities underlying, for example, various genetic cardiomyopathies is emerging. Implementing genomic findings in cardiology practice may well lead directly to better diagnosing and therapeutics. There is currently evolving a strong appreciation for the value of studying gene anomalies, and doing so in a non-disjointed, cohesive manner. However, it is challenging for many—practitioners and investigators—to comprehend, interpret, and utilize the clinically increasingly accessible and affordable cardiovascular genomics studies. This survey addresses the need for fundamental understanding in this vital area.


Genotype and expressed phenotypes Exons, introns, and alternative splicing Monogenic and polygenic traits and gene networks Major gene, “modifier genes,” and pleiotropy Regulatory DNA “switches” and regulation of gene expression Gene interactions and epistasis Genetic cardiomyopathies, HCM, DCM Environmental influences and epigenetics Mutations and haplotypes 

“With the tools and the knowledge, I could turn a developing snail's egg into an elephant.

It is not so much a matter of chemicals because snails and elephants do not differ that much.

It is a matter of timing the action of genes.”—Barbara McClintock, 1902-92, American geneticist, Nobel laureate.

Quoted (p. 176) in Bruce Wallace: The Search for the Gene [1].


The heart integrates structure and function across multiple spatial scales [2]. Great challenges revolve around developing the means of linking morphomechanical myocardial characteristics across the spatial scales. A unique characteristic of myocardium is its ability to grow and remodel in response to changing environments; this is determined partly by genes and partly by the physical environment [2, 3]. There is an emerging need to ascertain not only the physiological functions of myocardial genes and proteins but also their adaptive/maladaptive modulations by mechanical “environmental factors.” In a series of recent analyses, I have outlined the role of mechanical stresses [2, 3, 4, 5], or a lack thereof, as important epigenetic factors in cardiovascular development, adaptations, and disease.

The essential tasks of cardiac myocytes are the development of contractile forces by the fibrillar sarcomeres and their transmission to the myocardial extracellular matrix (ECM); insufficient accomplishment of either dynamic task initiates cardiac remodeling entailing hypertrophy and dilatation, provokes symptoms, and can lead to heart failure. Accordingly, all cardiac diseases with abnormal growth and remodeling may be in the class of diseases with a mechanical etiology [2]. Different levels of mechanical dynamic stimuli such as right ventricular/left ventricular (RV/LV) pressure, endocardial fluid shear stress, or myocardial deformations may initiate or modulate signal transduction cascades and other cellular processes that underlie cardiac adaptations, remodeling, or transition to pathology; these may be reversible and amenable to treatment. Moreover, it is now evident that systolic and diastolic (filling-related) local mechanical stimuli (compressive, tensile, and shear stresses) are major controllers of growth and remodeling. Cells are capable of sensing mechanical forces and converting them into biological signals via mechanotransduction mechanisms [3, 4, 5]. Therefore, mechanical forces contribute to the regulation at the molecular and cellular level of various processes, such as gene expression, cell growth, proliferation and apoptosis, and synthesis and degradation of ECM components, thus inducing altered morphology, properties, and function [2, 3, 4, 5]. Intriguingly, the influence of the ECM collagenous cardiac skeleton is reflected by the remarkable consistency of the cardiac shape over a lifetime; in fact, each individual cardiomyocyte retains only the extent of mobility permitted by its rigid collagenous suspension via wavy (non-straight) collagen struts within the ECM network [6].

A detailed GLOSSARY is listed in the Electronic (Online) Supplementary Material, and is referred to in the text asGlossary, ESM.”

The Cardinal Concepts of Genotype and Expressed Phenotype

The central concepts of genotype (Genotypus) and phenotype (Phänotypus) were first introduced in 1909 by Wilhelm Johannsen, a Danish geneticist, in his German-language textbook titled Elemente der exakten Erblichkeitslehre (Elements of an exact theory of heredity) [7]. Today, the genotype is viewed as the descriptor of the genome, which is the set of physical DNA molecules inherited from the parents [2, 3]. The phenotype is the descriptor of the phenome, which encompasses the manifest properties of cells, organs, and the whole organism, i.e., structural (morphology) and functional (physiology) characteristics, including growth patterns, specialized functions, and metabolic, behavioral, and other activities [2, 3]. Analogous considerations apply for constituent cells, tissues, and organs.

Unquestionably, however, there is some inexactness in the assignment of an individual to one genotype, because numerous mutations most likely take place at random in cells throughout development, growth, and life, so that all cells in the body do not have one identical genome. Likewise, even cloned genetic copies or monozygotic twins with near enough identical genotype will diverge from each other in phenotype due to discrepancies in their developmental environments. Therefore, phenotypes are categories actually possessing only a single member.

The Human Genome Project and Next-Generation Sequencing

Because the genes are made of DNA, what produces or influences strongly the observed similarities and differences in structure and function between different individuals should be similarities and differences in DNA. Accordingly, one might reasonably expect that to investigate such phenotypic similarities and differences, we must sequence and compare the corresponding DNA, anticipating distinct “genes for” not only different phenotypic traits but also many heritable diseases, including cardiovascular problems. After all, the evidence for the “heritability” of a trait comes from its stronger resemblance among close relatives than unrelated individuals.

Base pairs that form between specific (nucleo) bases are the building blocks of the DNA double helix (see Supplementary Table 1; cf., Fig. 5) and may be construed as the teeth of a zipper, or the rungs of the polynucleotide DNA spiral staircase, contributing to its folded structure. The helical strands are held together by hydrogen bonds between cytosine and guanine and between thymine and adenine residues. The aggregate size of the human genome is about 3 billion base pairs, arrayed in 23 (haploid number) chromosomes; the chromosomes themselves range in size from 250 million bases (Mb) for chromosome 1, to 50 Mb for chromosome 21 [8]. When the highly anticipated draft sequence and initial analyses of the human genome assemblies were published by the Human Genome Project (HGP) in February 2001 [9, 10, 11], with the layout of the entire genome’s 3 billion DNA base pairs some 90–95 % covered at an accuracy of 99.9 %, a perplexing conclusion was that the number of human protein-coding genes was significantly smaller, at <30,000, than previous estimates reaching up to as many as 140,000 genes. The current estimate is only 20,000–25,000 genes [12].

Using next-generation sequencing (NGS) [5, 13] with new, low-cost, rapid DNA sequencing machines combined with other advanced laboratory tools, databases and analytical software, and taking advantage of huge increases in computer processing speeds, the specific order of nucleotides (viz., nucleosides and bases; see Supplementary Table 1) in a human genome that would have taken many years or months to sequence now takes less than one-half hour (e.g., Such mind-boggling advances should yield clinically invaluable definitive information on precisely how the genes “determine” morphomechanical characteristics of tissues and organs, and on the correlation of inherited changes in DNA sequence (genetic variants, or mutations) to various diseases, including cardiovascular anomalies (cf., Table 1). They are already changing the classic practice of clinical cardiology in several ways, increasing awareness of inheritance of defective genes and their impact on cardiovascular health and anomalies, and providing newfound cardiogenomic diagnostic, prognostic, and therapeutic tools. Since the genetic makeup can shape the outcome of a pharmacological intervention (pharmacogenetics), it can be checked appropriately so that pertinent findings hopefully may guide therapy [14, 15].
Table 1

Genes commonly implicated in HCM & DCM in descending order of frequency

% of HCM associated with mutation of this gene


Protein name

40 %



40 %


Myosin-binding protein C, cardiac type

5 %


Troponin T, cardiac muscle

5 %


Troponin I, cardiac muscle

2 %


Tropomyosin alpha-1 chain

% of DCM associated with mutation of this gene


Protein name

20 %



6 %



4.2 %



3 %–4 %



2 %–4 %


Myosin-binding protein C, cardiac type

Table prepared using data from pertinent chapters of GeneReviews [Pagon RA, Adam MP, Ardinger HH, et al., editors.GeneReviews® [Interenet]. Seattle (WA): University of Washington, Seattle; 1993–2015]

In the past, genetic evaluation for, for example, genetic cardiomyopathies was performed by sequential screening of a very limited number of genes. NGS has increased the throughput and minimized cost, enabling simultaneous screening of large assortments of genes for multiple patients in a single sequencing run. Thanks to continuing advances in accuracy, throughput, pricing, and simplification of data analysis, it has now become practicable to apply NGS to clinical molecular diagnostics of inherited cardiac conditions [16]. Once disease-associated modifications in a primary variant gene’s DNA polynucleotide sequence are identified, it may become easier to resolve how the structure of the resultant polypeptide (and thence, protein/enzyme) gene product changes in a manner impacting its biological function [17]; this has obvious therapeutic implications.

A Gene Is Not a Continuous DNA Segment: Exons, Introns, and “Alternative Splicing”

DNA coding for a protein generally comprises separate sections of DNA, near one another in the genome, but with interposed lengths of non-coding DNA separating them. The coding portions of DNA are termed exons, and the lengths that disconnect them introns. On average, there are 8.8 exons and 7.8 introns per gene in the human genome [18]. The entire DNA sequence is transcribed into pre-messenger RNA (pre-mRNA), but ensuing revising processes generally remove the introns, leaving only the exons to determine the polypeptide/protein amino acid sequence. Consequently, both exons and introns, as well as assorted other regulatory DNA sequences upstream and downstream of the exon/intron set of a gene, are implicated in the actual manufacture of the RNA that guides the assembly of the protein product. Accordingly, genes should be envisioned as complex, spatially discontinuous, composite—rather than unitary—objects. This is akin to the idea that a cell culture is an object. Introns were discovered by observing that the mRNA that coded for proteins was almost always shorter than the DNA from which it had been transcribed. They are removed by splicing enzymes in the intranuclear spliceosomes before mRNA, ribosomal RNA (rRNA), and transfer RNA (tRNA) can carry out their functions in the cytoplasmic ribosomes [19].

Exons, the coding sequences of a gene, and only some interspersed introns are generally incorporated into mRNA; in humans, reliable cases of intron retention have been reported for <5 % of genes [20]. However, the RNA portions produced by a set of exons may be combined together in multiple ways, a phenomenon labeled as alternative splicing (AS) [21]. AS allows one set of exons to code for numerous distinct proteins (see Fig. 1); even entire exons can be skipped in the middle of the transcript (exon skipping of “cassette” exons), resulting in a different protein “isoform.” Changes in splicing, even without changes in overall gene expression, may thus have important phenotypic effects. Self-evidently, AS can play a major role in strikingly expanding the potential informational content of eukaryotic genomes. Recent estimates indicate that around 95 % of human genes undergo alternative splicing [22] and that chromatin and histone modifications (see section “Epigenetics and Gene-by-Environment Interactions.”) can regulate alternative splicing [23]. Many splicing regulators have tissue-specific expression patterns, resulting in widespread differences in AS patterns across different tissues. In view of alternative splicing, it follows that each gene does not, strictly speaking, encode a specific protein, since RNAs transcribed from the same gene can be rearranged to produce different protein isoforms, having dissimilar or even antithetical actions [24].
Fig. 1

Alternative splicing (AS) comprises mechanisms that enable a single gene to splice its mRNA transcripts in multiple ways, thus creating an assortment of dissimilarly spliced mRNA species, polypeptides, and proteins. Alternative splicing can allow one gene to generate different proteins (see protein isoforms discussion in text) in diverse tissues. Consequently, diversity encoded within the genome and organismal complexity can increase greatly without increasing the genome size. The human genome contains around 20,000–25,000 genes, substantially fewer than the 90,000 distinct proteins that are estimated to be assembled. Constitutive splicing (depicted in the top panel) means that all exons are joined together in the order in which they occur in the precursor mRNA (pre-mRNA) that is synthesized from a DNA template in the nucleus by transcription. Alternative splicing also occurs in the nucleus and entails various mechanisms that can alter the resulting mRNA products in several ways (exemplified in the lower panels) including exon skipping, mutually exclusive exons, and the use of alternative splice donor or acceptor sites. Certain such alterations in gene expression can result in disease states

Down syndrome (DS), caused by trisomy 21 (extra copy of chromosome 21), is the most common birth defect and is associated with intellectual disability, endocarditis, and heart defects that may require surgery. The gene DSCAM has been identified in the DS critical region; a homolog (Glossary, ESM) of the DSCAM protein in fruit flies (Drosophila melanogaster) has 38,016 isoforms arising from alternative splicing of exon clusters [25, 26], potentially altering gene function—the italicized symbol refers to the gene that encodes for the non-italicized protein symbol. A few regions of the large pre-mRNA molecule that undergoes alternative splicing remain in all versions of the protein, but this RNA also has four blocks containing multiple exons from which it chooses one version of each. It is akin to assembling clothing out of an apparel collection that provides only 1 type of hat, but 12 kinds of socks, 33 shirts, 48 varieties of trousers, and 2 different shoe types. Altogether, these items could be combined in (1 × 12 × 33 × 48 × 2) different ways to create 38,016 possible ensembles. Alternative splicing is particularly prevalent in genes encoding contractile proteins and in other genes expressed in muscle cells, including cardiomyocytes [27, 28, 29].

The Riddle of “Junk DNA” and Regulatory DNA “Switches”

Of the 3 billion DNA base pairs of the human genome, over 98 % that lie between our 20,000–25,000 genes were, until recently, regarded as junk DNA with no protein coding functions, implying that many DNA segments must contain information used in coding for more than just one protein, through the mechanism of alternative RNA splicing [4, 5], which is described in the previous section that details how a gene is not a continuous DNA segment. The human genome is a highly structured RNA-producing machine. The junk DNA hypothesis is being discounted decisively as more and more of the non-coding DNA is found to be transcribed into functional RNA molecules, albeit with functions not as yet characterized comprehensively [30]. The current understanding of non-coding RNA (ncRNA) transcripts is only very incomplete due to enormous difficulties in genome-wide ncRNA gene mining. Synonyms for ncRNA are non-messenger RNA (nmRNA), non-protein-coding RNA (npcRNA), and functional RNA (fRNA). Beyond transfer RNAs (tRNAs) and ribosomal RNAs (rRNAs), ncRNAs are classified as introns; short, long, antisense, and small interfering RNA; microRNA; and pseudogenes, and into many other functional classifications.

Some sequences in the junk DNA act as genetic switches, which determine where and when genes partaking in genetic networks get expressed; without these switches, called regulatory DNA, genes are inert. Regulatory DNA switches control human genes and, thus, how the body makes different kinds of cells, tissues, and organs, how it adapts to changing conditions, and how normal gene circuitry gets rewired in disease; in therapeutics, ncRNAs are emerging as a substantial component of the advancing field of cardiac regenerative medicine [31, 32]. The switches (DNA sequences) for controlling genes are organized throughout the genome, and combinations of different switches function together to control genes, often apart at great distance along the genome [33]. There is a large diversity of transcribed RNA molecules that play crucial roles in regulating gene expression or in guiding RNA modifications. This implies that the non-coding DNA influences the behavior of the protein-encoding genes, the “coding DNA.” In this way, non-coding polynucleotide sequences can shape, more or less profoundly, expressed phenotypic traits [34, 35]. Small non-coding microRNAs (miRNAs) are present not only in the cytoplasm but also in several extracellular compartments, including circulating plasma; consequently, miRNAs are useful as diagnostic biomarkers for an increasing number of diseases [36]. Recent research has shown that long ncRNA (lncRNA) drives the gene expression changes underlying heart failure, wherein certain fetal genes are upregulated while adult genes become downregulated, suggesting a therapeutic potential of lncRNA in heart failure management [37].

Only a small fraction of regulatory DNA regions are active in any given cell type; this fraction is almost totally unique to each type of cell and becomes a sort of molecular Zebra barcode label of the cell’s identity. It seems that the rather naïve dualistic classification into coding versus non-coding transcripts (ncRNA) is rather unworkable with respect to the subtlety and complexity of genome regulation. It is vital to always remember that cells in organs and tissues, including myocardial cells (endocardium, cardiomyocytes, and fibroblasts), do not differ in genes present but only in the differential gene expression of alternatively spliced mRNA transcripts.

Lack of Correspondence Between Expressed Phenotypic and Genomic Complexity

The Human Genome Project (HGP) characterized additionally the entire genomes of several other organisms employed widely in biomedical research, such as mice, rats, fruit flies, and flatworms. Standard gene names and symbols can be found in databases that are specific to particular organisms (e.g., human:; mice:; rats:; flies:; worms: The parallel sequencing efforts have acted synergistically, because most organisms have numerous “homologous” genes with similar function, reflecting their shared ancestry. Homologous genes and proteins that are faithfully preserved and may have not changed over millions of years of evolution could constitute essential integrants of life and attest to evolutionary conservation of core processes [9]. The prime surprise of such parallel investigations was that a fertile human egg could produce so very different an organism than a mouse (Mus musculus) egg, whereas the human has merely 300 distinctive genes not found in the mouse! It appears that the increased intricacy in the mechanisms that regulate gene expression is an important reason why humans are more complex than “lowly” creatures, such as worms and flies, and lower mammals, such as mice [2]. Importantly, depending in part on “environmental” epigenetic regulating factors [2, 3, 4, 5], human genes can give rise to multiple related proteins, each potentially capable of performing a different function in our bodies.

Accompanying such unexpected findings has been an uneasiness about the lack of correspondence between genomic and phenotypic complexity. Implicit in this is a non-linear, non-reductionist framework of disease, not necessarily relying on relatively simple causal explanations rooted in Mendelian gene mutation. This well-conceived framework perceives disease as an outcome of complex, multifactorial abnormalities at the manifold-scale levels of organs, tissues, and cells. Involved are integrated systems whose function depends definitively on their morphomechanical organization, their mutual interrelations and interdependencies, and on interactions among them and with their “environment.” This view does not negate the value of analyses at reduced system levels (molecules, genes, etc.) so as to understand how the interacting components work. However, at each higher level, important morphomechanical characteristics may emerge (in the sense of shifting from a billiard-ball causality to an interacting network complexity with increasing capacity for autonomous change), which could not have necessarily been predicted from just a knowledge of lower-level components.

The International Haplotype Map Project: Common Maladies Linked to Genotype

The International Haplotype Map Project (HapMap, has reallocated the attention from single gene to the whole genome, by cataloging/mapping details of genetic similarities and differences in human beings. NGS can sequence human genomes and transcriptomes and explore gene expression changes on a genome-wide scale and with single-base precision [5]. Genome-wide association studies (GWAS) identify loci of single-nucleotide polymorphisms (SNPs, pronounced “snips”) across the genome and correlate them with disease susceptibility [38]. Since the initial dissemination of the haplotype map (HapMap) of the human genome in 2005 [39], a considerable number of GWAS have been conducted to establish genetic determinants of complex, multifactorial diseases, including cardiovascular disorders, and individual responses to therapeutic agents.

A haplotype is a grouping of DNA variations, or a series of polymorphisms, that are close together in the genome and tend to be inherited jointly; it can refer to a combination of alleles (Glossary, ESM) or to a set of SNPs acquired on the same chromosome—new mutations create new haplotypes. Use of the HapMap in association studies can reveal links between common disorders and genotype; genotype is one of three factors that determine phenotype, the other two being epigenetics and non-inherited environmental factors. The haplotypes in individuals with a specific disease are compared to those of a comparable (control) group without it. If a particular haplotype occurs with higher frequency in affected individuals compared to controls, gene(s) influencing the disease may be situated within or near that haplotype.

Use of the HapMap should result in advances, such as treatments adapted to a patient’s genetic makeup to maximize effectiveness while minimizing side effects. It should be recognized, however, that genetic variation entailing multiple variants (alleles) of any given gene in the population (polymorphism) probably has a decidedly graded effect on gene expression because, for any gene, many factors participate in regulating its expression; genetic variability in any one of such genes is apt to bring about a change in expression [40]. Moreover, at any genomic locus, there are different alleles or variants that can exist across a single haplotype cluster or in several haplotype clusters nearby. Accordingly, there is no single haplotype for disease risk but, more accurately, a collection of haplotypes that confer a graded risk of the disease. Instead of being thought of as independent hereditary units, genes, expressions of DNA, are to be conceived as parts of biochemical/molecular systems within cells, including those of the myocardium. Intense work on these systems explores the functional interdependence of DNA and molecules of other kinds, and how diverse molecules may function together and operate in ways whereby they both regulate and are regulated by each other and by physical forces in their environment [4, 5].

Distributed Gene Networks

As the facts discovered through hypothesis-driven and big data project research [41] in genetics continue to grow, so does the need for advanced tools to aid in visualization and analysis [42, 43]. Consequently, software platforms for visualizing gene interaction networks and for integrating these interactions with gene expression profiles—such as the Genes2Networks, the FunCoup, and the Cytoscape freeware programs and associated plugins—become indispensable. They can help identify the top disease gene candidate(s) from “seed” lists of generally accepted human gene symbols; their output includes dynamic three-color network maps, with statistical analysis reports [43, 44, 45, 46, 47]. The plug-ins allow users to alter modes of network visualization, add layouts, add novel functionality (for instance, network graph animation), and search for pathways through a network. Powerful extensions are obtainable from the Cytoscape web site (, where new releases, documentation, and tutorials are available.

A gene system with associated subsystems is best conceptualized as forming a complex distributed gene regulatory network (GRN) [48, 49], as is exemplified in Fig. 2. The nodes represent genes and transcription factors, and the edges connecting them the direct or indirect interactions among them that regulate gene expression [50]. These interactions can be directional or bidirectional and can induce activation or inhibition; thus, an arrow entering a gene signifies that the gene is regulated either positively or negatively by the gene originating the arrow. Nodes with several links are named hubs. Nodes and edges jointly form the network topology and normal or abnormal gene network function involves dynamic processes unfolding within this topology (see Fig. 2). Having identified the constituents of the system, a large assortment of techniques can be employed on a gene-by-gene basis, frequently in a hypothesis-driven manner, to determine which components interact, how and, where possible, in what order [51].
Fig. 2

A static gene-to-gene interaction network summary map around five selected query genes (MYH7, MYBPC3, TNNT2, TNNI3, TPM1) implicated in genetic HCM—cf., Table 1 and discussion in text. The query genes are highlighted in yellow; the network displayed includes only the strongest links between the non-query subnetwork genes. Prepared using FunCoup, v. 3.0 build 2014-02; the name stands for functional coupling. FunCoup, developed by the Science for Life Laboratory of the Karolinska Institutet of Sweden, is a framework to infer genome-wide functional couplings in 11 model organisms, including the human. Functional coupling is an unspecific form of association that encompasses direct physical interaction but also more general types of direct or indirect interaction like regulatory interaction or participation in a normal or disease process (Color figure online)

Rather than operating along linear pathways, the collective complexity of connections among genes is better expressed as a distributed network. Some genes can cause greater abnormalities than others when mutated, but this hinges strongly on the context of other alleles that are present. The multipart connectedness of interactive gene ensembles demands a comprehensive systems approach: a versatile understanding of the gene system/subsystem that can refer to the wider network plan and simultaneously appraise the intricate interconnections among its elements. It calls for network thinking [2]. It is always important to remember too that cells, tissues, and organs are not static entities, and that it is consequently necessary to ascertain dynamic network changes reflecting both the temporally and spatially variable nature of the interconnected gene assemblies. Comparative analysis of the underlying gene networks under different operating conditions may provide invaluable pathophysiologic understanding when investigating disease-specific genetic alterations underlying characteristic morphomechanical phenotypic differences, e.g., in various cardiac abnormalities. Such dynamic studies of the implicated genetic networks should provide more accurate, comprehensive understanding regarding the roles of genes in diseases such as hypertrophic (HCM) and dilated (DCM) genetic cardiomyopathies.

Gene Interaction Networks: Degeneracy Underlies Genomic Stability and Plasticity

Human (ENCODE) and model organism (modENCODE) genome decoding projects and the human HapMap Project profiling genetic variants provide enormous quantities of genomic data from a diverse assortment of organisms and environments [5, 52]. They supply a “parts list” of genes for a given (normal or abnormal) genome of interest. However, to understand the complexity encoded within it, knowledge is required concerning which of these parts function together and how. Rather than running in linear pathways, as modeled by the classic “one gene–one enzyme hypothesis” of Beadle and Tatum [53], which depicts each gene in sequence as responsible for producing a single enzyme affecting a single step in the pathway, the increasing complexity of relationships among genes is better described as a distributed network of interconnected elements (cf., Fig. 2).

Networks do not function in the same way as simple pathways; their elements can assume new roles as conditions change or other genes suffer mutations/damage. To do so, linear pathways would require inefficient redundancies, not so with gene networks: They are intrinsically more versatile, adjustable, and adaptable. Network tweaking can occur after perturbations, when gene networks exploit alternate strategies to preserve the output. The mechanisms used to arrive at that output are nonetheless different: They exemplify degeneracy, namely, using multiple pathway combinations to achieve functional plasticity. Degeneracy is a key organizational feature of our genetic code. Not to be confused with medical jargon, degeneracy in network systems analyses refers to this capacity of alternative components—viz., gene groupings—by adjusting expression of different genes to produce a similar phenotypic outcome in one context and dissimilar ones in other contexts [54].

Having structural arrangements that can selectively, but not exclusively, yield the same output under one condition, and dissimilar outputs in different conditions, improves robustness to perturbation and increases the flexibility of the cell, organ, or organism to changing environments or to gene mutation and damage (“knockout” experiments, [55, 56]). Such considerations emphasize the need for caution in using gene knockouts to attribute phenotypic effects to genes. Appropriately, attention is now shifting to understanding the difficult problem of how normal or disease genes and gene interaction networks, which link diseases to causal and contributing mutated genes, interact cooperatively in a multipart distributed dynamic system [57].

The guiding assumption in experimental interventions employing mutant data is that mutants can reveal the key constituents in a process and that they can illuminate causative mechanisms. The established approach starts off with a particular aberrant phenotype, e.g., the congenital heart defect–associated dysmorphic Noonan syndrome, or HCM, or hypercholesterolemia or other phenotypic abnormalities, for which a genetic screen is applied to identify anomalous genes implicated. Known disease genes are then used to build a network around these genes to identify new genes/nodes that could be additional disease genes affecting syndrome severity [58]. Note, however, that given that the functional relationships among genes can shift with each genetic perturbation, ipso facto then, the very workability of extrapolating backward to a normal genetic system from mutant data becomes questionable. Likewise, it is hard to estimate the odds that phenotypic differences arise because of flanking sequences rather than the targeted mutant allele, especially as the phenotype(s) being studied are commonly selected with a bias toward identifying or confirming a known variant gene role.

Gene Mechanisms Operative in Determining Phenotype

Ever since Mendel’s classic work, geneticists have sought to both describe distributions of phenotypic attributes and explain them in terms of independent causative genetic antecedents or “genes.” The developmental theory of preformationism, which was first proposed by Aristotle and became particularly influential during the Scientific Revolution [2, 59, 60, 61, 62], originated the working notion that a gene is defined by the phenotype that it causes. In preformationism, the egg or sperm was essentially understood to contain all the final features/traits of the mature organism. Development consisted of an enlargement of these preformed characteristics (or primordia, the initial buds of embryonic cells from which the body organs/parts develop) into the individual features of the adult organism; i.e., these primordia truly corresponded to the adult features with which they had a simple and direct causal relationship.

Genes, now conceptualized as DNA, are conceived as parts of chemical/molecular systems within cell nuclei, and as blueprints of particular peptides/proteins [63]. However, while molecular geneticists came to understand that in biological systems genes actually code for proteins, it became convenient and seemingly natural to think about preformationist-like genes for the individual traits of an organism/individual and, by extrapolation, also for classical genetic diseases in humans. The idea that genes could be “for” human traits was sustained by the discovery that genes for classical Mendelian medical disorders, such as various anemias (see below), often acted just like the hereditary elements discovered in Mendel’s pea plants. Such patterns of thinking appear to be deeply rooted and influence thinking modes concerning gene–phenotype relations and mutant disease genes in general, even today.

The preformationist concept of “a gene for…” implies concentrated specificity between gene and phenotype. However, the direct, clear-cut, and distinct causal relationship implicit in the concept of “a gene for…” most likely does not apply to complex cardiovascular maladies, such as various inherited cardiomyopathies (e.g., genetic HCM, DCM, restrictive cardiomyopathy, and arrhythmogenic right ventricular cardiomyopathy [64]), coronary artery disease, and heart failure [65]. The causal chain from genes to such maladies is probably long and encompasses many interconnected genes (dozens, hundreds, or even thousands; cf., Fig. 2). Much remains unresolved. However, important observations suggest that many genes influence risk for complex cardiovascular disorders and, considered individually, they are not diagnostically specific in their effect—as I argue in the forthcoming section “Polygenic Phenotype–Engendering Mechanisms.” Accordingly, their impact resembles the many-to-one or the many-to-many action patterns depicted in Fig. 3, rather than a monogenic one-to-one relationship.
Fig. 3

Some gene action patterns operative in determining phenotypic traits. a In a monogenic one-to-one pattern, a single gene determines a particular phenotype or trait. b In a monogenic one-to-many configuration, a single gene determines/affects a number of different phenotypes or traits and each of the latter is determined/influenced by that single gene. This is a manifestation of pleiotropy, or the phenomenon in which a single gene contributes to multiple apparently unrelated phenotypic traits. c In a many-to-one arrangement, a particular phenotype or trait embodies the actions of several genes, which may encompass a primary gene and assorted modifier genes. d In a many-to-many action pattern, each gene determines/influences several phenotypes/traits and each of the latter embodies the actions of several genes

The evidence that the association between individual genes and complex cardiac diseases is commonly non-specific and manifold does not mean that the identification of implicated gene collections is inconsequential. Such findings can begin the important process of elucidating how specific genes interact with each other and with environmental factors/exposures [2, 3, 4, 5] to engender disease, and can provide new targets for treatment [65]. With the full sequence of the human genome in hand, NGS technologies have subsequently reduced the cost of DNA sequencing from approximately US$0.50 per kilobase (kb−1) to less than US$0.001 kb−1 and radically increased throughput by sequencing in a massively parallel fashion [66]. They allow differences in long stretches of gene-containing chromosomal DNA taken from many individuals to be searched for and compared extremely efficiently.

There are two principal patterns of gene expression operative in determining phenotype: monogenic and compound polygenic (Fig. 3). When pleiotropy is demonstrated, one gene is responsible for the determination of many traits; with polygenic traits, the expression of a particular phenotype results from interaction of the products of many genes. Composite traits/syndromes are generally polygenic in nature; moreover, the effect of any one gene can be dependent on the presence of one or more “modifier genes” (genetic background), a phenomenon labeled epistasis, affecting the observed phenotype in considerable ways. Importantly, epistatic mutations have different effects in combination than individually. Multiple modifier genes can act on a limited number of primary loci; with increasing numbers of loci, the possible number of different phenotypic outcomes increases exponentially, and this fact creates a continuous distribution for the composite trait.

Monogenic Phenotype–Engendering Pattern

The monogenic pattern identifies a one gene  one trait pathway. The pathway may be influenced by other genes or the environment (gene–gene, GxG, and gene–environment, GxE, interactions), but in some settings, when mutation involves an influential DNA locus, these can become effectively unimportant. Expressed monogenic (monolocus) disorders or Mendelian diseases tend to run in families [67]. Such disorders of cardiovascular import include various anemias, as follows [68, 69]: Mutations in the HBB gene affecting β-globin, a subunit of hemoglobin, give rise to sickle cell anemia and also to beta thalassemia; alpha thalassemia typically results from deletions involving the HBA1 and HBA2 genes that produce the α-globin subunit of hemoglobin. Fanconi anemia (FA) shows considerable genetic heterogeneity but is attributable to mutations mostly in one of three genes, FANCA, FANCC, and FANCG, which disrupt the FA multigene network that repairs DNA damage; and rapidly dividing bone marrow cells are particularly affected resulting in Fanconi anemia.

Similarly, a number of large-scale, gene-based SNP association studies have led to the discovery of specific genetic determinants or risk factors conferring susceptibility to common cardiovascular diseases [70], including the CTLA4 gene and type 1 diabetes, the factor V Leiden F5 gene and deep vein thrombosis, the ApoAV gene and hypertriglyceridemia, the PDE4D gene and ischemic stroke, and the LTA gene that has been implicated in the pathogenesis of atherosclerosis and coronary heart disease and myocardial infarction.

Mutations have been identified as the cause of other important cardiovascular monogenic diseases that in the words of one authority “are currently thought to include” [71] various genetic cardiomyopathies, encompassing HCM and DCM, channelopathies, involving long/short QT syndromes, catecholaminergic polymorphic ventricular tachycardia, familial (primary) pulmonary hypertension, Marfan syndrome, and familial hypercholesterolemia. Various kinds of mutations are recognized [72], as follows: They can be located in an exon that contains coding DNA, thus causing an amino acid substitution giving a protein with impaired function (missense mutations). Other mutations create a stop codon, viz., one that does not correspond to any amino acid, signaling a premature stop of translation and yielding a truncated and non-functional polypeptide, or even no RNA and protein (non-sense mutations). When one base pair is missing, then a shift in the DNA reading frame ensues leading to a different (commonly non-sense) protein. Additionally, some mutations interfere with the process of splicing (see Fig. 1).

However, essentially monogenic origins of disease, as in the preceding disorders, are not the rule. Only 2 % of the total disease domain corresponds to monogenic causality, and even here the final phenotype is modulated by many factors, principally degeneracy (Glossary, ESM). Factor V Leiden offers an example of multiple contributing circumstances [73]. Most of the thromboembolic events associated with it occur when other risk factors, such as use of oral contraceptives, surgery, and bed rest, are also present; both GxG and GxE interactions, reflected in changes in the composition of blood and in stasis of venous flow, impact the overall risk of venous thrombosis.

Polygenic Phenotype–Engendering Mechanisms

The second pattern is distinctly polygenic, signifying that phenotype is determined by a number of genes acting conjointly in a network, as alluded to previously. Inherited cardiac conditions, diabetes, and generally complex, multifactorial maladies typically result from the combined effects of a number of genetic variants. Multiple gene abnormalities may well be implicated in variant phenotypic manifestations of complex cardiovascular diseases, including genetic forms of DCM and HCM (see Table 1 and the “Can, on Their Own, Same-Gene Mutations Cause Starkly Contrasting Traits?” section). Over the last two decades, numerous disease-causing gene alterations for different cardiomyopathies, without clear genotype–phenotype correlations, have been recognized [74, 75]. Until recently, clinical genetic testing was available typically for apparently monogenic diseases, although the majority of genetic heart diseases are both genetically and clinically highly heterogeneous and complex. Complexity stems from the wide-ranging (many genes implicated) locus and allelic (many different mutations implicated within those genes) heterogeneities, which until recently had been confounding incomplete genetic testing data [76]. As I detail in section ‘The Challenge of Pleiotropic Phenotypes of Mutant Genes: “Modifier Gene” Effects,’ large-scale studies are necessary to identify modifier genes crucial for cardiac phenotype [77].

Can, on Their Own, Same-Gene Mutations Cause Starkly Contrasting Traits?

As things stand now, there is no conclusive, elegant explanation of how mutations in the same gene, for instance, MYH7 (see Table 1), can by themselves cause such contrasting morphomechanical and clinical disorders as DCM and HCM. This fact is currently interpreted as a de facto manifestation of “phenotypic plasticity” [78]. It is also a striking example of phenomenological “pleiotropy” (from the Gk. words pleio, alluding to “multiple,” and tropic, pertaining to “transformation”), where mutations in the same gene can bring forth different morphomechanical phenotypic traits [79, 80]. I am not talking here about monogenic mutations in the same gene that can be associated with quantitatively different levels of probability of pathogenicity for HCM (viz., benign variants, variants of uncertain significance, and likely or definitely pathogenic variants). I am talking about ostensibly monogenic causal mutations in the same gene [81] causing qualitatively drastically divergent pleiotropic outcomes: the anomalous phenotype of DCM versus that of HCM (see Table 1 and Fig. 3b)! With this in mind, I deem it thought provoking and thus instructive to briefly summarize next the striking clinical differences that typify the morphomechanical/clinical phenotypes of these two remarkable cardiac disorders.

DCM: Morphomechanical Phenotypic Characteristics

As I advanced originally in a systolic clinical fluid dynamics survey in JACC [82], in the dilated cardiomyopathies, left ventricular enlargement results in a relative disproportion between the size of the characteristically globular chamber and the aortic ring, a “systolic ventriculoannular (outflow valve) disproportion, SVAD” [2, 82, 83, 84]. Analogous considerations underlie my subsequent formulation of the concept of a fluid dynamic “diastolic ventriculoannular (inflow valve) disproportion, DVAD” [2, 85, 86, 87, 88, 89, 90], which tends to impedes inflow and filling in the dilated chamber. SVAD is functionally equivalent to an outflow port stenosis. Thus, the resultant characteristic changes in the configuration of multisensor catheter-derived ejection pressure gradients in DCM (see Fig. 4) are qualitatively reminiscent not of the non-obstructive normal pattern [2, 82, 91] but, rather, of the pattern that typifies aortic valve stenosis [2, 82, 92, 93, 94, 95, 96].
Fig. 4

a Solid-state, multisensor left heart catheterization in dilated cardiomyopathy (DCM) at rest (top) and during supine bicycle exercise (bottom). Notable both at rest and during exercise are the elevated LV filling pressures and the more rounded and symmetric than normal transaortic pressure gradients and ejection flow waveforms, reflecting reduced upstroke slopes in the aortic root flow signals with prolonged times to peak flow and increased downstroke steepness—there is both a depressed rate of ejection velocity increase and an enhanced rate of velocity falloff. The very rapid downstroke of the ejection waveform, especially during exercise, probably reflects a high wall stress level maintained throughout ejection and the operation of the inverse force-velocity relation [2, 82, 93, 96]; the normal swift decrease in wall stress that accrues from wall thickening and radial contraction (a ventricular “self-unloading”) is restrained in DCM. b (top panels) Left heart catheterization (similar to (a)) in hypertrophic cardiomyopathy (HCM); deep LVP and AOP signals demonstrate very large early and enormous mid and late systolic dynamic transvalvular pressure gradients. Severe diastolic dysfunction in HCM is also revealed by the micromanometric catheter during supine ergometer cycle exercise; particularly striking is the persistently downsloping micromanometric LVP throughout the diastolic period and up to the ensuing atrial contraction, suggesting greatly impaired ventricular relaxation. This is in sharp contrast to the exercise-induced sharp upslope of diastolic micromanometric LVP in the normal pattern, shown in (c). b (bottom panel) From top downward are shown: linear AO flow velocity signal and AOP, deep LVP, and LVOTP micromanometric signals measured by retrograde triple-tip pressure plus velocity multisensor left heart catheter; the LAP micromanometric signal is measured by transseptal catheter. Inertial forces associated with local and convective accelerations of intraventricular blood dominate the early phase of ejection. This phase is characterized by increasing deep and outflow tract left ventricular and aortic root pressures, while aortic root flow velocity briskly attains and transiently remains near its peak. It is the interaction of flow-field geometry, namely, outflow tract narrowing by massive subaortic septal hypertrophy, with enhanced early velocities and accelerations that underlies the augmentation of the multisensor micromanometric catheter-derived early ejection pressure gradients. Viscous effects grow rapidly with shrinking cavity size. Augmented convective acceleration forces, associated with wall collapse displacing a sequentially increasing blood volume from apex to aortic ring and necessitating a strong increase in velocity along the outflow axis independently of any coexisting geometric taper, and viscous shear forces readily account for the enormous mid and late systolic intraventricular pressure gradients [2, 82, 98]. AO: aortic; LV: left ventricular; LVOT: left ventricular outflow tract; LA: left atrial. ((a) is slightly modified from Pasipoularides [82], with permission of the American College of Cardiology; (b) and (c) are adapted and modified, with permission of PMPH-USA, from Pasipoularides [2].)

The preceding hemodynamic changes accrue from mutations that have been attributed to multiple genes, as summarized in Table 1. Allelic heterogeneity is the rule, and very few specific mutations are encountered in multiple families. Presently, application of NGS and DNA sequence assembler technologies coupled with software platforms for visualizing gene interaction networks can speed up the discovery of new DCM genes and of multicomponent DCM genetic networks. Non-genetic causes of dilated cardiomyopathy are, among others, ischemic injury associated with coronary artery disease or prior myocardial infarction, valvular and congenital heart disease, severe long-standing hypertension, viral, inflammatory, and toxin-linked myocarditides.

HCM: Morphomechanical Phenotypic Characteristics

Genetic causes can typically be found in 50 to 75 % of cases of hypertrophic cardiomyopathy; expanding the genetic screening analysis through increasing application of NGS technology to enable simultaneous investigation of a broad panel of inherited genes is projected to augment the genetic-positive yield [97]. In those patients in whom a mutation is identified, about 80 % involve two genes (encoding myosin heavy chain 7 and myosin-binding protein C—see Table 1). The pathologic features of HCM consist of marked and asymmetric LV hypertrophy with an especially thickened ventricular septum, atrial enlargement, and an undersized LV cavity; hypertrophy and disarray of the cardiomyocytes and interstitial fibrosis are present throughout the myocardium. The cardiac phenotype and clinical course of patients with HCM are typically very variable with regard to the pattern and degree of hypertrophy, the age at onset, and the clinical outcome. Modifier genes, the environment, gender, and comorbidity (coexistence of, for example, ischemic or valvular disease) contribute to such differences.

HCM exemplifies during systolic ejection what I have previously termed “polymorphic gradients” [82]. Protean and polymorphic are the complicated systolic fluid dynamics of HCM [2, 82, 98] (see Fig. 4). Previous CFD analyses [2, 82, 98], using ejection velocity and pressure gradient patterns obtained by multisensor catheters and angiographic measurements, indicate that convective (Bernoulli) pressure gradients proportional to the square of the applying flow rate or velocity, as shown by the Bernoulli equation, are accentuated preeminently; in the narrowed (due to the septal hypertrophy) subaortic region, they may give rise to a Venturi mechanism entraining, or sucking, the neutrally buoyant mitral leaflets toward the septum in a systolic anterior motion (SAM) with concomitant mitral regurgitation. Failure of the mural (posterior) leaflet to move as much forward as the aortic (anterior) one yields non-apposition and an interleaflet gap resulting in the regurgitation—here I use “aortic” and “mural” because they are the modern attitudinally correct nomenclatures [99]. It is noteworthy that, as is highlighted in Fig. 4, the pressure gradient rises to its peak levels and maintains them in the face of minuscule forward, or even negative, aortic root velocities recorded by the catheter-mounted electromagnetic sensor. Such negative aortic velocities (as revealed in Fig. 4) probably represent vortices, with recirculating retrograde velocity components. Further morphomechanical systolic flow dynamic details can be found in previous publications [2, 82, 98]; here, I simply note that whether mitral leaflet–septal contact is the cause of the enormous mid- and late-systolic intraventricular pressure gradient remains controversial.

The prominent ejection pressure gradients in HCM have focused attention mostly to systole and away from important coexisting biventricular [2, 100] diastolic function abnormalities (see Fig. 4). The relative contributions of relaxation defects, asynchrony, altered passive diastolic properties, and geometry to the diastolic dysfunction in HCM remain incompletely characterized [2, 100, 101, 102]. This might be expected for a disease characterized by profound genetic and epigenetic phenotypic heterogeneity, with substantial variation in expressivity and age-dependent penetrance [100]. Diastolic abnormalities span the gamut, starting from those that can accompany substantial physiologic remodeling hypertrophy as seen in athlete’s heart [103], and then developing progressively, through derangements that come with compensatory reactive adaptations to pathological systolic pressure overloads [2, 89, 102, 104, 105, 106], to distinctive qualitatively new problems peculiar to HCM [100]. Figure 4 illustrates hemodynamics of severe diastolic dysfunction in HCM.

Recently, I proposed a new paradigm of diastolic dynamics in HCM [100], emphasizing the relationship of myofiber sheet and ultraconstituent distortions to LV mechanics and end-systolic shape. This innovative approach affords understanding of intricate facets in patterns of diastolic rebound and suction needed for LV filling in many of the polymorphic phenotypes of HCM. Moreover, it may lead to extraepicardial or intramural implantable recoil devices to promote diastolic elastic rebound in selected subsets of HCM patients [100]: those with massive hypertrophy and excessive ejection fraction leading to virtually complete LV emptying, or unduly reduced end-systolic chamber dimensions.

The Challenge of Pleiotropic Phenotypes of Mutant Genes: “Modifier Gene” Effects

Allowing any complex disease to be purely monogenic appears as a sweeping simplification. Although supposedly monogenic, genetic HCM and DCM exhibit striking morphomechanical attributes conjuring up a composite pathogenesis embodying the effects of multiple genetic factors. This explains the inadequacy of existing approaches to cogently explain the variability and pleiotropism in cardiomyopathy phenotypes. As discussed in the introduction of the preceding section, various schemes have been invoked conventionally to account for the prima facie surprising and perplexing observations relating to causal mutations in the same gene that can lead to either DCM or HCM. While it is plausible that distinct variants in a given gene can cause more than one cardiomyopathy, some investigations have revealed the same causal variant in HCM patients and in patients with genetic DCM, and alluded to phenotypic plasticity to account for this finding [78, 107, 108]. However, oddly, the invoked molecular mechanisms underlying HCM (variations of the thin filament proteome, associated with an increase in Ca2+ sensitivity of the myofibril, a faster cross-bridge turnover rate, and augmented contractility) and DCM (alterations of the thin filament proteome, associated with a lower Ca2+ sensitivity, a slower rate of cross-bridge turnover, and diminished contractility) are fundamentally conflicting and incongruous [109, 110]. This and the contrasting juxtaposition of the morphomechanical characteristics of genetic DCM and HCM synopsized in the preceding section and Fig. 4 call forcefully into question whether the same variant by itself can truly cause both disease phenotypes.

On the other hand, there are plausible, albeit unconventional, reasons for such puzzling findings. They pertain to the unaccounted influences of numerous potential “modifier genes,” which are operative in the dissimilar, compound genetic backgrounds (cf., Fig. 2) that delineate the “genetic context” of individual patients. The modifier gene concept is not new, having been advanced in 1941 by Haldane [111]. Modifier gene mutations may have different effects in combination than individually. Research involving breeding cardiomyopathic mutations into distinct genetic backgrounds in animal (mouse) models should complement human investigations of the influence of the diverse genetic backgrounds on morphomechanical effects of dominant mutations, ultimately allowing us to characterize better the modifier gene networks [112].

Clearly, the phenotype need not be determined “single-handedly” by the mutation(s) in a single gene (e.g., MYH7). Modifier genes can bring about pleiotropism by affecting robustly the phenotype(s) resulting from the primary mutation; thus, genotype is not reliably predictive of phenotype [113, 114]. The final expressed phenotype is the outcome of the primary causal mutations, modifier genes ushering GxG interactions into the equation, and potentially of environmental factors—see section “Epigenetics and Gene-by-Environment Interactions.” This crucial fact explains why persisting approaches treating complex diseases as if they were Mendel’s peas fall short in cogently explaining the variability and, at times, striking pleiotropism in cardiomyopathic phenotypes. Such approaches face the correlation versus causality problem: “post hoc, ergo propter hoc.” Are investigators apt to assume too easily that “because something follows something else it must be due (solely) to it”? Malaria occurs most often in persons living near marshes; does it necessarily follow that the miasma that rises from such places was the cause of the disease (the name “malaria” means this)? It is crucial to recognize that complex non-linear relationships are common in gene expression patterns and possibly can allow for a small set of weakly interacting signals to express a significant effect. Powerful approaches employing genetic algorithms (GAs) and artificial neural networks (ANNs), considered in the next section, can be exploited in future studies to search for more complex and subtle interactions involving modifier genes, which may be passed over by less sensitive methods; these interactions can lead to either DCM or HCM syndrome traits when acting in concert with the main causal variant. Modifier genes cannot be ignored as mediators of pleiotropy in cardiomyopathies and other complex disease phenotypes.

The NGS and bioinformatics technologies that are now available [115, 116] offer us the prospect to widen the scope of genetic investigations to encompass genetic diagnosis of heterogeneous and polygenic (cf., Fig. 2) clinical cardiological disorders. There is consequently an emerging strong appreciation for the value of studying epistasis, or gene interactions, and for addressing these problems in a non-disjointed, cohesive manner [117]. Comparative expression of networks of genes coupled to complex maladies can contribute insights as to how those genes interact with each other. It is anticipated that increasingly a better, more comprehensive and detailed, mechanistic understanding of genetic abnormalities underlying, for example, various genetic cardiomyopathies can emerge; pari passu, the distinction between monogenic and polygenic underlying pathogenetic mechanisms will become more questionable for many syndromes. This is becoming evident quite rapidly now, as in vivo and in vitro studies indicate that various cardiomyopathic mutations pass on an extensive array of assorted cardiomyocyte operating flaws, encompassing reduced myosin ATPase activity, acto–myosin interaction and cross-bridging kinetics, impaired contractility, and altered Ca2+ sensitivity [118].

DNA Microarray and Other Gene Expression Methodologies: Linking Genes to Transcription Control and Specific Traits and Diseases

Microarray technology allows for large-scale gene experimentation and makes it possible to find the expression levels of genes across many different applying conditions. In principle, a typical microarray experiment entails the hybridization, or combination through base pairing, of fluorescent-dye-labeled mRNA molecules in samples of interest to their complementary DNA templates, acting as detectors robotically printed onto glass-slide arrays in a particular order. Each spot on a microarray contains multiple identical strands of DNA; many such single-stranded DNA/gene spots are used to assemble an array. The amount/density (brightness) of mRNA bound to each detector spot on the array indicates the expression level of the various genes, thus forming a profile of gene expression for the sample. RNA in samples can be converted into its complementary single-stranded DNA (cDNA) using the enzyme reverse transcriptase. Because RNA is readily degraded by omnipresent RNases, cDNA is more convenient to work with than mRNA. Accordingly, in practice, the mRNA of samples is commonly converted into cDNA to use in DNA microarray investigations.

The raw data sets retrieved from microarray experiments are obviously too massive to infer meaningful conclusions by inspection. Of the tens of thousands of genes in experiments, only a much smaller number show strong correlation with the targeted phenotypes. Selecting such a small subset out of the thousands of genes in microarray data is important for accurate classification of the phenotypes; when a reduced number of genes are selected, their pathogenic relationship with the target disease is more easily identified. Widely used methods typically rank genes according to their differential expressions among the phenotypes and pick the top-ranked genes. Powerful computers are used implementing sophisticated software, such as genetic algorithms (GAs), which have been shown to be a robust search method for problems with such large search spaces [119, 120, 121, 122] and artificial neural networks (ANNs), which emulate and scale up enormously the brain’s many efficient ways to store and process information [123, 124]; neural networks are structured to provide the capability to solve problems without expert guidance and without the need of programming. They can seek patterns in data that no one knows are there.

Compared to the speedily burgeoning knowledge of protein-encoding genes, the detection of transcriptional regulatory sequences within the human genome is progressing slowly. Genetic network models are typically constructed for polygenic interactive systems, in order to assess interactions and regulation of genes. Gene expression data are obtained using high-throughput technologies, such as microarrays and Serial Analysis of Gene Expression (SAGE) [5, 125], which can help in identifying candidate genes that may be variously involved in diverse normal or disease morphomechanical phenotypes; thus, they may offer innovative insights into complex cardiovascular syndromes whose etiologies implicate multiple genetic factors. Using such technologies, the expression intensity patterns of numerous genes are ascertainable under various conditions. The global analyses of gene expression levels that are obtained are useful for cataloging genes and overall phenotypes, and hence for elucidating the role of genes in human disorders. These activities are often interconnected, as we wish to identify genes that are linked to particular sets of morphomechanical traits, or to important cardiovascular anomalies—e.g., “marker genes” that are differentially expressed in particular phenotypic syndromes.

Piecing Together Genomics, Transcriptomics, Proteomics, Metabolomics, and Phenomics

Innovative sequencing and array technologies are fueling groundbreaking advancements not only in research but also in translational and consumer genomics, and molecular diagnostics, rendering possible investigations that were not even imaginable just a few years ago. As I have emphasized in a just published, comprehensive, and detailed analysis [5], to which I refer the interested reader, genome-wide gene expression methods (NGS technologies, high-density DNA microarrays, SAGE, Northern blotting, etc.) only give information about relative mRNA abundance, viz., concerning the transcriptome. Obviously, this is essential, but certainly not the complete story. This information should be combined with proteome data—from, e.g., Western blotting, 2D polyacrylamide gel electrophoresis (2D-PAGE), nuclear magnetic resonance (NMR) spectroscopy, and mass spectrometry (MS), or liquid chromatography and tandem mass spectrometry (LC-MS-MS), etc.—to conclude whether particular transcripts are, in fact, being translated and expressed, and in what form and intensity. In view of diverse posttranslational protein modifications, which are key mechanisms to increase proteomic diversity and regulate cellular activities, there is also the question of the biological activity of the protein products and their interaction with endogenous metabolites (endogenous metabolome) and exogenous factors, such as pharmaceuticals (exogenous metabolome) [126, 127].

Metabolomics technologies encompassing 2D-PAGE, LC-MS-MS, etc., together with powerful multivariate statistical analyses and informatics software, can afford a simultaneous and relative quantification of thousands of different metabolites within a particular sample, making available a wealth of relevant biochemical data. While transcriptomics and proteomics contribute essential insights into the coordinated regulation of metabolic adaptability, metabolomics point light on the actual enzyme activity expressing metabolic regulation together with mass action effects. Thus, in cardiac disease with altered myocardial metabolism, metabolomics is likely to improve greatly single biomarker-based approaches by ascertaining metabolic biosignatures of wide-ranging biochemical changes [128]. Considering the archetypical paradigm of gene  transcript  protein  metabolite, the best disease biomarkers will be ascertained through studies that show correlation between biomolecules at all four levels. Gene expression work is just one part of a comprehensive “systems biology” approach to gene expression [5], melding various disciplines such as molecular biology, informatics, biochemistry, and statistics.

The confluence of next-generation electronic health record (EHR) and high-throughput NGS genotyping technologies offers cardiovascular translational investigators a unique opportunity to integrate genomic patient data into EHR systems to apply precision medicine to a large-scale patient base, encompassing information ranging into many petabytes of data (1 petabyte = 1 quadrillion bytes)! Computer speed, memory, and bandwidth have advanced such that multimodality digital medical images can be part of an EHR system and at high-resolution. These advances will allow integration of genomics, pharmacogenomics, transcriptomics, proteomics, and metabolomics information with clinical facts and other multifaceted phenomics data (digital pathology and virtual microscopy of myocardial biopsies, multimodal cardiac digital imaging data including angiocardiography, echocardiography, Doppler, CT, MRI and PET, solid-state multisensor catheter-derived hemodynamics, and so forth) into a single unified digital workflow. Versatile repurposing for genetic research of EHR data containing rich phenotypic information will lead to the establishment of more detailed HER phenotypes and should stimulate innovative approaches and collaborative initiatives with enhanced productivity.

Big Data and Innovative Therapeutic Directions

Big data processing, which relies on the simultaneous application of statistics, computer programming, bioinformatics, and graphic/visualization techniques, can draw from text, images, audio, and video to deliver complete interactive analytics enabling the discovery and communication of meaningful data patterns, often in real time. Such patterns facilitate the inference of relationships and dependencies and allow predictions of morphomechanical behaviors and clinical outcomes. Rapid advances in genomics alongside progress in medical imaging, computational biology, and informatics are creating opportunities to develop tools to truly personalize diagnosis and treatment in line with the Hippocratic model: “It’s far more important to know what person the disease has than what disease the person has.” To wit, they are making personalized medicine possible today by matching drugs to individual patients. These abilities are something that has never been feasible before at such a scale; it should enable academic investigators and clinicians to reach precise diagnostics effectively and should speed up enormously the advent of personalized cardiology and genomic discoveries.

However, the precise extraction of detailed disease and therapeutic-response phenotype information contained in EHRs is not an easy task; EHR-driven phenotyping has yielded multiple challenges [129, 130]. GWAS and massively parallel DNA sequencing strategies detect hundreds of thousands of loci per individual, affecting both normal variation and susceptibility to disease [131], and can shed light on the genetic architecture of complex traits. However, their precise biological relevance must be proven before mutations can be causally linked to a specific disorder. The definitive aim of gene discovery in complex disease is to detect and characterize precisely the specific biological networks/pathways and processes that bring about the disorder. The critical issue is not whether patients have more instances of one or more rare variants than the controls but, rather, which mutated genes are causal and contributing to the illness in the affected. Variable penetrance, epistasis, epigenetic changes, and gene–environment interactions will complicate pertinent efforts; each affected individual will nonetheless exhibit interference/interruption of related key biological processes underlying the anomalies in the morphomechanical traits. Defining comprehensively the ways in which genomics, transcriptomics, proteomics, and metabolomics of complex disorders like HCM and DCM are impacted by DNA mutations will contribute new insights into normal physiology and disease pathophysiology and will provide important clues for rational therapy/management. This discovery pathway can ultimately lead to innovative and potentially patient-specific therapeutic objectives and methods.

Clustering: a Statistical Data Mining Procedure for Analyzing Gene Expression Data

Microarray technology has made it possible to simultaneously monitor the expression levels of thousands of genes in parallel during important (patho)physiological processes/conditions and across assortments of related samples. Currently, the main concentration in genomic research is switching from sequencing to using the genome sequences to ascertain how genomes are functioning. A microarray gene expression matrix is a table where rows represent genes, columns represent various samples (or examined conditions), and numbers in each cell denote the expression level of the particular gene in the particular sample. Clustering is a helpful statistical data mining procedure for analyzing such gene expression data; it arranges genes together in groups that have potentially related functions or are co-regulated, thus helping to establish the relationships among them in the form of gene regulatory networks [123, 132, 133].

Any particular set of co-regulated genes and co-regulating conditions within a gene expression matrix, encompassing a priori interesting sets of co-expressed/co-regulated genes and co-regulating conditions or samples, represents a regulatory unit or module (RM). Given a gene expression matrix, biclustering or two-mode clustering algorithms can be applied to discern one or more local patterns, or “biclusters” [134, 135, 136], in which a subset of genes exhibit similar expression levels over a subset of conditions—i.e., specific subsets of rows exhibit similar behavior across specific subsets of columns, and vice versa.

Therefore, biclustering is a data mining system that allows simultaneous clustering of the rows and columns of a gene expression matrix; it has been the most common technique for extracting gene RMs. Each bicluster is a tuple, or data structure, containing elements of two sets: the rows and columns. It is also possible to cluster separately the rows and columns of the gene expression matrix. Biclustering techniques/algorithms can simultaneously cluster genes and conditions, to reveal distinctive “checkerboard” patterns in matrices of gene expression data, if such patterns exist [134, 137]. When biclustering is used to reveal transcriptional modules composed of genes showing coherent expression profiles over time, it is termed “temporal biclustering.” Biclustering carries out row clustering and column clustering in tandem, according to the similarities among expression profile trajectories of genes and of samples [137, 138, 139].

Effectively, gene regulation matrices are postulated to reveal the structure of the underlying genetic network. These matrices can be utilized to figure out how genes act jointly to control transcription and bring about specific phenotypic characteristics, in health and disease, bearing in mind that the regulatory network is also influenced by extracellular environmental signals, as is discussed in the next section. Gene networks are commonly deciphered in combination with figures (cf., Fig. 2) to illustrate the generally intricate interrelations between the network elements. Because of their complexity, such networks are not always easy to grasp. Since the 1960s, approaches from mathematics and physics have been applied to characterize and simulate small gene networks quite rigorously [140, 141]. Comprehensive analytical models of large genetic networks could transform investigation and understanding of complex cardiac diseases, such as the genetic cardiomyopathies. However, such models are not yet within our reach, because large genetic systems involving hundreds/thousands of regulatory genes are enormously complicated to model and, furthermore, because experimental data for large genetic systems are inadequate.

High-throughput technologies encompassing NGS technologies (discussed earlier), bioinformatics methods, and systems biology approaches allow probing aspects of gene regulatory networks on a genome-wide scale. Such high-throughput technologies and molecular biological methods nowadays make it possible to study in parallel great numbers of genes and proteins, expediting the study of sizeable gene networks [142, 143, 144]. This is empowering us to deal with multipart gene networks more efficiently and promises to gradually lead to more effective, comprehensive understanding of multifactorial and likely polygenic heart diseases, such as genetic DCM and HCM. For instance, one could use a gene expression data set to assemble a corresponding gene network model that is consistent with the data. Discrepancies between simulated data produced using this model and new experimental/clinical data, not used to construct the model, would then reveal model deficiencies. The discrepancies could consequently be used to decide on alternative genetic network models, or to upgrade the model. Thus derived understanding could then be translated into effective therapy: identified variant genes might be corrected in stem cells and subsequently given back to the patient, resulting in functional protein products and ameliorating the aberrant/cardiomyopathic phenotype.

It should become increasingly clear in the near future that many genetic heart diseases have a more intricate polygenic etiology than is generally thought. Modifier polygenes whose individual effect on the phenotype is too small to be observed, acting in concert with other (non-allelic) genes, could actually modify strikingly various phenotypic traits (cf., discussion of HCM vs. DCM phenotypes with the same “causal” primary mutation; cf., Fig. 4).

Epigenetics and Gene-by-Environment Interactions

Despite the largely unforeseen, perplexing findings mentioned in the section “Lack of Correspondence Between Expressed Phenotypic and Genomic Complexity,” and as suggested by the Epigram, many biomedical scientists worldwide had known for decades that genetics alone, in the sense of Mendelian genetic determinism, is ineffectual in explaining complex developmental, adaptive, and maladaptive outcomes. Additional integrative information management systems with extensive capabilities had to be operational. Already back in 1975, on the heels of the advancements in molecular biology of the 1950s and 1960s, through a comparison of human and chimpanzee macromolecules by various methods (immunology, amino acid differences, and protein electrophoresis), MC King and AC Wilson determined [145] that the extent of protein-encoding gene variation between chimpanzees (Pan troglodytes) and humans was too small (≈1 %) to account by itself for the remarkable phenotypic differences between the two species. They argued that it was not structural gene dissimilarities that were responsible for the morphomechanical phenotypic differences, but gene regulation processes altering gene expression, the timing, extent, and manner in which gene products are assembled. Later investigations, employing more modern approaches and techniques, validated this view [146, 147].

There are significant differences in gene expression patterns and it is these divergences that primarily account for the striking phenotypic disparities. It might be speculated that such more or less wide variances in gene regulation and expression are of great consequence, in view of the relatively small genomic differences between species as well as individuals within a species. However, there is no simple mapping from genome to phenotype because of the confounding contributions of multifaceted environmental factors to the ways in which the genetic blueprint may unfold [2, 3, 4, 5]. The genes are highly specific chemically and thus are called into play only under very specific conditions, yet their morphomechanical effects hinge on quantitative influences of their environment and also of immediate or remote products of other genes, which are resultants of all that has gone on before in the organism.

Since the whole genome is replicated in each cell division, one can deduce that the information needed to determine cell type and the changing morphomechanical characteristics of cells, tissues, and organs cannot be encompassed within the DNA sequence itself. Instead, different transient environmental signals through early development and throughout life methodically switch on some genes and silence others, thus modulating gene expression patterns to achieve the appropriate adult cell types and their subsequent dynamic adaptations (see Fig. 5). This is accomplished by modifying DNA-containing chromatin, the genetic material, in ways that are both semi-stable and heritable. As shown in Fig. 5, the two principal mechanisms behind these modifications involve the post-translational modification of proteins, the histones, found within chromatin, and the methylation of individual units of DNA [4, 5, 148, 149]. Moreover, regulatory microRNAs can exert post-transcriptional simultaneous discriminating repression of hundreds of genes, by selectively inhibiting mRNA translation into protein [5, 150].
Fig. 5

Illustrated is genomic DNA packed tightly into chromosomes, as well as a DNA molecule unwound to reveal its 3D structure, along with epigenetic modifiers. Epigenetics translates into “superposed on the genome.” So, if the genome is like the hardware of a computer, the epigenome would represent the software that instructs the computer when to work, how to work, and how much. Epigenetic mechanisms allow cells carrying identical DNA to differentiate into numerous cell types, and to maintain differentiated—and highly adaptable to their changing “environment”—cellular states. Epigenetics thus forms a bridge between the genotype and the exhibited normal and abnormal/diseased phenotypes. Epigenetic modifications encompass, among others, histone modifications such as acetylation (1) and DNA methylation (2), which regulate gene expression. Standard variant genetic mechanisms (3) are linked to changes in the primary DNA sequence occasioning more or less severe abnormal morphomechanical traits. DNA cytosine methylation and histone modifications have profound roles in gene regulation; the two mechanisms cooperate in controlling gene expression. In contrast to genetic changes that are specific, epigenetic modifications of gene expression are more general and usually involve more than one gene. Complex diseases, such as genetic cardiomyopathies (e.g., HCM and DCM) that are conventionally classified as “monogenic” are most likely caused by causal variants with large effect sizes acting in conjunction with “modifier genes.” In addition to the main causal variant, e.g., MYH7, which typically displays a Mendelian pattern of inheritance, several other non-Mendelian variants contribute to each of the divergent phenotypes that can be actually expressed (HCM or DCM, see discussion in text)

Epigenetics examines morphomechanical attribute variations, pertaining to appearance and functions, which are not caused by changes in the DNA sequence of (nucleo)bases (see Fig. 5). It explores external or environmental factors that switch genes on and off and affect how the genetic blueprints are read and, thus, the specific properties of cells, tissues, organs, and individuals. In the present context, environment is used in a broad Bernardian sense [2, 4, 5], to account for an extensive range of non-genetic factors that can be implicated in the etiology of complex developmental, morphomechanical adaptive/maladaptive changes, and disease. Essentially, epigenetic mechanisms regulate the exceptional genomic plasticity and are fundamental in cardiac adaptations, remodeling, reverse remodeling, and disease [2, 3, 4, 5]. Epigenetic explanations are very powerful because they provide a detailed mechanistic account for how both environmental and genetic factors might interact to affect morphomechanical traits and physiological processes such as tissue and organ development, dynamic adjustments, and the onset and evolution of disease.

Epigenetics Forms a Bridge Between Genotype and the Exhibited Phenotypes

Different genotypes can be expected to show varying transcriptional responses to identical environments and the same genotype to show varying transcriptional responses to different environments. Viewed through this prism, the effects of the environment and genetic factors are not mutually exclusive but are operating complementarily, and epigenetics forms a bridge between the genotype and the exhibited normal and abnormal/disease phenotypes (see Fig. 5). As new technologies make comprehensive description of gene expression levels more accessible, it will without doubt become essential to study gene expression phenotypes as a function of applying environmental context as well as the product of a particular genotype.

Epigenetic factors may involve both monogenic and polygenic mechanisms (cf., Fig. 3). Epigenetics engenders complexity levels beyond gene–gene interactions and encompasses interactions between genes, gene products (proteins) and genes, and between these elements and environmental forces/clues, including past and present experiences and influences upon the organism, and its constituents. For a true understanding of the determinants of gene expression, expression variation must be considered not simply as a product of genetic dissimilarity or of environmental disparities, but as a joint product of genes and the environment.

Examination and understanding of so-called gene-by-environment, or GxE, interactions [2, 3, 4, 5] can allow for identification of unappreciated linkages between genetic and environmental risks and various triggering or aggravating factors. Moreover, it can allow for an increased understanding of the biology of important multifactorial maladies and may impact risk stratification for targeted health screening and prevention efforts. In a series of up-to-date publications [2, 3, 4, 5], I have outlined the role of mechanical stresses or of a lack thereof, as important epigenetic factors in cardiovascular development, adaptations, and disease. In a just off-the-press two-article series [4, 5], I have proposed that variable forces associated with diastolic RV/LV rotatory intraventricular flows can exert physiologically and clinically important, albeit still unappreciated, epigenetic actions influencing functional and morphological cardiac adaptations and/or maladaptations. Taken in toto, this two-part investigation formulates a new paradigm in which intraventricular diastolic filling vortex–associated forces play a fundamental epigenetic role, and examines how heart cells react to these forces. Albeit still unappreciated, these epigenetic actions influence morphomechanical cardiac adaptations.

Epigenetic mechanisms involve complex, multifactorial interactions and successions of interlinked morphomechanical states resulting from them. Secondary changes in DNA may also contribute to the stabilization of the new state, but they are not encoded by genes; they are elicited by regulatory modifications in DNA methylation patterns, in DNA-binding histone proteins, and in chromatin structure [2, 3, 4, 5]. These adjustments result in transformed transcriptional patterns and, thus, in altered morphomechanical characteristics at all levels of the organism. Because some of these epigenetic modifications may affect DNA passed on through the germline (in eggs and sperm), they can be transmitted from parents to offspring and subsequent generations (Fig. 5). In this circumscribed sense, epigenetics has given new life to Lamarckian theory and the previously discarded idea that characteristics acquired during an individual’s life are heritable.


While innovation can occur as individuals and groups tackle new problems, innovation can also occur as a response to unplanned changes, such as have been ushered in by the post-genomic era. The ancient Greek philosopher Heraclitus once proclaimed that there is nothing permanent except change. Advances in genome sequencing and research have come to be exponential in recent years, leading to a precipitous and powerful augmentation of our knowledge of intricate genetic foundations of many cardiovascular diseases. It is becoming quite challenging to practitioners and investigators alike to keep abreast of the clinically ever more accessible and affordable contemporary diagnostic developments in cardiovascular genetics. Technical advances in NGS with advanced laboratory tools, databases, and analytical software threaten to overtake the ability to comprehend and interpret dependably in the clinical or laboratory setting the implications and ramifications of new research findings.

Future studies, integrating the current wealth of genome-wide technologies, are poised to bring us clinically advantageous information on whether and how multiple variable factors are collaborating to modify human disease and disease risk. Implementing such genomic findings in cardiology practice may well lead rapidly to better diagnosing and managing of common cardiovascular conditions, such as the genetic cardiomyopathies and heart failure. The adoption of new methods can admittedly be unjustifiably slow at times. However, the utility of whole-genome sequencing of personal genomes in clinical practice will rise sharply as more and more genomic variations linked up with clinically significant disease phenotypes become determined and corroborated, making it cost-effective in the process to diagnose multiple diseases in tandem. Cardiovascular specialists are still a long way from submitting their patients’ full genomes for sequencing, not because the price is prohibitive, but because the data are difficult to interpret properly. This survey has aimed at addressing the need for fundamental understanding in this burgeoning field, which is desirable in order to provide high-grade, personalized care for patients and their families in the post-genomic era.

Compliance with Ethical Standards

Sources of Funding

Research support, for work from my Laboratory surveyed here, was provided by National Heart, Lung, and Blood Institute, Grant R01 HL 050446; National Science Foundation, Grant CDR 8622201; and North Carolina Supercomputing Center and Cray Research.

Conflict of interest

I declare that I have no conflict of interest, whatsoever.

Ethical approval

All procedures performed in studies involving human participants that are reviewed here were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments. All applicable international, national, and/or institutional guidelines for the care and use of animals in studies involving animals that are reviewed here were followed.

Ares Pasipoularides

Supplementary material

12265_2015_9658_MOESM1_ESM.docx (22 kb)
Glossary, ESM(DOCX 21 kb)
12265_2015_9658_MOESM2_ESM.docx (12 kb)
Supplementary Table 1(DOCX 12 kb)

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Duke University School of MedicineDurhamUSA
  2. 2.Duke/NSF Research Center for Emerging Cardiovascular TechnologiesDurhamUSA
  3. 3.Department of SurgeryDuke University School of MedicineDurhamUSA

Personalised recommendations