Behavioural scientists are often interested in understanding how human traits, such as intelligence, educational attainment, depression and anxiety, are influenced by inherited and environmental factors. Since long before the advent of human genomics, researchers have relied on twin studies to answer these questions. The classical twin design compares trait resemblances in monozygotic (MZ) twins to those in dizygotic (DZ) twins. Because MZ twins are genetically identical and DZ twins share 50% of their genes on average, any additional similarity between identical twins should be related to genes, provided that the twins share the same environments1,2.

Researchers can apply the classical twin design to estimate the influence of genes for any trait they can measure. They can also extend the classical twin design to model causal relations between exposures and traits, to describe differences within discordant twin pairs, to estimate genetic and cultural components of inheritance, and to account for measured genetic variables and environmental exposures.

In this Review, we examine the use of twin studies as researchers become interested in new phenotypes, including those based on omics technologies. We reflect on methods for achieving progress in fields that have traditionally valued twin studies of individual differences, and we explore which fields offer new opportunities for twin research and how insights from other disciplines can inform twin studies. We begin by introducing the classical twin design. We then summarize concordance and discordance in twin pairs for major psychiatric and somatic disorders, arguing that no other design is as informative for studying penetrance and genetic prediction of disease risk. Next, we turn to assumptions and advances in research methods and the study of twinning as a phenotype, which is important for the mothers of twins and for female fertility and may inform developmental biology. We conclude by discussing existing knowledge gaps in twin studies.

Classic twin design and twin concordance and discordance

Classic twin design

The classical twin design compares the resemblance of MZ and DZ twins for univariate or multivariate traits to estimate heritability and genetic correlations among phenotypes (a glossary of key terms is provided in Box 1). It decomposes phenotypic variance and covariance among traits into genetic and non-genetic components, on the basis of the biometrical model P = G + E, where P, G and E represent univariate or multivariate phenotypes, individual genetic values and environmental deviations, respectively3. P is measured, whereas G and E need not be observed, but their influence is estimated by comparing MZ and DZ twins, as illustrated in Fig. 1. The influence of the genotype can be summarized as the heritability of a trait: the proportion of variance due to genetic factors.

Fig. 1: Path diagram of a twins-and-parents design plus PGSs to disentangle genetic and cultural transmission.
figure 1

The circles denote latent (that is, unmeasured) variables, and the squares denote measured variables. To ease the presentation, variances for latent variables that are constrained at 1 are not drawn. The model as depicted is for DZ twins.

Twin resemblance can be quantified in concordance or correlations. Concordance and discordance refer to the degree of similarity or difference between twins in terms of their phenotypes. Concordance is typically applied to dichotomous phenotypes, such as the existence of a disorder, while correlations are used to summarize twin resemblance for continuous traits. Similar concordance rates in MZ and DZ twin pairs suggest a larger role of environmental factors, while a higher concordance in MZ than DZ twin pairs suggests stronger genetic effects. For continuous traits, we double the difference between the MZ and DZ correlations to obtain a first estimate of heritability—that is, the proportion of variance explained by genetic factors. However, the relation between twin concordance and trait heritability is more complex because it depends on the prevalence of a disorder in the population.

Twin concordance and discordance

Figure 2a presents an overview of estimates of twin proband-wise concordance rates (or case-wise with full ascertainment)4 in complex diseases, either directly obtained from publications by computing proband-wise concordance from the published data or obtained by contacting the authors. To estimate heritability, researchers cannot only look at the differences in concordance rates between MZ and DZ twins; their resemblance must be obtained on the underlying liability scale by tetrachoric correlations5,6. Figure 2b (based on Smith7) shows the relation between concordance, prevalence of a disorder in the population and heritability on the liability scale7. The figure specifies this relation for prevalences ranging from 0.01% for rare traits to 80% for common ones. It is especially striking that for traits with a low prevalence and a high heritability, most MZ twins will be discordant.

Fig. 2: Concordance rates in twins.
figure 2

a, Proband-wise MZ and DZ twin concordance rates. References for the studies and the non-rounded concordance rates are provided in Supplementary Table 2. b, Expected concordance rates in MZ twins based on Smith7 detailing the relation between twin concordance, disorder prevalence and disorder heritability on the liability scale. The expected MZ twin concordance, prevalence and heritability for the highlighted disorders are provided in Supplementary Table 3.

These results provide valuable information for research and society. While genetic effects are evident from the higher concordance in MZ twins than in DZ pairs, concordance in MZ twins does not approach 100%, even for highly heritable traits such as schizophrenia8. Explaining why one person becomes affected and their twin with an identical genotype is protected involves identifying environmental factors, epigenetic mechanisms and DNA mutations that occurred in one twin but not in the co-twin.

The extent to which the concordance rates in MZ twins are lower than 100% conveys how predictive the genotype will be for individual outcomes. An individual’s genetic liability to a disorder can be seen as providing a baseline, with other factors influencing whether that liability ends up passing the threshold for disease penetrance. Thus, twin concordance rates powerfully illustrate that even a strong genetic predisposition does not have to result in the development of a disease.

The genome is not deterministic. Processes such as epigenetics (including X-inactivation, in utero exposure and post-natal environment) may lead to changes in DNA expression within MZ pairs. Epigenetic and DNA expression differences were found in MZ twins discordant for inflammation-related diseases (such as lupus9) and for psychiatric and neurodegenerative disorders (including schizophrenia10 and Alzheimer’s disease11). Differences in discordant MZ twin pairs have also been observed for proteomic12,13 and metabolomic profiles14,15 and in multi-omics studies16,17.

For biomarkers, including omics profiles, it remains challenging to establish whether twin discordance is a cause or consequence of the disease. Here MZ twin pairs are valuable to examine causality18, especially when combined with the longitudinal data in twin registries. In one example addressing the causes versus consequences of smoking behaviour19, longitudinal phenotype data were combined with gene expression profiles, and 132 differentially expressed genes were identified in current, former and never smokers. Nearly all genes (125) had reversible effects on gene expression levels. In 56 MZ pairs discordant for current smoking, only six genes were differentially expressed, but the effects for 75% of the genes in the discordant MZ twin pairs were in the same direction as in the total population.

For diseases that are more common at an older age, one needs to consider that the disease can still manifest at later follow-up. Careful consideration of the age composition of the sample is therefore required, as illustrated by studies investigating twin concordance rates for Parkinson’s disease20,21. When twins were studied at an average age of 74 years, researchers found similar concordance rates for late-onset Parkinson’s disease (>50 years of age) in MZ and DZ pairs. When they later searched the US National Death Index for all twin pairs with Parkinson’s disease in at least one twin, they found higher concordance among MZ than DZ twins at follow-up, and heritability was now estimated at 20% for late-onset Parkinson’s disease.

In an era of genetic testing services and polygenic scores (PGSs), the upper limit of genomic predictions must be defined and acknowledged. Researchers and clinicians may currently see risk prediction tools as having an upper limit set by trait heritability, but in fact predictions can go only as far as the MZ twin concordance rates.

Assumptions of the classical twin design

In its most basic application, the classical twin design assumes random mating in the parents of twins, and that environmental factors that influence a trait are similar for MZ and DZ twins so that any differences in the similarity of the twins’ traits can be attributed to genetic factors. This so-called equal environment assumption has been tested numerous times and holds for most phenotypes22. Researchers can test assumptions regarding equality of means and variances in MZ and DZ twins and the absence of interactions or correlations between genes and environment23 with appropriate data and methods, and deviations from these assumptions may provide new insights, as we discuss below.

Another assumption entails that twins are representative of the population to which heritability and other estimates apply. Twins and singletons are similar in most respects24,25, even though twins are born earlier on average and grow up with a sibling of the same age. Twins and non-twins have similar biomarker profiles and diseases26, cognitive functions27, health behaviours28, personality29 and psychopathology30. Despite early hypotheses that twins from DZ pairs are more often chimeric than singletons, a large study in twin-family pedigrees found this not to be the case31. Although twins are born earlier on average and have lower birth weight32 and body mass index (BMI)33,34, genetic correlations of twin and singleton birth weights are not significantly different from each other35, meaning that the same genes influence birth weight in twins and singletons. Consequently, data from twins can be combined with other population-based results or included in population-based genome-wide association studies (GWASs) to increase the sample size. As David Lykken stated in his 1982 presidential address to the Society for Psychophysiological Research, “Twins are probably more representative of the general population than any other group … This representativeness is even more true of the families of twins”36.

DNA sequence differences within MZ twin pairs

Post-zygotic (that is, after fertilization) mutations occurring in one twin of an identical pair can lead to genetic and phenotypic differences between them. However, if such mutations occur early in embryonic development—that is, before a split occurs—both twins will have the mutation. In a DNA sequencing project of 381 Icelandic twin pairs and two triplet sets, 39 pairs differed by over 100 mutations, while 38 pairs did not differ at all37. The median number of post-zygotic mutations differing within twin pairs was 14, with a higher estimate of 48 for high-coverage pairs. In 78 parent–offspring trios from the same population, 4,933 de novo mutations were reported—that is, an average rate of 1.2 × 10−8 per nucleotide per generation38. A review also put 1.18 × 10−8 per position as the best estimate of the average human germline mutation rate, corresponding to 74 novel single nucleotide polymorphisms (SNPs) and approximately three novel structural variants39 per genome per generation. De novo mutations are enriched in coding and regulatory regions of the genome40. Thus, it is not surprising that de novo mutations account for a substantial component of some rare genetic syndromes and that mutations in one twin but not the other can lead to discordance between them. For highly polygenic traits, individual SNP effects are very small. If only a few SNPs are discordant per pair, the effects of these are negligible. While it is likely that de novo mutations do not impact twin-based heritability estimates of polygenic traits, we should be cautious when investigating rare genetic disorders that involve only a small number of genes or disorders that involve de novo mutations particularly.

DNA sequence differences between MZ twins can also inform on the timing of developmental mutations. Germ cells are specified around the third week after fertilization, a process referred to as primordial germ cell specification (PGCS). If post-zygotic de novo mutations are present in both the soma and germline of a twin, we can classify them as pre-PGCS. Both twins can share pre-PGCS mutations, or these can be present in just one twin, indicating whether mutations occurred before or after the twinning split. When post-zygotic mutations are present in the soma or the germline of both a proband twin and their offspring, we can classify them as post-PGCS mutations.

Methodological developments in twin studies

The classical twin design can be extended to explore gene–environment interactions and correlations. Gene–environment interaction (G×E) can be conceptualized as moderation, where genes moderate the environmental effects or where the environment moderates or controls genetic effects. Gene–environment correlation (rGE) describes a correlation between G and E in the model P = G + E and is sometimes viewed as genetic control over exposure to different environments. There is ample theoretical work on gene–environment interactions but relatively little empirical work. In contrast, several developments have stimulated the study of rGE. These involve adding measured G or E to the design, extensions to longitudinal data and the addition of family members, such as the parents or the offspring of twins.

Gene–environment correlation

Detecting rGE in the twin design can be achieved by including measured genetic or environmental variables. For instance, if the phenotype is depression, marital status may be a relevant measured environment, and a PGS for depression a relevant genetic variable. If genetic factors influence depression and marital status, this gives rise to rGE. Obviously, this approach requires a prior expectation about which environmental variable to assess. By including a measured genetic variable—for example, a PGS for depression—we can estimate the correlation between the shared family environment and all genetic factors that influence depression41. This approach is exploratory, as it can detect rGE but does not require prior hypotheses concerning the environmental influences involved in the rGE.

An alternative approach to studying rGE in the classical twin design is by analysing longitudinal data42. Given repeated measures, rGE can be modelled by estimating the regression of the latent environmental variable at time t (Et) on the phenotype at time t − 1 (Pt−1). This regression implies rGE by the following chain of regression relationships: the genotype at t − 1 influences the phenotype at t − 1, which influences Et. So, Pt−1 mediates the relationship between Gt−1 and Et, thus giving rise to rGE (Fig. 3). This influence of the phenotype on the environment may arise if children actively seek out or create environmental circumstances, which match their genotype. Given the own role of the child, this is called active rGE or ‘niche-picking’. This may apply to intelligence and musicality, for example. Alternatively, the behaviour of the child may elicit responses from persons in the environment, called evocative rGE. Examples are highly outgoing children, who encourage social interaction, or aggressive children, who may discourage social interaction and elicit corrective parenting.

Fig. 3: Path diagram for a time-dependent process.
figure 3

The model is shown for a pair of twins, whose longitudinal phenotypes are influenced by their genotype and the shared and unshared environment. The latent genotype and the shared and unshared environment can be influenced by earlier times (for example, stable genetic influences) and by innovations (for example, new expressed genes). The red arrows show phenotype-to-environment transmission, which leads to GE correlation.

Simultaneous environmental and genetic transmission across generations also leads to correlations between genotype and environment in offspring, as their genotype depends on the parental genotypes (genetic transmission), and their environment (partly) depends on the parental genotypes, as mediated by the parental phenotype (cultural transmission). This form of rGE is called passive rGE, as the twin offspring are the (passive) recipients of parental genes and an environment (partly) created by parents. The environment that parents create depends on the genotype they transmit to their offspring and on the part that is not transmitted. The parent–twin design can incorporate the effects of the non-transmitted genotypes on offspring outcomes43 (depicted in Fig. 1).

In the children-of-twins design, which involves MZ and DZ twin pairs as parents, researchers compare the resemblance between parents and their own children to the resemblance between offspring and their parent’s co-twin44,45,46. In both the children-of-twins design and the parents-of-twins design, researchers estimate the effect that remains after accounting for direct genetic transmission, which induces a correlation between genotype and the shared environment.

A new way of modelling parental transmitted and non-transmitted genotypes is structural equation modelling combined with PGSs (SEM–PGS), which disentangles parental genetic and environmental effects on offspring traits while controlling for assortative mating, thereby providing unbiased estimates of genetic transmission47. Researchers can use SEM–PGS with trio data but also with twin pairs48. This way, they can study parental effects by combining both GWAS and twin-based designs.

The outcomes of intergenerational twin studies can help to inform intervention approaches49 and avoid parent-blaming50. An example can be seen in cases in which a child is struggling in school, and the teacher blames its parents for not providing a suitable home environment or not being involved in their child’s education. In a parent–twin study of offspring years of education, there was a direct effect of parental genes that were inherited by the twin offspring51, making it difficult to hold the parents responsible for passing on their genes. The indirect (genetic nurturing) effect of parental genes that were not transmitted to offspring weakened after controlling for parental IQ or socio-economic status, implying that the genetic nurturing effect on education reflects family socio-economic status rather than any deliberate parenting behaviour51. Genetic nurture effects are often interpreted as parenting effects but may also reflect the broader family environment.

Gene–environment interaction

Researchers can investigate G×E by comparing heritability estimates across different environments. One recent example is before and during exposure to the COVID-19 pandemic. A comparison of heritability before and during the first lockdown in the Netherlands (gene-by-crisis interaction) showed a slight increase in heritability estimates for optimism and meaning of life during the first months of the pandemic and lower-than-unity genetic correlations across time (0.75 and 0.63), implying G×E, with both quantitative and qualitative differences in genetic influences52.

Purcell suggested a broad method to measure changes in heritability on the basis of traits that could potentially have a heritable factor53. Some caution is needed if such traits, or moderators, are themselves correlated between twins, but several solutions exist54. Molenaar et al. developed an alternative approach that does not require a measured moderator to estimate G×E55. Their method exploits the fact that G×E will introduce heteroscedasticity56,57. By modelling environmental variance conditional on the genetic variance, they test for departures from bivariate normality in twin data. This method can explicitly estimate G×E without a priori hypotheses about specific interacting genotypes or environments.

Interaction among family members

Individuals can contribute to the environment of their family members, including twins and their co-twins44,58, and these effects can be either competitive or cooperative59. Under competition, the higher the expression of a trait in one sibling, the less it will be exhibited in the other sibling. Under cooperation, a trait expressed more strongly in one sibling will also be exhibited more in the other sibling. The presence of sibling interaction may be optimally detected in twin designs. Here, violation of the equal variance assumption in the two zygosity groups implies either cooperation or competition: inflated phenotypic variances and increased twin correlations imply cooperation, while deflated variances and decreased DZ correlations relative to MZ correlations suggest competition. Both inflation and deflation act more on the variance of MZ twins than that of DZ twins, thereby making the twin design a powerful tool to detect and distinguish competitive and collaborative interactions.

Multilevel twin models

When working with data from twins, natural clusters occur that are created by the fact that we include individuals from the same family unit. These family units are further nested in higher-level clusters such as geographic region, neighbourhood or school. If both twins in a pair share higher-level clustering variables, their effects will be reflected in the estimate for the common environment in the classic twin design. In multilevel twin models, variance is decomposed into within-pair variance on the first level (that is, representing differences between twins within a pair), between-family variance on the second level (that is, differences between pairs or families) and the variance of a higher-level clustering variable on the third level (for example, differences between geographic regions)60. For instance, multilevel twin modelling can clarify the relationship between geographic regional clustering and ancestry, as seen in regional clustering for height in seven-year-old Dutch twins, where, after adjusting for ancestry, the variation in height was no longer explained by region61. Hence, regional clustering effects may mask genetic ancestry effects, which can be disentangled in multilevel twin models.

Causal modelling in twin designs

Randomized controlled trials are considered the gold standard for establishing causality, but these are often not feasible or ethical in health and behavioural research. In such cases, researchers can turn to genetic designs such as Mendelian randomization (MR) and the direction of causation (DOC) twin models to study causality. MR uses an observed genetic variable (for example, SNP or PGS) as an instrumental variable that is assumed to correlate with the exposure variable, but not with confounding variables or the outcome62. However, MR analysis with SNPs has low statistical power, and PGSs will often be unfeasible due to genetic pleiotropy. The bivariate DOC twin model was proposed for twin data to test causal relationships between two variables. Depending on whether trait A causes trait B or vice versa (that is, the direction of causation), the model predicts different cross-twin cross-trait correlations. The DOC model works only in situations in which two variables have different modes of inheritance—that is, one trait is influenced by the shared environment, while the other is mainly influenced by genes63. Combining MR with the DOC twin approach in an MR–DOC model yields a causal test that does not require the mode of inheritance to differ for the two variables and relaxes several critical MR assumptions concerning pleiotropy. If PGSs of twins are used as instrumental variables in a bivariate classical twin design, the PGSs can correlate with both predictor and outcome64,65. Combining DNA-based methods with twin-based designs can thus overcome the limitations of individual methods. While no single approach is free of bias, triangulation can help to address the limitations of methods and facilitate replicable and reproducible results.

The added value of twins when decomposing heritability

The classical twin design can be combined with genome-based restricted maximum likelihood methods to simultaneously estimate SNP heritability and pedigree-based heritability66,67. SNP heritability refers to the contribution of measured SNPs to trait heritability. Pedigree-based heritability captures heritability on the basis of the known relationships among not-too-distantly related family members. Here, closely related family members are typically defined as cousins two or three times removed or closer. This approach is referred to as threshold genetic relationship matrix (GRM) and relies on two GRMs: one GRM including all genetic relationships, and one where the genetic relationship among distantly related pairs of individuals is set to zero to capture pedigree-associated variation. The total heritability is obtained by summing the estimates as obtained from both GRMs (Fig. 4). By including additional GRMs that comprise specific SNPs, researchers can extend the threshold GRM approach to capture the contributions of functional or biologically relevant elements68. For example, a four-GRM approach was used to obtain heritability estimates for known metabolite loci and distinguish between metabolite classes69. Besides the GRMs in the threshold approach, from which all metabolite loci were removed, this study included a third GRM to capture the metabolite loci of a specific metabolite class and a fourth GRM to include all metabolite loci for all other metabolite classes.

Fig. 4: The threshold GRM approach.
figure 4

Two GRMs (orange boxes) are the basis to obtain two heritability estimates: SNP heritability and pedigree-based heritability. The two estimates sum to the total heritability. In the GRM giving rise to the pedigree-based heritability, only relations among third or fourth cousins (who share 2.5% of their genetic material—that is, 0.025) or among second cousins (0.05) or more closely related family members are retained.

Advances in health and behavioural phenotypes


DZ and MZ twinning have different biological mechanisms. DZ twins result from two separate fertilizations and have different placentas and fetal membranes. MZ twins, in contrast, come from a single sperm and egg and separate within two weeks of fertilization. MZ twins may have shared or separate amnions and chorions70. DZ twinning has a genetic component71,72,73, but estimates of heritability for being a DZ twin mother are remarkably scarce. Duffy and Martin estimated the heritability of twinning in historical pedigrees to lie between 8% and 20%74. They could not distinguish between MZ and DZ twins, potentially underestimating the true heritability, but their estimate was consistent across time (8–19 generations) and ancestries (West Africa, Europe and Canada).

The first GWAS for being a mother of natural DZ twins identified two genes, which replicated in the deCODE Icelandic database75—namely, follicle-stimulating hormone beta subunit (FSHB) and SMAD family member 3 (SMAD3)—with risk alleles increasing the frequency of twin births in Iceland by 18% and 9%, respectively. Whereas FSH genes are strong candidates for DZ twinning, the finding for SMAD3, which is expressed in the human ovaries, was new. SNPs associated with DZ twinning have an impact on several reproductive traits, including increased fertility (having more children), decreased risk of polycystic ovary syndrome, earlier onset of menstruation, earlier natural menopause and giving birth to the first child at a younger age.

By contrast, the genetic aetiology for MZ twinning remains unclear, but van Dongen et al. discovered a strong epigenetic signature for MZ twinning comprising 834 differentially methylated positions in adult somatic tissues76. The loci were enriched for putative metastable epi-alleles, which are epigenetic marks or modifications that occur during early development. This lifelong signature opens up new avenues to investigate the vanishing MZ twin syndrome (that is, the disappearance of an embryo during early pregnancy77) and congenital disorders that have an overrepresentation of MZ twins78.

Phenotypes of increasing interest

Twin studies are of interest not only to psychology, biology and medicine but also to fields such as political science, sociology and economics79,80. In economics, twin studies have demonstrated genetic effects in classic economic paradigms such as the trust game, which manipulates participants’ willingness to invest money and to reciprocate others’ trust81. In political science, twin studies have shown that genetic contributions to political ideology are largely independent of the chosen measurement, time or population82. Inside and outside of these fields, topics previously studied become relevant again, and new ones arise. For example, during the global COVID-19 pandemic, a study in twins estimated a heritability of 31% for COVID-19 on the basis of self-reported symptoms83.

The outcomes of twin studies can benefit interventions. As genetic influences can change over a lifetime, the longitudinal nature of twin registers can inform policy decisions aimed at a specific age group84. For example, as obesity often starts in childhood85 and twin studies show that the heritability of BMI increases throughout childhood86, this suggests that public health interventions might be most beneficial earlier in childhood rather than later in life. We may see this process as a recursive loop87 where policies can inspire new questions for researchers, and results from research can lead to new policies. Society and its political, economic and social agents influence funding decisions, thus making twin studies relevant to a wide audience. This includes economists looking at the costs and benefits of policy decisions, sociologists studying the effects of policy decisions in society and political scientists studying the entire process of policy development.

Scientists develop interests in phenotypes that become accessible because of our technological advancement. Digital fingerprinting, the unique trace of a person’s activity on the internet88,89, is one example of such a phenotype. Language used on Facebook can predict depression symptoms90, and more complete digital fingerprints may serve as predictors for other health and behavioural outcomes. However, researchers do not know whether digital fingerprints in MZ twins are as similar as their physical fingerprints. One study found that genetic effects can fully explain the familial resemblance in the frequency of internet use and that genetic and environmental factors account for different aspects of use91. It seems plausible that variance in digital fingerprints will also have a heritable component. If MZ twins have highly similar digital fingerprints (more so than DZ twins or siblings), this also creates options for the application of the classical twin design to nationwide register data, as these registers contain information on the entire population of twins but often lack information on zygosity for same-sex pairs.

An approach to handle unknown zygosity was suggested by combining twin and sibling data in register data92. Male–female pairs are DZ, but same-sex pairs are MZ or DZ. As there are equal numbers of opposite- and same-sex DZ twin pairs, we know the proportion of DZ pairs. As same-sex DZ twins and same-sex siblings share the same genetic similarity, we can utilize the ratios of concordant and discordant pairs of the latter as weighting factors to determine the number of concordant and discordant pairs for same-sex DZ and MZ twin pairs. This approach has been used to estimate the heritability for attention deficit hyperactivity disorder in German health insurance data (0.77 for females and 0.88 for males) as inferred from ICD-10 diagnoses and drug prescriptions. These estimates closely resemble those obtained from classical twin studies. Thus, with only sex and diagnostic status of twin and sibling pairs, researchers could determine the contribution of genetics.

Neural and symptom networks

Twin studies can shed light on the genetic influence on the development, prognosis and comorbidity of neuropsychological and psychopathological disorders by analysing the network parameters of these disorders93,94. Researchers investigate the genetics of nodes and edges in graph-theory-based networks or in network analyses of psychopathological phenotypes. Network parameters, such as the efficiency of the structural brain network (derived from magnetic resonance imaging data), have strong genetic correlations with intelligence95,96 and schizophrenia97. In networks of structural and functional brain connectivity, the heritability of network parameters ranges from 0.05 to 0.74 depending on participants’ age and the analytical method used (Supplementary Table 1). In networks for psychopathological disorders, central nodes (nodes with a relatively large number of connections to other nodes in the network) generally have higher heritability98.

Applications of network analyses face numerous challenges, including a lack of criteria for identifying the completeness of networks, and the heterogeneity and replicability of networks99,100. Network configuration can vary for different patients and across time, challenging the generalizability and reliability of network approaches98. Twin studies can help to resolve such issues by considering twin resemblance for network parameters, where high correlations in MZ twins imply high reliability36.

Omics traits

Recent technological advancements in measuring genetic variation, transcriptomes, epigenomes, proteomes and metabolomes have enabled researchers to study heritability in twin cohorts through classical twin designs or a combination of classical twin designs and genome-based restricted maximum likelihood. Heritability estimates for gene expression from twin studies that characterize RNA transcripts in a cell by microarrays or sequencing101 differ by tissue. For example, adipose tissue has a higher average heritability (26%) than lymphoblastoid cell lines (21%) and skin tissue (16%)102. The average heritability of peripheral blood gene expression is between 10% and 20%, with mean heritability estimates significantly higher in RNA sequencing studies than those from microarray-expression data68,103.

Twin-based heritability estimates for DNA methylation vary by sex and age. DNA methylation, the most commonly measured epigenetic mechanism in epidemiological studies, involves the addition of a methyl group to the C5 position of cytosines in the DNA, which decreases the genome accessibility for transcription, thereby regulating gene expression104. On the basis of adult twin data, the heritability of blood DNA methylation across the genome was 19% on average, with common SNPs explaining on average 7% of the variance—that is, 37% of the total twin-based heritability105. Heritability for DNA methylation decreased with age, driven by an increase in environmental variance. A combination of multiple twin datasets across the lifespan (0–92 years) demonstrated an increase in familial correlations in MZ and DZ twins from birth to adolescence and a decrease after young adulthood106. The rate of change in familial correlations was similar in MZ and DZ pairs, which suggests that age affected the influence of environmental rather than genetic effects on DNA methylation across the lifespan.

Twin studies of proteomics and metabolomics generally report higher heritability estimates than those found for transcriptome and epigenome variables. One study in female twins estimated that genes accounted for 13.6% of the variance in plasma protein levels107. Post-translational modifications of proteins, particularly glycosylation (attachment of a glycan to a protein), are also heritable, with many N-glycans showing high heritability (>50%) in plasma108,109. Metabolites comprise a diverse set of small molecules (<1.5 kDa) involved in cellular metabolism, including amino acids, sugars and lipids110. Twin-based heritability estimates of blood metabolite levels cluster around 50% but can differ among metabolite classes111,112. Similar heritability estimates have been observed for urinary metabolites113,114.

To fully understand biological processes, we need to collect and analyse data from multiple omics domains together115. Simultaneous multi-omics modelling can complement single-omics analyses by accounting for the relationships among omics domains116. We anticipate that the fusion of twin designs with data from multiple omics domains will enhance the categorization or differentiation of diseases and the forecasting of biomarkers.

Concluding remarks

In today’s globalized and increasingly interconnected research world, collaboration and cooperation are key. Over 60 twin registers are in place worldwide, across multiple continents in 26 countries117, with many collaborating in the Collaborative Project of Development of Anthropometrical Measures in Twins (CODATwins)118, which amasses and shares data on height, BMI and size at birth from 54 twin projects from 24 countries. The results for height and BMI are remarkable, as the differences in heritability estimates were minor across twin cohorts from different cultural–geographic regions and in individuals from different birth cohorts or for whom height and BMI were obtained at different ages.

Existing registries cannot always answer novel research questions, and the replication of findings in understudied twin populations warrants the development of new twin registries119. Particularly, African, Arab, Hispanic and other non-European populations are still underrepresented in twin research. Also, some age groups are still underrepresented in twin studies. With the lengthening of average life expectancies and the rapid rise of aging populations, understanding the causes and effects of aging and age-related declines in health is vital120. While twin studies can make valuable contributions to these research topics, few current registries include twins of advanced age. Eventually, longitudinal registries may aid here by following their participants into advanced age, and establishing geriatric cohorts will accelerate research into aging. Another solution, which also tackles the attrition often observed in longitudinal studies, involves linking existing twin registers to electronic health records and other national registers.

Periconceptional and prenatal variables remain understudied121. In a large meta-analysis of heritability estimates in over 14 million twin pairs for over 17,000 traits122, less than 5% of the studies published between 1958 and 2012 investigated early life traits or childhood-onset disorders. The low heritability estimates and substantial shared environmental effects reported for several of these traits might make them especially interesting to study123,124. Researchers should ideally follow mothers of twins and twins from the pre-pregnancy period. This would seem difficult, as predicting a twin pregnancy is hardly possible, but many fertility clinics hold medical records on the conception of twins and have the potential to create such research resources.

For all phenotypes, we need to consider sample sizes when applying twin methods. Is larger always better, or can ‘better’ also involve phenotypes with improved reliability or designs that profit from unique events? An example would be the National Aeronautics and Space Administration twin study that enabled the investigation of the effect of long-duration spaceflight on the human body in one pair of identical twins125. While astronaut Mark Kelly stayed on Earth, his identical twin Scott spent a year in space. Spaceflight affected some bodily functions even six months after returning to Earth, including gene expression changes. Other changes, such as body weight and metabolite levels, were less persistent and returned to the pre-flight levels after a shorter time.

Twin studies have helped in understanding the influence of the environment by analysing traits that tend to be labelled as ‘environmental’, such as social environment, leisure-time activities and life events, indicating that with an average heritability of 49%, these ‘environmental’ variables are partially under genetic control126. The classical twin design is still one of the most powerful to distinguish between unique and shared environmental influences. While GWAS approaches are increasingly powerful to estimate (besides heritability) the bivariate genetic correlations between traits, in contrast to twin studies, they do not inform on the phenotypic or environmental correlations between phenotypes and do not account for the total heritability of the traits. Moreover, traits that show evidence of shared environment are predicted to show evidence of genetic nurture in designs with non-transmitted PGSs127. The results from twin studies can thus generate new hypotheses about genetic nurture.

Twin registers often have multigenerational genotype data and a wide variety of phenotypes and will continue to be essential contributors of data to GWASs, within-family designs, causality modelling, intergenerational transmission and longitudinal studies. Twin researchers mostly collect twin data, though interest in twin research by other disciplines is beneficial, as every discipline has its way of measuring constructs128. Diverse disciplines should collaborate to ensure clarity about the meaning of results, such as when a trait is reported to have high heritability129. Despite rapid technological advances enabling increasingly large-scale omics investigations for complex human traits and the renewed interest in family-based methods, twin designs have contributed to the major discoveries in behavioural genomics and will continue to do so130.