Introduction

The 2011 Roadmap for Mental Health and Wellbeing Research in Europe (ROAMER) [1,2,3] outlined six priorities for research in mental health, of which three are the focus of the EU funded Marie Skłodowska-Curie International Training Network CAPICE project: Childhood and Adolescence Psychopathology: unravelling the complex etiology by a large Interdisciplinary Collaboration in Europe:

  1. (1)

    Research into prevention, mental health promotion, and interventions in children, adolescents, and young adults;

  2. (2)

    Focus on the development and causal mechanisms of mental health symptoms, syndromes, and wellbeing across the lifespan (including older populations);

  3. (3)

    Develop and maintain international and interdisciplinary research networks and shared databases in the area of childhood and adolescence psychopathology.

To address these priorities, CAPICE brings together data from eight population-based birth and childhood cohorts, including twin cohorts, to focus on the causes of individual differences in childhood and adolescent psychopathology and its course (see Tables 1, 2). The CAPICE cohorts have collected longitudinal data on behavioral and emotional symptoms, lifestyle characteristics and environmental measures as well as genetic and epigenetic data (see Table 2), some also from parents. Data from other cohorts from the EArly Genetics and Life course Epidemiology (EAGLE) consortium [4], behaviour and cognition group, can also be involved in the studies, especially for genome-wide association meta-analysis (GWAMA). The EAGLE consortium is a collaboration of population-based birth, childhood and adolescent cohorts including the cohorts participating in CAPICE. These cohorts are mainly based in Western Europe and Australia as, to our knowledge, no comparable cohorts exist in other countries. If there are, they are very welcome to participate.

Table 1 General descriptions of the CAPICE cohorts
Table 2 Summary of available data in the CAPICE cohorts: number of children with survey data and genome-wide association data (N GWA) and epigenetics data (N epi) at ages 3–5 years, 6–8 years, 9–11 years, 12–13 years, 14–15 years, and 16–18 years

Efforts to prevent and treat childhood psychopathology need to be informed by a clear understanding of the aetiology of mental disorders, and factors that impact the development of a chronic course. Due to the genetically informative, longitudinal designs of the eight included cohorts, CAPICE is well positioned to address questions regarding the interplay of genetic and environmental factors in the development, course, and comorbidity patterns of child psychiatric conditions. Since CAPICE is an international training network, the analyses have been performed by 12 early stage researchers under the supervision of senior researchers at academic and non-academic sites across 5 European countries (Italy, The Netherlands, Norway, Sweden, and the United Kingdom). In this paper, we describe the six CAPICE objectives, as well as the methodological approaches that have been used and the initial results. The overview of the CAPICE research programme is illustrated in Fig. 1.

Fig. 1
figure 1

Overview of the CAPICE research programme: the aims are to investigate the influence of genetic, epigenetic, and transcriptomic variants on mental health symptoms, the interplay with the environment, and how these influences depend on age. Ultimately, these results will be integrated into a model predicting the persistence of symptoms

Objectives

  1. (1)

    Elucidate the role of genetic and environmental factors in mental health symptoms across childhood and adolescence, and to establish the overlap in genetic risk factors with other traits related to childhood mental health symptoms.

    The projects that are part of this objective focus on the investigation of the overlap in genetic risk across phenotypes and ages, to explain comorbidity and persistence, respectively. Moreover, analyses will focus on disentangling the role of genetic and environmental factors in associations between childhood mental health symptoms and parental phenotypes or risk factors early in life.

    Previous research suggests that both genetic and non-genetic factors play a role in the development and persistence of mental health symptoms and disorders. Depending on age, gender, and type of mental health symptoms, genetic factors explain between 40 and 80% of their symptom variance between individuals [5], indicating that the contributions of genetic and environmental factors vary across disorders. Identifying mechanisms that underlie the persistence of symptoms and patterns of comorbidity are of specific importance in childhood psychopathology due to the potential impact of these disorders throughout the life course. Epidemiological studies have shown that about 50% of children with mental disorders still suffer from mental disorders in adulthood [6], including severe mental illnesses such as schizophrenia (SCZ) [7]. As comorbidity has been associated with worse prognosis [6, 8], it is essential to comprehend the mechanisms behind this.

    It is also still unknown to what extent the associations between environmental factors and mental disorders are explained by causal processes, in which the environmental risk factor directly increases the risk for psychopathology, or by other processes such as, for example, gene–environment correlation, where the same genetic variants that influence a child’s risk for a mental disorder are associated with the child’s risk to be exposed to an unfavourable environment.

  2. (2)

    Identify genetic, epigenetic, and transcriptomic variants associated with mental health symptoms during childhood and adolescence.

    There has been considerable progress in the field of psychiatric genetics with over two hundred genomic regions significantly associated with SCZ [9], bipolar disorder (BD) [10], depression [11, 12], attention-deficit/hyperactivity disorder (ADHD) [13], and autism spectrum disorder (ASD) [14]. This has been achieved by large international collaborations such as the Psychiatric Genomics Consortium (PGC), performing meta-analyses with total sample sizes even up to 100,000 subjects [12]. The number of variants which have been associated with childhood psychopathology at genome-wide significance is still lower than for adult mental disorders [13, 14], but polygenic analyses have suggested that genetic variation in childhood mental disorders, as in adult disorders, is primarily due to many genetic variants of small effect [15]. This would mean that genome-wide association studies (GWAS) of child and adolescent psychopathology, which so far have been notably smaller than the large meta-analyses above, have likely been underpowered. By increasing the sample size and performing meta-analyses across the EAGLE cohorts, including the CAPICE cohorts, it may be possible to detect genome-wide significant associations.

    In addition, because extended longitudinal data on mental health problems are available, it is possible to test whether genetic variants exert their influence across the lifespan or if their effect is restricted to a certain age period.

    Finally, as previous work has also suggested that the impact of environmental factors on childhood psychopathology may be mediated by epigenetic variation (functional alterations in the genome that do not involve a change to DNA sequence [16]), identification of epigenetic differences associated to childhood mental disorders is also part of this objective.

  3. (3)

    Identify biological pathways associated with mental health symptoms and to validate potential drug targets based on these pathways.

    The current identified genetic variants (SNPs) for psychiatric disorders only explain a small part of their variance [17,18,19]. However, the joint effect of all tested variants explains on average around 30% of the variance of psychiatric disorders [20], which makes it worth to investigate these variants further [21]. A key step in the pathway from variant identification in GWAS to the development of potential treatment or prevention approaches is the characterization of biologic pathways by which genetic variants affect disorders. For instance, follow-up analyses can specify whether associated variants are involved in the same biological pathways, which then can offer leads to novel drug targets or the repurposing of existing drugs that were initially developed for the treatment of other diseases [21,22,23].

    In a recent study [22], a framework for drug repositioning is offered by associating transcriptomes imputed from GWAS data with drug-induced gene expression profiles from the Connectivity Map database. When applied to mental disorders, the method identified many candidates for repositioning, several upheld by preclinical or clinical evidence as they included known psychiatric medications or therapies considered in clinical trials [22]. Validation studies of these approaches have recognized antipsychotics as potential drug targets for mental disorders, confirming the approach may be a productive technique for developing drug therapies for childhood psychiatric conditions [24].

  4. (4)

    Build a prediction model that identifies children at the highest risk of developing chronic mental health symptoms.

    To be able to provide treatment that takes into account the risk for persistence of symptoms, it is necessary to build a prediction model that supports stratifying children into groups at high and low risk for a severe, chronic symptom course. Similar risk calculators based on clinical symptoms and cognitive profiles have already been established for high-risk individuals for SCZ or BD [25,26,27]. Including environmental factors and multi-omic biomarkers may further improve the performance of the model [28].

  5. (5)

    Develop a sustainable international network of researchers in which collaboration is facilitated by data harmonization and information technology (IT) solutions enabling a joint analysis of data over cohorts.

    The eight cohorts involved in CAPICE contain a wealth of data on childhood and adolescent mental health, but measures used to assess these traits differ across cohorts (see Table 2). One way to improve the power of studies combining results from multiple cohorts is to use common units of measurement. Using item-response theory (IRT) based test linking in these cohorts; it is possible to evaluate the extent to which dimensions from the individual instruments can be mapped onto dimensions that are shared across instruments. Such an analysis was successfully applied earlier to measures of personality [29].

  6. (6)

    Build a structure to disseminate the results to a broad audience of scientists, clinicians, patients and their parents, and the general public.

    A limitation of research projects consists in the difficulty of properly disseminating their results, even if clinically relevant. Previous work has suggested that, even in targeted clinical research, it takes 17 years for 14% of discovery research to be integrated into physician practice [30]. Etiological research on mental disorders may include even broader dissemination gaps. To address this issue, CAPICE specifically aims to create structures allowing for the dissemination of project results to a broad audience, for instance, through a website and social network channels.

Methodology and results

Cohorts description

Before describing the approaches and results of each objective, we provide a general summary of the data collection of the eight CAPICE cohorts (see Tables 1, 2). All longitudinal cohorts have started either at birth or in childhood and use quantitative measures of psychopathology. These measures have been associated with clinical diagnoses [31,32,33,34] and are therefore widely used in clinical practice. The advantage of these dimensional measures is that they capture more of the variation present in the common population than dichotomous measures that only specify the presence or absence of a diagnosis. For genome-wide SNP data, methods to impute the genotypes are widely available allowing for the same genetic variants to be analysed over cohorts. All cohorts have been described in more detail elsewhere: ALSPAC [35,36,37], TEDS [38], GenR [39, 40], NTR [41,42,43], TCHAD [44], NFBC’86 [45], CATSS [46, 47], MOBA [48]. For a link to cohort-specific websites and for a detailed description of the cohorts (see Table 1). All data used for the analyses were collected under protocols that have been approved by the appropriate ethics committees, and studies were performed in accordance with the ethical standards.

Objective 1: Elucidate the role of genetic and environmental factors in mental health symptoms across childhood and adolescence, and to establish the overlap in genetic risk factors with other traits related to childhood mental health symptoms.

We refer to several excellent overviews for a description of the methods that can be used in genetic epidemiological studies analysing twin/family data and/or molecular genetic data [15, 49, 50]. In short, twin and family studies estimate the proportion of variance in a trait attributable to genetic and environmental factors by comparing the resemblance between pairs of relatives that differ in their relatedness. For example, if monozygotic twins, who are essentially 100% genetically identical, are more alike than dizygotic twins, who share on average 50% of their co-segregating alleles, this is an indication that genetic factors play a role in explaining differences between individuals for a certain trait. This model can be extended to include other family members, to longitudinal or multivariate designs, and to study gene–environment interaction (G × E), even without a direct measure of the environment [51,52,53].

It is also possible to address these questions using genotypic data obtained for GWAS. In GWAS, common single nucleotide polymorphisms (SNPs) (positions in the DNA sequence that vary between individuals) measured across the whole genome are tested for their association with a trait. Many complex traits, like mental disorders, are influenced by multiple genetic loci. In this case, polygenic analyses, taking into account many SNPs, can be applied to investigate the cumulative impact of these SNPs on a trait, as well as on the co-occurrence of phenotypes or the persistence of symptoms over time [15, 49]. To this point, CAPICE researchers have performed twin and polygenic risk score analyses of the correlation between common mental health symptoms, such as internalizing problems and ADHD problems, and the stability of symptoms over time, including into adulthood.

Regarding the co-occurrence of symptoms, the covariance could be explained by one common factor, the so-called p-factor, which was found to be for 50–60% heritable. Moreover, genetic factors explained stability in this factor across ages. A polygenic p-factor risk score based on adult psychiatric disorders was also associated with the childhood p-factor [54].

The genetic association between adult psychiatric disorders and childhood and adolescents traits was further investigated with polygenic analyses. The PGS for adult depression, neuroticism, BMI, and insomnia were significantly positively associated with childhood ADHD, internalizing and social problems, while the PGS for subjective well-being and educational attainment showed negative associations [55]. Only bipolar disorder PGS did not yield any significant associations. Effect sizes were in general similar across age and phenotype, although the PGS for educational attainment was more strongly associated with ADHD and the BMI PGS with ADHD and social problems [55]. A follow-up study is currently on the way, performing multivariate analyses to shed more light on the pattern of associations (https://osf.io/7nkw8).

Genetic data can also be leveraged to estimate the average causal effect of specific environmental factors, such as perinatal factors, on child or adolescent psychopathology, using a method known as Mendelian randomization (MR). Several ongoing CAPICE projects have applied this approach to estimate the effect of prenatal risk factors, such as maternal smoking, on childhood mental health problems. It is important to recognize that the assumptions that are required for MR may not hold for all prenatal exposures and offspring outcomes [56]. Therefore, methods to identify violations of the MR assumptions are also evaluated and approaches that require fewer assumptions are tested.

Another method to analyse the mechanisms underlying parent-offspring associations is maternal genome-wide complex trait analysis (M-GCTA), which can be applied when parental genotypic data are also available. M-GCTA can be used to calculate whether the association between parent and offspring psychopathology is explained by an environmental effect on top of the effect of the genetic transmission [57]. Applying this method to the MoBa data indicated no such effects for anxiety and depression at age 8 [58]. Analyses on a larger so more powerful sample and including externalizing problems are currently performed.

Objective 2: Identify genetic, epigenetic, and transcriptomic variants associated with mental health symptoms during childhood and adolescence.

GWAS have provided insight into the genetic basis of quantitative variation in complex traits in the past decade [20]. By increasing the sample size and performing meta-analyses across the EAGLE cohorts, including the CAPICE cohorts, it may be possible to detect genome-wide significant associations and to detect age effects. A large-scale GWAMA using a multivariate method to analyse summary statistics that are not independent [59] focused on identifying genetic variants that influence the development and course of internalising symptoms from ages 3 to 18 (https://osf.io/w5adg/). Three gene-wide significant effects were detected as well as significant genetic associations with adult depression and related traits as well as with childhood traits. Another study explores the effect of (ultra) rare and common variation in genes specific to brain cell types on neuropsychiatric disorders (https://osf.io/uyv2s).

Previous work has also suggested that the impact of environmental factors on childhood psychopathology may be mediated by epigenetic variation, which consists of functional alterations in the genome that do not involve a change to DNA sequence [16]. While there are several forms of epigenetic variation, most epigenetic studies have focused on alterations in DNA methylation. Epigenome-wide association studies (EWAS) are performed to test the effect of maternal mid-pregnancy vitamin D on offspring cord blood methylation and of the association between variation in child peripheral and cord blood methylation and the subsequent development of ADHD.

Under very strong assumptions, mediation of the possible average causal effect of prenatal exposures on offspring psychiatric outcomes might be tested using an extension of the MR approach, incorporating genetic variants as proposed instruments for a particular prenatal exposure and for methylation at a specific locus. In certain contexts, this might be considered as a follow up to the present studies.

The association of prenatal maternal smoking with offspring blood DNA methylation has been investigated in individuals aged 16–48 years, and MR and mediation analyses have been performed to evaluate whether methylation markers have causal effects on disease outcomes in the offspring [60]. 69 differentially methylated CpGs in 36 genomic regions (P-value < 1 × 10−7) were found to be associated with exposure to maternal smoking in adolescents and adults and MR analyses delivered evidence for a causal role of four maternal smoking-related CpG sites on an increased risk of SCZ or inflammatory bowel disease [60]. Further studies analyse whether alcohol, tobacco, and caffeine use in pregnancy might be causally related to ADHD in the offspring using negative control and MR approaches (https://osf.io/wxu58) (https://osf.io/aqrxp).

Objective 3: Identify biological pathways associated with mental health symptoms and to validate potential drug targets based on these pathways.

Applying drug pathway analyses to the CAPICE GWAS results may permit us to derive hypotheses about potential drug targets and consequently possibilities for drug repurposing. The GWAMA on internalizing problems did not detect biological pathways (https://osf.io/w5adg/) so could not identify drug targets. This is not surprising as there were not many significantly associated genes.

Objective 4: Build a prediction model that identifies children at the highest risk of developing chronic mental health symptoms.

Using cohort data, as well as Swedish registry data, studies have been performed to predict outcomes of psychiatric symptoms in childhood and adolescence, focusing not only on mental disorders but also on somatic medical outcomes. As part of these analyses, a machine learning model including 474 predictors has been developed that can predict mental health problems in adolescence using data from the Child and Adolescent Twin Study in Sweden (CATSS) [61]. The suggested model would not be appropriate for medical purposes, but it helps to build better models to predict mental health outcomes [61].

Moreover, longitudinal analyses of data from the Swedish and Dutch twin registers indicated that adolescent anxiety is associated with psychiatric disorders later in life, even when adjusting for other mental health issues [62].

Objective 5: Develop a sustainable international network of researchers in which collaboration is facilitated by data harmonization and information technology (IT) solutions enabling a joint analysis of data over cohorts.

Using Item-Response Theory (IRT) based test linking it has been evaluated whether internalizing and ADHD symptoms assessed by different instruments can be mapped onto dimensions that are shared across instruments. These analyses were possible as some of the EAGLE cohorts (ABCD [63], Raine [64], and TEDS [38]), had measured mental health symptoms of the same individuals at the same age with two or more instruments. This could allow combining individual raw item data from different instruments to maximize statistical power.

In addition, to facilitate data analyses over cohorts, a searchable data catalogue is created. The variables important for the current project include demographic and family characteristics, individual’s school achievements, mental health measures (both psychopathology as well as wellbeing) by various raters (mother, father, self-report, teacher), pregnancy/perinatal measures, several general health and anthropometric measures, parenting, parental mental health, and several genomic measures and biomarkers in children and parents. To build a search engine that returns items including the searched term as well as related terms, text mining of available data documentation has been used to identify relationships between words. These results can then be used to develop an advanced search engine for the data catalogue. If “mental health” is, for example, the search term, the results will also include “emotional problems”, “behavioural problems”, and “psychiatric history of the mother”.

Objective 6: Build a structure to disseminate the results to a broad audience of scientists, clinicians, patients and their parents, and the general public.

To engage the general public with the results from these studies, CAPICE researchers have also created content designed for a lay audience on the website (http://www.capice-project.eu/index), Twitter (https://twitter.com/capice_project), YouTube (https://www.youtube.com/channel/UCgq8uIHiHE69IlcHoYCjwKg/featured?view_as=subscriber), LinkedIn and Facebook. CAPICE was also represented at the Greenman Festival in Wales, UK, and ESRs presented multiple times on several international conferences.

Discussion

Genetic research, including psychiatric genetics, has substantially moved forward due to large-scale collaborations in consortia, with meta-analyses being the rule rather than the exception. Due to developments in methodology, it is also possible to use the genome-wide genotypic data for purposes other than the identification of genetic risk variants. Given the small effect sizes, these analyses need large samples to achieve adequate statistical power. The cohorts brought together in CAPICE and the close collaboration with the EAGLE behavior and cognition group (https://www.eagle-consortium.org/) has provided an opportunity to perform these analyses and progress the field of child psychiatry by addressing essential questions like “Which genetic variants and biological pathways underlie the continuity of symptoms from childhood into adulthood?”; “Which factors explain the associations of childhood psychopathology with early life and familial risk factors?”; “What is the role of epigenetic factors in the development of the child and adolescent psychopathology?”; “Can we predict which children are at higher risk for poorer outcomes?”.

The ultimate aim is that these results will inform the development of future treatment and prevention efforts, including supporting the identification of novel targets for existing pharmacological agents. Moreover, having better prediction tools will bring precision medicine closer for child psychiatry as it provides the opportunity to test interventions specifically targeted at children at high or at low risk for the persistence of symptoms.

We acknowledge that the studies performed in the framework of CAPICE are largely restricted to children and families with a Western European genetic ancestry. Extending genetic data collection to children and families from other backgrounds is essential to gain knowledge on similarities and differences between groups from various backgrounds.

These analyses have all been performed as part of a training program for Early Stage Researchers. These researchers receive not only mentoring from senior academics on their specific projects but also a structured curriculum of workshops on child psychiatry, statistics, and dissemination strategy throughout the grant period. These workshops as well as the secondments at other participating institutions aim to build a new generation of creative and innovative researchers who might exert a relevant impact on academic and non-academic organizations.

In conclusion, CAPICE provides a broad package of training in the field of psychiatric genetics, going from harmonizing the phenotypes, the creation of facilities for analyses across cohorts and the actual state-of-the-art analyses, to the translation of the results for drug target validation or prediction models that can be used in the clinic for targeted interventions.