Introduction

Education equips us with the skills and knowledge that we need to navigate through and master our lives. People’s differences in educational opportunities and outcomes have pervasive, long-term influence on their life-course development, including lifetime cognitive functioning (Chen et al., 2019; Lövdén et al., 2020), socioeconomic status attainment (Mackenbach et al., 2008), the likelihood of becoming employed (Oreopoulos & Salvanes, 2011), marriage partner choice (Cave et al., 2022), health and wellbeing (Cutler & Lleras-Muney, 2010, 2012), and life expectancy (Montez & Hayward, 2014). From the first day of kindergarten, vast differences in children’s educational performance can be observed, and these differences magnify as children grow older (Heckman, 2006; Schoon et al., 2002; von Stumm, 2017; von Stumm et al., 2019). Understanding the origin of people’s differences in educational outcomes is key to improving how we teach and learn, which, in turn, will greatly benefit the psychological development of both individuals and society at large.

The search for causes of children’s differences in school performance has traditionally focused on characteristics of students’ learning environments, such as their teachers’ expertise (Bardach et al., 2021; Nye et al., 2004), classroom sizes (Filges et al., 2018; Karvonen et al., 2018), and school quality (Shin & Chung, 2009; von Stumm et al., 2020). Yet in recent years, the role of genetic propensities on children’s learning differences has garnered much attention (e.g., Asbury & Plomin, 2013; Harden, 2021; Malanchini et al., 2020; Plomin, 2018). Twin studies that compare the phenotypic resemblance of identical twins (i.e., monozygotic) to that of fraternal twins (i.e., dizygotic) documented that differences in education are highly heritable (Ayorech et al., 2017; Polderman et al., 2015; Rimfeld et al., 2018a, b; see Glossary in Supplementary Table S1 for terminology explanations). Heritability refers to the proportion of differences in a phenotypic trait that can be observed between people and attributed to their inherited DNA differences (Plomin, 2018; Visscher et al., 2008). For example, 57% of children’s differences in children’s learning over the primary school years were heritable in a Dutch twin sample of 12-year-olds (Bartels et al., 2002). That is, 57% of the differences in pupils’ scores on the national CITO elementary test that children in the Netherlands take at the end of primary school could be attributed to their genetic differences, while 43% of their differences in test scores were due to environmental factors (e.g., teacher expertise or school quality; Bartels et al., 2002). Other studies confirmed that more than half of students’ differences in school performance can be attributed to their genetic differences (Malanchini et al., 2020; Polderman et al., 2015), including analyses of a UK sample of 16-year-olds for whom 58% of their differences in end-of-year exam grades were heritable; these exam grades indexed pupils’ suitability for further, university-level studies (Shakeshaft et al., 2013).

Notwithstanding the many robust and replicable findings from behavior genetic studies (Plomin et al., 2016), genetically informed research on psychological or behavioral traits is often misinterpreted (Asbury & Plomin, 2013; Crosswaite & Asbury, 2019) and met with skepticism and even hostility (e.g., Lewis-Kraus, 2021). A common misconception is that finding genetic influences implies that people’s differences are “fixed” or “destined,” an idea known as genetic determinism. Yet, genetic influences neither determine destinies nor do they mean that people’s differences are immutable. In fact, genetic and environmental influences exist in a concerted balance: If environments vary greatly in a population, for example when the opportunity to go to school is only available to some but not others, genetic influences will only explain a small proportion of that population’s differences in education. In this case, the effects of having access to a school versus not would be the far stronger predictor of school performance. However, if environmental conditions are more equal for people (e.g., when schooling is available to or even compulsory for all children), genetics accounts for a greater proportion of the differences due to less environmental variance (Knigge et al., 2022). Some have argued that this is the reason why heritability could be thought of as an index of equality of opportunity: If heritability is high, environmental variability is low (Ayorech et al., 2017, 2019), meaning that opportunities are, on average, equally available to everybody in the population.

The realization that eradicating environmental inequality will produce greater genetic inequality reads somewhat uncomfortable, yet it is true. Heritability describes the relative contribution of genetic and environmental influences to phenotypic variability in a particular population at a particular moment in time, but it says little about a trait’s susceptibility to environmental influences (Maccoby, 2000). For example, individual differences in height are highly heritable at about 80% (Yengo et al., 2022), and yet, people’s average height has increased across generations on average by ~ 1 cm per decade (Roser et al., 2013). These gains cannot be explained by changes in the population’s gene pool, which evolves over millennia, but are likely due to environmental changes, such as improved nutrition and medical care (Perkins et al., 2016). Akin to height, steep increases in school enrolment and school completion rates have been reported over the past century. People’s mean years of schooling has grown from 4 to 12 years in high-income countries (i.e., a 67% increase) and from half a year to 8 years in low- and middle-income countries (i.e., a 94% increase; Lee & Lee, 2016; Ritchie et al., 2023). These gains in educational attainment stem from the broader availability of education opportunities—an environmental change—rather than from changes in populations’ gene pools.

Recent advances in genomic research make it possible to predict people’s differences in education directly from their inherited DNA variants that we refer to as polygenic scores but that are also known as polygenic risk score (PRS), polygenic index (PGI), genetic risk scores, or genome-wide score in the literature (Allegrini et al., 2020; Cesarini & Visscher, 2017; Lee et al., 2018; Okbay et al., 2016, 2022; Rietveld et al., 2013; Selzam et al., 2017). Just like heritability estimates, DNA-based predictions for phenotypic traits do not imply immutability or fixedness, nor do they allow per se for causal inferences (Plomin & von Stumm, 2022). They reflect a person’s genetic propensity or likeliness to think, feel, and behave in certain ways (Cesarini & Visscher, 2017; Plomin & von Stumm, 2018). For example, children with lower genetic propensities for educational achievement are likely to find concentrating on homework more difficult, be less engaged with lessons, and forgo study sessions in favor of other activities more often than children with higher genetic propensities for eruditeness. It may therefore take more effort, time, and educational support for children with lower genetic propensities for education to achieve the same learning success as others with higher genetic propensities, but they could nonetheless perform just as well or even better. In other words, children’s genetic propensities for education describe learning tendencies or preferences but do not determine educational trajectories, akin to the observation that children’s performance in secondary school predicts but does not determine tertiary education outcomes (e.g., university-level studies; Starr et al., 2024).

Polygenic scores reflect not only direct genetic influences but also capture gene-environment correlations, assortative mating, and population stratification (Plomin & von Stumm, 2022). Gene-environment correlations arise, for example, because children are recipients of both inherited DNA differences and rearing environments from their parents (Wertz et al., 2019). Assortative mating refers to parents’ greater genetic resemblance than can be expected for two randomly chosen individuals (Horwitz et al., 2023). Population stratification occurs when allele frequencies differ systematically across subpopulations, giving rise to heterogenous population structures (Morris et al., 2020a, b). The effects of gene-environment correlations, assortative mating, and population stratification may account for up to 60% of polygenic score predictions of educational outcomes (Selzam et al., 2019), although estimates vary substantially across studies that have been published to date in this area (Howe et al., 2022).

At present, equal opportunity is the epitome of the education system in meritocratic societies that aim to achieve fairness in education by instructing all children in the same curriculum, at the same pace, and with the same pedagogical methods and approaches. Corresponding education policies favor pedagogies that are thought to benefit all students (e.g., Machin & Vignoles, 2006) but because children differ in their abilities, skills, and interests, the “equal” learning opportunities that school affords them do not translate into equity in learning outcomes (Chmielewski, 2019; von Stumm et al., 2022). A more productive way for achieving fairness in education is likely prioritizing equity in outcomes over equality in opportunity (Knigge et al., 2022; Sokolowski & Ansari, 2018), such as assigning more teaching staff to children who are at risk of academic failure, while better performing children are looked after by fewer teachers (Arold et al., 2022).

Acknowledging and understanding the role of genetics may help improve the fairness of education systems in three ways. The first is that finding genetic influences on children’s learning differences advocates for teaching that recognizes children’s inherent heterogeneity in abilities, skills, and interests, rather than for “one-size-fits-all” pedagogical approaches. Effective teaching must be “personalized” to respond to and remediate children’s vastly different, individual learning needs (Asbury & Plomin, 2013; von Stumm & Wertz, 2021). Some differentiated instructional strategies are already employed by teachers, including individual education plans that support students with special educational needs and disabilities (SEND) to monitor their learning progress and improve their learning outcomes (Dalgaard et al., 2022; Department of Education, 2015; Jørgensen et al., 2020). In 2022, 17% of UK pupils received support for SEND, including 4% of pupils who were taught according to personalized education, health, and care plans (EHC; Farquharson et al., 2022). While teachers strongly and unanimously endorse the principles of personalized learning, the practical implementation of corresponding pedagogies within current schools and education settings remains difficult (Dalgaard et al., 2022; Gunawardena et al., 2024; Martschenko, 2019; Williams-Brown & Hodkinson, 2020).

Second, the availability of DNA-based predictors for children’s learning differences opens possibilities for identifying children at risk for poor educational outcomes and intervening early before maladaptive learning behaviors have manifested. This feature distinguishes DNA from other predictors of learning difficulties, such as teachers raising concerns or children agonizing over homework and struggling to read, that only become observable after they have emerged. Education interventions that involve DNA-based predictors would need to be carefully planned to preclude any discrimination against children because of their genetics (Asbury et al., 2021; Martschenko et al., 2024). Yet, such interventions resemble in many ways current approaches to remediating children’s family background inequality in education: For example, in France and in the UK, schools receive extra funding from the state to support pupils from under-resourced family homes in mitigating their learning disadvantages (Flavier & Moussay, 2014; Roberts et al., 2021). Akin to families’ (lack of) resources as predictor of children’s learning support needs, polygenic scores could be used to guide the allocation of educational resources to the children with the genetic indication for developing learning difficulties (cf. Au, 2022).

Third, a deeper understanding among parents, educators, and policymakers of the role of genetics in education could increase tolerance, commiseration, and kindness toward the children who struggle with school—as well as reduce envy of and animosity toward those who breeze through (Asbury & Plomin, 2013; Asbury et al., 2021). Prioritizing equity of outcomes will reduce the extent to which children are held personally responsible for their academic achievements and failures, an attribution that prevails in meritocratic education systems (Renoux et al., 2024). At present, children who do well in school are widely believed to “deserve” better lives than those who do poorly (Anderson, 1999, 2007; Sandel, 2020)—a rather twisted notion of fairness, if we acknowledge that children’s school performance is largely a function of circumstances beyond their control, such as their family background, including their inherited DNA differences (Harden, 2021; von Stumm et al., 2020; von Stumm & d’Apice, 2021).

Notwithstanding the potential benefits that behavioral genetic findings may hold for designing fair educational systems, they also bear risks of social separation, pessimism, and creating perceptions of unmanageable behavior within the classroom (Larsen et al., 2022; Martschenko et al., 2024). Research on how teachers, students, and parents might interpret polygenic score feedback is currently sparse, as are findings on how polygenic score predictions may affect students’ academic motivations (e.g., Larsen et al., 2022; Matthews et al., 2021).

Our article is structured into four sections. First, we define educational attainment and achievement, review their respective assessments in research studies, and discuss what is known about their heritability. Second, we explain what polygenic scores are and how they can be used to predict individual differences in education. In the third part of this paper, we present meta-analytic findings on the prediction of educational attainment and achievement from polygenic scores. Finally, we embed our meta-analytic results in the broader context of the research literature and discuss how the benefits of DNA-based predictions of education may be leveraged while effectively mitigating their risks.

The Heritability of Educational Achievement and Attainment

Educational achievement refers to students’ relative performance within an educational setting or level, for example the end-of-year exam grades in school. Educational attainment pertains to the level of education that a person acquires, ranging typically from school leaving certificates to vocational training to tertiary qualifications, such as university degrees. In putatively meritocratic societies, educational achievement and attainment are intertwined because the former regulates the access to the latter: Children who achieved relative to others at one educational level (e.g., secondary school) will be offered opportunities for further studies (e.g., at university), while those who performed poorly at an education level typically fail to qualify for progressing to the next (Schoon et al., 2012; von Stumm et al., 2019).

Twin studies from across the world have reported educational achievement as highly heritable: On average, 60% of students’ differences in educational achievement can be attributed to their inherited DNA differences (Bartels et al., 2002; Polderman et al., 2015; Shakeshaft et al., 2013). The heritability of educational achievement increases with age due to gene-environment correlations (rGE; Haworth et al., 2010; Morris et al., 2018) that result from individuals selecting, adapting to, and shaping the environments that correspond to their genotype (Avinun, 2020). Children increasingly seek out environments that align with their genetic propensities as they grow older, for example when avid readers spend weekends at the library, which in turn improves their reading skills; as a result, estimates of genetic influences on phenotypic development increase (Avinun & Knafo, 2014; Haworth et al., 2011; Plomin et al., 1977; Scarr & McCartney, 1983).

Twin study heritability estimates for educational attainment tend to be lower than those for educational achievement (Branigan et al., 2013; Silventoinen et al., 2020). A meta-analysis across 15 populations from seven European countries and Australia (Ntotal = 51,545) reported 40% heritability for attainment, which was assessed as the number of years that people spent in full-time education or the highest educational qualification they obtained (Branigan et al., 2013). Similarly, an analysis of 28 twin cohorts from 16 countries in Europe, North American, Australia, and East Asia reported that differences in educational attainment were on average 43% heritable (Silventoinen et al., 2020). The discrepancy in heritability estimates between educational achievement and attainment likely reflects the constructs’ discrepancy in environmental variance. Educational achievement occurs within defined educational settings (i.e., a class in school) where students experience broadly comparable environments, and thus, their genetic differences gain greater weight (Kovas et al., 2013). By comparison, education attainment measures subsume different environments: students who obtain university degrees have experienced different learning environments than apprentices, who attended vocational training colleges. It follows that differences in educational attainment are on average less heritable than differences in educational achievement, notwithstanding their high covariance at the phenotypic and genetic levels (Kovas et al., 2005; Plomin et al., 2016).

Polygenic Scores as Predictors of Educational Outcomes

The decoding of the human genome (International Human Genome Sequencing Consortium, 2001) and the mapping of the genomic variation in human populations (Gibbs et al., 2003) in the early 2000s made it possible to identify the DNA variants that cause the heritability of phenotypic trait differences. These DNA variants, known as single-nucleotide polymorphisms (SNPs), mark the smallest unit of genetic differences between people (see Supplementary Table S1 for Glossary). These SNPs that drive the heritability of a phenotypic trait can be identified in genome-wide association studies (GWAS). GWAS test associations between people’s differences in a target phenotypic trait and their SNPs across the entire genome and produce summary statistics that reflect the strength (i.e., effect size) and direction with which SNPs are associated with the target phenotype (Choi et al., 2020). These GWAS summary statistics enable computing polygenic scores in independent samples, for whom genotype data are available, by aggregating the SNPs identified in the GWAS, weighed by their effect size (Choi et al., 2020; Pasaniuc & Price, 2016). Polygenic scores therefore index person-specific estimates of genetic propensities for a given target phenotype (Plomin & von Stumm, 2018), while also capturing environmental variances due to gene-environment correlations, assortative mating, and population stratification (Morris et al., 2020a, b). Polygenic scores are unique predictors of psychological traits, in the way they are stable across the lifespan (i.e., they are as predictive of adult outcomes during childhood as they are in adulthood), unaffected by self-report biases (i.e., one cannot inflate or fake their polygenic scores), and preclude backward causation (i.e., inherited DNA differences are unchangeable by behavioral or environmental factors; Plomin & von Stumm, 2018). Polygenic scores are normally distributed and can be modeled as continuous variables in statistical analyses, just like sum scale scores from psychometric tests or other psychological measures. While polygenic scores may be powerful predictors of people’s phenotypic differences, they do not necessarily cause these differences. They are probabilistic rather than deterministic or causal predictors of phenotypic outcomes (Asbury et al., 2021; Harden, 2021; Plomin & von Stumm, 2022), and they also capture environmental influences (Okbay et al., 2022; Selzam et al., 2019).

To date, four large-scale GWAS have sought to identify the DNA variants that are associated with people’s differences in educational attainment (Fig. 1), which is inferred from the number of years that people have spent in full-time education, a coarse yet cost-effective and widely used way of assessing educational attainment (Lee et al., 2018; Marioni et al., 2016; Okbay et al., 2022). The first GWAS for years spent in education included a discovery sample of N ~ 130,000, and corresponding polygenic scores predicted about ~ 2% of the educational attainment variance in independent samples (N ~ 3500 and ~ 6800; Domingue et al., 2015; Rietveld et al., 2013), a rather modest effect size relative to the twin study estimates of 40% heritability (Matthews & Turkheimer, 2022). The difference between polygenic score predictions and twin study heritability estimates is known as the “missing heritability” gap (Manolio et al., 2009), which is likely to due to rare variants that will require GWAS discovery sample sizes of tens of millions to be identified (Watanabe et al., 2019). The “missing heritability” gap will likely narrow as research progresses, especially as larger GWAS samples (Yengo et al., 2022), whole-genome sequencing (Wainschtein et al., 2022), and alternative statistical modeling approaches become available (Grotzinger et al., 2019).

Fig. 1
figure 1

Polygenic score prediction of years spent in education across four published GWAS. Note: For each GWAS for educational attainment (EA), the year of publication (in parentheses) is shown along the x-axis, and the discovery sample sizes are given above their respective bar. Bars indicate the weighted mean proportion of variance (R2) of polygenic score predictions of educational attainment in the independent samples that were reported in the original GWAS articles. Error bars reflect 95% confidence intervals (CIs) Some 95% CIs were not reported in the original article and were estimated via https://www.danielsoper.com/statcalc/calculator.aspx?id=28 for the weighted mean R2. Gray circles reflect the polygenic score prediction from each independent sample, whose N is indexed by circle size (for details on the independent samples, see Supplementary Materials section 2)

The second GWAS on years spent in education, published 3 years after the first, included a discovery sample of N ~ 290,000 adults (Okbay et al., 2016). Corresponding polygenic scores predicted ~ 3% of the variance in educational attainment in two independent samples (N ~ 10,000 and ~ 9000 Okbay et al., 2016), as well as ~ 9% of the variance in educational achievement in ~ 3000 British 16-year-olds (Selzam et al., 2017). Quadrupling the prediction effect size for educational achievement from the previous GWAS (Krapohl & Plomin, 2016; Selzam et al., 2017) suggested that even larger samples were needed to achieve sufficient statistical power for identifying the SNPs that drive the heritability of education outcomes.

Two years later, a third GWAS on years spent in education surpassed one million participants (Lee et al., 2018). Its polygenic scores accounted for ~ 11% of the variance in educational attainment across two independent samples (N ~ 5000 and ~ 9000; Lee et al., 2018) and ~ 16% in educational achievement in ~ 7000 British 16-year-olds (Allegrini et al., 2019). These prediction effect sizes are big enough to translate into real-life differences: Students with polygenic scores in the 10th percentile and below achieve on average a C + in their end-of-compulsory school grades, while students with polygenic scores in the 90th percentile and above achieve an A − (von Stumm et al., 2020). The strength of polygenic score predictions for students’ educational achievement greatly exceeds that of other predictors, such as school quality ratings that explain less than 1% variance in end-of-compulsory school grades after accounting for prior school performance and family background (von Stumm et al., 2020).

In 2022, a fourth GWAS on years spent in education with a discovery sample of N ~ 3 million individuals was published (Okbay et al., 2022). The polygenic scores from this GWAS explained on average ~ 14.3% of the variance in educational attainment across two independent samples (N ~ 10,000 and ~ 9000; Okbay et al., 2022), as well as ~ 13% of variance in grade point average at age 16 years (N ~ 4000), and ~ 8% variance in high school grade rank at age 14 years (N ~ 5000; Okbay et al., 2022). These findings consolidated that polygenic score predictions for education strengthen as GWAS samples sizes increase, potentially to the point that polygenic score predictions will approximate the heritability estimates from twin studies (Yengo et al., 2022).

To summarize, four GWAS for years spent in education were published between 2013 and 2022, with their participant numbers growing from N ~ 130,000 to ~ 3 million, a 23-fold increase in less than 10 years. These GWAS’ polygenic scores predicted from ~ 2 to ~ 14% of the variance in educational attainment, a sevenfold increase (Fig. 1). It is important to note that each consecutive GWAS builds upon the discovery sample of the previous, that is to say their samples are not independent but overlap. In this sense, an earlier GWAS can be considered outdated when the next GWAS on the same target phenotype is published, akin to revised psychometric test with improved items rendering its earlier versions obsolete. When we preregistered the current meta-analysis, the fourth GWAS on years spent in education was yet to be published (Okbay et al., 2022). During the writing of this article, the summary statistics of the fourth GWAS on years spent in education became available to some researchers but not all. Thus, we focus here on studies that reported polygenic score predictions of education using the third GWAS (Lee et al., 2018), which was the largest GWAS for which full summary statistics were broadly available when this meta-analysis was completed.

The Current Study

Our meta-analysis aimed to appraise the status quo of the empirical findings on polygenic score predictions of educational outcomes, to evaluate corresponding prediction effect sizes, and to inform future research and applications of polygenic score predictions in education. To this end, we reviewed empirical studies that reported associations between polygenic scores based on Lee et al. (2018) GWAS for years spent in education and phenotypic measures of educational attainment and achievement. We identified a priori three factors that may moderate this association. The first is the genetic ancestry or ethnic origin of the study population. GWAS samples are typically drawn from populations of European ancestry (Mills & Rahal, 2019, 2020), and corresponding polygenic scores tend to be more predictive in European than in samples from other ancestral populations (Duncan et al., 2019; Okbay et al., 2022; Wang et al., 2020). For example, polygenic scores predicted ~ 10.6% of the variance in educational attainment in participants who identified as white American but only ~ 1.6% in participants who reported to be African American (Lee et al., 2018). The second factor that likely influences polygenic score predictions of education is the sample’s age at phenotypic assessment, because genetic influences on education increase with age as a result of gene-environment correlations (Avinun, 2020; Haworth et al., 2010; Plomin et al., 1977). For example, in ~ 3000 British school pupils, the polygenic score predictions of educational achievement increased from 5% at age 7 years to 14% of the variance at age 16 years (von Stumm et al., 2020). Polygenic score predictions are therefore likely stronger for educational achievement during adolescence than in childhood. The third factor that may affect associations between polygenic scores and educational outcomes are cohort effects or generational trends. Contemporaneously with the global increase in the access and availability of education (Lee & Lee, 2016), polygenic score predictions are likely to account for increasing proportions of variance in education attainment in later as compared to earlier generations (Herd et al., 2019; Rimfeld et al., 2018a, b). One study showed that in US women who were born in the 1930s, polygenic scores explained ~ 9.5% of the variance in educational attainment, but in women born two decades later in the 1950s, polygenic scores predicted even ~ 13.6% of the variance (HRS samples; Herd et al., 2019). This and similar gains in polygenic score predictions of education likely result from greater proportions of women enrolling in and completing tertiary education in the 1970s and thereafter, when traditional gender roles became less prescribed.

Methods

Search Strategy

This meta-analysis was performed in line with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (Page et al., 2021) and preregistered on the Open Science Framework (OSF; https://osf.io/xcm43/). We included articles that had been published by August 2022 and reported empirical associations between phenotypic measures of educational attainment or achievement and polygenic scores based on the summary statistics from the third GWAS for years spent in education (Lee et al., 2018), whose summary statistics became available in July 2018, resulting in a publication period of 49 months (i.e., 4.1 years) for this meta-analysis. We identified the publications through the citation record of the GWAS (Lee et al., 2018) on Google Scholar and Web of Science, which yielded a total of 1329 non-duplicate hits (1257 citations in Google Scholar, 776 citations in the Web of Science; flow diagram Fig. 2). We also included the polygenic score prediction estimates that were reported in the original GWAS article (Lee et al., 2018) in our review (k = 4).

Fig. 2
figure 2

PRISMA flow diagram of the identification of eligible studies. Noten refers to the number of publications; k is the number of estimates

Study Eligibility Criteria, Screening, and Exclusions

Publications that reported the following were included in our meta-analysis: (1) a phenotypic measure of educational attainment (e.g., self-reported years spent in education or highest qualification obtained) OR educational achievement (e.g., school grades or academic test scores reported by children or teachers), (2) polygenic scores based on the summary statistics from Lee et al.’s (2018) GWAS for years spent in education, (3) a statistical estimate of association between a phenotypic measure of education and the polygenic score, (4) an original empirical study (i.e., no meta-analyses or review papers) that was (5) published in a peer-reviewed journal.

Two authors independently screened 20% of the 1329 records; because their interrater agreement was perfect (100%, κ = 1; McHugh, 2012), the first author proceeded to screen all remaining hits. Overall, 77 articles met our eligibility criteria for retention, and full texts were downloaded. Further reasons for exclusions were recorded (flow diagram Fig. 2; Table S2) before extracting, coding, and recording our target variables. Interrater agreement was for coding high (93.75%, κ = 0.64). Overall, 60 articles did not meet our eligibility criteria and were excluded (Fig. 2), while 17 publications as well as the estimates reported in the original Lee et al. (2018) GWAS article were retained for the meta-analysis.

Coding Target Variables

We extracted from the retained publications (1) samples’ ancestry, (2) samples’ birth year, (3) samples’ ages at phenotypic assessment, (4) educational outcome type (i.e., educational attainment or educational achievement) and measure (e.g., school grades or test scores), (5) statistical association estimate (e.g., standardized β coefficient, r coefficient, or R2 coefficient), (6) the polygenic score computation method (i.e., PLINK, PRSice, LDpred, and lassosum), and (7) whether the association estimates were adjusted for age, sex, and the first ten principal components (PC; Table S3). These latter adjustments are typical in polygenic score prediction studies because they help control for population stratification (i.e., systematic differences in genetically heterogenous populations; Allergini et al., 2019; Ni et al., 2021).

Harmonizing Statistical Estimates

Across studies, different statistical estimates were reported to describe the association between polygenic scores and education, including correlation r coefficients (k = 21), standardized β coefficients (k = 8), and R2 values (k = 10). We retained statistical estimates that were reported as r or standardized β coefficients from regression models that included no other covariates except one or more of the following: age, sex, PC (i.e., adjusting for population stratification), and batch and chip (i.e., systematic differences within samples due to genotyping methods) (Peterson & Brown, 2005). Combining r coefficients and β coefficients in a meta-analysis can be problematic and cause bias if predictors and covariates in regression models are intercorrelated (Roth et al., 2018). Because polygenic score covariates (i.e., age, sex, and PCs) are independent from each other (Tables S6 and S16), this issue does not apply here. Where estimates were reported as R2 (k = 10), we transformed them into r coefficients using an effect size converter (https://www.escal.site). We used correlation coefficients as our effect size metric, before transforming them into Fisher’s Ζ coefficients, as is common in meta-analyses (Alexander et al., 1989).

Multilevel Meta-analyses

The R package metafor (R Core Team, 2019; Viechtbauer, 2010) was used to conduct two multilevel meta-analyses using the function rma.mv(). We conducted a three-level model to address the issue of effect size dependency due to “nestedness” (e.g., individual estimates were clustered by study). The multilevel model includes three variance components: at level 1, the sampling error at the level of the participants; at level 2, the variance within-clusters or within-studies (e.g., individual estimates); and at level 3, the variance between-cluster or between-studies. The multilevel approach allows for effect sizes to vary between participants (level 1), estimates (level 2), and studies (level 3), thereby elucidating the sources of variance.

To investigate heterogeneity, we calculated Q, I2 statistics, and σ2. The Q statistic indicates the presence of heterogeneity (Huedo-Medina et al., 2006), and I2 indicates the percentage of variability in estimates that can be attributed to each level within the model (Higgins et al., 2003), using metafor companion package dmetar (Harrer et al., 2019). The σ2 statistic provides a measure of within- and between-study variance (replacing the τ2 statistic of a conventional meta-analysis). σ2 model comparisons indicate if the heterogeneity at the within- and between-level is significant (Assink & Wibbelink, 2016).

Meta-regression

To explore the heterogeneity within polygenic score predictions of educational attainment and achievement, we fitted meta-regression models with the following moderators: (1) sample ancestry (i.e., binary European versus non-European), (2) samples’ mean age and their age range in years at phenotypic assessment; (3) the samples’ mid-birth year and birth year range; (4) the polygenic score computation method (i.e., PRSice, LDpred, PLINK, and lassosum), with LDpred as the reference because it was most frequently used across studies; and (5) whether age, sex, and the first ten principal components were adjusted for (binary yes/no). Continuous moderators (e.g., age and birth year) were centered before adding them to the meta-regression model, to allow for the intercept to be interpretable. We tested each moderator in an individual meta-regression model (not preregistered) and retained those moderators that were individually significant for multiple moderator meta-regression models (Tables S6-S23).

Publication Biases

We tested for publication biases in three ways. First, we generated funnel plots to visually examine the symmetry of data points in the meta-analysis and in the meta-regression after modeling moderator effects. Second, we conducted Egger’s regression tests, using the Standard Errors of the studies’ estimates’ effect sizes as predictors in a meta-regression model. In the absence of biases and heterogeneity, the Standard Errors will be non-significant predictors. Third, we conducted p-curve analyses with histograms of the p-values associated with the studies’ estimates (Simonsohn et al., 2014a, b), generating p-curves for our estimates using an application (http://www.p-curve.com/app4/; see Supplementary Materials section 5 for details). Across all our models, no signs of biases were detected: The funnel plots illustrated no significant degree of asymmetry, the Egger’s regression tests were non-significant (Figs. S1 and S2), and the p-curve analysis indicated that individual studies’ estimates were truly significant (Figs. S3 and S4, Tables S4 and S5).

Results

Studies’ Description

In total, n = 18 publications were identified that reported associations between polygenic scores and educational attainment (n = 11) and/or educational achievement (n = 12). For polygenic predictions of educational attainment, k = 20 estimates from 16 samples were available with Ntotal = 314,757; for polygenic predictions of educational achievement, k = 19 estimates from 10 samples were included with Ntotal = 83,788. Eighteen of the educational attainment estimates (90%) came from European samples across six WEIRD countries (Western, Educated, Industrialized, Rich, and Democratic, Henrich et al., 2010), including Hungary (k = 1), Netherlands (k = 1), New Zealand (k = 1), Norway (k = 1), UK (k = 5), and US (k = 9; Fig. 3). The remaining two estimates (10%) were from samples of African American ancestry from the US. Fifteen of the educational achievement estimates (79%) came from samples of European ancestry across five WEIRD countries, including Australia (k = 1), Netherlands (k = 1), Norway (k = 1), UK (k = 9), and US (k = 3; Fig. 4); the remaining four estimates (21%) were from US samples of African American ancestry.

Fig. 3
figure 3

Polygenic score correlations with educational attainment across countries, ancestries, and birth years. Note: Graph depicts the raw correlation coefficients after harmonizing before Fisher’s z-transformation. Triangles represent samples of African American ancestry; circles represent samples of European ancestry. For samples that ranged in birth year, the birth year mid-point was plotted; birth year ranges were not added to sustain graphical clarity. Symbols' colors reflect samples’ country of origin, and their size reflects the sample size for the correlation estimate. Symbols’ acronyms refer to the respective sample names (full sample names in Table S3). The meta-analytic estimate with 95% confidence intervals (CI) is shown on the right-hand side

Fig. 4
figure 4

Polygenic score correlations with educational achievement across countries, ancestries, and birth years. Note: For legend, see Fig. 3. The size of triangles and circles that reflects sample size are scaled relative to each other; thus, the size of triangles and circles in Fig. 3 refers to samples ranging from N = 406 (FTP) to N = 217,550 (UK Biobank), and in Fig. 4 from N = 402 (MAUES) to N = 26,518 (MoBa)

Multilevel Meta-analysis of Polygenic Score Predications of Educational Attainment

After Fisher’s z-transformation, estimates for the association between polygenic scores and educational attainment ranged from 0.12 to 0.39 with 95% CIs that excluded 0. The meta-analytic association between polygenic scores and educational attainment was ρ = 0.27, p < 0.001 (95% CI from 0.22 to 0.32), reflecting a positive relation of large effect size (Funder & Ozer, 2019; Table 1., Fig. 5). The Q statistic was 881.12 (p < 0.001), with variance being mainly attributed to within-cluster or within-study heterogeneity (i.e., individual estimates) (I2Level 2 = 62.26%). ANOVA model comparisons confirmed that the extent of heterogeneity at level 2 was significant (σ2.1 = 0.005, p < 0.001) (Table 1.).

Table 1 Three-level random effects models of polygenic score predictions of educational attainment and achievement
Fig. 5
figure 5

Forest plots of meta-analyses of polygenic score predictions for (a) educational attainment and (b) educational achievement. Note: “Author and year” show the first authors’ name and year of publication for each estimate of polygenic score predictions. Multiple estimates per study were alphabetized and reported individually. Effect sizes to the right of the solid vertical line indicate positive associations between polygenic scores and educational success. The rectangles represent the individual estimates effect sizes, and the horizontal bars represent 95% CIs for each estimate. The dashed vertical line and the diamond illustrate the meta-analytic effect size, with the dashed horizontal line representing the corresponding 95% CI. No moderator effects were considered in this analysis and plot

Multilevel Meta-analysis of Polygenic Score Predications of Educational Achievement

After Fisher’s z-transformation, estimates for the association between polygenic scores and educational achievement ranged from 0.06 to 0.39. All individual estimates, apart from one (Rabinowitz et al., 2019), were positive and significant (i.e., their 95% CIs excluded 0). The meta-analytic association between polygenic scores and educational attainment was ρ = 0.24, p < 0.001 (95% CI from 0.18 to 0.30), reflecting a positive relation that approximates a large effect size (Funder & Ozer, 2019; Table 1., Fig. 5). The Q statistic was 444.135 (p < 0.001). The variance was mainly attributed to between-study heterogeneity (I2Level 3 = 61%), but ANOVA model comparisons showed significant heterogeneity at both levels 2 and 3 (σ2.1 = 0.004 p < 0.001, σ2.2 = 0.006 p = 0.045) (Table 1.).

Multilevel Meta-regression of Polygenic Score Predications of Educational Attainment

Individual meta-regressions for each moderator were fitted first to identify individually significant moderators (Tables S6-S15). Only sample ancestry emerged as significant moderator of polygenic score predictions of educational attainment (β =  − 0.209, p < 0.001, 95% CI from − 0.29 to − 0.13), with European ancestry samples showing stronger polygenic score predictions compared to African American samples. Accounting for sample ancestry increased the association between polygenic score predictions and educational attainment to ρ = 0.287, p < 0.001 (95% CI from 0.242 to 0.332) (Fig. S5). The Q statistic remained significant (Q = 851.832 p < 0.001), suggesting that sample ancestry accounted for only a small proportion of the total heterogeneity (i.e., 3.34%). Ancestry variance was mainly attributed to between-study heterogeneity (I2Level 3 = 84.41%), and ANOVA model comparisons showed the significant remaining heterogeneity at both levels 2 and 3 (σ2.1 = 0.001 p = 0.049, σ2.2 = 0.004 p = 0.006).

Multilevel Meta-regression of Polygenic Score Predications of Educational Achievement

We identified mean age at assessment, measure of educational achievement, and sample ancestry as significant moderators in individual meta-regression models (Tables S16-S23). When added simultaneously to the meta-regression model, the association pooled estimate increased to ρ = 0.50, p < 0.001 (95% CI from 0.39 to 0.61) (Fig. S5). Two moderators were significant in the model: Mean age of assessment (β = 0.014, p < 0.001, 95% CI from 0.007 to 0.021), with polygenic score predictions being stronger in adolescents than in younger children, and the measure of educational achievement (β =  − 0.116, p = 0.025, 95% CI from − 0.206 to − 0.026), with predictions being higher for school grades than for standardized test scores. Sample ancestry was no longer significant (β =  − 0.008, p = 0.884, 95% CI from − 0.120 to − 0.104). The Q statistic reduced substantially (i.e., by 65.64%) but remained significant (Q = 152.583, p < 0.001), occurring mainly at the within-study level (I2Level 2 = 89.91%). Yet, ANOVA model comparisons showed significant heterogeneity at both levels 2 and 3 (σ2.1 = 0.003 p < 0.001, σ2.2 = 0.001 p < 0.001).

Discussion

A vast body of empirical studies emerged over the past decade on predicting individual differences in psychological and behavioral traits from their inherited DNA differences using polygenic score-based approaches (Cesarini & Visscher, 2017; Malanchini et al., 2020; Plomin & von Stumm, 2018). The potential benefits of DNA-based predictors for healthcare and disease prevention, as well as their risks, are widely discussed, and some areas have started to apply them. For example, polygenic scores for lifetime risk of breast cancer are used to help women make informed decisions about their healthcare (Ho et al., 2022; Jolie, 2013), although they are not (yet) recommended for wider clinical use (Martin et al., 2019; Shah, 2021). The possibility of analogous uses of polygenic scores in educational settings has become the central subject of large-scale empirical research efforts, fierce debates, and some fervent speculations (Abdellaoui & Verweij, 2021; Harden, 2021; Karavani et al., 2019; Martschenko et al., 2024; Okbay et al, 2022; Plomin, 2018; Polyakov et al., 2022; von Stumm & Plomin, 2021). While the current consensus is that polygenic scores are only informative for identifying differences within European ancestries that occur at the population level (i.e., inferences based on aggregated individual-level data; Morris et al., 2020a, b; Okbay et al., 2022), the possibility for individual prediction is on the horizon.

Our meta-analyses appraised the current literature, focusing on the consistency and strength of polygenic score predictions for educational attainment and educational achievement across studies. Our findings show that people’s genetic propensities for years spent in education predict their phenotypic differences in educational attainment and achievement, with large effect size that suggest substantial explanatory power and practical applications (Funder & Ozer, 2019). Indeed, the effect sizes of DNA-based predictions of educational attainment and achievement rival that of other predictors that have been previously extensively reported on, including children’s socioeconomic status (Selvitopu & Kaya, 2021) and their personality traits (Mammadov, 2021), teachers’ characteristics and expertise (Bardach et al., 2021), and school belonging and quality (Allen et al., 2018; von Stumm et al., 2020). Our meta-analytic results suggest that a SD increase in polygenic scores is associated with spending 1.08 more years in full-time education.Footnote 1 In the UK, an additional year of full-time education translates, for example, into obtaining a postgraduate degree (e.g., Masters), which forecasts favorable long-term life outcomes (Oreopoulos & Salvanes, 2011), including a 10% premium in future lifetime earnings compared to those with only an undergraduate degree (OECD, 2001). Our findings also suggested that an SD increase in polygenic scores was associated with obtaining a 0.19 higher school GPA,3 which may seem small in relative terms but can be critical for gaining admission to highly competitive tertiary education programs, such as Ivy League schools. Graduating from an Ivy League or similar school is associated 32% greater earnings compared to obtaining educational qualifications at other 4-year universities (Claybourn, 2023; U.S News & World Report, 2020).

We observed substantial heterogeneity in polygenic score predictions across within- and between-study levels. For one, polygenic score predictions for educational attainment and achievement were reduced by 57% and 50%, respectively, in samples that were described as African American compared to those of European ancestry, although this effect was only significant for educational attainment. These results align with prior findings that polygenic scores based on GWAS in European ancestry populations have limited prediction in populations of other ancestries (Duncan et al., 2019; Mills & Rahal, 2020; Popejoy & Fullerton, 2016; Wang et al., 2020). For the other, polygenic predictions of educational achievement varied by type and age of assessment. First stronger predictions emerged for school grades than test scores. This difference in PGS prediction strength may reflect that school grades capture a broader range of educational skills, such as motivation and intellectual curiosity,  than test scores. Yet, previous studies reported no differential heritability estimates for alternative measures of educational achievement (Bartels et al., 2002; Shakeshaft et al., 2013). Second, polygenic score predictions were stronger for educational achievement during adolescence than in childhood, in line with previous findings that the heritability of cognitive traits increases with age (Haworth et al., 2010; Morris et al., 2018). Even after considering these moderator effects, significant heterogeneity remained in our meta-analysis, whose origin future research must identify.

Two principal observations emerged about the samples for whom polygenic score predictions of education outcomes have been reported. First, only 15% of the samples were of non-European ancestry; they had been recruited in the US, where 46% of all estimates included in this meta-analysis stemmed from. We could not identify a single study that reported polygenic score predictions in samples drawn from non-WEIRD countries (Western, Educated, Industrialized, Rich, and Democratic, Henrich et al., 2010), in line with other behavioral genetics research findings that emanate predominantly from Iceland and the US and UK (Mills & Rahal, 2019). To disrupt the dominance of eurocentrism in genomic research, funding bodies like the National Institute of Health (NIH) now mandate the inclusion of non-European ancestry samples in data collection projects, although dedicated funding calls to study and foster ancestral diversity remain scarce (Fatumo et al., 2022; Mills & Rahal, 2020). Ethical and practical challenges, as well as substantial time and financial investment demands, oftentimes hinder collecting genetic and phenotypic data from diverse human populations. These challenges are likely to perpetuate eurocentrism in genomics for the foreseeable future, at the cost of excluding 84% of the world’s people in genomic studies (only 16% of the global population are of European ancestry; Martin et al., 2019; Mills & Rahal, 2020).

Our second observation was that samples’ birth years and ages at assessment range varied greatly across studies. As a result, systematic interpretations of generation effects and age trends in polygenic predictions of educational outcomes were difficult. Knowing when genetic propensities are more or less strongly associated with education is likely critical for effectively supporting people in their learning, for example by identifying school students’ ages when additional tutoring or making educational choices achieve optimal learning outcomes (Harden et al., 2020). To better elucidate age and generation trends in polygenic score predictions of education, more data are needed. These could also help identify the socio-political conditions under which societies achieve the balance between equality of opportunity and equity of outcomes in education.

Bioethics of Polygenic Score Predictions in Education

Applications of polygenic score predictions for psychological and behavioral traits at the level of the individual are on the horizon, urging the careful consideration of their ethical, legal, and social implications. The bioethics of polygenic scores have been discussed across a range of contexts elsewhere (Asbury & Plomin, 2013; Chapman, 2023; Chapman et al., 2019; Linnér & Koellinger, 2021; Martschenko et al., 2024). Here, we focus on two concerns that pertain in particular to the implementation of DNA-based predictors in education: the risk of widening achievement gaps in schools, and children’s and families’ psychosocial responses to learning about their genetic propensities for educational outcomes.

The greatest fear associated with the use of DNA-based predictors in educational settings, as well as elsewhere, is that they could foster genetic discrimination, diminish support for educational policies that prioritize fairness and inclusion, and reinforce genetic essentialist narratives that designate some children as “lost causes” or “unmanageable” in the classroom (Lapham et al., 1996; Martschenko, 2019; Tiller et al., 2024). For example, students’ differences in their genetic makeup could be (mis)used to justify limiting the learning opportunities for children with lower genetic propensities for educational achievement, while focusing on enhancing the learning experiences for children with higher genetic propensities for educational achievement (Larsen et al., 2022). In such a scenario, children with lower genetic propensities for educational achievement may be assigned more often to schools of poorer quality that employ less accomplished teachers than children with higher genetic propensities for educational achievement. Without education policies in place that prohibit these and other forms of genetic discrimination, polygenic score predictions could—even inadvertently—contribute to creating educational systems that increase rather than reduce achievement gaps. At present, laws make it illegal to discriminate against individuals based on their genotypes in health insurance or employment contexts (i.e., the 2008 Genetic Information Nondiscrimination Act, GINA), but there is no comparable non-discrimination act that protects students from being subject to genetic discrimination in their education.

In meritocratic societies, where power and socioeconomic resources are distributed according to demonstrated achievement and attainment, differences in the access to and exploitability of educational opportunities become the principal cause of social inequality (e.g., Keohane et al., 2014; Starr et al., 2024). Education systems that offer greater opportunities to students who are predicted to excel deepen social inequalities, regardless if the basis of such performance predictions are polygenic scores or prior school performance or family wealth (Farquharson et al., 2022; Hanushek & W ößmann, 2006; Piketty & Wood, 2024; Wilkinson & Pickett, 2010). Because students’ differences in educational opportunities are already conditioned by their family backgrounds (Chmieleski, 2019; von Stumm et al., 2022), polygenic score predictions may deepen further family background inequalities in education, if no policies are in place that recognize and effectively mitigate their perpetuation.

The second bioethical concern that requires addressing is parents’ and children’s psychosocial reactions to information about their genetic propensities for educational achievement. Parents, who learn of their children’s low genetic propensity for educational achievement, may feel guilty and attribute blame to themselves for their children’s education outcomes, akin to the responses observed in parents who learned about the role of genetics in hereditary diseases and in childhood obesity (James et al., 2006; Persky et al., 2019; Weil, 2002). Conversely, children, who are told that they have low genetic propensities for educational achievement, may take personal responsibility for their struggles or failures in school, resigning from learning opportunities, losing their academic motivation, and lowering their educational efforts (i.e., self-fulfilling prophecy; Filippello et al., 2019; Ghasemi & Karimi, 2021; Matthews et al., 2021).

The two bioethical concerns are not independent but may intersect. For example, parents with greater financial resources can purchase DNA-based predictions of their children’s psychological profiles from various companies that offer direct-to-consumer genetic testing (Au, 2022; Martschenko et al., 2024; von Stumm & Plomin, 2021), which is more difficult to afford for families with fewer financial resources. Furthermore, parents with plenty resources can then supply their children with the learning support that their polygenic scores may have indicated they need to excel in school (Au, 2022). Thus, DNA-based predictions of children’s learning may lead to disproportional advantages for children from wealthier backgrounds and contribute to widening educational achievement gaps. We therefore caution that anonymized population-level data must not be migrated into personalized education until the benefits of DNA-based predictions for the individual, the family, and the school can be guaranteed to outweigh their risks.

Safeguarding Polygenic Score Predictions in Education

Notwithstanding the bioethical concerns outlined above, DNA-based predictions of education have the potential to improve teaching and learning in three ways. First, they can help personalizing education to meet students’ different learning needs, for example recommending additional classes or tutoring, or allocating more teaching staff to students with lower polygenic scores (Arold et al., 2022). Second, polygenic scores could serve as an early “warning system” that signals learning difficulties before they have manifested as maladaptive behaviors, such as “learned helplessness” which occurs when students withdraw from educational opportunities after experiencing academic setbacks (e.g., Ghasemi & Karimi, 2021). Third, recognizing that DNA is a core driver of children’s differences in learning, rather than factors that are under their personal control, may increase society’s compassion for and understanding of people’s differences in academic performance (Asbury & Plomin, 2013; Asbury et al., 2021; Harden, 2021). To ensure that these potential benefits of polygenic score predictions can be realized, while the risks of misapplications, misinterpretations, and misattributions are minimized, safeguarding is imperative. We propose here three preliminary ways for ascertaining adequate safeguarding for polygenic score predictions in education.

First, efforts to teach and inform about genetic predispositions should target the wider public rather than a handful of elite university students, as is currently too often the case. The nature and nurture of genetic predispositions can be effectively taught to and understood people from many walks of life (Asbury & Plomin, 2013; Asbury et al., 2022; Crosswaite & Asbury, 2019; Donovan et al., 2019, 2022; Rutherford, 2020) to help preventing misattributions of genetic risks (Driver et al., 2022). Over 60% of people who learn of their polygenic predictions for mental and physical health disorders report experiencing negative emotions, and 75% misinterpret their genetic risks (Peck et al., 2021). College students, who imagined receiving a low-percentile polygenic score for educational attainment, reported afterwards lower self-esteem, academic efficacy, and educational competence relative to individuals, who imagined having a high-percentile score or who did not “receive” a polygenic score (Matthews et al., 2021). These findings align with the mixed blessing model that submits that biogenetic explanations can soften public stigma but also induce pessimism, avoidance, and helplessness among affected individuals (Haslam & Kvaale, 2015; Larsen et al., 2022). As long as the fear of bad news in your genes remains a prevalent public opinion (Harden, 2023), polygenic score applications in educational contexts risk worsening discrimination, prejudice, and inequality rather than translating into benefits.

Second, changing the narrative of genetic influences is—in part—the responsibility of researchers who apply polygenic score predictions in psychological science. In recent years, this responsibility has been met by a number of excellent popular science books that accessibly communicated the ins and outs of behavioral genetics to broad audiences (e.g., Harden, 2021; Plomin, 2018; Rutherford, 2020). Some of these books are specifically aimed at helping teachers (e.g., Asbury & Plomin, 2013) and parents to nurture children’s nature (e.g., Dick, 2021; Gopnik, 2016). Another way of ensuring that research findings in behavioral genetics are adequately disseminated is accompanying research article publications with explanations of the findings in lay language (e.g., FAQs: Beauchamp et al., 2019; Benjamin et al., 2018, 2022). These FAQs are mainly used by journalists and science communicators, who through their own media contributions influence the consensus of opinions formed in public spheres (Garman, 2019; Habermas, 1989). The majority of public communication efforts by researchers are under- or non-funded, and they often only reach a fraction of the population (Office of Science and Technology & Wellcome Trust, 2000). While some funding agencies have started to mandate public engagement and dissemination strategies (e.g., Asbury et al., 2021, 2022), achieving a fundamental change in the public opinion and interpretation of the role of genetics in education will require substantial, concerted efforts.

Finally, facilitating more research into the heterogeneity of children’s learning abilities is crucial to safeguarding the use of DNA-based predictions for education. A central aim of such research should be conducting GWAS for differentiated learning-related phenotypes, such as reading ability (e.g., Eising et al., 2022) or mathematic skills (Chen et al., 2017; Harden et al., 2020), rather than general “education,” to enable pinpointing and differentiating the educational support that will best meet a child’s specific learning needs (Procopio et al., 2022). Such GWAS could maximize the utility of polygenic scores as diagnostic tools or early warning systems for learning difficulties (Procopio et al., 2022).

Limitations

A principal limitation of current polygenic score predictions for education is that they are somewhat blunt tools that cannot discern children’s specific learning needs (Peters & Ansari, 2019), for example struggling with reading versus having difficulties with algebra. Although this limitation is not specific to our meta-analysis, polygenic scores for years spent in education may predict general learning difficulties but cannot forecast known learning difficulties, such as dyslexia and dyscalculia, for which targeted, effective interventions have been developed and validated (e.g., Cancer et al., 2020; Youman & Mather, 20132015). We view this generality of current polygenic score predictions as analogous to the prediction from being eligible for free school meals or raised in foster care, two common criteria for receiving extra educational support (Flavier & Moussay, 2014; Roberts et al., 2021). Associations between free school meals or foster care and educational achievement have been broadly documented (Education Endowment Foundation, 2017; Office for National Statistics, 2022; Schwartz & Rothbart, 2020), yet they are probabilistic rather than causal (Plomin & von Stumm, 2022), akin to our current meta-analytic findings. Children eligible for free school meals or in foster care are vastly heterogenous in their learning abilities and difficulties, just like children of the same polygenic score percentile rank (Allegrini et al., 2019). This heterogeneity prohibits improving children’s learning outcomes through “one-size-fits-all” intervention approaches that typically follow from broad, general predictors, such as difficult early life experiences and polygenic scores.

Our meta-analysis could not model the interplay between genes and the environment, including differentiating polygenic score variance due to gene-environment correlations (rGE), assortative mating, and population stratification (Nivard et al., 2024; Plomin & von Stumm, 2022; Selzam et al., 2019; Wertz et al., 2019). Comparisons of within- and between-family polygenic score predictions have shown genetic effects on educational attainment that are environmentally mediated (i.e., passive rGE; Selzam et al., 2019; Wang et al., 2021). The most recent GWAS of years spent in education found that only ~ 30% of the prediction of educational attainment was due to direct genetic effects, with environmental confounding playing a major role (Okbay et al., 2022). While no studies to date reported comparable findings for educational achievement, polygenic scores should not be interpreted as reflecting direct genetic effects but as predictors that capture genetic and environmental effects (Plomin & von Stumm, 2022).

Conclusion

Our meta-analyses showed that polygenic scores predict educational achievement and attainment with large effect sizes. These findings imply that individuals with higher genetic propensities perform on average better in school, remain longer in full-time education, and obtain higher educational qualifications. Our meta-analyses’ samples came from WEIRD countries, and polygenic score predictions were strongest for school grades in adolescents of European ancestry. Polygenic score predictions remained heterogenous after accounting for a number of plausible moderators, suggesting that at present polygenic scores are neither suitable for individual level assessments nor for personalizing education in applied settings.

Our findings underscore the importance of inherited DNA variants for understanding and addressing students’ heterogeneity in learning abilities. Applications of polygenic scores in education are becoming increasingly tangible, even though they are not yet effective tools for improving teaching and learning (Asbury et al., 2021; Au, 2022; Harden, 2021; Morris et al., 2020a, b). With even more powerful polygenic scores for education on the horizon (e.g., Okbay et al., 2022), the interest in and potential for applying polygenic scores in education is growing, and with that the need for safeguarding and further research. In the words of Marie Curie (cited in Curie & Sheean, 1938, p.13): “Nothing in life is to be feared, it is only to be understood. Now is the time to understand more, so that we may fear less.”