Introduction

Autism spectrum disorder (ASD) is a group of neurodevelopmental disorders presenting genetic and phenotypic heterogeneity across the affected individuals. Within ASD, affected individuals exhibit a combination of abnormalities in adaptive and social functioning, language, and cognitive abilities [1].

Genetic studies indicate that numerous genes across the genome contribute to the risk of ASD. There has been substantial recent progress in the discovery of rare mutations. For example, many studies have identified a large number of copy number variants (CNVs) and single nucleotide variants (SNVs) to be associated with disease. De novo mutations associated with ASD were found to be rare and distributed over many genes across the genome [25]. Recessive loci in ASD have also been studied, reflecting the role of inherited variation [610].

One hypothetical model with some support is that autism may reflect approximately 2 groups, the first involving lower-functioning patients and a second involving higher-functioning patients (Fig. 1). This model would further hypothesize that lower-functioning autism may be constituted by a genetic cause that reflects a smaller number of individually rare yet highly penetrant mutations [some even reflecting monogenic causes such as fragile X syndrome (FXS) and others], whereas higher-functioning autism might reflect a genetic architecture involving a large number of common susceptibility variants of minor effect. An example of support for this hypothesis is that some forms of rare variation appear to be enriched largely in lower-functioning patients [3, 9, 1115]. Figure 1 represents a proposed model that may help guide approaches to dissecting the heterogeneity in autism, yet further work is necessary to provide more substantial proof for this model.

Fig. 1
figure 1

Model on genetic architecture of autism spectrum disorder (ASD; adapted from Morrow et al. [11]). The heterogeneity in autism may be dissected into 2 groups, approximately lower-functioning and higher-functioning groupings. One hypothesis with some support is that the genetic architecture of lower-functioning autism may involve a relatively smaller number per patient of rare and highly penetrant mutations. By contrast, the genetic architecture of higher-functioning autism may involve a large number of common variants with minor effect. The pie chart indicates the multitude of individually rare, highly penetrant genetic causes that may be identified in lower-functioning ASD. Of the cases of autism, at least 70 % have no known genetic mechanisms [16]. The data in the pie chart are hypothetical to reflect a large number of individually rare loci identified in cases; the largest piece of the pie represents a majority of cases in whom a genetic mutation is still not identified

In this review, we highlight the progress in discovery of rare ASD-associated mutations. There is also support for an important role for common variation in autism [17]. However, here we will highlight progress in discovery of rare mutations that can often pinpoint coding changes in specific genes. We argue that these discoveries provide critical traction for investigation of disease-relevant neurodevelopmental mechanisms. As convergent pathways emerge from the study of enough such rare variants, it is possible that some of these pathways may be targeted for new drug development.

Monogenic Disorders with Autistic Symptoms

There is a variety of monogenic disorders that include autistic symptoms. These disorders provide important models that may be informative about complex forms of autism and provide clues regarding the pathophysiology of autism symptoms. It is important to acknowledge, however, that these syndromes most often constitute a broader range of developmental anomalies than autism alone. Below we review a subset of these syndromes, with an emphasis on how they might inform the field to potential neurodevelopmental mechanisms. All monogenic disorders explained below are summarized in Table 1.

Table 1 Monogenic disorders associated with autistic symptoms

Angelman Syndrome

First described in 1965, Angelman syndrome (AS) is a genetic form of intellectual disability (ID). Phenotypic presentation includes abnormal gait, frequent laughter, seizures, microcephaly, and craniofacial abnormalities. Despite the range of symptoms, individuals with AS are often recognized by a generally happy disposition [18, 19]. AS affects between 1 in 15,000 and 1 in 40,000 individuals [20, 21]. Some studies have provided support to indicate that approximately 60% of patients may also be diagnosed with autism [22].

AS is an example of a condition caused by genetic imprinting, as it results from loss-of-function mutations or deletions involving the maternal copy of UBE3A. UBE3A is located on human chromosome 15 and encodes ubiquitin-protein ligase E3A (UBE3A). UBE3A is involved in the ubiquitination of other proteins within the cell, thereby targeting them for degradation. Evidence suggests that UBE3A has 2 roles within the process of ubiquitination—the first in catalyzing complex formation and the second in recognizing substrate specificity [19, 23, 24].

Within the brain, UBE3A has been shown to affect circuit development and the regulation of excitatory/inhibitory balance. One UBE3A target, activity-regulated cytoskeleton-associated protein, mediates surface expression of α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) receptors (AMPARs) at excitatory synapses. In the absence of UBE3A expression, activity-regulated cytoskeleton-associated protein degradation is decreased, resulting in increased internalization of AMPARs and the development of impaired circuit function [25]. It has also been demonstrated that UBE3A plays a role in inhibitory synaptic function within the neocortex. Loss of maternally derived UBE3A leads to a vesicle cycling defect at inhibitory synapses causing a build-up of vesicles at the axon terminal and decreased synaptic transmission. As the inhibitory synaptic deficit is more severe than that of excitatory synapses, the ultimate consequence is an excitatory/inhibitory imbalance [26]. Interestingly, duplication of the same region of chromosome 15, involving UBE3A and additional genes, is also observed in another autism-like syndrome known as 15q duplication syndrome [29].

Autism with Macrocephaly

Autism with macrocephaly (also known as Cole–Hughes syndrome) is a monogenic form of autism with progressive postnatal macrocephaly [3032]. Affected individuals can present with ID and language/social delays, as well as obesity, retarded bone growth, and facial anomalies [30, 32]. Of all individuals diagnosed with autism, it is estimated that 20% may have macrocephaly [33].

Autism with macrocephaly can be caused by dominant, partial loss-of-function mutations in PTEN, encoding the phosphatase and tensin homologue protein on chromosome 10 (PTEN) [31, 34, 35]. Although reports of the exact incidence vary, it is estimated that between 5% and 17% of individuals with autism and macrocephaly carry a PTEN mutation [31, 34]. PTEN is a tumor suppressor gene, and the PTEN protein functions as a dual-specificity phosphatase, capable of acting on both tyrosine and threonine/serine residues. Although it is active within several pathways involving cell growth, cell cycle arrest, and apoptosis, one key function is inhibition of phosphoinositide 3-kinase/protein kinase B signaling [31, 34, 36].

Within the brain, PTEN affects the development of neuronal and synaptic morphology. Analyses of PTEN-deficient neurons in mice revealed an increase in the size of neuronal cell bodies and arbors, as well as an increase in dendritic spine density. Synaptic structures throughout the cortex and cerebellum were enlarged, and axons showed evidence of severe myelin thickening. Loss of PTEN expression also resulted in the development of decreased synaptic functionality, or weakening of both synaptic transmission and synaptic plasticity within the hippocampus [37, 38].

PTEN mutations, often within the germline, have also been implicated in other disorders involving autistic symptoms, collectively known as PTEN hamartoma tumor syndromes. Notable examples of PTEN hamartoma tumor syndromes include Cowden syndrome and Bannayan–Riley–Ruvalcaba syndrome, and a defining feature of both disorders is the presence of benign tumors, known as hamartomas, throughout the body [39]. In Cowden syndrome, these hamartomas frequently become malignant, and affected individuals present with macrocephaly, ID, and autistic traits [40]. Characteristic features of Bannayan–Riley–Ruvalcaba syndrome also include macrocephaly, ID, and autistic traits, as well as thyroiditis, vascular malformations, and delayed motor development [31, 41].

CHARGE Syndrome

CHARGE, first described in 1979, is a syndrome encompassing a constellation of congenital anomalies. The term CHARGE is an acronym for the syndrome’s six core features: C, coloboma of the iris/retina; H, heart defects; A, atresia of the choanae; R, retardation of growth/development; G, genital abnormalities; and E, ear abnormalities. Occurring once in every 8500–10,000 live births [42, 43], CHARGE syndrome also involves a range of secondary features, including deafness, laryngomalacia, vestibulocochlear defects, facial nerve palsy, and oral clefts [4447]. Many individuals with CHARGE syndrome are reported to exhibit autistic-like behaviors, with an estimated 27.5% meeting classification for autism [45].

Although an initial report implicated a mutation in SEMAPHORIN 3E [48], the majority of CHARGE syndrome cases have since been shown to result from dominant, loss-of-function mutations in CHD7, located on human chromosome 8 and coding for chromodomain helicase DNA-binding protein-7 (CHD7). Members of the CHD protein superfamily affect development of the early embryo through their influence on chromatin structure and gene expression. CHD7 is expressed ubiquitously throughout a range of tissue types, resulting in the variety of systems affected in CHARGE syndrome [44, 49].

Of particular relevance to the field of neurodevelopment is the role of CHD7 in the regulation of genes involved in the guidance of neural crest cells and axons. In the absence of CHD7 expression, several genes were shown to be misregulated, including members of the semaphorin and ephrin families [50].

Christianson Syndrome

Christianson syndrome (CS), first noted in 1999 in a 5-generation South African pedigree, is an X-linked recessive neurodevelopmental disorder found in males [51]. Although the exact prevalence is not known, CS is estimated to affect between 1 in 16,000 and 1 in 100,000 individuals [52]. A majority of CS individuals are nonverbal and present with epilepsy, ID, and ataxia. Additional core symptoms are postnatal microcephaly and hyperkinetic behavior. Additional symptoms may include eye movement dysfunction, gastroesophageal reflux disease, low muscle tone, cerebellar atrophy, and phenotypic regression [5153]. In one study of phenotypic diversity within CS, 43% of affected individuals received a clinical diagnosis of autism, and a majority of patients present with autistic symptoms [52].

CS is the result of likely loss-of-function mutations in SLC9A6, encoding the Na+-H+ exchanger 6 (NHE6) protein. NHE6 functions as an endosomal sodium–proton exchanger, exchanging internal hydrogen ions for external sodium ions. NHE6 is localized to early, recycling, and late endosomes, and it has been shown to modulate intraendosomal pH [54].

Evidence reveals that NHE6 also plays a role in the regulation of neuronal circuit development. In the absence of NHE6 expression, brain-derived neurotrophic factor/tropomyosin receptor kinase B signaling is attenuated, which is associated with impoverished axonal and dendritic arborization. Additionally, loss of NHE6 causes reductions in both synapse number and strength, thereby diminishing functional connectivity [54].

FXS

FXS is an X-linked disorder characterized by cognitive, behavioral, and physical features. FXS occurs most often in males, with an approximate incidence of 1 in 4000 (1 in 8000 females) [55]. Cognitive symptoms include ID and language deficits, whereas behavioral symptoms include hyperactivity and social deficits. Prominent physical features are macro-orchidism, enlarged ears, and an elongated face [56]. According to results of a national parent survey, approximately 46% of males and 16% of females affected by FXS are also diagnosed with autism [57].

FXS is caused by mutation in FMR1, involving an expansion of CGG repeats and subsequent gene silencing by hypermethylation [58]. FMR1 is located at a fragile site on the X chromosome and codes for the fragile X mental retardation protein (FMRP) [58]. FMRP is an RNA-binding protein with robust expression in neurons; it is found at the postsynapse within dendrites, where it affects local protein translation [59]. Its loss has been shown to increase long-term depression in mouse hippocampus, suggesting that FMRP may affect protein synthesis by repressing translation of a specific group of mRNAs [60].

Neurofibromatosis, Type 1

Affecting as many as 1 in 2700 newborns [61], neurofibromatosis type 1 (NF1) is a monogenic disorder characterized by both physical and cognitive symptoms. Hallmarks of the syndrome include café-au-lait spots and benign nerve tumors known as neurofibromas. Additional physical symptoms include freckling of the groin and/or underarm, Lisch nodules of the iris, optic pathway gliomas, and bone lesions [62]. Approximately 80% of individuals with NF1 exhibit deficits in cognitive functioning [63]. Poor social skills, signs of attention deficit hyperactivity disorder (ADHD), and difficulties with executive function are also observed in affected individuals. In addition, it is estimated that at least 26% of individuals with NF1 meet the criteria for autism [64].

NF1 is the result of a dominant, loss-of-function mutation in NF1, located on human chromosome 17 and encoding the neurofibromin protein [6567]. Neurofibromin negatively regulates the Ras–mitogen-activated protein kinase (MAPK) pathway, and mutations in NF1 lead to hyperactivation of the signaling cascade. As neurofibromin is located postsynaptically in dendritic spines, excessive activation of the pathway results in altered synaptic plasticity, thought to be the cause of the cognitive aspects of the NF1 phenotype [64, 68]. An additional form of neurofibromatosis is neurofibromatosis type 2, which is caused by mutations in NF2, encoding the merlin protein. Neurofibromatosis type 2 is generally characterized by tumors within the brain, cranial nerves, and spinal cord [69]. Although related, it is a disorder distinct from NF1, with few current publications addressing a potential association with autistic symptoms [70].

Rett Syndrome

Rett syndrome (RTT) is a progressive neurodevelopmental disorder affecting an estimated 1 in 10,000 individuals, almost all of whom are female [71, 72]. Individuals with RTT develop typically for the first 6–18 months of life, after which they experience a period of regression, losing speech and deliberate use of the hands. Phenotypic presentation of the syndrome also includes microcephaly, ataxia, epilepsy, breathing difficulties, and stereotyped hand movements [73], and up to 40% of individuals with RTT meet the criteria for autism [74].

In 1999, researchers identified dominant, loss-of-function mutations in the X-chromosome gene MECP2, encoding methyl-CpG-binding protein 2 (MeCP2), as the cause of RTT [71]. MECP2 undergoes X-inactivation, resulting in the range of phenotypic severity seen within the syndrome. MeCP2 consists of 2 functional domains—a methyl-CpG binding domain and a transcriptional repressor domain. When MeCP2 is bound to chromatin, the transcriptional repressor domain interacts with histone deacetylases, resulting in transcriptional repression of target genes containing CpG residues [71]. More recent evidence also supports the role of MeCP2 in gene activation [75].

Within the brain, MeCP2 is involved in circuit function and the balance between excitation and inhibition. Evidence demonstrates that in the absence of MeCP2 expression, excitatory synaptic activity within cortex is reduced. Cortical pyramidal neurons exhibit a decrease in both spontaneous firing rate and amplitude of miniature excitatory postsynaptic currents [76].

Just as loss of MeCP2 expression can lead to dysfunction, so too can MECP2 duplication. Increased MECP2 dosage results in a rare disorder distinct from RTT, known as MECP2 duplication syndrome. MECP2 duplication syndrome causes severe ID and progressive spasticity in all affected males. Additional clinical symptoms include impaired or absent speech, impaired or absent walking ability, seizures, and premature lethality [77].

Timothy Syndrome

Timothy syndrome (TS) is a monogenic disorder characterized primarily by cardiac arrhythmias and dysfunction of multiple organ systems. Symptoms include syndactyly, elongation of the cardiac QT interval, congenital heart defects, hypoglycemia, immune system deficiency, and developmental delays. TS and the associated heart defects are often lethal, with an average age of 2.5 years at death. In 1 study of surviving children, 60% met criteria for autism and 80% met criteria for ASD [78]. Owing to the syndrome’s rarity, the exact prevalence of TS is unknown.

Timothy syndrome is caused by gain-of-function dominant, frequently de novo mutations in CACNA1C, located on human chromosome 12 and encoding the α-1 subunit of the Cav1.2 L-type calcium channel. The Cav1.2 channel is voltage-dependent and is expressed in the heart and throughout the brain. When CACNA1C is mutated, prolonged abnormal calcium current occurs as the result of a failure in channel inactivation [78].

CACNA1C mutations have multiple functional consequences within the central nervous system. Loss of Cav1.2 mRNA expression in the forebrain leads to a loss of late long-term potentiation, as well as decreased activation of signaling pathways such as MAPK/extracellular regulated kinase. Additionally, the CACNA1C mutation seen in TS causes defects in differentiation of projection neurons, resulting in the development of impaired connectivity [79].

Tuberous Sclerosis Complex

Tuberous sclerosis complex (TSC) is an autosomal dominant disorder marked by benign tumors throughout multiple organ systems. These tumors, known as hamartomas, have an unpredictable distribution, but are often found within the brain, the skin, the heart, the kidneys, and the lungs [80, 81]. TSC is associated with a variety of cognitive and developmental deficits, including ID, ADHD, and epilepsy. TSC is estimated to affect as many as 1 in 5800 newborns [82], approximately 40% of whom are diagnosed with ASD [83, 84].

TSC is the result of a loss-of-function mutation in 1 of 2 TSC genes—TSC1, located on human chromosome 9, and TSC2, located on human chromosome 16. TSC1 encodes the hamartin protein, whereas TSC2 encodes a protein known as tuberin, both of which are suggested to function as tumor suppressors [80, 81]. The TSC phenotype in individuals with germline TSC mutations may be explained by the “second hit theory”. Sporadic somatic mutation of the remaining copy of TSC leads to a loss of heterozygosity in a given cell, resulting in tumor formation [85].

As the phenotypic presentation of TSC is similar regardless of which gene is mutated, hamartin and tuberin were presumed to affect the same cellular pathway [81]. The 2 proteins have since been shown to form a complex responsible for inhibition of the mammalian target of rapamycin signaling pathway [86]. In the absence of hamartin/tuberin, there is overactivation of the mammalian target of rapamycin pathway, resulting in abnormal development of neurons, including cellular enlargement [87].

With regard to all of the above monogenic syndromes, one concern about the association of these disorders with autism is that it can be challenging to diagnose autistic symptoms in the setting of severe ID. However, in addition to the presentation of autistic symptoms in the context of these monogenic disorders, there is other evidence that supports a link between these monogenic disorders and idiopathic autism likely due to complex inheritance. For one example, FMRP binds to the mRNA of genes in which mutations have been found in idiopathic autism [88]. Additionally, mutations in CHD7 and its binding partner CHD8, for example, have also been found in idiopathic autism [3]. Rare variants in CHD7 are thought to confer increased autism risk, even in the absence of a CHARGE syndrome diagnosis [89]. Further still, mutations in genes highly related to MECP2, such as MBD5, have been found in idiopathic autism [90].

CNVs in ASD

CNVs have been essential to our understanding of autism genetics. Advances in genomic microarray technology coupled with decreasing cost have allowed for greater testing of individuals and identification of many CNVs in autism [9193]. CNVs refer to deletions or duplications of genomic segments. They are either inherited from a parent or occur de novo owing to germline alterations. Although CNVs are a source of normal human genetic variation [94, 95], a seminal study in 2007 demonstrated autism cases have an increased burden of de novo CNVs [96]. The authors found simplex families (i.e., only 1 affected child) had a significantly greater rate of de novo CNVs (10%) compared with multiplex families (i.e., multiple affected children) and control families (3% and 1%, respectively). Recent studies of simplex families using microarrays with greater resolution found a burden rate of de novo CNVs from 5.8–7.9% in probands to 1.7–2.0% in unaffected siblings [97, 98]. Furthermore, an increased CNV burden was identified in the same loci encompassing balanced chromosomal abnormalities, such as translocations and inversions, in an autism/neurodevelopmental disorder cohort [99].

Large case–control studies have been employed to elucidate specific loci associated with autism susceptibility. This research strategy compares the frequency of specific CNVs across the genome in probands with autism to unaffected controls. Notable recurrent CNVs include 1q21.1 [98, 100, 101], 7q11.23 [98], 15q13.3 [98], 16p11.2 [98, 102104], 17q12 [101, 105], and 22q11.2 [98, 106]. Recurrent CNVs may arise as a result of nonallelic homologous recombination whereby regions flanked by segmental duplications may be prone to deletions and duplications during meiosis [107, 108]. There is considerable overlap among recurrent CNVs in autism and other cognitive disorders such as ID, schizophrenia, and epilepsy [109]. This suggests potential shared pathologic mechanisms underlying various neurodevelopmental disorders.

It is important to consider gene expression changes that may result from CNVs. In fact, CNVs usually represent polygenic loci whereby a series of contiguous genes are likely altered by gene dosage, that is, there is a reduction or increase in transcript and protein levels. Therefore, the regulation of the precise level of a series of genes is altered. In this way, while a single gene within a CNV may be informative with regard to mechanisms, it is most likely that these genes implicated by multigenic CNVs act interactively. Recently, there has also been increased focus on the possibility that microRNAs found within CNVs (which also regulate a gene expression program in trans) may be another important part of the polygenic mechanisms resulting from changes in gene copy number [110]. Another interesting feature of CNV disorders is that in some cases, deletions and duplications at the same locus can cause overlapping, reciprocal or quite distinct phenotypes [102, 104, 111112]. Finally, some recurrent ASD CNVs exhibit incomplete penetrance whereby they are observed in unaffected siblings and control populations [113].

Recurrent CNV Disorders in Autism

1q21.1

CNVs in the 1q21.1 locus are enriched in individuals with autism [98, 100, 101, 114]. Deletions and duplications have been reported in approximately 0.2% (25 in 5218 and 27 in 16,557 subjects) and 0.1% (9 in 5218 and 17 in 16,557 subjects), respectively, of patients from large cohorts with ID, ASD, and congenital anomalies [100, 114]. The 1q21 locus is roughly 1.35 Mb, consisting of at least 7 genes [100]. Probands with deletions are characterized with mild-to-moderate developmental delay, microcephaly, joint hypermobility, congenital heart abnormalities, hypotonia, seizures, cataracts, and dysmorphic features (e.g., frontal bossing and bulbous nose) [100, 114]. Probands with the reciprocal duplication are characterized with mild-to-moderate developmental delay, autistic behaviors, relative macrocephaly and dysmorphic features (e.g., frontal bossing and hypertelorism). The 1q21.1 CNV is incompletely penetrant with variable expressivity as parents and siblings may also have the same CNV yet display none of the associated symptoms [100, 114].

Dosage of the HYDIN paralog located in this region may be important for influencing head size and growth [114]. Additionally, increased dosage of the CON1 subtype of the DUFF1220 protein domain is correlated with greater autism severity [115].

7q11.23

Duplications in 7q11.23 are significantly associated with autism susceptibility in a simplex cohort with a frequency of 0.09% in 3816 ASD probands [98]. These de novo duplications spanned 1.37 Mb, encompassing 22 genes. Individuals with 7q11.23 duplications exhibit a highly heterogeneous phenotype. Common features include ASD, cognitive deficits, severe expressive language delay, facial dysmorphisms (e.g., short philtrum and thin lips), anxiety, behavioral problems, and hyperactivity [98, 116, 117].

This CNV locus is especially interesting because the reciprocal deletion causes Williams–Beuren syndrome (WBS), a genomic syndrome notable for excessively social behavior. WBS is characterized by a hypersocial personality, cognitive deficits (particularly in visuospatial skills), anxiety, facial dysmorphisms (e.g., flat nasal bridge, upturned nose, and delicate chin), cardiovascular disease, growth retardation, and connective tissue abnormalities [118]. The deleted region is referred to as the WBS chromosome region 1.5 Mb (95%) or 1.8 Mb (5%) [119].

Using induced pluripotent stem cell (iPSC) methods, Adamo et al. [120] found that GTF2I mediates the dose-dependent transcription of 7q11.23. These researchers also identified other gene candidates, such as PDLIM1, which is associated with neurite outgrowth; MYH14, associated with hearing impairment; and BEND4, a transcription factor containing a BEN domain. Transcriptional analysis using lymphocytes from the Simons Simplex Collection (SSC) identified misregulated genes, including BCL7B, EIF4H, and LAT2 [121]. Pathway analysis of misregulated genes was notable for forebrain development, determination of bilateral symmetry, and hippocampal development.

15q11-q13

15q11-13 duplications constitute one of the most common genetic findings in autism [98, 104, 106, 122]. The 15q11-q13 locus contains 5 recurrent breakpoints (BP1–BP5), with BP2–BP3 being particularly associated with autism [98, 122]. This region is also notable for imprinting. Deletions in BP2–BP3 are associated with either Prader–Willi syndrome (PWS; paternal) or AS (maternal), depending on parental origin of the affected chromosome [123]. The phenotype is heterogeneous and may include ASD, hypotonia, speech and motor delays, anxiety, emotional lability, and hyperactivity [124]. As described above, one of the strongest gene candidates for autism in this locus is UBE3A. Some studies have reported that PWS caused by maternal uniparental disomy was more likely to be associated with autism symptomatology than PWS caused by paternal deletions [125, 126]. This suggests increased dosage of genes in the maternal locus, such as UBE3A, may increase the risk of autism. However truncating mutations of a gene in the PWS paternal-only expressed region, MAGEL2, was recently found in 4 patients diagnosed with ASD [127].

16p11.2

Deletions and duplications of 16p11.2 have been associated with autism and related neurodevelopmental disorders [98, 102, 103], yet are incompletely penetrant [102, 128]. This 593-kb segment encompasses 25 genes and accounts for approximately 1% of cases of ASD [98, 102]. Both deletions and duplications are associated with language delays, ASD, behavioral problems, congenital anomalies, and seizures [129131], although the deletion may appear more commonly in autism [102, 132], and the duplication appears to be more common in schizophrenia [112]. Contrasting phenotypes are observed: deletions are associated with increased brain volume/macrocephaly and increased body mass index/obesity, while duplications are associated with decreased brain volume/microcephaly and decreased body mass index [129, 133, 134]. Structural brain alterations were also found in a dosage-dependent manner based on whether the 16p11.2 locus was deleted or duplicated [135]. Brain regions involved in reward circuitry such as the striatum, mediodorsal thalamus, orbitofrontal cortex, and insula had significantly greater gray matter volume in individuals with deletions compared with duplications. However, individuals with duplications had significantly greater gray matter volume in language regions such as the middle, superior temporal gyrus, and caudate.

Overexpresssing the human transcript of a gene located in 16p11.2, KCTD13, in zebrafish resulted in a microcephaly phenotype, whereas knocking down its expression resulted in a macrocephaly phenotype [136]. Genome-wide transcriptome analysis revealed a significant positive correlation between the 16p11.2 locus and expression of MAPK3, YPEL3, CORO1A, and KCTD13 [121]. Head circumference was also highly correlated with expression of TAOK2, CORO1A, KCTD13, and QPRT. Another transcriptome analysis of 16p11.2 deletions and duplications in 1) cortical mouse tissue and 2) human lymphoblasts from multiplex autism families found dysregulated gene expression in synaptic functioning (e.g., NRXN1, NRXN3), chromatin modification (e.g., CHD8, EHMT1, MECP2), transcriptional regulation (e.g., TCF4, SATB2), and ID (e.g., FMR1) [137].

22q11.2

The 22q11 locus is one of the most robust CNV findings in neurodevelopmental genetics. Deletions and duplications of 22q11.2 are strongly associated with autism across multiple large case–control studies [98, 104, 106]. However, CNVs in this region are associated with a range of phenotypes and are incompletely penetrant [113]. 22q11 deletion syndrome (22q11DS), also referred to as Velocardiofacial or DiGeorge syndrome, represents one of the strongest genetic risk factors for neuropsychiatric disorders. Historically, 22q11DS constitutes a major risk factor for a schizophrenia-like condition [138]. Approximately 85% of individuals with 22q11DS have the recurrent 3-Mb deletion, encompassing roughly 60 genes [139]. In the largest study of individuals with 22q11DS, Schneider et al. [140] examined the psychiatric profile of 1402 participants (aged 6–68 years) throughout their lifespan. The prevalence of schizophrenia spectrum disorders greatly increased from early adulthood (23.5%) to adulthood (41.7%). ASD prevalence was highest in adolescents (26.5%). A subset of this cohort available for cognitive testing showed a mean full-scale IQ of 71.25, with almost half with a full-scale IQ > 70. Deficits in executive functioning have also been identified, specifically related to working memory and attention [141].

There are a number of genes in this locus potentially contributing to autism and other phenotypes. COMT may influence cognitive functioning and psychosis [142144]. Epistasis may also occur between COMT and PRODH, and is likely to affect behavior [139]. Heterozygous Tbx1 mice displayed impairments in social interactions, ultrasonic vocalizations, and working memory [145]. Haploinsufficiency of Tbx1 and Gnb1l in a 22q11DS mouse model appears to be responsible for decreased prepulse inhibition, a sensorimotor gating measure associated with psychiatric disorders [146]. GNB1L is also associated with ASD [147]. Ranbp1 is associated with microcephaly and important for cortical progenitor proliferations in layer 2/3 [148]. Cxcr4 potentially regulates cortical interneuron migration and placement [149]. There are also a number of microRNA-related loci in this region, including miR-185 and DiGeorge syndrome chromosomal (or critical) region 8, a component of the microprocessor complex that is involved in the initial step of miRNA biogenesis [110].

22q13

22q13 deletion syndrome, also known as Phelan–McDermid syndrome, is characterized by global developmental delay, absent or severely delayed speech, ASD, neonatal hypotonia, normal or accelerated growth, and dysmorphic features [150, 151]. Dysmorphic facial features include dolicocephaly, flat midface, wide nasal bridge, and bulbous nose. Cerebellar vermis hypoplasia, enlarged posterior fossa and ventricles, and thin corpus collosum and white matter are common neuroimaging findings [152].

Haploinsufficiency of SHANK3 is thought to be an important locus in Phelan–McDermid syndrome [153, 154]. Furthermore, SHANK3 is also considered a major effect locus in autism [153, 154], with deletions found in at least 0.5% of patients with autism [155]. SHANK3 encodes a master scaffolding protein localized to the postsynaptic density in excitatory synapses [156, 157]. Shank3-null mice exhibit deficits in synaptic functioning such as impaired AMPAR trafficking, glutamatergic transmission, and long-term potentiation [158, 159]. These mice recapitulate aberrant symptoms in social behavior, communication, repetitive behaviors, motor functioning, and learning/memory [158160]. However, SHANK3 is not the only gene responsible for this 22q13 deletion syndrome [161]. Mice lacking IB2, also known as MAPK8IP2, have abnormal Purkinje cell morphology, and decreased AMPA and increased N-methyl-D-aspartate cerebellar glutamatergic transmission [162]. These mice also exhibit deficits in social, motor, and cognitive functioning. PLXNB2 may also mediate pathological cerebellar symptoms [152].

Other Recurrent CNVs in ASD and CNVs that Identify Individual Genes

A number of other recurrent CNVs are enriched in autism cohorts; also, some recurrent CNVs have identified individual genes. 15q13.3 deletions have been reported in autism [98, 104, 163, 164], with FAN1 being implicated as a putative gene candidate [165]. 17q12 deletions have been identified in individuals with autism [101, 105]. ACACA may be responsible for the ASD phenotype in this CNV [101]. Some CNVs identified in large case–control studies consist of only 1 or a few genes. These include deletions in 1p33 (AGBL4) [104], 2p16.3 (NRXN1) [106], 3p26.2 (SUMF1) [106], 10q23.2 (GRID1) [106], 19q13.33 (CLEC11A, SHANK1, SYT3) [166], and Xp22.1 (DDX53, PTCHD1) [104], as well as duplications in 3p26.3 (CNTN4) [106] and 3q26.31 (NLGN1) [106]. Among the most highly recurrent single-gene CNVs are NRXN1 mutations. Interestingly, both NRXN1 and the postsynaptic binding partners NLGN1-4 have been strongly implicated in autism and reviewed elsewhere [106, 167170]. MBD5, a methyl-CpG-binding domain gene, has been identified as the causal gene in the 2q23.1 locus, which is characterized by ASD and ID [171, 172]. Deletions encompassing the C-terminus of AUTS2 are associated with autistic features and developmental delay [173]. Finally, deletions involving ASTN2 and TRIM32 at the 9q33.1 locus are enriched in male patients exhibiting a range of neurodevelopmental phenotypes such as ASD, ADHD, obsessive–compulsive disorder, and speech/language delay [174]. Recurrent CNVs are summarized in Table 2.

Table 2 Recurrent copy number variants in autism

De Novo SNVs in ASD

Advances in sequencing technologies have been providing substantial information on de novo SNVs in ASD. Recent studies have focused on whole-exome sequencing (WES) of simplex pedigrees to identify de novo SNVs. Those studies have reported hundreds of candidate genes with coding variants; however, for the vast majority of these candidates only a single mutation was identified in most genes (the “n = 1 problem”). Such single-hit genes provide important data, but it is difficult to interpret whether they are truly associated with ASD without recurrent hits [35, 176]. Sanders et al. [4] developed a model suggesting that the likelihood of observing 2 nonsense/splice site de novo mutations in the same gene in probands is not by chance. By using the same permutation test, it was indicated that a candidate gene with ≥ 2 disruptive (nonsense, frameshift, or splice site) mutations has a higher probability of being associated with ASD than a gene with 1 disruptive de novo mutation. Based on this approach, a growing number of genes have been implicated to date through the identification of recurrent, deleterious de novo SNVs in pedigree-based WES. For example, sequencing was performed on 2446 families with autistic patients from SSC to identify 44 candidate genes. In 6 genes, including CHD8, DYRK1A, GRIN2B, TBR1, PTEN, and TBL1XR1, recurrent disruptive mutations were detected (Table 3) [177]. Following this study, almost 3500 probands and their unaffected siblings were resequenced for 64 candidate genes in a case–control design. CHD2, ADNP, SYNGAP1, TRIP12, and PAX5 were found to have recurrent mutations in multiple probands [15]. Recurrent de novo mutations (i.e., ≥ 2 in the same gene), which include protein truncating and/or deleterious missense, to date found to be associated with ASD are summarized in Table 3.

Table 3 De novo recurrent mutations (≥2) in genes associated with autism spectrum disorder. These mutations included protein truncating and/or deleterious missense

Further, the contribution of disruptive SNVs to the ASD risk was shown by burden analysis on SSC pedigrees such as excess number of frameshift indels and loss-of-function mutations in cases compared with controls [35, 13]. In a recent study on WES data of 3871 cases of ASD and 9937 ancestry-matched or parental controls, more loss-of-function mutations were observed in cases than in controls. Using the transmission and de novo association statistical model, 7 novel genes, including ASH1L, MLL3, ETFB, NAA15, MYO9B, MIB1, and VIL1, were identified [178].

Understanding how SNVs in implicated genes affect the phenotypic characteristics of ASD is crucial as it is a complex disorder with a variety of presentations across probands. In a recent study on SSC pedigrees, the average number of truncating SNVs was found to be higher in probands with IQs < 100 than in those with IQs ≥ 100. This observation suggests that truncating SNVs likely contribute to lower-functioning ASD [12]. Similarly, a significant decrease in nonverbal IQ was identified in probands with an increased number of truncating mutations [3], and the presence of de novo frameshift indels was observed to be associated with a lower IQ in SSC pedigrees [13]. Interestingly, these studies of de novo SNV burden have shed light on possible epidemiologic factors that may contribute to autism risk. Specifically, multiple studies demonstrated a correlation between increased paternal age and the number of de novo SNVs [3, 13, 179].

Recessive Loci in ASD

There is substantial progress demonstrating the role of de novo CNVs and point mutations in ASD, as explained above. However, the contribution of inherited loci in autism remains to be clarified. The Homozygosity Mapping Collaborative for Autism embarked on studies of recessive loci in pedigrees with recent shared ancestry. This project identified an enrichment of implicated loci with genes that were regulated by neuronal activity [6]. These studies also identified a number of mutations in these pedigrees, for example the endosomal Na+/H+ exchanger 9 (NHE9), glycine metabolic genes, and SYNE1 [6, 181]. Some of the genes implicated showed comparable mutations in autism pedigrees without recent shared ancestry. In addition, some pathways showed evidence supportive of transheterozygous mutations, that is, heterozygous mutations in 2 genes in a common pathway [181]. This genetic mechanism is commonly encountered in model organisms [182]; however, this mechanism of inheritance has not been exhaustively studied.

Homozygosity mapping was also used to identify novel rare, recessive loci in syndromes including autism symptoms. These include a protein truncating mutation in CC2D1A in 9 consanguineous families with severe autosomal recessive nonsyndromic mental retardation [183185]. CC2D1A is a putative signal transducer involved in positive regulation of the I-κB kinase/nuclear factor-κB cascade. Wild-type protein in lymphoblastoid cells was shown to be absent in the affected subjects [183]. Another example includes homozygous mutations truncating BCKDK, identified in autism with epilepsy and ID in pedigrees with recent shared ancestry. Branched-chain ketoacid dehydrogenase kinase has an important role of phosphorylating and inactivating the E1α subunit of branched-chain ketoacid dehydrogenase complex. Mutations in the subunits of this complex cause accumulation of branched-chain amino acids, leading to maple syrup urine disease [186].

As an extension of traditional homozygosity mapping, another approach is homozygous haplotype mapping, which is used to identify haplotypes within shared runs of homozygosity (ROH) regions [7]. ROHs are regions of the genome with consecutive homozygous single nucleotide polymorphisms. Therefore, ROH reflects coinheritance of a segment of the genome derived from an ancestor shared by the parents and, in this way, they may be a marker of recessive loci. The length of the segment reflects the number of generations back wherein the ancestor was in common such that longer ROH blocks reflect a more recently shared ancestor. Increased ROHs were found to be associated with ID in autism in a population of 2108 families with autistic children from SSC [9]. ROH blocks were identified widely distributed across the genome. An increased burden of ROH was observed in probands with IQs ≤ 70 compared with their unaffected matched siblings, but not in probands with IQs > 70. The authors also identified an association of increased ROH burden for female sex and lower IQ across both sexes in autism [9]. Similarly, in another study on 612 affected children with ID, larger homozygous regions were found in the severe ID case group compared with the nonsevere ID case group. In addition, larger ROH blocks accounted for 20% of homozygosity in individuals within the severe ID case group and only for 6% in individuals within the nonsevere case group [187]. In yet another study in a Taiwanese Han population, ROH was found to be associated with speech delay [188].

In some studies the combination of analysis of ROHs and WES has led to the identification of new candidate gene mutations. For example, in the study by Chahrour et al. [8] ROH analysis and WES on the patients from the selected families showed 4 candidate autism-related genes (in 4 of the 16 probands): UBE3B, CLTCL1, NCKAP5L, and ZNF18. Similar to Morrow et al. [6], the 4 candidate genes were found to be neuronal depolarization dependent, supporting the role of disrupted synaptic transmission in ASD [8]. Lim et al. [14] also showed increased rare complete knockouts, including compound heterozygous and homozygous variants, in cases compared with controls by using whole-exome analysis, further suggesting a significant contribution of recessive loci to ASD risk [14]. In conclusion, studies to date on recessive loci of ASD have indicated that multiple genetic variants distributed across the genome and inherited in a recessive fashion appear to contribute to the risk of ASD. This may be an approach that can be capitalized on to a greater extent in future larger studies.

Autism Genetic Studies May Point to Convergent Pathways: The Road from Genetics to Therapeutics

As described above, studies have identified many autism risk candidate genes. Additionally, transcriptome studies in postmortem autism brain have identified gene networks disrupted in autism. Despite the vast genetic heterogeneity, accumulating evidence suggests that there are common pathways disrupted in the autistic brain. Discovery of these convergent pathways may be critical to identifying potential therapeutic targets.

Large-scale WES efforts in autism have pointed to several convergent pathways. Functional clustering of de novo SNVs in SSC identified enrichment of FMRP target genes, chromatin modifiers, and genes involved in embryonic development [88]. In another study, analysis of rare coding variants in almost 4000 autism cases identified a set of high-confidence autism risk genes. These genes were enriched for synaptic and postsynaptic genes, targets of the RNA binding proteins FMRP and RBFOX, histone modifiers, and chromatin remodeling genes [178]. Investigations of CNVs in autism have pointed to some similar pathways as were detected in exome sequencing studies. Large-scale studies of rare CNVs found that genes involved in ubiquitination and neuronal cell adhesion were enriched in autism associated CNVs [106]. In addition, deletions were enriched for genes involved in neuronal function and guanosine triphosphatase/Ras signaling [104]. Further studies found an enrichment of genes involved in MAPK signaling and neuronal development in autism-associated deletions [189]. Also, analysis of CNVs identified from WES data found an enrichment of cytoskeletal and autophagy genes in small CNVs in autism [190].

Analyses of co-expression networks in autism have also discovered convergent pathways in autism. Differential gene expression analysis of autism cortex transcriptome data revealed decreased expression of genes functioning at the synapse and increased expression of immune genes [191]. An extended transcriptome study of autism cortex also found upregulation of immune genes in autism specifically identifying increased expression of activated microglia genes. This study also identified misexpression of synaptic activity genes in autism [192]. To narrow down autism heterogeneity to the pathology of a specific development time point, cell type, and brain region, Willsey et al. [193] analyzed expression data from several brain regions at many stages of development. Co-expression networks seeded with genes recurrently mutated in autism were found to converge with networks in deep layer projection neurons of the prefrontal and primary motor–somatosensory cortex during midfetal development [193]. Parikshak et al. [194] mapped autism risk genes to transcriptional networks throughout development and identified glutamatergic neurons in upper cortical layers as a critical cell type and brain region in autism. Undoubtedly, pathway and related network analyses have identified some degree of convergence with regard to potential mechanisms in which genes may participate, but, at present, these mechanisms remain fairly broad. One limitation of this approach is that gene pathway methods are clearly limited by a very incomplete understanding of the function of specific molecules in circuit development and function. One hope is that investigation of specific molecules with disease-associated mutations may be a path forward to a more in-depth understanding of cellular and neurocircuitry abnormalities.

iPSC-derived neurons may provide a new, valuable preclinical model for understanding autism-related cellular dysfunction and for predicting treatment outcomes. For example, iPSC-derived neurons were generated from patients diagnosed with the autism-associated disorder TS, which results from mutations in the L-type calcium channel Cav1.2 and leads to decreased channel inactivation. TS neurons show dysregulation of genes involved in catecholamine synthesis, which can be reversed by treatment with an L-type-channel blocker, roscovitine [195]. In another study, disruption of calcium signaling, changes in gene expression, and decreased neurite growth were observed in iPSC-derived neurons from a nonsyndromic patient with autism with a translocation disrupting the cation channel gene TRPC6. Some of the defective phenotypes could be reversed by treatment with insulin-like growth factor-1 or a TRPC6 agonist, hyperforin [196]. The use of iPSC-derived neurons to connect gene variants to autism-related cellular phenotypes provides a new path to identifying and screening potential therapeutics for autism. Future studies to expand on our understanding of the convergent pathways identified by these genetic studies may lead to mechanism-based treatments for autism; however, this approach will also rely on the identification of biomarkers that indicate an autism subtype defined by the disruption of a specific pathway. For example, some forms of idiopathic autism with macrocephaly may reflect pathobiology related to overactive signaling, even though the patient may not have an apparent PTEN mutation. In this case, the combinations of neuroimaging and perhaps measures of peripheral blood signaling could be tested to predict response to treatment for this specific autism subtype.

Conclusion

The last decade of research has established an unprecedented number of rare mutations in autism and autism-related neurodevelopmental conditions. Extension of these gene discovery approaches will be important to discover a great number of genes and alleles, and thereby convergent mechanisms with a higher resolution of mechanistic understanding. However, now is also an important time to expand our efforts in mechanistic studies. This endeavor may be helped by the recent emergence of iPSC methods; however, in vivo studies will remain important. Such in vivo studies will rely on animal models, and new animal models and/or approaches to study circuitry are warranted. In addition, the field may also benefit from larger neuroimaging studies of genetically characterized patients. In conclusion, discovery of rare genetic mutations has provided an unprecedented opportunity for an important transition to studies that may now elucidate disease-relevant mechanisms in autism.