Autism genetics: Methodological issues and experimental design

Autism is a complex neuropsychiatric disorder of developmental origin, where multiple genetic and environmental factors likely interact resulting in a clinical continuum between “affected” and “unaffected” individuals in the general population. During the last two decades, relevant progress has been made in identifying chromosomal regions and genes in linkage or association with autism, but no single gene has emerged as a major cause of disease in a large number of patients. The purpose of this paper is to discuss specific methodological issues and experimental strategies in autism genetic research, based on fourteen years of experience in patient recruitment and association studies of autism spectrum disorder in Italy.

Autism spectrum disorder (ASD) encompasses a heterogeneous group of neurodevelopmental conditions characterized to a variable extent by social and communication deficits, accompanied by stereotyped behaviors and insistence on sameness (i.e., restricted patterns of interest and activities), with onset prior to three years of age. From a diagnostic point of view, ASD largely coincides with DSM-IV "pervasive developmental disorders", and in particular with autistic disorder, Asperger disorder, and pervasive developmental disorder not otherwise specified (PDD-NOS) [1]. The ASD concept is actually complementary to DSM-IV categorical diagnoses, because it underscores that autistic traits and behaviors are better represented by a dimensional continuum in the general population than by dichotomous "black-or-white" diagnostic categories. The male:female sex ratio for autistic disorder is 4:1, suggesting a direct involvement of the X chromosome, imprinting mechanisms, and/or hormonal effects. Autism is associated with seizures and mental retardation in up to 30% and 65% of cases, re-spectively [2,3]. A prenatal origin of the disease is strongly supported by neuroanatomical and neuroimaging studies, showing anomalies raising from abnormal neurodevelopmental processes which physiologically occur during the first and second trimester of pregnancy [4]. These anomalies would later result in reduced long-distance neuronal connectivity, yielding a "dysconnection syndrome" [5,6]. However, until recently it was believed that only specific brain regions were affected in autistic individuals, while increasing evidence shows that autism is a systemic disease involving not only the central nervous system (CNS), but also other organs, such as the gut, and the immune system [7][8][9][10][11].
Genetics strongly contributes to the pathogenesis underlying autism [12]. Rett syndrome is a monogenic disorder affecting, in the vast majority of patients, female carriers of mutations in the MeCP2 gene. Idiopathic ASD shows the highest heritability (>90%) among all neuropsychiatric disorders, as estimated by twin studies; its sibling recurrence risk (i.e., the incidence of severe autism among brothers and sisters of children already diagnosed with autism) is 5%-6%, that is 75 times higher than in the general population. Also the presence of mild autistic traits in first-degree relatives of autistic patients again points towards a strong genetic component in ASD. Not surprisingly, during the nineties autism became the target of a major research endeavour, aimed at elucidating its genetic and pathophysiological underpinnings. Within this framework, since 1997 our group has begun recruiting Italian families with 1 or more autistic children (simplex and multiplex families, respectively), in a collaborative project involving a growing number of Italian academic and clinical centers. This review summarizes fourteen years of experience in autism genetic research, reviewing theoretical and practical aspects relevant to experimental design.

Trends in autism genetic research
We have recently proposed the following genetic classification of ASD [13]: (i) "Classical" syndromic forms: approximately 10% of patients suffer from autism as part of a broader genetic syndrome, such as fragile-X and tuberous sclerosis. These forms can be caused by genomic DNA mutations, triplet repeat expansions, or large cytogenetic abnormalities defined by classical G band karyotyping.
(ii) Mitochondrial autisms: rare forms due to mutations or gene dosage abnormalities affecting mitochondrial DNA [14].
(v) Oligogenic or polygenic forms of autism: these cases do not recognize a single-gene origin, but rather multiple genes, each carrying one or more functional polymorphisms commonly distributed in the general population and conferring autism vulnerability or protection. An important role is played here by gene-gene and gene-environment interactions, as well as possible epigenetic effects, whereas each gene by itself confers a small percentage of phenotypic variance.
This classification has been recently presented and discussed in great detail [13]. Here we shall only point out that the role of CNVs is actually more complex than may appear. Initial studies suggested the existence of genomic instability in a sizable group of ASD patients, because de novo CNVs (i.e., microdeletions or microduplications present only in the patient and not in his/her parents) were significantly more common in autistic compared to healthy individuals [15][16][17]. In reality, the overall number of CNVs was later shown to be similar in ASD cases and controls, while rare CNVs overlapping coding genes are enriched in ASD cases, especially so for loci involved in synaptic and neuronal cell adhesion, ubiquitination, cell proliferation and migration, GTPase/Ras signaling [18][19][20]. Pathogenic CNVs seemingly act as rare variants with variable penetrance and expressivity. Some de novo CNVs may act in a dominant way and even display complete penetrance, while CNVs inherited from one of the two parents may act as risk enhancers or even follow a "quasi-recessive" mechanism, in association with a rare polymorphism located in the other allele [21,22]. Finally, many CNVs are commonly distributed in the general population, as listed in public databases such as the Database of Genomic Variants (http://projects.tcag.ca/ variation/) [23]. It is thus often complicated to determine whether and to what extent a CNV is actually contributing to the disease or simply represents a chance finding. As a general indication, (i) de novo CNVs have a greater probability of having a negative functional impact as compared to inherited CNVs; (ii) the genomic location of a CNV is more critical to its pathogenic potential than the quantitative profile or mean size of CNVs present in the genome of a given subject; (iii) pathogenic CNVs in autistic patients often include genes which, when mutated, are responsible for monogenic forms of autism, such as NLGN4, SHANK3, and NRXN1.
The genetic complexity outlined above has essentially led to the proposal that the majority of ASD cases may represent a collection of neurodevelopmental syndromes due to rare, if not even "private" mutations or CNVs. According to this view, we have so far been able to identify these mutations only in relatively few cases due to technological limitations, but rare variants will be increasingly detected applying deep-sequencing technologies [24]. Other investigators, while agreeing that rare mutations and CNVs may cause autism in many patients and should indeed be pursued, underscore multiple difficulties in reconciling rare variants with the explanation of autism in the majority of ASD patients: (i) The discrepancy between huge heritability estimates and low yield of causal mutations and CNVs identified after twenty years of intensive investigation is still striking; (ii) many "causal" mutations and chromosomal rearrangements are not de novo, but more often segregate in the family, underscoring their variable degree of penetrance and heterogeneous expressivity; (iii) even with de novo mutations, genotype-phenotype correlations are extremely labile: the very same mutation can cause behavioral and morphological phenotypes displaying a surprising degree of variability in different patients; (iv) this phenotypic variability closely mimics the impressive phenotypic variability seen when a gene inactivated by homologous recombination is backcrossed onto the genetic backgrounds of different mouse inbred strains [25]. These converging lines of evidence clearly emphasize the importance of common genetic variants (also designated as "genetic background" or "modifier genes") in determining the penetrance and expressivity of rare mutations. It also supports the likelihood of oligogenic or even polygenic forms, due to unfavourable gene-gene and gene-environment interactions even in the absence of strong "causal" mutations. Interestingly, epidemiological studies reported an incidence of 25 in 10000 newborn babies before 1985, whereas studies performed after the year 2000 converge upon rates of ASD as high as 2060 in 10000. This increase may stem from the use of broader diagnostic criteria and increased attention by the medical community, but an actual increase in ASD incidence due to environmental and epigenetic factors acting upon a genetically vulnerable background is also likely. Finally, linkage and association studies have identified numerous susceptibility genes, located on various chromosomes, especially 2q, 7q, 15q and on the X chromosome [12,26]. Indeed the clinical heterogeneity of ASD may reflect the complexity of its genetic underpinnings.

Endophenotypes in autism genetic research
The concept of "endophenotype" was first introduced by Gottesman and Shields [27,28], to designate a quantitative variable which, in the context of a complex behavioral disorder, fulfils the following criteria: (i) It is associated with the disorder in the general population; (ii) it is heritable; (iii) it displays clear familiarity, i.e., it is present among unaffected first-degree relatives of affected individuals at a higher frequency and/or intensity compared to the general population ("intermediate phenotype" in a "spectrum disorder"); (iv) it is closer to the genetic level than abnormal behavior or clinical diagnosis. Importantly, an endophenotype should not be a symptom nor a symptom cluster required for a given diagnosis (although in autism research there are multiple examples, such as use of Social Responsiveness Scale scores [29] and of ADI-R scores for the social interaction domain, or the restricted and repetitive behaviours domain [30,31]). Endophenotypes should thus be viewed as internal constructs ideally able to dissect subgroups of patients with similar clinical presentations but different pathogenetic underpinnings. Endophenotypes can thus contribute to fill the gap between behavioral symptoms and underlying genes in psychiatric disorders, as well as provide information on the chain of causal events leading to a complex disorder [32]. Moreover, they can greatly enhance statistical power, by permitting to use unaffected family members as a logical extension of the proband.
In our experimental design, we have implemented several biochemical and morphological endophenotypes, namely head circumference, serotonin blood levels, and urinary oligopeptides. In fact, macrocephaly (i.e., head circumference >97th percentile) has been consistently described in approximately 20% of autistic children [55,58,59], serotonin blood levels are elevated in 20%50% of autistic subjects [47], and increased urinary excretion rates of oligopeptides and multiple solutes is found in 20-60% of autistic patients, with significant interethnic differences [50,60,61]. The implementation of these endophenotypes in our studies is detailed in the following section.

Our roadmap: methodological issues and strategies
Starting in 1997, we have begun recruiting Italian families including one or more children diagnosed with ASD. Our aim was to reach a sample size which would allow us to perform family-based and case-control association studies with reasonable statistical power. Through the years, this collaborative project has involved several Italian academic and clinical centres, yielding a sample currently including 488 simplex and 17 multiplex Italians families which, combined with 38 Caucasian-American families, reaches a total of 522 simplex and 21 multiplex families, with 572 autistic children and 1047 first-degree relatives (Table 1).
All probands fulfil DSM-IV diagnostic criteria [1] for either Autistic Disorder (86%), Asperger Disorder (6%), or PDDNOS (8%). Only non-syndromic patients are recruited. We thus screen for known genetic, neurological and metabolic disorders, performing EEG, audiometry, karyotype and fragile-X testing, MRI, urinary aminoacid and organic acid measurement. Among these diagnostic assessments, only MRI is considered not absolutely mandatory, because it is rarely positive and very cumbersome for many ASD children. In contrast, urinary aminoacid and organic acid testing are also rarely positive, but much less expensive and potentially amenable to therapeutic intervention. Patients with sporadic seizures (i.e., less than 1 every 6 months) are included, whereas we exclude patients with more frequent seizures, focal neurological deficits, congenital anomalies, or major dysmorphisms. Patients with either DSM-IV autistic disorder, Asperger disorder, or PDD-NOS have been merged into a single "affected" ASD category, because to our knowledge no convincing evidence proves consistent differences in the genetic and neurobiological underpinnings of these three diagnostic categories. This unifying concept will also be adopted in DSM-V, where a single "autism spectrum disorder" diagnostic category has been proposed to absorb these three DSM-IV categories (http://www.dsm5.org/). DSM-IV diagnoses are nonetheless recorded in our database, as specific analyses may benefit from this information especially in reference to genetic variants influencing overall disease severity or linguistic skills. Autistic patients positive at our DSM-IV diagnostic screening and negative for syndromic forms at our medical screening, are then clinically characterized using several tools: (a) since 2005, the official Italian translations of the Autism Diagnostic Interview-Revised (ADI-R) [62], and of the Autism Diagnostic Observation Schedule (ADOS) [63], which have substituted the Childhood Autism Rating Scale (CARS) [64], administered until 2005; the Vineland Adaptive Behavior Scales (VABS) [65]; depending on each recruiting center, one of several intelligence or developmental scales (Griffith Mental Developmental Scales, Wechsler Intelligence Scales, Leiter International Performance Scale, Coloured Raven Matrices). Importantly, when applied to autistic children, the CARS and Intelligence Scales suffer in our experience from lower inter-rater reliability, as compared to other scales. For this reason, the CARS was dropped following the advent of ADOS and ADI-R, and I.Q. scores are not used as a continuous variable but rather dichotomized into two categories, either 70 ("mental retardation") or >70 ("normal intellectual level"). We also collect anthropometric measures, namely height, weight and head circumference. We aim at refining our morphometric analysis in the future, by collecting photographs of each patient in frontal and lateral view by web-cam, and applying one of several available automated morphometric analysis methods to the face and the head [56,57,66,67]. Finally, we have designed a simple ad hoc questionnaire currently including 36 patient-and family-history variables [60], exploring the following areas: (i) pregnancy, prenatal and neonatal history; (ii) developmental milestones (cognitive, language and social development, motor development and sphincter control); (iii) medical conditions or other pathologies occurring at autism onset (i.e., gastroenteritis, ear infections, other infectious diseases); (iv) behavioural symptoms (i.e., motor and verbal stereotypies); (v) neurological signs and symptoms (i.e., EEG abnormalities, abnormal pain sensitivity, hypotonia, etc.); (vi) allergic and/or autoimmune conditions in the patient and/or in his/her I and II degree relatives.
The current English version of this questionnaire is available here in Appendix in the electronic version. We are now in the process of revising and defining the psychometric properties of this tool. The selection of its variables was based on the literature available in 1997, on clinical observation, and on parental reports. This questionnaire is not meant to be used as a diagnostic scale: its major advantage, in addition to its time-saving simplicity, is that it collects information on patient-and family-history variables which is extremely important in dissecting different autistic phenotypes, but is not part of any common internationally-known scale.
Our protocol for collecting and managing biomaterials is displayed in Figure 1. Biomaterials collected from each family member (father, mother, autistic children, unaffected brothers and sisters whenever available) include both blood (approximately 1921 mL total) and urines (8090 mL). Investigators purely interested in running genetic studies can collect blood into three 7-mL EDTA-containing tubes: DNA can be rapidly extracted from one of the three EDTA-containing tubes and the remaining two tubes are stored at 80°C until further need to replenish the DNA stock with more DNA. In our hands, the genomic DNA yield from a 7-mL tube is so abundant that leukocyte immortalization is relatively unnecessary. However, since we are interested in studying biological endophenotypes, in our protocol blood is collected into two 7-mL tubes with EDTA (pink tubes in Figure 1) and into one tube without anticoagulant (red tubes in Figure 1), which is then centrifuged to obtain serum and measure calcium-dependent enzymatic activities, such as arylesterase [68]. One of the two EDTA-containing tubes is also centrifuged at 140 g for 25 min at 4°C within 20 min. of venipuncture to collect 12 mL of supernatant (i.e., platelet-rich plasma), which is immediately frozen in dry ice and stored at 80°C to later measure serotonin blood levels. Urine samples (yellow tubes in Figure 1) are used to study urinary bio-markers, such as p-cresol [61]. In general, it is extremely important for investigators interested in endophenotyping to diversify the collection of biomaterials as much as possible (blood, plasma, serum, urines, hair bulb, saliva, etc.). Furthermore, protocols minimizing the need to perform centrifugations should be preferred in multicentre studies.
Our genetic strategy has been primarily aimed at identifying common variants with low penetrance by using family-based and/or population-based association analysis at specific candidate genes. We have used our families as "experimental sample", while in some studies we have used a subset of Caucasian-American families present in the AGRE (Autism Genetic Resource Exchange) repository [69] as a "replica sample". We always test for genetic contributions not only to affection status, but also to each biological endophenotype. Similarly, we always test not only for vulnerability alleles, but also for protective alleles (i.e., preferential transmission of each allele from heterozygous parent to autistic or to unaffected siblings, respectively). To search for functional correlates of associated SNPs or haplotypes, we employ post-mortem tissues and/or fresh lymphocytes.
The exact experimental procedure applied for genetic research in our lab is briefly summarized in this following step-by-step flow chart: (i) extraction of genomic DNA Figure 1 Schematic representation of our protocol for biomaterial collection. Approximate blood and urine volumes are indicated inside each tube and expressed in mL. Pink tubes contain EDTA, red tubes contain no anticoagulant, yellow tubes are sterile 50 mL conical tubes provided to families in order to collect the first morning urine on the same day of blood drawing. One of the two EDTA-containing tubes is rapidly centrifuged to collect platelet-rich plasma, which is immediately frozen in dry ice and stored at 80°C to later measure serotonin blood levels. The remaining blood from the same tube is used for genomic DNA extraction. The other EDTA-containing tube and the second urine tube are stored at 80°C for later use. These biomaterials are collected from all available family members (father, mother, autistic children, unaffected brothers and sisters). from 3-7 mL of EDTA-anticoagulated blood, using a standard salting out protocol [70]; (ii) quantification of genomic DNA in triplicate using PicoGreen ® . If one of the three values falls beyond a 20% difference from the mean of the other two measurements, this value is discarded; (iii) aliquot (10 ng μL 1 in 100 μL total volume) in 96-well plates. Four separate sets of aliquots are prepared, one for immediate use and three for replacement. Aliquots in single 200 μL PCR tubes will be prepared only for samples whose genotyping requires separate replication; (iv) Store biomaterials at 20°C or 80°C; (v) Every two months, we check mendelian inheritance in the DNA samples of newly-recruited families using a set of 10 microsatellites (TPOX, D2S1338, D5S818, D8S1179, TH01, D13S317, D16S539, D18S51, D19S433, D21S11), and sex using amelogenin. For unrelated cases and controls, we use amelogenin only; (vi) Genotyping is usually performed using the TaqMan TM SNP genotyping assay (Applied Biosystems, CA, USA) on the ABI Prism 7900HT and analyzed with the SDS software; (vii) Mendelian inheritance is checked using PedCheck (available at http://watson.hgen.pitt.edu/register/) [71]; (viii) Hardy-Weinberg equilibrium and linkage disequilibrium are analyzed using Haploview (available at http://www. broad.mit.edu/mpg/haploview/index.php) [72]; (ix) Family-based association tests are performed using FBAT software (available at http://www.biostat.harvard.edu/~fbat/ fbat.htm) [73]. First, haplotypes are tested using the HBAT procedure and applying a sliding-window approach based on the structure of LD blocks. If a haplotypic association is detected, then SNPs constituting the associated haplotype are singly tested. Preferential allelic transmission from het-erozygous parents to autistic offspring (vulnerability variants) or to unaffected siblings (protective variants) is tested under additive, dominant, and recessive models. As a confirmatory approach, single SNP and haplotype TDT analyses are also performed using Unphased software (available at http://www.mrc-bsu.cam.ac.uk/personal/frank/software/ unphased/) [74]. Moreover, a population-based (i.e., casecontrol) approach can be implemented in parallel with family-based tests, provided cases and controls are ethnically matched and unrelated. To this aim, we exclude Caucasian-Americans patients from our case-control analyses, and contrast a sample including maximum one Italian patient per family with a sample of over 200 Italian population controls; (x) Quantitative analyses (q-TDT) are performed for biological endophenotypes (head circumference, serotonin blood levels, urinary solute levels) using FBAT; (xi) Quantitative analyses on scale scores, clinical symptoms and single scale items are carried out by parametric or non-parametric testing using SPSS ® software (SPSS Inc., Chicago, IL); (xii) Gene-gene and gene-environment interactions are tested using logistic regression and generalized linear models with SPSS ® , STATA ® , and R software (available at http://www.r-project.org/); (xiii) Power analyses are performed using PBAT (available at http://www. biostat.harvard.edu/~clange/default.htm) [75]; (xiv) Since our sample results from the merging of families recruited at clinical centers located in different part of the country, the genetic homogeneity of the sample has been tested using STRUCTURE software (http://pritch.bsd.uchicago.edu/ software.html) [76] after genotyping 90 unlinked SNPs distributed genome-wide.
The experimental strategy outlined above has thus far provided a satisfactory scientific output. We have identified RELN as a new vulnerability gene [77,78], which currently enjoys the strongest supporting evidence of involvement in autism among all vulnerability genes [79]. Another vulnerability gene, MET, was identified by Campbell and Levitt collaborating with our group for the genetic studies involved in this project [80,81]. In addition, we have identified other new vulnerability genes and confirmed associations initially reported in other samples for several genes, including ADA [82], APOE [83], HOXA1 [84,85], ITGB3 [86], PON1 [87], PRKCB1 [88], SLC6A4 [89,90], SLC-25A12 [91].
The study of biochemical and morphological endophenotypes in ASD has also contributed plenty of useful information. Serotonin blood levels, head circumference and peptiduria have been assessed in practically all our genetic studies, providing interesting evidence in reference to several of the genes listed above. Endophenotypes have also been the object of specific studies. Sacco et al. [55] investigated the clinical, morphological, and biochemical correlates of head circumference in autistic patients. Head growth rates are often accelerated in autism, especially in the first years of life [92]. Fronto-occipital head circumfer-ence was measured in 241 non-syndromic autistic patients, 3-16 years old. We could not only confirm that the distribution of head circumference is significantly skewed towards larger head sizes, but most importantly we found macrocephaly to be part of a broader macrosomic endophenotype, characterized by highly significant correlations between head circumference, height, and weight. Furthermore, a head circumference >75th percentile was associated with more impaired adaptive behaviors, and with less impairment in I.Q. measures, motor and verbal language development. Noticeably, larger head sizes were significantly associated with a positive history of allergic/immune disorders both in the patient and in his/her first-degree relatives. This study demonstrated the existence of a macrosomic endophenotype in autism, and strongly pointed towards pathogenetic links with immune dysfunctions either leading to, or associated with, increased cell cycle progression and/or decreased apoptosis.
In another study we searched for basic mechanisms underlying ASD by analyzing the clinical variables present in our ad hoc questionnaire on data collected from 245 Italian patients [60]. We identified at least four principal components that could play a relevant role in the pathogenesis of this disease: (i) "sensory dysfunction and abnormalities in the sleep-wake rhythm", (ii) "immune system abnormality" also including medical complications during pregnancy and repeated spontaneous abortions; (iii) "cognitive/motor developmental delay", and (iv) "repetitive behaviour". Interestingly, factors 2-4 appear each associated with one of our three biological endophenotypes, namely head circumference, urinary peptides levels, and serotonin blood levels, respectively. This research suggests the existence of at least four basic processes underlying autism, possibly linked to biological parameters known to be abnormal in specific subgroups of autistic patients. These results, if replicated, could help clinicians in identifying subgroups of patients characterized by specific symptom patterns, and to explore possible differences in clinical course and therapeutic response. They exemplify the usefulness of collecting clinical information including but not limited to ADOS and ADI-R items, as well as biological endophenotypes.

Conclusion: clinical and research perspectives
To this date, the clinical fall-out of two decades of genetic research in ASD is still limited. Yet, it is growing and it does require that we start asking ourselves how we want to use this information in the clinical context. Genetic screenings represent a powerful tool when dealing with monogenic Mendelian disorders, characterized by strong genotype-phenotype correlations. In the case of complex disorders, such as ASD, widespread genetic testing would not only be expensive and time-consuming, but also often un-successful due to its etiological complexity. Nonetheless, there are at least two areas where genetic testing can be successfully employed also in complex disorders: first, in order to evaluate the degree of genetic susceptibility to a given disease and, secondly, to search for rare monogenic or cytogenetic forms of the disease. The appropriate use of genetic testing in these two situations is very relevant to good clinical practice for the following reasons: (i) The identification of susceptibility variants can help direct the implementation of early intervention programs; (ii) the identification of the exact genetic cause of an otherwise unexplained disease, especially when affecting children, can significantly reduce the levels of anxiety in parents and improve their compliance to medical interventions and rehabilitation programs. The creation of genetic screening programs in autism requires the definition of a set of phenotypic inclusion criteria, which should be met by affected probands to justify their recruitment into the program. In fact, genetic screening programs must be designed according to clearly-targeted, feasible and cost-effective strategies. Readers can refer to our recent review [93], where we summarize all available evidence and we propose that, in addition to karyotype analysis (or CNV analysis based on array-CGH, wherever available) and fragile-X testing, genetic screening programs for autistic children are justified in two cases, namely when targeting the PTEN gene in macrocephalic autistic children, and the MeCP2 gene in autistic girls, even if devoid of any sign or symptom typically accompanying Rett syndrome, such as microcephaly, epilepsy, regression, hand stereotypies.
Moving from clinical applications to research perspectives, several points can be proposed to serve as the foundation for plausible pathophysiological models of ASD: (i) Autistic brains show neuroanatomical abnormalities implicating mainly cellular proliferation and neuronal migration. These neurobiological signs raise during the first and second trimester of pregnancy [4]; (ii) Although some motor signs may be already present at birth [94], baby-sibling studies document behavioral abnormalities in most of cases appearing around the end of the first year of postnatal life [95]; (iii) Head circumference is normal or even a little below the 50th percentile at birth; head overgrowth occurs between 1 and 7 years of age [92], possibly due to enhanced extracellular fluids in the deep gray matter/superficial white matter of the neocortex [96]. After 78 years of age, head growth slows down to below normal; (iv) ASD is a systemic disorder, with many patients displaying macrosomy [55], gastrointestinal symptoms [11], and immune abnormalities [10,97]; (v) Most ASD brains display significant neuroinflammation [98], excessive calcium levels [91], presence of oxidative stress [91], and an overexpression of immune genes even as early as at 4 years of age [99]; (vi) Also in vivo there is an excess of proinflammatory cytokines in the cerebrospinal fluid of autistic children (aged 6-8 years in Vargas et al. [98]), an overexpression of immune genes in lymphoblastoid cell lines especially implicating NK cell function [100], and temporal lobe abnormalities reminding of virally-generated lesions, detected in 48% of ASD patients [101].
Finally, there is a number of caveats that any research plan aimed at elucidating the genetic underpinnings of autism should consider: (i) Ethnic homogeneity of the sample is at least as important as sample size in determining the statistical power of genetic studies, especially when planning genome-wide association studies (GWAS). Large populations, like China, can provide enough recruitment to minimize the need for multi-site collections, envisioning for example the recruitment of two ethnically-homogeneous samples (i.e., an experimental sample and a replica sample) each from a single recruiting site.
(ii) Study design must be matched with available technologies. Studies whose primary focus is on common variants, rare variants, or CNVs, primarily require large genotyping, DNA sequencing, and microarray facilities, respectively. Our group has so far focused on common variants due to this limitation; starting in 2012, microarray and next generation DNA sequencing will become available, allowing us to pursue also rare variants through whole-exome sequencing, as well as CNV analysis.
(iii) The implementation of biological endophenotypes in autism genetic research can be strongly recommended, both for single gene studies and for GWAS. In general, biological endophenotypes provide at least two advantages over behavioral endophenotypes, namely enhanced reliability and greater validity. Biological parameters are measured using standardized and/or automated procedures, generally more reproducible than psychometric measures. Their lesser complexity and greater proximity to the genetic level facilitates the interpretation of the results. Simple collection protocols will minimize loss of samples due to inappropriate biomaterial handling, especially in multi-site studies.
(iv) The clinical characterization of the patients is at least as important as their genetic assessment. It is critical to use internationally-known scales, such as ADOS and ADI-R, in order to ensure sample comparison and the replication of positive results abroad. At least two clinicians per recruiting center should receive proper training, using an approved Chinese translation of these scales. Inter-rater reliability both within each center and between recruitment centers should also be periodically assessed. However, the clinical assessment should not be limited to these scales, but should expand to include also patient-and family-history variables; (v) Given the known uniqueness of linkage disequilibrium patterns and of common variants conferring vulnerability to complex disorders in different ethnic groups, as well as the interaction effects with relevant environmental factors, results from western countries may not necessarily overlap with those recorded in eastern Asia, and vice versa. The two should be viewed at least in some instances as complementary, rather than as necessarily overlapping. Our target should ultimately be to understand the pathophysiology of autism: we should not necessarily expect identical genetic forms in different parts of the world.
In conclusion, we have attempted to briefly present a clear rationale and discuss methodological issues important for the establishment of a scientific program aimed at studying the genetics of ASD and related endophenotypes in China. This program should on one hand respect the peculiar characteristics of the Chinese culture, while also generating data comparable with results obtained worldwide. This mutually enriching crosstalk will be immensely useful in helping elucidate the universal aspects of gene-to-behaviour pathways underlying autism, as well as local gene-and environment-driven specificities. We hope this program will not only provide important scientific results, but possibly also represent a great source of information to help policy makers and stakeholders design increasingly effective diagnostic and treatment services for individuals with ASD in the Chinese health system.