Introduction

ASD is a neurodevelopmental disorder that is defined by impairments in social communication, restricted interests, and repetitive behaviors (American Psychiatric Association 2013). Despite the unitary definition, individuals with ASD exhibit a wide variety of different core and secondary symptoms, which include dramatic differences in the level of adaptive behaviors, language, and cognitive abilities. This heterogeneity suggests that the ASD diagnostic category includes a variety of distinct disorders (Happé et al. 2006) that develop due to different causes (State and Levitt 2011; Jeste and Geschwind 2014) and are likely to require different interventions and therapies (Zwaigenbaum et al. 2015).

Consequently, conducting ASD research with small groups of participants in isolated laboratories yields findings that are not likely to be replicated across sites. The alternative, which has gained considerable momentum over the last decade, is to develop collaborative research efforts that involve identical data collection at multiple sites and the establishment of a common shared database. Such efforts enable data collection from a larger number of participants who are more likely to represent the true heterogeneity of ASD characteristics in the community. This approach can, therefore, enable researchers to pursue a personalized medicine approach with the goal of dividing the heterogeneous population into distinct subgroups that share specific phenotypic features, etiologies and/or treatment response patterns. These efforts are considerably more difficult and expensive to establish, and require multidisciplinary collaboration.

Several successful examples of such collaborations include the Simons Simplex Collection (Fischbach and Lord 2010) and SPARK (Feliciano et al. 2018) databases that focus on revealing the genetics of ASD. The EU-AIMS Longitudinal European Autism Project (Loth et al. 2017), which aims to identify distinct biomarkers for ASD using neuroimaging, behavioral assessments, and biological samples from hundreds of individuals with ASD who are sampled longitudinally over several years. The Autism Brain Imaging Data Exchange (ABIDE) project (Di Martino et al. 2014), which has aggregated hundreds of brain MRI scans from 6 to 60 years old individuals with ASD along with Autism Diagnostic Observation Schedule (ADOS) and cognitive assessment scores. Finally, until recently, the Autism Treatment Network (ATN), which included a consortium of sites in North America that collected a wide variety of behavioral and clinical data from thousands of ASD cases with the goal of improving clinical guidelines for best practice (Coury et al. 2020).

Recently, several countries have begun to create national autism databases in an effort to develop ASD research that is focused on local communities. Given the vast heterogeneity of ASD, one may assume that different ethnic communities and geographic locations with different genetics, environments, education systems, health resources, and cultural norms will require different solutions. Examples of such work include the Australian Autism Biobank (Alvares et al. 2018) and the Italian Autism Network (Muglia et al. 2018).

In 2018, the Israeli Ministry of Science and Technology awarded a grant to scientists at Ben Gurion University (BGU) and physicians at the neighboring Soroka University Medical Center (SUMC) to create the National Autism Research Center of Israel (NARCI). The first goal of the center was to turn a previously regional autism database (Meiri et al. 2017) into a national database with standardized data collection at multiple clinical sites throughout the country. This supports the center’s vision to improve autism diagnosis and treatment in Israel by integrating multidisciplinary research into the social healthcare system of Israel. In February 2019, we conducted the first Israeli Meeting for Autism Research where leading Israeli autism researchers and clinicians discussed shared research goals and defined the current study protocol for the NARCI database.

Methods

Study Design

This national project takes advantage of the centralized structure of the social healthcare system in Israel, which is run by four Health Maintenance Organizations (HMOs) that are obligated to accept any Israeli citizen and provide all citizens with equal medical services that are either free of charge or at very low cost. Within this framework, children who are suspected of having ASD are referred to one of ~ 50 clinics, mostly located in child development centers (with ~ 10 of these clinics handling ~ 70% of national referrals). Approximately 1800 new cases of ASD are diagnosed annually in these clinics, and in many of them, children are invited to periodic follow-ups that are free of charge. Diagnostic protocols vary across sites, with some clinics performing standardized ADOS and cognitive and language assessments while others perform partial assessments or more limited clinical evaluations. The formal ASD diagnosis in Israel must be performed according to DSM-5 criteria and include separate diagnoses from both a psychologist and a physician (pediatric neurologist or child psychiatrist).

Thus far, the NARCI has developed partnerships with four such clinics: The Preschool Psychiatry Unit at SUMC, the Child Development Center at SUMC, the Health Ministry Child Development Center at Beer Sheva, and the ALUT Autism Center at Shamir Medical Center. When necessary, the NARCI provides the clinical teams with the necessary knowledge, training, and additional work force to create standardized and reliable assessment protocols across sites. This is a cost-effective way to create clinical research partnerships that simultaneously enable autism research and improve clinical care. It is worth noting that this framework is ideal for conducting multi-site clinical trials and longitudinal research regarding treatment efficacy. Identical research protocols have received ethical approval at the four participating sites where ~ 450 new cases of ASD are diagnosed annually. Additional sites will be added in the near future.

Critical components of the study design include standardized data collection at the time of diagnosis and at follow-up visits, subsequent incorporation of the data into identical database structures, and the eventual integration of anonymized data across sites. Database administration is handled centrally by the NARCI with research assistants managing data collection and processing at each of the sites to ensure consistency and reliability.

Selected Scientific Questions

Over 40 leading Israeli autism researchers and clinicians participated in identifying the following nine shared scientific questions that were agreed to be the highest priority for ASD research in Israel. The database is designed to address these key scientific questions:

  1. 1.

    Are there distinct clinical profiles (i.e., subgroups) in the ASD population that can be defined based on specific behavioral, genetic, and/or other biological measures?

  2. 2.

    How effective are current diagnostic protocols and screening methods in Israel, and how can they be improved?

  3. 3.

    Which techniques and technologies enable objective, reliable, and sensitive early detection of ASD and quantification of ASD symptoms (e.g., outcome measures)?

  4. 4.

    How effective are existing community-based treatments and educational settings in reducing core and secondary ASD symptoms. How can their efficacy be improved?

  5. 5.

    How effective are new behavioral, pharmacological, and other medical interventions in reducing core and secondary ASD symptoms?

  6. 6.

    Which behavioral and/or biological measures enable accurate prognosis of short- and long-term outcomes for individuals with ASD?

  7. 7.

    What are the genetic and environmental risk factors for ASD in Israel?

  8. 8.

    How prevalent are various medical comorbidities in Israeli children with ASD?

  9. 9.

    What are the factors that influence the functioning and well-being of families of children with ASD in Israel?

After selecting the scientific questions, we developed a list of the most important behavioral and biological measures that are necessary for addressing these questions. In consideration of practical limitations, we selected measures that would be feasible given limitations in budget, personnel, and availability of participants.

Participants

Children, between the ages of 1 and 8 years old, who were referred with a suspicion of ASD to one of the participating clinics, are eligible to participate if they have completed the ADOS assessment. To facilitate recruitment, it is enough that one parent signs the informed consent form on behalf of both parents. Either parent can stop participation in the research at any time, and the family will be excluded from further research. To date, 961 children and their parents have been recruited at the four sites. This cohort exhibits the typical 4:1 male to female ratio and a wide distribution of ADOS and cognitive scores demonstrating heterogeneous severity levels (Fig. 1). Initial analyses demonstrate moderately negative correlations between ADOS-calibrated severity scores and cognitive scores (r = − 0.38, p < 0.0001) suggesting that children with more severe ASD symptoms tend to have lower cognitive abilities. Age of diagnosis was negatively correlated with the ADOS-calibrated severity scores (r = − 0.27, p < 0.0001) and positively correlated with cognitive scores (r = 0.17, p < 0.0001) suggesting that children who are diagnosed earlier tend to have more severe ASD symptoms and poorer cognitive abilities. Maternal and paternal age at birth were not significantly correlated with symptom severity (− 0.1 < r < 0.1, p > 0.1).

Fig. 1
figure 1

Characteristics of the children who are currently in the database (N = 961). a Gender and ethnicity. b Age at diagnosis. c Cognitive scores at diagnosis. d ADOS-calibrated severity scores (i.e., comparison score) at diagnosis. e Maternal age at birth. f Paternal age at birth

Since the majority of the children currently in the database are from the south of Israel (i.e., diagnosed at SUMC), ~ 22% of the children are from the Bedouin nomadic Arab community. This is a unique tribal community with ~ 66% consanguineous marriages (i.e., first cousins) and frequent polygamy with very large families, which offers a singular context for genetic studies of ASD. In addition to the mandatory ADOS assessments, a variety of other data has been collected from the children and their families (Table 1). The types and sources of the data are described below.

Table 1 Current availability of different types of information from children who were recruited to the database

Behavioral Assessments

A fundamental necessity of all ASD studies is to acquire standardized behavioral assessments that capture basic phenotypic features from children at multiple ages. The following assessments were selected:

  • DSM-5 assessment (American Psychiatric Association 2013): Physicians and psychologists assess the DSM-5 criteria for ASD in a meeting with the child and parents. The results of this assessment are recorded in a standardized form including the level of required support in each symptom domain.

  • Autism Diagnostic Observation Schedule 2 (ADOS-2) (Lord et al. 2012): Trained clinicians administer this standardized ASD assessment of the child. The ADOS yields scores describing the severity of symptoms in the social and repetitive behavior domains and is a requirement for entry into the database.

  • Preschool Language Scale 3 (PLS-3) (Volden et al. 2011): Speech pathologists administer this standardized language assessment, which yields scores regarding speech production and comprehension.

  • One of the following cognitive assessments, administered by a licensed psychologist:

    1. 1.

      Bayley Scales of Infant and Toddler Development (Viezel et al. 2014): cognitive test for children < 3.5 years old that yields a developmental quotient score with a distribution that is equivalent to IQ scores (population mean of 100 and a standard deviation of 15).

    2. 2.

      Mullen Scales of Early Learning (Mullen 1995): cognitive test for children < 5 years old that yields a composite score of early learning abilities with an equivalent distribution to IQ scores.

    3. 3.

      Wechsler Preschool and Primary Scale of Intelligence (WPPSI) (Luiselli et al. 2013): standardized IQ test for children 2.5–7.7 years old.

Audio and Video Recordings

Recent advancements in computer vision and speech analysis suggest that certain ASD symptoms may be identified and quantified by analyzing video and audio recordings of children with ASD (Budman et al. 2019; Sadiq et al. 2019). To enable such research, a subset of the ADOS and cognitive and language assessments in all four clinical sites are performed in rooms equipped with audio and video recording systems. When recorded, parents sign an additional consent form approving the use of recordings for clinical and research purposes.

Interviews and Questionnaires

In addition to behavioral assessments, basic phenotypic information is gathered using the following parent interviews and questionnaires:

  • Intake questionnaire with the following information:

    1. 1.

      Sociodemographic—education, income, ethnicity, age, history of addresses

    2. 2.

      Family history—number of children, miscarriages, medical history in the immediate and extended family, medications, smoking, alcohol, substance abuse

    3. 3.

      Pregnancy—use of assisted reproductive technologies, vitamins and supplements, smoking, alcohol, medications, high-risk pregnancy, illnesses, complications

    4. 4.

      Birth—gestational age, weight, complications, neonatal care unit

    5. 5.

      Early development—medical history, medications, vaccinations, feeding difficulties, physical growth, motor development, language development, regression

    6. 6.

      Early interventions—history of early interventions and educational placements.

  • Assessment of adaptive behaviors using one of the following options:

    1. 1.

      Vineland Adaptive Behavior Scales 2nd Ed. (Sparrow et al. 2005): a structured parent interview.

    2. 2.

      Adaptive Behavior Assessment System 2 (ABAS-2) (Harrison and Oakland 2003): parent questionnaire.

  • Sensory Profile 2 (Dunn and Westman 1996): This is a parent questionnaire for evaluating hypo- and hyper- sensitivities in a variety of sensory modalities.

  • The Children’s Sleep Habits Questionnaire (CSHQ) (Owens et al. 2000): This is a parent questionnaire for evaluating the severity of sleep disturbances in children.

  • Aberrant Behaviors Checklist (ABC) (Aman et al. 1985): This is a parent questionnaire for evaluating the severity of aberrant behaviors in children.

Additional questionnaires, necessary for addressing key scientific questions, will be added in the near future. These include measures of parental characteristics such as the Parenting Stress Index (Abidin et al. 2013) and Beck Depression Inventory (Jackson-Koku 2016), measures of children’s ADHD symptoms such as the Conner’s rating scale (Keith Conners et al. 1998), measures of children’s anxiety symptoms such as the SCARED questionnaire (Birmaher et al. 1999), and ASD screening measures such as the Social Responsiveness Scale (Constantino 2012), and the Modified Checklist for Autism in Toddlers (Robins et al. 2001).

Birth and Medical Records

The approved parental consent form authorizes the research team to extract information from the birth and medical record of the child, which are available for most of the children through their Health Maintenance Organization (HMO) electronic records. This includes the following information:

  1. 1.

    Birth—gestational age, weight, Apgar score, complications, newborn hearing test, neonatal intensive care unit

  2. 2.

    Service use—hospitalizations, medications, referrals, blood tests, and medical exams

  3. 3.

    Comorbidities—current and previous illnesses and disorders

  4. 4.

    Development—physical growth, motor and language milestones, and vaccinations

MRI Scans

Surprisingly, initial examination of the available medical records revealed that ~ 20% of the children in the database were referred to a clinical brain MRI at some point before or after their ASD diagnosis. Clinical brain MRI scans in Israel typically include a detailed anatomical 1 × 1 × 1-mm T1-weighted scan and a lower resolution 2 × 2 × 2-mm diffusion-weighted scan (at least 6 directions). The parental consent form allows us to extract the scans from the HMO database and perform additional analyses with the raw data.

Genetic Evaluations

Saliva samples are collected from participating children and both parents for extraction of DNA. The DNA is sent to the Autism Sequencing Consortium (Satterstrom et al. 2020) for whole exome sequencing. The resulting sequences enable studies that assess the prevalence of de novo and inherited genetic abnormalities in the Israeli population. Furthermore, it is possible to relate specific genetic abnormalities to the deep phenotypic data available for this cohort. Participation in the genetic study requires both parents to sign an additional consent form that has been approved at all four sites.

Biological Samples

Protocols for collecting, processing, and storing additional biological samples including blood, stool, urine, and hair samples have already been approved through participation in the Israeli Biorepository Network for Research (MIDGAM, www.midgam.org.il). The MIDGAM is a national organization operating in five major medical centers in Israel with infrastructure for processing and storing a variety of biological samples. Participation will require parents to sign an additional consent form. Collection of biological samples is expected to begin in late 2020.

Longitudinal Data Collection

Within ASD research, it is essential to study how behavioral and biological measures change over time, particularly in the context of intervention (Georgiades et al. 2017). This is critical for revealing the different etiologies of autism, for developing prognostic measures, and for assessing treatment efficacy. With this in mind, a strong emphasis of the NARCI database is to enable longitudinal data collection at participating sites using identical standardized measures as in the initial diagnosis. While most ASD clinics in Israel invite families for follow-up visits 1–2 years after the initial diagnosis, these visits are typically short and do not include the same behavioral assessments performed at diagnosis.

To counteract this challenge, the NARCI is funding the administration of follow-up behavioral assessments at each of the clinical sites. Data collection at follow-up currently includes questionnaires regarding adaptive behaviors, aberrant behaviors, sensory sensitivity, sleep disturbances, and a follow-up questionnaire regarding the child’s educational placement, behavioral interventions, and use of medications. In addition, standardized ADOS, cognitive, and language assessments are performed by NARCI personnel (Table 2). Follow-up assessments have been completed with 208 children at the two SUMC sites, with expansion to the other sites planned this year (Table 2).

Table 2 Current availability of different types of information for children who completed a follow-up assessment 1–2 years after the initial diagnosis

Data Management and Sharing

To ensure privacy, identifiable data is held in separate local password-protected databases at each of the clinical sites (with the exception of the two SUMC sites where data management is combined). The databases reside on their respective medical center’s internal computer networks, which are protected according to the standards enforced by each center. Research assistants from the NARCI join the local clinical teams at each of the sites and receive authorization from each medical center to access the clinic’s computer network. They then populate identical databases that are built in access (Microsoft Inc., USA). An automated procedure de-identifies the data and enables integration of the data across sites. All clinicians and researchers associated with the participating sites have access to the anonymized, integrated data, which can be extracted for analyses in their respective laboratories. Plans are underway to enable a method for sharing the anonymized data with researchers outside of the participating sites, given appropriate ethics approval. Some forms of data, such as video recordings, cannot be anonymized and are therefore analyzed within the confines of each clinical site.

Discussion

The NARCI database is a large shared resource for performing multidisciplinary research regarding ASD risk factors, biomarkers, outcome measures, and treatment efficacy. The study protocol focuses on ASD development and incorporates retrospective data from pregnancy, birth, and medical records along with prospective data that is collected as part of the diagnosis process and in follow-up visits. The wealth of longitudinal behavioral, clinical, and biological data from this large dataset is critical for studying the heterogeneity of ASD etiologies and specifically for understanding ASD development, risk factors, diagnosis, treatment efficacy, and service utilization in the Israeli population.

Research Focus

A central goal of the NARCI is to facilitate clinical research that will improve ASD diagnosis and treatment in Israel. Such clinical research is carried out by many researchers in different fields within social sciences, medicine, engineering, and life sciences who use distinct techniques and measures. Despite differences across fields, the vast majority of clinical studies require thorough behavioral characterization of participants using standardized tests of autism severity (e.g., ADOS), cognitive function, language abilities, and adaptive behaviors. Additional measures of aberrant behaviors, sleep problems, and sensory sensitivities add critical information as do medical records containing medication use and diagnosed comorbidities. Collecting these behavioral data from a large number of 1–8-year-old children at the time of ASD diagnosis is the primary focus of the database. This is demonstrated by the large number of children for whom these data are already available (Table 1).

After establishing standardized behavioral assessments at each of the sites, we added video and audio capture of behavioral assessments, collection of saliva samples for whole exome sequencing, extraction of retrospective clinical records including pregnancy and birth records, collection of eye tracking data, and clinical MRI scans. Collecting these medical and biological data from a large number of 1–8-year-old children at the time of ASD diagnosis is the secondary focus of the database. We plan to add several types of data to this secondary focus including the collection of additional biological samples (blood, hair, stool, and urine) for processing and storage at MIDGAM, and collection of additional questionnaires including parent and family characteristics/function questionnaires.

After establishing the collection of these data during the ASD diagnosis, we developed a routine for collecting longitudinal information at the annual follow-up visits that families complete in most of the ASD clinics. Collection of longitudinal data follows the same priorities as initial data collection with a primary focus on standardized behavioral assessments and a secondary focus on additional medical and biological data (Table 2).

Size Versus Depth

When deciding on the research goals of the NARCI database, we were forced to choose between collecting restricted information on a relatively large number of participants (e.g., by mining the HMO records of all ASD individuals in Israel) or collecting in-depth information from a relatively small number of participants. The main reason for choosing the latter stems from the understanding that ASD heterogeneity requires in-depth characterization of participants using standardized clinical measures that involve direct interaction with the child. Standardization of measures such that they are collected in a reliable manner across sites is paramount for the success of this research. Deeper data collection limits the number of participants, thereby reducing the size of the database.

Note that this trade-off between size and depth is apparent in other collaborative research efforts around the globe. For example, genetic projects with larger cohorts (e.g., the SPARK project with a target cohort of 50,000 ASD individuals) typically reduce the depth of data collected from each participant (e.g., SPARK collects phenotypic data using parent and self-report questionnaires). Conversely, projects that collect more detailed longitudinal phenotypic and/or neuroimaging data tend to recruit smaller cohorts (e.g., the EU-AIMS LEAP project with ~ 400 participants).

Limitations of Large-Scale Research

While large multi-site collaborations are essential for studying large numbers of ASD individuals and understanding the heterogeneity of ASD, they also have several limitations. First, large multi-site collaborations are difficult to manage since participating members sometimes want to pursue different goals. Second, large collaborations tend to respond more slowly to emerging challenges and opportunities. Third, large collaborations are always limited in their scope such that some scientific topics and fields are left out.

Our hope is that these limitations will not impede our efforts to identify distinct subgroups of ASD children with specific behavioral and/or biological characteristics who benefit more from specific interventions and treatments. This step will enable more focused research with specific subgroups in highly specialized labs. We believe that ASD research will profit greatly from a balance between large-scale collaborative research and focused highly specialized research.

Recruitment Progress to Date

Recruitment of families to the database has progressed at an average rate of ~ 180 new families per year (Fig. 2). The experience with the families has been extremely positive, with ~ 80% of referred families agreeing to participate in the database. We believe that this high recruitment rate is due to the integration of the research into the ASD clinics. This enables families to participate without requiring any additional effort (i.e., research is carried out as part of the routine clinical visits). Expansion of the research to additional sites throughout Israel is planned to increase recruitment rates and data collection over the next several years. Our goal is to reach ~ 70% of the 1800 annually diagnosed children in Israel.

Fig. 2
figure 2

Recruitment rate of new families to the database. Gray bars represent the number of children recruited per month, from January 2015 to February 2020. Black line indicates the cumulative recruitment rate. Average recruitment rate is 14.7 families per month and ~ 180 families per year (N = 961)

Conclusions and Future Plans

The NARCI database is a unique and valuable resource for studying the development of ASD. We believe that further development of the database will enable national studies, which will reveal important findings that are unique to Israel, as well as international studies that will reveal commonalities between the Israeli and the global ASD population. Future plans also include extending the age range of participants to include adolescents and adults with ASD who require immediate solutions for a variety of challenges.