Introduction

Background and Motivation

Autism Spectrum Disorder

Autism Spectrum Disorder (ASD) is a complex neurodevelopmental disorder of apparently increasing prevalence and unknown etiology (Levy et al. 2009; McPartland and Volkmar 2012). The condition is highly heritable (Folstein and Rosen-Sheidley 2001; Geschwind 2009; Berg and Geschwind 2012), and although there has been active research in an attempt to discover the genetic factors and other biomarkers underlying ASD (Abrahams and Geschwind 2008; Scherer and Dawson 2011; Miles 2011; Devlin and Scherer 2012), diagnosis still depends almost exclusively on behavioral assessments (Matson 2007; Huerta and Lord 2012). ASD affects predominantly males, with a male-to-female ratio currently estimated at approximately 4:1 (Fombonne 2009; El-Fishawy and State 2010; Baron-Cohen et al. 2011). ASD is a lifelong condition with symptoms appearing in early childhood. Individuals affected by ASD exhibit varying degrees of deficits in communication and reciprocal social interaction and show a range of restricted and repetitive interests (Moldin and Rubenstein 2006; Johnson and Myers 2007; Lord and Jones 2012, DSM IV-TR 2000, DSM-5 2013). Diagnosis of affected individuals falls on a spectrum, with variability both in the presence or absence of specific autistic features as well as variability in the severity of those features (Tager-Flusberg and Joseph 2003; Walker et al. 2004; Volkmar et al. 2009; Rutter 2011). Existing treatments are primarily behavioral, with early intervention having a positive impact on the lifelong course of the condition (Committee on Children with Disabilities 2001).

There has been extensive research on ASD since Leo Kanner first identified “autistic disturbances” in children in 1943 (Kanner 1943). The naming and classification of the symptoms and conditions that comprise autism and related developmental disorders have undergone changes over the years with the criteria enumerated in the International Classification of Diseases and in the Diagnostic and Statistical Manual of Mental Disorders (DSM) serving as definitional for clinical assessment. DSM-IV recognized several separate disorders: autistic disorder, Asperger’s disorder, childhood disintegrative disorder, and pervasive developmental disorder, not otherwise specified, while DSM-5 recognizes one encompassing disorder: autism spectrum disorder. The shift from DSM-IV to DSM-5 has been viewed as a largely positive shift, but it has also raised some concerns, including whether the changes will have a negative impact on the services provided to affected individuals, as well as whether the changes will make comparison with previous research results more difficult (Wing et al. 2011; Mattila et al. 2011; Mahjouri and Lord 2012; Lord and Jones 2012; Huerta and Lord 2012; Volkmar and Reichow 2013).

Collecting, Accessing, and Sharing ASD Data

Data in all areas of biomedical research are being collected at an astonishing rate, but with varying attention paid to methods that would make those data readily accessible to others. Biomedical ontologies have become recognized for their important role in facilitating data access and sharing among large groups of researchers, often with disparate backgrounds and interests (Rubin et al. 2008; Bodenreider 2008; Gardner et al. 2008; Bug et al. 2008; Larson and Martone 2009; Bilder et al. 2009; Imam et al. 2012; Hoehndorf et al. 2012). A few experiments have used the biomedical literature both to explore the usefulness of autism-focused ontologies and to generate candidate ontologies (Petric, et al. 2007; Tu et al. 2008; Macedoni-Lukšič et al. 2011; Hassanpour et al. 2011).

In recent years, a number of large-scale initiatives have contributed to the collection of extensive information from families affected by autism. These initiatives are primarily motivated by a desire to gain an understanding of the genetics of autism. Included among the initiatives are the Autism Genetic Resource Exchange (AGRE) a database of biomaterials and genotypic and phenotypic information, the Simons Foundation Autism Research Initiative (SFARI), a database of clinical and genetic information about families affected by autism and other neurodevelopmental disorders, the National Database for Autism Research (NDAR), an informatics platform for ASD relevant data, and the Autism Consortium data resource, a database of phenotypic and genetic data on families affected by autism (AGRE 2013; SFARI 2013; NDAR 2013; Autism Consortium 2013). All of these resources have been developed to facilitate collaboration and sharing of data with the goal of accelerating scientific research on ASD (Lajonchere and AGRE Consortium 2010; Fischbach and Lord 2010; Hall et al. 2012).

The Autism Consortium, whose membership includes scientists of varying backgrounds from multiple institutions in the greater Boston area, has recruited hundreds of families to participate in autism spectrum disorder research studies. Extensive phenotypic and genotypic data have been collected not only from affected children but also from each of their family members. The goal of the Consortium is to determine the cause of autism spectrum disorders, thereby speeding diagnosis and leading to the development of new treatments (Autism Consortium 2013).

The goal of the work reported here was to develop an ontology that can be used 1) to provide improved access to the data collected by those who study ASD and other neurodevelopmental disorders, and 2) to assess and compare the characteristics of the instruments that are used in the assessment of ASD.

Methods and Materials

Materials

The Autism Consortium selected some two dozen different screening tools and diagnostic instruments for the collection of phenotypic data from affected individuals and from their family members, including parents and siblings. Table 1 lists the instruments together with their abbreviations, investigative methods, and citations to articles that describe the development, refinement, or evaluation of those instruments.

Table 1 Screening and diagnostic instruments used by the autism consortium

Instrument formats include 1) questionnaires, generally completed either by a parent or another primary caregiver (e.g., CBCL), 2) interviews, administered by a trained individual (e.g., BPASS), or 3) direct assessment, administered by an individual who has been trained to achieve high levels of reliability for that particular instrument (e.g., ADOS). Time to administer any given instrument ranges from 5 min (e.g., Dean Handedness) to over two hours (e.g., ADI-R). In some cases, multiple versions of the same instrument exist, generally designed to be administered to different age ranges. For example, three versions of the Brief Rating Inventory of Executive Function (BRIEF) were used by the Consortium: BRIEF-P for preschool children, BRIEF (Parent Form) for ages 6 to 18, and BRIEF (Self-Report Form) for ages 19 or older. The Autism Consortium Medical History (MH) includes both a comprehensive questionnaire and an interview that addresses substance use.

Some instruments include a relatively small number of questions (e.g., CTOPP), while others include hundreds (e.g., VABS-II). Questions vary in the types of answers required, including yes/no answers, open-ended answers, and scores, while others require an assessment, for example, of severity or frequency. Some examples of questions related to restricted and repetitive behavior are shown below:

  • Does s/he ever have things that s/he seemed to have to do in a very particular way? (SCQ)

  • Repeats certain acts over and over; compulsions (CBCL)

  • Having to repeat the same actions such as touching, counting, or washing (SCL-90)

  • REPEATING (Need to repeat routine events; In/out door, up/down from chair, clothing on/off) (RBS-R)

  • Flexibility in schedule and routine (BPASS)

  • Resists change of routine foods, places, etc. (BRIEF-P)

  • Responds appropriately to reasonable changes in routine (for example, refrains from complaining, etc.). (VABS_II)

  • Resistance to Trivial Changes in Environment: Current (ADI-R)

  • Reacts positively when a new and unfamiliar activity is suggested (CCC-2)

Methods

The development of the ontology was informed by a consideration of the extensive literature on the phenotypic characteristics of individuals affected by autism as well as by the detailed content of the autism assessment instruments. It was immediately apparent that the instruments differ in structure and coverage. We undertook a comprehensive analysis of each of the instruments, studying the nature of the questions asked and items assessed, the method of delivery, and the overall scope of the content.

The literature guided us in the top-down development of the overall structure of the ontology as well as in developing its meaningful subcategories. The initial three-branch hierarchy representing autism specific personal traits, social behaviors, and associated medical conditions was expanded iteratively through both manual and automated evaluation. Once we finalized the hierarchy and concepts, we reviewed, refined, and validated the item level mappings to individual concepts in the ontology.

We began by grouping and clustering instrument questions based on similar meanings. We performed initial automated clustering of the instrument question text using latent semantic indexing to create groupings that served as “work lists” for bottom-up development (similar to methods used by Petric et al. 2007). The process involved manual refinement of the automatically generated clusters, including adding items to the initial clusters, merging clusters where appropriate, and splitting clusters that had been created based on shared terms but that, in fact, represented distinct concepts (e.g., “plays well with others”, “plays with parts of objects”). As part of this process, we mapped individual items in each of the instruments to the evolving ontology.

We then “bound” the concepts to the possible answers as they are represented in any given instrument. For example, some questionnaires may have true-false questions (e.g., SCQ), others may rate answers on a scale of 1–3 (e.g., BRIEF), while others may use a scale of 1–4 (e.g., SRS). Furthermore, some questions may be phrased positively (“plays well with others”), while others are phrased negatively (e.g., “has trouble playing with others”). In such a case, the same value such as “true”, or “all the time”, means two quite different things. We developed three sets of generic assessment scales based on different types of concepts represented in the ontology: 1) Frequency: “rarely or never, sometimes, almost always, frequently or always, N/A or unknown”; 2) Severity: “average or above, somewhat limited, limited, severely limited, N/A or unknown”; and 3) Presence: “present, absent, unsure or unknown”. For all item level questions, we then created mapping tables from each possible answer (or numeric range of answers) to an assessment on the assessment scale. Similarly, for each concept, we added an attribute determining which assessment scale to use.

Figure 1 illustrates (a) a section of the ontology, highlighting the concept “Control of Emotional Reactions”, together with (b) the binding that is necessary such that the correct interpretation can be made of the answers to the questions posed.

Fig. 1
figure 1

Portion of the autism phenotype ontology (a), and binding of answers to concepts (b)

Note that the instruments, (e.g., BRIEF, VABS-II, ADOS, etc.) represented in the figure not only have slightly different ways of representing the same concept, but also have different ways of assessing the responses to the questions posed.

As our ontology development environment we used the Protégé Ontology Editor and Knowledge Acquisition System (Noy et al. 2010; Noy et al. 2009; Protégé 2013) a readily available open source ontology development tool. Throughout the development cycle we applied a variety of metrics to our emerging ontology. We leveraged existing National Center for Biomedical Ontology (NCBO 2013) metrics to find structural weaknesses, and we developed additional metrics to analyze information content and to generate suggestions for further ontology development. Iteratively applying the metrics to the evolving ontology guided our revision strategies by highlighting inconsistencies, structural imbalances, and areas in need of review. Metrics included the distribution of concepts across the ontology, including the maximum depth of the concepts in the hierarchy, and the average and maximum number of siblings. In addition, we measured the number of concepts as compared to the number of instances (questions) mapped to those concepts, as well as the number of leaf concepts linked to only questions from a single instrument and leaf concepts not linked to any questions. We regularly and continuously applied these metrics and used the results to inform iterative modifications of the developing ontology.

Once we had developed the first complete version of the ontology, we used the ontology to further study the full set of instruments with the goal of identifying possible overlaps in their coverage. The motivation for this was that the number of instruments is quite large and demands a significant commitment of time from researchers, and even more importantly, from the families themselves. If we could identify consequential overlaps, then there was the possibility that the number of instruments and questions could be considerably reduced. Understanding how instruments overlap and complement each other may, thus, lead to effective grouping of instruments in future research studies.

Because a given instrument might have more than one version, e.g., a different version for a different age group, and because a question may exist in multiple forms within a single instrument, we developed normalization methods in order not to over-count concept coverage. For those cases where we normalized across instruments, when the same question appeared in multiple versions, this was counted as a single question. For example, ADOS has four modules for different age groups/developmental levels, and many of the same questions appear in several versions, such as in ADOS section C, where the item “Imagination/Creativity” appears in all four modules. For our analysis this would represent one question. For those cases where we needed to normalize within an instrument, when there were several scoring scales for an item, we normalized to a single item. For example, CELF-2 has a subtest focusing on “Recalling Sentences”. This test results in a raw subtotal, a scaled score, a percentile rank score and an age equivalency score. We normalized these items so that they are represented as one question for the purposes of coverage analysis.

In order to have an objective measure for comparing different combinations of instruments, we identified the set of variables that would be relevant to such comparisons. These included the depth and breadth of the ontology concepts covered by the combined instruments, the uniqueness of the concepts covered when combining instruments, an instrument type factor indicating the mode of administration, a time factor indicating the total amount of time needed to administer a combination of instruments, and an instrument count factor for the number of instruments used. (See Supplement 1 for a detailed description of how the objective function is calculated (Online Resource 1.) The variables we identified are by no means the only possible variables that could be used for performing such an instrument coverage assessment and the specific definition of each variable as well as the details regarding how the variables are combined into a single objective function may not be appropriate for many use cases. Nonetheless, our overall objective was to design an assessment approach that captured what we considered to be the important elements of instrument coverage quality and was based on information theoretic principles but used, wherever possible, simple and intuitive mathematical functions whose computed values and impact on the final object function could be clearly understood by a human user during iterative exploration of various instrument combinations. In addition, it is important to note that the value of the scores for a set of instruments used in combination lies not in the actual score, but, rather, in how a specific score compares relative to the scores of other possible combinations of instruments.

Results

Autism Phenotype Ontology

The final ASD phenotype ontology comprises three high-level classes, ‘Personal Traits’, ‘Social Competence’, and ‘Medical History’. Figure 2 shows the top level structure of the ontology.

Fig. 2
figure 2

Top level structure of the autism phenotype ontology

Table 2 shows a portion of the ontology in tabular form. Each concept has a unique identifier, a tree number, a concept name, a concept definition, and where appropriate, a mapping to a standard ontology, i.e., MeSH (Medical Subject Headings), ICF (International Classification of Functioning, Disability and Health), or the UMLS (Unified Medical Language System).

Table 2 A portion of the autism phenotype ontology in tabular form

Table 3 shows the results of selected metrics for the final version of the ontology. The full ontology comprises 283 concepts distributed across three major branches. (See Supplements 2 and 3 for the full ontology in OWL and tabular format, respectively (Online Resources 23).)

Table 3 Autism phenotype ontology final metrics

‘Medical History’ has the largest number of concepts, followed by ‘Personal Traits’, and there is a somewhat smaller number of concepts in ‘Social Competence’. The maximum depth of concepts is 5 and the average number of siblings is 4. The maximum number of siblings is 11, found in C4. The total number of questions mapped to concepts is over 5,000, and after normalization this number is reduced to 3,395Footnote 1. The majority of leaf concepts that are mapped to only one instrument is found in ‘Medical History’, which is expected given that the primary coverage of medical issues is found in the Autism Consortium Medical History while the other diagnostic instruments have only minimal or no medically related coverage.

Figure 3 illustrates the integration of the ontology with the Autism Consortium database. The figure is a composite of screen shots from the Autism Consortium query tool illustrating the Query by Ontology capability.

Fig. 3
figure 3

Composite screen shot of autism consortium query tool

In the example shown in Fig. 3, the researcher is interested in retrieving data for all of those individuals in the database who have been assessed with severely limited ability to control their emotions. On the left hand side, it can be seen that the ontology is expandable by clicking on the area of interest, in this case, ‘Personal Traits’. Exploring ‘Personal Traits’ leads to the choice of ‘Emotional Regulation and Control’. Once that concept is chosen, the severity level ‘Severity’ level is chosen on the top right. The bottom right shows all of the questions that have been mapped to that concept, and now the researcher is able to download from the database all of the relevant data for each of the individuals who meet those criteria. The download includes not only the data that are relevant to the topic of the query, but all of the data that exist in the database about those individuals.

ASD Instrument Analysis

Figure 4 shows the distribution of the high-level ontology categories in two different instruments, and answers the question of what percentage of questions within a particular instrument are mapped to which portions of the ontology. Note that ADI-R covers a high percentage of topics in language ability, stereotyped behavior, and interpersonal interactions, together with a range of other topics represented in the ontology. The majority of topics covered in BRIEF treats, not surprisingly, executive function, but, in addition, stereotyped behaviors, adaptive life skills, and cognitive ability, as well as some other concepts are also covered. (See Supplement 4 for coverage representations for the full set of instruments (Online Resource 4).)

Fig. 4
figure 4

Distribution of ontology categories in 2 different instruments

Table 4 shows for each instrument the distribution of its normalized questions across the three major branches of the ontology, A (‘Personal Traits’), B (‘Social Competence), and C (‘Medical History’), and the number of concepts in each branch of the ontology that those questions cover. (See Supplement 5 for the concepts covered by each individual instrument (Online Resource 5).)

Table 4 Distribution by instrument of question and concept coverage across the ontology

Figure 5 shows the scores that result when two or more instruments are combinedFootnote 2.

Fig. 5
figure 5

Objective function, coverage, and unique coverage scores for selected combinations of instruments used in the autism consortium study

Higher objective function scores indicate that there is both good coverage of the ontology concepts and an acceptable amount of time and number of instruments involved. For example, when combining two instruments such as MH (medical history) and CBCL (the Child Behavior Checklist), or MH and VABS-II (the Vineland Adaptive Behavior Scales) the result is a good objective function score. Combining a larger number of instruments, for example, combining four instruments, MH, ADOS (Autism Diagnostic Observation Schedule), BRIEF (Brief Rating Inventory of Executive Function) and SRS (Social Responsiveness Scale), can also result in a good objective function if the instruments complement each other in coverage, have minimal overlap, and are administered in a reasonable amount of time. Note that when all instruments are used together, the objective function is quite low. This is because using all instruments incurs a large penalty due to the large number of overlaps in mapped concepts and the cost involved in using such a large number of instruments.

Higher coverage scores indicate that there is good coverage of the ontology concepts. For example, using a combination of the four instruments MH, CBCL, VABS-II, and Mullen (Mullen Scales of Early Learning) results in a good coverage score as does using a combination of six instruments, MH, SRS, BPASS (Broader Phenotype Autism Symptom Scale), Peds-QL (Pediatric Quality of Life Inventory), CELF (Clinical Evaluation of Language Fundamentals), and BRIEF. Note that the highest coverage score, by far, results when all instruments are used together.

However, because there is extensive overlap in the concepts covered, the unique coverage score when using all instruments together drops significantly. Higher unique coverage scores indicate that the combination of the instruments used involves a low number of overlaps. For example, the combination of MH, VABS-II, and WPPSI-III has the highest unique coverage score, indicating that each of the instruments makes a unique contribution to the overall assessment.

Discussion

The completed ontology reflects the full scope of the ASD behavior phenotype and provides a mapping from each of the more than 5,000 questions that comprise two dozen standardized instruments for ASD to a set of several hundred concepts that comprise the ontology. A review of the extensive autism literature led us to propose a high-level structure for the ontology. The three top-level classes, ‘Personal Traits’, ‘Social Competence’, and ‘Medical History’ together with their immediate subclasses are intended to encapsulate the primary characteristics of the ASD behavioral phenotype.

Personal traits such as cognitive ability, executive function, and language abilities together with evidence of stereotyped, restricted, and repetitive behaviors, the ability to control emotions, and the ability to perform complex motor acts are all evaluated as part of the ASD assessment process. Also important for assessing ASD is the level of social competence exhibited by the individual being evaluated. Deficits in recognizing social norms and cues, particularly in communication, together with deficits in reciprocal social interaction, such as an inability to make eye contact, and general level of ability in age-appropriate life skills such as personal hygiene, and other everyday skills that are needed at home and in the community are all part of the ASD assessment. Finally, medical history includes a comprehensive review of the individual’s background including the circumstances associated with pregnancy and infancy, exposures, such as injuries, hospitalizations, and medications, any current medical symptoms or complications, and an indication of the primary diagnoses together with any additional diagnosed comorbidities.

We used standard metrics to evaluate the ontology both for structure and content, and we defined each concept in the ontology both through its position in the hierarchy as well as with a textual definition. This latter allows interested individuals to have a fuller understanding of what is meant by each of the concepts rather than relying solely on its name. Each concept has both a unique identifier as well as a tree number indicating its place in the hierarchy. Over time, and as more is known about ASD, the tree numbers may change, but the unique identifier will stay constant. Whenever possible, we mapped our concepts to standard ontologies, specifically the Medical Subject Headings, the International Classification of Functioning, Disability and Health, or the Unified Medical Language System. This ensures that the ontology can be used to link to other data sources, including the biomedical literature.

The ontology has been fully integrated with the Autism Consortium database. This means that researchers do not need to know the details of the individual ASD instruments, but, rather can query the database by posing questions that are ontology-based. For example, a researcher can query the database for all individuals who have severe deficits in executive function and then can correlate that with the genetic analysis for those individuals. The genetics researcher is often hampered by the lack of ASD phenotypic information available. Perhaps there is an ADI-R or ADOS score and some demographic information captured for the individual, but not much data beyond that. It is now possible to have a much more granular approach to the various features that comprise ASD. The ontology maintains the granularity (with its 283 features (concepts)), while at the same time easing the burden of the researcher by abstracting away from the specifics of each of the instruments.

The instruments studied here differ not only in coverage but also in 1) format and method of investigation, 2) focus, 3) terminology, and 4) granularity. Instruments may involve questionnaires, interviews, or direct assessments by a trained examiner.

Depending on the investigative method, features of an instrument can vary widely. For example, for assessing expressive language, direct examinations may include word lists (Mullen), questionnaires may contain several questions about specific aspects of pronunciation either dispersed throughout the instrument or in a specified section, e.g., CCC-2 and VABS-II, and interviews may contain only one or two specifically related questions, but with many components, thus allowing for interpretive flexibility prior to coding a response, e.g., ADI-R.

The focus of the instruments also varies widely. A minority of the instruments has been specifically designed for autism assessment, including, for example, ADI-R, ADOS, and BPASS. The majority focus, instead, on determining various aspects of neurodevelopment, such as executive function (e.g., BRIEF), language capability (e.g., CELF), IQ (e.g., WPPSI-II), and social interaction skills (e.g., SRS).

Large variation in terminology among the instruments includes both the use of different terms to denote the same behavior, and the same or similar terminology to designate distinct traits. This may be seen quite clearly, for example, through the questions about children’s playing behavior. Instruments vary as to whether, for example, they are investigating playing behavior as it relates to social development, communication, or restricted and unusual interests. Whereas both ADI-R and ADOS include sections that specify a focus on play, ADI-R investigates the individual’s participation and interest in group play, while ADOS investigates the individual’s use of imagination and toys. Similar questions about imaginative play are also included in ADI-R, but they appear in the “Language and Communication Functioning” section.

The granularity of the items in each of the instruments also differs, and often in ways that are not readily apparent. There is also a tension between the granularity of the questions in any specific instrument and the granularity of the concepts in the ontology. Questions in some instruments can be quite detailed in covering a particular phenotypic area, while other instruments may have only a few high level questions that cover that same area. In some cases, the detailed questions indicated important areas for further development of the ontology, while, in other cases, we mapped the detailed questions to higher level concepts that already existed in the ontology. Assessment questions regarding self-inflicted injuries serve as one example. Instruments such as ADI-R, ADOS, VABS-II and RBS-R include varying numbers of relevant questions that use differing terminology to investigate the presence of self-injurious behavior to various degrees.Footnote 3 Where ADOS poses a single general question focusing on “any kind of aggressive act to self”, RBS-R contains a section with eight questions investigating specific types of self-injurious acts.Footnote 4 In this case, the ADOS question and the eight RBS-R questions were all mapped to the concept ‘Self-injurious Behavior’.

By mapping all questions from each of the two dozen instruments to the completed ASD phenotype ontology, we have been able to show the overall focus of each of the instrumentsFootnote 5. For each instrument, it is now possible to see at a glance the distribution of the topics it covers (Online Resource 4). For example, it can be seen that D-KEFS has a large percentage of questions treating executive function, but also a relatively large percentage that deals with cognitive ability and language ability. Another smaller percentage deals with recognition of social norms. ADI-R covers a range of ontology concepts, especially in A (‘Personal Traits’) and B (‘Social Competence’), but with variable percentage of coverage in each of those areas. BRIEF, as would be expected, has a large percentage of concepts in executive functioning, but it also covers some stereotyped behaviors, and several areas of social competence, albeit at a smaller percentage. WPPSI-III questions are distributed exclusively across cognitive ability and language ability, while in SRS the majority of questions treat interpersonal interactions and recognition of social norms, but several personal traits, such as stereotyped behavior, executive function, and emotional traits are also covered. This view of each of the instruments might be helpful for ASD and other neurodevelopmental investigators as they think about which set of instruments would be most useful in their particular context.

In the clinical setting, the administration of ASD diagnostic instruments is most often paired with the judgments of a multi-disciplinary team of skilled clinicians (Falkmer et al. 2013; Kim and Lord 2012), and using a small number of complementary instruments is often recommended (Risi et al. 2006; Tomanik et al. 2007; Huerta and Lord 2012). In the research setting, a somewhat larger set of instruments can be considered such that the full range of ASD characteristics is recorded, but at the same time it is important not to subject study participants to undue duplication in questions asked and to excessive administration timeFootnote 6.

To address these issues, we developed a method that would allow researchers to assess the optimal set of instruments according to several objective criteria, including, most importantly, the overall coverage with the least amount of overlap in concepts covered, modulated by the mode of administration, with higher value given to the involvement of a trained individual, and adjusted further by the cost of administering multiple instruments, with cost consisting both of the amount of time it takes to administer the instruments and the financial and administrative overhead involved. The underlying assumption here is that our ontology has sufficient coverage that is both broad and deep enough for research purposes. Also, the parameters we identified are, we believe, the relevant ones to consider when comparing sets of instruments used in combinationFootnote 7. It is, of course, possible to develop other approaches and formulas for assessing the relative importance of each of those parameters, resulting in different absolute scores. However, as mentioned above, the importance lies not in the absolute scores themselves, but, rather, in how the scores for one set of instruments compare with the scores for another candidate set of instruments.

The results shown in Fig. 5 are indicative of how the ontology can be used to make the necessary judgments. If there is no major time constraint, and if coverage is paramount, then it is clear that all 24 instruments are the best choice. If coverage is paramount, but the investigator would like to minimize unnecessary overlap, while also minimizing time of administration, then a smaller number of instruments might be used. That is, there is greater value when there is better coverage, with the smallest number of overlaps in concept coverage when assessing autism and when there is relatively more involvement by trained professionals. The value of administering more than one instrument is mitigated by the time it takes to administer multiple instruments (which has an impact both on the professional who is administering the instruments and on the individual who is undergoing the assessment), and by the cost associated with purchasing and learning a new instrument.

The ontology and objective scoring system can also be used iteratively to determine the best combination of instruments for the purpose at hand. For example, perhaps an investigator is considering using ADOS together with the medical history assessment. The coverage score, unique coverage score, and objective function scores are 75, 74, and 25, respectively. This indicates that there is virtually no overlap between the two instruments, but the objective function score is relatively low and important concepts in the executive function section of the ontology and certain areas of social competence are not covered. Adding BRIEF addresses the missing executive function concepts and results in coverage, unique coverage, and objective function scores of 108, 102, and 35, respectively. Adding SRS addresses the social competence concepts and results in coverage, unique coverage, and objective function scores of 124, 103, and 33, respectively. In this case, while the objective function score is slightly lower, the coverage is superior, and so this might be a reasonable set of instruments to consider.

Conclusions

Our goal has been to develop a comprehensive phenotype ontology for providing intelligent and flexible access to autism-specific phenotypic data and for comparing the characteristics and coverage of a set of instruments that are used to assess ASD and other neurodevelopmental conditions. We developed a high level structure for the ontology that is consistent both with established knowledge about the autism phenotype and congruent with the many concepts that are represented in some two dozen instruments that are used by the ASD community. In developing the ontology we have been guided by our collaboration with other researchers in the Autism Consortium, by the extensive literature on autism spectrum disorders, and by the multiple phenotypic instruments in use by the Consortium.

Our analysis of the instruments using the newly created ASD phenotype ontology represents a novel approach to assessing and comparing the characteristics and coverage of the instruments that are routinely used in ASD research and diagnosis. The work reported here may have implications for reducing the number of instruments needed for fully assessing ASD both for research and in the clinic. The ontology also has promise for use in research settings where extensive phenotypic data have been collected, allowing a concept-based approach to identifying behavioral features of importance and for correlating these with genotypic data.

Information Sharing Statement

We developed the ontology using the Protégé ontology editor (http://protege.stanford.edu/). We have included the full ontology in the supplementary materials. We will deposit the ontology on the National Center for Biomedical Ontology (NCBO) BioPortal site (http://bioportal.bioontology.org/) so that it will be openly available to the broader research community. The ontology will also be made available to the National Database for Autism Research at the National Institutes of Health.