Validity of the SNAP-IV For ADHD Assessment in South African Children With Neurodevelopmental Disorders

This study investigated the psychometric properties of the Swanson, Nolan, and Pelham ADHD Rating Scale (SNAP-IV) in a sample of South African children with neurodevelopmental disorders (n = 201), primarily Autism Spectrum Disorder and Intellectual Disability. We conducted a confirmatory factor analysis to inspect the two-factor structure of the SNAP-IV. We also calculated ordinal coefficient alpha to estimate internal consistency. Fit statistics for the two-factor model approached acceptable levels. The model fit improved slightly after removing an item related to spoken language. The subscales had acceptable internal consistencies. Findings partially support the use of the SNAP-IV in this group of children. However, there are limitations to its performance in this population likely related to the presence of neurodevelopmental disorders.

One frequently used ADHD rating scale is the Swanson, Nolan, and Pelham ADHD Rating Scale (4th edition; SNAP-IV; Swanson et al., 2012). In research contexts, the SNAP-IV is typically used with school-based, non-clinical samples (Bussing et al., 2008). However, a few promising studies support the use of the SNAP-IV with children who have ID. Two related studies found that the SNAP-IV subscales had strong psychometric properties in a sample of children with ID. The SNAP-IV had excellent reliability and concurrent validity, demonstrated by large, positive correlations with scores on other ADHD rating scales (Miller, Fee, & Jones, 2004;Miller, Fee, & Netterville, 2004). However, these studies did not investigate the structural validity of the SNAP-IV.
In clinical contexts, the SNAP-IV is sometimes used to aid in the assessment of ADHD in child patients with other NDDs. The SNAP-IV is used clinically in this way in the Western Cape province of South Africa, where, similar to other resource-limited settings, specialist neurodevelopmental services accessible to the majority of the population are frequently over-burdened. In addition, the clinical severity of neurodevelopmental populations presenting to tertiary hospitals in Africa is typically high, resulting in a large number of non-verbal children (Bakare & Munir, 2011). Rating scales are especially useful for assessment purposes in low-resourced clinical settings as they are quick and inexpensive to administer. However, until there is sufficient psychometric evidence to support use in these contexts, results should be interpreted with caution. To the best of our knowledge, there are no published studies to date that investigate the psychometric properties of the SNAP-IV in a sample of children with NDDs other than ID.
The aim of this study was to evaluate the validity of scores derived from the SNAP-IV in a sample of young South African children with NDDs. We used the analyses conducted by Yerys and colleagues (2017) as a template for the current study. We hypothesized that (i) the SNAP-IV items would measure two distinguishable constructs, namely, inattention and hyperactivity-impulsivity, (ii) the SNAP-IV subscale scores would correlate positively with subscale scores of another measuring ADHD-related behaviors, and (iii) the SNAP-IV would have strong, positive correlations with externalizing behaviors relative to internalizing behaviors. al., 2006). This, in turn, makes it difficult to assess whether our approach to measuring ADHD-related behaviors in children with other NDDs is valid. The result is a lack of evidence demonstrating that ADHD rating scales measure the same underlying constructs (i.e., have construct validity) with children who have other NDDs.
There is some evidence suggesting that ADHD rating scales may not be valid for children with ASD. One study evaluating the ADHD Rating Scale-Fourth Edition in a sample of children with ASD (without comorbid ID) found that the scale did not adequately distinguish between inattention and hyperactivity-impulsivity, the two constructs thought to underpin ADHD, in this group (Yerys et al., 2017). A confirmatory factor analysis (CFA) found three items intended to measure inattention, including "Does not listen when spoken to directly" and "Easily distracted", were also associated with the hyperactivity-impulsivity factor in these children. These results suggest that the presence of comorbid ASD may influence the ratings of target (ADHD) behaviors. However, in terms of the overall phenotype described, are presented as DSM-5 symptoms (e.g., "Often is forgetful in daily activities"). For each item, respondents select one of four response options (0 "Not at all", 1 "Just a little", 2 "Quite a bit", or 3 "Very much") that best describes the child's behavior over the past year. Subscale and total scores are calculated as an average score across relevant items. We included a "not applicable" (N/A) response option to capture questions relating to speech in non-verbal children.
We obtained Afrikaans and Xhosa translations of the SNAP-IV from another South African research group that had previously used the SNAP-IV (Zeegers et al., 2010). Two Afrikaans and three Xhosa-speaking members of the NeuroDEV South Africa research team independently reviewed the Afrikaans and Xhosa translations respectively to confirm that the translations were tapping into the intended constructs. The Afrikaans reviewed team consisted of a senior research assistant with a neuropsychology background (B.U.C) and a pediatric research nurse. The Xhosa review team comprised two research assistants with a psychology background (including M.R.Z), and a pediatric research nurse. Each team consolidated their suggestions for revisions (minor revisions to the wording of items) to create the final translation versions.
To estimate the prevalence of clinically significant ADHD symptoms in this sample, we set cut-off scores that were aligned with the DSM-5 diagnostic criteria for ADHD. The DSM-5 requires an individual to exhibit at least 6 symptoms in at least one domain (inattention or hyperactivityimpulsivity) to qualify for a diagnosis of ADHD. Pelham and colleagues (1992) recommend defining the presence of a symptom by a 3-point score on the SNAP-IV items, as it produces prevalence rates similar to those reported in the general population. Therefore, we estimated the prevalence of ADHD symptomology in the sample by calculating the number of participants who scored a '3' on at least six items in at least one domain.

Child Behavior Checklist for Ages 6-18 (CBCL/6-18)
The CBCL/6-18 is a self-report questionnaire designed to assess specific problematic behaviors in school-age children, as reported by caregivers (Achenbach & Rescorla, 2001). It is considered a "gold standard" tool for assessing behavioral problems in children and has been validated in over 30 countries (Ivanova et al., 2007). The items listed in the CBCL/6-18 are designed to align with the DSM-5 diagnostic criteria for a number of behavioral disorders, including ADHD. This study was concerned with four subscales: The Attention Problems syndrome scale (10 items), the ADHD DSM-oriented scale (7 items), the Aggressive Behavior syndrome scale (18 items), and the Withdrawn/Depressed syndrome scale (8 items). We used the latter two subscales

Participants
This study was embedded within a larger project titled "NeuroDEV South Africa", which administers the SNAP-IV to parents of case children (children with NDDs) aged 6-17 years (de Menil et al., 2019). This large-scale, multisite study aims to investigate the genetic architecture and phenotypic manifestations of NDDs in African populations. Figure 1 outlines the sample selection process for the current study within the context of the larger NeuroDEV South Africa study.
We invited parents of children attending outpatient developmental, genetic, speech, and neurology clinics at two tertiary hospitals in Cape Town to participate in the NeuroDEV South Africa study. Children were eligible for inclusion as cases if they had a confirmed diagnosis of one or more of the following Diagnostic and Statistical Manual of Mental Disorders 5th edition (DSM-5) NDDs: ADHD, ASD, SLD, ID or Communication Disorder (CD; American Psychiatric Association, 2013). Children with a primary neuromotor disability (e.g., cerebral palsy) were not eligible to participate. We analyzed data from all case children aged 6 years and older who were enrolled in the NeuroDEV South Africa study between August 2018 and November 2021, except for one participant who did not complete the SNAP-IV.

Demographics Questionnaire and Asset Index
Parents completed a demographic questionnaire, which included questions about the child and parents' home languages. The Asset Index is a socio-demographic questionnaire adapted from previous South African studies, including questions regarding parental educational attainment and household income (Myer et al., 2008;Stein et al., 2015).

Swanson, Nolan and Pelham Questionnaire ADHD Rating Scale (SNAP-IV; Parent Form)
The SNAP-IV ADHD rating scale is an 18-item self-report questionnaire designed to measure symptoms of ADHD (Swanson et al., 2012). The items are consistent with the DSM-5 ADHD diagnostic criteria and are designed to distinguish between different symptom presentations of ADHD, namely, inattentive, hyperactive-impulsive, and combined (both inattentive and hyperactive/impulsive). The subscales are named accordingly; 'Inattention' (IN,9 items) and 'Hyperactivity-Impulsivity' (HI,9 items). The items observed variables (indicators) and latent variables (factors). CFA is used to confirm an a priori specification (i.e., hypothesis) about the underlying structure of a tool (Kline, 2016). We used global and local fit statistics to determine the usefulness of the model (Brown, 2015). The chi square statistic (χ 2 ) is a test of exact fit between the model and the data. The Tucker-Lewis Index (TLI) conveys information about the "goodness of fit" between the model and the data. A higher value indicates better fit, and should ideally exceed 0.95 (i.e., the specified model should improve the fit by 95% relative to no model). The Root Mean Square Error of Approximation (RMSEA) is a measure of approximate fit, where 0 indicates a perfect fit and 0.05 indicates close fit. The Standardised Root Mean Square Residual (SRMR) represents the average difference between the observed and predicted correlations matrices (i.e., the average residual correlation). Like the RMSEA, SRMR values close to 0 suggest good fit. Modification Indices (MIs) approximate the amount by which chi-square value will decrease (i.e., model fit will improve) if an unspecified parameter were to be estimated. Parameters with MIs greater than 3.84 indicate a statistically significant decrease in the chi-square statistic, or improved model fit. Each MI is associated with a standardized expected parameter change (SEPC), the magnitude and direction of which approximates how much the parameter is expected to change if it were to be estimated. Standardised coefficients (also known as 'factor loadings') are estimates of direct effects between latent and observed variables. The squared factor loadings indicate the proportion of variance in each observed variable that is explained by the latent factor.
Given our chosen statistical analyses, we conducted a priori power calculations to determine the minimum required sample sizes to obtain reasonable power in our two main analyses. Using the 'semPower' package in R, we determined that obtain a RMSEA of 0.05 with 134 degrees of freedom and powers of 80% and 90% respectively, sample sizes of 139 and 168 are required (Moshagen & Erdfelder, 2016). For the correlation, an a priori power analysis using G*Power software indicated that to achieve an effect size of 0.30 (conservatively selected) for a one-tailed test (α = 0.05) with 80% and 90% powers respectively, the number of required observations are 64 and 88 respectively (Faul et al., 2007).

Results
The sample included 201 child participants aged 6-17 years (M = 8.16, SD = 2.61). Table 1 presents the sociodemographic characteristics of the sample, as well as information related to DSM-5 diagnoses, level of spoken language, as measures of externalizing and internalizing behaviors respectively. We obtained licenses to administer the ASEBA Afrikaans and Xhosa translations of the CBCL/6-18. For each item, parents rated the frequency/severity of their child's behavior in the past six months using a three-point scale; 0 ('not true'), 1 ('somewhat or sometimes true'), or 2 ('very or often true'). Subscale scores were calculated by summing the scores of all relevant items. The CBCL/6-18 is widely used in research with children who have ASD. There is some evidence to support the subscales' validity and reliability in this population (Dovgan et al., 2019;Pandolfi et al., 2012Pandolfi et al., , 2014.

Procedure
We obtained informed consent from parents of child participants. If developmentally appropriate, we obtained assent from child participants over the age of 12 years. A member of the data collection team (comprising medical doctors, research nurses, and psychologists) verbally administered the SNAP-IV to all parents as well as the CBCL/6-18 subscales to a subset of parents, in their preferred language. Administration of the CBCL/6-18 began in May 2019, nine months after data collection for the umbrella study commenced. Although we did not record the language of administration for specific tools, anecdotal reports indicate that the majority of participants elected to complete the questionnaires in English.
The University of Cape Town's Human Research Ethics Committee approved this research (367/2019) as part of the NeuroDEV South Africa study (810/2016). The NeuroDEV study also received ethics approval from the Harvard T. H. Chan School of Public Health Institutional Review Board (17-1260) and the Western Cape Department of Health, South Africa.

Statistical analyses
We performed all statistical analyses in RStudio (Version 1.3.1093) for R (Version 4.0.2; R Core Team, 2019) We used the 'psych' package to conduct exploratory data analysis (Revelle, 2020). To estimate the concurrent validity, convergent validity, and discriminant validity of the SNAP-IV, we computed Pearson's correlation coefficients between SNAP-IV subscale scores and CBCL/6-18 subscale scores, and the 'corcor' package to statistically compare correlation coefficients (Diedenhofen & Musch, 2015). We estimated the internal consistency reliability of the SNAP-IV subscales using ordinal coefficient alpha (Zumbo et al., 2007). To test the expected two-factor structure of the SNAP-IV, we conducted a CFA using the 'lavaan' package (Version 0.6-5; Rosseel, 2012). CFA examines relationships between and highest level of education. Most children came from low-and middle-income families that spoke one of the three main languages spoken in the Western Cape of South Africa. The predominance of male participants (> 70%) is typical of study samples comprising children with NDDs (Springer et al., 2013).
Seventy-five child participants (37%) had more than one NDD diagnosis. The most frequent diagnoses were ID and ASD. ASD and ID were comorbid in 49 participants (24% of the sample). In this sample, CD was primarily comorbid with ID, with only five children having CD a primary diagnosis. Twenty-two children (11%) were diagnosed with ADHD, which in all but two participants was comorbid with at least one other NDD, including ASD (n = 4), ID (n = 11), ASD and ID (n = 3), CD (n = 2), or SLD (n = 1). Figure  S1 in the supplement is a Venn diagram displaying overlap between clinical DSM-5 diagnoses. One hundred and twenty-one children (60%) had delayed (i.e., non-fluent) speech. Sixty-six (33%) children were not enrolled in a formal schooling system. Table 2 presents a summary of the sample's SNAP-IV scores, as well as item response frequencies. Figures S2 and S3 in the supplement display the distribution of subscale scores. Using our tentative cutoff scores for the SNAP-IV, 69 children (34%) exhibited clinically significant symptoms of ADHD. Twenty-five children (12%) met the cut-off for a predominantly inattentive presentation of ADHD, 33 (16%) for a predominantly hyperactive/impulsive presentation, and 11 (5%) for a combined presentation. Of the 22 participants with a confirmed ADHD diagnosis, 3 (14%) met the criterion for a predominantly inattentive presentation, 4 (18%) for a predominantly hyperactive/impulsive presentation, and 5 (23%) for a combined presentation. On average, respondents strongly endorsed the SNAP-IV items, with percentages of "quite a bit" or "very much" responses ranging from 22 to 72% (M = 57.40, SD = 13.55) and percentages of "very much" responses ranging from 11 to 55% (M = 35.88, SD = 11.04). Response patterns did not differ substantially by primary diagnosis (see Figure S3 in the supplement). Eleven items had small proportions of N/A responses, while Items 15 ("Talks excessively") and 16 ("Blurts out answers") had 51 (25%) and 61 (30%) N/A responses respectively.
We conducted a CFA specifying a model (Model 1) with two latent factors, Inattention and Hyperactivity-Impulsivity, 1 (0.50) Other 2 (1.00) Note. 'Other' home languages included Shona (n = 4), Chichewa (n = 2), Swahili (n = 1), and Lingala (n = 1). All participants who spoke the aforementioned 'other' languages also spoke English at home. At the time of writing, $1.00 ≈ ZAR15.00. Information about the child's diagnosis, language level, and education were extracted from the child's medical records by a medical officer. DSM-5 = Diagnostic and Statistical Manual of Mental Disorders, 5th ed. *Percentages do not add up to 100 as children may have more than one diagnosis. and 8 ("Distracted by extraneous stimuli", MI = 12.09, SEPC = 0.432) with the Hyperactivity-Impulsivity factor. We compared Model 1 to a nested one-factor model (Model 1a) to determine if variance in the SNAP-IV indicators is better explained by a single latent factor, "ADHD". However, the Model 1a's fit was poorer than that of the original two-factor model, χ 2 (135, N = 201) = 341.38, p < 0.001, TLI = 0.859, RMSEA = 0.087.
Taken together, these findings indicate that Item 15 ("Talks excessively") may be a poor item in this sample. We considered that this item's poor performance may have been due to its irrelevance for the non-verbal children in this sample. Hence, we ran a second model (Model 2) with a subset of participants who had acquired at least phrase speech (n = 164). However, Item 15's factor loading remained relatively low (see Table 5). We then reran Model 1 without Item 15 (Model 3, N = 201) which slightly improved the model fit (see Table 4). This final model (Model 3) specified two related latent factors, Inattention and Hyperactivity-Impulsivity, with 9 and 8 indicators respectively (see Figure S6 in the supplement). MIs suggesting possible cross-loadings of Items 3 ("Does not listen"; MI = 14.89, SEPC = 0.487) and each with nine indicators (SNAP-IV Items 1-9 and Items 10-18 respectively), as well as an additional path estimating the covariance between the two latent factors. Table 4 presents details of the model fit and Table 5, the standardized coefficients (factor loadings). The approximate model fit indices (TLI, RMSEA, and SRMR) were approaching acceptable levels. There was a significant correlation of 0.748 (p < 0.001) between the two latent factors. In general, the factors explained relatively high proportions of variance within their respective items. An exception was Items 15 ("Talks excessively"), of which Hyperactivity-Impulsivity explained only 16% variance.
An inspection of the modification indices and a residual correlation matrix revealed several correlated residuals (r > absolute value of 0.1, see Figure S5, a residual correlation matrix for Model 1, and Table S1, both in the supplement). For example, Items 15 ("Talks excessively") and 16 ("Blurts out answers"; MI = 9.96, r = 0.24) had a large positive residual correlation, suggesting that the specified model is not fully accounting for the covariance between these two items. MIs suggested possible "cross-loadings" of Items 3 ("Does not listen"; MI = 16.44, SEPC = 0.487)

Table 3
Polychoric Correlation Matrix and Item-Total Correlations for the SNAP-IV ADHD Rating Scale (N Note. Correlation coefficients within the black borders are cross-subscale correlation coefficients (i.e., between items on the Inattention and Hyperactivity-Impulsivity subscales). Some correlation coefficients are based on n < 201 due to non-applicable responses (see Table 2). ITC = Item-total correlation. ITCs were calculated using total subscale scores (Inattention OR Hyperactivity-Impulsivity), not total SNAP-IV scores. For example, ITC for Item 1 represents the Pearson correlation coefficient between Item 1 scores and Inattention subscale scores (minus Item 1).
Given evidence to support two correlated unidimensional factors, we calculated subscale means as well as internal consistency reliability statistics for the Inatten-  Both chi-square (χ 2 ) and RMSEA significance tests are accept-support tests. Estimate is significant at *p < 0.05, ** p < 0.01, ***p < 0.001. All models measure two latent variables, 'Inattention' ('IN') and 'Hyperactivity-Impulsivity' ('HI'). Covariance between the two latent variables is also specified. Estimation method is robust weighted least squares with polychoric correlations, which is recommended for ordinal data (Li, 2016). a Model 1: All 18 items -IN (9) and HI (9)  in the sample who met the cutoffs for clinically significant ADHD-related behaviour. This finding highlights the importance of using clinical interview techniques alongside behavioural screening tools to disentangle symptoms that may present similarly, albeit with different underlying causes. Thorough questioning around the context of these ADHD-related behaviours will shed light on whether a symptom is indeed suggestive of ADHD or better explained by another NDD (e.g., ASD).
Moderate, positive relationships between the SNAP-IV and CBCL/6-18 ADHD-related subscales supported the concurrent validity of the SNAP. It is worth noting that the SNAP-IV and CBCL/6-18 are similar in terms of item content and response format. Hence it is possible that the correlation coefficients may have been inflated by common method variance. The SNAP-IV also demonstrated convergent and discriminant validity when correlating ADHD-related behaviors with internalizing and externalizing behaviors respectively. Combined ADHD (SNAP-IV) was more strongly associated with aggressive (externalizing) behaviour than with withdrawn/depressed (internalizing) behavior, as measured by the CBCL/6-18. It is likely, though, that sample-related factors affected responses to the CBCL/6-18. For example, items such as "Teases a lot", "Threatens people" (Aggressive Behavior), "Refuses to talk" and "Secretive" (Withdrawn/Depressed) were not applicable to children who were non-verbal. Perhaps the pre-school version of the CBCL, the CBCL/1.5-5 may have been a more appropriate measure of internalizing and externalizing behaviors, given the average developmental level of the children in this sample.
Most children in the current sample with a confirmed ADHD diagnosis did not meet the SNAP-IV cutoff for clinically significant ADHD-related behaviours. The DSM-5 requires that symptoms of ADHD be consistent with an individual's developmental level. The SNAP-IV was designed for use with typically developing (i.e., non-clinical) schoolaged children without serious comorbid conditions (Swanson et al., 2012). The average child in the current sample was chronologically young, had some degree of language delay, and was not attending a mainstream school. Some of the SNAP-IV items, especially those indicating inattention (e.g., "Avoids tasks requiring sustained mental effort"), may not have been relevant to children not yet enrolled in a schooling system where such behaviors are typically expected. In other words, the behaviors indicated by the items (e.g., sustaining attention, modulating verbal activity) may not have been appropriate to expect of a child, taking into account their developmental age and the presence of specific cognitive deficits. The SNAP-IV may therefore be less useful as a measure of ADHD for young children with developmental delay. Notwithstanding the somewhat CBCL/6-18 Aggression Problems subscale (r = 0.48, 95% CI = 0.35-0.59, p < 0.001), and the CBCL/6-18 Withdrawn/ Depressed scores (r = 0.17, 95% CI = 0.02-0.31, p = 0.025). The difference between the latter two correlation coefficients was significant (z = 3.87, p < 0.001), demonstrating good convergent and discriminant validity.

Discussion
The primary aim of the study was to investigate the construct validity of the SNAP-IV in a sample of children with NDDs. A CFA partially supported the claim that the scale measures two factors, Inattention and Hyperactivity-Impulsivity, barring one item that did not seem to measure either factor clearly, as well as two items that "cross-loaded" onto another factor.
Item 15 ("Talks excessively") had a small factor loading. In this sample, excessive talking was likely an indicator of a child's verbal abilities, rather than a measure of hyperactive/impulsive behavior. From a statistical perspective, the large number of N/A responses for Item 15, likely due to the substantial proportion of the sample with limited expressive language, may also have contributed to the item's weak factor loading. We were not able to use multiple imputation techniques to counteract the effects of missing data points, as the data were not missing at random. The large negative residual correlation between Items 15 and 16 may further account for the item's poor performance in the model. The model did not sufficiently account for the observed variance shared between these two items. It is possible that similar response patterns for these items may have contributed to the unexpectedly large amount of shared variance (i.e., a caregiver of a non-verbal child who responded to Item 15 with N/A likely responded to Item 16 in the same way).
Items 3 ("Does not listen when spoken to directly") and 8 ("Distracted by extraneous stimuli") were also consistently problematic across all models. The cross-loadings suggest that these behaviours were not good measures of inattention in this sample. It is possible that parents of children with ASD endorsed these items due to impairments in social communication rather than difficulties with inattention and impulsivity/hyperactivity specifically. For example, the wording of the item "easily distracted by extraneous stimuli" may be interpreted as an impairment of social reciprocity, a core symptom of ASD (e.g., a child not making direct eye contact with the person they are communicating with) or inattentiveness, a core symptom of ADHD (e.g., a child being unable to sustain attention during one activity or another). Misinterpretation of item wordings may result in over-endorsement of items or "false positives", perhaps explaining the higher-than-expected proportion of children have likely provided more accurate and reliable estimates of convergent and discriminant validity of the SNAP-IV.

Summary and Conclusion
Clinicians working in low-resource settings in the Western Cape often administer the SNAP-IV to evaluate ADHD symptoms in children with NDDs. Data derived from the SNAP-IV have important implications for referral and the provision of interventions for comorbid ADHD. However, little is known about its validity in the NDD population, especially in children with moderate to severe ID and language delay as part of their phenotype. The primary aim of this study was to evaluate the validity of the SNAP-IV in a sample of children with NDDs using CFA techniques. The results of the CFA partially supported the validity of the SNAP-IV in this sample. However, there were limitations to its performance related to the clinical characteristics of the sample. The presence of developmental delay and specific cognitive deficits associated with one or more NDDs likely influenced ratings of target ADHD behaviors. Importantly, some SNAP-IV items may be unsuitable measures of hyperactivity-impulsivity in children without functional speech. Other items, especially those measuring inattention, may be less relevant to children not yet enrolled in a formal schooling system. Additional studies are needed to determine whether existing DSM-based ADHD rating scales are able to capture inattentive and hyperactive/impulsive behaviors in the NDD population with sufficient precision. Refinement and rewording some SNAP-IV items may be warranted to improve measurement accuracy. Acknowledgement and sources of support We extend our gratitude to the families who participated in the study. We also acknowledge the important work of the NeuroDEV South Africa data collection team. The NeuroDEV study is supported by the Stanley Center for Psychiatric Research at the Broad Institute and the National Institute of Mental Health (1U01MH119689-01). The NeuroDev Study also received funding through a grant from the Simons Foundation/SFARI (599648, E.R. and K.A.D.). In addition, M.R.Z. received fellowship funding from the Harry Crossley Foundation.

Conflict of Interest The authors report no conflicts of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended supportive results of the CFA (especially given the significant differences between the 'target' sample and the current sample), the clinical context in which the tool is being administered has important implications for the interpretation of items and responses.
The framework for the current study's analyses was based on a previous study by Yerys and colleagues (2017). Although the two study samples were very different in terms of chronological age, developmental diagnoses, and cognitive functioning, the findings were relatively consistent. Notably, both studies found that existing ADHD rating scales may not adequately tap into the latent constructs of inattention and hyperactively respectively in children with NDDs. Two items ("Does not listen when spoken to directly" and "distracted by extraneous stimuli") seemed to be poor indicators of inattention in both samples. In addition, both studies demonstrated that the ADHD scales had good construct validity when correlated with internalizing and externalizing behaviours respectively. Overall, this study upholds Yerys and colleagues' conclusion that standard ADHD rating scales may not detect ADHD symptoms with sufficient precision in children with neurodevelopmental disorders.

Limitations
This study had three major limitations. Most of the children in this sample were chronologically young, were diagnosed with ASD or ID, and had some degree of cognitive or language delay. Although the sample was representative of children attending neurodevelopmental clinics in sub-Saharan Africa (Bakare & Munir, 2011;Springer et al., 2013), these results cannot be generalized to NDD populations with different proportions of severe and non-verbal cases. A larger sample of children with NDDs, stratified by diagnosis, chronological age, and degree of language delay would have allowed a more thorough evaluation of the validity of the SNAP-IV in the NDD population. Another important limitation was the relatively small sample size. CFA techniques often require large sample sizes to ensure precision and generalizability of the results (Kline, 2016). Chi-square tests, fit indices, parameter estimates, modification indices, and standardized residuals are all sensitive to sample size (Kyriazos, 2018). Larger samples are often recommended, especially when models are complex, data are not normally distributed, and when data are missing (Kline, 2016). Although a sample size of at is least 200 is generally considered sufficient, a larger sample would likely produce a more stable solution. Finally, including a variety of behavioral measures with different measurement methods would