Introduction

Although the prevalence of childhood visual impairment (VI) is low [1], it has lifelong and far-reaching implications, for both children and their parents. According to parents of children with VI in the age band of 0–2 years and professionals with expertise in VI for this particular age group, sensory and general developmental issues related to attachment and well-being were among the most important concerns [2].

In the Netherlands, low vision services offer guidance such as developmental and behavioral interventions to overcome challenges related to vision loss. One of the most important outcomes of low vision services in children with visual impairment is participation, which for young children usually takes place in the family context [3]. In order to structure the process of identifying needs of children and their parents, the Participation and Activity Inventory for Children and Youth (PAI-CY) was recently developed involving end-users as stakeholders [2]. To aid interpretation, four different questionnaires were developed to reflect the developmental age bands of children as set by the World Health Organization (WHO). The PAI-CY should lead to a patient-based assessment of the impact of VI on functioning and participation and should initiate shared decision-making about interventions needed. Results from a pilot study showed that parents were mostly satisfied with the PAI-CY, whereas professionals suggested some changes which were incorporated in the next version [4]. The current study aims to investigate the psychometric properties of the PAI-CY 0–2 years.

Methods

Participants and procedure

Parents/caretakers (parents for brevity) of children aged 0–2 years registered at two Dutch low vision services were invited to participate. Parents who agreed to participate completed questions regarding socio-demographic and clinical information, the PAI-CY 0–2 years, and an evaluation form. Two weeks later [5], parents completed a retest. Although it should be noted that the very young age of children might result in less accurate data, ophthalmic diagnosis, visual acuity, and visual field of children were retrieved from patient files; missing values were complemented with self-reported data from parents (n = 10). VI was divided by five levels based on the better seeing eye and corresponding to the WHO criteria [6].

PAI-CY 0–2 years

The preliminary version of the PAI-CY 0–2 years comprises 31 items categorized into seven domains (for descriptive purposes only, in order to provide contextual meaning): attachment, stimulus processing, visual attention, orientation, play, mobility, and communication that were informed by qualitative data from parents and concept-mapping workshops with professionals [2]. Each item is scored on a 4-point Likert scale with the following response options: not difficult (1), slightly difficult (2), very difficult (3), and impossible (4). The response option not applicable was treated as a missing value.

Statistical analyses

Item analyses were conducted by examining missing responses and response category distributions. Items with missing scores > 20% were considered for elimination, as were items with > 70% of the respondents endorsing the first or last response category (i.e., floor or ceiling effects) and items having no answer in one of the response categories. Items showing inter-item correlations > 0.8, indicating potential redundancy, were also considered for elimination, as were items with an item-total correlation < 0.3. Cronbach’s alpha was calculated to evaluate internal consistency reliability.

An item response theory (IRT) model was subsequently applied. Items violating basic assumptions were considered for elimination [7,8,9,10]. Unidimensionality [11] was assessed by performing an eigenvalue decomposition on the matrix of robust (Spearman) correlations between the items. A difference approximation to the second-order derivatives along the eigenvalue curve (scree plot) was calculated. This acceleration approximation indicates points of abrupt change along the eigenvalue curve, and the number of eigenvalues before the point with the most abrupt change (the point with the maximum acceleration value) represents the number of latent dimensions that dominate the information content [7]. Subsequently, a principal component analyses (PCA) was performed to proxy if all items load on a single component (where the component is taken as a proxy for the latent trait). Magnitude of principal components was checked. Item pairs with excess covariation (> 0.25), signaling local dependence, were flagged. Monotonicity was assessed using Mokken scale analyses. The resulting graphs were visually inspected, and a Loevinger H coefficient was calculated to assess scalability (< 0.3 was considered unsatisfactory) [8,9,10]. The graded response model (GRM) was used [12, 13]; a full model was compared with a constrained model [11, 14, 15] nested within the full model with equal slope parameters across items. The models were compared using a likelihood ratio test (LRT). Relevant model fit indices were checked [16, 17]. Individual item performance was examined by assessing item fit using the X2 statistic. In addition, item and test information curves [11, 18, 19] were computed as well as an item-person map [20].

Known-group validity [5] was investigated using independent samples t tests and ANOVAs using post hoc Tukey tests for the following characteristics: gender, comorbidity, age (median split: ≤ 20 months vs. > 20 months), and level of VI according to the WHO criteria [6]. Test–retest reliability at item level was investigated using weighted kappa and percentage agreement [5, 21, 22] The intraclass correlation coefficient (ICC) for thetas calculated on test–retest data was based on absolute agreement in a two-way mixed-effects model.

All statistical analyses were conducted in R-Studio [23] and SPSS version 22 [24].

Results

Of an estimated 290 invited parents, 131 provided written informed consent to participate and completed the first questionnaire (45%). Data from participants > 25% missing responses on the PAI-CY 0–2 years (n = 14) or children > 2 years (n = 2) were excluded from the analyses. Table 1 presents characteristics of participants (n = 115). Out of the 115 participants, 54 had complete data on the PAI-CY 0–2 years, 45 respondents had one to three missing items, while nine respondents had > 5 missing items. Items were missing at random; there were no indications for acquiescence bias. Over 90% of the respondents were neutral to very positive about various aspects of the PAI-CY 0–2 years (Fig. 1), and no suggestions for improvements were made more than once. Self-reported administration time (including questions on demographic and clinical characteristics) was 17 ± 7 (range 5–40, median 15) min. The retest (n = 108) was completed after a mean of 30.5 ± 26.4 (range 11–181, median 19) days.

Table 1 Socio-demographic and clinical characteristics of participants (n = 115)
Fig. 1
figure 1

Evaluation of the PAI-CY 0–2 years by parents (n = 115)

The items “reacting to (sudden) sounds” and “entertaining alone” were deleted because of low factor loadings, low information, and low scalability coefficients. “Looking at a particular item nearby” and “following a toy or person with the eyes” were deleted because of local dependence with too many other items. Furthermore, omitting these items was considered not to violate content validity, because they were thought not to be suitable for the target population (entertaining alone) or because of similarity to other items.

None of the items had disturbing amounts of missing responses according to our cut-off criterion, whereas three items demonstrated floor effects. In general, the fourth answer category was infrequently endorsed, but collapsing answer category 3 and 4 led to questionable unidimensionality, as the relative increase of the explained variance of the second factor was limited compared to the first factor. Therefore, the response categories of only a few items were collapsed (indicated in bold in Table 2). Four item pairs showed high inter-item correlations (Table 2), whereas none of the items had item-total correlations < 0.3. Cronbach’s alpha was 0.95.

Table 2 Distribution of responses over the response categories, GRM item parameter estimates, item information, item fit, and parameters for test–retest reliability for the PAI-CY 0–2 years

The assumptions for IRT seemed to hold for most items (Table 2 shows items violating assumptions). The items represented a unidimensional model; a one-factor model explained 48% of the variance and mostly yielded high component loadings (> 0.45). A two-factor solution added 10% explained variance. The ratio of 4.8 between the first and second factor is higher than the required minimum of 4 [25]. Out of 351 possible item pairs, three item pairs violated the local independence assumption. One item violated the monotonicity assumption and one item had an insufficient scalability coefficient. It was decided not to delete these items because of content relevance.

The full GRM outperformed the constrained model (LRT = 116.42, df = 26, p < 0.001), and model fit approached satisfactory values (RMSEA = 0.064, SRMR = 0.099, TLI = 0.965, CFI = 0.968). Item parameters, information, and fit statistics are displayed in Table 2. We confirmed the validity of the IRT parameters by examining the differences in test–retest parameters, which were on average small (α: 0.53 ± 0.48, β1: 0.21 ± 0.17, β2: 0.29 ± 0.27, β3: 0.43 ± 0.32); note that test and retest data are highly correlated (r = 0.92). Despite the fact that some items provided little information, all items were maintained, because removing the item with lowest information (“reacting to sudden actions”) resulted in more violations of assumptions for the remaining items. Difficulty of items matched respondents’ ability (Fig. 2).

Fig. 2
figure 2

Item-person map of the PAI-CY 0–2 years

Regarding known-group validity, participants with comorbidity scored worse on the PAI-CY than those without comorbidity (p < 0.001, respectively). Moreover, participants with severe VI or blindness scored significantly worse than participants with no VI (p = 0.030 and p = 0.023, respectively). A trend for worsening scores with increasing severity of VI was observed. No significant difference in PAI-CY scores was found for age and sex (Fig. 3).

Fig. 3
figure 3

Mean disability (theta) by gender, presence of comorbidity, age, and severity of VI. *p < 0.05

Most items showed satisfactory test–retest reliability (Table 2), although for two items (“looking at something for a longer period of time” and “reading books together”) agreement was < 60% and for one item (“reacting to visual stimuli”) weighted kappa was 0.6. The ICC between test and retest data was 0.920 (95% confidence interval 0.880–0.946).

Discussion

With these acceptable psychometric properties, the PAI-CY 0–2 years should be useful in low vision services, in which the perspectives from parents of very young children with VI can now systematically be assessed. After deleting 4 items, the remaining 27 items showed satisfactory model fit and a unidimensional scale measuring developmental and participation needs. Although we observed some violations in IRT assumptions, it was decided not to delete any items at this stage. Furthermore, removal of the worst performing item (“reacting to sudden actions”) caused more violations in assumptions and worsened model fit. The PAI-CY 0–2 year seems able to discriminate between participants with varying levels of clinical conditions, i.e., comorbidity and degree of VI.

Not many instruments focusing on functioning, participation, and/or quality of life are available for children this young age. Instruments for children with disabilities [26] or visual impairment are even more scarce [27, 28]. To our knowledge, the Children’s Visual Function Questionnaire (CVFQ) is currently the only instrument with a version for children aged below 3 years [29]. The CVFQ was developed to measure vision-related quality of life with domains related to competence, personality, family impact, and treatment difficulty imposed by specific eye conditions and might be complementary to the PAI-CY 0–2 years.

The limited sample size of our study was unavoidable; visual disabilities in early childhood are rare among the 17 million Dutch inhabitants. We took a conservative approach because of the small sample size (only four items were deleted), using less stringent criteria for item removal in order not to compromise face and content validity. Deleting potentially relevant and informative items prior to the availability of larger samples could be counterproductive in the long term. The planned use of the PAI-CY 0–2 years by Dutch low vision services and in research as a patient-reported outcome measure enables confirming its psychometric properties. New applications of IRT for small sample sizes, using longitudinal data, should be considered [30]. This might facilitate further item reduction, increase precision, and improve user-friendliness. Moreover, feasibility and acceptability of the questionnaire to respondents and professionals in clinical care should be monitored.