School psychologists contacted 800 pupils (aged 8-16 years and equally boys and girls) from nine elementary schools in Western Vojvodina to participate in the study. They informed all children and adolescents about the purpose of the study, as well as their parents and teachers. Those agreed to participate and returned the written consent from the parents completed the questionnaire in the schools to prevent a low responding rate. The participants were instructed carefully how to fill the KINDL out. One hundred and twenty randomly selected pupils completed the questionnaires after a seven-day period.
The data from healthy subjects were used for the present analysis and those with major psychological or physical chronic diseases or acutely diseased were not considered relevant. As in the previous study, only health subjects were included, assuming to develop a questionnaire with appropriate measurement properties for QOL assessments in healthy populations . The data about the subjects' health were taken from medical records available in schools.
The Serbian Kid-KINDL (8-12 years) and the Kiddo version (13-16 years) are self-report questionnaires developed in the previous study . Each version contains 24 Likert-scaled items in six general subscales: Physical well-being - PW, Emotional well-being - EW, Self-esteem - SE, Family - FAM, Friends - FRI, and School - SC. The score of each item ranges from 1 (never) to 5 (always), while the total of the subscales and overall raw score are formed from the items' means. The raw score are transformed into a 0-100 scale, with higher scores indicating better QOL. The questionnaires and the scoring procedures are provided at the official website .
The distribution of missing data was calculated as the percentage of missing responses on all possible responses. Only subscales with less than 30% of missing items were considered, whereby mean value replacement dealt with such missing values. Mean (M) and standard deviation (SD) was calculated for each item, subscale, and total.
Reproducibility, test-retest reliability, concerns the degree to witch repeated assessments in stable persons produce similar responses . It was evaluated using the intarclass correlation coefficient - ICC, the two-way random method of absolute agreement . Assuming reliability is the degree to which people can be distinguished from each other, the KINDL's ICCs should be 0.6 or higher for healthy group comparisons. The retest took place seven days latter.
Construct validity was assessed using factor analysis that combines observable variables into unobservable, latent variables, giving insights into the theoretical model of some construct [3, 14]. This is known as factorial validity that is assessed using explorative factor analysis (EFA) and/or confirmative factor analysis (CFA). The present study gave priority to CFA, whereas we already have the hypothesized theoretical model of the KINDL assuming to be confirmed as valid for QOL assessments and it is not necessary to re-explore the latent variables using EFA. Moreover, the current perspectives are to use CFA in QOL models, whereas EFA could produce strange combinations of QOL items with unexpected latent constructs . This is mainly because QOL questionnaires often combine items with a causal relationship with the latent variables, causal variables, and items dependant upon the latent variables, indicator variables, while EFA requires only the later [3, 15, 16]. Finally, CFA provides some data on convergent (the extent to which similar theoretical constructs are related) and discriminant validity (the extent to which different theoretical constructs are relatively unrelated) as the aspects of construct validity .
A CFA was conducted using Analysis of Moment Structures Version 7 (AMOS-7) on a model representing the items and the corresponding factors as originally assumed. Therefore, the tested model, as a second order CFA model, had three levels: items (24), primary factors (six subscales), and one secondary factor (QOL). The primary goal is to determine the goodness of fit between the hypothesized model and the sample data. To test the hypothesized model the variance-covariance matrix was used and maximum likelihood (ML) estimation was employed. ML is robust in terms of using non-continuous data and there is evidence of robustness in the terms of the violation of multivariate normality assumption [17, 18]. However, Bollen-Stine bootstrap and associated test of overall model fit were used to study and manage the effects non-normality in the underlying database since research has also demonstrated that ML test statistic (TML) and ML parameter standard errors may be affected when data deviate form normal [17, 18]. Bollen-Stine bootstrap provides more realistic standard errors if there is serious departure from multivariate normality. Based on the recommendations, 2,000 bootstrap samples were drawn to obtain overall model fit and 250 bootstrap samples to obtain parameter estimates and associated standard errors . Model identification was established by estimating the factor variances and fixing one factor loading to 1.0 for each factor. The following statistics assessed the adequacy of the model, indirectly construct validity, as the degree of fit between estimated and observed variance: chi square, Tucker Lewis Index (TLI) (>0.90 acceptable, >0.95 excellent), the Comparative Fit Index (CFI) (>0.90 acceptable, >0.95 excellent), and root mean square error of approximation (RMSEA) (<0.08 acceptable, <0.05 excellent) [16–19]. It was assumed the factor loadings of the items within the subscale and the standardized coefficient of the subscales should be at least moderate to support convergent validity, while the correlations between the estimated parameters of the latent factors should be low to support discriminant validity [3, 18, 20].