Norms for the Dutch Version of the Young Schema Questionnaire – Adolescent in a Clinical Population

The Young Schema Questionnaire (YSQ; Young, 1994) is a widely used instrument to assess early maladaptive schemas in adults and older adolescents. Despite its widespread use, no norm data are available, making it difficult to evaluate when an individual’s YSQ score can be considered as elevated or high. Such norms can be useful for screening purposes such as to identify those at risk for psychopathology and providing early and appropriate interventions. The aim of our study was to norm the five schema domains of the Dutch adolescent version of the Young Schema Questionnaire (YSQ-A: Van Vlierberghe et al., 2004). In addition to providing norm data for clinical practice, we also show the process of obtaining reliable and valid norm data with state-of-the-art regression analysis which does not necessarily require splitting the norm sample into subgroup by sex or age, yet does take these variables into account in obtaining norms.


Introduction
In cognitive theory, it is assumed that negative basic core beliefs about the self, other people, and the world underlie the development and maintenance of emotional disorders. These core beliefs are called schemas (Beck, 1995). Young expanded the cognitive theory of Beck by researching the scored on a Likert-scale ranging from 1 (completely untrue of me) to 6 (describes me perfectly). Rijkeboer and Sterk (1997) constructed a Dutch version of the original 205item Young Schema Questionnaire, which has been demonstrated to have good psychometric properties in referred and non-referred adult populations (Rijkeboer & van den Bergh, 2006;Rijkeboer et al., 2005). Corresponding items constituting the short version of the Young Schema Questionnaire were extracted from this Dutch translation and then Van Vlierberghe et al. (2004) incorporated 15 out of 18 schemas and rephrased them to fit the adolescent population to construct the Dutch version of the Young Schema Questionnaire-Adolescent (YSQ-A), in agreement with the original author. Factor analytic research from this research group confirmed the 15 schema and five schema domains taxonomy outlined by Young (Van Vlierberghe et al., 2010).
There are two common ways of scoring the schema-questionnaires. Counting the number of extreme scores (5 or 6) of a schema scale or taking the average of all scores of a particular schema scale, with research from Rijkeboer et al. (2005) showing the last method to have the best predictive value for psychopathology. There are currently no norms for interpreting the schema-questionnaires, which makes it difficult to compare an individual score with a representative population. For clinical use, it seems relevant to provide norms along which someone's individual score can be evaluated as being within the normal range or not. This way, the schema questionnaire can be used for screening purposes and the selection of early interventions. Starting from the traditional way of norming, it seems a relatively easy process to obtain norm scores for clinical questionnaires. Simply splitting a representative sample into subgroups, based on, for example, sex or age, and converting raw scores into norm scores, for example Z-scores or percentiles, based on the score distribution within the subgroup to which a person belongs. As Van Breukelen and Vlaeyen (2005) have stated, using this traditional approach has some serious drawbacks in favoring simplicity at the cost of validity and reliability. Splitting groups into subgroups reduces the sample size which will increase the standard error and thus provide less reliable norms. On top, the question is which person variables should be used to split the sample (e.g. sex or educational level?). Both problems can be solved by using the regression approach for norming, as demonstrated by Van Breukelen and Vlaeyen (2005) on a pain coping questionnaire, for the five schema domains of the YSQ-A. This regression approach simultaneously helps to decide which person characteristics (such as age, sex, education) are relevant to norming, and leads to norms based on the total sample but adjusted for the relevant person characteristics. In this paper, we first described the sample and the statistical method used for norming. Next, the results were presented for the five schema domains, indicating which adolescent characteristics were relevant to norming and which were not and showing how the raw score of an individual adolescent can be converted into a standardized Z-score, making it possible to compare an adolescent's score to his or her reference group. Domains were selected for norming instead of schemas because each schema consists of only five items which leads for some schemas to problematic distributions for norming. Finally, the results were summarized and strengths and limitations of this study were discussed, as well recommendations for future research.

Participants
Raw scores per item on the Dutch version of the Young Schema Questionnaire -Adolescent and several demographic background variables were available for 840 referred adolescents, aged 12 to 25 years. These data were gathered in the period between 2010 and 2020. From this group, 4 adolescents were excluded because more than 15% of the items was missing. 1 adolescent was excluded because all demographic background variables were missing. The remaining group consisted of 245 (29.3%) patients from two outpatient treatment centers in de southern part of the Netherlands who completed the questionnaire on paper, and an online database that consisted of 590 (70.7%) patients from different treatment settings in the Netherlands and Belgium. There were 252 boys (30.2%) and 583 girls (69.8%). Average age was 16.64 (SD = 2.997), ranging 12 to 25 years. Other background variables were only partly known. More specifically, these were not known for the patients in the online database. Table 1 summarizes the descriptive characteristics of the referred adolescents.

Procedure
Data were entered into a SPSS system file anonymously and coded. Data were part of routine clinical assessments and collected retrospectively. Therefore, no written informed consent was needed.. The study protocol was evaluated and approved by the Medical Ethics Committee Zuyderland-Zuyd and was completed in accordance with the Helsinki Declaration.

Measures
The Dutch version of the Young Schema Questionnaire-Adolescent (YSQ-A: Van Vlierberghe et al., 2004) was used to measure schemas. This questionnaire consists of 75 items, divided into 15 schema scales (five items each), which in turn are divided into five schema domains (varying from two to five scales per domain), namely Disconnection/Rejection, Impaired Autonomy, Impaired Limits, Other-directedness and Overvigilance/Inhibition. Each item presents a specific personality statement and the adolescent is asked to indicate agreement or disagreement on a Likert scale, ranging from 1 (completely untrue of me) to 6 (describes me perfectly). The schema-scale scores are obtained by calculating the average score of the items that contribute to the schema. Schema domain scores are obtained by summing the scores of the schemas that belong to that particular domain. An item example is "During my childhood, nobody supported me when I was sad or scared" (schema: Emotional Deprivation; schema domain: Disconnection/Rejection). Previous research has shown that the reliability of schemas and schema domains is satisfactory (Roelofs et al., 2011(Roelofs et al., , 2013Van Vlierberghe et al., 2010). In the current study, Cronbach's alpha for the five schema domains ranged from 0.578 (Other-directedness) to 0.876 (Disconnection/Rejection).

Statistical Analysis
Data were checked for missing values. If less than 15% of the items on a schema scale was missing, missing values were imputed by inserting the mean of that person on the non-missing items of that schema scale. This was the case with 56 adolescents. Multiple regression analysis was conducted with the SPSS statistical software package version 21 to determine a fitting model for deriving norms for each schema domain, following the approach for deriving norms for clinical questionnaires by Van Breukelen and Vlaeyen (2005) who showed the advantage of the regression approach above traditional norming. A regression analysis was performed per schema domain, with the score on that schema domain as the dependent variable. Sex, age, source (paper and pencil completion versus online) and their interaction effects were included as predictors. Other variables from Table 1 were not included in view of the very large percentage of missing values on these. Dummy coding was used for the categorical predictor source (0 = online 1 = paper & pencil), and for sex (0 = girl, 1 = boy). Age (in years) and squared age were included as predictors to capture linear and nonlinear trend in domain scores. To prevent collinearity between age, age squared and interactions, age was centered by subtracting the overall average age (16.64) from each individual age before computing the squared age term and interactions. To include only relevant variables into the norming procedure, the full regression model was reduced stepwise by eliminating the least significant predictor if its two-tailed p value was larger than 0.01 (rather than 0.05 in view of multiple testing), while respecting the hierarchical nature of the model (i.e. maintaining the linear age in the model as long as the quadratic term was still in the model, maintaining predictors A and B as long as their interaction effect was still in the model). For the final regression model, residuals (prediction errors) were plotted and analyzed to check whether these satisfied the regression assumptions of normality and homogeneity of variance which are important especially for norming. With the final model, the raw schema domain score of an individual can be converted into a standardized Z-score in three steps. First, the predicted score Y of an adolescent is computed by filling in his or her values on each of the predictors (X variables, i.e. age, sex, source and their interactions as far as being part of the final model) and using the regression weights as estimated from the (large) norm sample: Next, the raw residual or prediction error, e, is calculated, which is the difference between the adolescent's observed and predicted domain score: 76 (29,9%) a Of these, 245 are adolescents from two outpatient treatment centers in the southern part of the Netherlands who completed the questionnaire on paper, with all other adolescents from different treatment settings in the Netherlands and Belgium completing the questionnaire online. b Age in years, with a range of 12 to 25 years. c Information regarding ethnicity based on the 245 adolescents from the two outpatient treatment centers in the southern part of the Netherlands. d Educational information based on 163 out of 245 adolescents from the outpatient treatment centers in the southern part of the Netherlands who completed the questionnaire on paper. e Diagnostic information based on 245 adolescents from the outpatient treatment centers in the southern part of the Netherlands who completed the questionnaire on paper

Testing Assumptions
The use of standardized residuals (Z-scores) of the final regression models to evaluate an adolescent's score requires that these residuals are normally distributed and their variances are homogeneous. The normality assumption was tested with the Kolmorgorov-Smirnov Test. Because the residuals turned out to have a skewed distribution, the raw domain scores were transformed by taking their square root and then repeating the whole regression procedure and rechecking the residuals. The results in Table 2 are based on the regression analyses after square root transformation of all domain scores. After this square root-transformation, residual normality was sufficiently met (p > .05) for all five domains, and homogeneity of variances was for schema domains two to five (p > .5), but not for domain 1 (p < .001). The heterogeneity of residual variance for domain 1 might invalidate the use of standardized residuals, which are based on the residual variance in the total sample, as norms for domain 1. Therefore, the agreement between those standardized residuals and an improved version was checked. The improved version of the standardized residuals was derived by dividing each residual by the estimated SD(residual) for that specific person. This estimated SD(residual) in turn was obtained by a regression analysis of the squared raw residual (which estimates the residual variance) on age, sex and source, then saving the predicted values, and then taking the square root of these predicted values as estimate of SD(residual). The thus improved standardized residuals showed sufficient homogeneity of variance. Figure 1 plots the standardized residual against the improved standardized residual, showing fair agreement as most persons are at or very near the Y = X line, but also some deviations in the Finally, the adolescent's standardized residual, Z, is computed as follows: Here, SD(e)is the residual standard deviation (i.e. the square root of the MS(residual) in SPSS. The standardized residual is easily obtained as optional output of the regression procedure in SPSS, and it has a standard normal distribution if the model assumptions of normality and homogeneity of variance of the residuals are met, which can be checked with SPSS. If the model is correct, Z-scores can be interpreted along a standard normal distribution, where a Z-score between − 2 and + 2 can be considered within the normal range given the adolescent's background variables. Z-scores outside that range (+ 2) might be regarded as exceptional and may be reason for concern if on the unhealthy side of the domain range (i.e. if Z > + 2 as opposed to < -2).  Note. * = all regression weights have p < .01. -= predictors not included in final model because p > .01. Age in years was centered by subtracting its overall mean (16.64). Dummy indicators for sex and source were coded as 0 = girl, 1 = boy and 0 = online, 1 = paper&pencil. ** = All schema-domain scores were square root transformed to obtain a fairly normal residual distribution, as the residual distribution was skewed after regression analysis of untransformed scores in Table 2, we obtain as predicted square root transformed score: Y = 3.689 + −.115x0 (sex) + (−.056x (age − 16.64)) + .032x0 (source) + .046x0 (sex) x (age − 16.64) + −.362x0 (sex) x0 source) + .094x (age − 16.64) x0 (source) = 3.689 + 0 + 0.03584 + 0 + 0 + 0 + 0 = 3.72484

Predictors of the schema-domains
Using Eq. 2, we obtain as prediction error e: 5-3.72484 = 1.27516 (remember e = Observed Y -Predicted Y). Using Eq. 3, we finally obtain as Z-score: 1.27516/0.559 = 2.3 (remember Z = e/SD(e), where SD(e) = square root of MS(residual), and MS(residual) is obtained as optional output of the regression procedure in SPSS). Assuming that a Z-score of 2 or higher is considered abnormal, this adolescent experiences a critical extent of disconnection and fear of rejection and abandonment in relationship with other people, which poses a serious threat to her personality development and may be reason for intervention. Because the residual variances for domain 1 was not homogeneous, however, this Z-score has to be interpreted with some caution. As Fig. 1 shows, a Z-score of 2.3 according to the original method allows for an improved Z-score between 1.7 and 2.5. neighborhood of Z = + 2. A possible solution is to use the more flexible regression approach in Voncken et al. (2018) which allows for dependence of residual variance on age and sex and for non-normality. The price of that is an even more complex norming procedure however. For the sake of simplicity standardized residuals (Z) can be used as norms for all five schema domains, but Z-values near + 2 on domain 1 need to be assessed more carefully by checking what range of improved Z-scores that implies according to Fig. 1. Practical guidance will be given in an online supplement.

Example of Computing a Z-score for an Adolescent with the Regression Model
As explained in the Method section, the final regression models can be used to convert an adolescent's raw score on each schema domain into a standardized residual or Z-score, which indicates the position of that adolescent's domain score relative to those of adolescents with the same age and sex. Suppose we have a 16-year old girl who completed the questionnaire online and she has a score on the first schema domain Disconnection/Rejection of 25, of which the square root is 5 (remember that the regression analyses and norming were done on the square root transformed domain scores to obtain normally distributed residuals). We want to know if this score is within or outside the normal range. Using Eq. 1 and the regression weights and coding of predictors

Discussion
In this study we showed how to obtain valid and reliable norms for the schema domains of the Dutch version of the Young Schema Questionnaire (Van Vlierberghe et al., 2004) with the regression approach for norming, rather than the simpler but less valid and reliable traditional approach of splitting the sample by age and sex. Using this regression method and checking its assumptions of normality and homogeneity of variance of residuals, we also derived norm tables per domain that enable users to quickly look up what the Z-score of a given patient is, taking into account sex and age and mode of test administration (online/paper&pencil). These Z-scores make it possible to evaluate an individual's score along other adolescents of the same age and sex and to see whether this score falls within the range of normal scores or is exceptionally high and might be reason for concern or intervention. To the best of our knowledge, this is the first attempt to obtain norms for the YSQ-A. The YSQ-A is frequently used in clinical practice and having norm scores allows clinicians to evaluate whether a youngster is at risk for psychopathology or for the purpose of obtaining a caseconceptualization for schema therapy. Elevated schema domain scores may influence how one relates to oneself or to others. It is especially relevant to discuss with the youngster which schemas within each of the domains are responsible for the elevated scores. This way, the underlying vulnerability can be understood in terms of how they may influence together with coping responses the current mood states of adolescents (schema modes). Until now, clinicians usually use their clinical judgement to consider whether schema domain or schema scores are elevated or not. Interestingly, our analyses show that age and sex are important parameters when elevated scores should be interpreted. Moreover,

Using the Regression Approach to Derive Traditional norm Tables
Although the regression approach has the advantage of greater validity and reliability, it may be unattractive for daily use by clinicians because computing a person's Z-score is time-consuming and error-prone as the example above shows. However, we can use the reduced regression model to derive traditional norm tables, to meet the need for an instrument that is easy to use in daily practice. We first select a suitable categorization of the Z-score. With respect to the schema questionnaires we are mainly interested in positive Z-scores, because (extremely) high scores may be reason for concern. So we could use as categories for Z, for instance, Z < 0, from 0 to 0.5, from 0.5 to 1, from 1 to 1.5, from 1.5 to 2, from 2 to 2.5, from 2.5 to 3, and Z > 3. Next, we compute for each Age x Sex x Source category the raw scores that correspond to these category boundaries. Finally, for each schema domain we obtain one norm table per sex per source, which shows for each age the raw domain score boundaries corresponding to the Z-score boundaries above. Table 3 gives an example of the norm table Disconnection/ Rejection for girls that completed the questionnaire online. Referring to our example of the 16 year old girl, the norm table shows that her schema domain score of 25 matches with a Z-score ranging between 2.0 and 2.5. Similar tables were produced for all five domains and all four person categories (girls online, girls paper&pencil, boys online, boys paper&pencil). All tables are included as an online supplement to this manuscript. Author Contributions All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Dyonne Sijstermans, Marjolijn Span, Jeffrey Roelofs and Gerard van Breukelen. The first draft of the manuscript was written by Dyonne Sijstermans and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Funding
The authors did not receive support from any organization for the submitted work.

Conflict of Interests
The authors declare they have no financial (or non-financial) interests.

Consent to Participate
Informed Consent was obtained from part of the individual participants included in the study (treatment centers). The online data were gathered anonymously for which no ethical informed consent procedure was required. This study was performed in line with the principles of the Declaration of Helsinki.

Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/. the results also show that the format used (Paper&Pencil or online) also plays a role. This is not surprising as also other studies using this way of analysis already found these parameters of great importance (see Roelofs et al., 2010). This paper provides a sound rationale for using the empirically derived norms for this purpose.
From a clinical perspective, the YSQ-A can best be used as a screening instrument to assess elevated scores on schema domains and, as a next step, to evaluate which schemas within a domain might be important to take into account when making a case-conceptualisation. It should be borne in mind that one cannot rely solely on questionnaire data. When people are in a vulnerable state, an overreport of schemas and schema domains may occur and in a similar vein, when in an avoiding coping state, underreport may occur. Therefore, data on questionnaires should never be the only source of information. For case-conceptualisation in schema therapy, creating a lifeline, imagery exercises, and cognitive techniques (downward arrow technique) should be used in addition to the YSQ-A to provide a complete and more differentiated picture of the underlying schemas of an individual.
A strength of the present study is the use of the regression approach for deriving norms, which makes it possible to select relevant predictors in the norming procedure and to improve the reliability of the norms by keeping the sample together. We used age, sex and mode of test administration as possible predictors for the norming process. Although we consider these variables as most important, we cannot rule out that there might be other relevant variables that were nor assessed. However, this is a general limitation of observational studies, respectively studies using available data, irrespective of whether norms are obtained by regression analyses or by the traditional approach. Future research in other adolescent samples may inform about differences between countries and cultures. Another limitation of this data-driven study is that we could not check all descriptive information from all patients from the different treatment centers, as the dataset with the YSQ-A was anonymous and could not be linked to the file of a specific patient. Because both the questionnaire and the online tool are only available on request and limited to psychologists familiar with schema theory, we are convinced that the data were representative for a broad area of clinical treatment centers for adolescents. The present paper shows that using the regression approach makes it possible to find an equilibrium between statistically valid and reliable norming and ease of use of these norms in clinical practice.