Background

Oral health is no longer seen exclusively as an absence of oral diseases. Good oral health is considered to be having or maintaining optimal functional, social and psychological well-being [1]. Therefore, researchers have developed a number of oral health-related quality of life (OHRQoL) instruments to measure clinically relevant outcomes from the patient’s perspective, which are used to measure general oral health needs or to measure specific diseases or conditions such as the impact of malocclusion on their well-being.

Like their counterparts in other countries, many adolescents in Malaysia desire to undertake orthodontic treatment [2]. Thus, instruments that could measure the impact of the malocclusion on the adolescent’s OHRQoL would provide more objective information on the perceived needs of the adolescents. This may support treatment priority as the impact of the malocclusion on patients’ OHRQoL is considered on top of clinical oral health treatment measures.

The Psychosocial Impact of Dental Aesthetics Questionnaire (PIDAQ) is an instrument that has been developed specifically for measuring impacts related to dental aesthetics and arrangement [3]. It was improvised from previous scales [47] that were able to associate subjects’ malocclusion with perception of their dental aesthetics [3] with few added items to form 4 subscales of 23 items measuring the impact of malocclusion on the dental self-confidence (DSC), social impact (SI), psychological impact (PI) and aesthetic concern (AC). The DSC is a 6-item subscale that measures the impact of dental appearance on a positive self-concept. The other 3 subscales are negative domains: SI has 8 items that measure anxiety levels about other people’s reaction when the subjects expose their teeth, PI has 6 items that measure negative emotions about the dental aesthetics, and AC has 3 items that measure displeasure with the image of the subject’s teeth in different conditions. The questionnaire has shown good psychometric properties in cross-cultural adaptations in adults [812] as well as in adolescents [1315]. The instrument has recently been cross-culturally adapted into the Malay language and has been shown to be valid and reliable for use by Malaysian adolescents [16].

In order for the instrument to operate well, it must be validated to be used for the target population. In Malaysia, Malay (also referred to as Bahasa Malaysia, or Malaysian language) is the national language. However, English is widely used as a lingua franca among urban Malaysians. Clinical impressions suggest that communication between a clinician and an adolescent patient is commonly in the language that the patient prefers, usually either in Malay or in English. It was also noted that a good number of adolescents especially in the urban areas prefer to express themselves in the lingua franca of English regardless of their background. Exposure of English has become more widespread as it is regularly used in Malaysian media including the internet. Thus, a Malaysian English version of the PIDAQ as an alternative to the Malay PIDAQ is necessary and may greatly benefit OHRQoL studies to capture the psychometric information of adolescents who may not be comfortable to answer Malay-medium instruments. This will prevent exclusion of subjects due to language barrier. The aim of this study was to develop and test a Malaysian English version of the PIDAQ instrument for use by Malaysian adolescents and to evaluate its psychometric properties.

Methods

The study comprised both linguistic and psychometric validations. The adapted English version was intended to be provided as an alternative to the Malay PIDAQ for subjects who are less proficient in the Malay language or who prefer to use English. Therefore, the process of linguistic validation was parallel to that used for the linguistic validation process of the Malay PIDAQ, monitored closely to ensure that the conceptual, item and semantic equivalences did not differ from the Malay version. None of the authors were familiar with German, the language used for development and validation of the original [3] and adolescent [15] versions of the PIDAQ. Therefore, the professionally translated, published English versions [3, 15] were the basis for the Malaysian versions. Linguistic validation involved face and content validations by experts in orthodontics (WNWH and MZMM) and in dental public health and OHRQoL measures for children (ZYMY), pre-testing, comparison with the Malay PIDAQ and final review of the draft English PIDAQ.

Following content validation by the experts, the published English version for adolescents was pilot tested on seven patients aged 12–14 and on seven patients aged 15–17 non-randomly selected from the orthodontic waiting list. The pilot test was conducted at the Faculty of Dentistry, University of Malaya by the investigator (WNWH). The adolescents were asked to answer the questionnaire independently and the time taken to complete the questionnaire was recorded. Next, a discussion with the participants was conducted paying attention to the participants’ feedback on the general layout of the questionnaire including the instructions, questionnaire items and answer options. In the discussion, some words that were considered ambiguous were discussed and replaced with suitable words as suggested by the participants. They were encouraged to suggest words that they would find more suitable to their level of comprehension. When words that could not be understood by the adolescents could not be replaced with other suitable words, the published original version was consulted, tested verbally and discussed to see if subjects understood the meaning. Examples of words that were deemed difficult by the participants were content, reveal, fear, funny looks, self-conscious, upset, envy and ashamed. Participants also agreed that the answer option not relevant should be included for the item Don’t like own teeth on video for those who had never seen a video recording of themselves.

Following the pilot test, a meeting was held between two of the authors, one an orthodontist (WNWH) and the other an expert in OHRQoL measures (ZYMM); both were native Malay speakers and proficient in both English and Malay languages. The aim of the meeting was to discuss the outcomes of the pilot test and compare with the back translation of the Malay PIDAQ before agreeing on the draft adapted English PIDAQ. During the discussion, close attention was given to the content and wording of the adapted English PIDAQ to ensure that conceptual and item equivalence were achieved between the original instrument and the adapted English PIDAQ. Conceptual equivalence was assessed to ensure the questions reflect the same concept and the concepts are meaningful to the targeted cultures and languages. Item equivalence was achieved when the meaning of each item was maintained during the adaptation process [17].

Next, the psychometric properties of the adapted English PIDAQ were tested on non-randomly selected adolescents who had not been involved in the pilot study. Sample size calculation was done to detect mis-specified factor loadings comparing the two age-groups, using A-priori Sample Size Calculator for Structural Equation Models [18]. Given 4 PIDAQ subscales with 23 items, the recommended sample size for each age group at a power level of 0.80 and a probability level of 0.05 for model structure was 166 [19, 20]. This concurred with the rules-of-thumb of 4 to 10 subjects per variable [21]. Convenience sampling was done, which included participants who volunteered from 4 schools in the northern part of peninsular Malaysia and participants in the orthodontic waiting lists from 2 orthodontic government clinics in Kuala Lumpur. To account for variation in developmental changes in adolescents [15], participants were divided into two narrow age groups: The lower secondary school children 12 to 14 years old (Forms 1 and 2) and upper secondary school children 15 to 17 years old (Forms 3 to 5). For completeness of analysis, the participants were also analysed collectively. Due to the policy of the Ministry of Education Malaysia that does not permit involvement of students who will take the national examinations in that year, i.e., the Form 3 and Form 5 classes of secondary school, participants from schools were recruited from those in Forms 1, 2 and 4. Participants from the orthodontic clinics comprised 12- to 17- year-olds who requested orthodontic treatment. Exclusion criteria included those who were having or have had orthodontic treatment and those with craniofacial deformities. The questionnaire was self-administered. Participants completed the questionnaire either in the classroom or in the orthodontic clinic waiting area. For the re-test, questionnaires were redistributed to 25% of the participants after 2 weeks.

Responses were scored from 1 to 5 on a 5-point Likert scale: not at all (score 1), a little (score 2), somewhat (score 3), strongly (score 4) and very strongly (score 5). For each subscale, scores were tabulated from the total item scores. Total scores of the subscales SI, PI and AC and the reversed scores of the positive domain DSC were summed to provide the total PIDAQ score, which is a measure of the impact of the dental aesthetics on the psychosocial well-being of patients [22]. Low PIDAQ scores would indicate a low impact of dental aesthetics on OHRQoL while high scores would indicate high negative psychosocial impact.

Data were analysed using IBM-SPSS-AMOS v.20 and IBM-SPSS-Statistics v.20. Chi-squared test and Fisher exact test were done to compare equality of the proportions of the demographics between age-groups. Internal consistency was measured by confirmatory factor analysis (CFA) and Cronbach’s α for each subscale. The CFA calculated estimates of the maximum likelihood discrepancy. Goodness of fit of the observed data to the model was measured on a comparative fit index (CFI) ≥ 0.90 and root-mean-square error of estimation (RMSEA) < 0.08 [15]. Multiple group comparison was done to determine measurement invariance between the two age groups. Three stages of invariances where the models were further constrained at each stage were tested: In the configural (baseline) model, all free parameters were estimated separately in each group; in the measurement weights model, the paths of the factor loadings were constrained equally across groups; and in the structural covariances model, the estimated factor loadings, factor covariances and factor variances were constrained [23]. Non-invariance across age groups was assessed as ∆CFI ≥ 0.01 when compared against the baseline configural model [24]. Subscales with Cronbach’s α of between 0.70 and 0.95 were also considered to have good internal consistency [25].

The PIDAQ was developed to assess need for treatment in patients requesting orthodontic treatment [17] and to measure orthodontic-specific OHRQoL outcomes [15]. As in the previous study [15], discriminant validity was tested by comparing the relationship of the PIDAQ subscales with perceived need for orthodontic treatment based on the Malocclusion Index [15], which comprised the Aesthetic Component of the Index of Orthodontic Treatment Need (IOTN-AC) and the awareness component of the Perception of Occlusion Scale (POS). The IOTN-AC was rated using a black and white photographic 10-point-scale showing teeth with increasing severity of malocclusion [26]. The POS component comprised 6 items of malocclusion traits [27] and participants were required to evaluate their level of agreement with each item on a 5-point Likert scale from not at all to very strongly. The self-rated and investigator-rated Malocclusion Indices (MI-S and MI-D, respectively) were adapted for analysis of the severity of malocclusion where the scores of the IOTN-AC and total scores of the POS were standardized, summed up and divided by 2 to give an index value with a 0 mean value [15]. The judgments of six investigators (WNWH, SSFS, SFMA, MZMM, RB and MJG) were calibrated for the MI-D. The inter-operator intraclass correlation (ICC) at T1 was excellent [28] at 0.97 (95% CI = 0.95 to 0.98; p < 0.001). Intra-operator ICC scores were also excellent at above 0.75 (p < 0.001) and ranged from 0.85 (95% CI = 0.71 to 0.92) to 0.95 (95% CI = 0.90 to 0.97).

The construct validity of the adapted English PIDAQ was tested by comparing the relationship of the PIDAQ subscales with other measures measuring related constructs, i.e., rank of perceived dental appearance and need for braces. For criterion validity test, this was assessed by comparing the relationship of the PIDAQ subscales with perceived impact of malocclusion on daily activities using the Child-Oral Impacts on Daily Performances (Child-OIDP) index [29]. The rank of perceived dental appearance was rated from Excellent, Good, Average and Poor while the need for braces was rated as Yes, No and Don’t know. The Child-OIDP is used as a condition-specific (CS) instrument to measure the impacts of malocclusion on daily activities if impacts were attributed to the Spaces between and Position of the teeth [30]. The third CS item, which was Deformity of the mouth and face was excluded as it was not relevant due to the exclusion criteria. The performance score was tabulated by multiplying the frequency (scale from 1 to 3) and magnitude (scale from 1 to 3) of the impact that was attributed to any of the 8 daily activities, i.e., cleaning teeth, eating, emotional stability, smiling, speaking, relaxing, doing schoolwork and socialising. The instrument total score was tabulated by summing up the 8 performance scores. It was scored as 0 if there was no impact on the 8 daily activities. The range of scores for each performance was 0–9 and the index was 0–72.

An independent t-test was applied to compare the relationship between the PIDAQ subscales and total PIDAQ scores with the malocclusion index (MI-S/MI-D) scores of those with no or slight malocclusion (lower quartiles) and severe malocclusion (upper quartiles). The effect size was tabulated as 2 t/√df where t is the t test value and df is the degree of freedom [31]. Kruskal-Wallis and Mann-Whitney statistics were used to assess the relationship between PIDAQ and the other subjective measures mentioned. The Pearson correlation coefficient was calculated to assess the relationship between the PIDAQ and CS-Child-OIDP total performance scores of the eight daily activities. The CS-Child-OIDP total performance scores were tabulated only when the impact was due to malocclusion.

In terms of reproducibility test, the standard error of measurement (SEM) was calculated as the square root of the residual variance of the ANOVA analysis, and the smallest detectable change (SDC) was calculated as 1.96 × √2 × SEM [25, 32]. The paired t-test determined if there was any significant change in PIDAQ subscales of informants between the first and second tests. Limits of agreement were calculated as mean change ± 1.96 × standard deviation of the changes [33]. The ICC for absolute agreement by two-way random effects models was calculated [25].

Floor and ceiling effects within each subscale were calculated as the percentage of the achieved lowest and highest possible scores. Floor or ceiling effects were considered present if the prevalence was more than 15% [25].

Results

In total, 393 participants responded (12–14 year old age group = 203, 15–17 year old age group = 190). Table 1 shows selected demographics of the participants. Frequencies across the demographics showed no statistically significant differences between the age groups.

Table 1 Demographics of the participants

Initial analysis showed that 11.7% (n = 46) of the participants chose Not relevant for the item Don’t like own teeth on video. This demonstrated that a relatively large proportion of the participants found this item irrelevant to their circumstances. Therefore, based on the recommendation by Jokovic et al. [34] and through discussions among the authors, it was decided to remove this item from the AC subscale of the PIDAQ. Thus, the psychometric analysis in this study was based on the shortened version of the Malaysian English PIDAQ that comprised 22 items instead of 23 items.

In this study, the histogram demonstrated that the data were normally distributed. Thus, the results were described in means and standard deviations (Fig. 1). The mean PIDAQ score was 59.7 (SD = 15.5; Min = 30; Max = 100) for the younger age group, 62.3 (SD = 16.2; Min = 24; Max = 110) for the older age group and 60.9 (SD = 15.9; Min = 24; Max = 110) for the overall population. Independent t-test showed no significant difference in mean PIDAQ scores between the younger and older age groups (p = 0.10; 95% CI = −5.8 to 0.5).

Fig. 1
figure 1

Distribution of the total PIDAQ scores of the participants

The factor analysis (Table 2) shows good-fit statistics: the CFI score of Model A was at 0.90, while the RMSEA was less than 0.08, with a small confidence interval. The factor loadings were within acceptable range although one item had a factor loading of less than 0.50. Multi-group invariance test showed that the baseline configural model constrained for age group was 0.898. The ∆CFI was 0.004 (i.e., ∆CFI < 0.01). The CFI of the measurement weights model was 0.893 and structural covariance model was 0.893 (i.e., ∆CFI < 0.01 against the baseline model), confirming invariance across age groups.

Table 2 Multi-group confirmatory factor analysis showing the standardised parameter estimates and fit indices

Table 3 shows the results of internal consistency analyses of the subscales, scale statistics and inter-item correlations of the subscales. The subscales of DSC, SI and PI satisfactorily achieved the Cronbach’s α values of between 0.70 and 0.95 for all age groups. However, the subscales of the AC component was moderately satisfactory, ranging between 0.52 and 0.62, for all age groups. None of the inter-item correlations were ≥ 0.90 for all subscales or ≤0.30 for the DSC and AC subscales. For the SI subscale, the items with inter-item correlations below 0.30 were: between Hold back their smile and What others think (12–14 years = 0.23), Shy because of own teeth (15–17 years = 0.28), Stupid comments from others (12–14 years = 0.17; all ages = 0.25) and Boys/girls find own teeth ugly (all ages = 0.29); and between What others think and (12–14 years = 0.21) and Stupid comments from others (12–14 years = 0.20). For the PI subscale, the item Wish to look better had inter-item correlations below 0.30 with Unhappy about own teeth (12–14 years = 0.23; all ages = 0.29) and Feel bad about own teeth (12–14 years = 0.25). None of the item total correlations scores were < 0.30 in all subscales and age groups.

Table 3 Internal consistency, scale statistics and inter-item correlations of the subscales

Table 4 shows the results of discriminant validity analyses of the adapted English PIDAQ. There were statistically significant differences in mean scores between adolescents who rated themselves (MI-S) with no or slight malocclusion and those with severe malocclusions for all subscales and total PIDAQ, and for all age groups (p < 0.01). There were statistically significant differences in mean scores between adolescents who were rated by the investigators (MI-D) with no or slight malocclusion and those with severe malocclusions for all subscales in the older and overall age groups and the total PIDAQ scores for the older age group (p < 0.01). However, the differences were not statistically significant (p > 0.05) in the younger age group for the SI, PI and AC subscales and for total PIDAQ in the younger and overall age groups. In all three age groups, comparison with MI-S and MI-D showed that DSC scores reduced with increasing severity of the malocclusion. In contrast, SI, PI, AC and total PIDAQ (except in comparison to the MI-D in the younger age group) scores increased with increasing severity of malocclusion.

Table 4 Discriminant validity of the adapted English PIDAQ with regards to self-rated (MI-S) and interviewer-rated (MI-D) malocclusion

In terms of construct validity test analyses, rank of the participants’ perceived dental appearance showed that there were statistically significant associations between all PIDAQ subscales and total PIDAQ scores with self-rated rank of their dental appearance (p < 0.01) for all age groups (Table 5). For DSC subscale, mean scores gradually lowered as the participants rated their teeth from excellent to poor. The trend was statistically significant for all age groups. Conversely, mean scores were gradually increased in SI, PI, and AC subscales, and total PIDAQ scores, respectively, as the participants rated their teeth from excellent to poor. The trend was statistically significant for all age groups.

Table 5 Construct validity of the adapted English PIDAQ with regards to self-perceived dental appearance rank

There were also statistically significant associations between the self-perceived need for braces with all PIDAQ subscales and total PIDAQ scores in all age groups (Table 6). Those who felt they needed braces had significantly lower DSC mean scores and significantly higher SI, PI, AC and total PIDAQ mean scores compared to those who felt they did not need braces in all age groups.

Table 6 Construct validity of English PIDAQ with regards to self-perceived need for dental correction

Table 7 shows the associations between total PIDAQ scores and the prevalence of CSOIDP for all age groups. Those with CSOIDP had significantly higher total PIDAQ mean scores than did those who did not report any impact on their daily activities related to the CSOIDP in all age groups (p < 0.001).

Table 7 Criterion validity of the adapted English PIDAQ with regards to impact on daily activities attributed to malocclusion

The Pearson correlation coefficient showed statistically significant moderate associations between the total PIDAQ scores and the CSOIDP performance scores (Table 7). The association was positively correlated for all age groups.

Table 8 shows the reproducibility test analyses of the adapted English PIDAQ. The ICCs were above 0.70 for the DSC subscale for all groups and for the SI for the younger age group. The ICCs were generally moderate for the rest of the subscales in all age groups, with the lowest ICC score of 0.53 for the SI subscale in the 15–17 years age group. There were no statistically significant differences between the first and second test administrations in all subscales for all age groups.

Table 8 Tests of reproducibility for the adapted English PIDAQ

Neither floor nor ceiling effects were detected. The prevalence of the lowest or highest scores for all subscales were below the cut-off value of 15% in all age groups (Table 9). The AC subscale had the highest prevalence of the percentage lowest scores of between 13.7 to 13.8%.

Table 9 Floor and ceiling effects of the English PIDAQ

Discussion

The development of the PIDAQ instruments may take years to ensure the validity of the outcome that they purport to measure. Application of these instruments for populations for which they were not originally designed must be revalidated in some form of cross-cultural adaptation process due to the differences in the language and cultural background. Cross-cultural adaptation overcomes the time- consuming process of developing a new measure, taking advantage of the conceptual ability of the instrument to describe and evaluate health status, and allowing comparisons internationally between cultures. Herdman et al. [35] described 6 steps that should be taken into account during cross cultural validation: Conceptual equivalence, item equivalence, semantic equivalence, operational equivalence, measurement equivalence and functional equivalence.

Conceptually, the 4 domains of the PIDAQ have been analysed during the cross-cultural adaptation of the Malay PIDAQ [16]. It was found that the 4 domains of the Malay PIDAQ were as relevant to the Malaysian adolescents as the original PIDAQ had been relevant to the adolescents in Germany [15]. During the development of the adapted English PIDAQ, item and semantic equivalences were closely monitored to be as close as possible to the Malay PIDAQ. This would allow for both instruments to be deployed independently or concurrently in a bilingual format. Consequently, the operational equivalence was also found to be similar to the Malay PIDAQ. The questionnaire format, mode of administration and measuring methods were similar to the Malay PIDAQ while the instruction only differed in translation, i.e., the instructions were in the language of the instruments.

During the pilot tests, the format and instructions were found to be acceptable to the participants. In terms of response mode, the item Don’t like own teeth on video was not relevant to a few of the participants in the pilot test. As a result, a not relevant answer option was added to the item. During the psychometric analyses, a high proportion of participants (11.7%) chose the not relevant option indicating the item was not common in the Malaysian setting. Based on the literature, several ways have been recommended to deal with responses that are not within a Likert scale including to exclude subjects with such responses, using adjusted scores or to drop the item [34]. Following discussions with all authors and to prevent errors in future studies involving the instrument, it was decided to have the item removed from the AC subscale. In terms of mode of administration, the Malaysian English PIDAQ was found to be suitable for self-administration – an important consideration since that it has been intended for large sample population study in schools or for participants to answer in the waiting rooms while waiting for their orthodontic appointment. Opportunities for other modes of administration, e.g., verbal expression, were limited during the pilot test discussion. The self-administered mode was considered feasible in Malaysia as the basic literacy rate for English among secondary school children was 93.2% [36], only slightly lower than the basic literacy rate for Malay, which was 95.2% [36].

Measurement equivalence was demonstrated by the results of the psychometric analyses and comparison of the results with those of the German study [15]. Fit statistics showed that multidimensional structure of the constructs for this population was the same as that of the German study [15]. The instrument was also invariant across age groups, indicating that the latent variables have the same meaning for the population across ages. The internal consistencies of the subscales were satisfactorily within the recommended range of 0.70 and 0.95 [25] except for subscale AC where the Cronbach alpha coefficients were 0.52 for 12–14 year old age group, 0.62 for 15–17 year old age group and 0.56 for 12–17 year old age group. The relatively lower values of Cronbach alpha were expected because the AC subscale consisted of 2 items only and for a subscale with only a few items, Cronbach alpha coefficient of 0.5 is considered acceptable [37, 38].

The adapted English PIDAQ’s discriminant validity showed statistically significant differences in PIDAQ mean scores between those with slight and severe malocclusion based on self-rated malocclusion index (MI-S) for all subscales and age groups. Similar to the past study [15], the effect sizes of the differences were high with positive effect between the severity of malocclusion and the PIDAQ mean scores of the SI, PI and AC subscales, respectively, and negative effect between the severity of malocclusion and the DSC subscale mean scores. Similar results were observed when the malocclusion was rated by the investigators (MI-D) for the older (15–17 years old) and the overall (12–17 years old) age groups. When malocclusion was assessed by the investigators, the effect sizes were much lower than when assessed by the participants themselves. This pattern was also similar to that found by Klages et al. [15]. It shows that the same level of malocclusion may not be rated equally between patients and professionals. It is possible that the differences is due to the investigators assessing the malocclusion in an objective manner while the participants assessed their level of malocclusion based on self-perception of the oral impacts. This is a common observation to have patients complaining that their malocclusion is worse than that assessed by clinicians.

For the younger age group (12–14 years old), a statistically significant difference was only detected between the DSC mean scores of those who were rated by the investigators with slight or with severe malocclusion. The SI, PI and AC scores of those with slight malocclusion as rated by the investigators were lower than those rated by the investigators as having severe malocclusion. Although the trend was similar to that found in the past study [15], the differences were not statistically significant. The lack of statistical significance may be due to a higher number of investigators involved in this study, which might introduce some inconsistencies despite the calibration that was done prior to the study. However, considering that the PIDAQ mean scores of the self-rated malocclusion (MI-S) by the younger age group had very strong effect sizes and that the PIDAQ mean scores of the investigator-rated malocclusion (MI-D) by the overall age group were significant different, these have provided adequate evidence for the discriminant validity for the instrument.

Apart from discriminant validity, the study also included further evidence of the construct and criterion validity properties of this instrument by comparing the PIDAQ mean scores of participants’ self-assessed ranking of dental appearance, need for braces and impact on daily performances attributed to malocclusion scores (CS-OIDP). This was based on the argument that those with impact on their OHRQoL due to dental aesthetics would have lower ranking of self-perceived dental appearance, higher perception of the need to correct their dental alignment by braces and also some impact on their daily performances. It should be noted that in this study, the Child-OIDP instrument was based on Yusuf et al. [39]. Conceptually, the instrument has been found to be valid for young Malaysian adolescents [39]. During the pilot test, the participants understood the content of the instrument with no modifications required. This may be due to the high literacy rate of 93.2% for basic English among Malaysian adolescents, as found in a sample of 5000 secondary schoolchildren in Malaysia [36], and the language used for the instrument was not ambiguous to this population as it was geared to 10- to 11- year-old UK primary school children [29]. In terms of construct validity, regardless of age group, those with low DSC mean scores and high SI, PI, AC and total PIDAQ mean scores had lower perception on their dental appearance and felt that their dental alignment needed to be corrected by braces. In terms of criterion validity, those with high total PIDAQ scores also had impacts on their daily activities, which were attributed to malocclusion. This indicated that those with psychosocial impact due to the influence of dental aesthetics would also have their daily activities affected that was accounted by malocclusion.

In the reproducibility test analysis, the ICC scores for the DSC subscale for all age groups and the ICC scores for the SI subscale in the younger age group were excellent, with values well above the recommended minimum level of 0.70 [25]. The ICC scores for the remaining subscales in the rest of the age groups were in the range of fair to good [28]. The ICC scores of subscales in the older age group were slightly lower than those in the younger age group. A few factors may have contributed to the slightly reduced ICC scores than the recommended level. Firstly, although English is widely used as a lingua franca, the prevalence of its use among adolescents may vary. This study included participants from both urban and rural schools. Although the students were able to answer the Malaysian English version of the PIDAQ, the levels of proficiency may be slightly different between urban and rural schools as more emphasis in the use of English is often seen in urban schools. The inclusion of participants from rural schools, despite their ability to answer the English version of the PIDAQ, may have introduced some variations in interpretations that resulted in the less favourable ICC values in this study. Secondly, as the AC subscale has only 2 items and social impact perceptions of malocclusion may vary over time (assessed by SI subscale), very good reproducibility of these subscales cannot be expected at all times. To further support the reproducibility, the paired t-test demonstrated that the differences in mean scores between the first and second questionnaire administration were small and not statistically significant. As such, the relatively wider 95% CI values for ICC of AC and SI subscales in the older age group were deemed acceptable.

In terms of the floor and ceiling effects, the adapted English PIDAQ had neither significant floor effects nor ceiling effects. This suggests that the instrument was sensitive enough to discriminate those with the lowest and highest possible scores [25]. The results were better than the German PIDAQ, which demonstrated some floor effects in the SI, PI and AC domains [15].

Conclusion

Following a few modifications in the cross-cultural adaptation of the English PIDAQ for Malaysian adolescents, the shortened version of the Malaysian English PIDAQ’s functional equivalence was established. The instrument is empirically valid and reliable to assess Malaysian adolescents’ OHRQoL specific to malocclusion.

Symbols

α: alpha; ∆: Differences; χ 2 : Chi-square