Background

Oral health-related quality of life (OHRQoL) is an important patient-centered endpoint to consider when assessing the impact of oral diseases in populations and evaluating the professional interventions used in attempt to improve oral health [119]. The Oral Health Impact Profile (OHIP) is a questionnaire designed to measure self-reported dysfunction, discomfort and disability attributed to oral conditions [20], and is based on a conceptual oral health model outlined by Locker [21]. The original instrument has 49 items representing 7 domains (functional limitation, physical pain, psychological discomfort, physical disability, psychological disability, social disability, and handicap) and has been shown to be reliable [2224]; sensitive to changes [5, 11, 24, 25]; and to exhibit suitable cross-cultural consistency [26]. Although the OHIP is available in several languages (Chinese, Finish, French, German, Japanese, Malaysian, Portuguese, Sinhalese, Somalian, Swedish, and Tagalog), a Spanish translation is not available and there are no suitable alternative OHRQoL tools available in Spanish. The aims of this study were to develop a Spanish version of the Oral Health Impact Profile and to evaluate its convergent and discriminative validity, and its internal consistency for use among Chilean adolescents.

Methods

Development of a Spanish version of the Oral Health Impact Profile

One of the authors (RL), a Chilean dentist proficient in Spanish and English, translated the 49 items of the original version of OHIP [20] into Spanish. Special attention was given to develop a questionnaire conceptually equivalent to the original version in order to maintain cross-cultural equivalence. The translation was then revised independently by two bilingual dentists, fluent in both Spanish and English, who gave feedback regarding the understanding and semantics of the translation. Following revision, the Spanish version was back-translated to English by an independent bilingual dentist (PS) who had never seen the original version of the OHIP. The back translation (OHIP-Sp) and the original version of OHIP were then compared in order to identify conceptual differences.

Study group

The data used to validate the OHIP-Sp [see Additional file 1] originated in a cross-sectional study conducted among high school students from the Province of Santiago, Chile. The study group was obtained using a multistage random cluster procedure to select school classes within schools. The sample consisted of 9,203 students aged 12–21 years, distributed in 310 classes from 98 schools. Details about the sampling strategy have been provided elsewhere [2729]. The study protocol was reviewed and approved by the local ethical committee of the University of Chile and subjects participated on the basis of informed consent. All students were invited to participate in the study and all accepted to fill a brief questionnaire containing information on socio-demographic factors; oral health related behaviors; and self-reported oral health status (rated as good, fair or poor) [27, 29]. From the whole study group, 9,163 students accepted to answer a written questionnaire asking detailed information on socio-economic indicators [30] and to participate in a clinical oral examination involving the recording of tooth loss [31], the presence of necrotizing ulcerative gingival lesions (NUG) [28] and clinical attachment level (CAL) in 6 sites per tooth in molars and incisors [27]. A total of 9,155 students also accepted to fill the OHIP-Sp questionnaire. Owing to the young age of the study population, the recall period considered was 'lifetime', just as the response options for each question were dichotomized as 'Yes' or 'No'.

Missing values and completeness of the OHIP-Sp version

Cognitive disparity and communication problems among the participants may hamper the use of an instrument and seriously affect the results of scoring systems [32]. To circumvent this problem, subjects with more than 5 missing answers in the OHIP-Sp (n = 22) were excluded from further analysis. The burden of OHIP-Sp and the potential difficulties in answering it were evaluated by counting the number of missing answers. In addition, we calculated the % of subjects responding 'No' for each of the 49 items of OHIP-Sp in order to identify items that could be irrelevant for the young study population included in this study.

Evaluation of the construct validity of the OHIP-Sp

Convergent validity

To assess the convergent validity of the OHIP-Sp, we investigated the association between self reported oral health status (good; fair; poor) and the total unweighted OHIP-Sp score, computed by adding the number of items experienced (0–49), as well as each domain score, using Spearman rank correlation. We hypothesized that students who reported good oral health would have lower scores than subjects who reported fair or poor oral health.

Discriminative validity

Four dichotomous dental health outcomes were used: A) 'tooth loss', which was considered present if at least one molar or incisor was absent, B) 'CAL ≥ 1 mm', which was present if at least one of the sites recorded had clinical attachment level measurements ≥ 1 mm; C) 'CAL ≥ 3 mm'; and D) 'NUG', which was considered present if at least one interproximal papilla presented with necrotizing ulcerative lesions'. Details on the clinical examinations and the reliability of the recordings have been previously published [27, 28, 30, 33].

To compare the validity of OHIP-Sp in discriminating between groups with and without oral conditions, the mean OHIP-Sp scores were compared between subjects with and without the four oral health outcomes investigated using the Mann-Whitney test. We hypothesized that subjects with poor oral health outcomes would have higher OHIP-Sp scores. Although this is a rather standard procedure in OHIP validation studies [23, 3438], a potential problem may arise when the assessment of discriminative validity of OHIP relies on statistical significance. The situation may be especially critical if the study group is large, because statistical significance may be obtained without the instrument being able to distinguish between groups in a real scenario. In order to explore this possibility, the 'roctab' command of the software Stata [39] was used to obtain receiver operating characteristic curves (ROC) and to calculate the values for the area under the ROC curves [40] for the ability of the total OHIP-Sp score to predict each of the four outcomes studied. The area under the curve is a proportion which can be interpreted as the probability that a randomly selected person with a positive oral health outcome has a higher OHIP-Sp value than a randomly selected person without the oral health outcome [41]. In a post-hoc analysis, ROC curves for the total OHIP-Sp score and more severe clinical attachment level outcomes (CAL ≥ 4, and CAL ≥ 5 mm); and more extensive tooth loss outcomes (≥ 2, ≥ 3, and ≥ 4 teeth) were used to assess whether OHIP-Sp shows higher discriminative validity with more severe and extensive dental outcomes.

Internal consistency

'When items are used to form a scale they need to have internal consistency. The items should all measure the same thing, so they should be correlated with one another. A useful coefficient for assessing internal consistency is Cronbach's alpha' [42].

Internal consistency was assessed for the total OHIP-Sp score and for each of the seven domains, using the Cronbach's reliability coefficient α [43], which is a measure of intercorrelation between possible subsets of items in the instrument. Average inter-item correlation coefficients were obtained for each of the domains of OHIP-Sp, as well as for the total OHIP-Sp score. 'Cronbach's alpha has a direct interpretation. The items in our test are only some of the many possible items which could be used to make the total score. If we were to choose two random samples of k... (where k is the number of items)... of these possible items, we would have two different scores each of them made up of k items. The expected correlation between the scores is α' [42].

Results

The comparison between the original OHIP questionnaire and the back translated English version did not reveal conceptual content differences. The participation rate was high (99.9%) and the completeness of the self-answered OHIP-Sp questionnaire was high with about 99% of the students answering at least 44 items and 87.2% of the subjects answering all 49 questions.

OHIP-Sp total scores and domain scores were computed for 9,133 subjects, 12 to 21 years, and evenly distributed by gender. The oral health impacts found in this study group were low, with a mean OHIP-Sp score of 9.7 and mean domain scores ranging between 0.3 for 'social disability' and 3.0 for 'physical pain' (Table 1). The highest oral health impact was observed for the domains 'physical pain', 'functional limitation', and 'psychological discomfort' with mean OHIP-Sp scores 3.0, 2.1, and 1.9, respectively (Table 1).

Table 1 Convergence validity.

Evaluation of the construct validity of the OHIP-Sp

Convergent validity

Self-perceived oral health status and OHRQoL were statistically significantly associated with the total OHIP-Sp score and all the domains (Table 1). Correlation coefficients (rSpearman) for the association between self -reported oral health status and the different domains ranged between 0.23 for 'social disability' and 0.42 for 'functional limitation'. The coefficient for the association between the total OHIP-Sp score and self-reported oral health status was 0.41 (Table 1).

Discriminative validity

As hypothesized, higher OHIP-Sp total score were observed among subjects with the four oral health outcomes investigated. All differences were statistically significant (Table 2). The largest impact was found for the outcomes 'tooth loss', with a mean OHIP-Sp score = 13.5, and 'CAL ≥ 3 mm', with a mean OHIP-Sp score = 13.0 (Table 2).

Table 2 Discriminative validity.

The estimates for the area under the ROC curve obtained for each of the dental health outcomes studied and the total OHIP-Sp score ranged between 0.56 for having CAL ≥ 1 mm, and 0.66 for 'tooth loss' (Table 2).

The ROC curves obtained for the total OHIP-Sp score and increasing severity of clinical attachment loss revealed increasing values for the area under the curve ranging from 0.57 for CAL ≥ 1 mm to 0.78 for CAL ≥ 5 mm (Table 2). A similar result was obtained for increasing extent of tooth loss with values ranging between 0.66 for tooth ≥ 1 tooth, and 0.76 for tooth loss ≥ 5 teeth (Table 2).

Internal consistency

Internal consistency (Cronbach's α) of the OHIP-Sp was 0.90 and α values for the different domains ranged between 0.48 and 0.76. (Table 3).

Table 3 Internal consistency for OHIP-Sp and its 7 domains

A total of 8 items (8, 9, 18, 26, 29, 30, 39, 44) were found to impact on less than 5% of the participants and were therefore considered of infrequent for this young population. A closer examination of these items showed that they concern severe oral health related impacts such as eating/digestion impairment, and the use of prostheses, which can be expected to be rather infrequent among young people.

Discussion

Cross-cultural adaptation procedures are a critical component of the validation process of an instrument to assess OHRQoL and several guidelines can be found for this purpose [32, 44, 45]. In the present study, the translation process from English to Spanish was straightforward and the comparison between the original OHIP questionnaire and the back translated English version did not reveal conceptual content differences. The equivalent words needed for translation of the questions were not difficult to find, and the grammar structure of the sentences was not difficult to build during the translation process, possibly owing to the fact that English and Spanish share a common Latin background.

Previous studies have shown a low frequency of oral health impacts for young populations such as the present [23]. Moreover, there are drawbacks of using ordinal scales for questionnaire responses, which may make the scale not only instrument-specific, but also sample- and item-specific [46]. To best of our knowledge, there are no studies addressing this issue on adolescents, but the results of a study on the assessment of changes in the quality of life using OHIP on adults [5] suggest that the differences found between groups may be consistent, regardless of the use of dichotomous or ordinal scoring systems. We therefore considered it best for the purpose of the present study to dichotomize the response options for each question into 'Yes' or 'No'. We realize that this approach departs from the common use of Likert-like scales ranging from 'never' to 'very often' in many OHIP studies. This, and the fact that the use of the Oral Health Impact Profile among adolescents has consistently considered only the 14-item versions of the OHIP, and rather different recall periods [23, 4749], makes direct comparisons between studies rather difficult. We are not aware of studies of the effect of different types of response scales on estimates of validity and consistency for the same study group, but the estimates will almost certainly differ.

The interpretation of the study results should also consider the different recall periods used in different studies. To the best of our knowledge, this is the first study considering a lifetime recall period for the administration of the questionnaire among adolescents. The impact of the use of different recall periods has not been addressed in young populations. In a recent study, John et al., [50] applied a German version of OHIP on adults using 3 different recall periods (lifetime, 1 year, 1 month) and found better consistency for the shortest recall period, and a lower impact of oral health for the lifetime recall period.

The mean score values in this study suggest a relatively low impact of oral health in the population studied, similar to the impact reported previously by Soe et al. among Myanmar adolescents with low levels of dental disease [23], and considerably lower than the oral health impact reported in studies comprising minority adolescent populations with higher oral disease burden [49] and adult populations [51, 52].

Our finding that 8 items related to eating impairment, use of prostheses, general health, and inability to function were rather infrequent in this adolescent population, indicates that a number of items from the original OHIP representing severe impairment may be irrelevant for adolescents who have only experienced minor oral disease. Our observations suggest that the highest impacts concern some items from the domains representing 'physical pain'; 'functional limitation' and 'psychological discomfort' in this young adolescent population. This is in agreement with the observations by Broder et al. [49] among minority adolescents, and our findings on 'physical pain' and 'psychological discomfort' also agree with the observations by Ferreira et al. [47] in Brazilian schoolchildren, thus suggesting that some dimensions from these domains of OHRQoL frequently affect adolescents.

Construct validity of the OHIP-Sp

The OHIP-Sp exhibited adequate convergent validity, in agreement with studies conducted using other versions of the Oral Health Impact Profile among adolescents [23, 49].

A potential limitation of this study to assess discriminative validity is the lack of inclusion of a common pain-related dental health outcome such as caries, which could be a better oral health outcome to distinguish between groups of adolescents with known differences in dental health. The oral health outcomes used in this study are usually considered in studies among adults [53, 54] but not in studies conducted among adolescents [23, 49], in which the occurrence of tooth loss and periodontal disease is expected to be low. Nevertheless, the results of the assessment of discriminative validity using Mann Whitney statistics suggest that OHIP-Sp is suitable to distinguish between groups with and without oral conditions such as clinical attachment loss and tooth loss among adolescents. The area under the ROC curves for the four outcomes tested are not impressive and challenge the application of statistical testing for the assessment of discriminative validity.

The ROC curve areas for different severity levels of clinical attachment loss and increasing extent of tooth loss demonstrated that OHIP-Sp is suitable to discriminate subjects with increasing severity and/or extent of these dental outcomes.

Internal consistency of the OHIP-Sp

The values for internal consistency estimated with Cronbach's alpha relate to OHIP scores obtained for an specific study group rather than to the instrument itself [55]. This means that the numerical size of Cronbach's alpha is significantly influenced by the degree of disease variation in the study group used to test the instrument. The Cronbach's alpha coefficients for internal consistency found in this study were slightly lower than those observed by Broder et al., [49] for disadvantaged adolescents, and similar to those obtained by Soe et al., [23] for Myanmar adolescents with low oral disease experience. The population in which the OHIP-Sp was tested represents one of the most demanding situations for the instrument. Our observation that OHIP-Sp did in fact capture oral health impacts when used in a young population with a low periodontal disease burden and very limited tooth loss testifies to the usefulness of the instrument. While it may be noted that the recommendation of Cronbach's alpha > 0.70 for sufficient internal consistency [42] was reached only for one of the domains and for the total summary score, it is also clear that most other domains were approaching this limit. Moreover, higher estimates for internal consistency are likely to be found if the instrument is applied to (older) study groups with more disease experience.

Clearly, further studies of the properties of OHIP-Sp should include testing of the questionnaire in older populations and in populations with a higher disease burden/disease variation; as well as the inclusion of caries as a dental outcome. Additional aspects of the instrument that should be assessed are the use of test-retest reliability exercises to evaluate the stability of the test; and the assessment of the responsiveness of OHIP-Sp to changes in oral health conditions.

Conclusion

The OHIP-Sp revealed suitable convergent and discriminative validity and appropriate internal consistency.