Background

A range of instruments that measure oral health-related quality of life (OHRQoL) has been developed in the last two decades. [1]. One of these instruments is the Geriatric Oral Health Assessment Index (GOHAI), a frequently used questionnaire that aims to assess OHRQoL within older populations [2]. It comprises of 12 items that measure three dimensions of OHRQoL: physical function (3 items), psychosocial function (5 items) and pain/discomfort (4 items).

Several studies indicate that the GOHAI is a more suitable instrument to measure OHRQoL of the elderly in Western cultures than the currently most frequently used Oral Health Impact Profile (OHIP) [37]. The OHIP taps more severe OHRQoL impacts than the GOHAI and is generally less sensitive to minor impairment of OHRQoL [3]. As a consequence larger proportions of participants report no impact, i.e. have zero-scores (floor effect) when using the OHIP than when using the GOHAI [3, 8]. Based on epidemiological data this floor effect is likely to also occur for Dutch elderly [9, 10]. This effect reduces the ability of the OHIP to detect within-subject changes, when compared with the GOHAI. However, no validated Dutch version of the GOHAI is available.

The aim of this study was to translate the original English version of the GOHAI into a Dutch version (GOHAI-NL), and to validate the translated instrument for use in epidemiological surveys among older people in the Netherlands. To warrant validation for a wide spectrum of older people, we chose to validate the GOHAI for both severely frail, care-dependent older people and for care-independent older people.

Although the GOHAI was originally intended as a self-administered questionnaire, this administration method is likely to generate unreliable results in severely frail and care-dependent older people who often have impairments (e.g. visual, cognitive) that affect their capacity to complete self-administered questionnaires [11, 12]. Therefore, for care-dependent older people we chose to administer the GOHAI questionnaire through a personal interview.

Methods

Translation

The original GOHAI questionnaire [2] was independently translated into Dutch by two bilingual translators whose native language was Dutch. One of them was a dental researcher experienced in the use of quality of life measures (DN), the other was a professional translator specialized in the translation of patient reported outcome measures. We adhered to the “Principles of Good Practice for the Translation and Cultural Adaptation Process for Patient-Reported Outcomes Measures” [13]. The two forward-translations were reconciled into one forward translation by an expert panel consisting of a dentist-researcher, a geriatric dentist-researcher and an oral health researcher. Competing options were discussed, and other bilingual experts were consulted when necessary, until consensus was reached. The resulting forward-translation was independently back-translated into English by two professional translators whose native language was English. The back-translations were compared for conceptual equivalence with the original GOHAI by the expert panel. Problematic items were identified and discussed among the expert panel and with the translators. Based on their comments, the forward-translation was refined. The resulting translation was tested in a purposive sample consisting of 10 older (65 years and over) people whose self-reported general health was bad (n = 3), mediocre (n = 3), or good (n = 4). The translation was tested for cognitive equivalence and comprehensibility. Based on received comments, the translation was finalized by the expert panel.

Respondent selection

In order to test and validate the proposed GOHAI-NL, participants of 65 years and over were recruited from two independent samples (group A and group B).

These two groups were recruited in order to represent distinct differences in frailty and general health within the population of elderly. Group A represented non-frail care-independent older people with expected good health and group B represented frail care-dependent, but cognitively alert people with compromised health.

Because gender and dental/prosthodontic status are known to possibly influence self-perceived oral health, for both groups an even distribution of men and women, and of (partially) dentulous (having at least one natural tooth) and edentulous (with or without complete removable dental prostheses (CRDP’s)) participants was sought [1417].

Participants of group A were recruited in the clinic of the Dental School of Radboud University Medical Center through convenience sampling, and comprised of independent living, cognitively alert subjects who came for periodical check-up visits between 2013 and 2015. Since this sample was recruited from a generally healthy, independently living population with no registered health impairments according to the patients’ dental records, it was assumed that the chance of recruits being frail would be small. Upon provision of informed consent, after their clinical examination, they were asked to complete a questionnaire, including the GOHAI-NL.

Participants of group B were recruited in a total of 11 residential aged care facilities (RACFs), selected through convenience sampling, in the southern part of the Netherlands. The RACFs were included after the management’s consent to have their residents examined on a voluntary base. The care managers of the RACF’s recruited the participants for this study, based on instructions by the principal researcher (DN). These instructions included exclusion of subjects who were not cognitively alert according to the responsible ward nurse. All participants in group B had a certain level of care dependency as determined by a medical authority, based on the Dutch care-dependency classification system (Dutch National Centre for Indication of Care Need (CIZ; www.ciz.nl)). Each RACF resident is assigned a ‘Package of Care Dependency’ according to this system, indicating the level and type of care needed referring to impairments in the physical and/or mental and/or social domain.

Upon provision of their informed consent, the participants received a clinical examination by a final year dental student or a final year dental hygiene student. Next, they were personally interviewed by the principal researcher who used the same questionnaire as the one used for group A.

Convenience quota sampling was used aiming at a total sample size of approximately 120 recruits for each group. Sample size was calculated based on the recommendation to include 5–10 subjects per questionnaire item [18], resulting in a need for 60–120 participants per group.

Data

Participants were asked to provide information regarding their oral health by answering the GOHAI questionnaire and four additional questions: 1. How do you perceive your oral health (very bad, bad, moderate, good, very good); 2. Are you satisfied with your oral health (y/n); 3. Do you think you need dental treatment at the moment (y/n); 4. How do you perceive your general health (very bad, bad, moderate, good, very good). The GOHAI questionnaire includes 12 questions (each question addressing one oral health item). Respondents were asked how often, in the previous three months, they have experienced the oral health item addressed: ‘never’, ‘seldom,’ ‘sometimes’, ‘often’, or ‘very often or always’. Besides, date of birth, gender, and nationality were recorded. Socio-economic status (SES) (high, middle, low) was assessed based on last held occupation (according to the ISCO-08 classification [19]) and on level of education (high, middle, low); the highest level of either education or occupation determined SES.

Clinical data were obtained through examinations by calibrated final year dental students (all kappa’s > 0.82; overall κ =0.87; agreement = 90.1 %) or calibrated final year dental hygiene students (all kappa’s > 0.66; overall κ =0.74; agreement = 84.4 %). Data included number and position of 1) natural teeth, 2) caries lesions, 3) restorations (such as direct restorations or fixed dental prostheses), and 4) partial or complete removable dental prostheses. The WHO criteria for assessment of the aforementioned variables were used [20]. In addition, clinical treatment need (y/n) was recorded, based on the clinically assessed need for any professional dental treatment including reline, rebase or replacement removable dental prostheses.

Group A participants were examined in the clinic of the dental school while group B participants received a clinical examination at their residence, where the examiners used hand held torches and a dental mirror.

Missing data

Participants with two or more GOHAI answers missing, or with one or more answers to the additional questions missing, or with missing clinical data were excluded. In case only one GOHAI answer was missing, the missing value was replaced by mean substitution.

In case clinical data were recorded more than two weeks before or after the questionnaire was completed, the participant was excluded. This was done in order to minimize the chance that the clinical status of the participant was different from that at the moment of completing the questionnaire.

Analyses

General psychometric properties

Answer proportions (%) of each of the GOHAI-NL items and of the GOHAI-ADD (additive) score and the GOHAI-SC (simple count) score [2] were calculated. The GOHAI-ADD score is the sum of all scores (score 1 to 5 per answer; total score from 12 to 60). The GOHAI-SC score is the sum of all items with response ‘sometimes’, ‘often’ and ‘always or nearly always’ (score 0 or 1 per answer; total score from 0 to 12), where a ‘1’-score indicates impairment for that item [2]. Item scores for questions 3, 5, and 7 were reverse-coded so that all items scored in the same direction; higher values indicating better OHRQoL.

Floor and ceiling effects were assessed at dimension level (with GOHAI dimensions: physical function, pain and discomfort, and psychosocial function), and at total score level (GOHAI-ADD; GOHAI-SC). Floor and ceiling effects were considered present when 20 % or more participants had the lowest (floor) or highest (ceiling) possible total score [21].

Reliability

Reliability was assessed by measuring internal consistency and stability. Internal consistency was measured through correlation between item scores and the overall GOHAI-ADD score, using the corrected item-total score correlation (Spearman’s rank correlation coefficient) and Cronbach’s alphas. Overall Cronbach’s alphas > 0.7 and > 0.9 are considered indicative for acceptable consistency for comparisons at group level and at individual level, respectively [22, 23]. The dimensional structure of the GOHAI-NL was evaluated through assessment of correlations between item scores and the GOHAI-ADD score of the related dimension (subscale). Cronbach’s alphas > 0.4 are considered indicative for adequate item - subscale consistency and Cronbach’s alphas > 0.7 are considered indicative for adequate subscale - overall scale (total score) consistency [22]. Inter-item correlations were calculated in order to determine the extent to which the items were related to each other (average inter-item correlation ideally should be between 0.2 and 0.5 [23, 24]), and to detect redundancy of items (Cronbach’s alpha > 0.7) [25].

Stability was assessed by measuring test-retest reliability through calculation of intraclass correlation coefficients (ICCs) (two-way mixed, single measure) in two subsamples consisting of randomly selected respondents from group A and group B. Participants of these samples groups were either sent a second questionnaire (group A) or interviewed a second time (group B) after one to two weeks after they had returned their first questionnaire or were interviewed, as it was expected that no major differences in oral status and oral health would have occurred during this time interval. ICC values > 0.75 were considered indicative for excellent stability, 0.40 – 0.75 for fair to good, and < 0.40 for poor stability [26].

Validity

Validity was measured through convergent validity, discriminant validity, and known-group validity. Convergent validity refers to the degree to which two measures that should measure the same construct, are related. This was determined through assessment of the correlations between GOHAI-ADD scores and the answers to two general questions on self-perceived oral health: 1. How do you perceive your oral health; 2. Are you satisfied with your oral health.

Discriminant validity refers to the degree to which two measures that should measure two similar, but conceptually different constructs are related. This was determined through the correlation between the GOHAI-ADD scores and 1. clinical treatment need; 2. presence of caries lesions; and 3. self-perceived general health. A low to moderate correlation was expected between higher GOHAI-ADD scores on the one hand and less clinical treatment need, absence of caries lesions, and better self-perceived general health on the other.

Known-group validity refers to the degree to which a measure is sensitive to differences within subgroups that are assumed to be reflected in the scores. This was assessed by comparing differences in GOHAI-ADD scores between subgroups with different self-perceived treatment need (y/n), a higher number of natural teeth, and different dental / prosthodontic status (natural teeth without removable dental prostheses (RDPs), natural teeth with partial or complete RDPs, or no natural teeth (with or without complete RDPs)). Participants without self-perceived treatment need, with higher numbers of natural teeth, and without removable dental prostheses, were assumed to have higher GOHAI-ADD scores.

Correlations were assessed by calculating Spearman’s rank correlation coefficients (r), with values > 0.5 indicating a strong correlation, 0.35 to 0.5 a moderate correlation, and 0.2 to 0.34 a low correlation [27, 28].

Ethics, consent and permissions

The study was approved by the Medical Ethics Committee (CMO) of the Radboud University Medical Center Nijmegen (CMO ref. 2012/294). All participants were informed (in writing and personally) about the study and provided written consent prior to their participation.

Results

Translation

Translation procedures and discussions among the expert panel yielded no irresolvable issues concerning semantic, experiental or conceptual equivalence. The resulting GOHAI-NL is presented in Additional file 1.

Characterization of groups and subjects

The original sample consisted of 232 participants; 111 in group A and 121 in group B. After exclusion of subjects because of 2 or more missing GOHAI answers (group A; n = 2) or missing clinical data (Group B; n = 3) respectively, group A included 109 participants and group B 118 (Table 1). For two participants in group A who missed one GOHAI question, the mean substitution was imputed. In group A, 47.7 % of the participants were female; 60.6 % were dentate (at least one natural tooth) and the mean age was 73.1 ± 5.4. In group B, 57.6 % were female; 49.2 % were dentate and the mean age was 85.6 ± 7.0. Group A participants had a slightly higher SES (high 31.9 %, medium 50.5 %, low 17.8 % versus high 23.9 %, medium 40.2 %, low 35.9 % in group B).

Table 1 Sample Characteristics and Analyses per Sample

General psychometric properties

Answer proportions and percentage impairment (based on the number of answers ‘sometimes’, ‘often’ or ‘nearly always or always’) for group A and B are listed in Table 2. The mean GOHAI-ADD score was 51.5 ± 7.5 (range 29–60) for group A and 52.4 ± 6.1 (range 26–60) for group B. Mean GOHAI-SC score was 1.9 ± 2.4 (range 0–9) for group A and 1.9 ± 1.9 (range 0–9) for group B. The items that showed highest frequency of impairment were item 9 (30.2 %), item 2 (28.5 %), and item 5 (23.8 %) for group A and item 2 (48.3 %), item 7 (39.8 %), and item 5 (26.3 %) for group B, indicating that most impairment was reported in relation to oral function (items 1, 2 and 4) and psychological aspects (items 6, 7, 9, 10, 11) (Table 2). The items that showed lowest frequency of impairment were items 6, 8, and 10 for both groups, indicating that least impairment was reported in relation to psychosocial aspects, which was emphasized by the zero scores in answer categories ‘often’ and ‘nearly always or always’ of items 6, 9, 10, and 11.

Table 2 Answer proportions and percentage participants scoring ‘impairment’a per GOHAI item for groups A and Bb

No floor or ceiling effects were detected at total score level (GOHAI-ADD): 7.3 % (group A) and 12.7 % (group B) had the highest possible score of 60, none had the lowest possible score of 12. The GOHAI-SC score however did show a floor effect: 42.2 % of group A participants and 28.0 % of group B participants had a total score of zero. At dimension level, there were no floor effects. However, ceiling effects occurred in two dimensions in group A and in all 3 dimensions in group B. Maximum scores were obtained by 37.6 % (physical function), 21.1 % (pain and discomfort) and 17.4 % (psychosocial function) of group A participants; and by 28.0 % (physical function), 28.8 % (pain and discomfort), and 28.0 % (psychosocial function) of group B participants.

Reliability

Cronbach’s alphas were 0.86 for group A and 0.80 for sample B, indicating good overall internal consistency. The corrected item-total score correlations were between 0.4 and 0.7 indicating adequate correlation, except for item 3 in both group A (r = 0.34) and group B (r = 0.08), and for item 12 in group A (r = 0.20) (Table 3).

Table 3 Reliability analysis based on item-total score correlation and test-retest correlation

Inter-item correlations were within the acceptable range of 0.2–0.5 for both groups (mean Cronbach’s α group A: 0.34 ± 0.11; mean Cronbach’s α group B: 0.33 ± 0.08). Inter-item correlations > 0.7 occurred only in group A, between items 1 and 2 (Cronbach’s α = 0.76) and between items 10 and 11 (Cronbach’s α = 0.74); indicating possible redundancy.

Test-retest reliability (stability) was high for both groups: mean 0.88 (GOHAI-ADD) and 0.87 (GOHAI-SC) for group A, and 0.93 (GOHAI-ADD) and 0.89 (GOHAI-SC) for group B. ICCs of 0.62 - 0.88 in group A and 0.64 – 0.91 in group B indicated overall good stability, with least stability for items 3, 6 and 7 in group A, and for items 7, 9, and 11 in group B (Table 3).

The dimensional structure of the orginal GOHAI was only partly supported by Cronbach’s alphas and item - subscale correlation values (Table 4). Cronbach’s alphas for subscale - overall scale correlation were around the treshold of 0.7 for the dimensions ‘physical functioning’ and ‘psychosocial functioning’, and all item - subscale correlations within these dimensions were adequate (above > 0.45) except for item 4, ‘trouble speaking clearly’, in group B. Items within the dimension ‘pain and discomfort’ (items 3, 5, 8, 12) were only weakly correlated to the dimension total score (Cronbach’s alphas between 0.13 and 0.44). This dimension showed inadequate (<0.7) subscale - overall scale consistency in both groups A and B.

Table 4 Correlation between item - subscale (dimension) scores and between subscale- overall scale scores

Validity

Table 5 shows the main results of comparisons between assumedly construct-related variables and GOHAI-ADD scores.

Table 5 Validity assessments: Spearman’s rank correlations between selected variables and GOHAI-ADD scores

Convergent validity: moderate to high (0.42–0.68), significant correlations in the expected direction were found between GOHAI-ADD scores for self-perceived oral health and satisfaction with oral health for both groups A and B.

Discriminant validity: low to moderate (0.24–0.42), but significant correlations in the expected direction were found between GOHAI-ADD scores and self-perceived general health (group A), clinical treatment need (group A and B) and presence of caries (group B). Non-significant were the correlations between self-perceived general health (group B), and presence of caries (group A); these correlations found were, however, in the expected direction.

Known-group validity: moderate, significant correlations in the expected direction (group A: r = 0.48; group B: r = 0.53) were found between GOHAI-ADD scores and self-perceived treatment need. GOHAI-ADD scores were also significantly correlated in the expected direction for dental / prosthodontic status (r = 0.29) and number of natural teeth (r = 0.39) for group A, but not for group B.

Differences in age, gender and SES were not statistically significantly correlated with GOHAI-ADD; however higher SES levels were correlated with higher GOHAI-ADD scores in both groups (Table 5).

Discussion

Study design

This study tested psychometric properties of a Dutch version of the GOHAI, including validity and reliability. The original GOHAI was validated in a population of older well-educated Americans. Although the GOHAI has been demonstrated to also be valid for younger and for less educated population samples [29, 30], it remains important that validity problems related to differences in language or culture are ruled out. This is why we undertook an evidenced approach [13] to assure conceptual equivalence between the GOHAI-NL and the original GOHAI.

Following the vast majority of GOHAI validation studies, we calculated GOHAI-SC scores in addition to the standardly used GOHAI-ADD scores. Although the use of GOHAI-SC scores implies some loss of information because it requires dichotomization of GOHAI answers, the GOHAI-SC provides a reliable, albeit crude, estimate of perceived oral impairments.

To our knowledge, our study is the first that validates the GOHAI in two distinct groups of older people using different administration methods. This choice was prompted by the evidence that the use of self-administered questionnaires in severely frail older populations does not always yield reliable results [31]. We therefore used personal interviews as the administration method in this group. When using personal interviews, any problems related to cognitive abilities of the respondent can be detected more easily. Although the GOHAI has been used in people with mild cognitive impairments [32, 33] it has not been validated for such populations. Therefore we do not recommend the use of the GOHAI-NL in cognitively impaired subjects except when closely related informants provide support in answering questions, and with explicit reference to this fact.

Limitations

The administration method used in group B may have induced a degree of social desirability bias, leading to expectedly ‘too high’ scores. Reissmann [34] showed that OHIP outcomes obtained through personal interviews were 15 % lower (indicating less oral health-related complaints) than outcomes derived from self-administered questionnaires in a group of older adults. In the present study, we could not examine the effect of these two administration methods on GOHAI-scores because these methods were applied in different samples. It is recommended to compare the effects of different methods of administration on acquired GOHAI scores within groups of frail and non-frail older people in future research.

Our study did not measure responsiveness to change in oral health status of the GOHAI-NL and hence additional longitudinal research is recommended to assess the sensitivity of the GOHAI-NL for monitoring oral health changes.

Results

In the translation procedure, the expert panel decided to use the Dutch equivalent of ‘very often or always’ instead of ‘always’ in the original version. This follows the reasoning used in the translation to the German GOHAI [5]: ‘always’ (‘altijd’) in Dutch is very strictly interpreted as ‘not a moment without’, and the distance between the alternative response options ‘often’ and ‘very often or always’ is expectedly more equal to the distances between other consecutive response options, as meant in a Likert-scale [35], than the (expectedly larger) distance between‘often’ and ‘always’.

The double negative phrasing of item 5 of the original GOHAI “how often were you able to eat without discomfort” has been documented to lead to inconsistent answers [30, 36]. In our study, item 5 had relatively low item-total correlation in group A and around 6 % of the anwers to the (self administered) item 5 were considered to be inconsistent with reference questions. The effect of double negative phrasing may be mitigated through adding reading notes to the questionnaire; which should be considered for all international GOHAI versions.

The mean GOHAI-ADD scores of 52.4 ± 7.5 in group A and of 52.5 ± 6.1 in group B in this study are similar to those found in Northwestern Europe and the USA (53 in Germany, 49.8 in Sweden, 46.4 in France, and 52.5 in the USA) [2, 30, 37, 38] but higher than those found elsewhere in the world (mean GOHAI-ADD scores between 18 and 49 in Romania Hongkong, Japan, Malaysian, Jordan, Turkey, India, Spain, Mexico, Iran, see also overview in Rezaei e.a. [39]). This is considered to be not only due to differences in oral health status, but also to variations in perceptions and expectations of oral health as well as in the self-reporting of oral health impacts, which are in part explained by cultural differences.

Although GOHAI outcomes of groups A and B are not meant to be compared because of the different administration methods used, the lack of difference between GOHAI-ADD scores is striking against the differences in clinically assessed oral health status between both groups (group B having worse oral health). The relatively high GOHAI scores of group B are most likely caused by social desirability bias (as addressed above) and by the so-called ‘disability paradox’ of older people that implies that they have better self-perceived oral health despite worse oral health status [40, 41].

Contrary to the OHIP [3, 6, 8], the GOHAI did not demonstrate floor and ceiling effects for the overall GOHAI-ADD score, which is the most used outcome measure for group comparisons of the GOHAI. At subscale (dimension) level, however, floor effects were detected. This means that the subscales are not capturing the full range of potential GOHAI responses in the population and that the ability to detect changes over time may be compromised [42].

Regarding reliability: both overall internal consistency (Cronbach’s alphas of 0.86 (group A) and 0.80 (group B)) and overall stability (Cronbach’s alphas of 0.88 (group A) and 0.93 (group B)) were good and comparable with values of other GOHAI studies [5, 8, 30, 37, 39, 4346]. Items 3 (ability to swallow) and 12 (sensitivity to hot, cold and sweets) showed low correlation with the total GOHAI scores, which is in line with several previous validation studies [5, 30, 39, 47, 48]. Both items probably refer to a different construct than that intended to be measured by the GOHAI, which is oral health-related quality of life. One respondent in our study criticized item 12 saying that any human tissue is sensitive to hot and cold. Hence apart from the questionable conceptual correlation between teeth and tissue sensitivity and oral health, ambiguous interpretation of this item is likely to contribute to the found low item-total correlation.

The subscale (dimension) - overall scale correlation for the dimension ‘pain/discomfort’ was too low to justify distinction of this dimension. Since this finding is supported by ample evidence against the original dimensional structure of the GOHAI [5, 36, 37, 45, 49], it may be worthwile to reconsider these dimensions or opt for a one-dimensional scale.

Regarding validity: the GOHAI-NL was in good agreement with other measures of perceived oral health and demonstrated overall good convergent and adequate discrimant and known-group validity, supporting its construct validity. The low correlation between presence of carious lesions and GOHAI-ADD scores in the care-independent elderly was probably at least partly due to the low numbers of carious lesions encountered in group A, where only 7 out of 66 dentates had one or more carious teeth. The low correlations between GOHAI-ADD scores on the one hand and self-perceived general health and dental/prosthodontic status on the other that were found in this study in the group of care-dependent elderly, were unexpected. Although there is some evidence indicating that the correlation between general health and oral health is weaker in populations with impaired general health in comparison to healthy populations, generally the association between perceived oral health and perceived general health is strong [50, 51]. With regard to prosthodontic status, the lack of correlation, which is in contrast with findings from the majority, but not all GOHAI validation studies (e.g. [5, 30, 37, 44] vs. [39, 49]), may be due to the adaptation of frail elderly to oral discomfort caused by removable dental prostheses [52, 53].

Conclusion

This study shows that the GOHAI-NL has satisfactory reliability and construct validity and can be used to measure OHRQoL in Dutch care-dependent and care-independent older people.