Introduction

Health-related quality of life (HRQOL) is important when exploring the general wellbeing of the population, as well to evaluate specific health states. The Short Form-36 Health Survey (SF-36) is a self-reported multi-dimensional measure widely used across countries, with its use ranging from monitoring the burden of disease, to examining the cost-effectiveness of a treatment [1]. In order to aid interpretation of the data, the developers recommend that a normative-based scoring method using US weights is used to provide a standard with which scores from other populations can be compared. However, there is much discussion surrounding whether it is appropriate to use weights that may not be culturally specific, especially when there are differences in health states.

Cyprus is the third largest Mediterranean island with around 300,000 Turkish Cypriot and 700,000 Greek Cypriot residents. There is a lack of population-level health data from Northern Cyprus due to unresolved political circumstances [2, 3] and as a result, this part of the island is absent from any published health statistics from the region. To date, there are no published studies that have provided population normative values in Cyprus for the SF-36, and it is inappropriate to use published values from other countries given the differences in healthcare. A cross-sectional study in Limassol, Republic of Cyprus examined HRQOL in the region [4], but it did not provide normative values for the island and instead used normative values derived from the U.S.A. Another cross-sectional study in Izmir, Turkey, used the SF-36 to provide population norms for the urban population [5]. However, the study took part just after a major economic crisis which is likely to have had an impact on the mental health component of the quality of life scores.

This paper seeks to address the lack of normative data applicable to Northern Cyprus and also serves to add further reference values for the Eastern Mediterranean region. We use data from the Cyprus Women’s Health Research (COHERE) Initiative, to provide population norms for the eight SF-36-Health-Survey version 2 (SF-36v2) health domains as well as the two higher-order summary scores; Physical Component Summary (PCS) and Mental Component Summary (MCS). We examine the reliability and the validity of this dataset and its ability to provide this information, as well as testing the construct validity by investigating statistical relationships between each domain and a range of demographic variables that are known to be related to health outcomes including age, ethnicity, migration status, educational attainment, region of residence, civil status and employment status.

Methods

Study sample: the COHERE Initiative

Normative values for the SF-36v2 and the physical and mental health scoring coefficients were estimated using data collected as part of the COHERE Initiative. The COHERE Initiative is a population based cross-sectional study that has recruited 7646 consenting women between the ages of 18–55 in Northern Cyprus. The aim of COHERE is to determine the relative burden of women’s health conditions and related co-morbidities in women living in Northern Cyprus and as such, establish a women’s health cohort for future follow-up. In short, each participant completed the baseline questionnaire—an expanded version of the Endometriosis-Phenome-and-Biobanking-Harmonization-Project (EPHect) questionnaire [6] which included the SF-36v2 generic health measurement. Data were collected through a combination of household (16%, (n = 1208)) and workplace (84% (6438)) face-to-face visits (93% (7128)) as well as through online (7% (518)) recruitment methods [7], between January 2018 and February 2020. Women aged between 18 and 55 at recruitment, who were either citizens of Northern Cyprus or had been residing there for the past 5 years and were able to give informed consent were eligible to participate in the study. Women were recruited into the study from the 6 main districts in Northern Cyprus (Nicosia, Kyrenia, Famagusta, Morphou, Trikomo and Lefke) with recruitment targets being set using geographic population densities.

We compared the age, educational attainment, civil status, employment status, and city of residence structure of our sample with the projected 2019 population figures for Northern Cyprus obtained from the Northern Cyprus Statistics Institution (available on request from: http://www.stat.gov.ct.tr/). Our sample was broadly representative of these projected values with the main differences being seen in age and education. We calculated weights for age and education and present all SF-36v2 scores obtained from the sample both unweighted and weighted.

Ethics

The study was approved by the Oxford Tropical Research Ethics Committee (OxTREC) of the University of Oxford (OxTREC reference: 37–17). The study also received local ethics approval from the Eastern Mediterranean University Ethics Committee (ETK00-2017-0240).

SF-36v2 health domain subscales

The SF-36 has been validated previously in the Turkish language [8, 9]. Within the SF-36, there are 36 items that measure 8 domains as follows: limitations in physical activity due to health problems (physical functioning, 10 items), limitations in social activities due to physical or emotional problems (social functioning, 2 items), limitations in usual role activities due to physical health problems (role physical, 4 items), limitations in usual role activities due to emotional problems (role emotional, 3 items), wellbeing and psychological distress (mental health, 5 items), energy and fatigue (vitality, 4 items), bodily pain (bodily pain, 2 items) and perceptions of general health (general health, 5 items). Within the 36 items, there is a singular additional question asking about changes in health over the past year. Using the methods set out by Ware et al. [10, 11], the items within each of the above dimensions were coded, summed, and transformed (calculated by subtracting the lowest possible raw score from the actual raw score, dividing by the possible raw score range, and multiplying this by 10. This gave a scale from 0 (worst possible health state as measured by the questionnaire) to 100 (best possible health state).

We also calculated the 8 health domains using norm-based scoring using a T-score transformation where the mean was set to 50 and the standard deviation to 10 in the current sample.

SF-36v2 component summary scores

Following the methods set out in the SF-36v2 manual, the data were factor analysed to produce scoring coefficients for PCS and MCS [11]. In short, we used principal components analysis to produce factor loadings, applied an orthogonal varimax rotation to rotate these loadings and then obtained scores. Calculating PCS involved multiplying each SF-36 scale z-score (calculated by subtracting the mean of SF-36 scale and dividing the difference by the corresponding scale standard deviation) by its respective factor score coefficient and in the case of the MCS, this involved multiplying each SF-36 scale z-score by its respective factor score coefficient. Finally, a T-score transformation was used to standardise the scores whereby the mean was set to 50 and the standard deviation to 10. PCS and MCS are two clusters that are produced from each of the eight scales and allow for further interpretation of the results.

Although no previous studies have investigated the SF-36v2 measures in Northern Cyprus, a cross-cultural adaptation of the survey was shown to be successful in Turkey and demonstrated an acceptable level of reliability and validity in its use in Turkish speaking individuals [5, 8]. To investigate whether the SF-36v2 questionnaire was valid in Northern Cyprus, reliability and validity was assessed. We also calculated summary measures using US-specific health domain subscale scores from 1998 and US-specific factor score coefficients from 1990 [12].

Reliability

Cronbach’s alpha was used to examine internal consistency reliability; a value of > 0.7 was considered satisfactory. To assess whether the summary measures, PCS and MCS were reliable, the reliability of each of the eight subscales, the covariances among them and the factor score coefficients were calculated.

Validity

Principal components analysis, item-subscale correlations (item-rest correlations for the subscales and their respective items) and inter-scale correlations (Spearman correlations) were used to assess construct validity. A value of > 0.4 for item-rest correlations was considered satisfactory. If the correlation between an item and the sum of the other items its respective subscale (item-rest correlation) was shown to be significantly higher than its correlation with other subscales (item-subscale correlations) then the item’s inclusion in its subscale was supported. If the correlation between two subscales was less than their reliability coefficients (Cronbach’s alpha), then it can be said that there is evidence of reliable variance measured by the respective subscales.

Sociodemographic characteristics

We used self-reported data from the COHERE questionnaire in order to obtain potential covariates. Age was calculated by subtracting birthyear from survey completion date and categorised into the following four groups: 18–25, 26–35, 36–45 and 46–55. Ethnicity was split into 3 groups: Turkish Cypriot, Turkish, Mixed/Other (women reporting to have 2 ethnicities and women reporting to be a single ethnicity other than Turkish Cypriot or Turkish). Migration status (born in Northern Cyprus or parents born in Northern Cyprus, not born in Northern Cyprus and therefore recent migrant to the island). Highest educational achievement (primary or middle school, high school or post-secondary, undergraduate and postgraduate), employment status (employed or unemployed), civil status (single, married, divorced/widowed), city of residence (Nicosia, Famagusta, Trikomo, Lefke, Morphou, Kyrenia) were also assessed using the questionnaire data.

Missing data

As we used a large sample size (n = 7646) and the intention of this primary analysis was to obtain the factor loadings to construct the summary scores, data substitution algorithms were not used. In addition to this, there has been speculation that substitution of mean values may have either a conservative or attenuated bias [13]. Given these are the first normative scores to be calculated for women in Northern Cyprus between the ages of 18–55, it is also important to report ‘actual’ scores, so that normative scores here can be used for future analyses and are not differentially attenuated according to the amount of missing data that has been recorded [14]. This approach is consistent with previously published studies [15, 16].

Statistical analyses

Descriptive statistics for each of the eight subscales for the included sample and different subsamples according to age, ethnicity, migration status, educational achievement, civil status, city of residence and employment status were calculated. Differences in means of the eight health subscales for each of the subscales were tested using linear regression. Regressions were computed both crudely and after adjustment for age, as age is usually correlated with the covariates examined here. Regressions for ethnicity and migration status were additionally adjusted for educational achievement and employment classification, as these two demographics are often correlated with non-native people. Statistical analyses were carried out using Stata SE version 16.0.810 (StataCorp LP, College Station, Texas, USA) and R Studio.

Results

Sample population and data completion

Our sample population consists of 7646 women with a mean age of 36.9 years. Women in our cohort were more likely to be Turkish Cypriot (73.8%), native to the island (74.6%) have a university degree (52.7%), be married (66.5%), be in paid employment (81.2%) and reside in the capital city, Nicosia (43.2%) (Table 1). Percentages for whom scale scores could not be calculated due to incomplete items for a particular scale was low and ranged from 0.5 to 2.9% (data not shown).

Table 1 Comparison between various demographics of participants within the Cyprus Women's Health Research (COHERE) Initiative and the census for Northern Cyprus

Validation of the SF-36v2 questionnaire for women in Northern Cyprus

Regarding reliability, Cronbach’s alpha coefficients were found to be satisfactory (> 0.70) for all health domains (Table 2), apart from general health (0.69). Reliability of the summary measures was 0.89 for both PCS and MCS.

Table 2 SF-36v2 health domain subscales: mean 0–100 scores with 95% confidence intervals, standard deviation, percentage floor, percentage ceiling, factor score coefficients, and Cronbach's alpha

When looking at validity, item-rest correlations were all satisfactory apart from PF10 of the physical functioning perception subscale (0.38). The principal components analysis produced two factors with eigenvalue > 1 which indicates a two-factor structure. All items were satisfactory when looking at differences between item-rest correlations and inter-scale correlations i.e. correlations were higher between individual items and their respective subscales, than between individual items and the other 7 subscales. It was also found that the correlations between subscales were lower than their respective Cronbach’s alpha values which suggest that there is unique reliable variance (Table 2). Ceiling effects were highest in physical functioning and role physical subscales (39–42%) with the bodily pain subscale having the highest floor effect (1.20%).

Health domain subscales

Women who were younger had better physical health (PF, RP, BP) compared to older women but as age increased, mean scores for the mental health subscales also increased (VT, RE, MH) (Table 3). Though GH increased with age, this was not statistically significant. When considering ethnicity, women who self-reported to be Turkish had the lowest scores across all domains except MH, RE, SF, and all but PF remained statistically significant after adjustment for age (Table 4). Women who were not migrants had higher subscale scores for all the domains compared to women with a migration background (PF, BP, GH, VT, SF, RE, MH (Supplementary Table 2). After adjustment for age, married women generally had the best mental health (MH, RE, SF) but the worst physical health (PF, RP, BP) (Supplementary Table 3). Residency of the women in COHERE did not appear to significantly affect physical or mental health domains (Supplementary Table 4). As educational attainment increased, mean physical health also increased after adjustment for age, (PF, RP, BP, GH) as did mental health domains (VT, RE and MH) (Table 5). Generally, women who were employed had significantly better physical subscale scores (PF, RP, BP, GH) with only the VT mental health score remaining significant after adjustment for age (Table 6).

Table 3 SF-36v2 health domain subscales: mean 0–100 scores with 95% confidence intervals, standard deviation, p values from linear regression (global test) according to age (18–25, 26–35, 36–45, 46–55)
Table 4 SF-36v2 health domain subscales: mean 0–100 scores with 95% confidence intervals, standard deviation, p values from linear regression (global test) without and with adjustment for age according to ethnicity (Turkish Cypriot, Turkish, Other/Mixed)
Table 5 SF-36v2 health domain subscales: mean 0–100 scores with 95% confidence intervals, standard deviation, p values from linear regression (global test) without and with adjustment for age according to educational attainment (Primary or middle school, high school or post-secondary, undergraduate degree, postgraduate degree)
Table 6 SF-36v2 health domain subscales: mean 0–100 scores with 95% confidence intervals, standard deviation, p values from linear regression (global test) without and with adjustment for age according to employment status (In paid employment, not in paid employment)

The 8 domains presented as norm-based scores can be seen in the Appendix (Supplementary Tables S4-S10).

SF-36v2 summary measures PCS and MCS

Principal components analysis and orthogonal rotation was used to factor analyse the data and led to a two-factor solution; factor labelled PCS gained an eigenvalue of 4.00, with MCS having an eigenvalue of 1.14. We found better physical health (PCS) in younger women and generally better mental health in older women (Table 7). Higher scores were seen in women with a higher educational achievement and those in paid employment. Single women appeared to have the best physical health and the worst mental health (mean age in single women = 27.03 (SD 7.20) vs married women = 39.74 (SD 8.16)). Women residing in Morphou had the lowest PCS scores and those in Famagusta the worst MCS scores, with women in Kyrenia having the highest PCS scores and those in Lefke the highest MCS scores. These associations were not significant once adjusting for age (mean age and SD in each district as follows: Famagusta (mean = 35.5 (SD 9.8)), Kyrenia (mean = 37.2 (SD 9.4)), Lefke (mean = 38.1 (SD 9.5)), Morphou (mean = 37.6 (SD 10.2)), Nicosia (mean = 37.4 (SD 9.3)) and Trikomo (mean = 36.1 (SD 10.3)) (Table 7). Turkish Cypriot women had both the best physical (with other/mixed ethnicities) and mental health scores after adjustment for age as did those women with a non-migration background. After further adjustment for education and occupation classification, associations were especially attenuated for migration status and health scores and although the effect was also lessened between ethnicity and the two health scores, associations remained significant for  MCS (not shown).

Table 7 PCS, MCS: mean (T-scores) with 95% CI, SD and p values from linear regression (global test) without and with adjustment for age for the overall sample (n = 7089) and subsamples according to age, ethnicity, educational achievement, civil status, city of residence, employment status*

After weighting our data by age and education we did not see great differences in the results (Supplementary Tables S11-S19).

PCS estimates calculated using US coefficients were broadly similar to those generated using our sample; however, we found that the MCS estimates were substantially lower (Supplementary Tables S20-S21).

Discussion

We have shown that the SF-36v2 questionnaire is both reliable and valid in women between the ages of 18–55 residing in Northern Cyprus in evaluating HRQOL. We found better physical health in younger women and better mental health in older women. Turkish Cypriot women and non-migrant women had both better mental and physical health, and HRQOL was highest in those in paid employment as well as in those with a higher educational achievement. Here we have provided the first normative values for women in Northern Cyprus aged between 18 and 55 and our analysis suggests that choosing culturally specific weighting coefficients is important when investigating HRQOL.

Reliability and validity

Internal consistency reliability of the scales was high (above 0.8) for 5 of the eight scales with the general health scale being the only one to fall somewhat short of the accepted Cronbach’s alpha level of 0.7 (0.69). However, we believe this to be sufficiently close to argue that this scale too had adequate internal consistency reliability. The highest ceiling value in our sample was 41.8%, for the role physical scale, suggesting that the SF-36v2 was well interpreted and suited to our population of Northern Cyprus.

Physical and mental health domains

This study has shown that the highest score of the eight health domains for participants in COHERE was physical functioning and the lowest was vitality. This pattern is consistent with various other studies conducted in a number of high-income countries such as the United Kingdom [16], Switzerland [17] and the United States [11], as well as countries within the Mediterranean region such as Turkey [5] and Greece [18]. Although the scores we present here vary from those presented by other countries, this does not necessarily mean that there are international health differences; there are a number of reasons normative scores may differ between countries, such as differences in culture, expectation of health and mode of administration of the questionnaire.

We observed that as age increased, mean physical health decreased and mean mental health increased, as seen in various other studies [11, 16, 17]. Education and employment are two demographics that can be used as proxies for the sociodemographic position of people within society. As our results showed higher mean scores in those who had obtained higher educational qualifications as well as in those who were in paid employment, we can be confident that the SF-36v2 was capable of detecting these known differences amongst different socioeconomic positions.

Compared to migrant women, non-migrants had the highest mean scores after adjustment for age only, consistent with previously published studies [17, 19]. Once education and occupation type had been adjusted for, the associations were no longer statistically significant, suggesting the observed associations may be partly explained by social disadvantages that arose from lower socioeconomic status immigrants in this sample. When examining ethnicity, after adjustment for age, education, and occupation class, the statistically significant association between ethnicity and mental health remained, with Turkish Cypriot women having the highest mean scores. Most studies examining HRQOL in non-natives show that they suffer from higher stressors when compared to their native counterparts, not only due to differences in socioeconomics, but due to other migration-specific difficulties [20, 21].

The area most similar to Northern Cyprus both geographically and culturally to have produced normative values for the SF-36v2 is Turkey [5]. Mean health domain scores in COHERE were lower for all scales, apart from physical functioning (88.3 vs 80.6). The study in Turkey was focussed on an urban region with a small sample size (670 women) which is not representative of the population of Turkey. In addition to this, five of the eight scales had a median ceiling score of 100 (perfect) which is unusually high. We believe this is why normative HRQOL scores should be both country and culturally specific.

Strengths and limitations

The cross-sectional design of our study means that we cannot infer causality of measured variables. Gandek and Ware [22] recommend that the minimum sample size used to produce country specific normative values for HRQOL using the SF-36v2 should be between 2500 and 3000 respondents. Here our large sample size (n = 7646) means we can be confident in the accuracy of our values and in addition to this, the demographic and social background characteristics of participants in our sample are broadly representative of women between the ages of 18–55 in Northern Cyprus. However, there is some over-representation of women with university degrees and in employment compared to the general population and this should be considered when carrying out any further analyses.

Although the SF-36v2 is a self-administered questionnaire and therefore may suffer from reporting bias, it is reliable and valid as well as being a widely used tool to assess HRQOL. Our study is unique in that it provides normative values for a female population. It is well established that HRQOL is significantly lower in women compared to men and therefore using female-specific normative values from a population of premenopausal women is especially advisable when investigating the impact of women’s reproductive diseases, such as endometriosis, on HRQoL. The main aim of the COHERE Initiative is to investigate reproductive health conditions and so our study is restricted to women between the ages of 18–55. Therefore, it is not generalisable to women above and below these ages or to men and so further research in these groups would be needed to discover if similar patterns exist and to provide normative values for all age groups.

Conclusion

Here we present the normative values for women aged 18–55 in Northern Cyprus using the SF-36v2 questionnaire. In accordance with the literature, we saw higher mean physical health scores in younger women and higher mean mental health scores in older women. Non-Turkish Cypriot women and migrants had lower mean scores for both domains, as did those who had lower educational attainments and those who were not in paid employment. This research will allow future studies to measure HRQOL as assessed by the SF-36v2 questionnaire using female-specific data in Northern Cyprus, which we argue is essential in order to investigate the impact women’s health diseases have on HRQOL.