Background

Globally, the number of people forcibly displaced by conflict is at the highest levels ever recorded. At the end of 2015, it was reported that the number of refugees had reached 21.3 million [1]; an increase of 1.7 million from 2014 [2]. A further 3.2 million were asylum seekers [1]. Conflicts, violence and human rights violations, particularly in the Middle East and North Africa, are forcing millions of people to leave their homes and to flee from destruction and persecution. A refugee has a well-founded fear of persecution for reasons of race, religion, nationality, political opinion or membership in a particular social group [1].

Resettlement to a third country may be offered to those refugees who cannot return to their home country for fear of persecution or when they cannot be offered a permanent residence in the country they are currently living [3]. Resettling refugees are taking on added importance internationally with massive movements of people across continents due to global political and economic instability. The long term health and settlement prospects of refugees are a matter of continuing relevance for receiving nations as they are recognized as one of the most vulnerable groups in our society in terms of risk for poor health [4,5,6,7,8,9,10]. They often have unique health needs reflecting the epidemiology of diseases in their country of origin [11]; inadequate and disrupted health care; and stressors experienced during the migration and resettlement periods; including trauma, torture and poverty. Exposure to pre-migration trauma may have a lasting impact on their psychological and physical well-being and high levels of stress due to assimilation into a new society may also contribute to the health of resettled refugees [12,13,14,15,16,17,18].

The collection of refugee health data relies heavily on subjective measures of refugee health because they are less resource intensive than clinical assessments, less burdensome on participants and can capture individual perceptions of health, such as psychosocial factors geographic location and individual characteristics [19]. A large number of empirical studies have demonstrated that a person’s own appraisal of his/her general health is a powerful predictor of future morbidity and mortality, even after controlling for a variety of physical, socio-demographic and psycho-social health status indices [20,21,22,23,24,25]. Therefore, measurement of self-rated health within resettling refugees may well serve as a surrogate for more traditional clinical assessments of health. Despite this, the selection of robust and appropriate subjective measurement tools for use among this population group is challenging due to methodological variation in the use of assessment tools across refugee health studies. This has meant that refugee health data is often conflicting and difficult to interpret and compare. Several studies have reviewed health measurement tools used in refugee populations [26,27,28,29] and all have suggested that the measurement tools being used in refugee health research often lack the validity and rigour required to assess constructs of psychological wellbeing, health and other factors that are associated with resettlement outcomes.

This review will build on these earlier reviews by providing an update on the health measurement tools being used in adult refugee research. However, in contrast with these earlier systematic studies, this review will not be limited by specific health concepts; such as trauma [26, 28]; refugee gender [27], and we will include refugees living within western and non-western nations; allowing a broader investigation of the literature reporting the use of these measurement tools. The development of such a comprehensive knowledge base is required in the literature; particularly at this time when refugee movement across the globe is unpresented and receiving nations are faced with increased pressure to provide immediate and long term health care that is appropriate, effective and comparative.

Study aim

The aim of this scoping review is to describe the self-report health measures which have been used in studies of adult refugees living in the community. For each self-report measure identified, we will note any reliability and validity testing within refugee groups as well as the settings in which the measures have been used. By doing this, we aim to gain a better understanding of how these measures are used in refugee health research. This will allow us to address the challenges of selecting appropriate assessments to measure health within refugee groups.

Methods

A scoping review was conducted between October 2016 and February 2017. Scoping reviews are rigorous, with methods that allow for replication, but findings are not synthesized or aggregated to the extent customary in systematic reviews. They allow for the inclusion of diverse study designs and involve iterative search process where search terms may evolve during the review [30]. A scoping review can help to identify gaps in the evidence base and summarize a more broad range of research findings [30]. We adopted the Arksey and O’Malley (2005) five stage methodological framework for scoping reviews:

  1. Stage 1:

       Identifying the research question

  2. Stage 2:

       Identifying the relevant studies

  3. Stage 3:

       Selecting studies

  4. Stage 4:

       Charting the data

  5. Stage 5:

       Collating, summarizing and reporting the results

Stage 1: Identifying the research question

Our research question was: What is known about the measurement tools used to measure self-rated health within resettled refugee groups?”

More specifically, the present review aims to address the following questions:

  1. 1.

    What settings have been described in studies that have measured self-rated health among refugee populations?

  2. 2.

    What self-rated health measurement tools have been used in these studies?

  3. 3.

    Which of these self-rated health measurement tools been evaluated for validity and reliability criteria within refugee populations?

Stage 2: Identifying the relevant studies

We searched five electronic databases (Medline, CINAHL, EMBASE, SCOPUS and PsychINFO) for English language papers published between January 2000 to March 2017. The following search terms were employed: “refugee”, “asylum seeker”, “settlement”, “humanitarian”, “self-perceived health”, “subjective health”, “mental health”, “mental disorder”, “physical health”, “health status”, “surveys and questionnaires”, “scales”, “screening”, “measures” and “instruments”. The initial searches were performed in December 2016 and subsequently re-run in Medline in March 2017 to identify additional relevant studies published in 2016 and 2017. No additional studies were identified. This search was supplemented with a general Internet search using Google and Google Scholar to ensure our results were maximal.

Inclusion and exclusion criteria

For a measurement tool to be included in the review, two eligibility steps were required. First, published peer-reviewed articles were required to meet the following criteria:

  1. 1.

    Published in the English language.

  2. 2.

    Published between January, 2000 and March, 2017.

  3. 3.

    Focused on the collection of self-rated physical and mental health from refugees or asylum seekers using specific health assessment tools.

  4. 4.

    Focused on adults (defined as those aged 15 years and above)

  5. 5.

    Focused on community living refugees and asylum seekers; including those living within refugee camps which offer long-term or permanent settlement, and located outside of the refugee or asylum seeker’s home country.

  6. 6.

    Included different population groups other than those of interest (e.g. immigrants) but reported refugee and asylum seeker data separately.

  7. 7.

    Studies were excluded if they were classified as an incomplete article (e.g. Editorial, commentary, letter or conference abstract); were review articles or reported data already used in another included article.

Second, the measurement tool(s) reported in the articles using eligibility criteria 1–7 were required to meet the following inclusion criteria:

  1. 8.

    Had been specifically tested for validity and/or reliability within refugee groups. (Given the limited volume of research in this area, the search was not limited by date of publication).

  2. 9.

    Had not been specifically tested for validity and/or reliability within refugee groups but were described in studies where:

  • the refugee sample size was ≥ 150; or

  • the tool(s) were described in 5 or more research articles.

These tools were included based on their use in research studies of what the study authors considered large sample sizes or in an acceptable number of research studies.

Stage 3: Selecting studies

The screening and selection procedure is shown in Fig. 1 using the Preferred Reported Items in Systematic Reviews and Meta-analysis (PRISMA) flowchart [31]. It was difficult to ascertain between projects and papers (i.e. projects are those where multiple papers were published from the one study as opposed to single studies from single projects) and based on this, the decision was made to look at papers, not projects, for the review.

Fig. 1
figure 1

PRISMA diagram: search and selection process. *Tools/instrument(s) were assessed as meeting the inclusion criteria if they reported being tested for validity and/or reliability within refugee groups, or if the refugee sample size was >150, or if described in 5 or more research articles

A total of 390 references were obtained from the initial search of which 114 studies were duplicates. One author (AD) screened the remaining 276 references, applying the inclusion criteria to the titles and abstracts where possible. Full text review was conducted on 193 articles with a further 10 articles excluded because they did not report on resettled refugees; were studies that included different population groups other than those of interest (e.g. immigrants) but did not report refugee and asylum seeker data separately; did not specify the name of the measurement tool(s) used to collect refugee health data; or were studies researching the health of internally displaced refugees. The final set of articles was 183 papers with 52 measurement tools identified. To reduce bias caused by human error, two authors (GR and JE) repeated 10% of the study selection process. Rates of agreement were consistently high between the three reviewers, with any discrepancies resolved through discussion.

Stage 4. Charting the data

The 183 articles were charted in Microsoft Excel 2010, using the study characteristics (e.g. author information, publication year, study design), participant characteristics (e.g. demographic data), measurement tool name and statistical outcome information (e.g. measurement tool validation and reliability data).

Stage 5: Collating, summarizing and reporting the results

We devised the following categories of focus: self-rated health measurement tool name and description; health variable (eg. mental health, general health); study setting (eg. clinic, community, refugee camp); study population (eg. refugee, asylum seeker); and reliability/validation data regarding the measurement tool(s) conducted within refugee or asylum seeker populations. Common themes were identified across articles, and when possible, articles were compared.

Definitons

For the purpose of this scoping review, the following definitions have been applied.

Study population

Refugees and asylum seekers included in this study are those who were resettled in a community in a country outside of their own. This included those who had been offered permanent residency in a third country (either through United Nations High Commission for Refugees (UNHCR) mandate; permanent visas) or were living in a refugee camp for over 12 months.

Self-rated health measurement

Self-rated health measures are defined as any report on the status of an individual’s health condition that comes directly from the individual, without interpretation of the individual’s response by a clinician, researcher or anyone else [32]. Modes of data collection include interviewer-administered measurement tools, self-administered measurement tools, computer-administered measurement tools or interactively administered measures [33].

Reliability

Reliability was defined as to the extent to which scores are the same for repeated measurement under different conditions [34].

Validity

Validity is the extent to which a tool measures what it is supposed to measure and performs as it is designed to perform [34].

Results

The studies

A total of 183 articles were reviewed for this study (see Table 1 for a summary). Most of the studies (123 in total) were conducted within resettled refugees living in community settings within Western nations. Studies investigating the health of refugee populations living in non-metropolitan areas were limited, apart from 19 studies which were conducted in refugee camps. A large proportion of the research focused on populations from the Middle East, although Syrian refugees were under-represented with only 5 studies investigating the health status of this refugee group. Refugee mental health status was the focus of most of the reviewed studies (n = 153); particularly symptoms of depression, anxiety and post-traumatic stress disorder (PTSD). Studies were predominantly cross sectional in design and comprised of sample sizes of less than 500 participants. Recruitment mainly involved both sexes with the exception of 10 studies where only females were invited to participate and 2 studies where only males were recruited. The measurement tools were administered to participants using several methods; including face to face oral administration by trained interviewers/interpreters and self-completion.

Table 1 Self-rated health study characteristics

The measurement tools

A total of 52 tools were identified in the retrieved studies, of which 45 met the study inclusion criteria (see Tables 2 and 3 for characteristics of these measurement tools). Seven tools were excluded because they had not been specifically tested for validity and/or reliability within refugee groups; were reported in studies where the refugee sample size was ≤150 or where the measurement tool(s) were described in less than 5 research articles.

Table 2 Overview of the self-rated health measurement tools identified in the scoping review
Table 3 Brief overview of the self-rated health measurement tools identified in the scoping review

Characteristics of measurement tools

The 45 measurement tools were used to measure general health, common mental disorders and trauma/PTSD among refugees. In terms of measurement focus, general health was measured using five tools; 19 tools assessed common mental disorders, such as anxiety and depression; and 21 tools investigated trauma/PTSD (see Tables 2 and 3). No consensus on the preferential use of different measurement tools emerged; although the most widely used tools across the studies were the Harvard Trauma Questionnaire for the assessment of trauma/PTSD and the Hopkins Symptom Checklist-25 for anxiety and depression. Sixteen of the identified measurement tools were developed specifically within samples of refugees and were used in half of the reviewed studies.

Aside from the Afghan Symptom Checklist, which was developed within a community sample in Afghanistan; the remaining 28 tools were designed for use in Western populations 12 of these Western developed tools were designed to be used in a clinical environment, but were frequently used in community studies of refugee populations. For example, over half of the community based studies used at least one tool that had been specifically developed for use within Western clinical populations. Translation of the Western designed tools mostly involved back-translation (85 studies). Twenty two studies reported translating English versions of measurement tools word-for-word into the refugee language. Pre-translated versions of the measurement tools were used in 28 of the articles reviewed and 15 studies reported using bilingual interpreters were used to verbally translate the tool items to the target population.

The pre-determined cut-off scores of measurement tools were routinely applied across most studies in spite of the conditions of their original development. Only two studies were identified that adjusted the cut-off score of their selected measurement tools to suit their target population [35, 36]. Measurement tools for use in non-clinical environments must be brief, easy to use and interpret, cost-effective and accessible. For example, the Minnesota Multiphasic Personality Inventory is designed for use in clinical samples, yet was used in several community studies of refugees [37, 38]. This tool is 567 items long, takes 60–90 min to complete and requires specialist training to administer and score.

Statistical testing of measurement tools

More than half of the 45 tools have been evaluated statistically for reliability and validity among refugee populations. Reliability data was reported for 33 of the tools and 24 tools had published validation data (see Tables 2 and 3). Twenty tools have published reliability and validity data among refugee populations. Prior to the year 2000 only seven of the tools described in this review had evidence of reliability testing and validity within refugee populations. A notable finding is that eight of the tools have published reliability data in refugee groups, but no published reliability data. A measurement tool cannot be valid unless it is reliable [39] and reasons for the absence of reliability data for these tools could not be ascertained in this review. Six of the reviewed tools have not been tested for either validity or reliability among refugees, but meet the criteria of being used in refugee research where the sample size was 150 participants or greater. Eighteen tools were validated by one study each and six tools were validated by two or more studies (see Table 2).

The methods used to conduct reliability and validity testing within studies was highly variable and there appeared an emphasis on conducting reliability testing rather than validation testing of measurement tools. In terms of reliability testing, internal consistency the most commonly reported statistic while test-retest reliability and inter-rater reliability was mostly overlooked. Test-retest reliability was only reported in studies using 11 tools and inter-rater reliability was reported in only 2 studies. Where the validation testing of measurements was conducted, there were inconsistencies across studies in the extent of testing. Ideally, refugee health measures should have, at a minimum, confirmed content and construct validity within refugee populations; however this was only observed in a few of the studies. Where a tool was adapted for refugee research, there appeared an emphasis on reliability and criterion validation. Testing only criterion validity does not assess whether the measure appropriately captures culture specific constructs, such as depression, anxiety or PTSD.

Where the development of a measurement tool was described, few studies described efforts to conduct a qualitative analysis of the concepts being measured in the development of the tools; incorporating ten tools. These tools were all designed specifically for use in refugee research, with the exception of the Hopkins Symptom Checklist-25 (Indochinese version) and Medical Outcomes Study Short Form −12, which have been adapted for use in refugee populations.

Modification of measurement tools

Eight of the 44 measurement tools underwent some form of modification to make them more culturally appropriate. These modifications included addition/removal of items [40,41,42,43,44,46] and cut-off score modification [35, 36]. The Harvard Trauma Questionnaire was the most commonly modified tool, and is in accordance with the authors’ recommendation that the tool be modified and adapted to the characteristics of each cultural group [40]. Fourteen of the 62 studies which used the Harvard Trauma Questionnaire modified the tool to be more appropriate for their target refugee population.

The translation of measurement tools across studies was variable. Ideally, translation of measures should undergo a standard translation/backtranslation process to ensure semantic and conceptual equivalence [47] and to avoid culturally sensitive material. In less than half of the studies reviewed, there was evidence that the researchers had undertaken thorough back-translation of the measurement tool(s). A number of studies reported translating English versions of measurement tools word-for-word into the refugee language, which is problematic in cross cultural research as the questions or items may not be communicated correctly, especially if any idioms were used in the source language [48]. Attempts to establish conceptual and/or thorough linguistic equivalence of only 11 tools was identified.

Discussion

This scoping review identified that 45 different self-rated health measurement tools were used to measure self-rated health in refugee populations. Most of the 183 studies detected in the review study were cross sectional explorations of the mental health status of refugees living in community settings in Western nations. A third of the tools were designed specifically for use within refugee populations. More than half of the measurement tools have been evaluated for reliability and/or validity within refugee populations.

Apparent from this review is that no consensus exists on the use of different measurement tools amongst researchers and there are no standard criteria against which quality assessment of these tools can be made. This has resulted in a large number of tools of varying rigour being used in refugee health research. The resulting variability in structure, reliability and validity increases the potential for inaccurate conclusions to be made concerning the health of refugee populations. Several studies have previously reported large variations in prevalence rates of mental health disorders among refugees [4, 8, 14, 49] and these disparities have been attributed to inconsistencies in methods and measurement tools used for data collection, analysis, and reporting [50]. Tools used in refugee health research should have at a minimum, demonstrated reliability and validity of the construct within refugee populations to ensure that cultural concepts and constructs are able to be accurately measured [26, 51]. A number of tools identified in this review have published reliability and validity within refugee groups, but few provided evidence of testing for construct validity and the methods used to confirm reliability and validity were highly variable across studies.

Refugee self-rated health measurement tools are also frequently used out of the context for which they were designed. For example, clinical tools are developed for use among treatment seeking patients and are designed to be administered in safe and trusting environments where care can be provided for any adverse reactions. They are not specifically designed for non-clinical settings where the level of contact between respondents and often non-clinical, administrators is brief. Clinically developed tools also have established cut off points for identifying individuals who are identified as being positive for specific disorders and these may shift as the setting changes from a clinic to a community sample. It is not known whether the application of measurement tool scores across heterogeneous populations leads to reasonable inferences concerning symptom severity and diagnoses [47, 51], but there is a body of work suggesting that using a single cut-off score may not be a valid procedure for cross cultural samples [51,52,53,54,55].

The challenges in developing and adapting health measurement tools for refugee research were evident in the reviewed literature. The methods used were inconsistent and/or limited across studies. For example, testing only reliability was common across the reviewed studies. Reliability only addresses the degree to which measurement tools result in reproducible results across different interviewers and applications [56]; it does not measure the degree to which the tool actually measures the construct of interest. Where validity testing was conducted, there was a focus on criterion validity (or ‘caseness’). Criterion validity does not assess whether the measure appropriately captures culture specific constructs, such as depression, anxiety or PTSD. Determining the construct validity of a tool demonstrates how well the tool measures the constructs it was designed to measure [34] and ensures that inferences made using the results of such assessments, such as severity of symptoms and prevalence rates are supported. For example, the PTSD symptom patterns of refugees may deviate from those of Western populations because of both cultural and war-related factors, as well as post-traumatic life circumstances [53, 57]. The failure to appropriately standardize or adapt existing measures for use with refugee populations means that they may lead to incorrect generalizations about the health of refugees and this can have a widespread effect and can lead to the development and implementation of incorrect interventions and policies [58].

This review found inconsistencies in the translation of measurement tools across studies. Translation should be undertaken using thorough back-translation of the measurement tool(s) into the first language of those being assessed by using the back-translation method by itself or in combination with a committee or bilingual assessment method [47, 59]. The translation of English versions of measurement tools word-for-word into the refugee language is problematic in cross cultural research as the questions or items may not be communicated correctly, especially if any idioms were used in the source language [48, 50]. Appropriate translation ensures ensuring semantic and conceptual equivalence and is one of the requirements for establishing validity [47, 60].

Health measurement tool items should also be examined to illustrate why they are culturally relevant in terms of the rationale behind their inclusion and understanding what it means for a person in that culture to have the symptom or syndrome or how they vary across cultures. For example, the way in which western psychology describes PTSD does not fit the symptoms of people from non-western cultures, yet tools measuring trauma in refugees often use western trauma concepts and constructs. The exploration and establishment of equivalence of measurement tools between local indigenous constructs and symptoms is an important step in ensuring that the measurement tools are tapping into the respondents’ understanding of their health [26, 61] and was frequently overlooked in the reviewed refugee research studies.

Recommendations

Researchers would benefit from the development of guidelines to instruct proper and consistent measurement design and testing; such as the achievement of cultural equivalency across health concepts, reliability and validity across refugee population groups and settings. For example, researchers would benefit from the use of standardized procedures such as the Translation Monitoring Form [62] which provides a method for the systematic translation and adaptation of measurement tools. Evaluation of the performance of measurement tools needs to be undertaken in rural and remote settings. Given that receiving nations, such as Australia and the United States of America (USA), are resettling refugees beyond the major metropolitan regions, it is important that these populations are not overlooked so that we can understand the ongoing health needs of remote refugee populations to inform often limited, and often over-stretched, rural and regional health services.

The development of integrated and comprehensive measurement tools to assess all elements of relevance to the health of refugees would be beneficial to health service providers. The current tools are not comprehensive, but rather assess parts of experiences and/or symptoms and disorders. Measures, such the Refugee Mental Health Assessment Package, hold promise as an integrated and comprehensive package of measures to assess all elements of relevance to the mental health of refugees [63].

Currently, there is an underrepresentation of tools measuring resettlement stressors, pre-war, pre-conflict and non-conflict trauma within refugees. The resettlement environment is significant to the health of refugees and the lack of measurement tools to capture this information means that the development of health interventions during resettlement is challenging. Many refugees may experience trauma prior to the war or conflict-related event that can causes them to flee their home. This trauma could be due to religious or political oppression which then resulted in the outbreak of civil or international war. Therefore, these refugees are traumatised prior to displacement, yet there are no assessments to measure this pre-war or pre-conflict trauma. This represents a significant gap in refugee research and research is required into the development of such measures.

Optimal tools for the measurement of refugee self-rated health are those that have reported reliability and validity testing in refugee populations. Several tools fit these criteria and include the Medical Outcomes Study Short Form and New Mexico Refugee Symptom Checklist-121 for general health assessment; the Hopkins Symptom Checklist-25 and Refugee Health Screeners-15 for common mental disorders; and the Harvard Trauma Questionnaire and Comprehensive Trauma Inventory-104 for the assessment of trauma/PTSD within refugees (see Tables 2 and 3). Also, tools should not be used out of context for which they were designed. Measurement tools designed for use in clinical settings may not be suitable for use in community environments and vice-versa (see Tables 2 and 3).

Limitations

There are several limitations to this review. Firstly, electronic searches of the literature are not error free, and citations to some studies and measurement tools may not be included in the literature. In addition, our review will be limited by the fact that the searches were limited to articles in English published since 2000. However, given the limited volume of research in this area, the search for validation studies of tools utilised in refugee research was not limited by date of publication. The results of the review should be interpreted with the knowledge that scoping reviews do not screen for quality of studies and, therefore, they include studies with large variations in study methodologies.

What does this study add to the literature?

The results of this review have important research, policy and practice implications, which are outlined below.

Research implications

This review study provides an up to date compilation of contemporary general health, common mental disorders and trauma/PTSD assessments that rely on client self-report. We have identified a number of measurement tools that had been evaluated for reliability and/or validity since the publishing of prior reviews and this included a number of newly developed tools as well as those used to assess general health.

Policy implications

Given the increasing importance of patient measured outcomes, our results provide (particularly in Table 2) a compendium of self-report measurements for policy makers to use for program evaluation.

Practice implications

A number of jurisdictions are seeking input from patients on the quality and accessibility of health care services. A number of the measurement tools would be highly relevant for clinicians to identify performance as part of quality improvement.

Conclusion

Tools for use in refugee health research should have demonstrated reliability and validity in refugee populations to ensure accurate measurement of the health concept(s) under investigation. Consideration should also be given to the setting in which the tool was originally designed. This review shows that there are currently a number of reliable and valid measurement tools available for use in refugee health research which can be used across a variety of settings. However, further work is required to achieve consistency in tool quality and in the use of these tools. Methodological guidelines are required to assist researchers and clinicians in the development and testing of subjective health measurement tools. In the interim, an achievable and very useful study would be a comprehensive evaluation of the most current and robust self-rated health measurement tools for use within refugee health research.