Background

The capacity to cope with new and ill-structured situations is a crucial ability in today’s world. Developing this ability, by shaping empowered citizens, challenges individuals as well as organisations and societies. This process of empowerment is usually referred to as capacity development (CD) [1]. While this term has been commonly used for years in the field of foreign aid, other societal and political domains (e.g. social work, education and health systems) are increasingly adopting the concept of CD when developing new or existing competencies, structures, and strategies for building resilient individuals and organizations [2]. Also in the field of health research, an increasing number of activities to strengthen health research competencies and to support organizations can be observed – as demanded by the three United Nations Millennium Development Goals addressing health related issues [36]. Several frameworks are already in use that support a structured approach to health research capacity development (HRCD) and address competencies that are specific to health research [79]. These frameworks usually incorporate the individual or team, organization or institution, and society levels [8, 10, 11]. One conclusion that can be drawn from the available evidence is that, in such a structured approach to HRCD efforts, meaningful data collection is crucial. First, data collection incorporates the HRCD needs assessment and second, the monitoring and evaluation (NaME) of activities and programs once implemented. Therefore, HRCD activities should address the needs as assessed. Monitoring and evaluation of these activities should reflect the desired outcomes as defined beforehand [1215]. Bates et al. [16] indicate how data collection tools and instruments are usually developed for a certain purpose in a certain context. The context specificity of tools and instruments has to be considered and the appropriateness of these must be determined when selecting instruments for any needs assessment for a new project. This article offers a systematic review of tools and instruments for the NaME of HRCD activities at the individual or team and the organizational levels to aid HRCD initiatives in selecting appropriate tools and instruments for data collection within their respective context. For this purpose, a range of studies published between January 1, 2003, and June 30, 2013, were chosen and analysed based on different context parameters such as the level of the CD and the nature of the HRCD activities.

Methods

We followed the PRISMA checklist for reporting systematic reviews and meta-analyses [17]. Inclusion and analysis criteria were defined in advance and documented in a protocol (Tables 1 and 2).

Table 1 Description and operationalization of the five inclusion categories
Table 2 Nine aspects for further analysis of the included studies

Information sources and search strategy

We conducted the systematic literature search in July 2013. The search was done in both the literature database PubMed and the search engine Google Scholar. We applied the three search terms “capacity building” AND “research, “capacity development” AND “research”, and “capacity strengthening” AND “research”. We checked the first 200 hits in Google Scholar for each search term. “Health” and “evaluation” were not included in the search terms as a pre-test search had revealed this would exclude relevant literature. Articles from personal bibliographies of the authors were also included.

Inclusion categories and criteria

The inclusion process was structured along the five inclusion categories ‘capacity development’, ‘research’, ‘health profession fields’, ‘monitoring and evaluation’, and ‘level of NaME’. Table 1 gives a detailed overview of all descriptions and operationalisations used.

The category ‘capacity development’ [18] represents an exemplary definition which serves as a guideline for inclusion but should not to be applied word by word. ‘Research’ was operationalized according to the categories of the ‘research spider’ [19]. Some process-related research skills as well as communicational and interpersonal skills were added to our operationalisation [20]. Main health professions were identified and grouped within different fields. NaME was operationalized according to a self-constructed NaME framework of HRCD activities (Fig. 1), which summarizes 13 HRCD/NaME frameworks [2, 5, 8, 1013, 15, 2125] and reflects the level of HRCD, common indicators, and the order (from needs assessment to impact evaluation) commonly used in the original frameworks.

Fig. 1
figure 1

Framework for needs assessment, monitoring and evaluation (NaME) of health research capacity development (HRCD) [ 2 , 5 , 8 , 10 13 , 15 , 21 25 ].

For the categories ‘research’, ‘health profession fields’ and ‘monitoring and evaluation’, at least one of the operationalisations of each category had to be addressed by the study. The category ‘level of NaME’ was operationalized referring to the ESSENCE framework ‘Planning, monitoring and evaluation framework for capacity strengthening in health research’ which describes three CD levels: individual and/or team, organizational, and system levels [10]. Only publications focussing on NaME on the individual/team and organizational levels were considered for this review.

Additionally, the following eligibility criteria were set: English or German language, publication period from January 1, 2003, to June 30, 2013, intervention, non-intervention and multiple design studies (Fig. 2). We excluded grey literature, editorials, comments, congress abstracts, letters, and similar. Articles focussing on institutional networks with external partners were excluded as well.

Fig. 2
figure 2

Categorization of the study designs. The study designs are restricted to the included studies.

Study selection

Two researchers, JH and SN, independently scanned the abstracts identified for inclusion. In case of disagreement, JH and SN discussed the abstracts in question. If consensus could still not be reached, a third reviewer, CK, was consulted. After consensus on inclusion was reached, the full-texts of all included studies were rechecked for inclusion by JH and SN.

Study analysis procedure

We analysed the included articles according to nine aspects defined in Table 2.

Results

The search in PubMed revealed 700 suitable records (Fig. 3). We removed 27 duplicates, resulting in 673 records for inclusion screening. The first 200 hits for each of the three search terms in Google Scholar were considered, resulting in two additional records after removing duplicates. Furthermore, we included articles from the personal bibliographies of the authors, adding 10 more abstracts after checking for duplicates. Of the 685 records identified, 24 did not contain an abstract, but were preliminarily included for the full-text screening. JH and SN scanned the remaining 661 abstracts in terms of the inclusion criteria, thus excluding 616 records; 45 abstracts and the 24 records without abstracts were considered for full-text screening. After the full-text screening, 42 articles were finally included for further analysis; 37 articles originated from PubMed, one from Google Scholar, and four from the personal bibliographies of the authors.

Fig. 3
figure 3

Flowchart of the inclusion process.

These 42 articles were subsequently analysed along nine aspects (Table 2). The results are summarized in Table 3.

Table 3 Included studies on needs assessment, monitoring and evaluation (NaME) of health research capacity development (HRCD) at the individual and organizational level

Around half of the NaME studies on HRCD activities were conducted in high-income countries (n = 24) [26]. Six studies took place in lower-middle-income and two in upper-middle-income economies. Participants of one study were from a low-income country [27]. Two studies were performed in partnerships between a high-income and several low-, lower-middle and upper-middle-income economies. Mayhew et al. [28] described a partnership study between two upper-middle income countries and Bates et al. [29] analysed case studies from two lower-middle-income and two low-income economies. Five authors did not specify the country or region of their studies.

The evaluation focus of the studies was predominately on outcome evaluation (n = 23). Besides that, six studies surveyed the current state, three studies assessed requirements, and two studies investigated needs of HRCD activities. The remaining eight studies combined two evaluation aspects: definition of needs and outcome evaluation (n = 4), analysis of current state and outcome evaluation (n = 1), outcome evaluation and impact evaluation (n = 1), and analysis of current state and definition of needs (n = 1). Jamerson et al. [30] did not define their focus of evaluation.

Nearly half of the studies investigated HRCD on the individual/team level (n = 20); 16 studies were conducted at both the individual/team and organizational levels. The authors of six studies focused on organizational aspects of HRCD.

Almost all studies (n = 38) described and evaluated HRCD activities; 19 of these HRCD activities were training programmes of predefined duration, lasting between some hours or days up to 2 years. Another nine HRCD activities were perpetual or their duration not specified and 10 studies defined and pre-assessed the setting in preparation of an HRCD activity. The authors of four studies did not specify an HRCD activity, focussing on the development or validation of tools, instruments, and frameworks.

The participants of HRCD activities represent a wide range of health professions (e.g. laboratory scientists, physiotherapists, dentists, pharmacists); 10 studies investigated staff with management tasks in health, e.g. hospital managers, clinical research managers. Nurses participated in eight studies with another eight studies looking into ‘research staff’ and ‘scientists’ with no further description. Medical practitioners were studied in five papers. Besides all these, the background of participants was often not specified beyond general terms like ‘health professionals’, ‘ethic committee members’, ‘scholars’, ‘university faculty members’, or ‘allied health professionals’. In a different approach, Suter et al. [31] analysed reports and Bates et al. [29] investigated case studies (without specifying the material scrutinized).

A wide variety of study designs was employed by the studies included in the review. We identified 35 single-study and six multi-study approaches. Of the 35 single-study approaches, 10 were designed as intervention (three with control groups) and 25 as non-intervention studies. Four multi-study approaches combined an intervention study with a non-intervention study. Two multi-study approaches combined different non-intervention studies. Jamerson et al. [30] did not specify their study design.

Many different tools and instruments for NaME were identified and applied in quantitative, qualitative and mixed mode of analysis. No preferred approach was observed. One third of the studies (n = 16) used a combination of tools for quantitative as well as qualitative analysis. In 13 studies, tools like questionnaires and assessment sheets were applied to evaluate and monitor HRCD activities quantitatively. Evaluation tools, such as interviews, focus group discussions, document analyses, or mapping of cases against evaluation frameworks, were identified in 12 studies and commonly analysed in a qualitative approach. In one study, tools for evaluation were not described at all.

Discussion

Summary of evidence

The aim of our systematic review was to give an overview on tools and instruments for NaME of HRCD activities on the individual and organizational level; 42 included articles demonstrated a large variety of tools and instruments in specific settings. Questionnaires, assessment sheets and interviews (in qualitative settings) were most commonly applied and in part disseminated for further use, development and validation.

Overall, 36 studies were either conducted on the individual/team or on both individual/team and organizational level. Within these studies, a well-balanced mixture of quantitative, qualitative and mixed tools and modes of analysis were applied. Judging from the depth of these studies, it seems as if NaME of HRCD on the individual level is quite well developed. Only six studies focused exclusively on organizational aspects, almost all with qualitative approaches, indicating that HRCD studies at this level are still mainly exploratory. The organizational level is possibly a more complex construct to measure. The fact that 13 out of 19 studies that broach organizational aspects were conducted in high-income countries might reflect the wider possibilities of these research institutions and indicates a need for more attention to NaME on the organizational level in lower-income settings. Results from these exploratory studies on the organizational level should feed into the development of standardized quantitative indicators more regularly. Qualitative approaches could be pursued for complex and specific constructs not easily covered quantitatively.

By not limiting the primary selection of articles for this review to a specific health profession, it was revealed that staff with management tasks in health research, as well as nurses, were the cohorts most frequently targeted by NaME studies. Further research should concentrate on other health professionals to determine communalities and differences of health-research related skill acquisition and development between health professions. These studies could determine whether and which parts of HRCD and NaME can be considered generic across health professions. Further, we will at some point have to ask, who is being left out and who is not getting access to HRCD programs, and why.

The focus of NaME throughout the studies included in this review was on outcome measurement, regardless of whether these were conducted in high-income, upper-middle, lower-middle, or low-income countries. However, there were only few reports of needs assessment from middle- and low-income economies, while high-income countries regularly give account of current states. While this should not be over-interpreted, it still raises the question of whether the needs assessment in the middle- and low-income countries is being done as thoroughly as warranted, but not reported in the articles, or if these countries’ needs might not always be at the very centre of the HRCD’s attention. While the evaluation of HRCD outcomes is, of course, of importance, more attention should be paid to the sustainability of programs and impact evaluation, e.g. parameters of patient care or societal aspects. Only one study, that of Hyder et al. [32], made use of one such indicator and assessed the impact of a HRCD training by considering “teaching activities after returning to Pakistan”. The development of valid impact indicators of course constitutes a methodological challenge. Some studies reporting impact evaluation on a system level might of course have been missed due to the search parameters applied.

When undertaking the review, three main methodological weaknesses of this research area became apparent. First, there is a need for common definitions and terminologies to better communicate and compare the HRCD efforts. The analysis of the studies showed that there is an inconsistent use of terms, for example, for CD activities (e.g. training, course, or workshop). Similar problems were already identified in the context of educational capacity building by Steinert et al. [33], who suggest definitions for different training settings which may also be suitable for a more precise description of CD activities. A common taxonomy for the description of health professionals (i.e. the study participants) would be just as desirable. The use of coherent terms would not only enable the accurate replication of studies but also help in determining whether tools and instruments from one setting can be easily transferred to another. A clear and coherent description of study setting and participants is thus an integral step towards scientific transparency. The incoherent categorisation of study types is probably not a new problem. It is, however, amplified by authors who choose very complex approaches to collect data at different NaME levels with deviating terms to describe these approaches [28, 3436].

The second weakness of the research area is the varying adherence to reporting standards. While there are standards available for reporting qualitative or quantitative research (e.g. Rossi et al. [12], Downing [37], Mays & Pope [38]), it seems these or similar recommendations were not frequently considered when reporting or reviewing NaME studies. This was particularly the case in studies with a mixed-method mode of analysis, where the need for more standardised reporting became apparent. Frambach et al.’s [39] “Quality Criteria in Qualitative and Quantitative Research” could provide guidance, especially for studies with mixed-method approaches. Another important aspect of transparent reporting would be the publication of the tools and instruments used in NaME studies. Of the 42 articles scrutinized during this review, only 15 either disclosed the tools and instruments within the article itself in an appendix or volunteered to have them sent to any audience interested. Of all the tools and instruments disclosed, only two were used in two or more studies. Making the tools and instruments available to the HRCD community would not only allow for their adaptation whenever necessary but, more importantly, support their validation and enhancement.

The last point concerns the study designs implemented. The majority of articles are mainly descriptive, non-intervention studies that only allow for low evidence according to Cochrane standards [40]. While most HRCD studies conducted in high-income economies were of non-interventional nature, those from low- and middle-income countries were a mix of non-intervention, intervention and multi-study approaches, yielding higher levels of evidence. Of all interventional studies, most employed a quasi-experimental design with only one randomized controlled trial [23]. The studies reporting HRCD on the institutional level were also primarily on a descriptive level. Cook et al. [41], however, demand going beyond describing what one did (descriptive studies) or whether an intervention worked or not (justification studies). Instead, they call for analysing how and why a program worked or failed (clarification studies). An in-depth analysis of the effectiveness of different HRCD activities is, however, still lacking.

Limitations of the systematic review

This systematic review displays some methodological limitations itself. The issue of deviating terminologies has been raised earlier. In most cases, we adopted the terms used in the studies themselves, e.g. when reporting the authors’ denoted study designs. In very few cases, we changed or completed terms to make the studies more comparable to others. One example is changing the wording from Green et al.’s [35] “case study approach” into a “multi-study approach” to match Flyvberg’s taxonomy [42]. Other limitations typical for reviews may also apply. Relevant sources might not have been detected due to the selected search terms, the range of the data sources, the exclusion of grey literature, and the restriction to English and German sources.

Conclusion

A systematic review on studies from the field of HRCD activities was conducted, with 42 studies being fully analysed. The analysis revealed that a variety of terms and definitions used to describe NaME efforts impedes the comparability and transferability of results. Nevertheless, insight from this review can help to inform researchers and other stakeholders in the HRCD community. A coherent overview on tools and instruments for NaME of HRCD was developed and is provided (Table 3).

Furthermore, it is time to set standards for NaME in the HRCD community. Researchers and stakeholders should develop a common research agenda to push, systematise and improve the research efforts in the field of NaME of HRCD activities. To do so, a common language and terminology is required. The conceptualizations used for the purpose of these review can inform this development. On the other hand, we have to critically analyse research gaps in terms of generalizable versus context-specific theories, methods, tools, and instruments. To maximize the benefits and to incorporate different research traditions, these undertakings should be done internationally and multi-professionally within the HRCD community.