Introduction

Since the early 1990s, evaluations have been conducted according to the five criteria of relevance, effectiveness, efficiency, impact, and sustainability (Organisation for Economic Co-operation and Development, Development Assistance Committee [OECD DAC], 1991, 2019). This approach has worked rather well, especially at the project level where most of the evaluation body of work was being done. Standard evaluation methods have included the review of project documentation, portfolio analysis, interviews at agencies’ headquarters, and field observations in a selection of project sites assessed using specialized technical expertise.

The introduction of more complex delivery modalities that started in the 2000s—sector approaches, budget support modalities, and programs—and the advent of the Millennium Development Goals (MDGs), recently replaced by the Sustainable Development Goals (SDGs), brought about a corresponding increased complexity in evaluation. The SDGs take an integrated approach that links the three pillars of sustainability: social, economic, and environmental (United Nations Department of Economic and Social Affairs, 2015). Although such integration is necessary to move toward sustainable development, it undeniably poses significant challenges in terms of identifying suitable metrics and indicators to assess achievements and results in a way that breaks the “data silos” performance measurement approach that was typical of the MDGs era (ICLEI, 2015).

The Global Environment Facility (GEF), a partnership set up as a result of the 1992 Rio Earth Summit, underwent a similar evolution. From a project-based delivery institution focusing on the environment, the GEF is increasingly moving toward more complex, programmatic, interconnected, and synergetic delivery modalities that consider the environmental with the social and economic dimensions. These GEF integrated programming modalities aim at tackling the main drivers of environmental degradation and achieving impact at scale (GEF Independent Evaluation Office [IEO], 2018a). The GEF has designed these strategies because many of these drivers extend their influence beyond national boundaries. To participate in integrated, multiple-country initiatives, governments need to find a balance between their national sustainable development priorities and their commitments to contribute to the global goals of international environmental conventions.

In the GEF, project and program evaluations are conducted by GEF partner Agencies. The GEF IEO conducts complex evaluations at levels higher than projects (GEF IEO, 2019a). To better capture the successes and challenges the GEF has faced in its move toward more complex, integrated programming, IEO evaluations increasingly consider innovative ways to address the complexity of assessing the environmental with the social and economic, including how these three dimensions play out at the national and local levels. The way GEF support is operationalized at the country level is increasingly a key IEO area of enquiry.

Challenges and Opportunities in IEO Complex Evaluations

Complex evaluations typically use mixed methods involving both quantitative and qualitative tools and analyses. In mixed-methods research, methods sequence and dominance are central concepts. The rationale for the mixed-method explanatory sequential design is often that the quantitative analysis provides a general understanding of the main research results while the qualitative data and their analysis refine and explain those results (Walker & Baxter, 2019). When this approach is applied in evaluation, aggregate quantitative analysis can also inform subsequent qualitative deep dives in specific projects/project sites to explain the main trends and provide additional insights. This is the usual approach in academic research, which, unlike evaluation, does not usually face tight deadlines to serve decision makers’ specific information needs.

In practice, tight timelines make for difficulty in applying a coherent sequencing in conducting the various quantitative and qualitative components of a typical complex, higher level evaluation. A long time is needed for process issues, and the tasks that take the longest usually are (from most to least time consuming): (a) contracting the various firms and individual experts; (b) getting in touch and agreeing on the field mission dates and modalities with GEF national stakeholders in countries chosen for field data gathering; (c) setting up stakeholder engagement mechanisms such as peer review panels and reference groups, and the functioning of those mechanisms; and (d) arranging the mission logistics while complying with security procedures of the institution (which in the GEF case is the World Bank). Afterwards, when the time comes to bring it all together, the evaluators must triangulate the different sets of qualitative and quantitative data and information, looking for coherence and connectedness between the various pieces of evidence.

To address this challenge, a few years ago the IEO developed a systematic approach to triangulate evidence and identify key findings in country portfolio evaluations (Carugi, 2016). This approach ensures the systematic use and analysis of all the data and information gathered, while respecting tight deadlines. Systematic triangulation can also help in addressing common challenges in evaluation, such as the scarcity or unreliability of data, or the complexities of comparing and cross-checking evidence from diverse disciplines. Although comprehensive, systematic triangulation does not allow evaluators to purposively dive deeply on a limited set of selected key themes that are common to multiple country or portfolio settings.

The Strategic Country Cluster Evaluation Concept

A way to address the challenge of assessing complex environmental and development interventions that require comparing and cross-checking evidence from diverse disciplines is to apply a sequenced, purposive approach in the conduct of an evaluation. That is what the IEO has done with strategic country cluster evaluations (SCCEs). SCCEs focus on a limited set of common themes across clusters of countries and/or portfolios that involve a critical mass of GEF investments toward comparable or shared environmental challenges and that have gained substantial experience with GEF programming over the years. Starting from aggregate portfolio analysis to identify trends and cases of positive and absent or negative change, SCCEs are designed to dive deeply into those themes and unpack them through purposive evaluative inquiry. SCCE design is based on a conceptual analysis framework, an approach the GEF IEO developed earlier at the country level,Footnote 1 to enable comparison of findings across geographic regions and/or portfolios. In addition to the aggregate portfolio analysis, SCCEs use geospatial analysis to identify change on key environmental outcome indicators over time. Targeted field verifications follow in specific hot spots selected based on the findings of the geospatial and portfolio analyses. The purpose of field verifications is to identify and understand the determinants of the observed change or lack thereof.

The identification of factors hindering and/or enabling the sustainability of GEF outcomes was one of the main themes selected by the GEF IEO for deep-dive investigation in SCCEs. In 2017, the IEO completed a desk study on the sustainability of GEF project outcomes (GEF IEO, 2019b).Footnote 2 The study analyzed the IEO datasets of terminal evaluation ratings to assess correlations among sustainability, outcomes, implementation, broader adoption, project design features, country characteristics, and other variables. The analysis took stock of projects for which field verifications were conducted by the IEO at least 2 years after project completion. According to the study, the following contributing factors were at play in those cases where past outcomes were not sustained: (a) lack of financial support for the maintenance of infrastructure or follow-up, (b) lack of sustained efforts from the national executing agency, (c) inadequate political support including limited progress on the adoption of legal and regulatory measures, (d) low institutional capacities of key agencies, (e) low levels of stakeholder buy-in, and (f) inadequate project design characterized by flaws in the theory of change of projects.

The IEO further explored these issues by applying the new SCCE purposive evaluative enquiry approach to three different clusters of country portfolios. The SCCEs’ main objectives were: (a) to provide a deeper understanding of the determinants of the sustainability of the outcomes of GEF support and (b) to assess the relevance and performance/impact of the GEF toward the main environmental challenges from the countries’ perspective. Gender, climate resilience, private sector, and GEF operations in fragile situations were also assessed as cross-cutting issues.

A unique area of SCCE research was the environment vs. socioeconomic development nexus, a concept that is central to sustainable development. This nexus is too often neglected in development interventions, both by donors and developing countries alike (GEF IEO, 2020). Efforts to integrate socioeconomic development with environment conservation/sustainable use both at national and local levels depend on the interest of country governments. Many governments in the least developed countries (LDCs) believe that achieving both at the same time is difficult, and perceive, rather than a nexus, that major trade-offs exist between environment and socioeconomic/livelihoods objectives. Countries differ on: (a) reliance on natural resources, (b) susceptibility to natural disasters, (c) the poor’s dependence on the environment, and (d) the government’s economic development and other priorities. SCCEs investigated if and how the existence of a nexus between socioeconomic development needs and environmental conservation priorities (or lack thereof) contributed to or hindered the observed sustainability of project outcomes.

Applications of the SCCE Approach

The approach discussed in the previous sections has been applied to three clusters of countries, one covering the GEF portfolio of projects and programs in two biomes,Footnote 3 one covering LDCs, and one covering the small island developing states (SIDS) portfolios.Footnote 4 The African biomes covered by the first SCCE were the Sahel and the Sudan-Guinea Savanna. Selection of these two biomes was based on the countries’ comparable land-based environmental challenges. These countries also face challenges related to governance, demographics, migration, conflict, and fragility, which work as drivers for the environmental issues at hand. Most countries in the two selected biomes are LDCs, and half are fragile (World Bank, 2020).

The LDCs SCCE covered 47 countries that are currently designated by the United Nations as LDCs.Footnote 5 Focus on LDCs was based on these countries’ greater challenges related to sustainability of outcomes over several GEF periods (GEF IEO, 2019b) and related economic, social, and environmental challenges. Most LDCs are characterized by a low level of socioeconomic development. They have weak human and institutional capacities, low and unequally distributed income, gender inequality, and scarce domestic financial resources. LDCs often suffer from governance crisis, political instability, and, in some cases, internal and external conflicts. Twenty-eight of the 47 LDCs are fragile (World Bank, 2018). The SIDS SCCE covered 39 small island developing states in the AIMS (Atlantic, Indian Ocean, Mediterranean, and South China Sea), Caribbean, and Pacific regions. The choice to evaluate the SIDS as a strategic country cluster was based on their shared geophysical constraints that result in disproportionately large economic, social, and environmental challenges.

Methodological Considerations

Selection of case study countries in the three SCCEs drew upon sustainability cohorts composed of national and regional projects completed between 2007 and 2014 and having Annual Performance Report (APR) ratings (GEF IEO, 2018b, 2019b, c) to allow for observation of the actual sustainability of outcomes 4–5 years after project completion. Projects in the African biomes and LDC cohorts were classified as: (a) having both outcome and sustainability ratings in the positive range (i.e., highly satisfactory, satisfactory, or moderately satisfactory); (b) having both outcomes and likely sustainability ratings in the negative range (i.e., highly unsatisfactory, unsatisfactory, or moderately unsatisfactory); (c) having either positive outcome and negative likely sustainability ratings, or the inverse; and (d) not having either outcome or sustainability ratings, or both (see Tables 1, 2, and 3).

Table 1 Outcome and sustainability ratings matrix
Table 2 African biomes SCCE: Selection of countries based on APR ratings prior to missions
Table 3 LDCs SCCE: Selection of countries based on APR ratings prior to missions
Photos 1 and 2
figure 1

LDCs SCCE – Finalizing sites selection and interviewing rural communities in Bhutan

Photo 1: Meeting the Gross National Happiness Commission (Thimphu, March 2019)

Photo 2: Discussing SLM measures with farmers during site visits (Zhemgang, March 2019)

Photos 3 and 4
figure 2

African Biomes SCCE – Discussing environmental change with local stakeholders

Photo 3: Field visit in Kaback Commune (Guinea, March 2019)

Photo 4: Kyenjojo District technical staffs reviewing Albertine Rift forest loss maps (Uganda, May 2019)

Also informing the selection of country case studies were trends over time of key environmental outcome indicators at geolocated project sites, with the aim of identifying cases of positive and absent or negative change. Country case study selection started with the identification of the main environmental challenges faced by the countries covered by the respective SCCE. These challenges were classified by biome in the case of the African biomes SCCE and by geographic country category in the case of the LDCs and SIDS SCCEs. Projects with both positive and negative outcome and sustainability ratings in each portfolio were tagged to each environmental challenge.

Guided by the mapping of countries and projects to environmental challenges, the IEO selected countries with the largest number of national and regional projects with positive and negative outcome and sustainability ratings. This method ensured the largest number of observable data points and coverage of possible factors affecting sustainability. The countries selected also included those in which projects addressed the most commonly shared environmental challenges. In the African biomes SCCE, these were deforestation and land degradation, threats to biodiversity, and desertification. In the LDCs SCCE, these were deforestation and land degradation, and biodiversity loss. Water-related challenges were also important and included water quality and quantity, threats to marine resources, and coastal and coral reef degradation.

The application of the pre-mission selection process based on outcomes and sustainability ratings was accompanied by typical logistics and organizational considerations such as site accessibility and seasonality. In Bhutan, evaluators did not make the final selection of project sites to visit until after discussion upon arrival in the country with stakeholders in the Gross National Happiness Commission, relevant line ministries, and technical agencies such as the National Soil Services Center. For example, in the case of sustainable land management (SLM), these discussions resulted in the LDC SCCEs evaluation team visiting a site in Zhemgang District, selected out of three possible sites to logistically coordinate with site visits to the other projects in the sustainability cohort and in consideration of road conditions in the mountainous country. This choice was made because the SLM project sites are located in areas of high incidence of land degradation that are inhabited by most of the country’s poorest and most vulnerable communities. Although the terminal evaluation had rated the project’s outcomes in the positive and sustainability in the negative range, the evaluation team could verify that the SLM measures introduced by the project were still in operation 5 years after the project was completed. Selecting Zhemgang District for a site visit allowed the evaluation team to observe, 5 years postcompletion, the main sustainability factors fostering positive SLM results in mountainous ecosystems alongside unforeseen hindering factors. The team could verify the status of SLM measures introduced by the project and directly collect information on their continued use and maintenance from the remote rural communities living in those highly degraded lands. Photos 1 and 2 show meetings to finalize selection of sites and interview rural communities in Bhutan.

For the African biomes SCCE, once the evaluation team had selected the countries and projects based on the pre-mission selection process described above, they prepared geospatial maps for each project site prior to the missions to the country. Once in the country, evaluators used these maps to select the sites to visit in the field verification mission (see Fig. 1). This ensured the conduct of field observations in specific project locations selected both in highly degraded areas and in areas where vegetation had actually increased.

Fig. 1
figure 3

African biomes SCCE – project sites geospatial maps

The evaluation team shared these maps with stakeholders (on a laptop/smartphone in the field, or on paper in local offices) to stimulate discussions and identification of the key factors at play driving the change observed in the map—see Photos 3 and 4. Local technicians, locally elected representatives, and community members all confirmed the environmental changes in the areas indicated in the maps and provided additional insights on when, how, and why those changes occurred. For example, in Tolo (Guinea), areas of increasing vegetation were subject to intense afforestation efforts accompanied by strict enforcement measures by local government forestry technicians. In Kaback, the anti-salt dikes built with GEF support in highly degraded coastal areas were insufficient in both height and width to withstand water intrusion. Attempts were made in Konimodouya and Katonko to change the approach by building a more robust dike, but these too could not withstand the rising sea-level pressure.

The purposive selection processes described above allowed for an in-depth, more granular, and comprehensive understanding of which specific factors have influenced the observed sustainability following project completion. In Tolo, the watershed identified for relocating farmers from the Bafing Lake had insufficient water for irrigation, a case of poor project design (see Box 1). In Zhemgang, the quality of project design led to the highly positive observed sustainability postcompletion (see Box 2).

Box 1: Field Visit in a Site Selected Based on Pre-mission Analysis: Tolo, Bafing Lake (Guinea)

The GEF project applied a coherent ecosystem approach to the whole watershed, working with all the stakeholders involved. Evaluators selected two sites to visit in Tolo: The first was on protection measure to rehabilitate the Bafing Lake banks, and the second involved community-based farming in the adjacent watershed. The lake is a source for 50% of the water going to the Senegal river. Around the lake is a community village. One of the project objectives was to reduce deforestation around the lake that leads to erosion and water loss from the lake basin. Deforestation is due to land clearing for slash-and-burn, itinerant agriculture. The local forest department enforces a forest-cutting ban around the lake. The project relocated the farmer community around the lake to a watershed 2 km from the village, where communities could practice horticulture. This delocalization measure was informed by a socioeconomic study followed by intensive participatory activities and negotiations, which provided a management arrangement for the distribution of land in the watershed and included granting some compensation measures to the farmers.

Years after the delocalization of the activity from the lake, the ecosystem of the lake banks has been slowly rehabilitated through intense reforestation measures (see Photo 5). The area has become green, with no agricultural activities around the lake, favoring the settling in of a small micro-climate that benefits the whole ecosystem. It was reported that years ago, one could cross the lake by foot in April due to damage from deforestation. The banks around the lake, once degraded from unsustainable agriculture activities, are now green.

Photo 5
figure 4

Reforestation around Bafing Lake

Access to water remains the key impediment for agriculture in the Mamou region. The two hectares of watershed where the farmers have been delocalized has an irrigation system with canals that allows water to be spread on the field and six groundwater wells, all of which are thanks to the Community Land Management Project investments. The mission found this area underused. Farmers reported that despite the investments made, they can have enough irrigation water only for 6 months in a year (see Photo 6).

Photo 6
figure 5

Watershed relocation land

Box 2: Field Visit to a Site Selected Based on Pre-mission Analysis: Zhemgang District, (Bhutan)

The project aimed to strengthen institutional and community capacity for anticipating and managing land degradation. SLM practices were piloted in three geogs (groups of villages), where farmers were trained in SLM techniques. The project sites were in areas of high incidence of land degradation that were inhabited by the country’s poorest and most vulnerable communities. The project resulted in an increase in farmers practicing SLM techniques, a reduction in sediment flows in selected watersheds, regeneration of degraded forest land, and improved grazing land in the pilot geogs. The postcompletion site visit to a pilot geog in a remote area in Zhemgang noted continued practice of SLM techniques such as land terracing, hedgerows, fruit orchards, tree plantations, and irrigation systems. Selling produce both in the district and in Gelephu on the border with India has provided increased income for residents. Villagers confirmed in interviews that more land is under cultivation, and 60% of households continue using SLM techniques learned from the project. The remainder of the households discontinued using SLM due to shortages of water and losses caused by wildlife such as bears and wild boars. The government has provided some electric fencing, but it is not sufficient. The continued practice of SLM techniques has also helped improve and retain soil and convert shifting land cultivation to sustainable land cover (see Photo 7).

Photo 7
figure 6

Fruit orchards contributing to soil conservation, observed in Zhemgang

Among the project outcomes were the preparation and implementation of the 2007 Land Policy Act that incorporated SLM principles in programs and policies including the National Land Policy, the Forestry Policy, the National Adaptation Program of Action, and the National Biodiversity Action Plan. SLM principles have been incorporated in the government’s 12th five-year plan (2018–2023) and in plans on poverty reduction and increased food security.

Key factors driving postcompletion sustainability were good project design and government support, including highly relevant objectives in line with government priorities and relevant activities to achieve the stated objectives. The project design was guided by a bottom-up approach with participatory planning that focused on community priorities, phased implementation allowing for adjustment throughout implementation based on learning from pilots, decentralization to strengthen the role of communities and local authorities, use of knowledge and information on farmer incentives, and an integrated multisectoral approach. Before the completion of the project, institutional, financial, technical, and policy arrangements were made for sustaining its outcomes.

Geospatial Analysis Following Project Field Visits

The IEO conducted targeted geospatial analysis once teams returned from the missions, using the geographic coordinates collected with GPS tracking software apps installed in the team members’ smartphones and the information gathered during field observations. Both in Bhutan and Guinea, this analysis showed increased vegetation despite lower precipitations in the project sites visited, providing complementary data to shed more light on the observed changes (see Figs. 2 and 3). The reforested areas evidenced by the satellite photos taken in 2012 and 2019 on the Bafing lake basin in Guinea (Fig. 4) are the result of GEF-induced farmer relocation and afforestation activities accompanied by strict government enforcement. This temporary project success depends on the farmers continuing to practice horticulture in the watershed where they have been relocated.

Fig. 2
figure 7

African biomes SCCE – vegetation increase vs. lowering annual rainfall in the Bafing region, 2012 and 2019

Fig. 3
figure 8

LDCs SCCE – vegetation increase vs. lowering annual rainfall in Zhemgang

Fig. 4
figure 9

African biomes SCCE – vegetation increase around the Bafing Lake

In Zhemgang, both forest and vegetation cover in pastures have increased since the onset of the project. In Fig. 5, the 2010 image clearly shows large areas of relatively bare ground, which are subsequently covered by vegetation in 2018. The findings of the post-mission geospatial analysis of the SLM project confirmed the field visit finding of improved sustainability of outcomes years after project completion.

Fig. 5
figure 10

Vegetation increase in Zhemgang

Lessons from the SCCE Experience

Using the selection process described in this chapter and a combination of geospatial analysis prior to and after field visits to targeted project sites, the SCCEs revealed that most of the field-verified projects maintained or sustained their outcomes postcompletion. This was the case for 87% of the projects field verified in the African biomes SCCE (16 projects), 81% in the SIDS SCCE (24 projects) and 70% in the LDCs SCCE (25 projects). More important, the selection of projects with a combination of outcome and sustainability ratings in both the positive and negative range and tagged to the main environmental challenges faced by the country led to a diverse group of projects selected for deep-dive analysis into which specific factors contributed to these improvements in the observed postcompletion sustainability. Enhanced learning led to a better understanding of how the environment and development nexus (or lack thereof) played out in contributing to or hindering the observed sustainability. This would not have been possible to achieve with the same granularity through the usual randomized approaches to country, project, and site selections for field verification applied in parallel to the conduct of aggregate analyses in previous IEO evaluations.

A second important lesson that informs the preparation of future IEO work plans is that applying the described sequencing approach from aggregate analysis to detailed observation took a long time. This investigation was possible because the three SCCEs were conducted in the 2 years following the completion of the Sixth Comprehensive Evaluation of the GEF, corresponding to a slightly lower intensity in the GEF decision makers’ demand for evaluative evidence from the office. To minimize the long timeframes that may result from sequencing, the most time-demanding activities should be conducted first. In the case of SCCEs, aggregate geospatial analysis was the most time-consuming and complex component, followed by making the arrangements for the missions in selected countries.

At times, pre-mission analysis needs adjustment to account for country-specific logistics and other organizational considerations influencing the final site visit selections. In the case of the LDCs SCCE, site selection had to account for the remoteness and challenges of traveling to several sites during a visit in a mountainous country. When this happens, care should be taken in adjusting site selection to allow as much compliance as possible with the results of the pre-mission aggregate analysis, while accounting for variation due to the site changes in the final deep-dive analysis, as was done in Bhutan.

Applying a purposive evaluative enquiry approach to evaluation encompasses sequencing the evaluation data-gathering and analysis components so that each component informs the following one. This approach has the potential to produce a deeper, more granular and comprehensive understanding of the issues being evaluated. This was achieved by introducing the new SCCE approach, in which evaluators used geospatial analysis with aggregate portfolio analysis and review of project documentation to design the case studies’ deep dives in terms of issues to focus on, data and information to gather, and exact locations for gathering those data.

The project selection method based on projects’ positive and negative outcomes and sustainability ratings was very useful for new discoveries. Among these, field visits to 36 completed projects in 12 LDCs by the three SCCEs found that 25 projects sustained or progressed further in achievement of their outcomes after project completion. Teams found that these improvements were mainly attributed to two factors: the quality of project design and positive changes in the context taking place postcompletion. Although previous analyses already indicated the importance of good project design for fostering the sustainability of project outcomes (GEF IEO, 2019b), less was known about the different ways in which various contextual factors come progressively into play 4–5 years after a project is completed. This understanding sheds new light on how to best take advantage of the country- and site-specific context factors that enable the sustainability of GEF interventions, a lesson that further contributes to improving project design.