Global patterns of disaster and climate risk—an analysis of the consistency of leading index-based assessments and their results

Indices assessing country-level climate and disaster risk at the global scale have experienced a steep rise in popularity both in science and international climate policy. A number of widely cited products have been developed and published over the recent years, argued to contribute critical knowledge for prioritizing action and funding. However, it remains unclear how their results compare, and how consistent their findings are on country-level risk, exposure, vulnerability and lack of coping, as well as adaptive capacity. This paper analyses and compares the design, data, and results of four of the leading global climate and disaster risk indices: The World Risk Index, the INFORM Risk Index, ND-GAIN Index, and the Climate Risk Index. Our analysis clearly shows that there is considerable degree of cross-index variation regarding countries’ risk levels and comparative ranks. At the same time, there is above-average agreement for high-risk countries. In terms of risk sub-components, there is surprisingly little agreement in the results on hazard exposure, while strong inter-index correlations can be observed when ranking countries according to their socio-economic vulnerability and lack of coping as well as adaptive capacity. Vulnerability and capacity hotspots can hence be identified more robustly than risk and exposure hotspots. Our findings speak both to the potential as well as limitations of index-based approaches. They show that a solid understanding of index-based assessment tools, and their conceptual and methodological underpinnings, is necessary to navigate them properly and interpret as well as use their results in triangulation.

1 Introduction also raises questions for decision-makers, risk practitioners, and academics-the very target audiences of these products: It has become difficult for potential users to grasp the methodological difference, evaluate the validity of the different products, and juxtapose their results (Leiter et al. 2019). To put it simply, if risk indices are used, for instance, for prioritizing funding towards "highrisk" or "most vulnerable" countries, the resulting funding streams can look very different, depending on which index one picks.
This study addresses this issue by comparing four of the leading global multi-hazard risk indices which are not only highly cited in the scientific domain but have been taken up and used heavily in the media as well as in national and global policy debates, e.g., adaptation finance: The World Risk Index (WRI) (Welle and Birkmann 2015), the INFORM Global Risk Index (INFORM) (UN OCHA 2020), the ND-GAIN Country Index (GAIN) (Chen et al. 2015), and the Global Climate Risk Index (CRI) (Eckstein et al. 2019). This research analyses the design and results in a comparative manner. It evaluates the overall consistency of risk index information provided by these indices, not only in terms of overall risk but also regarding the major sub-components of exposure, vulnerability or susceptibility, lack of short-term coping capacity, and lack of longer-term adaptive capacity, where available in the different indices (WRI, INFORM, and GAIN). In doing so, the study helps to analyze whether indices produce similar patterns and results so that global risk hotspots can be robustly identified. Three research questions guide and structure our analysis: 1. How do the different global risk indices differ conceptually and in terms of their underlying indicators and construction? 2. How consistent are the results of different global risk indices, and are there differences in the level of agreement or disagreement when comparing overall risk vs. its subcomponents exposure, vulnerability/susceptibility, and lack of short-term coping as well as long-term adaptive capacity? 3. How high is the agreement on global risk patterns and do the different indices in combination allow for the robust identification of hotspot countries and regions?
The paper is structured into six main parts: Section 2 provides a short introduction into global risk indices and reviews the limited previous work on evaluating and comparing their approaches and results. Section 3 explains the methods and data used for our analysis. Section 4 presents our results. The section is structured into three main parts, each responding one of the four research questions. Section 5 discusses the results and Section 6 provides key conclusions and an outlook on future research needs.

Background: global risk indices and previous work on their evaluation
Over the past two decades, many researchers have been arguing for the academic and practical usefulness of indicator-based composite vulnerability and risk indices (Birkmann 2007;Cutter et al. 2003;Fekete 2009;Sherbinin et al. 2017). Approaches have covered local to global scales (see, e.g., recent reviews of index-based assessments by Beccari 2016, de Sherbinin et al. 2019. Nevertheless, next to the proponents of such indices, epistemological and methodological concerns have also been raised (e.g., (Hinkel 2011).
Others have pushed back on such critique, aiming to defend the index approach (de Sherbinin et al. 2017).
However, the purpose of this paper is not to review and discuss the strengths and weaknesses of such indices or indicators overall. Irrespective of the critique, they have become part of the toolbox in risk research and are increasingly being used in policy making, for example for legitimizing policy claims or guiding priority-setting (Kreft and McKinnon 2014). Hence, our paper rather concentrates on assessing how consistent the information on risk scores and ranks is when comparing the different available global risk indices. The argument is that with the rising usage of such information in policy making, the lack of such consideration becomes a growing concern that needs to be addressed.
While not all of the global assessments mentioned above use indices to represent the multidimensional nature of risk (c.f. Ward et al. 2020 for a recent review of global scale risk assessments), several of these assessments are aggregating singular indicators into a final risk index to compare and rank countries. Most indices draw on a modular approach in which they define and combine different components of risk. In line with the framing first developed in the Special Report on Managing Extreme Events (SREX) of the Intergovernmental Panel on Climate Change (IPCC 2012), most indices differentiate between hazards (i.e., potentially threatening environmental and climatic events and stressors such as floods, storms, or sea level rise), exposure (i.e., the spatial and/or temporal placement of people, assets, and ecosystems within the reach of hazards), and vulnerability (i.e., the predisposition or susceptibility of elements to suffer harm when being exposed to hazards). In addition, many indices include proxies on the capacity to deal with risk, including the short-term coping and the long-term adaptive capacity of people, sectors, and systems. While the naming and precise conceptual taxonomies can vary between different indices, making comparisons challenging (see Section 4.1), the philosophy and general interpretation of these different building blocks of risks are mostly accepted and shared by the different assessment approaches.
For our analysis, we focus on four index-based global multi-hazard risk assessments (i.e., the WRI, INFORM, GAIN, and CRI) which (i) have been published in recent years, (ii) provide country rankings, (iii) are frequently updated in order to track changes in risk over time, and (iv) are widely being used to inform action and policy.

Existing comparisons and evaluations of risk indices
With a sharp increase in the number of risk assessments addressing climate-related and natural hazards across spatial scales (local to global), sectors, and systems over the past decades, scholars have recently started to ask questions on the validity of these tools. A new body of literature has begun to compare the underlying concepts and approaches as well as their findings in order to validate their results (Gall 2007;Preston et al. 2011;Beccari 2016;Ford et al. 2018;Hagenlocher et al. 2019;de Sherbinin et al. 2019;Ward et al. 2020; see Online Supplementary Material for an overview of their main findings). At the same time, a growing number of studies has looked into the effects of vulnerability and risk index construction methods and choices, such as indicator selection, choice of source data, normalization, weighting, and aggregation, using sensitivity, and/ or uncertainty analysis (Schmidtlein et al. 2008;Tate 2013;Willis and Fitton 2016;Feizizadeh and Kienberger 2017;Machado and Ratick 2018;Il Choi 2019).
A number of recent studies have specifically compared index-based vulnerability and risk assessments for specific regions or scales. For sub-national risk indices, key contributions include, for example, Tate (2012), Rufat et al. (2015), and Anderson et al. (2019). At the global scale, Pelling (2013) compared the Disaster Risk Index (DRI; Peduzzi et al. (2009), the Natural Disaster Hotspots global risk analysis Dilley et al. (2005), the Global Risk Analysis (GRA; United Nations 2011), and the World Risk Index (WRI) in terms of their conceptual orientation, methods, and key results at the risk score level. While Pelling (2013) provides key recommendations for sub-national and local approaches to assessing vulnerability and risk, no in-depth comparison of the convergence or divergence of these global risk indices and the implications for policy and action is provided. Garschagen et al. (2016a) compared six risk indices focusing on Africa and Asia exclusively. Their results showed that inter-index agreement is stronger at the high-end of the risk spectrum when compared to the average risk spectrum. Leiter et al. (2017) compared the top 20 country risk rankings of the WRI, INFORM, ND-GAIN, and CRI. They find rather strong disagreement in the rankings. Feldmeyer et al. (in this issue) also compare the vulnerability components of two risk indices (WRI and INFORM) but do so at the resolution of climate regions used within global circulation models. This approach is relevant for some contexts in that it allows a direct integration of climate and vulnerability data at the level of aggregation used within climate models. However, we argue that it is of limited relevance not only for the social and political sciences but also for potential usage in climate policy, e.g., the prioritization of climate finance at the country level (Garschagen and Doshi, under review).
In sum, we argue that our country-level approach adds very relevant information to the existing body of literature. Also, the differentiation into the different risk sub-components adds to this knowledge and is highly relevant, as climate policy now oftentimes talks about prioritizing adaptation resources toward the most vulnerable countries, calling for robust knowledge. At the same time, providing detailed and statistically robust assessments over the entire set of countries globally, not just the 20 highest risk countries as in Leiter et al. (2017), is of increasing relevance, we argue, e.g., as one piece of information contributing to the upcoming Global Stocktake under the Paris Agreement but also global adaptation and risk reduction tracking more generally.

Comparison of index components, indicators, and design
As laid out in Section 2.1, our analysis uses data from four of the leading global multi-hazard risk indices: WRI, INFORM, GAIN, and CRI. We analyzed similarities and differences between these four global risk indices on two levels: First, the conceptual design including the modular structure of sub-components and, second, the underlying indicators. Figure 1 shows the modular structure of each index and its sub-components and which of these components were compared in the correlation analysis across indices (see Section 3.2). In order to understand indicator overlaps and since some of the indicators measure the same effects but rely on different data sources, we binned conceptually similar indicators (out of the 113 total indicators across the indices) into 104 items.  Sensitiv ity analy sis Sensitiv ity analy sis n/a n/a Labelled as "adaptive capacity" Labelled as "adaptive capacity" Labelled as "sensitivity" Labelled as "readiness"

Vulnerability
Labelled as "vulnerability" Labelled as "coping capacity" * Some risk components were labelled differently in the original risk index methodology (e.g. what we call susceptibility was labelled as senstitivity in the ND-GAIN index). These deviations are explained in the light blue boxes.
Labelled as "natural hazard exp." Fig. 1 Comparisons between four major global multi-hazard risk indices and their modules as well as their indicators, underlying data and method of index construction, including treatment of missing data, outliers and multicollinearity, the approach for normalization, indicator weighting, aggregation, and validation. The risk scores of WRI, INFORM, and GAIN consist of several sub-components next to the overall risk scores which align conceptually and were used for the later correlation analysis, namely exposure (EXP), vulnerability (VUL), susceptibility (SUSC), lack of coping capacity (LCC), and lack of adaptive capacity (LAC) This was solely done for the conceptual comparison only, but not taken up in the following statistical analysis. For example, we combined the World Health Organization's indicator "People using at least basic drinking water services" with the World Bank indicator "Access to reliable drinking water" into the new indicator "Access to drinking water".

Correlation analysis of index results
In order to allow for a comparison between the results of the indices, which all have slightly different metrics and numbers of countries covered ( Fig. 1), we first normalized the ranked country scores for risk and its sub-components on a scale from 0 to 1 using linear min-max normalization. In the next step, we conducted pairwise correlation analysis between the normalized ranks for all risk indices and their sub-components ( Fig. 1 . 1). Spearman's rank correlation coefficient was used for the analysis since we used ranked values and not all the underlying data was normally distributed.

Analysis of ranges and means across country groups
An important question to further understand in the analysis was whether, and to what extent, a country's score and eventually rank differs between different indices. We therefore analyzed the range of normalized country scores between different indices and their sub-components (see Section 4.3). Countries for which only one index score is available were excluded from the analysis. Furthermore, another question for the paper was how consistent the index results are when looking into specific country groups of particular relevance for climate and disaster risk policy decisions, such as least developed countries (LDCs), small island developing states (SIDS) (ITU 2017), landlocked developing countries (LLDCs), continental groups, income groups (UN DESA 2019), and the co-called Vulnerable 20 (V20) (V20 2018). We therefore assessed the range and mean values of risk and its sub-components in each of these country groups, using boxplots for graphical illustration (see Supplementary Material for results).

Consistency in terms of index components, indicators, and design
Out of the four globally leading risk indices analyzed, three share a large overlap in terms of their conceptual understanding of risk and its elements (WRI, INFORM, GAIN), while the CRI is quite different from that (see Fig. 1). WRI, INFORM, and GAIN all provide modular sub-components of risk, more or less closely in line with the standard risk framing developed over the past two decades within disaster risk and climate change adaptation research (IPCC 2012). That is, all three provide separate sub-indices for (1) the amount of exposed population, (2) a country's level of vulnerability or susceptibility, and (3) a country's level of coping and adaptive capacity, in terms of the readiness to deal with acute disaster situations (i.e., coping capacity) and/or the long-term adaptive capacity. CRI does not follow such a modular approach and only provides aggregate risk scores. In addition, it is quite different in that it does not aim to assess hypothetical risk in the future but draws exclusively on data of past impacts (see Table 1 of the Supplementary Material for additional information). It uses past fatalities and economic losses to represent risk and hence is very different from the futureoriented, probabilistic, and hypothetical assessments of the other indices. Nevertheless, the CRI is considered in the subsequent comparison of risk aggregate results (though not the subcomponents). This is because the CRI is so heavily being taken-up in the political space, media, and public discourse and mostly used there as a general measure of country-level climate risk. Hence, analyzing how its risk-level results compare to those of the other three indices is of great interest and relevance, despite the fact that CRI conceptually and methodologically differs quite heavily from the other three approaches. WRI and INFORM are the most similar indices conceptually. GAIN also uses risk sub-components but composes them in a quite different manner. Still, as it uses core concepts of climate and disaster risk debates (exposure, vulnerability, susceptibility), including GAIN into the further comparison with WRI and INFORM is of high relevance, we argue. In terms of indicator overlap, a mixed picture emerges (see the Euler diagrams in the Figure SM2 of the Supplementary Material). Overall, INFORM and WRI show the largest pairwise overlaps, followed WRI and GAIN and then INFORM and GAIN. The two only indicators used by CRI are not shared by any other index. The WRI shares two-thirds of its indicators and has the greatest overlap to INFORM, whereas GAIN has 73% unique indicators. The large relative overlap of WRI indicators needs to be interpreted against the comparatively small number of indicators used (n = 27) compared to INFORM (n = 54) and GAIN (n = 45). In terms of overlaps in different risk components, the largest overlaps can be observed in the domains of exposure (6 indicators overlapping pairwise out of 28 used in total across WRI, INFORM, and GAIN) and vulnerability (11 pairwise overlaps out of 66 indicators used in total). The exposure indicators of WRI are highly overlapping with the natural hazard exposure of INFORM, as both use earthquakes, floods, cyclones, and droughts as hazards and the population as the exposed element. The overlap could be even higher, but the epidemics indicator in INFORM consists of ten sub-indicators, thus leading to a higher divergence to WRI. In contrast, GAIN uses the projected impacts of climate change as indicators and has little in common with the other indices. The WRI has a high overlap in the vulnerability and susceptibility sub-components as well, while there is little to no overlap between INFORM and GAIN. The overlaps in lack of coping capacity and lack of adaptive capacity are much smaller (3 out of 26 and 1 out of 19, respectively). Each risk index shares one indicator with each other regarding the lack of coping capacity (medical staff, ICT infrastructure, and the corruption index) and the lack of adaptive capacity (protected biomes habitats). Not surprisingly, the overlap in risk aggregates is highest due to the fact that risk is the aggregate of previous mentioned components.
Next to their modular design and indicator choice, the indices also differ in terms of their approach to index construction, including the weighting, aggregation, and validation of indicators and the normalization of data as well the treatment of missing data, outliers, and multicolinearity (see Fig. 1), all of which have potentially large influence on the overall index results.

Correlations between risk indices and their different components
Comparing the results of the different indices for consistency, pairwise correlations can be observed, but the strength of correlation greatly varies between different indices and subcomponents. In terms of overall risk aggregates, significant pairwise correlations can be observed between the WRI, INFORM, and GAIN indices, with the strongest correlation between GAIN and INFORM. The CRI is not significantly correlated with any of the other three risk indices (Table 1).
In terms of the exposure sub-components, the results show with high significance that no considerable pairwise correlation exists between the WRI, INFORM, and GAIN. For the vulnerability and susceptibility components, pairwise correlations are much higher, instead: they reach a coefficient of + 0.9 between GAIN and WRI. Also, the vulnerability results of INFORM and WRI are strongly positively correlated (+ 0.82). The correlation between GAIN and INFORM is a bit weaker (+ 0.7) but still strong in comparison to exposure. Even stronger positive correlations can be observed in the capacity components, i.e., the lack of short-term coping capacity (LCC) and longer-term adaptive capacity (LAC). In terms of LCC, all three indices show a very strong, statistically significant, positive correlation coefficient of + 0.92. Table 1 Correlation coefficients between the different global risk indices overall and in their different components on exposure, vulnerability, susceptibility, lack of coping capacity or readiness and lack of adaptive capacity, grouped by sub-components. The significance levels are indicated by the asterisks: ** correlation is significant at the 0.01 level (2-tailed) and * correlation is significant at the 0.05 level (2-tailed). The green color shading indicates the strength of correlation with a grouping into moderate (> 0.5), medium-strong (> 0.7), strong (> 0.8), and very strong (> 0.9) correlations. A similar coefficient can be observed in terms of the long-term adaptive capacity, even though only two indices (WRI and GAIN) provide such information.

WRI -Risk
In sum, the strongest correlations can be observed in the capacity sub-components, followed by the vulnerability and susceptibility components and the overall risk scores. The exposure component is least strongly correlated. A possible explanation of these findings is provided in the discussion (Section 5). Further, the correlations between the WRI and the INFORM as well as between the WRI and GAIN are the strongest when comparing the entire package of different index sub-components. The overall correlations between INFORM and GAIN are slightly weaker. The CRI does not show significant correlations with the other indices overall (see Table SM3 of the Supplementary Material).

Agreement and disagreement on global risk patterns and regional hotspots
On top of the correlation analysis, a key question was whether the consistency of index results differs along world regions and a spectrum of countries with high to low risk. This question is of particular relevance for robust identification of hotspots of risk, exposure, vulnerability, and lack of capacity, i.e., the question whether different indices are in high agreement on their ranking particularly of countries with the highest risk. In order to answer this question, a detailed look into cross-index means and ranges is necessary. Figure 2, in combination with Figures SM3, SM4, SM5, and SM6 in in the Supplementary Material, shows that a quite mixed picture emerges.
On the global scale, clear regional clusters with highest and lowest mean vulnerability and lack of capacity can be observed across the indices. The highest mean vulnerability and susceptibility occurs in sub-Saharan Africa, South Asia, and parts of Southeast Asia (Fig. 2, panels e and g). A very similar pattern can be observed in terms of the lack of adaptive capacity (Fig. 2k) and, to less clear extent, the lack of coping capacity (Fig. 2i). Interestingly, the right-hand side of Fig. 2 very clearly shows that the identified clusters of highest mean vulnerability and lack of adaptive capacity are also amongst the regions with the lowest interindex range, meaning with the highest agreement across different indices. In other words, identifying countries with highest vulnerability and lack of adaptive capacity seems to be possible in a quite robust fashion when considering different indices. Identifying clear regional clusters of countries with highest exposure is much more difficult according to the considered indices and their national level assessments. Here, the clearest regional pattern can be observed in Southeast Asia, however, with considerable differences between neighboring countries (Fig. 2c) and a quite high range (i.e., limited agreement) between the different indices, e.g., in Laos, Cambodia, and Myanmar (Fig. 2d). South Asia, East Asia, Australasia, Central America, and the western parts of South America also show high mean exposure, but with mixed agreement across the different indices. Sub-Saharan Africa does not emerge as a very clear cluster. Despite the less distinct regional clusters, some countries clearly combine a high mean exposure with a low range across the different indices, i.e., a high agreement across different indices, e.g., Bangladesh, Indonesia, and Ecuador.
Interestingly, no clear picture emerges in terms of overall risk metrics. Here, some global regions with high mean risk emerge (sub-Saharan Africa, Central and South Asia, parts of Southeast Asia) (Fig. 2a) yet with quite high ranges-in other words low agreement-across the four different indices (Fig. 2b). The most consistent picture here seems to emerge for some countries on the lower end of the risk spectrum, e.g., those in Scandinavia but also Kazakhstan or Belarus.
These findings are confirmed by additional analysis (presented in detail in the Supplementary Material), which shows that the inter-index ranges are widest for overall risk and exposure (ranging all the way from 0 to almost 1), while they are much smaller for vulnerability and susceptibility (with the majority of countries having inter-index ranges smaller than 0.4) and are smallest for lack of coping and adaptive capacity (most of the countries stay below 0.2) ( Figure SM3). In addition, we also conducted a statistical analysis on whether the ranges of results across different indices vary along particular country groups, i.e., different world regions, income levels, development status according to UN classification and V20 membership. The detailed findings of this analysis are provided in the Supplementary Material ( Figure SM5). Overall, the ranges of inter-index results are not significantly different in any of the groups. In other words, the disagreement or agreement between the different indices and its sub-indices is not significantly higher or lower in any particular country group when dividing countries by the above-indicated markers.
In addition, when zooming into the inter-index ranges, i.e., the level of agreement/ disagreement between different indices, a number of patterns can be observed in terms of the actual ranks of countries (see Figure SM4 in the Supplementary Material for a detailed depiction). In terms of overall risk, there is a cluster of countries with a high ranking in WRI, GAIN, and INFORM but a much lower raking in CRI, e.g., the Central Republic of Africa. In return, there are countries which are ranked with low risk in all indices but the CRI, e.g., Germany. Similar divergence can also be observed in terms of exposure between WRI, GAIN, and INFORM, while the analysis confirms the overall high consistency in terms of the other risk sub-components.

Integrated interpretation of the different steps of analysis
Triangulating between the results from the three different steps of analysis (Section 4) allows to discuss and better explain the observed patterns overall. In particular, a looming question is to what extent the correlations shown in Section 4.2 as well as the ranges in Section 4.3 can be explained by the differences and similarities in terms of underlying concepts and data, as analyzed in Section 4.1. The low correlation of "risk indices" (WRI, GAIN, INFORM) on the one hand with the Climate Risk Index (CRI), which also claims to assess climate "risk," might be surprising at first. All these indices claim to provide a valid representation of risk and a comparative overview of different countries' risk levels. However, the variation can be explained to a large extent by the very different conceptual design and data base of the CRI when compared to the other three indices (WRI, INFORM, GAIN). Still, the fact that CRI produces starkly different results from WRI, INFORM, and GAIN, which are much more aligned, is troubling as all four indices often seem to be used interchangeably in academia, policy, media, and the general public. In addition, there is even a high divergence in the risk results of WRI, INFORM, and GAIN when considering the entire spectrum for low to high risk.
When zooming into the sub-components of the three indices, which are in principle more similar conceptually (WRI, INFORM, GAIN), the weak correlations in exposure might surprise at first. One would expect that the assessment and indicator-based expression of exposure is quite straight-forward and hence yields similar results. However, triangulating this finding with the little overlap in terms of underlying exposure information (see Fig. 1 and the extensive table of indicators provided in the Supplementary Material), it becomes clear why no stronger correlations are being observed. Still, the finding presents some challenges for the use of exposure data in policy making and other contexts. Whether a country is considered to have high or low exposure varies greatly, depending which index is being picked. Hence, the use of exposure index information, and therefore eventually risk index data, requires high familiarity with and technical expertise in the underlying data and indicator framework. It is questionable whether all actors potentially using this information will have such expertise and familiarity. In addition, the disagreement in exposure measures contributes significantly to the comparatively high disagreement that can be observed in the overall risk values.
Interestingly, the results between the different indices increasingly converge when moving on to the vulnerability and susceptibility sub-components and especially to lack of coping and adaptive capacity. Strong or very strong inter-index correlation can be observed for vulnerability, lack of coping capacity and lack of adaptive capacity in particular. Likewise, the inter-index ranges for these sub-components are much smaller than for risk and exposure. This strong conversion cannot exclusively be explained by the overlap in underlying indicators since the overlap is, on first sight, not much stronger than for risk or exposure. In fact, it is weaker. However, the underlying indicators in the different indices (such as income, poverty, food or health data) are correlated amongst each other. In any case, the comparatively strong conversion in the socio-economic vulnerability and capacity domains is a key finding for policy making which often targets the "most vulnerable." It suggests that the results in these domains are in fact more consistent than in the hazard and exposure domains.

Our findings in the light of previous studies
In contrast to the increasing amount of studies that have provided an in-depth analysis of the robustness of vulnerability and risk indices toward changes in underlying concepts (e.g., Anderson et al. 2019) and methodological choices for index construction (e.g., Schmidtlein et al. 2008;Tate 2013;Willis and Fitton 2016;Feizizadeh and Kienberger 2017;Machado and Ratick 2018), we compare the results of four leading global risk indices assuming that policy and decision makers who are using these indices likely focus on the results of these indices and often might not care too much about the methodological consistency. Although the analysis presented here goes beyond existing attempts to compare global risk indices (e.g., Feldmeyer et al., in this issue;Leiter et al. 2017) by considering either more indices or diving deeper into the different sub-components of risk, we also identify a number of similar findings. First, our study also confirms that there are stark conceptual differences in how risk is defined and ultimately operationalized in the assessments through indicators in each of the four global risk indices considered here-with implications on the consistency of their results-as postulated by Leiter et al. (2017). Second, our analysis also reveals a strong correlation in the vulnerability sub-components at the country level-confirming the results of Feldmeyer et al. (in this issue) who compared vulnerability aggregated into climate regions. However, by diving deeper into the sub-components of vulnerability (i.e., susceptibility, lack of coping and adaptive capacity), we were able to identify even stronger correlations in some of these subcomponents for the WRI, INFORM, and GAIN.
At the same time, the analysis presented here also leads us to carefully balance some of the statements put forward by Leiter et al. (2017) on the use of such indices for international climate policy. Based on the comparison of the top 20 ranked countries alone, Leiter et al. (2017) conclude that these indices "cannot be used to reliably determine the most vulnerable countries" (Leiter et al. 2017, p. 2). In contrast to that, our analysis actually reveals a quite strong correlation for the vulnerability and lack of capacities sub-components of risk, and enabled us to identify a small number of countries that actually emerge in several of the four indices as "high-risk," "high-vulnerability," or "high lack of capacities" countries ( Fig. 2 and Fig. SM6 in the Supplementary Material).

Remaining question marks
Although our approach adds knowledge on global risk index data, some key questions remain open. We reflected above on how to explain the difference between CRI results and those of WRI, INFORM, and GAIN. Yet, on a different level, it is still interesting to observe that impact data on past events, as used in the CRI (here: mortality and economic losses), are so little correlated to the overall risk levels assumed by the other three risk indices. A possible explanation is the comparatively short time-frame of data used in the CRI, which captures disasters and extreme events and might therefore generate different patterns than assessments focusing on the underlying bio-geophysical hazard potential and overall vulnerability conditions of countries more generally. Still, observing the correlations, or non-correlations, between both in the future will be of high relevance as extreme events are expected to rise in frequency and intensity along with climate change (IPCC 2018). In addition, the merger of fatalities and economic losses into one risk score might be problematic in this context. It is well known empirically that poorer and less developed countries are more likely to suffer fatalities in disasters, while rich countries can expect more economic losses (CRED and UNDRR, 2020). Further, as highlighted by the Emergency Events Database (EM-DAT) which publishes data on disaster impacts in their most recent annual report (UCLouvain et al. 2019) and the scientific literature (Osuteye et al. 2017;Panwar and Sen 2020), it has to be noted that, despite major improvements in making event and impact data available in a standardized manner, there are still prevalent issues with underreporting of impacts for specific hazard types (e.g., heat waves) and in developing countries. Lumping impact data on mortality and economic losses into an overall risk score hence produces data which might be little telling when comparing very different country groups, and might hence generate little correlation overall with other risk assessments.

Reflections on our approach
The approach chosen here brings many advantages, e.g., in regard to the transparency and simplicity of the individual analysis steps and the fact that it is based on nation-state resolution (a level of key concern in international climate finance and policy). It hence allows for countyby-country comparisons. However, our approach also comes with a number of limitations, calling for further research. First, while we analyzed the similarities and differences of different indices in terms of their modular design and underlying indicators, we did not dive deep into assessing the role of indicator choice and different normalization, weighting, and aggregation options and the implications those can have for index results and overall cross-index consistency. Previous work on comparing sub-national risk indices in delta regions suggests that the choice of aggregation method has a considerable effect on overall results ). This finding is also shared by further previous studies that used sensitivity analysis to evaluate the robustness of index scores toward changes in construction steps (e.g., Tate 2013; Tate 2012). In the case of the WRI, INFORM, and GAIN, a quite similar aggregation and weighting approach has been applied. Still, given the high relevance of understanding the consistency of the four risk indices considered here toward changes in the input parameters and index construction steps, a detailed analysis in the future would be worthwhile, we argue, in order to characterize the effect of different approaches more precisely. Second, given that the indices compared here were developed based on data from different years, this might also distort the findings of the comparison a bit. However, given that key social and economic factors (e.g., poverty or education levels) usually do not change tremendously over short time periods, we consider this to be negligible.

Conclusions and outlook
This analysis was motivated by the observation that the recent rise in the number and use of index-based approaches to assess countries' climate and disaster risk at the global scale is met with a lack of clarity on the consistency of assessment results across different indices. This gap, we argue, is troubling not only from an intellectual, but also from a practical and policy point of view. Risk information, as provided by global scale risk indices, is increasingly being used to inform international climate policy and finance decisions. It has the potential to unfold quite significant agency, e.g., when funding decisions aimed to target the "most vulnerable" or "most at risk" countries are based on the findings from such indices.
We, therefore, asked how the leading global climate and disaster risk indices differ in terms of their design and data, how their results differ, and whether their results, in combination, allow for the identification and ranking of global hotspots in climate and disaster risk, exposure, vulnerability, and lack of coping as well as adaptive capacity. Our results reveal a mixed picture. Out of the four leading global multi-hazard risk indices which were analyzed and compared, i.e., the World Risk Index (WRI), the INFORM Risk Index, the ND-GAIN Index, and the Climate Risk Index (CRI), three (i.e., WRI, INFORM, GAIN) are built more or less closely around the current mainstream conceptual thinking which in risk and climate research. They all use sub-components on hazard, exposure, socio-economic vulnerability, and the (lack of) short-term as well as long-term capacity to cope with and adapt to hazards. They also have a rather similar approach to utilizing data. This sets them apart from the fourth index considered, the CRI, which is centered around a simpler outcome-oriented assessment of past hazard impacts. Not surprisingly, therefore, the former three indices show comparatively strong correlations between each other, but not with the CRI. Surprisingly, the correlations in the sub-components related to social, economic and institutional parameters (i.e., vulnerability and capacity issues) are much higher than those for the hazard exposure and overall risk components. This means that global vulnerability and lack of capacity hotspots can be identified and ranked much more consistently than hotspots of overall risk or hazard exposure.
Our analysis does not intend to resolve the general chasm in the literature regarding the perceived usefulness of index-based approaches to assess risk and vulnerability (Section 2), nor would it provide all the necessary knowledge or arguments to do so. In fact, our findings speak to the potential as well as limitations of index approaches. Yet, we observe that such tools are increasingly being used outside of the academic debate. We therefore hope that our findings can make a contribution to fostering a careful reflection and use of available index-based approaches at the global scale. If our analysis only shows one thing, it is that a solid understanding of modular indexbased assessment tools, and their conceptual and methodological underpinnings, is necessary to navigate them properly and interpret as well as use their results in triangulation.
Author contributions MG and MH designed the concept for the analysis. DD and JR were responsible for data collection. JR analyzed the overlaps of indicators. MG conducted the correlation analysis and the analysis of means vs. ranges. DD created the boxplots, and MH conducted the global hotspot analysis. All authors have contributed to drafting the manuscript and the interpretation of the results under the lead of MG, and approved the final manuscript.
Funding Open Access funding enabled and organized by Projekt DEAL. This research has received funding from the LAKARI project and has been supported by the German Federal Ministry of Education and Research (BMBF; grant no. D/396/67223147).

Declarations
Competing interests The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.