1 Introduction

Since Balassa (1965), comparative advantages have been measured through Revealed Comparative Advantage (RCA) indexes. RCA indexes are calculated using trade data under the assumption that trade flows can “reveal” comparative advantages. To the question “how can trade data be used to calculate an RCA index?”, Balassa (1965) suggests a simple and intuitive answer: if a product has greater weight in some country’s total exports than in a given trade area, then this divergence can be interpreted as the consequence of comparative advantage. In this regard, the RCA index from Balassa (1965), hereafter referred to as the B index, is calculated as the share of that product in the country’s total exports divided by the share of the product in total exports from all countries in the trade area. Thus, the B index reveals comparative advantages (disadvantages) if its value is greater (less) than 1.

The widespread use of the B index would convey the idea that comparative advantages should be measured according to that index. Recent examples of the use of the B index include: Barattieri (2014), Nath et al. (2015), Nath and Goswami (2018), Bagci (2016), Carrère and Strauss-Kahn (2017), Kathuria (2018), Lectard and Rougier (2018), Bahar et al. (2019). In addition, various international organizations seem to support the B index, such as the World Bank in its World Integrated Trade Solutions. database, the World Trade Organization in its Practical guide to trade policy analysis (Bachetta et al., 2012), and the United Nations Conference on Trade and Development in its RCA radar plots.

However, doubts about the B index have been expressed in the literature. These doubts primarily concern the “theoretical” properties of the B index, namely the nature of the variables used by the B index and the method used to incorporate these variables into a formula. For instance, the B index is based on export data but ignores import data. However, using both export and import data would increase the precision of measurements of comparative advantages by better capturing both the demand and supply dimensions (Vollrath, 1991). Furthermore, the B index is subject to a size bias: a country can be associated with high values of the B index and, consequently, strong comparative advantages even if the country represents a relatively small share of exports (i.e., a small “size”) in a given trade area (Leromain & Orefice, 2014). Similarly, the B index reveals comparative advantages if its value is greater than 1 and comparative disadvantages if its value belongs to [0, 1). Because there is an upper-bound for comparative disadvantages but not comparative advantages, the B index does not measure comparative advantages and disadvantages in the same way (Laursen, 2015). Lastly, the B index is not additive; that is, “EU’s comparative advantage in producing vehicles can be measured by the sum of its membership countries’ comparative advantage in producing vehicles; and China’s comparative advantage in the various labor-intensive products as a whole can be measured by the sum of China’s comparative advantage in each specific labor-intensive product” (Yu et al. 2009, p. 273).

In addition to doubts about the theoretical properties of the B index, concerns have arisen regarding its “empirical” properties, namely the properties of the values of the B index when applied to a given set of countries, products and periods. Three main shortcomings have been reported. First, the B index proves to be unstable over time even though comparative advantages are supposed to be sticky over time. Second, the shape of its statistical distribution is asymmetric despite the fact that symmetry better reflects the fact that comparative disadvantages counterbalance comparative advantages, and exhibits fat tails even though observations suggest that strong comparative (dis)advantages are relatively rare. Third, the B index is subject to an ordinal ranking bias as the values of the B index may not rank countries in a consistent way (Yeats, 1985; Hoen & Oosterhaven, 2006; Yu et al., 2009; Leromain & Orefice, 2014; Liu & Gao, 2019).

This paper focuses on these three empirical properties—time stationarity, symmetric shape with thin tails, and minimization of ordinal ranking bias—and suggests a new set of measures to evaluate the ability of the B index and its alternatives to fit these properties. These new measures represent revisions of the measures already used in the literature, with the aim of increasing their reliability. Ultimately, these improved measures enhance the selection of the most appropriate RCA index to apply to a given set of countries, products and periods, based on a discussion of the empirical properties of RCA indexes together with a consideration of their theoretical properties. This is especially relevant for the design of economic policies on the basis of comparative advantages (on this point, see Gallardo 2005). We discuss different methods to rank RCA indexes according to all proposed measures, and we apply these methods to a data set of more than 595 million values of 33 RCA indexes encompassing different trade areas and product classifications.

The rest of the paper is organized as follows. Section 2 briefly reviews the most representative RCA indexes in the literature. Sections 3, 4 and 5 present our new tools to evaluate the time stationarity, shape and ordinal ranking bias of RCA indexes, respectively. Section 6 describes our ranking of RCA indexes according to all measures, and Sect. 7 presents our empirical assessment of RCA indexes. Concluding remarks are given in Sect. 8.

2 An overview of RCA indexes

Let J be a set of countries that form a trade area, K a set of products (or sectors) and T a set of time periods. An RCA index is a (many-to-one) function from \(J\times K\times T\) to \(\mathbb {R}\) or a subset of \(\mathbb {R}\) such that every triplet \((i,k,t) \in J\times K\times T\) is associated with a unique real number, which we denote as \(\text {RCA}_{ikt}\). There exists a comparative-advantage “neutral” value \(\phi\) such that the inequality \(\text {RCA}_{ikt}>\phi\) reveals comparative advantages for country i in J with respect to product k in time period t. The inequality \(\text {RCA}_{ikt}<\phi\) reveals comparative disadvantages, and the equality \(\text {RCA}_{ikt}=\phi\) reveals the absence of comparative advantages and disadvantages. In this regard:

  • If \(\text {RCA}_{i_1kt}>\text {RCA}_{i_2kt}>\phi\), country \(i_1\) has a higher comparative advantage than country \(i_2\) for product k in period t. If \(\phi>\text {RCA}_{i_1kt}>\text {RCA}_{i_2kt}\), \(i_1\) has a lower comparative disadvantage than \(i_2\) for k in t.

  • If \(\text {RCA}_{ik_1t}>\text {RCA}_{ik_2t}>\phi\), i has a higher comparative advantage for product \(k_1\) than for product \(k_2\) in t. If \(\phi>\text {RCA}_{ik_1t}>\text {RCA}_{ik_2t}\), i has a lower comparative disadvantage for \(k_1\) than for \(k_2\) in t.

  • If \(\text {RCA}_{ikt_1}>\text {RCA}_{ikt_2}>\phi\), i has a higher comparative advantage for k in period \(t_1\) than in period \(t_2\); and if \(\phi>\text {RCA}_{ikt_1}>\text {RCA}_{ikt_2}\), i has a lower comparative disadvantage for k in \(t_1\) than in \(t_2\) (Danna-Buitrago, 2017).

We define \(\text {B}_{ikt}\) as the B index calculated for (ikt): \(\text {B}_{ikt}=(X_{ikt}/X_{i.t})/(X_{.kt}/X_{..t})\) where \(X_{ikt}\in \mathbb {R}_{+}\) denotes the exports of k from i to the other countries among J in t; \(X_{i.t}=\sum _{l\in K}X_{ilt}\) denotes the exports from i to the other countries among J for all products among K (in t); \(X_{.kt}=\sum _{j\in J}X_{jkt}\) denotes the exports of k from every country to another among J; and \(X_{..t}=\sum _{j\in J}\sum _{l \in K}X_{jlt}\) denotes the exports from every country to another among J for all products among K. \(\text {B}_{ikt}>1\) implies that k has more weight in the exports of i in t relative to a “typical” country among J (French, 2017) and therefore should reveal comparative advantages for (ikt). On the contrary, \(0\le \text {B}_{ikt}<1\) reveals comparative disadvantages, and \(\text {B}_{ikt}=1\) is the comparative-advantage neutral value.

As summarized in Tables 1, 2 and 3, various alternatives to the B index have been proposed in the literature. Some of these alternatives stem from modifications of the B index. The “additive” version of the B index (Hoen & Oosterhaven, 2006), referred to as the AB index, calculates the difference between \(X_{ikt}/X_{i.t}\) and \(X_{.kt}/X_{..t}\) instead of the ratio, namely \(\text {AB}_{ikt}=X_{ikt}/X_{i.t}-X_{.kt}/X_{..t}\). The “symmetric” version of the B index (Dalum et al., 1998; Laursen, 2015), which we denote as SB, is an approximation of the log of the B index around the neutral value of the B index (i.e., 1):Footnote 1\(\text {SB}_{ikt}=(\text {B}_{ikt}-1)/(\text {B}_{ikt}+1)\). Furthermore, in “weighted” versions of the B index, \(\text {B}_{ikt}\) is normalized by the average B index either across products (K) for a given country (Proudman & Redding, 1998, 2000) or across countries (J) for a given product (Amador et al., 2011). We write as \(\text {WB}^{K}\) and \(\text {WB}^{J}\) these weighted versions of the B index: \(\text {WB}_{ikt}^{K}=\text {B}_{ikt}/\frac{1}{\#K}\sum _{l\in K}\text {B}_{ilt}\) and \(\text {WB}_{ikt}^{J}=\text {B}_{ikt}/\frac{1}{\#J}\sum _{j\in J}\text {B}_{jkt}\).

Table 1 The B index and alternatives (1)
Table 2 The B index and alternatives (2)
Table 3 The B index and alternatives (3)

Other RCA indexes are based not only on exports but also on imports. For this purpose, let us denote as \(M_{ikt}\) the imports of k by i in t, and as \(M_{i.t}\), \(M_{.kt}\) and \(M_{..t}\) as the counterparts of \(X_{i.t}\), \(X_{.kt}\) and \(X_{..t}\), respectively. The RCA index suggested by Balassa (1986) and referred to as the B2 index consists of the trade balance associated with (ikt), that is, \(X_{ikt}-M_{ikt}\), normalized by the total trade associated with (ikt), namely \(X_{ikt}+M_{ikt}\). Donges and Riedel (1977) and Gnidchenko and Salnikov (2015) subsequently proposed modifications of B2. Donges and Riedel (1977) calculate the ratio of \(\text {B2}_{ikt}\) to the same type of ratio calculated at the level of K—that is, \((X_{i.t}-M_{i.t})/(X_{i.t}+M_{i.t})\)—before subtracting 1 from that ratio and ultimately multiplying by 1 or -1 depending on the sign of the total trade balance \(X_{i.t}-M_{i.t}\) associated with (it). Gnidchenko and Salnikov (2015) take into account GDP in addition to exports and imports. Their RCA index, which we denote as B2G, weights \(\text {B2}_{ikt}\) by the relative openness rate, defined as the ratio of trade \(X_{ikt}+M_{ikt}\) to the GDP of i in t divided by the same ratio at the level of J.

Michaely (1962) proposes an RCA index that, like the B2, B2D and B2G indexes, takes into account both exports and imports. This RCA index calculates the difference between \(X_{ikt}/X_{i.t}\) and \(M_{ikt}/M_{i.t}\) instead of the difference between \(X_{ikt}/X_{i.t}\) and \(X_{i.t}/X_{..t}\), contrary to the AB index. Consequently, the share of k in i’s imports substitutes for the share of k in J’s exports. Another solution is suggested by Algieri et al. (2022) and Danna-Buitrago and Stellian (2022) and is the difference between the SB index and the same kind of metric in relation to imports, that is, the log-approximation of \((M_{ikt}/M_{i.t})/(M_{.kt}/M_{..t})\). This RCA index is denoted as RC and is rather similar to the RCA index labelled “Revealed Competitiveness" suggested by Vollrath (1991). Vollrath (1991) proposes removing the trade flows involving k and/or i when exports are added up across products and/or countries. For example, \(X_{.kt}-X_{ikt}\) substitutes for \(X_{.kt}\); therefore, \(X_{ikt}/X_{i.t}\) becomes \(X_{ikt}/(X_{i.t}-X_{ikt})\). This would enable a clear distinction between k and the other products, and between i and the other countries. Danna-Buitrago and Stellian (2022) explain that preserving these flows is necessary because otherwise the values of the corresponding RCA index cannot be computed if a country is the sole exporter or importer of some productFootnote 2. In addition, Vollrath (1991) applies the log to the corresponding export-based and import-based ratios instead of the approximation of the log. Again, as argued by Danna-Buitrago and Stellian (2022), using the log prevents the calculation of an RCA index throughout a given set of countries, products and time periods.Footnote 3

Further improvements to the RC index have subsequently been suggested, thus making RC the baseline RCA index of a whole class of RCA indexes. First, the RC index can be adjusted by multiplying it by a coefficient whose value is based on the respective GDPs of all countries in J. The purpose of this coefficient is to better measure comparative advantages as GDP differentials between countries is a proxy for differences in the availability of economic resources for creating comparative advantages or enhancing existing comparative advantages. The coefficient is given by \(m^{-Y_{it}/\bar{Y}_{t}+1}\in [0,m)\) where m is strictly positive and \(\bar{Y}_{t}\) denotes the across-country average GDP in t. The corresponding RCA index is denoted \(\text {RC}^{Y}\) if the aforementioned coefficient is based on GDP, and \(\text {RC}^{y}\) if GDP per capita is used instead (Danna-Buitrago & Stellian, 2022; Stellian & Danna-Buitrago, 2022).

Second, following the RCA indexes in terms of contribution to the trade balance—which are presented thereafter—every trade flow can be adjusted so that the share of k in total trade is the same in each period and corresponds to the share of a given reference period r. For this purpose, every export/import flow associated with (kt)—\(X_{ikt}\) and \(M_{ikt}\) \(\forall i \in J\)—is multiplied by \(w_{kr}/w_{kt}\) where \(w_{kt}=(X_{.kt}+M_{.kt})/(X_{..t}+M_{..t})\) is the share of k in the total trade of J in t.Footnote 4 The corresponding RCA index is denoted as \(\text {RC}^{r}\). Third, GDP adjustment and trade-flow adjustment can take place together. The corresponding RCA indexes are denoted as \(\text {RC}^{Y,r}\) and \(\text {RC}^{y,r}\), respectively.

Another RCA index is suggested by Leromain and Orefice (2014). This RCA index, which we denote as Z, is based on the variable \(z_{ikt}\), which is a proxy for the Ricardian fundamental productivity level of i with respect to k in t. In line with the micro-founded Ricardian model of Costinot et al. (2012), \(z_{ikt}\) verifies \(z_{ikt}=\exp {(\delta _{ikt}/\theta )}\) where \(\delta _{ikt}\) is derived from the OLS estimation of \(\ln (x_{ijkt})=\delta _{ijt}+\delta _{ikt}+\delta _{jkt}+\varepsilon _{ijkt}\). According to this equation, the log of \(x_{ijkt}\), which is the trade flow of k from i to another country j in t, is decomposed additively into an exporter-importer fixed effect (\(\delta _{ijt}\)), an exporter-product fixed effect (\(\delta _{ikt}\)) and an importer-product fixed effect (\(\delta _{jkt}\)); \(\epsilon _{ijkt}\) is the residual term. The parameter \(\theta\) captures productivity dispersion across varieties of the same k. \(z_{ikt}\) is divided by i’s average productivity across products, and the same ratio is calculated at the level of J. These two ratios are analogous to \(X_{ikt}/X_{i.t}\) and \(X_{.kt}/X_{..t}\) in the B index. Instead of dividing the former ratio by the latter ratio, we can also compute the difference, similar to the AB index with respect to the B index. AZ denotes this additive version of Z. The AZ index may show more interesting empirical properties than the Z index or other RCA indexes. Consequently, it is useful to include the AZ index among the set of RCA indexes that should be evaluated for a given space \(J\times K \times T\).

Lastly, the RCA index from Yu et al. (2009) and the RCA indexes in terms of Contribution to the Trade Balance (CTB) should be mentioned. Yu et al. (2009) assume that i would have neither comparative advantages nor comparative disadvantages if the total exports of i are distributed according to the share of each product in the total exports of all countries in J. In this respect, \(\left( \nicefrac {X_{.kt}}{X_{..t}}\right) X_{i.t}\) is the comparative-advantage neutral value of exports. Actual exports higher (lower) than this neutral value should reveal a comparative advantage (disadvantage) for i with respect to k in t. The RCA index from Yu et al. (2009), which we denote as NY, consists of calculating the difference between actual exports and the comparative-advantage neutral value of exports; this difference is then normalized by the total exports of J.

The CTB indexes are similar to the NY index. The first difference is that CTB indexes do not calculate a comparative-advantage neutral value of exports but a comparative-advantage neutral value of the trade balance associated with (ikt). In addition, this value is calculated by weighting the trade balance of i (instead of the sole exports of i) by \(w_{kt}\), which has been defined previously as the share of k in the total trade of J and is calculated as \((X_{.kt}+M_{.kt})/(X_{..t}+M_{..t})\). Consequently \(w_{kt}(X_{i.t}-M_{i.t})\) is the comparative-advantage neutral value of \(X_{ikt}-M_{ikt}\). The difference between the actual trade balance and its neutral value can be normalized either by the total trade of J (\(\text {CTB}^{W}\)) or by the GDP of i (\(\text {CTB}^{Y}\)). In addition, every trade flow can be adjusted so that the share of k in total trade is the same in each period and corresponds to the share of a given reference period r (as explained before in relation to the class of RCA indexes arising from RC). The corresponding CTB index, also normalized by GDP, is denoted as \(\text {CTB}^{Y,r}\).

As in the case of RC-inspired indexes, a whole class of CTB indexes can be conceptualized starting from the difference between the actual trade balance and the theoretical trade balance. First, once this difference is computed on the basis of adjusted trade flows, it can be normalized by total trade calculated from the adjusted trade flows instead of a country’s GDP. The corresponding CTB index is denoted as \(\text {CTB}^{W,r}\). Second, the standard CTB index with normalization by (unadjusted) total trade can be weighted by the coefficient \(m^{-Y_{it}/\bar{Y}_{t}+1}\) or \(m^{-y_{it}/\bar{y}_{t}+1}\). The corresponding CTB indexes are denoted as \(\text {CTB}^{W,Y}\) and \(\text {CTB}^{W,y}\), respectively. The combination of normalization by total trade and the GDP-related coefficient can be further augmented by the use of adjusted trade flows (Stellian & Danna-Buitrago, 2022). The corresponding CTB indexes are denoted as \(\text {CTB}^{W,Y,r}\) and \(\text {CTB}^{W,y,r}\), respectively.Footnote 5

3 Time stationarity

Comparative advantages arise from structural factors like technology, factor endowments and institutions, so comparative advantages tend to change only in the medium/long term. However, because an RCA index is calculated on the basis of trade data, short-term fluctuations in trade flows might induce variations in an RCA index even though comparative advantages might not change. Consequently, an RCA index that varies substantially over time might not adequately reveal comparative advantages. Ultimately, an RCA index is preferable to another if its time stationarity is higher, as an RCA index with higher time stationarity is better able to reveal comparative advantages despite the short-term fluctuations inherent to trade flows.

Our review of the literature reveals that time stationarity has been evaluated in the following ways:

  • Unconditional standard deviation through time of the values of an RCA index calculated across products and possibly countries (Leromain & Orefice, 2014; Danna-Buitrago, 2017; Stellian & Danna-Buitrago, 2019).

  • The OLS estimation of \(\beta _i\) in \(\text {RCA}_{ikt_1}=\alpha _i+\beta _i\text {RCA}_{ikt_0}+\varepsilon _{ik}\) (Dalum et al., 1998; Laursen, 2015; Danna-Buitrago & Stellian, 2022). According to this equation, time stationarity between initial period \(t_0\) and final period \(t_1\) is higher if \(\beta _i\) is closer to 1 and \(\alpha _i\) is closer to 0. This measure of time stationarity concerns the values of an RCA index calculated across products for a single country (hence the index i in \(\beta _{i}\) and \(\alpha _i\)) on the basis of \(\# K\) observations, each one corresponding to a product k.

  • The OLS estimation of \(\beta\) in \(\text {RCA}_{ikt_1}=\alpha +\beta \text {RCA}_{ikt_0}+\gamma _i+\varepsilon _{ik}\) (Stellian & Danna-Buitrago, 2019, 2022). According to this equation, time stationarity between \(t_0\) and \(t_1\) is higher if \(\beta\) is closer to 1 and \(\alpha\) is closer to 0. This method of measuring time stationarity concerns the values of an RCA index calculated across products and countries instead of a single country (hence \(\# J\times \# K\) observations, each one corresponding to a country-product pair (ik)). Country heterogeneity is captured by the fixed effect \(\gamma _i\).

  • \(\chi ^2\)-tests to determine whether the values of an RCA index calculated across products for a single country in \(t_0\) and the same kind of set in \(t_1\) are significantly different (Hoen & Oosterhaven, 2006). These tests can be extended to the values of an RCA index calculated across products and countries instead of considering a single country.

  • The Harris-Tzavalis unit-root test based on the OLS estimation of \(\text {RCA}_{ikt}=\rho \text {RCA}_{ikt-1}+\gamma _{ik}+\varepsilon _{ikt}\) (Stellian & Danna-Buitrago, 2019), which has the same interpretation as the unit-root test presented in this paper (see below).

  • The OLS estimation of \(\beta _{ik}\) in \(\text {RCA}_{ikt}=\alpha _{ik}+\beta _{ik}t+\varepsilon _{ikt}\) (Yu et al., 2010). According to this equation, time stationarity is higher if \(\beta _{ik}\) is closer to zero. This method of measuring time stationarity concerns the values of an RCA index calculated for a single product and a single country but can be extended to multiple products and countries, possibly with fixed/random effects associated with countries and/or products.

  • Calculation of transition probabilities that measure the likelihood that the value of an RCA index belonging to a given interval in \(t_0\) remains in the same interval in \(t_1\). If these transition probabilities are closer to 1, then time stationarity is higher (De Benedictis & Tamberi, 2004).

We propose a more robust method in which the starting point is the panel data set arising from the collection of RCA indexes calculated for every element of \(J\times K\times T\). In this set, each individual corresponds to a country-product pair \((i,k)\in J \times K\); hence the cross-sectional dimension comprises \(\#J\times \#K\) individuals. The evaluation of the time stationarity of an RCA index is based on the Arellano-Bond GMM estimation of the following AR(1) process:

$$\begin{aligned} \text {RCA}_{ikt}=\beta t +\rho \text {RCA}_{ikt-1}+\gamma _{ik}+\varepsilon _{ikt} \end{aligned}$$
(1)

In this equation, \(\gamma _{ik}\) is a specific intercept for each country-product pair, which is useful to control for heterogeneity in the estimation. \(\varepsilon _{ikt}\) is the residual term. Using Eq. 1, we can implement the unit-root test according to which an RCA index shows time stationarity if the null hypothesis \(\rho =1\) is rejected and thus the alternative hypothesis \(|\rho |<1\) is accepted. Indeed, if \(|\rho |<1\), then the value of the RCA index associated with each country-product pair tends to fluctuate around the mean calculated as \((\alpha +\beta t+\gamma _{ik})/(1-\rho )\). We can reject an RCA index as a suitable measure of comparative advantages for \(J\times K\times T\) if \(\rho =1\) is accepted. Thereafter, for the RCA indexes for which \(|\rho |<1\) is accepted, it is possible to apply three measures of time stationarity:

  1. 1.

    The distance between their respective estimations of \(\beta\) and 0. A smaller distance implies higher time stationarity.

  2. 2.

    The distance between their respective estimations of \(\rho\) and 0. By construction, a smaller distance implies higher time stationarity.

  3. 3.

    The conditional standard deviation of an RCA index, calculated as \(\sigma _{\varepsilon }/\sqrt{1-\rho ^2}\), where \(\sigma _{\varepsilon }\) denotes the standard deviation of the residuals. A lower conditional standard deviation implies lower deviations of \(\text {RCA}_{ikt}\) around the mean, giving rise to higher time stationarity.

Equation 1 has four valuable features compared with already available tools. First, instead of an unconditional standard deviation, Eq. 1 rests the calculation of standard deviation on an AR(1) process. Second, the GMM estimation according to the Arellano-Bond method is more robust than the OLS estimation, especially for “large N - small T” panel data sets like ours. GMM estimation provides consistent estimators in the case of panel models with the presence of endogeneity or lagged dependent variables, whereas OLS estimation does not. Under the usual moment condition in which instrumental variables and errors are orthogonal, the two-step GMM provides the most efficient estimator. Even if regressors are strongly exogenous, two-step GMM is more efficient than pooled OLS in the case that lagged or squared independent variables are added to the estimation (Cameron & Trivedi, 2005). Third, Eq. 1 takes into account all periods rather than only the initial period \((t_0)\) and the final period \((t_1)\). Fourth, Eq. 1 comprises the linear time trend as suggested by Yu et al. (2010).

Remark 1

The GMM estimation by Arellano and Bond (1991) assumes that errors are not serially correlated. These authors devised an autocorrelation test using the residuals of the estimation. They also recommend applying a Sargan test for over-identifying restrictions, which is analogous to the test proposed by Hansen (1991) within a general GMM regression. However, the former test is more important than the latter since the absence of serial correlation is an assumption for the GMM estimation. By contrast, the test for over-identifying restriction is only an early warning for the possibility of misspecification problems in the estimation.

We use a few options to guarantee a more reliable estimation of Eq. 1. First, the variance-covariance matrix is computed to be robust to the presence of heteroskedasticity. Second, the GMM estimation is optimal since it uses two steps to compute the coefficients. Third, we restrict the number of lags to be used as instrumental variables to improve the asymptotic behavior of the estimation.

Example 1

Equation 1 is estimated for the \(\text {CTB}^{W,y}\), NB, Z and \(\text {RC}^{Y,r}\) indexes according to the 259 items in the 3-digit Standard International Trade Classification (SITC) for each year from 1995 to 2020. The trade area comprises Germany and its main trading partners,Footnote 6 of which there are 34. Hence, the panel data comprises 26 time periods and \(35\times 259=9065\) country-product pairs for each RCA index. Estimates are reported in Table 4. For each RCA index, the 95% confidence interval of \(\rho\) does not include 1, allowing us to reject the null hypothesis of unit root (\(\rho =1\)). Consequently, all RCA indexes can be considered stable over time, and the next step is to determine the extent of time stationarity for each RCA index. According to Table 4, \(\text {CTB}^{W,y}\) is the most stable RCA index over time from the vantage point of \(\beta\) because \(\text {CTB}^{W,y}\) minimizes the distance of \(\beta\) from zero. However, the p-value associated with the estimate of \(\beta\) with NB is greater than 10%, and hence we cannot reject \(\beta =0\) with NB. In addition, from the vantage point of \(\rho\), Z is the most stable RCA index over time because Z minimizes the distance of \(\rho\) from zero. Ultimately, \(\text {CTB}^{W,y}\) minimizes the conditional standard deviation and therefore would be the most stable RCA index over time from that third vantage point.

Table 4 Estimation of Eq. 1: An illustration

Table 5 ranks the four RCA indexes according to each criterion and then computes the mean rank. Rank 1 is attributed to \(\text {CTB}^{W,y}\) and NB because the corresponding estimates of \(\beta\) are roughly equal and very close to zero. Eventually, \(\text {CTB}^{W,y}\) achieves the lowest mean and therefore the best rank on average for the three criteria. In this example, \(\text {CTB}^{W,y}\) is the most stable RCA index over time.

Table 5 Ranking of RCA indexes in Table 4 according to time stationarity: an illustration

4 Shape

The distribution of an RCA index across countries and products should be as symmetric as possible in a given period. If the distribution is symmetric, the values that are greater than the mean offset the values that are smaller than the mean. Assuming that the mean is close to the neutral value of the RCA index under consideration, symmetry implies that the values that reveal comparative advantages tend to counterbalance the values that reveal comparative disadvantages. This is consistent with the relative nature of comparative advantages: a country that has comparative advantages for a product has comparative disadvantages for at least one other product, and at least one other country has comparative disadvantages for the former product. Ultimately, an RCA index is preferable to others if its asymmetry is lower. In addition, the distribution of an RCA index should contain a few outliers and therefore have thin tails. Such a distribution is necessary to reflect the fact that, generally speaking, a country has high comparative advantages and disadvantages for a few products. Consequently, fat tails should be avoided because they imply that outliers are more frequent than they should be.

Leromain and Orefice (2014) compare the respective asymmetries of the B and Z indexes based on skewness and mean-minus-median. Both statistics provide evidence of a right-skewed distribution if positive, or a left-skewed distribution if negative. Asymmetry is smaller if the respective distances of skewness and mean-minus-median from zero are smaller. However, it is difficult to compare the mean-minus-median of heterogeneous RCA indexes. If two RCA indexes have the same scale, then the aforesaid comparison is possible. For instance, the neutral value is 1 for both the B and Z indexes, and both indexes reveal comparative advantages within the range \((1,+\infty )\) and comparative disadvantages within the range [0, 1). Consequently, if the B index has a higher mean-minus-median than the Z index, it is possible to say that the former has a higher lack of symmetry. Nonetheless, if a third RCA index enters the comparison and has another scale (the AB index for example), mean-minus-median should be interpreted with caution.Footnote 7

Ultimately, to compare the asymmetry of heterogeneous RCA indexes, we replace the mean-minus-median with the coefficient known as Pearson’s second coefficient of skewness, which is calculated as \(3(\text {mean}-\text {median})/\sigma\). Similar to skewness, the normalization by standard deviation gives a measure in a dimensionless unit and therefore enables comparisons between RCA indexes with different scales.Footnote 8 This coefficient provides evidence of a right-skewed distribution if positive, and a left-skewed distribution if negative.

Example 2

Assume that the B, AB, SB and NY indexes are computed for the year 2020 according to the 65 items of the 2-digit SITC for the 30-country trade area comprising Japan and its main trading partners. The sets of values of each RCA index are represented in the scatter plots in Fig. 1. Each set comprises 1950 values, one per item and per country. The skewness and Pearson’s skewness associated with each set are reported. The numbers in parentheses are the respective ranks of the RCA indexes for each coefficient. For example, B ranks fourth for each coefficient because B gives rise to the longest distance between each coefficient and zero. Consequently, the B index is the least symmetric RCA index in this example. SB ranks first in terms of skewness because SB minimizes the distance of skewness from zero (0.390), whereas NY ranks first in terms of Pearson’s skewness because NY minimizes the distance of Pearson’s skewness from zero. If we compute the mean rank, NY is the most symmetric RCA index in this example (mean rank equal to 1.5), followed by SB (2), AB (2.5) and B (4).

Fig. 1
figure 1

Scatter plots and asymmetry statistics of B, AB, SB and NY; trade area comprising Japan and its main trading partners, 2020. Source: Authors’ calculations. Note: All points are assigned a random y-value to allow a better visual representation

Kurtosis is the conventional statistic for measuring tail fatness. According to the standard interpretation, a higher kurtosis implies fatter tails. The reference point is 3, which is the kurtosis of a normal distribution. A kurtosis higher than 3 implies fatter tails than if the distribution were normal. However, Fig. 2 shows that this might not be true, similar to Westfall (2014) who gives several counterexamples showing that kurtosis cannot always be considered a reliable measure of the concentration of a distribution about its mean (“peakedness”). Figure 2 presents the distributions of the values computed for the \(\text {CTB}^{Y}\) and RC indexes for 2020 according to the 65 items of the 2-digit SITC for the 32-country trade area comprising Ghana and its main trading partners. We compare the distributions of these RCA indexes with normal distributions with the same mean and standard deviation. On the one hand, the left graph shows that the values of the \(\text {CTB}^{Y}\) index tend to remain below the probability density function of the corresponding normal distribution, and hence the distribution of the \(\text {CTB}^{Y}\) index shows thinner tails than in the normal case. On the other hand, the right graph shows a fat right tail for the RC index, as values higher than 1 are more frequent than if the distribution were normal. Nonetheless, the kurtosis associated with the \(\text {CTB}^{Y}\) index is higher than the kurtosis associated with the RC index, namely 280.09 versus 3.83. According to the conventional interpretation of kurtosis, both distributions would have fat tails, but this is not the case for the \(\text {CTB}^{Y}\) index, even if its distribution has higher kurtosis than that of the distribution of the RC index.

Fig. 2
figure 2

Distributions of the \(\text {CTB}^{Y}\) index (left) and the RC index (right) for the trade area comprising Ghana and its main trading partners, 2020. Source: Authors’ calculations. Note: The area of each bar is proportional to the number of data points in the corresponding bin. Moreover, the histogram is rescaled so that the total area of the bars is equal to 1. The red curve represents the probability density function of a normal distribution whose mean and standard deviation are taken from the distribution of the corresponding RCA index

To avoid possible misinterpretations of kurtosis, we suggest another measure of tail fatness. Let \(\mu _{it}\) and \(\sigma _{it}\) be the mean value and standard deviation of an RCA index calculated for the \(\#K\) values associated with a given country i and period t (one value per k), and let \(\text {rca}_{it}^{p}=[\mu _{it}-p\cdot \sigma _{it},\mu _{it}+p\cdot \sigma _{it} ]\) be the range given by \(p\in \mathbb {N}_{*}\) standard deviation(s) of the mean. We define an outlier of order p as a value of an RCA index that does not belong to that range. A higher number of outliers implies fatter tails. Let \(\text {out}_{it}^p\) be the number of outliers of order p with respect to (it):

$$\begin{aligned} \text {out}_{it}^{p}=\#\left\{ k: \text {RCA}_{ikt} \notin \text {rca}_{it}^{p}\right\} \end{aligned}$$
(2)

Our measure of tail fatness is the average of \(\text {out}_{it}^{1}\) and \(\text {out}_{it}^{2}\) throughout T for each country in J. These averages are written as \(\overline{\text {out}}_{i}^{1}\) and \(\overline{\text {out}}_{i}^{2}\):

$$\begin{aligned} \overline{\text {out}}_{i}^{p}=\frac{1}{\#T}\sum _{t\in T}\text {out}^{p}_{i,t} \; \ p = 1,2 \end{aligned}$$
(3)

Smaller values of \(\overline{\text {out}}_{i}^{1}\) and \(\overline{\text {out}}_{i}^{2}\) imply thinner tails. \(\overline{\text {out}}_{i}^{p}\) is an unambiguous measure of tail fatness. Its consistency arises from the fact that it is calculated on the basis of the standard deviation of each distribution. Lastly, it is possible to compare two RCA indexes on the basis of their respective values of \(\overline{\text {out}}_{i}^{p}\).

Example 3

Figure 3 represents the values of the AZ and RC indexes in 2020 according to the 65 items of the 2-digit SITC for Mexico in the 13-country trade area comprising this country and its main trading partners. The graph shows the location of the mean, the interval \([\mu -\sigma ,\mu +\sigma ]\) and the interval \([\mu -2\sigma ,\mu +2\sigma ]\) for both RCA indexes. Therefore, it is possible to observe that, for example, one value of AZ is inferior to \(\mu -2\sigma\) and three other values are superior to \(\mu +2\sigma\). Consequently, there are four second-order outliers. According to the same logic, both RCA indexes present nineteen first-order outliers, whereas RC has fewer second-order outliers than AZ (three instead of four). Ultimately, RC gives better measures of comparative advantages than AZ in this example because RC gives rise to fewer outliers than AZ.

Fig. 3
figure 3

Measure of outliers: Mexico in the trade area comprising this country and its main trading partners, 2020. Source: Authors’ calculations

5 Ordinal ranking bias

Assume that country i is ranked as the \(x^{\text {th}}\) country according to the B index for some product k (in time period t). There is an ordinal ranking bias if for another product \(h \ne k\) the B index is higher but i has a lower rank than x, or is lower but i has a higher rank than x. Ordinal ranking bias should be avoided as much as possible to avoid misleading interpretations of RCA indexes.

Leromain and Orefice (2014) measure ordinal ranking bias by means of Spearman’s rank correlation coefficients. Assume that in a three-country trade area product k has the fourth highest RCA index among the \(\# K\) indexes (one per product) associated with i. In addition, the RCA index associated with i for product k is the highest RCA index compared with the RCA indexes associated with the other countries. We thus have a first pair of integers (4, 1) where the first integer is the ranking of RCA indexes in the corresponding country’s comparative advantage distribution and the second integer is the ranking of RCA indexes across countries. Assume that for the two other countries the respective pairs are (14, 2) and (7, 3). Spearman’s rank correlation coefficient measures the extent to which two vectors of discrete data place the distinct items in the same order. In the case of RCA indexes, the coefficient captures the magnitude of match/mismatch between countries’ respective distributions of RCA indexes and the ranking of countries according to RCA indexes. In our example, the Spearman’s rank correlation coefficient is calculated for the rank vectors (4, 14, 7) and (1, 2, 3) and equals 0.292 when computed as Pearson’s correlation coefficient (Kvam & Vidakovic, 2007). A coefficient closer to 1 implies a higher consistency between product classification and country classification, which leads to a lower ordinal ranking bias.

Leromain and Orefice (2014) calculates Spearman’s rank correlation coefficient for each product under consideration (and for a given period); hence the number of these coefficients can be large. To obtain a synthetic measure, we calculate a unique Spearman’s rank correlation coefficient across all products (and periods). Formally, for each (ikt), the computation of Spearman’s rank correlation coefficient requires the calculation of i’s rank relative to the other countries with respect to k in t. This rank is written as \(\hat{\mathcal {K}}_{ikt}\in \{1,2,\ldots ,\#J\}\):

$$\begin{aligned} \hat{\mathcal {K}}_{ikt}=\#\{x \in J: \text {RCA}_{xkt}>\text {RCA}_{ikt}\}+1 \end{aligned}$$
(4)

i is ranked as the \(x^{\text {th}}\) country if the value of the RCA index is higher for \(x-1\) countries in relation to (kt). Similarly, the computation of Spearman’s rank correlation coefficient requires the calculation of k’s rank relative to the other products with respect to i in t. This rank is written as \(\check{\mathcal {K}}_{ikt}\in \{1,2,\ldots ,\#K\}\):

$$\begin{aligned} \check{\mathcal {K}}_{ikt}=\#\{x \in K: \text {RCA}_{ixt}>\text {RCA}_{ikt}\}+1 \end{aligned}$$
(5)

k is ranked as the \(x^{\text {th}}\) product if the value of the RCA index associated with (it) is higher for \(x-1\) products in relation to (it). If two countries/products share the same value of the RCA index, they have the same rank. For example, for 4 countries, assume that \(\text {RCA}_{i_1kt}=1.2\), \(\text {RCA}_{i_2kt}=\text {RCA}_{i_3kt}=1\) and \(\text {RCA}_{i_4kt}=0\). \(i_1\) is ranked first with respect to k in t, both \(i_2\) and \(i_3\) are ranked second, and \(i_4\) is ranked fourth. No country is ranked third (the RCA index values of three countries are higher than the value of \(i_4\), and therefore \(i_4\) is ranked fourth). The same logic applies to products.

Ultimately, the Spearman’s rank correlation coefficient is calculated as Pearson’s correlation between the paired ranks from \(\hat{\mathcal {K}}\) and \(\check{\mathcal {K}}\) over \(\#K\times \#T\) for each country.

Example 4

We compute \(\hat{\mathcal {K}}_{ikt}\) and \(\check{\mathcal {K}}_{ikt}\) for the trade area comprising Mexico and its main trading partners for a single year, 2020. Figure 4 describes each pair \((\hat{\mathcal {K}}_{ikt}, \check{\mathcal {K}}_{ikt})\) with \(i=\text {United States}\) (and \(t=2020\)), namely one of Mexico’s main trading partners, in the case of the B index and the AB index. The inter-country ranking (i.e. \(\hat{\mathcal {K}}_{ikt}\)) is represented on the horizontal axis, and the intra-country ranking (i.e. \(\check{\mathcal {K}}_{ikt}\)) is represented on the vertical axis. Because the Mexico-related trade area comprises 13 countries, the inter-country ranking ranges from 1 to 13; because we use the 65 items in the 2-digit SITC as in the previous example, the intra-country ranking ranges from 1 to 65. Spearman’s rank correlation coefficient is greater for the B index than for the AB index. In this example, B is better able than AB to avoid ordinal ranking bias.

Fig. 4
figure 4

Spearman’s rank correlation coefficient between intra- and inter-country product rankings, United States in the Mexico-related trade area, 2020. Source: Authors’ calculations

We complement the computation of Spearman’s rank correlation coefficient by the generalization of the measure of ordinal ranking bias by Stellian and Danna-Buitrago (2019). Indeed, Spearman’s rank correlation coefficient might not adequately capture ordinal ranking bias, especially if the number of countries in J is small, as this would imply a few possible ranks for each country and therefore might distort the coefficients. The non-parametric measure suggested by Stellian and Danna-Buitrago (2019) was initially designed for a four-country trade area, namely the Pacific Alliance (Chile, Colombia, Mexico and Peru). Here we provide formulas that enable the application of this measure to any trade area [see also the formalization aspects of ordinal ranking bias in Danna-Buitrago and Stellian (2022) and Stellian and Danna-Buitrago (2022)]. For this purpose, we implement a four-step procedure. First, for each (it), we distribute K into J subsets \(K^{1}_{it}, K^{2}_{it}, \ldots , K^{\#J}_{it}\). If \(k \in K_{it}^{j}\) then k implies that i is ranked as the \(j^{\text {th}}\) country according to the value of its RCA index compared with the values associated with the other countries and the same product. This means that a set of \(\# j-1\) of countries other than i have a higher rank than i. These countries verify \(\text {RCA}_{xkt}>\text {RCA}_{ikt} \ \exists k,t\) and \(x \ne i\). Ultimately:

$$\begin{aligned} K^{j}_{it}=\{k: \#\{x: \text {RCA}_{xkt}>\text {RCA}_{ikt}\}=j-1\} \end{aligned}$$
(6)

Second, we calculate the mean value of the RCA index that leads i to be ranked as the \(j^{\text {th}}\) country (in t). We denote this mean as \(\text {RCA}_{it}^{j}\):

$$\begin{aligned} \text {RCA}_{it}^{j} = \frac{1}{\# K^{j}_{it}}\sum _{k \in K^{j}_{it}}\text {RCA}_{ikt} \end{aligned}$$
(7)

This second step is the use of a function from \(\mathbb {R}^{\#K^{j}_{it}}\) to \(\mathbb {R}\) such that the set \(\{RCA_{ikt}: k\in K^{j}_{it}\}\) is associated with a unique value that is representative of the values in that set. We choose the function that computes the mean, but other descriptive statistics such as the median or percentiles, among others, are potential candidates for \(\text {RCA}_{it}^{j}\). Further research should explore this subject.

For the third step, we introduce the following definition:

Definition 1

An RCA index implies an ordinal ranking bias if a country i, a time period t, a product k and two ranks \(j_1, j_2\) exist such that \(k\in K_{it}^{j_1}\), \(j_1>j_2\) and \(\text {RCA}_{ikt}>\text {RCA}_{it}^{j_2}\), or \(k_1\in K_{it}^{j_1}\), \(j_1<j_2\) and \(\text {RCA}_{ik_1t}\le \text {RCA}_{it}^{j_2}\).

Assume that i has rank \(j_1\) with respect to k in t, namely \(k \Rightarrow \#\{x: \text {RCA}_{xkt}\ge \text {RCA}_{ikt}\}=j_1-1\). There is an ordinal ranking bias if the value of the RCA index associated with (ikt) is higher than the mean value that leads i to have a higher rank than \(j_1\) in t. Conversely, there is an ordinal ranking bias if the value of the RCA index associated with (ikt) is lower than the mean value that leads i to have a lower rank than \(j_1\) in t. Definition 1 implies the following two properties. First, an RCA index is exempt from ordinal ranking bias if \(\text {RCA}_{ikt}<\text {RCA}_{it}^{j_2}\le \# J\) \(\forall t\), i, \(j_1>j_2\) and \(k\in K_{it}^{j_1}\), and \(\text {RCA}_{ikt}>\text {RCA}_{it}^{j_2}\) \(\forall t\), i, \(j_1>j_2\ge 1\) and \(k\in K_{it}^{j_1}\). Second, it is possible that \(K_{it}^{j_1}=\emptyset\), namely i does not have rank \(j_1\) in t. If so, rank \(j_1\) does not imply ordinal ranking biases in relation to (it). Similarly, rank \(j_1\) does not imply ordinal ranking biases in relation to (it) and \(j_2 \ne j_1\).

The third step consists of counting the number of ordinal ranking biases for every (ikt). We denote this number as \(\text {orb}_{i,k,t}\in \mathbb {N}\):

$$\begin{aligned} \text {orb}_{ikt}&=\sum _{j=1}^{\# J}\#\left\{ k\in \mathop {\bigcup }\limits _{x=j+1}^{\# J}K_{it}^{x}:\text {RCA}_{ikt}>\text {RCA}_{it}^{j}\right\} \nonumber \\&\quad +\sum _{j=1}^{\# J}\#\left\{ k\in \mathop {\bigcup }\limits _{x=1}^{j-1}K_{it}^{x}:\text {RCA}_{ikt}<\text {RCA}_{it}^{j}\right\} \end{aligned}$$
(8)

As a last step, we compute the average value of \(\text {orb}_{i,k,t}\) across products and time. This average is written as \(\overline{\text {orb}}_{i}\):

$$\begin{aligned} \overline{\text {orb}}_{i}=\frac{\sum _{k\in K}\sum _{t\in T}\text {orb}_{i,k,t}}{\#K \times \# T} \end{aligned}$$
(9)

A smaller value of \(\overline{\text {orb}}_{i}\) implies a smaller ordinal ranking bias. For replication purposes, an algorithm to compute \(\overline{\text {orb}}_{i}\) is available in an online repository (click here).

Example 5

Figure 5 describes the values taken by the AZ index in 2020 for the countries in the trade area comprising Mexico and its main trading partners for a subset of 2-digit SITC items. Figure 5 shows that Mexico is ranked third for two items, that is, the first and fifty-seventh items (“Live animals" and “Furniture and parts thereof”, respectively). Indeed, for both items, there are only two values greater than the value associated with Mexico. This third rank is associated with AZ = 0.10905e-3 for the first item and 0.34806e-3 for the second item. Consequently, an average value of 0.22855e-3 leads Mexico to rank third. This leads to five ordinal ranking biases in this example:

  • On the left of the graph, items 7, 56 and 64 imply an ordinal ranking bias. Indeed, for these three items, Mexico is ranked first or second, but the corresponding value of AZ is less than 0.22855e-3. Mexico is expected to rank second or first if the value of AZ is greater than the mean value that ranks Mexico third. This occurs for items 6, 11 and 65, which therefore do not imply ordinal ranking bias, but not for items 7, 56, and 64.

  • On the right, items 47 and 61 also imply an ordinal ranking bias. Indeed, for these two items, Mexico is ranked fourth, but the corresponding value of AZ is greater than 0.22855e-3. Mexico is expected to rank fourth if the value of AZ is smaller than the mean value that leads Mexico to rank third. This occurs for items 2, 5, 46 and 50, which therefore do not imply ordinal ranking bias, but not for items 47 and 61.

Fig. 5
figure 5

Ordinal ranking bias, Mexico and its main trading partners, 2020. Source: Authors’ calculations

6 Ranking of RCA indexes according to all measures

In summary, sections (3), (4) and (5) give rise to nine measures for evaluating the empirical properties of an RCA index for a given universe \(J\times K\times T\). The first empirical property is time stationarity, and its three measures are \(\beta\) (trend), \(\rho\) (persistence) and the conditional standard deviation (\(\sigma\)) associated with the AR(1) process in which the RCA index computed for (ikt) is the dependent variable and the RCA index computed for \((i,k,t-1)\) is the independent variable (see Eq. 1). The process is estimated once for all values of \(J\times K\times T\), so each RCA index is associated with a single triplet \((\beta ,\rho ,\sigma )\). This gives rise to three rankings, each one associated with a measure in that triplet. Provided that a value of \(\beta\) closer to zero implies higher time stationarity, rank x is given to an RCA index in terms of \(\beta\) if \(x-1\) RCA indexes achieve a lower distance of \(\beta\) from 0. RCA indexes can be ranked in the same manner using \(\rho\) and \(\sigma\).

The second empirical property is shape, which can be divided into asymmetry and tail fatness. The two measures of asymmetry are skewness and Pearson’s second coefficient of skewness. These shape statistics are calculated for all countries and products in each period; namely these asymmetry metrics are calculated \(\#T\) times with all values in \(J\times K\). Provided that asymmetry is lower if skewness is closer to zero, RCA indexes can be ranked \(\#T\) times according to the distance of their respective values of skewness from zero, one rank per period, and then the across-period average rank is calculated. Another solution is to rank RCA indexes a single time according to the distance of their respective values of across-period average skewness from zero. The same kind of ranking applies to Pearson’s second coefficient of skewness.

The two measures of tail fatness are the numbers of outliers based on one standard deviation of the mean—\(\overline{\text {out}}_{i}^{1}\)—and two standard deviations of the mean—\(\overline{\text {out}}_{i}^{2}\)—given by Equations (2) and (3). \(\overline{\text {out}}_{i}^{1}\) and \(\overline{\text {out}}_{i}^{2}\) are calculated for each country in J. Provided that a smaller value of \(\overline{\text {out}}_{i}^{p}\) implies lower extreme value frequency, RCA indexes are ranked according to their ability to minimize \(\overline{\text {out}}_{i}^{p}\) for each country before computing the across-country average rank.

Lastly, the two measures of ordinal ranking bias are Spearman’s rank correlation coefficient (see Eqs. 4 and 5) and the non-parametric measure given by Eqs. (69). The two metrics are calculated for each country in J. Values of Spearman’s rank correlation coefficient closer to 1 imply a lower ordinal ranking bias. RCA indexes are ranked for each country according to the distance of their respective values of this coefficient from 1 before computing the across-country average rank. Ultimately, RCA indexes are ranked for each country according to their ability to minimize the non-parametric measure of ordinal ranking bias before computing the across-country average rank.

To obtain more synthetic rankings, a solution is to compute the mean rank of each RCA index for each of the three empirical properties. The underlying assumption is that all measures associated with a given empirical property have the same weight, which is theoretically plausible. In this regard, let \(R_{u,v}\in \mathbb {N}_{*}\) be the rank of RCA index u with respect to measure \(v\in V\). The nonuple \(\langle R_{u,v}\in \mathbb {N}_{*}\rangle _{v\in V}\) is converted into the triplet \((\bar{R}_{u,1},\bar{R}_{u,2},\bar{R}_{u,3})=(\nicefrac {1}{3}\sum _{v\in V_1}R_{u,v},\nicefrac {1}{4}\sum _{v\in V_2}R_{u,v},\nicefrac {1}{2}\sum _{v\in V_3}R_{u,v})\) where:

  • \(V_1=\{\beta , \rho , \sigma \}\);

  • \(V_2=\{\text {Skewness, Pearson's 2}^{\rm nd} \text { skewness}, \overline{\text {out}}^1, \overline{\text {out}}^2 \}\); and

  • \(V_3= \{{\text {Correlation coefficient between }} {\hat{\mathcal{K}}} \text { and } {\check{\mathcal{K}}},\overline{\text {orb}}\}\)

This triplet enables the identification of the most accurate RCA indexes in relation to time stationarity, shape and ordinal ranking bias for a given universe \(J\times K\times T\). Note that an RCA index is not necessarily the best or among the best for each empirical property. Going further, one could calculate a “global" rank as a unique rank arising from the values of \((\bar{R}_{u,1},\bar{R}_{u,2},\bar{R}_{u,3})\). The simplest global rank is the simple arithmetic mean of \((\bar{R}_{u,1},\bar{R}_{u,2},\bar{R}_{u,3})\). Let \(\bar{\bar{R}}^{0}_u\) be this global rank. Therefore:

$$\begin{aligned} \bar{\bar{R}}^{0}_{u}=\frac{1}{3}(\bar{R}_{u,1}+\bar{R}_{u,2}+\bar{R}_{u,3})=\frac{1}{3}\left( \frac{1}{3}\sum\limits_{v\in V_1}R_{u,v}+\frac{1}{4}\sum\limits _{v\in V_2}R_{u,v}+\frac{1}{2}\sum\limits_{v\in V_3}R_{u,v} \right) \end{aligned}$$
(10)

This method of calculation assumes that the mean ranks associated with each of the three empirical properties have the same weight in determining the global rank. However, one empirical property might not be as important as the two others for a specific analysis. In this regard, using the simple arithmetic mean is not necessarily relevant. Consequently, two other solutions are suggested. The first solution is to maintain the ranking of RCA indexes for each empirical property separately instead of converting \((\bar{R}_{u,1},\bar{R}_{u,2},\bar{R}_{u,3})\) into a (single) global rank. The second solution consists of the two following weighted arithmetic means denoted as \(\bar{\bar{R}}_u^1\) and \(\bar{\bar{R}}_u^2\):

$$\begin{aligned} \left\{ \begin{array}{l} \bar{\bar{R}}^{1}_{u}=\dfrac{1}{4}\left( \bar{R}_{u,1}+\bar{R}_{u,2}\right) +\dfrac{1}{2}\bar{R}_{u,3} \\[9pt] \bar{\bar{R}}^{2}_{u}=\dfrac{1}{6}\bar{R}_{u,1}+\dfrac{1}{3}\bar{R}_{u,2}+\dfrac{1}{2}\bar{R}_{u,3} \end{array}\right. \end{aligned}$$
(11)

Regarding \(\bar{\bar{R}}_u^1\), ordinal ranking bias alone determines 50% of the global rank, whereas the other empirical properties—time stationarity and shape—determine equally the other 50% of the global rank. \(\bar{\bar{R}}_u^1\) implies that ordinal ranking bias is twice as important as time stationarity or shape. On the one hand, ordinal ranking bias concerns the informational content of RCA indexes. The measurement of ordinal ranking bias evaluates the ability of an RCA index to provide reliable information about the strengths and weaknesses of a country relative to others in international trade. In this regard, the measurement of ordinal ranking bias evaluates the ability of an RCA index to provide useful guidance for economic policy. On the other hand, time stationarity and shape are empirical properties that are more concerned with how closely an RCA index resembles the ideal of an RCA index that is stable over time, symmetric and without fat tails. Contrary to ordinal ranking bias, time stationarity and shape are not concerned with what an RCA index tells us for economic policy purposes. Consequently, calculating \(\bar{\bar{R}}_u^1\) amounts to a policy-oriented standpoint according to which the informational content of RCA indexes is more important than the compatibility of RCA indexes with the ideal of an RCA index that is stable over time, symmetric and without fat tails.

With respect to \(\bar{\bar{R}}_u^2\), the respective weights of \(\bar{R}_{u,1}\), \(\bar{R}_{u,2}\) and \(\bar{R}_{u,3}\) imply not only that ordinal ranking bias is twice as important as time stationarity and shape—similar to \(\bar{\bar{R}}_u^1\) and the corresponding policy-oriented standpoint—but also that shape is twice as important as time stationarity. Indeed, trade patterns are changing more rapidly now than before because of factors such as geopolitics-induced trade tensions (e.g. rising prices of various commodities), shifting consumer preferences, new developments in digital technology and the COVID-19 pandemic. Provided that trade patterns arise from comparative advantages, more rapidly changing trade patterns imply that comparative advantages are less sticky over time than before. Ultimately, this shift reduces the importance of time stationarity relative to shape when considering how closely an RCA index resembles the ideal RCA index. Simultaneously, the informational content of RCA indexes still determines 50% of the global rank.

Example 6

Thirty-three RCA indexes are ranked in the case of the 44-country trade area comprising Italy and its main trading partners. The corresponding product classification consists of the 26 sectors arising from the EORA database (agriculture, fishing, mining and quarrying, food and beverages, etc.). The RCA indexes are those described in Sect. 2 (see Tables 1, 2 and 3). The RC and CTB indexes based on adjusted trade flows are calculated twice: first using the first available year as the reference year for adjusting trade flows (1990) and then using the last available year (2017).

Table 6 shows the ranking of these RCA indexes (variable \(R_{u,v}\)) after calculating the nine measures evaluating their empirical relevance as illustrated in Examples 15. Variables \(\bar{R}_{u,1}\), \(\bar{R}_{u,2}\) and \(\bar{R}_{u,3}\), which are the mean ranks associated with each of three empirical properties—time stationarity (TS), shape (SH) and ordinal ranking bias (ORB), respectively—are also reported. Table 7 shows the five RCA indexes with the highest ranks (i.e. the RCA indexes with the lowest values of \(\bar{R}\) or \(\bar{\bar{R}}\)) according to the four different methods discussed previously. In this example, CTB indexes have a clear advantage regarding time stationarity and shape but not ordinal ranking bias. Z, AZ and the weighted versions of B are better suited to avoiding ordinal ranking bias. When the three empirical properties are combined as simple or weighted arithmetic means, RC is the best option \((\bar{\bar{R}}^{0})\) or the second best option \((\bar{\bar{R}}^{1} \text { and } \bar{\bar{R}}^{2})\) after AZ. The trade-adjusted versions of RC are ranked third and fourth, respectively. There is only one CTB index represented in these three global rankings, and both Z and SB appear a single time with rank five.

Table 6 Ranking of RCA indexes: 44-country trade area comprising Italy and its main trading partners, EORA26 product classification, 1990–2017
Table 7 Top 5 CTB indexes according to different ranking methods, trade area comprising Italy and its main trading partners, EORA26 product classification, 1990–2017

The next section provides a general empirical assessment of RCA indexes on the basis of \(\langle R_{u,v}\in \mathbb {N}_{*}\rangle _{v\in V}\), then \((\bar{R}_{u,1},\bar{R}_{u,2},\bar{R}_{u,3})\) and ultimately the global ranks \(\bar{\bar{R}}_{u}^{0}\), \(\bar{\bar{R}}_{u}^{1}\) and \(\bar{\bar{R}}_{u}^{2}\).

7 An empirical assessment

We compute the RCA indexes in Sect. 2, apply the set of measures in Sect. 3-5 and then rank RCA indexes as discussed in Sect. 6 for different trade areas (i.e. different values of J) and different product classifications (i.e. different values of K); the time span (T) will be the longest available depending on the database employed to collect the trade data. The objective of working with a sample of trade areas instead of a single trade area is to obtain more general results about the empirical relevance of RCA indexes. According to the same logic, working with different product classifications provides an opportunity to analyze the influence of product classification on the empirical relevance of RCA indexes.

To build a sample of trade areas, we follow the path suggested by Stellian and Danna-Buitrago (2022). We conceptualize a trade area by gathering a given country—the “reference” country—and its “main trading partners”. Using trade data supplied by COMTRADE and then refined by UNCTADstat, all countries are ranked according to their respective shares in the reference country’s total trade over the 26-year time span of that database, namely 1995–2020. The set of main trading partners is the smallest set of countries with the highest shares and that together represent at least 90% of the reference country’s total trade.Footnote 9 Table 8 illustrates the conceptualization of the kind of trade area with Nepal as the reference country.Footnote 10 Nepal has fifteen main trading partners, so the corresponding trade area comprises sixteen countries. From 1995 to 2020, 61.28% of Nepal’s total trade with the rest of the world consists of exports to India or imports from this country. No other country has a higher share, so India is the first main trading partner of Nepal. China and the United States of America represent 10.19% and 3.02% of Nepal’s total trade, respectively, and thus are the second and third main trading partners of Nepal, respectively. Together, India, China and the US represent 74.49% of Nepal’s total trade. Twelve other countries must be taken into account so that this cumulative share tends toward (without being less than) 90%: the United Arab Emirates, Singapore, Indonesia, … and South Korea.

Table 8 Main trading partners: the example of Nepal

This kind of trade area is conceptualized for 67 reference countries from each main region of the world and with diverse levels of development. Table 9 describes the set of reference countries. The minimum number of countries in a trade area is 13 (Mexico as the reference country with a set of main trading partners comprising 12 countries), the maximum is 48 (Turkey), and the mean is 29.41; 137 countries are represented in at least one trade area either as a reference country or as a main trading partner. Table 10 describes these countries and the number of trade areas they belong to. This number ranges from 1 (e.g. Uzbekistan) to 67 (China, which belongs to all trade areas under consideration). On average, a country belongs to 13.91 trade areas. Ultimately, Fig. 6 shows the average trade share that makes a country the j-th main trading partner of a reference country. Specifically, on average the first main trading partner represents approximately 23% of a reference country’s total trade, the second main trading partner approximately 11%, and so on for the other main trading partners. On average, half of total trade is associated with the first five main trading partners.

Table 9 List of reference countries for the conceptualization of trade areas
Table 10 Number of trade areas each country belongs to in the sample of trade areas
Fig. 6
figure 6

Trading partner’s average share of reference country’s 1995–2020 total trade Source: Authors’ calculations

Regarding K, we use three different product classifications. The first two classifications are the 2-digit SITC (65 items) and the 3-digit SITC, respectively. SITC is employed by UNCTADstat to report trade flows, and we use this classification with two different levels of disaggregation to compute RCA indexes according to this database. These trade flows are “gross" trade flows, that is, exports and imports as reported by custom officials (and adjusted or complemented if necessary). However, gross trade flows may not provide sufficient information about international trade because of the international fragmentation of the production process and the subsequent global value chains. Indeed, “the export of a computer, for example, is in a fragmented world no longer reflecting the production of that computer from start to finish. The country involved might only contribute a (small) fragment of the production process, or in other words, add only a part of the total value added of the final product” (Brakman and Van Marrewijk 2017 p. 62).

For that reason, as the third product classification, we use the “value-added” trade flows provided by the UNCTAD-EORAGlobal Value Chain (GVC) database (Casella et al., 2019). In contrast to the SITC classifications, working with this database requires a simplified 26-sector classification to enable the computation of RCA indexes for all trade areas under consideration. Indeed, the EORA sector classification is not the same depending on the reporting country. Value-added trade flows are reported according to these 26 sectors for various countries (e.g. Algeria, Egypt, Mali or Pakistan, among others) whereas for other countries these flows are reported according to a higher number of more disaggregated sectors, ranging from 49 (Canada) to 512 (UK). Consequently, to combine different countries into a trade area, sectors must be aggregated for countries reporting more than 26 sectors.Footnote 11

To sum up, the use of value-added trade flows (EORA) instead of gross trade flows (SITC) implies a lower level of product/sector disaggregation (26 versus 65 or 259). We work with the aforementioned three classifications to account for different trade-offs between information accuracy (“gross" versus “value-added" trade flows) and the level of disaggregation among products/sectors.

For the RCA indexes that take into account GDP and GDP per capita (B2G, part of CTB indexes and part of RC indexes), we use the GDP data provided by UNCTADstat. As in Stellian and Danna-Buitrago (2022), the value of m to adjust an RCA index depending on GDP differentials in a trade area is set to 2; that is, GDP differentials at most double the level of comparative advantages or comparative disadvantages. For the period r that serves as the reference period to adjust trade flows for the calculation of some CTB indexes and some RC indexes, we use two different years: the first available year and the last available year. The corresponding values of r are written f (for “first”) and l (for “last”). For the SITC classifications, the first and last available years are 1995 and 2020, respectively. For the Eora-26 classification, the first and last available years are 1990 and 2017, respectively. Using the first available year for the adjustment of trade flows can be labelled a “forward-looking adjustment”, and using the last available year can be labelled a “backward-looking adjustment” (Stellian & Danna-Buitrago, 2017). Ultimately, regarding the Z and AZ indexes, the value of \(\theta\) is set to 6.534 as suggested by Costinot et al. (2012).

Note that an RCA index may face numeric exceptions. The online appendix explains how these numeric exceptions can be solved. Eventually, a full set of RCA indexes is computed. This set comprises:

  • 33 RCA indexes: the nine RCA indexes in Table 1 (the standard B index and eight other RCA indexes related to the standard B index); nine RC indexes (the six RC indexes in Table 2, three of which are computed with two different values of reference period r); Z and AZ; NY; and 12 CTB indexes (the eight CTB indexes in Table 3, four of which are computed with two different values of reference period r).

  • 67 different values of J. Each value is a trade area centered on the main trading partners of a given country among a representative sample of 67 countries (see Table 9).

  • 2 values of K associated with \(T=\{1995, 1996, \ldots , 2020\}\), namely 2-digit SITC (65 items) and 3-digit SITC (259 items); and a third value of K, EORA-GVC (26 items), associated with \(T=\{1990, 1991, \ldots , 2017\}\).

Given the number of countries in each possible value of J (see Table 9), the number of items in each possible value of K (26, 65 or 259) and the number of years in each possible value of T (26 or 28), our empirical assessment is based on more 595 million RCA index values.

Table 11 reports the ten best RCA indexes according to the mean values of \(\bar{R}_{u,1}\), \(\bar{R}_{u,2}\) and \(\bar{R}_{u,3}\), and then \(\bar{\bar{R}}_{u}^{0}\), \(\bar{\bar{R}}_{u}^{1}\) and \(\bar{\bar{R}}_{u}^{2}\). These mean values are computed for the whole sample of 67 trade areas and are reported separately for each of the three product classificationsFootnote 12 (2-digit SITC, 3-digit SITC and EORA-26 GVC). The corresponding calculations are described in Sect. 6 and illustrated in Tables 6 and 7.

Table 11 Top 10 RCA indexes, mean values for the sample of trade areas

For the three product classifications, the class of CTB indexes has a clear advantage regarding time stationarity and shape. A CTB index is always ranked first, and at least 6 CTB indexes belong to the top 10. There is an even sharper contrast regarding shape: the top 10 is fully occupied by CTB indexes in relation to the 3-digit SITC, and all positions except one in the top 10 are occupied by CTB indexes in relation to the EORA-26 GVC. NB is in the second position in the top 10 regarding time stationarity for each product classification. However, NB does not remain in the top 10 in terms of shape in relation to the 3-digit SITC and EORA-2 GVC and is last in the top 10 in relation to the 2-digit SITC. Consequently, our empirical assessment suggests that CTB indexes are most able to generate empirical measurements that fit the ideal of an RCA index that is stable over time, symmetric and without fat tails.

In addition, as argued by Stellian and Danna-Buitrago (2022), CTB indexes present theoretical robustness, namely robustness before any empirical assessment. Indeed, the formulas defining CTB indexes present the following features independently of the universe \(J\times K \times T\) under consideration:

  1. 1.

    Compatibility with the Kunimoto-Vollrath principle, according to which the magnitude of the trade balance or another trade-related variable is compared to its theoretical—“expected”—value to reveal comparative advantages;

  2. 2.

    Calculation on the basis of the overall structure of exports and imports to adapt an RCA index to the relative nature of comparative advantages and both the supply- and demand-side dimensions of comparative advantages;

  3. 3.

    Symmetry around its neutral point to ensure that comparative advantages and comparative disadvantages are measured in a homogeneous way;Footnote 13

  4. 4.

    Use of GDP data to ensure a more precise measurement of comparative advantages;

  5. 5.

    In some cases, additivity across products and even countries to make measurements of comparative advantages independent of product/country classifications.

Ultimately, CTB indexes warrant attention because they resemble most the ideal RCA index—stable over time, minimizing both asymmetry and tail fatness—and because of their theoretical robustness. However, with respect to what an RCA index tells us for economic policy purposes, Z, AZ and the class of RCA indexes based on RC tend to be more reliable RCA indexes because they are more able than CTB indexes to avoid ordinal ranking bias. For each product classification:

  • Z and AZ belong to the top 10, or at least AZ belongs to the top 10 and therefore places regression-based indexes among the RCA indexes most able to avoid ordinal ranking bias. In addition, the first rank is occupied by Z in relation to the 3-digit SITC, while AZ and Z rank first and second, respectively, in relation to the EORA-26 GVC.

  • At least RC, \(\text {RC}^l\), and \(\text {RC}^f\) belong to the top 10. These three indexes even occupy the three first ranks in the case of the 2-digit SITC. The other RCA indexes of the same class are present among the ten best RCA indexes: \(\text {RC}^{Y}\) and \(\text {RC}^{Y}\) in relation to the 2-digit SITC and 3-digit SITC and \(\text {RC}^{Y,l}\) and \(\text {RC}^{y,l}\) in relation to the 3-digit SITC.

No other RCA index appears so frequently in each top 10, which makes the RC index, its further improvements and the regression-based RCA indexes most able to avoid ordinal ranking bias and therefore the most useful RCA indexes for economic policy purposes. In addition, as explained by Danna-Buitrago and Stellian (2022) and Stellian and Danna-Buitrago (2022), the theoretical robustness of these RCA indexes is close to the theoretical robustness of CTB indexes. Ultimately, CTB indexes are less able than regression-based indexes or RC-inspired indexes to avoid ordinal ranking bias, but their informational content should not be ignored because CTB indexes are most compatible with the ideal of an RCA index that is stable over time with the most desirable shape.

When the ranks arising from time stationarity, shape and ordinal ranking bias are combined into a global rank, each product classification gives rise to a specific situation. In the case of the 2-digit SITC, the highest positions are occupied by the same three CTB indexes—\(\text {CTB}^{W}\), \(\text {CTB}^{W,y}\) and \(\text {CTB}^{W,Y}\)—and three RC indexes—RC, \(\text {RC}^{l}\) and \(\text {RC}^{f}\). In summary, with the 2-digit SITC and under different weights given to time stationarity, shape and ordinal ranking bias, the best RCA indexes are the CTB indexes with normalization by total trade or normalization by both total trade and the GDP-related coefficient \(m^{-Y_{it}/\bar{Y}_{t}+1}\) or \(m^{-y_{it}/\bar{y}_{t}+1}\), and the standard RC index with or without adjustment of trade flows. Note that, interestingly, none of these RCA indexes combine the GDP-related coefficient and the use of adjusted trade flows. This kind of combination confers greater theoretical robustness but is absent from the top 10. This suggests a trade-off between theoretical robustness and empirical accuracy. Ultimately, the last positions are occupied by AZ and B2, as well two RCA indexes among SB, NB, \(\text {RC}^{Y}\) and \(\text {RC}^{y}\).

In the case of the 3-digit SITC, compared with the 2-digit SITC, RC, \(\text {RC}^{l}\) and \(\text {RC}^{f}\) remain classified among the six first positions. Nonetheless, these positions are no longer shared with \(\text {CTB}^{W}\), \(\text {CTB}^{W,y}\) and \(\text {CTB}^{W,Y}\) exclusively. On the contrary, Z is now classified as fifth regarding \(\bar{\bar{R}}_u^0\) and \(\bar{\bar{R}}_u^1\) and fourth in terms of \(\bar{\bar{R}}_u^2\); B2G occupies the fifth position when the global rank is calculated as \(\bar{\bar{R}}_u^2\). \(\text {CTB}^{W}\) still occupies the best position with respect to \(\bar{\bar{R}}_u^0\) and \(\bar{\bar{R}}_u^1\) but falls to sixth regarding \(\bar{\bar{R}}_u^2\). \(\text {CTB}^{W,y}\) occupies the third position according to \(\bar{\bar{R}}_u^0\) and \(\bar{\bar{R}}_u^1\) but falls to eighth regarding \(\bar{\bar{R}}_u^2\). \(\text {CTB}^{W,Y}\) occupies the seventh position in terms of \(\bar{\bar{R}}_u^0\) and \(\bar{\bar{R}}_u^1\) but no longer appears among the top 10 regarding \(\bar{\bar{R}}_u^2\). In summary, with the 3-digit SITC and under different weights given to time stationarity, shape and ordinal ranking bias, CTB indexes with normalization by both total trade and the GDP-related coefficient are less relevant from an empirical standpoint, and Z gains empirical relevance. Interestingly, other CTB indexes appear among the lowest positions of the top 10, namely \(\text {CTB}^{W,l}\) and \(\text {CTB}^{W,f}\).

In the case of the EORA-26 GVC, the same classification arises from \(\bar{\bar{R}}_u^0\) and \(\bar{\bar{R}}_u^1\), where RC, \(\text {CTB}^{W}\) and AZ occupy the three first positions, followed by \(\text {RC}^l\), NB and \(\text {RC}^f\). If the weights underlying a global rank are those of \(\bar{\bar{R}}_u^2\), with less importance given to time stationarity than shape, AZ occupies the first position followed by RC with or without adjustment of trade flows.

As a general result, we can say that the standard RC index with or without adjustment of trade flows tend to provide relevant measures of comparative advantages independently of product classification; CTB indexes without adjustment of trade flows and normalization by total trade combined or not with GDP adjustment are more reliable RCA indexes in relation to the 3-digit SITC; and regression-based indexes can usefully complement RC and CTB indexes. All these RCA indexes present greater theoretical robustness than the B index and the other RCA indexes based on exports only. Indeed, the main theoretical robustness of regression-based RCA indexes is that they are supported by a Ricardian model of international trade, whereas CTB indexes and RC indexes are RCA indexes arises from the Kunimoto-Vollrath principle, are calculated on the basis of the overall structure of exports and imports while being symmetric around their neutral point, and can be adapted to GDP differentials in a given trade area. In addition, some CTB indexes are additive across countries and/or products. Therefore, more theoretical robustness seems to confer greater empirical relevance to an RCA index. However, as suggested before, this positive relationship between theoretical robustness and empirical accuracy has some limits because no global rank includes among the ten best RCA indexes the RCA indexes that combine adjusted trade flows and a GDP-related coefficient.

8 Conclusion

The conventional RCA index from Balassa (1965) has been questioned multiple times but remains widely used along with many alternative RCA indexes. This puzzling situation suggests that there is still no clear consensus on which RCA index should be used to analyze the comparative advantages of a given set of countries, products and periods. This paper contributes to filling this gap by suggesting a consistent set of measures to evaluate the empirical properties of RCA indexes based on their time stationarity, shape and ordinal ranking bias. This set comprises nine measures that together rank different RCA indexes according to stability over time, the presence of a symmetric and thin-tailed distribution, and the absence of ordinal ranking bias. These measures combine GMM estimation, descriptive statistics and non-parametric measures.

We compute 33 RCA indexes for 67 different trade areas comprising different sets of countries centered on a reference country and its main trading partners. These computations were repeated for three different product classifications and, in relation to these classifications, two sets of time periods. Eventually, a database of more 595 million values of RCA indexes is used as the input to rank RCA indexes according to their empirical accuracy. Different methods of rankings can support a discussion in which the empirical properties of RCA indexes are balanced against their theoretical properties. Contribution-to-the-Trade-Balance (CTB) indexes, the “Revealed Competitiveness" index and its further improvements, and regression-based indexes tend to provide the most relevant measures of comparative advantages from an empirical standpoint. In addition, theses RCA indexes tend to have the greatest theoretical robustness.

To conclude, three future lines of research are worth mentioning. The first line of research is the use of alternative dynamic panel data methods to measure the time stationarity of RCA indexes, with the aim of working with more flexible serial correlation assumptions for the residuals. The second line of research is other methods of evaluating ordinal ranking bias. Specifically, there is an ordinal ranking bias if a country has lower (higher) rank than x with a value of an RCA index greater (lower) than the mean value leading the same country to have rank x. Instead of the mean value, other representative values like the median or quartiles might be used to measure ordinal ranking bias. The third line of research is to analyze the average ranking of RCA indexes for different subsamples in which reference countries and/or their respective main trading partners present some specific characteristics instead of analyzing the average ranking of RCA indexes throughout the sample of 67 trade areas. This kind of exercise would strengthen the analysis of RCA indexes from the vantage point of their empirical relevance. Ultimately, further examining the empirical properties of RCA indexes will help academics and practitioners select the most adequate RCA index without ignoring the theoretical foundations of RCA indexes.