1 Introduction

The cross-country variegation of growth experience has been one of the pivotal topics in economics for decades. A special place in this research agenda have computational models supporting the analysis of the paths of long-run growth of countries at different stages of development. The persistent interest on this debate stems from the direct link between convergence hypothesis and decision making in regards to economic policy. As indicated by Johnson and Papageorgiou (2020), a confirmation of a global convergence gives support to policy interventions on a small-scale which could speed-up transition of the poor countries toward a common steady-state. On the other hand, an existence of diverging clubs with different steady-states, calls for large-scale policy interventions, which could contradict unfavorable initial conditions, and could enable poor economies to escape from the development traps.

In the present paper, we aim to contribute to the research on computational models applied to the analysis of convergence phenomena in economies. In particular, we attempt to discover distinct growth trajectories and to find clusters of economies which follow similar development paths. For this purpose, we propose a novel, data-driven approach to mine patterns of transitional growth with the use of concept-based models.

Concepts in information sciences are abstract entities that represent information chunks. They can be instantiated with the use of various information representation formalisms such as real intervals or sets. A collection of concepts makes a concept-based model. Depending on the assumed information representation scheme, we can obtain models with different properties.

In this research, we focus on fuzzy set-based concepts, thus, we work with fuzzy concept-based models. The choice of fuzzy sets was motivated by their flexibility and versatility. They allow expressing knowledge in the terms of belongingness measured on a scale from 0 to 1. One instance may belong to multiple fuzzy sets to different degrees. Thus, we can express the diverse properties of the given data. What is more, we can capture uncertainty in the description of phenomena. Stressed properties make fuzzy modeling especially valuable when we wish to represent the inner-workings of complex systems, such as economic phenomena.

The modeling framework introduced in this paper provides comprehensive means for transforming raw multidimensional data concerning multiple problem instances into a joint fuzzy concept-based model, which is then used as a base for similarity evaluation of various problem instances. The proposed framework is focused on providing user-centred phenomena representation. With that in mind, it is of the utmost importance that the model is intuitive and easy to visualize. Therefore, in the application discussed in this paper, we relate to the problem of absolute convergence conceptualised as a relationship between income and growth rate. A focus on multivariate data with two variables enables to directly represented the set of available information using two-dimensional plots.

Let us reiterate the key novelty elements introduced in this paper:

  • We propose a novel procedure to assess similarity of multivariate time-series on the ground of a concept-based model.

  • We apply the proposed framework to identify similarities in the growth-income paths. That, in turn, let us to cluster economies on the basis of their development trajectories. The ability to detect distinct group of countries allows to pinpoint patterns of transitional growth specific to a given group.

  • The obtained results inform the researchers and policy makers which countries have followed similar development paths, and which countries—given their development trajectories—has been stuck in development traps.

The remainder of the paper is structured as follows. Section 2 addresses a brief review of the relevant literature. Section 3 presents the proposed method. In Sect. 4, we attempt to identify the main patterns of the income-growth relationship. The paper ends with concluding remarks in Sect. 5.

2 Literature review

One of the most important hypotheses developed within growth economics states that the initial conditions have no consequences on the long-run per capita income. Such a hypothesis is built upon the neoclassical growth models, pioneered by the work of Solow (1956) and Swan (1956), which predict countries to converge to a single global steady-state, due to declining marginal product of capital. The converging process is described with transitional dynamics (Barro and Sala-i Martin 2004, p. 37), and thus a growth rate that deviate from the steady-state level is referred to as transitional growth (Gundlach and Paldam 2020). In practice, as countries differ in the timing of reaching a steady-state, we ought to observe that economies with lower output per capita (further below the common steady-state) should grow faster than others, which would eventually lead to a global convergence of incomes.

In opposition to neoclassical growth models lies the reasoning based on models allowing for increasing marginal returns, which can produce multiple basins of attraction of the growth processes. Introducing feedback loops from output to the production factors in the growth models helped to theorize the existence of multiple steady-states, into which economies are sorted based on their initial conditions (Azariadis and Drazen 1990; Mankiw et al. 1992; Galor and Weil 2000; Galor 2011). Around these two views evolved a vast literature. Given the number of significant contributions to this debate (see Battisti et al. 2020; Bergeaud et al. 2020; Comin and Mestieri 2018 for the latest), we aim not to comprehensively review the whole discussion, we rather send a reader to the recent survey of the literature of Johnson and Papageorgiou (2020). Nevertheless, we review selected recent works, which have the closest connection with our paper.

As we demonstrate in the next section, our approach leads to the identification of groups of countries with similar growth-income trajectories. Up to date, researchers have extensively used many techniques aimed at splitting countries into distinct groups to study growth patterns, the behaviour of income distribution, and the formation of convergence clubs.

One stream of research in this domain is focused on the identification of countries that are similar in terms of their initial income or other proxies of initial conditions. Early contributions were based on exogenously defined groups (Baumol 1986). More recently, researchers attempt to form clusters endogenously, i.e. directly from the data. For instance, Tan (2009) uses the regression tree algorithm to identify the threshold level of predefined variables which groups countries into distinct growth-regimes. More recently, Fiaschi et al. (2019) detect different regimes based on an algorithm designed to compare Akaike information criterion (AIC) from a set of semi-parametric Generalized Additive Models. Both, the set of determinants and the thresholds levels which split the sample in the most informative way are derived from the AIC comparisons. The approach of Tan (2009), Fiaschi et al. (2019) has its advantages, as it enables comparing development paths of economies with different initial conditions. Nevertheless, we see merit in an alternative: i.e. in clustering based on the growth experiences during the whole time horizon. Such a procedure eventually gives space to the investigation of the underlying forces of a given class membership.

Another application of the procedures of endogenous partitions is the detection of convergence clubs. Phillips and Sul (2007, 2009) propose a data-driven clustering algorithm that tests for convergence within a certain group. As the main focus of the algorithm lies in the relationship between idiosyncratic transitions and a common growth component of income, the resulting clusters tend to contain countries at very different levels of development, which are expected to converge in the future. Results with similar interpretations are obtained by Beylunioglu et al. (2020). The authors apply maximum and maximal clique algorithms to identify clubs based on unit root tests of pairwise income differences. The methods of Phillips and Sul (2007) and Beylunioglu et al. (2020) differ from ours in the sense of the economic meaning of the clusters. While the convergence clubs contain countries with a variety of transition paths and at a different level of development, we aim to seek clusters of countries that have experienced similar growth rates at a given stage of their development. Therefore, our focus lies in the similarity of transitional growth, rather than cross-country convergence. The research which is the closest to our paper is the analysis of growth dynamics by Brida et al. (2011). They propose to cluster economies based on the membership to one of four regimes—the regime of low income/low growth, low income/high growth, high income/high growth, and high income/low growth. The thresholds of low/high income and growth are set as the global averages of the corresponding variables. The membership to a given regime is changing over time, so the authors represent these memberships for a given country as a vector and propose a distance measure to assess their similarity. Our main idea is akin, i.e. we also represent the dynamics of development in a two-dimensional space of income and growth rates, and we propose to cluster the countries based on a distance measure capturing similarity of the growth paths. However, we propose two important extensions. First of all, we see the evaluation of membership to four regimes as unsatisfactory. Such a procedure effectively collapses the information contained in the value of output and growth rate to be above or below its averages. In the aftermath, a country with output per-capita close to the subsistence levels and with no growth is described as similar to a country with income and growth rates just below the world’s averages. We see such simplification to be too extensive. Thus, we propose a method that allows assessing membership to any number of concepts representing the distribution of the data. As we will show in the next sections, this step is crucial to get reliable measures of (dis)similarity for economies that are not on the opposite ends of world income (growth) distributions. Furthermore, since concepts are abstract, we use fuzzy set memberships to describe the evolution of GDP per capita.

Finally, we shall mention the recent work in the area of assessing non-linearity in the transitional growth models. Most notably, Gundlach and Paldam (2020) with the use of kernel regression present empirical regularities in favour of a humped-shaped long-run development path. The authors justify such a shape of the growth-income relationship with the predictions of Lucas (2009) two-sector model. The model predicts that poor countries’ take-off is delayed due to their low capability to absorb technology. As countries move along the development path their initially low growth rates accelerate thanks to positive externalities of human capital. The growth-miracles are thus observed at low-to-middle income countries that simultaneously benefit from the advantage of backwardness and which developed sufficient stock of human capital. At the later stage of development, the growth rates fall again, as economies approach “modern” steady-state. Importantly, the model can predict a parabolic income-growth relationship of a different shape, depending on model parameters that reflect countries’ internal characteristics. In sum, the development path is thought of as a “great-transition” between traditional and modern steady-state, with the highest rates of growth observed during the transition stage, i.e. around the middle of the income distribution. As Gundlach and Paldam (2020) note, the humped-shaped transition path is expected for any variable generated by a sigmoid function. We build upon these findings, and we evaluate the shape of the development paths in sub-samples of similar economies.

3 The method

3.1 Preliminary notions

Empirical evidence on economic growth is available in the form of multivariate time series. That is, given is a sequence of M time series

$$\begin{aligned} \left[ \begin{array}{ccccc} z_{1_1} &{} z_{1_2} &{} z_{1_3} &{} \ldots &{} z_{1_N} \\ z_{2_1} &{} z_{2_2} &{} z_{2_3} &{} \ldots &{} z_{2_N} \\ \vdots &{}&{}&{}\\ z_{M_1} &{} z_{M_2} &{} z_{M_3} &{} \ldots &{} z_{M_N} \\ \end{array} \right] \end{aligned}$$

where \(z_{i_j}\) is the value of the ith variable time series at the jth moment in time. N is the length of the time series we analyse.

Let us introduce labelling of the data which allows us to differentiate between multivariate data sets of various countries. For example, Eq (2) presents multivariate data set concerning a certain country L.

$$\begin{aligned} \begin{array}{l}L\\ L\\ \vdots \\ L\\ \end{array} \left[ \begin{array}{ccccc} z_{1_1} &{} z_{1_2} &{} z_{1_3} &{} \ldots &{} z_{1_N} \\ z_{2_1} &{} z_{2_2} &{} z_{2_3} &{} \ldots &{} z_{2_N} \\ \vdots &{}&{}&{}\\ z_{M_1} &{} z_{M_2} &{} z_{M_3} &{} \ldots &{} z_{M_N} \\ \end{array} \right] \end{aligned}$$

Available data, as in Eqn. (1), forms a cloud of points in an M-dimensional space, where dimensions correspond to variables. Each point in this space can be denoted as \(\mathbf{z} _i = [z_{1_i}, z_{2_i}, \ldots , z_{M_i}]\). \(\mathbf{z} _i\) is an M-element vector corresponding to the moment in time i. Data points in this space are labeled with their respective country labels.

In more general terms, labels allow distinguishing between samples coming from different classes of phenomena. Since this paper is devoted to analyzing patterns of transitional growth, one class corresponds to one country. Thus, the narration refers to the term country not to obfuscate the narration. In the wider context, the approach described in this paper can be altered to cluster and classify multivariate data that is an essential machine learning task (Ren et al. 2016). Furthermore, in the framework introduced in this paper time flow is not represented in the model itself. Thus, the method can be applied to data that is not temporal.

To enhance the temporal data representation ability of the model, we can introduce variables that encode time flow. The most straightforward solution is to introduce first-order differences of a given temporal variable. The series of first-order differences for the ith variable is computed as the differences \(z_{i_j} - z_{i_{j-1}}\).

3.2 Extraction of concepts

The first phase of the procedure aims at extracting meaningful concepts to represent the underlying data set concerning multiple countries. That is, data as in Eq. (2) concerning all countries that we wish to analyze is at first concatenated.

The role of concepts is to (i) aggregate and (ii) generalize knowledge present in the empirical evidence based on which the concepts are created. The most straightforward procedure that can be applied to extract concepts is centroid-based clustering (Askari et al. 2017; Yang and Nataliani 2017). Not only does it partition the data into subsets, but also it produces centroids that can be treated as “average” representatives of each cluster. To align the choice of algorithms to the methodology put forward in this paper, we employ the well-known fuzzy c-means algorithm delivered by Dunn (1973) and Bezdek (1981). It is a fuzzy variant of the k-means clustering algorithm.

In the fuzzy c-means algorithm, empirical evidence is associated with each centroid using a membership value. Membership values, in contrast to standard clustering algorithms, are not crisp, but fuzzy, which means that one data point may belong to more than one cluster. The sum of all membership values for a single data point adds up to 1.

The fuzzy c-means algorithm creates a partition matrix \(\mathbf{U} = [\mu _{ij}], \mu _{ij} \in [0,1], j=1, \ldots , C, i=1, \ldots , P\), where C is the number of clusters, P is the number of data points in the data set. An element \(\mu _{ij}\) represents the degree of membership of the ith data sample i to the jth cluster. The clustering procedure is regulated by a fuzzification coefficient \(m \in {\mathbb {R}}\), such that \(1.0< m < \infty \). The higher the value of m, the more fuzzified the solution. m close to 1 makes the fuzzy c-means behave like the binary k-means clustering. Hathaway and Bezdek (2001) recommend setting \(m = 2.0\).

Fuzzy partitioning is carried out by an iterative procedure that aims at minimization of the following objective function:

$$\begin{aligned} J_m = \sum _{i=1}^{P} \sum _{j=1}^C \mu _{ij}^m\Vert \mathbf{z} _i - \mathbf{v} _j\Vert ^2 \end{aligned}$$

\(\mathbf{z} _i\) is an ith tuple of the clustered data (ith observation). Each datum \(\mathbf{z} _i\) is located in the M-dimensional space, thus, technically speaking \(\mathbf{z} _i\) is an M-element vector. \(\mathbf{v} _j\) is an M-element vector with coordinates of the jth cluster (coordinates of the jth centroid). m is the aforementioned fuzzification coefficient. \(\mu _{ij}\) is the degree of membership of \(\mathbf{z} _i\) to \(\mathbf{v} _j\). \(\Vert *\Vert \) is any norm expressing similarity of two vectors.

The iterative procedure adjusts membership values \(\mu _{ij}\) from the matrix \(\mathbf{U} \) and the cluster centres \(\mathbf{v} _j\) by:

$$\begin{aligned} \mu _{ij} = \frac{1}{\sum _{k=1}^C \Bigg ( \frac{\Vert \mathbf{z} _i - \mathbf{v} _j \Vert }{\Vert \mathbf{z} _i - \mathbf{v} _k \Vert } \Bigg )^{2/(m-1)} } \end{aligned}$$


$$\begin{aligned} \mathbf{v} _j = \frac{\sum _{i=1}^P \mu _{ij}^m \cdot \mathbf{z} _i}{\sum _{i=1}^P \mu _{ij}^m} \end{aligned}$$

The procedure stops when the greatest change in any value in the matrix \(\mathbf{U} \) from kth to \((k+1)\)th iteration was smaller than a given \(\varepsilon \) or when a predefined number of algorithm’s iterations was exceeded (Zhang et al. 2016).

After applying the fuzzy c-means, we obtain concepts that are used to represent the given data set. The key parameter is the number of concepts to be extracted that has to be given before the procedure is launched. The more concepts we have, the more fine-grained phenomena description we obtain. The downside of having many concepts is that the more concepts we have, the less intuitive the interpretation of the model. Furthermore, a data-driven concept extraction procedure which is advocated in this paper has the property that the more concepts we produce, the more likely it is to obtain concepts representing outlying observations. On one hand, it may seem like an advantage, to be able to pinpoint outliers. On the other hand, the model will become highly biased towards the data that was used to train it. Thus, there is a need for a sensible balance between model generality and specificity (Kim et al. 2018).

3.3 A modification of the concept extraction procedure

The proposed modeling framework will be applied to represent economic phenomena. A frequently seen property in such data is that in different regions (subspaces) of the assumed M-dimensional space variance of observations differs and the clusters are far from being spherical. An example of such a situation is plotted in Fig. 1. The plot concerns a two-dimensional space. The first considered variable is GDP per capita (constant 2000 U.S. dollars). The second variable is the net primary income in current U.S. dollars. The data was downloaded from the World Bank API and it concerns the years 1970–2019. Plotted data points concern five large economies in the European Union: France (FR), Germany (DE), Italy (IT), Spain (ES), and the United Kingdom (UK). The interval between the smallest and the largest value found in observations in each dimension was split into three even parts and we plotted horizontal and vertical lines to designate nine regions.

Fig. 1
figure 1

Illustration of frequently seen properties of clouds of points of variables describing economic phenomena. Dashed lines partition the region occupied by the data scatter into nine sub-regions. Data points do not form spherical shapes. Densities of data scatter differ in sub-regions

Let us at the moment steer clear from the interpretation of the plot, since it was given only to illustrate certain challenges in data processing that reappears in many data sets in economics.

With a bare eye, we notice that the density of data points varies in designated regions. Furthermore, it is hard to describe the shape of the formed clouds of data points. From the qualitative point of view, we are aware that interesting patterns may be represented with clusters of unequal variance and odd shapes. Yet, the fuzzy c-means algorithm is most suitable to detect spherical clusters. This stems from the inner-working of this method which measures distances of data points to each centroid. Since the relation of distance is symmetric, the method is particularly well-suited when the data forms spherical clusters.

Thus, instead of running the fuzzy c-means algorithm directly in M-dimensional data, we propose to employ it separately for each variable in the given multivariate data set. We perform fuzzy c-means to extract \(u_1\) centroids for the first variable, \(u_2\) centroids for the second variable, and so on. Next, we apply the Cartesian product to the obtained centroids to generate M-dimensional centroids.

A comparison of the results achieved with the direct and the Cartesian product-based procedure for the aforementioned data set is illustrated in Fig. 2.

Fig. 2
figure 2

Comparison of the properties of the direct (the left plot) and the Cartesian product-based (the right plot) concept extraction procedure

Cartesian product-based procedure forces the concepts to spread out more in the given space. In contrast, the direct method tends to position centroids within the densest regions. Since the data scatter is not spherical, more than one centroid will likely be used to represent one crescent-shaped cluster. The described situation can be observed in the top-right sub-region in the left-hand-side plot in Fig. 2.

The downside of the Cartesian product-based method is that it may produce superfluous concepts. This case can be observed in the top-left sub-region in the right-hand-side plot in Fig. 2. However, we can easily envision introducing a measure for concept evaluation that could be used to evaluate and eliminate superfluous concepts. At the same time, our previous investigations into concept-based models suggested that superfluous concepts do not deteriorate prediction accuracy, but rather cause issues in model interpretation (Homenda et al. 2014). In the cited paper, the authors propose several strategies to evaluate concepts.

3.4 Linking concepts with countries and comparing countries

The introduced concept-based data representation allows transforming the data previously present in the M-dimensional space into a C-dimensional space. In the new space, each dimension corresponds to cluster membership, thus, we have C dimensions. The data in this space is normalized to the [0, 1] because the membership function values are in [0, 1] (see Eq. 4). The multivariate data set concerning a certain country L is now in the form as in Eq. (6).

$$\begin{aligned} \begin{array}{l}L\\ L\\ \vdots \\ L\\ \end{array} \left[ \begin{array}{ccccc} \mu _{1_1} &{} \mu _{1_2} &{} \mu _{1_3} &{} \ldots &{} \mu _{1_N} \\ \mu _{2_1} &{} \mu _{2_2} &{} \mu _{2_3} &{} \ldots &{} \mu _{2_N} \\ \vdots &{}&{}&{}\\ \mu _{C_1} &{} \mu _{C_2} &{} \mu _{C_3} &{} \ldots &{} \mu _{C_N} \\ \end{array} \right] \end{aligned}$$

\(\mu _{i_j}\) is the membership value of the jth data point to the ith cluster.

In the next step, we aim at aggregating knowledge concerning each country. We propose to represent each country with a C-element vector of averaged membership values computed as follows:

$$\begin{aligned} \Bigg [ \frac{\sum _{i=1}^{N}\mu _{1_i}}{N}, \frac{\sum _{i=1}^{N}\mu _{2_i}}{N}, \ldots , \frac{\sum _{i=1}^{N}\mu _{C_i}}{N}\Bigg ]. \end{aligned}$$

The introduction of the averaging makes data representation much more succinct than it was before. We can interpret a jth entry in the vector from the Eq. (6) as an average membership obtained for one country to a certain concept. We compute this vector for each country separately. Since the values are normalized (in [0, 1]), we can directly compare vectors corresponding to different countries.

The subsequent step of the proposed modelling framework aims at evaluating the similarity of countries. We seek a mapping operating on a collection of vectors in the form as in Eq. (6). We propose to use the Jaccard index, which is given as:

$$\begin{aligned} {J(A,B)={{|A\cap B|} \over {|A\cup B|}}.} \end{aligned}$$

To adapt Eq. (8) to fuzzy sets, we assume that \(| * |\) denotes the cardinality of the fuzzy set computed via summation, \(\cap \) and \(\cup \) can be replaced with a t-norm and an s-norm respectively (Hwang et al. 2018; Ramli and Mohamad 2009). We propose to use minimum and maximum as a t-norm and an s-norm respectively. In particular, for a given two countries L and K, the computations of the nominator in Eq. (8) look as follows: \(\sum _{i=1}^{C} \min (l_i,k_i)\). Computations in the denominator rely on \(\sum _{i=1}^{C} \max (l_i,k_i)\). In both cases, \(l_i\) and \(k_i\) are ith entries in a vector as in Eq. (6) obtained beforehand for the countries L and K respectively.

3.5 Similarity-based clustering of countries

As a result of the application of the procedure leading from the extraction of concepts to the similarity evaluation for any pair of countries liked to the extracted concepts, we can construct a similarity matrix for any given set of countries. This matrix can be, in turn, used to split the set of countries into disjoint subsets—clusters. We propose to apply hierarchical clustering, because of its intuitive model representation form (Murtagh and Contreras 2017; Xu et al. 2016).

Hierarchical clustering creates a tree-based model, where leaves contain single data samples (here countries) and the root holds the whole set of countries we wish to partition. The procedure can start either on top or on the bottom. Depending on the assumed strategy, the procedure is called either divisive or agglomerative (Bouguettaya et al. 2015; Zhang et al. 2013). In the research presented in this paper, we employ the latter one, where the procedure starts with tree leaves containing single observations and merges nodes until a root with all observations is reached. An important choice regarding the tree-building process is the method of similarity/difference evaluation for a given candidate pair of intermediate nodes to be merged. In this paper, we utilized the popular Ward’s clustering criterion (Murtagh and Legendre 2014) to decide which pair gets merged. The decision about merging certain intermediate clusters is made based on a similarity matrix. Here, we use a similarity matrix computed using the steps addressed in the previous sections.

A hierarchical clustering outcome can be presented in the form of a dendrogram. Based on a dendrogram a human expert can select the proper number of clusters. Sub-trees in the formed structure correspond to the clusters. After obtaining clusters, further processing can be employed. In particular, we may look at and evaluate certain parameters of observations that were grouped and interpret the overall outcome from the point of view of the applications area.

4 Data and results

As our goal was to identify patterns of long-run transitional growth, our primary source of information on GDP was the Maddison database (Bolt et al. 2018), which contains historical reconstructions of incomes for a relatively big number of countries. In our first sample, we used annual observations from 1870 to 2016, for all countries with at least 100 observations. We excluded extreme values from the sample, i.e. we cut off 1-st and 99-th percentile of our variables. Such a procedure gave 5506 observations on 41 countries. Furthermore, we proceeded with an analysis of a broader sample of 144 countries with at least 50 observations in the period 1950–2016, ending with 9228 observations. As described by Bolt et al. (2018), variable cgdppc gives a better estimation of income for cross-country comparisons, while variable rgdpnapc is more suitable to study variation in growth rates. Therefore, we used the logarithm of the former variable as a measure of income, and we used the differences of logarithms of the latter one to calculate growth rates. We applied our concept-based model to evaluate the similarity of transitional growth paths in the following manner: we set the number of concepts to 25 and 49 for smaller and larger samples respectively. Then, we applied the Cartesian product to centroids extracted with fuzzy c-means. Further, we matched observations for a given country with each of the concepts and we use arithmetic averaging to aggregate membership values for a given country. As a next step, we construct a similarity matrix, based on the Jaccard index. Finally, we employed hierarchical clustering on the similarity matrix. As we are interested in obtaining relatively compact clusters, we chose to use Ward’s method.

4.1 Long-run growth patterns (1870–2016)

At first, we turn our attention to the long time horizon and we start with the sample for the years 1870–2016. Clusters obtained for this sample are presented in Fig. 3. Having used a hierarchical clustering algorithm, we can observe the groupings at different levels of aggregation. In Fig. 3, we observe three distinct clusters, which can be divided into smaller, and more similar groups as depicted with different colours representing six clusters.

Fig. 3
figure 3

Groups of countries with similar growth experiences in 1870–2016

Starting from the left-hand side of Fig. 3, we note that cluster No. 1 consists of somehow divergent countries, both in terms of geography and the current level of development. A common characteristic of these countries is a late take-off of their incomes. As a result of being initially poor, these countries form a group of contenders to catch-up with the global leaders. Cluster No. 2 is the first candidate for further splits. A decision to cut a whole sample into six parts divides cluster No. 1 into the following groups:

  • 1a: India, Indonesia, Sri Lanka, Venezuela, and Brazil;

  • 1b: Italy, Japan, Finland, Portugal

  • 1c: Poland, Mexico, Panama, Bolivia, Peru, Philippines, Colombia,cuador.

Next, we note that cluster No. 2 (in the middle of Fig. 3) consists of advanced economies, which at the first sight form a compact group. Splitting cluster No. 2 into two groups divides it into the group of European and non-European countries with an exception of France. Looking at even smaller clusters, we grasp a reaffirmation of domain knowledge—the highest similarities are found between Germany and Austria, Norway and Sweden, Denmark and Belgium, etc. We consider this fact as a confirmation of the adequacy of the proposed algorithm.

Finally, cluster No. 3 groups countries, of which development paths similarity is non-obvious at a first glance. Yet, we notice a common characteristic of these countries, which is a relatively turbulent history of their economic systems such as long periods of centrally-planned communist economies followed by rapid and sharp transformation (Former USSR, Romania, Taiwan), as well as long periods of political unrest spanning throughout the twentieth century (Chile, Greece, Uruguay, Argentina, South Africa). In consequence, cluster No. 3 is characterised with the highest standard deviation of the growth rates as documented in Table 1. Romania and Taiwan fit cluster No. 3 to the smallest extent and thus they form cluster No. 3a. The remaining countries are gathered up in cluster No. 3b.

Table 1 Summary statistics of growth rates by cluster, 1870–2016

Having identified groups of countries with similar transitional growth paths, we turn to the analysis of the relationship between growth rates and income levels. To do so, we fit a generalised additive model of a growth rate (with cubic spline smoothing function) for each cluster, as presented in Figs. 4 and 5. In line with Gundlach and Paldam (2020) we observe a humped-shaped relationship between income and growth rates within clusters Nos. 1 and 2 (in the latter case we leave aside a vague shape of the function for the lowest level of income, with very wide confidence intervals). Figure 4 visualises some differences between the estimated path of transitional growth between catching-up countries (cluster 1) and economic leaders (cluster 2). In particular, the fitted growth curve of the former is shifted leftwards, as compared to the curve of the latter one. The acceleration of the growth rates of advanced economies occurred at a relatively high level of their development. Such differences can be interpreted as resulting from the global technological shock which allowed the “grand transition” and which paved the way to a rapid development to countries at different levels of income. Turning to cluster 3, we note a constant growth-income path, with growth rates just below the global median. In this case, as we mentioned above, we observe rather volatile growth, which additionally turned out to be—on average—sustained at the same rate at different levels of incomes.

Fig. 4
figure 4

Income-growth relationship in 1870–2016 within the three main clusters. Gray area represents 90% confidence intervals. The horizontal red-dotted line represents the median growth rate in the whole sample, the vertical one indicates the median income. (Color figure online)

Fig. 5
figure 5

Income-growth relationship in 1870–2016 within six clusters. Gray area represents 90% confidence intervals. The horizontal red-dotted line represents the median growth rate in the whole sample, the vertical one indicates the median income. (Color figure online)

As a high level of aggregation may disguise the variegation of income-growth paths, we move towards the analysis of smaller clusters. Transitional growth paths in clusters 1a and 1b show important dissimilarities. Although in both cases, a hump-shaped pattern prevails, countries in 1a seem to converge to a lower level steady-state. A drop in their growth rates occurred much faster than in the case of countries in cluster 1b (which can serve as an example of economies that caught-up with the global leaders successfully). As a result, Venezuela, Brazil, India, Sri Lanka, and Indonesia, although experienced relatively high growth rates, tend to slow down too early to converge with Japan, Italy, Finland, and Portugal. In cluster 1c, in turn, we observe only the increasing function of the growth rate, which can be a part of a humped-shaped function with a relatively late slowdown. Given the wide confidence intervals, the investigated relationship seems to be vague in this group. Finally, the average growth rates in cluster 3b remained at constantly low levels, showing no sign of a humped-shaped curve.

Table 2 presents a further description of the growth-income paths. It reports the outcomes of growth regressions for each cluster. In particular, we regress the growth rates in year t on income in year \(t-1\) with the use of pooled OLS (Eq. 9):

$$\begin{aligned} {\textit{growth}}_{i,t} = \beta _0 + \beta _1y_{i,t-1}+\epsilon _{i,t}. \end{aligned}$$

where \(y_{i,t-1}\) is the income of country i in year t-1, and \(\epsilon _{i,t}\) stands for residuals at a country-time level. The results confirm, that a humped shaped relationship is dominant in cluster 1a, 1b, and 2. A positive, linear relationship is found in cluster 1c. Coefficients of \(y_{i,t-1}\) for cluster 3a and 3b are not significantly different from zero.

Table 2 Pooled OLS regression by clusters, 1870–2016

In sum, we find evidence of a humped-shaped long-term growth, as in Gundlach and Paldam (2020) for a big part of the sample. However, the growth patterns differ substantially across clusters. The discrepancies in these patterns add to our understanding of long-term convergence processes. Given the general hump-like shape of the growth-income relationship, we can expect the poor countries to temporally diverge from the middle-income ones. Moreover, countries experiencing a transition from traditional to modern steady-state seem to be on the convergence path with advanced economies. Importantly, such convergence would occur only if a slow-down corresponding with an approaching modern steady-state happened relatively late. In other words, seemingly converging economies may indeed converge to other equilibrium than developed countries. On top of it, we identified a group of countries that tend to follow a different development path. Finally, we notice that the sample for years 1870–2016 contains only countries that are relatively well developed. These economies could have followed different growth paths than underdeveloped ones. Such a sampling issue turns our attention to the analysis of growth patterns in a shorter time dimension but among a broader set of countries.

4.2 Growth patterns in 1950–2016

An application of the concept-based model and hierarchical clustering algorithm to the GDP series spanning from 1950 to 2016, led us to distinguish clusters as illustrated in Fig. 6. For further analysis, we decided to split the sample into four and eight groups (see Table 5 in the appendix for the list of countries).

Fig. 6
figure 6

Groups of countries with similar growth experiences in 1950–2016

We repeat the analysis presented in the previous steps, i.e., we first turn to the groups at a higher level of aggregation. Next, we show how disaggregation into smaller and more compact groups reveals dissimilarities in the transitional growth paths. Looking at the fitted growth curves in Fig. 7, we note that the observed curves of the growth rates reflect the much shorter history of development, as the sample starts in 1950. Thus, the inverted U-shaped curve, dominant in 1870–2016 sample, is no longer visible. In the case of cluster 1, 2, and 3, the relationship between income and growth rates is negative, i.e. the higher the level of income the lower the growth rate. The differences across these clusters are non-negligible and depict the high growth rates of the richest countries in the world (gathered-up in cluster 1), and a low dynamics of development in cluster 3, which groups countries exhibiting a growth slowdown at relatively early stages of development. The most striking discrepancy in the growth experiences is revealed in cluster 4, which bring together the poorest economies, with the lowest growth rates.

Fig. 7
figure 7

Income-growth relationship in 1950–2016 within four clusters. Gray area represents 90% confidence intervals. The horizontal red-dotted line represents the median growth rate in the whole sample, the vertical one indicates the median income. (Color figure online)

The disaggregation of the sample into eight clusters uncovers another important dissimilarity, as it splits countries at a comparable level of development accordingly with their growth dynamics. In particular, cluster 2b has higher growth rates than cluster 2a. Growth rates differ drastically between cluster 3a and cluster 3b. Cluster 4a consists of initially poor countries that show signs of a take-off. In contrast, economies in 4b are characterised by a consistently low level of growth. Finally, countries in 4c, despite having a similar level of output as in cluster 4a, tend to stagnate as their income grows (Table 3).

Table 3 Summary statistics by cluster, 1950–2016

We see the above findings as having important economic meaning. First of all, we can identify distinct growth paths, which relate directly to hypotheses present in the current literature. For example, in cluster 3b we find a signal of the medium-income trap (Eichengreen et al. 2013), while cluster 3a points to the examples of countries that were able to avoid this trap. Concurrently, in clusters 4b and 4c we identify countries that can be classified as trapped in poverty (Bowles et al. 2016). Secondly, the findings presented in Fig. 8, relate to the unconditional convergence hypothesis. Although our methods do not allow to directly test this hypothesis, the disparities across clusters, such as between the groups 2a and 2b, 3a and 3b, 4a and 4b or 4c, indicate that countries are heading towards different steady-states, and tend to be a part of distinct basins of attractions of their growth processes. Moreover, the within-cluster negative relationship between income and growth rate (f.e. in cluster 1, 2a, 3b, 4c) suggest the convergence within these clusters. However, the negative slope of the growth function may be an artifact of longitudinal, within-countries patterns of development, rather than cross-country convergence. We provide further insight into this issue with the within-between model (Bell et al. 2019), which we propose to modify as in Eq. (10):

$$\begin{aligned} {\textit{growth}}_{i,t} = \beta _0 + \beta _1(y_{i,t-1}-y_{i,1950})+ \beta _2 y_{i,1950} + \mu _{i} + \epsilon _{i,t} \end{aligned}$$

where \(y_{i,t-1}\) is the income of country i in year \(t-1\), \(y_{i,1950}\) is the initial income of a country i, \(\epsilon _{i,t}\) stands for residuals at country-time level, and \(\mu _{i}\) represents random effect attached to intercept. Note, that model 10 is a special case of multilevel models, where observations are clustered within countries. We propose to separate the variable y into its value in 1950 (initial income) and its value centred around initial income (income diff). Such a procedure allows us to simultaneously assess the cross-country and longitudinal relationships between income and growth rates. We interpret the former one as consequences on growth stemming from cross country differences in the initial conditions. The latter reflects the within-country relations between growth and output. The results are reported in Table 4.

Fig. 8
figure 8

Income-growth relationship in 1950–2016 within eight clusters. Gray area represents 90% confidence intervals. The horizontal red-dotted line represents the median growth rate in the whole sample, the vertical one indicates the median income. (Color figure online)

Table 4 Within-between models of economic growth in 1950–2016

In general, cross country and longitudinal relationships within clusters are akin, i.e. the coefficients of initial income and income diff have the same signs. The coefficient of the initial income variable indicates a cross-country convergence within each cluster, except 4a and 4b. Interestingly, the estimations on the whole sample indicate no cross-country convergence (given the positive coefficient of initial income). Such results stand in line with the overall picture emerging from Fig. 8.

5 Conclusion

In the present paper, we have proposed the modeling framework to evaluate the similarity of multidimensional time-series data. The back-bone of the proposed method consists of a set of abstract concepts, which are linked to the data with fuzzy membership values. These, in turn, serve as a tool to calculate pairwise similarity matrix.

Subsequently, we have applied the concept-based model to identify main patterns of transitional GDP growth. The proposed procedure has revealed a variegation of development paths. In particular, for a number of economies we have found evidence of a humped-shaped pattern of the long-run growth, and we have demonstrated how these patterns differ across groups of countries. Furthermore, analysing growth rates in 1950–2016, we have identified groups of countries that follow distinct transitional growth paths despite being at a relatively similar stage of development. In sum, the obtained results help to pin-point main transition processes, which facilitates explaining cross-country inequalities of income.

The key novelty of the paper, i.e. the concept-based model, led us to identify groups of countries with similar growth trajectories. In our view, such a procedure enables a balance between over-simplification related to the analysis of pooled data, and over-specificity of the individual country time-series explorations. Given the variety of the development histories, and a need of formulation of some generalisations, such balance seems to be of the uttermost importance. Moreover, the proposed method opens a room for the discussion on the determinants of the membership to a given cluster. Following the domain literature (e.g. Fiaschi et al. 2019; Tan 2009), we envision that the potential drivers are the initial conditions, which constitute ’deep’ determinants of economic growth and which are not captured by initial income variable. Thus, determinants such as culture, geography, institutions and social capital may serve as a potential explanation of the diversified path of transitional growth. Nevertheless, we also expect these variables to be correlated with the initial income, and to be a weak explanation of the different patterns of growth of countries at a similar level of development. Thus, we see a potential in the qualitative research on the policy measures introduced in these economies. A promising research constitute also an qualitative investigation of growth shocks, and associated structural breaks within individual countries. Last but not least, one could also use tools typical to growth econometrics to seek for within-clusters relationships between growth and its proximate determinants. Efforts to explain class membership is the direction of follow-up studies, which we aim to undertake in the future research.