1 Introduction

The 2019 novel coronavirus disease (COVID-19) has continued to remain a global challenge. From the onset of the outbreak in December 2019 to March 2020, the spread and associated deaths from the virus increased at an unprecedented rate. On 11 March 2020, given the contagion rate and the widespread infectivity of the virus, the World Health Organization (WHO) declared COVID-19 a pandemic (Ghebreyesus 2020). As of October 2021, over 249 million people have been affected, and more than 5 million people have died due to COVID-19 related complications (WHO 2021).

To better understand the pandemic, several studies have focused on examining factors that influence the spread of COVID-19. This literature has examined factors such as temperature, environmental conditions, population density, social isolation and government policy interventions including lockdowns and information campaigns (see, e.g. Xie and Zhu 2020; Prata et al. 2020; Sy et al. 2021; Baser 2021; Di Domenico et al. 2020; Atalan 2020; Travaglio et al. 2021). These studies have enhanced our understanding of factors that contribute to the spread of COVID-19 and, thus, have been useful in informing policy on targeted interventions. However, to better inform policy, there is also a need to understand the spatial distribution dynamics of COVID-19.

Understanding the spatial distribution patterns of COVID-19 is important for policymakers, as they provide indications on potential factors affecting the spread and associated mortalities linked with the virus. Importantly, understanding the spatial trends of the infection and death rates of COVID-19 can help policymakers to identify the most effective strategies that can be prioritised to address the ongoing pandemic. Such an analysis provides indications on the success or failure of policy interventions that have been implemented in certain countries and sheds light on trends in geographic areas that might be relevant for policy. Put differently, understanding of the convergence patterns of COVID-19 provides insights into effectiveness of existing policies and, as a consequence, helps to determine whether to continue with or change existing policies, such as lockdowns and social isolation, that are aimed at curbing the spread of the virus and associated mortality rates.

We contribute to the burgeoning literature that has examined various aspects of the COVID-19 pandemic, by examining the spatial distributional patterns of infection and death rates of COVID-19 using the club convergence methodology proposed by Phillips and Sul (2007, 2009). Despite the well-known, increasing trend of COVID-19 cases and deaths across countries globally, there are significant transitional dynamics and heterogeneities that have been observed across countries with regards to policy, growth rates of infections, and frequency of deaths associated with COVID-19. Thus, we model COVID-19 infection rates and death rates accommodating these heterogeneities and transitional dynamics using the Phillips and Sul (2007, 2009) methodology. This methodology has the advantage of offering unique insight into the spread of COVID-19 by detecting multiple equilibria related to the groupings of countries while allowing for different convergence paths within these groupings. Additionally, time-varying properties of the model we use allow us to detect any disequilibria arising from the insurgence of COVID-19 across countries.

We focus on the number of daily new confirmed cases and deaths, and examine the convergence of COVID-19 cases and deaths for 155 countries starting from the date COVID-19 was declared a pandemic to August 2021. We find evidence of COVID-19 converging worldwide, although this convergence has not been equal across countries. Specifically, we find evidence of countries or regions that are clustered into several groups. We extend our analysis to explore the potential determinants of the various convergence groups to explain how countries converge into various COVID-19 steady states. Notably, our results suggest that the probability of belonging to a group with higher COVID-19 death intensity has no association with increases with stringency of government policies such as lockdowns, business closures and social isolation.

Our study contributes to a small body of literature that has examined the spread patterns of the COVID-19 pandemic. A subset of this literature has focused on the spread patterns of COVID-19 within single countries using different spatial econometric methods and Geographic Information System (GIS) mapping (Huang et al. 2020; Han et al. 2021; Martellucci et al. 2020; Cordes and Castro 2020). Our work is closely related to those that focus on the spatial distributions across countries (see, e.g. Meng 2021; Dehghan Shabani and Shahnazi 2020; Ismail et al. 2020; Katul et al. 2020). For instance, Dehghan Shabani and Shahnazi (2020) examine the spatial distribution patterns of the spread of COVID-19 in 40 Asian countries using data on reported cases from February to July 2020. Ismail et al. (2020) apply multiple time-series models to forecast the spread of the COVID-19 virus. The closest in the literature to our study is Meng (2021), who adopts similar club convergence methods but focuses only on G20 countries. Our study also differs from Meng (2021) given that we model the determinants of convergence across the 155 countries that we study.

A related strand of literature has also examined the impact of COVID-19 on the convergence of income and other economic indicators. For instance, in line with the established literature on convergence in income per capita, Martinho (2021a) examine how convergence in income per capita in OECD countries has been influenced by the COVID-19 pandemic. Other studies have discussed how convergence in factors such as unemployment rates, inflation and mortality rates have been influenced by COVID-19 (Fedajev et al. 2021; Horton 2021). We differ from these studies in that our focus is not to examine how COVID-19 has influenced convergence of economic, health and social indicators but to examine convergence in COVID-19 cases and mortalities related to COVID-19.

2 Nonlinear time-varying factor model

The identification of convergence patterns in the COVID-19 global pandemic utilises the club converge methodology proposed by Phillips and Sul (2007, 2009) (henceforth PS).Footnote 1 The novelty in the PS approach centres on several key aspects. First, the framework is a nonlinear time-varying factor model that allows for transitional dynamics and captures heterogeneity across countries and over time. Second, it identifies groups of countries with similar convergence patterns, irrespective of all countries converging in the panel. In other words, if the club convergence test fails to identify convergence across the complete set of countries in the panel dataset, an algorithm detects sub-convergent groups or clusters, and when formed, it converges to different steady-state equilibria. Third, the club convergence test is tailored explicitly to the properties of the data, thus classifying convergence clusters endogenously. Finally, the PS framework is superior to standard panel unit root tests since it is not sensitive to the stationarity properties of the time series and thus not affected by the small sample properties of conventional stationarity tests. In addition, the club convergence approach calculates relative convergence of cross-sectional means in contrast to the concept of absolute level convergence.

PS express their methodology as a time-varying common factor for observable series \({Y}_{it}\) for country \(i\) at time \(t\) given by:

$${Y}_{it}={\beta }_{i}{\mu }_{t}+{\varepsilon }_{it}$$
(1)

where \({Y}_{it}\) is the log-transformed value of our COVID-19 measures (discussed in detail in Sect. 4), \({\beta }_{i}\) denotes the countries’ characteristic component, \({\mu }_{t}\) represents the time-varying idiosyncratic element that captures the deviation of the countries’ common trend path (which can either be a non-stationary stochastic trend with drift or a trend-stationary process), and \({\varepsilon }_{it}\) denotes the error term. Note that, in Eq. (1), \({Y}_{it}\) can be decomposed into a common trend component, \({\mu }_{t}\), and an individual element, \({\varphi }_{it}\), given by:

$${Y}_{it}=\left({\beta }_{i}+\frac{{\varepsilon }_{it}}{{\mu }_{t}}\right){\mu }_{t}={\varphi }_{it}{\mu }_{t}$$
(2)

Although we cannot directly estimate \({\varphi }_{it}\) as a result of over-parameterisation, PS modify Eq. (2) to eliminate the trend component through rescaling the panel average as follows to generate a relative measure of the loading coefficients, given by:

$${h}_{it}=\frac{{Y}_{it}}{(1/N){\sum }_{i=1}^{N}{Y}_{it}}=\frac{{\varphi }_{it}}{(1/N){\sum }_{i=1}^{N}{\varphi }_{it}}$$
(3)

where \({h}_{it}\) represents the relative transition parameter (or path) of country \(i\) to the panel average at time \(t\). This allows to trace the individual path relative to the panel average. In addition, the relative transition parameter has two unique properties: (i) \({h}_{it}\) has a cross-sectional mean of 1; (ii) if \({\varphi }_{it}\) converges to \({\varphi }_{i}\), the relative transition parameter, \({h}_{it}\), will then converge to 1 for all \(i\) as \(t\to \infty\). As a result, the long-term cross-sectional variance \({h}_{it}\) denoted by \({H}_{it}=(1/N){\sum }_{t=1}^{N}{\left({h}_{it}-1\right)}^{2}\) tends to zero as \(t\to \infty\).

It may be the case that the number of observations is smaller than the number of unknown parameters in a time-varying factor model. Hence, specific function expressions of \({\varphi }_{it}\) and \({\mu }_{t}\) are required to undertake parameter estimation. PS propose that \({\varphi }_{it}\) can be expressed a semiparametric function given by:

$${\varphi }_{it}={\varphi }_{i}+{\sigma }_{i}{\xi }_{it}L{(t)}^{-1}{t}^{-\alpha }$$
(4)

where both \({\varphi }_{i}\) and \({\sigma }_{i}\) are constant over time, \({\xi }_{it}\) is a residual term and \(iid(\mathrm{0,1})\), \(L(t)\) is a function whose value changes slowly over time, and \(\alpha\) captures the speed of convergence. This representation of Eq. (3) guarantees that \({\varphi }_{it}\) converges to \({\varphi }_{i}\) for all values \(\alpha \ge 0\). Thus, we can implement the convergence test by testing the following hypotheses:

$${H}_{0}:{\varphi }_{i}=\varphi \quad and \quad \ge 0$$

against the alternative of:

$${H}_{1}:{\varphi }_{i}\ne \varphi \quad or \quad \ge 0$$

If we accept \({H}_{0}\), the samples tend to converge as a whole. Otherwise, it indicates that either convergence clubs are formed or the sample countries are non-convergent (or divergent).

We can test \({H}_{0}\) by estimating the following log-t regression given by:

$$\mathrm{log}\left(\frac{{H}_{1}}{{H}_{t}}\right)-2\mathrm{log}L\left(t\right)=\widehat{a}+\widehat{b}\mathrm{log}t+{\widehat{u}}_{t}$$
(5)

where \(t=\left[rT\right], \left[rT\right]+1,\dots ,T, r>0\) and r is set on the \(\left[\mathrm{0.2,0.3}\right]\) interval. The t-statistic, \({\widehat{t}}_{b}\), in Eq. (4) is estimated with heteroskedasticity and autocorrelation consistent standard errors. When setting \(\widehat{b}=2\)

\(\widehat{a}\), the null hypothesis, can be constructed as a one-sided test of \(\widehat{b}\ge 0\) against the alternative of \(\widehat{b}<0\). If \({\widehat{t}}_{b}<-1.65\), we reject the null hypothesis of convergence at the 5% level of significance. Finally, we implement the robust club clustering algorithm proposed by PS to identify convergence clubs in the panel of countries, as detailed in “Appendix B”.

3 Data

The COVID-19 pandemic panel dataset is obtained from Roser et al. (2020), who provide country profiles of coronavirus statistics.Footnote 2 The main variables of interest in this study are the number of daily new confirmed COVID-19 cases (\({C19}^{C}\)) and the number of daily new confirmed COVID-19 deaths (\({C19}^{D}\)). To test the sensitivity of these two variables, we also utilise the intensity of these measures; that is, daily new confirmed COVID-19 cases per million people (\({C19}^{CI}\)) and daily new confirmed COVID-19 deaths per million people (\({C19}^{DI}\)). We also utilise a host of country-specific variables contained within the COVID-19 pandemic panel dataset for a determinants analysis using the results from club formation (discussed in detail in Sect. 6).

Based on country availability, we investigate the convergence of the COVID-19 pandemic for 155 countries that span all continents. Although the start of the pandemic has been debated, we begin the sample period based on the announcement of WHO declaring the COVID-19 as a global pandemic on 11 March 2020. Thus, our sample period spans 11 March 2020 to 1 August 2021.

Table 1 reports the descriptive statistics of the variables used in the analysis. The mean value and the maximum number of new cases in a day are reported. The maximum \({C19}^{C}\) is 414,188, which was recorded in one day in India during the sample period. Across the sample, the average daily of \({C19}^{C}\) is 2541. The average \({C19}^{D}\) is about 54 deaths across the sample, while the maximum on any given day occurred again in India, with 7,374 deaths were reported. \({C19}^{CI}\) averages 90 per day and \({C19}^{DI}\) averages 4 per day.

Table 1 Descriptive statistics

We provide statistics for our COVID-19 variables based on ranking order for the 10 countries with the highest/lowest \({C19}^{C}\) in Table 2. Based on the sample period, the USA, India, Brazil, France, Russia, United Kingdom, Argentina, Turkey, Colombia, and Spain recorded the highest \({C19}^{C}\) (in descending order). In contrast, Mali, Belize, Lesotho, Burkina Faso, Congo, China, Hong Kong, Djibouti, South Sudan and Timor recorded the lowest \({C19}^{C}\). Apart from Hong Kong, the lowest cases are recorded in developing countries and occur across all regions. The 10 countries reporting highest \({C19}^{D}\) (in descending order) were the USA, Brazil, India, Mexico, Peru, Russia, United Kingdom, Italy, Colombia, and France. In contrast, the lowest \({C19}^{D}\) are recorded in Burkina Faso, Gabon, Djibouti, Togo, Andorra, Tajikistan, South Sudan, Seychelles, Singapore and Timor. Apart from Singapore, the lowest number of deaths occur in developing countries, and the majority of these countries are in the African region. Similar statistics are reported for \({C19}^{CI}\) and \({C19}^{DI}\). For instance, although \({C19}^{CI}\) are lowest in developing countries, some developing countries, such as Andorra and Seychelles, report high intensity of cases. Figures 1, 2, 3 and 4 illustrate the trending behaviour of all four COVID-19 variables for the 10 countries reporting highest numbers of cases and deaths during the sample period.

Table 2 High vs. low countries
Fig. 1.
figure 1

Source: John Hopkins University CSSE COVID-19 Data and https://ourworldindata.org/coronavirus (data explorer)

10 highest countries (\({C19}^{C}\)) (rolling seven-day average).

Fig. 2.
figure 2

Source: John Hopkins University CSSE COVID-19 Data and https://ourworldindata.org/coronavirus (data explorer)

10 highest countries (\({C19}^{D}\)) (rolling seven-day average).

Fig. 3.
figure 3

Source: John Hopkins University CSSE COVID-19 Data and https://ourworldindata.org/coronavirus (data explorer)

10 highest countries (\({C19}^{CI}\)) (rolling seven-day average).

Fig. 4.
figure 4

Source: John Hopkins University CSSE COVID-19 Data and https://ourworldindata.org/coronavirus (data explorer)

10 highest countries (\({C19}^{DI}\)) (rolling seven-day average).

4 Club convergence findings

Table 3 reports the results from the club convergence test for all four variables under investigation across the entire panel of countries. Based on the log-t test results, the null hypothesis of convergence of the panel of countries is rejected for all variables. In particular, we find no evidence of full panel convergence in \({C19}^{C}\) since \({\widehat{t}}_{b}=\) - 56.009 < −1.65. Similarly, the null hypothesis of full panel convergence is rejected given that \({\widehat{t}}_{b}\) for \({C19}^{D}\) (-33.532), \({C19}^{CI}\) (-43.694), and \({C19}^{DI}\) (-32.663) are less than 5% critical value of significance.

Table 3 Club convergence results (log-t test)

With panel convergence rejected for all four variables, the investigation now considers the possibility of sub-convergent clubs given that the club convergence test may overestimate the actual number of club clusters. As a result, we undertake club merging analysis to determine if the merging of adjacent clubs can form larger club clusters. The results for the final club merging analysis for \({C19}^{C}\), \({C19}^{D}\), \({C19}^{CI}\), and \({C19}^{DI}\) are reported in Tables 5, 6, 7, and 8 of “Appendix A”. For ease of interpretation, we provide a visual map of the final club classifications for these variables.

Regarding \({C19}^{D}\), we identify two convergent clubs and one non-convergent club as illustrated in Fig. 5. The first club consists of 105 countries. Most of these countries are in North and South America, North and Southern Africa, the Middle East, Asia, Russia, Mongolia, Pakistan, and India. The second club consists of 46 countries, including Australia, China, Kazakhstan, and countries in West Africa and Northern Europe. The two countries that are non-convergent are the Dominican Republic and Israel.

Fig. 5
figure 5

Final club convergence map (\({C19}^{D}\)) Source: Authors’ own calculations

For \({C19}^{DI}\), we identify three convergent clubs as illustrated in Fig. 6. The first club consists of 83 countries, mainly in North and South America, Southern Africa, Middle East, Asia, Libya, Russia, Mongolia, Pakistan, India, and some East European countries. The second club consists of 10 countries (black coloured): Canada, El Salvador, Iraq, Israel, Kosovo, Luxembourg, Netherlands, Spain, Switzerland, and the United Kingdom. The third club consists of 60 countries (white in Fig. 6), which are low death intensity countries and tend to be based in Africa and Northern Europe. This club also includes Australia, China, and Kazakhstan.

Fig. 6
figure 6

Source: Authors’ own calculations

Final club convergence map (\({C19}^{DI}\)).

In relation to \({C19}^{C}\), we find four convergent clubs and one non-convergent club as illustrated in Fig. 7. The first club consists of 125 countries, across all regions of the world. The second club consists of 13 countries: Bahamas, Bosnia and Herzegovina, China, Cote d'Ivoire, El Salvador, Ghana, Guinea, Lesotho, Luxembourg, and Madagascar, Mali, Oman, and Sweden. The third club consists of 11 countries: Australia, Burkina Faso, Djibouti, Gabon, Haiti, Hong Kong, Kosovo, Singapore, Somalia, South Sudan, and Sudan. The fourth club consists of three countries, Cameroon, Congo, Tajikistan, while one country belongs to the non-convergent group (Pakistan).

Fig. 7
figure 7

Source: Authors’ own calculations

Final club convergence map (\({C19}^{C}\)).

Concerning \({C19}^{CI}\), we find four convergent clubs and one non-convergent club as illustrated in Fig. 8. The first club consists of 115 countries: North America, South America, Southern Africa, Asia, Libya, Russia, Mongolia, India, and some European countries. The second club consists of Oman and Peru. The third club consists of 15 countries: Afghanistan, Algeria, Bosnia and Herzegovina, Djibouti, Egypt, El Salvador, Ethiopia, Gabon, Kosovo, Lesotho, Pakistan, and Saudi Arabia, Senegal, Sweden, and Uzbekistan. The fourth club comprises 18 countries, including Australia, Burkina Faso, Cameroon, Congo, Cote d'Ivoire, Democratic Republic of Congo, Ghana, Haiti, Hong Kong, Madagascar, Mali, Nigeria, Singapore, Somalia, South Sudan, Sudan, Taiwan, and Tajikistan. The last group consists of three non-convergent countries: China, Guinea, and Luxembourg.

Fig. 8
figure 8

Final club convergence map (\({C19}^{CI}\))Source: Authors’ own calculations

Finally, to further comprehend the behaviour of the COVID-19 pandemic across the globe, we sort the club convergence results into groups that show convergence across all four COVID-19 variables. These results are reported in Table 9. We group the countries that show club convergence in all for variables (including the possible combinations). For example, we identified 73 countries around the world that converge in \({C19}^{C}\), \({C19}^{D}\), \({C19}^{CI}\), and \({C19}^{DI}\).Footnote 3 As illustrated in Fig. 9, the virus has had a significant effect around the world.

Fig. 9
figure 9

Source: Authors’ own calculations

Convergence map for countries that converge in all four COVID-19 variables.

5 On the determinants of COVID-19

5.1 Ordered logit model

Having identified convergence in the COVID-19 variables across countries, we now investigate the underlying factors behind the formation of high virus incidences for some countries. Since the explained variable is an ordinal number (a discrete variable), we utilise an ordered logit model developed by McKelvey and Zavoina (1975) to analyse the influencing factors, which is estimated via the maximum likelihood method, given by:

$${y}^{*}={\lambda }_{0}+{\lambda }_{1}{x}_{i}+{\lambda }_{2}{x}_{2}+\dots +{\lambda }_{i}{x}_{i}={\lambda }^{^{\prime}}{\varvec{X}}+\nu$$
(6)

where \({y}^{*}\) is defined as the latent variable, \({\lambda }^{^{\prime}}\) is a \(i\times 1\) vector of estimated coefficients, and \(\nu\) is the white noise error term such that \(\left.\nu \right|{\varvec{X}}\sim Normla(\mathrm{0,1})\). The focus of this analysis is on the coefficient \({\lambda }_{i}\), which investigates the influence of each determinant variable on the club convergence by the coefficient sign. Since the latent variable \({y}^{*}\) is not directly observable, it is measured through the explicit variable \(y\). In this case, \(y\) is the observed assigned values of the clubs. Thus, given the club algorithm identified various clubs, we assign variable \(y\) as follows:

(i) Club convergence for \({C19}^{DI}\)::

\(y=1\) if country \(i\) belongs to club 3; \(y=2\) if country \(i\) belongs to club 2; \(y=3\) if country \(i\) belongs to club 1.

(ii) Club convergence for \({C19}^{CI}\)::

\(y=1\) if country \(i\) belongs to club 4; \(y=2\) if country \(i\) belongs to club 3; \(y=3\) if country \(i\) belongs to club 2; \(y=4\) if country \(i\) belongs to club 1.

5.2 Determinants: variable selection

Regarding the vector of determinants, \({\varvec{X}}\) in Eq. (5), we utilise a host of explanatory variables to help explain the driving forces of club formation for the COVID-19 pandemic across countries. As with the main measures used in the club convergence analysis, all determinant variables are obtained from Roser et al. (2020).

Stringency: The government response stringency index, which is a composite variable based on nine response indicators including school closures, workplace closures, and travel bans, rescaled to a value from 0 to 100 (100 = strictest response). Research on infectious diseases finds that closures effectively reduce community transmission of infection (Rashid et al. 2015).

Population density: The number of people divided by land area, measured in square kilometres. During the COVID-19 pandemic, almost all countries imposed physical distancing interventions to limit the spread of the coronavirus. Evidence has found that implementation of any physical distancing intervention is associated with an overall reduction in COVID-19 incidence (Islam et al. 2020). Thus, how people can be spatially separated is constrained by population density. Recently, Wong and Li (2020) find that population density effectively predicts cumulative infection cases.

Poverty: Share of the population living in extreme poverty. Several factors may increase exposure to COVID-19 for people from a low socio-economic background. For instance, economically disadvantaged people are more likely to live in overcrowded accommodation with limited access to personal outdoor space, and overcrowding will reduce compliance with social distancing. Poorer people are often employed in occupations that do not provide opportunities to work from home. In addition, people from low socio-economic groups may be more vulnerable to severe disease once infected because of higher levels of pre-existing illness. Also, cardiovascular disease, obesity, diabetes, and hypertension are risk factors for death from COVID-19. Individuals with low socio-economic status may not have easy access to healthcare to test these underlying conditions (Whitehead et al. 2021).

Age65: Share of the population that is aged 65 years and older. From the beginning of the pandemic, it was evident that the elderly were at a higher risk of COVID-19 complications with higher hospitalisation rates, intensive care unit admissions, intubation, and death (Garg et al. 2020). Indeed, recent evidence now indicates that older people are at higher risk of COVID-19 mortality (Ho et al. 2020).

Female/Male smokers: Share of females/males who smoke. While the current evidence suggesting that people who smoke are at higher risk of COVID-19 is inconsistent, smoking increases the incidence, duration, and severity of viral respiratory infections and has also been found to increase pneumonia. However, the hand-to-mouth action of smoking and e-cigarettes means that people who smoke may be more vulnerable to COVID-19, as they touch their face and mouth more often. Indeed, emerging evidence has shown that coronavirus may be spread via aerosols (Fennelly, 2020), suggesting that the virus could be transmitted through exhaled tobacco smoke and e-cigarette aerosols. On the other hand, recent studies indicate that active smokers are underrepresented among patients with COVID-19, leading to claims that a ‘smoker’s paradox’ may exist in COVID-19, wherein smokers are protected from infection and severe complications COVID-19. (Usman et al. 2020).

Handwashing: Share of the population with basic handwashing facilities on premises. Since the beginning of the pandemic, handwashing has been an important defence against the virus and one of the most effective forms of COVID-19 prevention (CDC 2020).

Gross domestic product per capita (GDPPC): GDPPC at purchasing power parity (constant 2011 international dollars). GDPPC is an important indicator that reflects the level of economic development and the income and consumption of residents (Barro, 1991). However, the COVID-19 pandemic has brought new challenges for individuals, businesses, and governments worldwide, which may compromise the efforts to promote balanced development. Indeed, an extensive literature has shown that the Global Financial Crisis impacted economic growth worldwide. The economic growth effects of COVID-19 may pose even more significant challenges, particularly in weaker economies, where the problems of development and the capacity of response to unexpected challenges are more accentuated. Early research has shown that the COVID-19 pandemic has eliminated the signs of convergence in GDPPC across OECD nations and thus has potentially inhibited a globally balanced development process (Martinho 2021b).

Diabetes: Diabetes prevalence (% of population aged 20–79 years) in 2017. Recent evidence has shown that having either type 1 or 2 diabetes increases the risk that an individual will become more severely ill from COVID-19 and have worse outcomes, including higher mortality, than those without diabetes (Lim et al. 2021).

5.3 Results on the determinants of club convergence

The estimated results from Eq. 5 are reported in Table 4. For ease of interpretation, we report the average marginal effects with robust standard errors. Columns (1)–(4) report the determinants analysis for \({C19}^{DI}\), while columns (5)–(7) report the determinants analysis for, \({C19}^{CI}\). In addition, the significance of the variables is examined by employing the baseline specification and the full model, which includes Diabetes prevalence. Given that endogeneity might bias the estimates, we employ the conditional mixed process estimator (CMP) as developed in Roodman (2011) to overcome this bias.Footnote 4 The mixed process allows the estimation of both continuous and categorical dependent variables. Thus, an instrumental variable model is estimated, using GDPC of the USA as instrument for GDPC of the entire sample in the ordered probit specification. The results are reported in columns (4) and (7). The results are similar across the models but the magnitude of the coefficients are smaller in the models addressing endogeneity.

Table 4 Determinants of COVID-19 club convergence

Based on the results from columns (1)–(4), the probability of belonging to a group with higher death intensity rate (\({C19}^{DI}\)) increases with poverty, Age65, and for Female Smokers. However, the probability of dying is reduced for Male Smokers. Death intensity falls with GDPPC and Population density. The provision of Handwashing facilities does not reduce the probability of belonging to a group with higher death intensity. Also, death intensity is not rising with Diabetes. However, at higher level (95 percentile) of handwashing (column 3), the results show that death intensity is falling with handwashing and rising with diabetes and male smokers.

The results have marked differences when density intensity (\({C19}^{DI}\)) and case intensity (\({C19}^{CI}\)) series are considered. The results reveal that the probability of belonging to a group with higher case intensity increases with Stringency, Population density, GDPPC, and Female smokers. The likelihood of contracting COVID-19 is reduced for Age65 and Diabetes. The provision of Handwashing facilities does reduce the probability of belonging to a group with higher cases.

The result shows that Stringency has a significant effect, and the chances of belonging to a group with higher case intensity are increasing. Thus, the anti-spread preventive measure of policy stringency plus its consequent ameliorating effect is not evident in the result. A percentage increase in policy stringency leads to 0.39% increase in case intensity. Hence, while the stringency index is a strictness measure of ‘lockdown style’ policies, the panel results suggest that restricting people’s spatial behaviour may have not been as effective in curtailing the spread of the virus. Nonetheless, this result is based on a global sample which takes into account both developed and developing nations (where lockdown has not been implemented nor followed in these countries).

The result suggests that death intensity falls with Population density, but case intensity rise with population density. A percentage increase in population density reduces death intensity by 0.2% but increases case intensity 0.09%.

The effect of Age65 is statistically significant. A percentage increase in age 65 years and above leads to 1.39% increase in death intensity. However, a percentage increase in age 65 years and above leads to a 0.41% decrease in case intensity. Poverty increases the probability of belonging to a group with higher death intensity. A percentage increase in poverty leads to 0.34% increase in death intensity.

For the proportion of Female smokers, death intensity, and case intensity rises by 0.26%, and 0.19%, respectively. Handwashing reduces the case intensity by 0.07% while death intensity rises with handwashing. In contrast to the effects observed, death intensity falls by 0.12% when handwashing is observed at a higher level. Death intensity falls by 0.18% under a percentage increase in GDPPC, but case intensity rises by 0.25%. Being diabetic does increase death intensity but reduces case intensity.

6 Conclusion and implications

In late 2019 and early 2020, the world witnessed the rapid spread of the COVID-19 virus, which infected millions of people worldwide and resulted in many deaths. COVID-19 was declared a pandemic on 11 March 2020 by the WHO and remains an ongoing pandemic at time of writing. The impact of COVID-19 has induced economic and financial disruptions to global economies, consistent with those experienced during previous episodes of economic or financial crises. Alongside the research into the COVID-19 pandemic that is underway and that will continue for years to come, we offer a critical perspective into the spread of the virus by investigating the convergence patterns of COVID-19 around the world.

The empirical approach implements the Phillips and Sul (2009); Phillips and Sul (2007) clustering algorithm to examine the convergence patterns of new infection rates, new death rates, and their intensity measures (i.e. infection rates and death rates per million people) for a panel of 155 countries that have witnessed a rapid spread of the virus. This framework has the advantage of detecting multiple equilibria related to the groupings of countries while allowing for different convergence paths within these groupings, thus offering a unique insight into the spread of COVID-19. Equally important is that the club convergence approach is a nonlinear model with time-varying properties that can detect any disequilibria arising from the insurgence of COVID-19 across countries.

We utilised four important COVID-19 measures to assess convergence in the spread of the virus across countries: the number of new COVID-19 cases, death rates, and intensity measures based on new cases and death rate per million people. The results indicate that the null hypothesis of full (panel) convergence across the sample countries is rejected for all four COVID-19 measures, suggesting the virus's unequal convergence process across countries.

However, the results identify that sub-convergent clubs exist. In particular, we identify four convergent clubs, which span almost all continents across the globe, and one non-convergent club for new COVID-19 cases. Regarding sub-convergent clubs for new death rates, we find two convergent club-clusters and one divergent club for new death rates in countries mainly from North and South America, North and Southern Africa, the Middle East, Asia, Russia, Mongolia, Pakistan, and India. The second club comprises Australia, China, Kazakhstan, West African, and a few North European countries. Similar results are reported when we utilised the intensity measures. For instance, case intensity is highest in North America, South America, Southern Africa, Asia, Libya, Russia, Mongolia, India, and some European countries. In sum, while we find evidence of COVID-19 converging worldwide, the convergence process has been inequitable, with countries or regions clustered into several groups.

Given that we identified sub-convergent groups in the COVID-19 measures across countries, we extended the analysis to explore the potential drivers of the observed club formations by undertaking a determinants analysis. We linked the club convergence results to key variables to model this process, which helps to explain how countries converge into higher steady states of COVID-19.

The results from the determinants analysis reveal that the probability of belonging to a group with higher death intensity of COVID-19 increases with being over the age of 65 years, and for female smokers. The provision of handwashing facilities does reduce the likelihood of belonging to a group with higher death intensity. Also, death intensity falls with economic development (GDPPC) and population density. In addition, increased share of the population living in extreme poverty enhances the probability of belonging to a group with higher death intensity. However, the findings have several differences when case intensity series is considered. The results reveal that the probability of belonging to a group with higher case intensity increases with policy stringency, population density, economic development, and female smokers. In comparison, the likelihood of contracting the virus is reduced for those aged 65 years and over and diabetic patients. The provision of handwashing facilities does reduce the probability of belonging to a group with higher case intensity.