Introduction

Human societies have repeatedly suffered from pandemics, such as the Justinian Plague of 541, Black Death of 1347, and the Third Pandemic of 1894, associated with catastrophic demographic consequences. There are at least three main factors that account for the recurrence of recent human history epidemics:

First, climate-related factors, such as temperature cooling (Lee, 2014; Lee et al., 2017; Tian et al., 2017) and hydro-climatic extremes (Lee et al., 2016; Schmid et al., 2015; Xu et al., 2011, 2014; Yue & Lee, 2020a), disrupted agricultural practices and the ecological environment, and then triggered the migration of people and rodents in the affected regions and facilitated the transmission of the epidemics. Second, the downturn in economic well-being due to changes in grain prices, which affected the supply of subsistence resources to humans, reduced the nutritional level of the human population and subsequently increased their susceptibility to the epidemics (Duncan et al., 1997; Pei et al., 2015; Yue & Lee, 2020b). Third, there was a mutual reinforcement between positive checks such as famines, wars, and epidemics during overpopulated periods (Lee, 2018, 2019; Malthus, 1798). These population checks interacted with and reinforced themselves, aggravating epidemics spreading and their associated demographic impacts. Previous empirical studies have quantitatively verified the above factors individually. Yet, there is no study incorporating all of the above factors and analyzing their relative importance for the outbreak of epidemics in history quantitatively. It remains uncertain which factor was more imperative to account for the outbreak of epidemics in the past.

The period of late imperial China overlaps partly with the Third Pandemic. The pandemic also resulted in substantial demographic catastrophes (Xu et al., 2011). Based on fine-grained historical datasets and statistical methods, this study seeks to test the validity of the above-mentioned hypotheses by analyzing the country-level outbreak of epidemics in China during the time. The associated findings may provide insight into the factors responsible for the long-lasting and catastrophic social impact of the Third Pandemic.

Methodology and Data

Study Area and Study Period

To fulfill this study’s research scope and data requirement, I took China as the study area. Also, I set the study period as 1841–1911 to align with all datasets’ time-span and geographic coverage.

Data

I employed the following datasets, which are the best available data in academia to date, in this study. Figure 1 and Table 1 present the employed datasets, and their descriptions are provided below:

Fig. 1
figure 1

Time series of the variables employed in this study: A Hydro-climatic extreme, B famine, C war, D economic fluctuation, and E epidemics

Table 1 Descriptive statistics of the variables employed in this study

Epidemics data – I collected the epidemics data from the following sources: (1) Collection of Meteorological Records in China over the Past Three Thousand Years (Zhang, 2013), (2) Historical Records of Infectious Diseases in China (Li, 2004), and (3) Epidemic Records in Historical China (Zhang, 2007). The above data sources are derived from official dynastic histories and local chronicles but compiled independently by different scholars. Each of the data sources provides detailed information of historical Chinese epidemics incidents such as the year of onset (in the calendar year), the location of the outbreak, quotations describing the epidemics incidents, and the sources from historical documents.

This study only considers those contagious human endemics and epidemics that could infect a massive population in a short period via breathing, food, or human contact. In historical documents, those infectious diseases are usually coined as yi or yili. I took the following steps to process the epidemics data. First, I selected the records containing the word yi or yili from each of the epidemics data sources. Second, I cross-checked the information about the onset year and the outbreak location of the epidemics in those selected records and saw whether there were any duplications. Any duplicated records were removed. Third, I based on the number of counties affected by epidemics in a year to count the epidemics, which could better reflect the geographic coverage and the magnitude of the epidemics (Lee et al., 2016, 2017). There are 1402 epidemic incidents in the study period (i.e., 1841–1911).

Other scholars have employed other data sources of historical epidemics (Chen, 1986; Sun, 2004) to examine climate change’s influence on epidemics in historical China. However, the number of epidemics records in those sources is far less than those I employed in this study. Hence, they were not considered here. On the other hand, subject to the limited medical knowledge in the past, historical literature could not provide full descriptions about the nature, magnitude, root cause, and length of time about the diseases concerned. Hence, they could not be directly compared with the modern days’ epidemiological records (Pei et al., 2015). Due to the inherent constraints of the historical epidemics data sources, I can only assume that all epidemic records (yi or yili) in the literature are major epidemics that caused significant loss of lives.

Hydro-climatic extreme data – I collected the hydro-climatic extreme (flood and drought) data from the Yearly Charts of Dryness/Wetness in China for the Last 500-year Period (Chinese National Meteorological Administration, 1981) and the Yearly Charts of Dryness/Wetness in Northwestern China for the Last 500-year Period (AD1470–2008) (Bai et al., 2010). The former contains the dryness/wetness grade series of 120 sites in China in AD1470–1979. The latter provides the updated dryness/wetness grade series of the 12 sites and the additional dryness/wetness grade series of seven new sites in northwestern China. Both of the yearly charts follow a 5-point grading system to describe local hydro-climatic conditions, in which value one represents extremely wet, and value five represents extremely dry. The grading system is mainly based on the statistical evaluation of the hydro-climatic conditions recorded in local gazettes and other historical documents to assign each site’s dryness/wetness grades. Flood and drought could facilitate the transmission of epidemics significantly (Xu et al., 2011). Hence, I counted the number of sites with data whose dryness/wetness grade is one or five in any given year (Lee, 2018) to quantify the hydro-climatic extreme incidents. I found a total count of 1140 hydro-climatic extreme incidents in the study period.

Famine data – Famines are always hard to objectively define. Therefore, I took cannibalism caused by starvation as a famine’s indicator (Lee, 2018; Xiao et al., 2015). I collected cannibalism data from the Collection of Meteorological Records in China over the Past Three Thousand Years (Zhang, 2013), a multi-volume compendium containing comprehensive historical records of various natural disasters and human catastrophes. Like the epidemics data, I counted the number of counties with cannibalism in a year to measure the frequency of famines (Lee, 2018, 2019). There is a total of 145 famine incidents during the study period.

War data – I collected the war data from a multi-volume compendium entitled Tabulation of Wars in Historical China (Editorial Committee of Chinese Military History, 1985). The compendium contains very detailed information on the armed conflicts (including the onset year, location, involved parties, and so on) in Chinese history spanned between 800BC and AD1911. In this study, I counted the frequency of wars in terms of the number of battles (Lee, 2018, 2019). There is a total count of 583 battles during the study period.

Economic fluctuation data – The ratio of the annual change of GDP deflator derived from Ma and de Jong (2019) was used to reflect the general price level fluctuations. The deflator was compiled based on wheat and rice prices, the export and import prices of coal, domestic gold prices, raw cotton, raw silk, and tea, covering the economy’s main pillars, such as agriculture, industry, and services. The price of gold represents both industrial products (manufacturing, mining, and construction) and services (commercial activities, finance and housing, other professional services, and government) (AD1912 = 1). To date, Ma and de Jong’s (2019) work provides the most fine-grained indicator of the general price level of historical China in academia.

Statistical Analysis

I applied two statistical methods in this study, which are introduced below:

Multiple linear regression – This method explains the variance in an interval dependent variable based on the linear combinations of interval, dichotomous, or dummy independent variables. The main functions of the multiple linear regression are: (1) assess the strength of the association between one dependent variable and multiple independent variables; (2) measure the proportion of variance in the dependent variable explained by the regression; and (3) estimate the relative importance of the various independent variables to the solution (De Vaus, 2002). Here I employed this method to investigate the association between epidemics outbreaks and hydro-climatic extremes, famines, wars, and economic fluctuations. Before the regression analysis, I log-transformed the epidemics, hydro-climatic extreme, famine, war, and economic fluctuation data to stabilize their variance as required (Curran-Everett, 2018).

Wavelet coherence analysis – Wavelet analysis is robust at analyzing non-stationary systems (i.e., the frequency content in the systems changes over time), which is implemented via the local time-scale decomposition of the signal embedded in time-series data (Cazelles et al., 2007, 2008; Torrence & Compo, 1998). The continuous wavelet transform can decompose the time series into both time and frequency components. The wavelet power spectrum can graphically show the time-series variance in the time–frequency domain (Cazelles et al., 2008; Grinsted et al., 2004). Wavelet coherency is a kind of correlation that detects common time-localized oscillations at various frequencies (periodic cycles) between a pair of time series. Here I employed wavelet coherence analysis to reveal the possible non-stationary association between epidemics and various driving factors in their periodic cycles.

Results

The outbreak of a new epidemic incident may be the extension of past epidemic incidents (Pei et al., 2015). Such a temporal association may cause auto-correlated errors in statistical analysis. It is necessary to include an autoregressive term in the regression models to address this issue (Hamilton, 1994). I employed the autocorrelation function (ACF) and the partial autocorrelation function (PACF) to check the autocorrelation of the epidemics data (Harvey, 1993). The ACF and PACF plots and autoregressive (AR) modeling show that the epidemics data have a positively significant correlation with their first lag (p < 0.05) (Fig. A1 and Tables A1A5). Therefore, I adopted an AR(1) model and included the frequency of epidemics in the previous time step (epidemics t-1) in my regression to control the epidemics data’s autocorrelated errors. The ACF and PACF plots of the AR(1) model’s residuals, in which there is not any significant correlation (Fig. A2), confirm the model’s validity. My basic multiple regression results show that epidemics positively correlated with hydro-climatic extremes (p < 0.05), famines (p < 0.05) and economic fluctuations (p < 0.01), while wars were not correlated with epidemics (Tables A6A9).

Fig. 2
figure 2

Wavelet coherency analyses of the phase and the frequency between epidemics outbreak and various factors in China in 1841–1911. A Hydro-climatic extreme and epidemics. B Famine and epidemics. C War and epidemics. D Economic fluctuation and epidemics. The black contour shows significant periodicities (p < 0.05) against red noise. The legend indicates coherence values ranging from dark blue (low) to dark red (high). The statistically significant phase relationship are indicated by the black arrows. The region outside the cone of influence, where edge effects might distort the picture, is shaded

As the mean of epidemics frequency may have increased or decreased over time, I also included the time-trend variable (year) with its squared (year2) terms in the regression models to verify whether the above statistical findings were still valid (Galloway, 1986). Results show that hydro-climatic extremes (p < 0.05), famines (p < 0.05), and economic fluctuations (p < 0.01) were still positively correlated with epidemics, while the wars were not associated with epidemics (Tables A10A13).

Between 1851 and 1880, China was extremely overpopulated in subsistence pressure (Lee, 2014), and then the population size was reduced from 436.1 to 364.5 million subsequently (Cao, 2000). Therefore, I included a dummy variable, Crisis, in the multivariate epidemic regression and checked whether the relationship would be distorted during the crisis period. The Crisis dummy is equal to one for 1851–1880 and is equal to zero for the remainder of the study period, indicating the presence or absence of a crisis. The results were that hydro-climatic extremes (p < 0.05), famines (p < 0.05), and economic fluctuations (p < 0.01) were still positively correlated with epidemics, while wars were not correlated with epidemics (p > 0.05) (Tables A14A17).

I have included both AR(1), time-trends, and Crisis dummy in the regression models and checked whether hydro-climatic extremes, famines, wars, and economic fluctuations still significantly impact epidemics. The results were that hydro-climatic extremes (p < 0.05), famines (p < 0.05), and economic fluctuations (p < 0.01) were still positively correlated with epidemics, while wars were not correlated with epidemics (p > 0.05) (Tables 2, 3, 4 and 5).

Table 2 Estimates of the effect of hydro-climatic extreme on epidemics in late imperial China, with the influence of autocorrelation, time trends, and crisis controlled
Table 3 Estimates of the effect of famine on epidemics in late imperial China, with the influence of autocorrelation, time trends, and crisis controlled
Table 4 Estimates of the effect of war on epidemics in late imperial China, with the influence of autocorrelation, time trends, and crisis controlled
Table 5 Estimates of the effect of economic fluctuation on epidemics in late imperial China, with the influence of autocorrelation, time trends, and crisis controlled

Moreover, I included hydro-climatic extremes, famines, wars, and economic fluctuations as independent variables in a single regression model. I have also run collinearity statistics to check for any problems with multi-collinearity in the model. The tolerance and variance inflation factor (VIF) values of the major independent variables are > 0.2 and < 5 (Table A18). No problem with multi-collinearity has been identified (De Vaus, 2002). Regression results show that only economic fluctuations had a significant positive effect on epidemics (p < 0.05), while the influence of other factors (i.e., hydro-climatic extremes, famines, and wars) on epidemics was no longer significant (p > 0.05) (Table 6).

Table 6 Estimates of the effect of hydro-climatic extreme, famine, war, and economic fluctuation on epidemics in late imperial China, with the influence of autocorrelation, time trends, and crisis controlled

I further looked into the relative phase relationship between two time series. The red arrows’ direction indicates the relative phase relationship between the two time series on the wavelet coherence plots. When a pair of time series are in-phase or are positively correlated, the arrows point to the right. When a pair of time series are anti-phase or are negatively correlated, the arrows point to the left. The arrows also show the phase lead-lag relationship. The arrows point down when the first variable leads the second variable. The arrows point up when the first variable lags the second variable (Grinsted et al., 2004). Between the hydro-climatic extreme and the epidemics time series, I found their coherence in ~ 4-year periodicities in 1873–1881 and in ~ 8–16 year periodicities in 1892–1911 (Fig. 2A). The arrows point right, meaning that the hydro-climatic extremes and the epidemics are in-phase and positively correlated. There were also out-of-phase ~ 4-year periodicities in 1905–1911. There was coherence in ~ 3–6 year periodicities between the famine and the epidemics time-series in 1873–1885 (Fig. 2B). Wars cohered with epidemics in ~ 4–8 year periodicities in 1861–1868. But, there was a negative association in ~ 4-year periodicities in 1900–1907 (Fig. 2C).

The economic fluctuations cohered with epidemics in ~ 4–8-year periodicities in 1841–1870, 1877–1884, and 1899–1901. There was coherence between economic fluctuations and epidemics in ~ 2–4 year periodicities in 1857–1868. The coherence in ~ 12–16-year periodicities was continuous throughout the study period, except 1878–1887. The 1878–1887 period coincides with the population trough – the sudden liberation from the Malthusian trap. There was also coherence in > 16-year periodicities in 1841–1882 (Fig. 2D).

Briefly, wavelet coherence analysis results show that economic fluctuations are more important than natural disasters and positive checks in driving the epidemic outbreak (Fig. 2), which matches the regression results (Tables 26). To further address the temporality of the linkage (i.e., the cause must happen before the effect) between economic fluctuations and epidemics, I also computed the two specific periodic cycles (including ~ 6 years and ~ 12 years) of their coherency. For the ~ 6-year cycles, the average phase angle is 0.402 θ, in which economic fluctuations led to epidemics by 0.374 years. For the ~ 12-year cycles, the average phase angle is 1.594 θ, in which economic fluctuations preceded epidemics by 2.965 years.

As natural disasters and social turmoil could have a fundamental impact on the economy, I also check whether various positive checks drove the economic fluctuations. Simultaneously, the results show that the influence of positive checks on economic fluctuations was sporadic. Hydro-climatic extremes cohered with economic fluctuations in ~ 6–8-year periodicities in 1877–1886 (Fig. 3A). Famines cohered with economic fluctuations in ~ 5–8-year periodicities in 1877–1890 and in < 4-year periodicities in 1898–1905 (Fig. 3B). Wars cohered with the economic fluctuations in ~ 8-year periodicities in 1868–1875 and ~ 4-year periodicities in 1849–1852 (Fig. 3C).

Fig. 3
figure 3

Wavelet coherency analyses of the phase and the frequency between economic fluctuation and various social turmoil in China in 1841–1911. A Hydro-climatic extreme and economic fluctuation. B Famine and economic fluctuation. C War and economic fluctuation. The black contour shows significant periodicities (p < 0.05) against red noise. The legend indicates coherence values ranging from dark blue (low) to dark red (high). The statistically significant phase relationship are indicated by the black arrows. The region outside the cone of influence, where edge effects might distort the picture, is shaded

Discussion and Conclusion

The key findings from multiple regression and wavelet coherence analyses are: In China in 1841–1911, with the climate-related factors, economic well-being, and Malthusian checks were taken into account simultaneously, I found that economic fluctuations drove the waves of epidemics outbreaks at the inter-annual and decadal time-scales. My findings are consistent with previous research findings that economic factors are critical catalysts in triggering epidemics outbreaks in the past, whether in historical China (Pei et al., 2015) or pre-industrial Europe (Yue & Lee, 2020b). On the other hand, epidemics could also deteriorate the agrarian economy’s stability (Voigtländer & Voth, 2013). Therefore, the tie between economic systems and epidemics might have reinforced itself to form a vicious cycle, making their association invariable. This mechanism might have sustained the repeated waves of epidemics outbreak during the Third Pandemics period.

Nevertheless, it is worth mentioning that there was substantial and significant interannual and decadal coherency between economic fluctuations and epidemics in 1857–1868 (Fig. 2D), which should be explained further. The amplitude of economic fluctuations in 1857–1868 is the strongest throughout the study period (Fig. 1D), marked by the huge fluctuations of general price levels, especially grain prices (Peng, 1965). In the 1850s–1860s, significant setbacks were also happening in the industrial and service sectors (Ma & de Jong, 2019). Wavelet coherence results show that the economic fluctuations are not related to hydro-climatic extremes, famines, or wars in this specific period (Fig. 3). The possible reason for the fluctuations is the expansionary monetary policies adopted during the reign of Xianfeng (c. 1850–1861) (i.e., the issue of big coins, paper notes, and iron money), which bumped up the money supply to overcome the government’s fiscal deficit. As a result, hyperinflation took place and drastically diluted the wealth of the common people (Chen, 1958; He, 2018). On the other hand, the third plague pandemic started in China in 1855. The paper note scheme was abolished in 1862 when a new emperor was enthroned (He, 2018), and the inflation slowed down afterward. Given that a large portion of the Chinese population was still living at the subsistence level during the time, those economic fluctuations might have engendered severe hardship and malnourishment across societies (Lee, 2014; Ma & de Jong, 2019). Such a situation further increased the underprivileged social groups’ susceptibility to epidemics (Yue & Lee, 2020b).

Limited by scanty historical economic data, previous studies only used specific grain prices, such as rice or wheat, as a proxy for economic well-being to explain epidemics (Duncan et al., 1993, 1997; Pei et al., 2015). However, in the late imperial era, which coincides with the Third Pandemic period, the influence of grain prices on economic well-being was weakened by industrialization. Solitary use of grain prices may not fully capture the impact of economic well-being on outbreaks of epidemics. The present study has overcome this limitation by using a more holistic indicator of economic well-being (i.e., fluctuations in price indices in the agricultural, industrial, and service sectors). It has further supported the impact of economic well-being on facilitating the Third Pandemic in late imperial China.

Previous studies indicate the long-term impact of climate-related variables on epidemics (Lee et al., 2017; McMichael, 2012; Tian et al., 2017; Xu et al., 2011; Yue & Lee, 2020a). However, I found that the hydro-climatic extremes did not significantly affect the epidemic outbreak when other factors’ influence was controlled (Table 6). The disparity of findings may be attributed to the fact that the climatic effects on epidemics have been indirect and sporadic and have been materialized through economic changes (Pei et al., 2015; Yue & Lee, 2020b; Zhang et al., 2011). Also, the different time-spans of the study between this and the previous studies matter because the ranges of ecological and social responses to climate change are subject to the length of the study periods (Tian et al., 2017) and its associated temporal scales (Orlove, 2005; Pei et al., 2020). The study time-span in this study is relatively short of revealing the full climatic influence on outbreaks of epidemics, which is found to be apparent at the long-term scales (McMichael, 2012). Besides, this study focuses on the late-imperial era in which the impact of climate on human societies could have been weakened by improved social resilience and institutional measures in alleviating harvest failures (Hao et al., 2020; Lee & Zhang, 2010; Xiao et al., 2018).

Thomas Malthus (1798) based his observation on historical China that famines, wars, and epidemics, coined as positive checks, were mutually reinforcing during a subsistence crisis. Such a conception has become a traditional view. Yet, as shown in Fig. 2, in China during the crisis period (1851–1880), only famines were associated with epidemics, while there was no coherence between wars and epidemics. Perhaps the interconnection among positive checks could not be entirely applied to late imperial China.

Although economic fluctuation is the primary driver of the epidemic outbreaks, my wavelet coherency results show that economic fluctuations come with different cycles. They are driven by various factors over different periods (Fig. 3), suggesting that the underlying causes of economic fluctuations may be complicated. Thus, instead of merely correlating economic fluctuations with epidemics, the question of what caused economic fluctuations should not be overlooked. It should be noted that this study takes the whole of China as an aggregate. Subject to local and regional influence, there may be different catalysts of epidemics outbreaks at lower geographic levels (Lee et al., 2017). Furthermore, as the globe has been getting more interconnected over time, the role of regional environmental factors (Xu et al., 2019) as well as trade and transport routes (Schmid et al., 2015; Yue et al., 2016, 2017) in facilitating the transmission of diseases in geographic dimension should not be overlooked. Besides, how economic fluctuations shapes human individual and collective behaviors, and subsequently, the human population’s vulnerability to pandemics, should be fully addressed in future research.

Finally, COVID-19 has brought global and regional economic recession. Some countries adopt expansionary monetary policies to revitalize their national economic well-being. However, according to my findings, if such monetary policies cause drastic price fluctuations and if the effective vaccine for COVID-19 is still unavailable, such a situation may further reinforce the spread of the disease in the global village.