Background

Mumps, commonly caused by mumps virus (MuV), is an acute respiratory infectious disease characterized by the swelling of parotid gland, localized pain, and fever [1]. Mumps are mainly transmitted by droplets and direct contact, which are infectious 7 days before and 9 days after parotid gland enlargement [2]. A number of countries around the world have reported cases of mumps and China was one of the areas with high incidence of mumps [4]. From the year 2004 to 2013, the total incidence of mumps in China reached 24.2/100,000, among which the reported incidence of children aged 5–9 was 118.2–281.4/100,000 [4]. In recent years, the incidence of mumps has been decreasing, however in some certain conditions people are still at high risk of being infected. For example, children in gathering places, as well as someone living in crowded and poor sanitation [5].

The incidences of mumps have an obvious differences from the regional and seasonal dimension. In the tropical regions of the world, mumps occurs all year round, while it is just common during spring and winter in the temperate regions [3]. Nowadays, the global climate has been facing dramatic changes and the extreme weather is more frequent. Thus exploring the impact of climate factors on the incidence of mumps and predicting its incidence are contributive to control the outbreak and epidemic of mumps [6]. Autoregressive integrated moving average (ARIMA) is often used to explore the regular pattern of disease development from the time dimension in the field of medicine. Besides, ARIMAX model of multivariate time series adds other variables related to the research sequence as input variables in order to make more accurate prediction [7,8,9,10]. In this study, ARIMA and ARIMAX model with meteorological factors were established, and the relationship between mumps and meteorology factors in different regions of China have been explored, the comparison of the two models has provide reference for both prevention and control of mumps and the effective prediction of other infectious diseases in China.

Methods

The sources of data

The epidemic surveillance data of mumps from January 2006 to December 2016 in provinces, municipalities directly under the Central Government, and autonomous regions of China were collected from the Public Health Science Data Center of China Disease Prevention and Control Information System (http://www.phsciencedata.cn/Share). The surveillance coverage of mumps was unchanged from 2006 to 2016, and all case reports were based on the confirmed diagnosis of mumps [4]. In China, all medical institutions, Centers for Disease Control and Prevention (CDC), blood collection and supply institutions were responsible reporting units of infectious diseases. Doctors in school, community or hospital who have diagnosed mumps for the first time should submit a report card, at the same time, the staff in infectious disease management department should directly report the case on the network, or send the report card to local CDC within the prescribed time. Mumps is legally classified as C level infectious disease in China, which means the report should be submitted within 24 h once the diagnosis is confirmed.

The demographic data of different regions were from the National Bureau of Statistics; National meteorological monitoring data, including average precipitation (mm), average air pressure (hPa), average temperature (°C), average relative humidity (%), minimum and maximum temperature (°C), days with daily precipitation ≥0.1 mm and maximum wind speed (ms)), were from the National Meteorological Information Center (http://data.cma.cn).

Data preprocessing

Considering the great differences of climate in the vast territory of China, this study has divided the Chinese mainland into seven regions according to the administrative planning criteria, i.e. North China (includes Beijing, Tianjin, Hebei, Shanxi, Inner Mongolia), Northeast China (includes Liaoning, Jilin, Heilongjiang), East China (includes Shanghai, Jiangsu, Zhejiang, Anhui, Fujian, Jiangxi, Shandong), Central China (includes Henan, Hubei, Hunan), Southwest China (includes Chongqing, Sichuan, Guizhou) Prefecture, Yunnan, Tibet, Northwest China (includes Shaanxi, Gansu, Qinghai, Ningxia, Xinjiang), South China (includes Guangdong, Guangxi, Hainan). (Supplement 1).

The incidence of mumps was calculated by dividing the number of reported cases per month by the average population of the same period in the region, and the meteorological factors in different regions were described by the average monthly data of all observation points in the region.

ARIMA model

Data of mumps and meteorological factors in different regions were divided into two parts from the perspective of time. One part was from January 2006 to December 2015, which was used to fit the model of mumps incidence; another part was from January to December 2016, which was aimed to evaluate the prediction effect of the optimal model.

The model could be expressed as ARIMA (p, d, q) × (P, D, Q) s, where d and D showed the order of ordinary difference and seasonal difference, which were the data conversion methods used to transform original time series into stable time series; p and q showed the order of autoregression and moving average in continuous model respectively; P and Q showed the order of autoregression and moving average in seasonal model respectively; the subscripted letter “s” showed the seasonal period length, in this research, s = 12.

The process of building ARIMA model of mumps was as follows [11]:

  1. (1)

    Model stationarization: Drew time series graph of mumps incidence data. When the graph was non-stationary, it needed to be stabilized by data conversion like means of difference, seasonal difference, logarithmic and exponential transformation of original data.

  2. (2)

    Model recognition: Judged the order of model and estimated the range of p, d, P, D according to the graph characteristics by drawing the autocorrelation function (ACF) and partial autocorrelation function (PACF) of mumps.

  3. (3)

    Parameter estimation: The least square method was used to estimate the parameters in the autoregressive process and moving average process, the significance of the estimated model parameters were tested at the same time, where α = 0.05.

  4. (4)

    Model Testing: Firstly, calculated the residual between the real value and the model fitting value, then formed the residual sequence. Secondly, used the Ljung-Box Q test (LBQ test) to determine whether the model was a white noise sequence or not, which means the model was sufficient to extract data information. Thirdly, determined the optimal model according to Schwarz Bayesian criterion (SBC), the smaller the SBC value was, the better the fitting effect of the model was.

  5. (5)

    Model prediction: Predicted the incidence of mumps in different regions from January to December 2016, then judged the prediction effect of this model by comparing with the actual reported incidence.

ARIMAX model

ARIMAX model is an extension of ARIMA modelling incorporating an explanatory independent variable. An ARMAX model could simply be regard as a multiple regression with one or more AR and MA terms [12]. The corresponding residual white noise sequence was obtained by establishing a one-element time series model for each individual meteorological variable. Based on the cross-correlation function (CCF) of residual white noise, the meteorological factors which affects the incidence of mumps were found out, and the optimal lag time was obtained. The selected meteorological factors were incorporated into the previously determined time series model to construct multivariate time series ARIMAX model. The optimal ARIMAX model was determined according to the minimum criterion of SBC. Lastly, compared the prediction effect of ARIMA and ARIMAX model with the relative error between actual and predicted incidence.

Statistic software

Using Microsoft Excel 2010 to establish the original database, using IBM SPSS statistics 23.0 for statistical analysis.

Results

Epidemiological characteristics of mumps in China from 2006 to 2016

From 2006 to 2016, 3.2 million cases of mumps were reported in China, which declined year by year after the highest average monthly incidence rate in 2012. The peak seasons of mumps in China were spring and winter, which accounted for 30.00 and 30.06% in 1 year respectively. Mumps occured in all provinces of China every year, mostly were concentrated except for north of China, which was shown in Table 1.

Table 1 Epidemiological characteristics of mumps in China from 2006 to 2016

ARIMA model

The original sequence of mumps incidence in different regions were non-stationary time series (Supplement 2). The original sequence of mumps incidence in different regions were converted into stationary time series by logarithmic transformation, ordinary difference or seasonal difference (Fig. 1). The parameter of models in different regions were firstly estimated according to the characteristics of time series diagram, ACF, and PACF diagrams (Figs. 1 and 2). The optimal ARIMA model in different regions was the one with the smallest SBC by fitting p, q, P, Q of 0, 1 and 2 in order (Table 2). LBQ test showed that models in different regions conformed to the white noise sequence conditions (P > 0.05), which means that the optimal models could extract information sufficiently (Table 2).

Fig. 1
figure 1

The stationary time series of mumps incidence in different regions after conversion

Fig. 2
figure 2

The ACF and PACF diagrams of models in different regions

Table 2 The optimal ARIMA model in different regions of China

Predicted the average monthly incidence of mumps from January to December 2016 with ARIMA model, we found that the predicted values were basically in agreement with the actual incidence values. All the actual values were within the 95% confidence interval of the predicted values and the average relative error was 15.57% (Table 3).

Table 3 The prediction effect of ARIMA and ARIMAX model in different regions

ARIMAX model

The optimal ARIMA model of each meteorological factor sequence was used to filter the differential meteorological factor sequence and mumps sequence after the difference, and then calculated the co-correlation coefficient (CCF) between the meteorological factors and the mumps incidence sequence (Fig. 3). The CCF chart lagged a certain order beyond the confidence interval, indicating that the incidence of mumps is related to this meteorological factor. Considering the lag of the 0–12 order, we found that the incidence of mumps was correlated with average precipitation (lags 5 or 8), average air pressure (lags 2) and minimum temperature (lags 0) in north China; correlated with average precipitation (lags 6) in east China; correlated with maximum wind speed (lags 10) in central of China; correlated with average air pressure (lags 10 or 11), average relative humidity (lags 10), minimum temperature (lags 8) and maximum temperature (lags 3) in Southwest; correlated with maximum wind speed (lags 9) in the northwest; correlated with average precipitation (lags 6), average air pressure (lags 10) and maximum wind speed (lags 1 or 11 or 12) in the northeast. No meteorological factors related to the incidence of mumps in south China (Fig. 3, Table 4).

Fig. 3
figure 3

Cross correlation analysis between meteorological factors and mumps incidence in different regions

Table 4 Relationship between mumps and meteorological factors in different regions

Established the ARIMAX model by incorporating significant statistical relevant meteorological factors into the ARIMA model, the LBQ test showed that ARIMAX models in different regions conformed to the white noise sequence conditions (P > 0.05) (Supplement 3). Predicted the incidence of mumps from January to December 2016 with ARIMAX model, we found that the predicted values were basically consistent with the actual incidence values. All the actual values were within the 95% confidence interval of the predicted values, with an average relative error of 10.87%, which was lower than ARIMA model (Table 3). Supplement 3.

Discussion

With the increase of global temperature and extreme weather events, it is important to research and predict the impact of meteorological factors on the incidence of diseases. Mumps is an acute infectious disease, which has a great impact on the physical and mental health of adolescents in China. In this study, we found that precipitation, air pressure, temperature, and wind speed had an effect on the incidence of mumps in most regions of China, which was consistent with the studies in Japan and Taiwan [12, 13]. Temperature and precipitation may affect the survival environment and transmission routes of pathogens, as well as exposure opportunities and sensitivity of susceptible populations [12, 14, 15]. Warm and humid weather are conducive to virus reproduction and evolution, and in warm weather, children are prone to go outdoors, which may increase the possibility of infection [12, 15]. In most regions of China, air pressure was negatively correlated with the incidence of mumps, a possible explanation could be that, the low air pressure causes the thin air condition, which was the reason for a low partial blood pressure of oxygen in human beings and then results in the reduction of resistance of the human body [16]. The acceleration of wind speed may speed up the flow of virus aerosol and expand the coverage of mumps, at low pressure and high wind speed, the virus was easy to spread and cause infection [12, 15, 16].

Mumps occured in all provinces of China every year, most of which were concentrated except for north of China, but the incidence of mumps in north and southwest China were more susceptible to climate factors, which was related to the meteorological characteristics in various regions. In the east, south and central of China, the rainwater was abundant, the weather was warm and humid; in the west of China, the air pressure was low and the air was thin, so these areas were more susceptible to mumps [17]. In the north and southwest China, the temperature and wind speed usually changed rapidly, and there was more extreme weather, which was more susceptible to the influence of meteorological factors.

Predicted the incidence of mumps from January to December 2016 with the ARIMA model, we found that the predicted values were basically in agreement with the actual incidence values, with an average relative error of 15.57%. Established ARIMAX model through incorporating statistical significant relevanted meteorological factors into the ARIMA model, we found that the predicted values were also basically in agreement with the actual incidence values, with an average relative error of 10.87%, which was lower than ARIMA model. Considering meteorological factors, ARIMAX model could better simulate and predict the incidence of mumps in China, which has certain reference value for the prevention and control of mumps.

Although two models in this study (ARIMA and ARIMAX) showed a good predictive effect on the incidence of mumps in China, they have not taken the economic and demographic factors into account, so in the following study, we would try to integrate more factors which may affect the incidence of mumps into models for comprehensive analysis. Besides, the ARIMA and ARIMAX model applied the historical epidemic data of mumps, with the occurrence of new cases, we should constantly adjust the parameters of the models to improve the sensitivity and accuracy of prediction, so as to make the research results closer to the actual work of prevention and control [14, 18].

Conclusions

Precipitation, air pressure, temperature, wind speed had an impact on the incidence of mumps in most regions of China and the incidence of mumps in north and southwest China, were more susceptible to climate factors. Considering meteorological factors, the average relative error of ARIMAX model was lower than ARIMA model; ARIMAX model could better simulate and predict the incidence of mumps in China, which has certain reference value for the prevention and control of mumps.