Periodicity of outbreak phenomena
Because only three outbreak sites are located in the Southern Hemisphere and the number of cases in the source data is 0, it can be considered that all the outbreak sites are distributed in the Northern Hemisphere. In this paper, the annual outbreak trend of global H5N1 is obtained according to the division of the seasons in the Northern Hemisphere (Fig. 2). As noted in the chart, during the 15-year period from 2005 to 2019, the number of global H5N1 outbreaks showed three “upward-downward” trends over time: the first period is 2005–2009, the second period is 2010–2014, and the last period is 2015–2019. These data indicate that the global H5N1 has the characteristics of repeated outbreaks about every 5 years. To further explain the characteristics of the number of H5N1 outbreaks in each season, a box chart (Fig. 3) of the number of outbreaks in each season was calculated. From this chart, we can see that spring has the highest number of outbreaks followed by winter, summer, and finally autumn, accounting for 46.72%, 37.55%, 8.98%, and 6.75% of the total, respectively. The median number of H5N1 outbreaks in spring and winter was also greater than that in other seasons, indicating that spring and winter represent the peaks of H5N1 outbreaks. In addition, the outliers and the range of 1.5 IQR in spring and winter are larger, and the upper limit and lower limit of the numbers of H5N1 outbreaks are very different, indicating that the number of H5N1 outbreaks in spring and winter is not too stable.
Spatial distribution
In the analysis of the time period of H5N1 outbreaks, the trend of the number of H5N1 outbreaks shows that there are three outbreak time periods in the 15-year period from 2005 to 2019, and we further analyze the spatial distribution of global H5N1 outbreaks on the basis of these three time periods (Fig. 4a, b, c). We can see that the global H5N1 outbreaks are distributed on four continents, including Africa, Asia, Europe, and America, and H5N1 breakouts mainly occurred in Africa, Asia, and Europe.
Significant differences in the number of outbreaks on each continent were noted over the three periods (Fig. 5). In the first period (Fig. 5a), the number of H5N1 outbreaks in Asia, Europe, and Africa was very high. In 2006, H5N1 appeared in Africa, and the number of outbreaks increased sharply to a 15-year peak of 765. Almost all H5N1 outbreaks occurred in Asia in the second time period (Fig. 5b). The number of outbreaks in Asia in 5 years was 26 (89.66%), 178 (100%), 98 (100%), 108 (100%), and 46 (90.20%). In the third time period (Fig. 5c), H5N1 broke out mainly in Africa, and the number of H5N1 outbreaks in Africa was very high in 2015 and 2016, with a total of 494 outbreaks in both years. Overall, the outbreak ranking statistics (Fig. 5d) show that Africa has the highest number of outbreaks over a 15-year period followed by Asia, Europe, and finally the Americas. The ratios of outbreaks on these four continents to total outbreaks were 50.39%, 37.15%, 9.70%, and 0.06%, respectively.
The autocorrelation analysis of the number of H5N1 outbreaks in three periods was performed, and the Moran’s I scatter plot of each time period was obtained (Fig. 6). As shown in Fig. 6, the Moran’s I indices of the three time periods are all greater than 0, indicating that the global H5N1 outbreaks exhibit a positive spatial correlation, and the sites of the outbreaks are characteristic of spatial aggregation. The Moran’s I index of the three time periods is different, indicating that the degree of aggregation is not stable and will change with time. In addition, to ensure the reliability of the autocorrelation results, the significance test of Moran’s I index was performed. The standard statistics Z of 2005–2009, 2010–2014, and 2015–2019 are 1.9883, 4.9683, and 2.0221, respectively, and p values are less than 0.05, indicating that the test result of Moran’s I index is significant.
To further understand the aggregation phenomenon of H5N1, the multidistance spatial agglomeration is used to analyze the scale of H5N1 aggregation in three periods. The actual K and expected K are calculated by Formula 2. With the increasing spatial scale, the aggregation of H5N1 first increased and then decreased (Fig. 7). The scale range of actual K value greater than expected K value exists in all three periods, and the corresponding scale range is 0–500 km during 2005–2009, 0–7500 km during 2010–2014, and 0–6500 km during 2005–2019. These results indicate that the distribution characteristics change with the change in the time and space scale. Generally, H5N1 breakouts still exhibit a weak aggregation phenomenon at scales close to 5000 km.
Correlation between the outbreaks of H5N1 and environmental factors
In this paper, 12 environmental factors are selected: elevation (DEM), LAI, air pressure (AP), specific humidity (SH), surface radiation (SR), rainfall (RA), snow (SN), air temperature (AT), wind speed (WS), road distance (ROD), railway distance (RAD), and water distance (WAD). Here, the road distance, the railway distance, and the water distance are the nearest distances of the “outbreak” site or “nonoutbreak” site relative to the road, railway, and water area (rivers, lakes, and reservoirs) respectively. Then, these factor values of “outbreak” site and “nonoutbreak” site are substituted into the binary logistic regression model to analyze the relationship between the outbreak of H5N1 and environmental factors. According to the collinearity test results (the resulting figure can be found in supplementary material S1), we find that the tolerance between surface radiation and air temperature is less than 0.1, and the variance inflation factor (VIF) is greater than 10, so the surface radiation is excluded. The remaining 11 factors were analyzed by regression, and then the significantly effective factors were substituted into binary logistic regression for multifactor regression. Based on the multifactor regression results (Table 1), it can be concluded that elevation, rainfall, and snowfall are unrelated factors. Although the air pressure is significant, the odds ratio (OR) value is 1, and the coefficient is 0. Thus, air pressure is regarded as an independent factor. While other factors are significantly related to the outbreak of H5N1, the correlation degree of each factor to the outbreak of H5N1 exhibits the following order: WS > LAI > AT > SH > WAD > ROD > RAD.
Table 1 Multiple factor regression Global H5N1 outbreak risk prediction
Before prediction, the Omnibus test of model coefficients and Hosmer–Lemeshow test were performed on the model. The p of the Omnibus test of the model coefficient is less than 0.05 (p = 0.02). The p of the Hosmer–Lemeshow test is greater than 0.05 (p = 0.35). Thus, the goodness of fit of the model is high, and the predicted value is very close to the real value. The results of the two test methods indicate that the model in this paper has high reliability. In addition, in this paper, the outbreak risk (probability) is divided into eight categories by the natural breakpoint method, because the natural breakpoint method can set the boundary where the numerical difference is relatively large, and the similar values can be grouped properly. Thus, there is a maximum difference between the data of each group, which can effectively represent the classification of the probability of each prediction site.
As seen from the prediction map (Fig. 8) and the results presented in Fig. 4, with the exception of China, Southeast Asia, South Asia, Nigeria, Ghana, Burkina Faso, Côte d’Ivoire, Egypt, Sudan, Romania, Greece, Germany, and France, where H5N1 has previously occurred, there is a high risk of H5N1 outbreaks in the southeastern part of the USA and Paraguay in South America. A high risk of H5N1 outbreaks is also noted in Uganda, Congo, Tanzania, South Africa, and the western and southeastern regions of Oceania.