Plant Phenology Supports the Multi-emergence Hypothesis for Ebola Spillover Events

Ebola virus disease outbreaks in animals (including humans and great apes) start with sporadic host switches from unknown reservoir species. The factors leading to such spillover events are little explored. Filoviridae viruses have a wide range of natural hosts and are unstable once outside hosts. Spillover events, which involve the physical transfer of viral particles across species, could therefore be directly promoted by conditions of host ecology and environment. In this report, we outline a proof of concept that temporal fluctuations of a set of ecological and environmental variables describing the dynamics of the host ecosystem are able to predict such events of Ebola virus spillover to humans and animals. We compiled a data set of climate and plant phenology variables and Ebola virus disease spillovers in humans and animals. We identified critical biotic and abiotic conditions for spillovers via multiple regression and neural network-based time series regression. Phenology variables proved to be overall better predictors than climate variables. African phenology variables are not yet available as a comprehensive online resource. Given the likely importance of phenology for forecasting the likelihood of future Ebola spillover events, our results highlight the need for cost-effective transect surveys to supply phenology data for predictive modelling efforts. Electronic supplementary material The online version of this article (10.1007/s10393-017-1288-z) contains supplementary material, which is available to authorized users.


INTRODUCTION AND PURPOSE
Ebola virus disease can emerge simultaneously at geographically distant locations throughout the West African tropical rainforest belt biome. For example, the 1996 wave of outbreaks in both animals and humans occurred within just a few months in remote forest locations in Gabon and the Democratic Republic of the Congo (Georges et al. 1999;Milleliri et al. 2004;Yamagiwa and Basabose 2006;Lahm et al. 2007). This fact evoked migrating species such as bats as possible natural reservoirs of the virus (Leroy et al. , 2009). However, most emergences of Ebola virus disease have not been linked to direct human exposure to bats (Leendertz et al. 2016), and global predictions do not identify the African tropical rainforest belt biome as hot spots for viral spillovers from bats (Olival et al. 2017). Since current evidence for migrating bats being the primary natural reservoir of Ebola virus is inconclusive, alternative explanations for the sporadic emergence of Ebola virus disease in different places at the same time should be explored. Phylogenetic distance between susceptible potential host species has recently been used to predict additional susceptible taxa (Weber et al. 2017;Olival et al. 2017). Mononegavirales, the supergroup containing Filoviridae such as the Ebola virus group, includes a broad spectrum of natural reservoirs, including many mammals and other Figure 1. a Map of localities of Ebola virus disease outbreaks (where exact coordinates could be obtained, small squares), climate data (large squares) and phenology data (circles). b Box plots for climate and phenology variables with significant inter-annual variation (see Table 1 for statistical results) between years with recorded Ebola human + animal spillover events and years without known Ebola virus spillover events spanning the period of 1953-2016. Boxes represent 25-75% interval, whiskers represent non-outlier range, circles represent outliers, and stars represent extreme values. Lines connect medians of the same variables between years with spillover events and years without spillover events. Climate variables are shown in filled boxes, and phenology variable is shown in open boxes. (a) 10-Year average rainfall in Kibale National Park, (b) number of days with 0.254-mm rainfall per month (DP01), (c) number of days with 12.7-mm rainfall per month (DP05), (d) number of days with 25.4-mm rainfall per month (DP10), (e) highest daily total of precipitation per month in mm (EMXP), (f) lowest daily minimum temperature for the month in°C (EMNT), (g) monthly mean minimum temperature in°C (MMNT), (h) total precipitation, in mm/10, for the month (TPCP), (i) second independent component of Normalized Difference Vegetation Index (IC2 NDVI) anomaly between July and December.
tetrapods (Taylor et al. 2010), so that no potential reservoir species can be a priori excluded.
Simultaneous infections of humans and animals in the Gabon-Congo border region between 2001 and 2005 originated from viral strains with different amino acid sequences of the viral glycoprotein (GP), indicating that these originated from at least five independent, simultaneous spillover events Wittmann et al. 2007). Based on this evidence, and the fact that viruses cannot exist outside of their natural host organisms, previous studies highlighted possible ecological and environmental (''eco-environmental'') dimensions to Ebola virus spillover events. Wittmann et al. (2007) hypothesised that ''particular, but unknown environmental conditions'' may have caused the multiple spillover events from the unknown host to the human and animal populations recorded to be affected (Peterson et al. 2004). Leroy and Gonzalez (2012) also called this the ''multi-emergence'' hypothesis, where multiple, episodic and simultaneous outbreaks are caused by ecological or environmental conditions.  An environmental niche model has been generated for the Ebola virus based on environmental parameters in its localities of emergence, showing the African tropical rainforest biome as clearly defined region with highest probability of occurrence (Pigott et al. 2014). Besides seasonal fluctuations, and despite annual bat migration and high rates of human consumption of bushmeat, Ebola virus disease does not emerge annually. The 1990s Ebola outbreaks were related to short-term changes in regional precipitation and Normalized Difference Vegetation Index (NDVI) (Tucker et al. 2002). Evapotranspiration and Enhanced Vegetation Index are predictive of the spatial environmental niche of Ebola virus (i.e. being located in the African tropical rainforest biome), and monthly rainfall and rainfall anomaly can predict Ebola virus spillovers (Pinzon et al. 2004;Schmidt et al. 2017).
Intra-specific or inter-specific competition for resources such as food, or environmental contamination within the virus-harbouring ecosystem, can serve as a hypothetical functional link between ecological and environmental factors and spillover events from the unknown natural reservoir to other species. Rouquet et al. (2005) hypothesised that direct competition for resources between susceptible species in periods of food scarcity leads to increased inter-species transmission (Rouquet et al. 2005), and the annual fruit bat migration into the Democratic Republic of the Congo between April and May could be related to human disease emergence events in June (Leroy et al. 2009). A new mechanistic hypothesis invokes ''viral rain'', the shedding of infectious viral particles from the natural reservoir into the environment, such as bats spreading Hendra viruses around the trees they roost on (McFarlane et al. 2011).
Current efforts to identify Ebola outbreaks exclusively rely on monitoring on the level of animal species commonly hunted as bushmeat and first incidences in humans (Rouquet et al. 2005). This practice is costly and associated with complicated logistics. Designing novel methods for forecasting the likelihood of future Ebola virus disease spillover events before the human-to-human transmission stage is reached, would be a cost-effective method of upstream mitigation (Schar and Daszak 2014).
In this article, we evaluate support for the multiemergence hypothesis from ecological and environmental data sourced from the zoonotic niche of Ebola virus (Pigott et al. 2014), as being predictive for Ebola virus disease spillover events. Our working hypothesis is that local climate and phenology variables (greening, flowering and fruiting) will be better predictors for Ebola virus disease spillovers than remotely sensed climatic variables currently used for predictive modelling.

METHODS
Data on the occurrence of Ebola virus spillover events were obtained from published reports (for localities see Fig. 1a). Therefore, the true number of spillover events (especially in animals) might be higher, and we have to consider the data set we compiled as a sample of this true number. Data for spillover events involving humans were obtained from the website of the United States Centers for Disease Control andPrevention (cdc.gov, accessed 2014-2017). As the CDC website only lists reports for human outbreaks, we searched online literature resources for information on other species' Ebola infection records to obtain data on outbreaks in nonhuman animals (International Commission 1978;Heymann et al. 1980;Le Guenno et al. 1995;Muyembe and Kipasa 1995;Amblard et al. 1997;Formenty et al. 1999;Georges et al. 1999;Khan et al. 1999;Okware et al. 2002;Leroy et al. 2004;Lamunu et al. 2004;Milleliri et al. 2004;Rouquet et al. 2005;Pourrut et al. 2005;Lahm et al. 2007;Wittmann et al. 2007;Onyango et al. 2007;Towner et al. 2008;Leroy et al. 2009;Wamala et al. 2010;MacNeil et al. 2010;Nkoghe et al. 2011;Shoemaker et al. 2012;Bausch Figure 3. Seasonality of Ebola virus spillover events (humans + animals, standardised values, closed circles), the standardised average of climate PCs 1-3 (open circles) and the standardised average of phenology PCs 1-6 (closed squares). Standardised values for single phenology PCs 1-6 are shown as background lines (no symbols). Both September and December spikes in recorded human + animal Ebola virus spillover events (virus symbols) are associated with lower values of the average of phenology PCs 1-6 (fruit/leaf symbols), but only for the September spike with average climate variable changes (weather symbols). Curves are spline-fitted.  Fig. 1a). Data were obtained for monthly summaries between 1953 and 2015 for the weather station GHCND-GB000004556 in Makokou, Gabon (Lat 0.578004°, Long 12.889751°), and for the years 2016-2017 for the location FIPS:GB 9 (Libreville, Gabon, Lat 0.457088°, Long 9.409372°). We obtained data for the following variables: DT90, number of days with maximum temperature 32.2°C; DP01, number of days with 0.254-mm rainfall per month; DP05, number of days with 12.7-mm rainfall per month; DP10, number of days with 25.4-mm rainfall per month; CLDD, cooling degree days, computed when daily average temperature is more than 18.3°C; CDD, days per month with mean daily temperature -18.3°C; EMNT, lowest daily minimum temperature for the month in°C; EMXP, highest daily total of precipitation per month in mm; EMXT, highest daily maximum temperature per month in°C; MMNT, monthly mean minimum temperature in°C; MMXT, monthly mean maximum temperature in°C; MNTM, monthly mean temperature in°C; and TPCP total precipitation, in mm/10, for the month. From this monthly data set, we calculated arithmetic means for each year, as well for each month over the 1953-2017 period. We added additional climate data from other published sources that were computationally collected from published graphs using the WebPlotDigitizer software v 3.12 (Rohatgi 2016). Average annual values and for average rainfall , and average monthly minimal and maximal temperatures  for Kibale National Park, Uganda (Lat 0.486621°, Long 30.390715°), were obtained from Chapman et al. (2005). From Polansky and Boesch (2013), we extracted monthly inter-annual linear trends of rainfall, average monthly rainfall in Central Africa (in mm), between 1979 and 2010 in Tai National Park, Ivory Coast (Lat 5.690020°, Long -6.939547°). We also extracted temperature for African humid tropics in°C (Fisher et al. 2013), as well as the averaged normalized departure (April-October rainfall departure) between 1979 and 2010 (Boyd et al. 2013).
Currently, no comprehensive database exists for African plant phenology and vegetation data (Adole et al. 2016). The African tropical rainforest belt, which has high occurrence probability for Ebola virus disease emergence (Pigott et al. 2014), is characterised by a regional, seasonal pattern of greening, fruiting and flowering linked to the cyclical West African monsoon (Cornforth 2013). We used WebPlotDigitizer (Rohatgi 2016) to collect quantitative plant phenology data sets in published reports from the area of high occurrence probability for the Ebola virus (Fig. 1a). From Plumptre et al. (2012), we extracted data for the 1986-2012 flowering anomalies in Lope, Gabon (Lat -0.441156°, Long 11.524798°). Average monthly values were obtained from the same source for the proportion of trees flowering and the proportion of trees with ripe fruit in Goualougo, Nouabale Ndoki National Park, Republic of the Congo (Lat 2.263195°, Long 16.591688°); the proportion of trees flowering and the proportion of trees bearing ripe fruit in Lope, Gabon; and the proportion of trees flowering and the proportion of trees bearing ripe fruit in Okapi National Park, Democratic Republic of the Congo (Lat 1.400931°, Long 28.577619°). We also obtained values for the annual proportion of trees fruiting in Kibale National Park, Uganda, 1970-2001(Chapman et al. 2005. With an average monthly resolution, we extracted data for the percentage of trees with ripe fruit in Kibale National Park, Uganda, between August 1990 and September 2002 (Chapman et al. 2005). Arithmetic means were calculated for each year. From Philippon et al. (2007), we extracted data for the NDVI anomaly for West Africa (between August and December and between July andDecember, 1982-2002). From Polansky and Boesch (2013), we also extracted the percentage of the forest community bearing fruit and the count of number of species with peak fruiting times per month. Additional variables, extracted (monthly, between 1994 and 2002, and annual averages) from graphs generated by Yamagiwa et al. (2008), were a ''fruit index'' for fruit species consumed by chimpanzees and gorillas in both primary and secondary forests, in Kahuzi-Biega National Park, Democratic Republic of the Congo (Lat. -1.963260°, Long 28.018609°). We further extracted data from the same publication, for the percentage of different plant species bearing fruit that contributed to the generation of the fruit indices (monthly data between 1998 and 2002, and annual arithmetic means). These include: Allophylus spp., Bridelia bridelifolia, Cassipourea spp., Diospyros honleana, Ekebergia capensis, Ficus oreodryadum, Ficus thonningii, Maesa lanceolata, Myrianthus holstii, Newtonia buchananii, Psychotria palustris and Syzygium parvifolium.
All data are available as supplementary Appendix. We analysed differences in climate and phenology variables between years without any spillover events and years with at least one spillover event. The climate variables included were 10-year average rainfall in Kibale, average monthly maximum temperature in Kibale-annual average, DT90, DP01, DP05, DP10, CLDD, EMNT, EMXP, EMXT, MMNT, MMXT, MNTM and TPCP. Phenology variables included were first and second independent components of NDVI (see Philippon et al. 2007 for details). First, we performed Kolmogorov-Smirnov tests on each quantitative variable to test for normality. For variables that were distributed normally, we subsequently performed univariate ANOVAs. For variables not normally distributed, we performed Kruskal-Wallis tests between years with and without spillover events. We subsequently boxcox transformed non-normally distributed variables. We then tested whether climate and phenology variables can predict the number of human and other animal Ebola spillover events. To this purpose, we performed a multiple regression in Statistica (Dell, Inc.) with number of spillover events as dependent variable, and climate and phenology variables as independent predictors. To further test whether phenology or climate variables are better predictors for the recorded human + other animal spillover events, we generated and compared neural networks in Statistica (Statistica automated neural networks, SANN). First, principal components (PCs) were generated for 16 climate variables (Supplementary Tables 1, 2) and for two phenology variables (Supplementary Tables 3, 4, for the time frame be tween 1970 and 2002). Subsequently, the data set was partitioned into a modelling data set (years with even numbers) and a cross-validation data set (years with odd numbers). On the modelling data set (70% training, 15% test and 15% validation data), we generated three models based on neural network time series regression, with different input variables, and the number of spillover events as target variable. First, a model was generated with climate + phenology PCs. Second, we generated a model using only phenology PCs. Third, we generated a model using only climate PCs as input. For model generation, we used the MLP (multilayer perceptron, Rumelhart et al. 1986) method with 500 networks, and the five best networks were retained. We applied weight decay to prevent overfitting and used one step for forecasting. For each analysis, the best model was chosen and subsequently deployed to the cross-validation data set. We then compared the model estimates for the number of spillover events with the recorded number of spillover events in the cross-validation data set. Best fit to the recorded number of spillover events was determined via correlations and t tests for dependent samples. Monthly averages across years were available from a set of climate and phenology variables to determine seasonal patterns in the data. To reduce dimensionality in the data for analysis, we performed a principal component analysis. Thirteen climate variables were reduced to three PCs (Supplementary Tables 5, 6). Twenty-five phenology variables were reduced to six principal components (Supplementary Tables 7, 8). To explore the seasonal progression of these variables relative to one another, we plotted the number of spillovers per month, together with the average of climate PC values, and the average of phenology PC values. Between 1994 and 2002, monthly data were extracted for fruiting phenology, additionally to climate variables. For analysis, the monthly data partition was therefore cropped to only include monthly values from the years 1994-2002. To reduce dimensionality of climate and phenology variables, three principal component analyses were performed. Firstly, the eight climate variables were reduced to three principal components that together explained 81.3% of the overall variance in this data set (see Supplementary Tables 9, 10). From the set of four fruit index variables of Yamagiwa et al. (2008), two principal components were extracted (Supplementary Tables 11, 12). A principal component analysis for 12 single species variables recovered five principal components (Supplementary Tables 13, 14). To test whether phenology and climate variables are predictive for Ebola spillover events in the monthly data partition, we performed a multiple regression.

RESULTS
Results of tests for normality for annual climate and phenology variables are shown in Table 1. Univariate ANOVAs (for normally distributed variables) and Kruskal-Wallis tests (for non-normally distributed variables, respectively) revealed significant differences in the climate variables 10year average rainfall in Kibale, DP01, DP05, DP10, EMNT, EMXP, MMNT, and TPCP between years with versus years without spillover events. Of the two phenology variables, IC2 NDVI significantly differed between the years with versus without spillover events. Figure 1b shows box plots for these variables. Regression of climate and phenology variables against the number of human + other animal spillover events was overall significant (ANOVA results: R 2 = 0.3624, F(13,63) = 2.7546, P < 0.00374, Table 2). The residuals were distributed normally (not shown). Significant univariate predictors for this model were EMXT (P = 0.049) and IC2 NDVI (P = 0.012). To further determine the effectiveness of phenology versus climate variables in predicting spillover events, we generated and compared neural networks on principal components for the two types of predictor variables separately (phenology only, climate only) and combined (phenology and climate). Performance of each of the best models is shown in Supplementary  Table 15. We do not aim for predictive quality of these models here-instead, we aimed to test for best fit between modelled data using different sets of input variables and observed data. For this purpose, we employed a paired t test. When deploying each best model for the three sets of input variables, no model significantly differed from the observed values of Ebola spillover events, indicating an overall good fit (in Fig. 2a, results of t test are shown in Supplementary Table 16). The model with only phenology PCs as input variables performed best when deployed to the cross-validation data set, with the best correlation coefficient between predicted and observed spillover events (Fig. 2b, r = 0.8860, P 0.000001).
Using average values of phenology and of climate principal components, line plots for seasonality (Fig. 3) show a (previously reported) pattern of seasonality of Ebola spillover events. Our strategy to record both human and other animal spillover events, as well as recording independent epidemic chains as separate spillover events, reveals a clear seasonal dynamics of recorded spillovers, phenology and climate variables (Fig. 3).
The Yamagiwa et al. (2008) data set allowed a detailed investigation into which kinds of plants are contributing to the inter-annual and seasonal variation in phenology that we found to be associated with Ebola spillover events. Multiple regression revealed that for the investigated time period (which includes a total of 57 spillover events), climate variation does not significantly predict the number of monthly spillovers. While the ''fruit index'' variables also were not significant predictors, three of the five plant species PCs significantly predicted spillover events (Table 3). The overall regression model was significant (AN-OVA for overall goodness of fit: R 2 = 0.2368, F(11,93) = 2.6238, P < 0.00583, SE of estimate, 0.9740). Variables with significant factor loadings for these three PCs were the proportion of trees bearing fruit: Ekebergia capensis, Ficus oreodryadum, Ficus thonningii and Newtonia buchananii (Supplementary Table 13

DISCUSSION
We found significant differences in climate and vegetation parameters between years with and years without Ebola virus spillover events. Years with spillover events were characterised by generally drier conditions (DP01, DP05, EMXP and TPCP), as well as by higher minimum temperatures (EMNT, MMNT). In contrast, the anomaly of the vegetation index in the months from July to December was higher, combined with a higher average of rainfall (measured in Kibale, Uganda). An overall model of climate and phenology variables as predictors for the number of human + other animal spillovers in a given year was significant. EMXT, the highest monthly daily maximum temperature, was a significant predictor for the number of Ebola spillover events in humans and animals. These results align to previous studies that have found a climatic dimension for Ebola spillover events (Tucker et al. 2002;Pigott et al. 2014;Schmidt et al. 2017). In our study, however, we could additionally show that plant phenology variables [including the anomaly of Normalized Difference Vegetation Index (NDVI) between July and December, the proportion of population fruiting in Kibale National Park, Uganda, and the flowering anomalies in Lope, Gabon] informed neural network models with a superior fit to the data than when climate or climate in conjunction with phenology variables were used as inputs. Previous studies using Normalized Difference Vegetation Index or Enhanced Vegetation Index have not yet considered their predictive power for Ebola spillover events independently from climate variables (see Schmidt et al. 2017).
Aligned to previous reports (Alexander et al. 2015), we found a pronounced seasonality in recorded human and other animal spillover events. This seasonal association between Ebola spillover events and a multitude of phenology variables describing seasonality of flowering and fruiting was not previously reported and corroborates the newly found close association between inter-annual Ebola spillover events and inter-annual variation in vegetation and phenology variables. The seasonality in spillover events is thought to be associated with transitions between wet-todry and dry-to-wet conditions (Tucker et al. 2002;Pinzon et al. 2004;Groseth et al. 2007;Altizer et al. 2013;Schmidt et al. 2017), and thus may be associated with spatiotemporal fluctuations of the African monsoon (Cornforth 2013). While the September spike in human + other animal spillover events was associated with such a transition of climatic variables, the December spike was not. Other time points with shifts in climatic conditions were vice versa not associated with increases in Ebola spillover events. In contrast, both spikes in Ebola spillovers are associated with extreme values in plant phenology variables, lending additional support to the multi-emergence hypothesis. Previous studies have found seasonal and yearly oscillations of fruiting status within specific localities, with a trend that is generated by differences in peak fruiting times among individual species (Anderson et al. 2005;Polansky and Boesch 2013). Most species fruit only for 1 month per year (Anderson et al. 2005). Fruit abundance of species within the Tai forest (Ivory Coast) plant community has been found to increase over the past decade, with no apparent correlation with local climatic patterns (Polansky and Boesch 2013). Basabose (2002) also reported significant differences in the mean percentage of trees fruiting at Kahuzi-Biega National Park, between 1994 and2000, both between the dry and the wet season, as well as between years. Data from both these localities were part of our analysis. Local flowering and fruiting patterns consequently might reflect components of the local environment of the Ebola virus reservoir that differ from climatic variables, represented by climate layers.
Without in-depth knowledge of the proximate mechanism of Ebola spillovers, we could here show a covariation between ecological variables related to food resources and Ebola spillovers to animal hosts. For example, ''virus shedding' ' (McFarlane et al. 2011;Plowright et al. 2015), the process of bats transmitting virus particles to plant matter which is then taken up by other animals, provides a possible functional link between phenology and virus spillover. Plant phenology variables that significantly predicted monthly spillover events included Ficus spp., fruit, leaves and stems which are consumed by gorilla and chimpanzee. Ekebergia capensis fruit and Newtonia buchananii seed are likewise consumed by gorilla and chimpanzee in Kahuzi-Biega National Park, Democratic Republic of the Congo (Yamagiwa and Basabose 2006). Ficus thonningii constitutes an important food source for African frugivores (Bleher et al. 2003;Kirika et al. 2008). Seasonally, dry climate has been shown to cause synchronicity for flowering, but not fruiting in this species (Damstra et al. 1996). It has been proposed as a keystone species for frugivores, as it bears fruit during periods of fruit scarcity (Bleher et al. 2003). We could here show an association between Ebola spillovers and phenology of plants that constitute a seasonally varying food source for susceptible species. Plant species like F. thonningii could serve as preliminary focal species to test the hypothesis that competition for fruit or other plant matter in dry periods increases the likelihood of spillover events. Our results on the link between Ebola spillovers and forest phenology are mirrored by results from another viral haemorrhagic fever in Belgium; therefore, dynamics of vegetation phenology and their alteration under climate change have been found to influence the dynamics of Puumala virus (causing the disease Nephropathia epidemica, Barrios et al. 2010). In conclusion, both Barrios et al. (2010) and our present study support the multi-emergence hypothesis for virus spillover events. The phenology source data analysed in this study, however, were far from comprehensive. Clearly, to improve the understanding of the importance of phenology for predicting disease emergence and the quality of predictive models, more data are needed that spatially match locations of Ebola virus emergence. Phenology can vary locally and seasonally (Bleher et al. 2003), so that this information cannot be replaced by remote sensing data. We therefore highlight the importance of ecological field work additional to remote sensing in providing important data on phenology variables related to the ecosystem ecology of the unknown natural reservoir of Ebola virus.

CONCLUSION
Data on local fruiting patterns, collected locally, could constitute a useful indicator for the likelihood of impending Ebola spillover events and could readily be classified through cost-effective transect surveys. The most recent Ebola outbreak in May 2017 in Likati (Bas-Uele Province, Democratic Republic of the Congo) suggests that the emergence of Ebola virus is a recurring event that necessitates the generation of upstream models for disease management that is sensitive to economic cost and cost of human lives (Schar and Daszak 2014). We therefore encourage researchers to make unpublished African phenology data available to the public domain to aid in understanding this dimension of ecological and environmental variables related to spillover events of emerging infectious diseases.