Stochastic flowering phenology in Dactylis glomerata populations described by Markov chain modelling

Understanding the relationship between flowering patterns and pollen dispersal is important in climate change modelling, pollen forecasting, forestry and agriculture. Enhanced understanding of this connection can be gained through detailed spatial and temporal flowering observations on a population level, combined with modelling simulating the dynamics. Species with large distribution ranges, long flowering seasons, high pollen production and naturally large populations can be used to illustrate these dynamics. Revealing and simulating species-specific demographic and stochastic elements in the flowering process will likely be important in determining when pollen release is likely to happen in flowering plants. Spatial and temporal dynamics of eight populations of Dactylis glomerata were collected over the course of two years to determine high-resolution demographic elements. Stochastic elements were accounted for using Markov chain approaches in order to evaluate tiller-specific contribution to overall population dynamics. Tiller-specific developmental dynamics were evaluated using three different RV matrix correlation coefficients. We found that the demographic patterns in population development were the same for all populations with key phenological events differing only by a few days over the course of the seasons. Many tillers transitioned very quickly from non-flowering to full flowering, a process that can be replicated with Markov chain modelling. Our novel approach demonstrates the identification and quantification of stochastic elements in the flowering process of D. glomerata, an element likely to be found in many flowering plants. The stochastic modelling approach can be used to develop detailed pollen release models for Dactylis, other grass species and probably other flowering plants.


Introduction
Grass ecology is of considerable importance to human society and culture, mainly through cereal cultivation, animal fodder and forestry but also through grass pollen allergy. Grass pollen is arguably the most important outdoor aeroallergen with the highest amount of sensitization (D'Amato et al. 2007;Heinzerling et al. 2009). Studies have shown that there is a strong relationship between airborne pollen concentrations and the flowering phenology of grasses (Ghitarrini et al. 2017;Kmenta et al. 2016;Romero-Morte et al. 2020). The identification and modelling of demographic and stochastic processes in grass flowering should increase understanding of this relationship.
To fully understand the ecology of grasses, a population-based approach is often needed (Bolnick et al. 2003;Watkinson and Ormerod 2001). Spatial and temporal observations of flowering of multiple populations are often used to understand demographic elements in the study of grasses in agriculture (Rossignol et al. 2014;Smith 1944), ecology (Eagles 1972;Lindner and Garcia 1997) and aerobiology (Rojo et al. 2017;Romero-Morte et al. 2018). Combining this approach with a stochastic Markov chain modelling approach (Balzter 2000) and a model grass species may provide deeper multifaceted understanding with respect to underlying flowering processes. Markov chain approaches have previously been used to accurately model demographic and stochastic elements in grass populations (Nakaoka 1996) (e.g. Canales et al. 1994;Silva et al. 1991).
The species Dactylis glomerata, being (probably) the only species in its genus, has been investigated for more than a century regarding its growth and flowering processes (Wolfe 1925;Wolfe andKipps 1925, 1926). This has been due to its widespread use as a common pasture grass (Eagles and Williams 1971;Mizianty 1986) and its cross-pollination capacity (Jones and Newell 1948). It occurs abundantly throughout its extensive biogeographic distribution, with a native range between the USA and Japan, and from northern Africa to northern Sweden (Clayton et al. 2002). The species can be divided into a few different sub-species based on genetic and morphological divergencies (Yan et al. 2016). However, D. glomerata subsp. glomerata is by far the most common sub-species, defining the standard of the species (Mizianty 1990). Cebrino et al. (2018) found that D. glomerata starts flowering around the same time in Spain, regardless of habitat. This holds true between years and locations (Cebrino et al. 2016). Observations in cultivated gardens have shown that earlier starting times promote longer flowering in individual flowers (Tormo-Molina et al. 2015), common in wild grass species and important for climate change scenarios (Cleland et al. 2006;Munson and Long 2017). The higher pollen production in D. glomerata compared to other grasses (Aboulaich et al. 2009;Prieto-Baena et al. 2003) heightens the relevance of these processes. All the above-mentioned considerations make D. glomerata a model species for phenological investigation. Information about the detailed flowering dynamics in populations will enable climatic and mechanistic modelling of D. glomerata and other grasses with similar behaviour, a key aspect for phenological modelling of grasses and their impact on the grass pollen seasonality (García-Mozo et al. 2009). The inclusion of stochastic elements in the phenological development will be able to account for intrinsic attributes of the flowering process, which has been suggested to have implications for pollen and seed dispersal patterns (Soons et al. 2004;Tufto et al. 1997), processes-based range modelling (Evans et al. 2016) and cellular differentiation modelling (Davila-Velderrain et al. 2016). Our objective in this study was to investigate demographic and stochastic elements in grass flowering phenology by using Dactylis glomerata as a model species. This was done by incorporating population dynamics through detailed spatial and temporal demographic observations and by using Markov chain modelling to explain the stochastic behaviour.

Locations
Eight locations across the surroundings of Worcester, UK ( Fig. 1), were monitored over two years (15 May to 31 August 2017 and 2018) to investigate D. glomerata population dynamics ( Table 1). The population at each location were considered as biologically distinct unit since it was spatially isolated from other specimens of the same species within the immediate surroundings. The locations are distinguished between two main populations at St Johns and Lakeside campuses belonging to the University of Worcester and six other unique locations each containing one secondary population. The populations at both campus sites were chosen as main populations due to increased accessibility and lower chances of disturbance. All locations were selected based on three criteria: 1) should contain more than 150 tillers (grass flowering stems) of D. glomerata within a 10 m x 10 m area, 2) should not be a road verge and 3) should be easily accessible. A large minimum population size would limit the variation contributed from each tiller and produce a smaller and more stable standard error for the overall population development . The microclimate contributed to road verges would produce uncertainty regarding the stability of population dynamics (Jantunen et al. 2007), coupled with aspects of disturbances such as  (Brock et al. 1996). Easily accessible locations were due to simplicity and reproducibility of the study.

Grass flowering phenology
The flowering phenology was investigated using the 'Biologische Bundesanstalt, Bundessortenamt und Chemische Industrie' (BBCH) scale (Meier 2018). The scale, originally designed to monitor phenological changes in agricultural crops, functions by systematically observing a primary growth characteristic and a secondary growth characteristic, resulting in a twonumber sequence. By observing changes to the sequence, detailed phenological change can be monitored (Cornelius et al. 2011(Cornelius et al. , 2014Meier et al. 2009). The primary growth characteristic focused on in the study is the flowering stage, which starts at tiller heading and ends in tiller senescence. The secondary growth characteristics are focused on the percentage of visible anthers. Each tiller is assigned one of six phases, representing an amount range of visible anthers ( Table 2). The percent of visible anthers is the proxy of flowering progression due to the increased potential of pollen release, the ultimate evolutionary and ecological goal of Poaceae-specific flowering (Khanduri 2011). The tiller is assigned phase p0 until sign of first flowering and phase p5 when the last anther has detached. The reasoning behind last anther detachment is due to the slight theoretical possibility that pollen could still be emitted from desiccated anthers. Phases p1-p4 represent fractions of flowering anthers in steps of 25%. Examples of Dactylis glomerata tillers in each phase can be found in the supplementary material (Figs. S1-S6). By observing each tiller within a population, it is possible to measure the phenological progression of the population. Regular observations will allow for a detailed progression profile and reveal inter-(between population) and intra-(within population) demographic variation into flowering phenology through population dynamics of the species.

Observational strategy
Each population is defined as all tillers of the species within two to four 1.5 m 9 1.5 m plots with less than 3 m between each plot within each location. The  reasoning is to allow a population size large enough to capture statistically unlikely events in regard to population demographic stochasticity . Denser populations utilized the lower number of plots and sparser populations the higher number of plots. Phenological progression of all tillers in each population was observed from the first flowering tiller to the last tillers' senescence. The reasoning is to allow the full range of population dynamics within the main population locations, as adopted from previous grass population studies (e.g. Liddle et al. 1982).

Markov chain model
The Markov chain stochastic modelling approach was utilized to analyse tiller-specific dynamics within each of the two main populations (in a similar way as Tseng et al. (2020)) for the years 2017 and 2018. The stochastic model uses the records of the temporal phase change for each tiller, by summarizing the overall phase change of all tillers into a six-by-six transitional matrix. The R package markovchain (Spedicato 2017) was used to create the transitional matrices. Not all phase changes are possible due to the tiller not being able to revert in phase or go directly from non-flowering to end-flowering (within the daily/ bi-daily observational approach). This matrix describes the likelihood for flowering progression (or phase change) for the full season for all tillers in each of the main populations (see approach in Meyn and Tweedie (1993)). These transitional matrices can be translated into Markov chain diagrams, which display the directional change between phases and the likelihood expressed between zero and one of transitioning for all possible transitions within each population (Grinstead and Snell 1997). Note that this analysis was not possible for the secondary locations, in the same manner, due to the limited number of visits, thereby not being able to observe tiller-specific progression.

Correlations
The general flowering progression for each year was tested for statistical normality using the Shapiro nonnormality test (Shapiro and Wilk 1965). The nonparametric Spearman rho correlation was utilized to analyse the general flowering progression due to non-normality expressed by the data. The rank-order correlation produces a more reliable estimate in cases of violated parametric assumptions (Spearman 1904).
Correlation was tested without significance testing due to the strong autocorrelations found in accumulated time series. The R package MatrixCorrelation (Indahl et al. 2018) was utilized to analyse the similarity between the Markov chain transitional matrices for the three main locations. Three different types of related matrix correlation coefficients were compared: RV (Robert and Escoufier 1976), RV2 (Smilde et al. 2009) and adjusted RV (Mayer et al. 2011). Three coefficients were used to avoid potential bias using only one type of correlation. All statistical analyses were employed and analysed in the statistical software R (R Core Team, 2020; R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available online at https://www.R-project.org/).

Main population dynamics
The main populations at St Johns had 317 and 355 tillers in 2017 and 2018, respectively, while Lakeside had 535 ( Table 1). The flowering started in late May (May 27 ± 2 days) in both years at St Johns and Lakeside (Fig. 2). During the first few days (May 29 ± 2 days), 10-20% of the tillers in the population reached the flowering phases p1, p2 and p3; hence, each of these may have up to 75% of their anther extruded. Most of these, augmented by non-flowering, tillers reached the full flowering in p4 during the first few days of June (June 1 ± 1 day). At this point, about 40% of the population was in full flowering (p4) and two days later an additional 20% of the population reached full flowering (p4) (June 3 ± 1 days). Most tillers in lower flowering phases progressed into full flowering over the coming few days (June 6 ± 2 days). At this time, most tillers were either in full flowering (p4) or had not yet started (p0). This continued until the middle of June (June 12 ± 3 days) when the first tillers reached senescence (p5), showcased by the detachment of the last anther. Around the same time (June 12 ± 1 day), the populations reached peak flowering. Peak flowering is the event where the maximum number of tillers in the season occupies the phase full flowering (p4). The phenological progression of the populations continued as an increasing number of tillers reached senescence and where the last tillers took considerable time to reach phase p5, illustrated by the long tail in Fig. 2. This progression continued until the last tiller had reached senescence, which happened on July 17, August 9 and July 24 for each main population, respectively (Fig. 2).

Secondary population dynamics
The six secondary populations contained between 160 and 427 tillers ( Table 1). The populations started flowering before the beginning of June (June 1-3-4 days) (Fig. 3). All of them had reached more than 60% full flowering (p4) in early June (June 7). By the middle of June (June 19), all populations had started senescence with at least some tillers in the senescence phase (p5). To check for spatial differentiation between the populations, phase ratios were compared. There was no discernible spatial differentiation found between the most western/eastern or northern/southern locations.

General flowering progression
The development of the 2017 population was mostly two days ahead of the 2018 population, with the full flowering phase being reached with only one day difference (Fig. 4). The main difference between the years was the amount of time needed to reach senescence: it was reached 20 days later in the 2018 population. In terms of development between the two years, the 2017 population was on average one-fifth of one phase earlier than the 2018 population. For example, if on day X the 2018 population was average phase 3.5, then the 2017 population was average phase 3.7. The standard error in the 2018 population ranged between 0.70 and 0.22 in the beginning of the season (June 1 to June 6), never reaching more than 0.15 for the rest of the season and reducing to almost zero towards the late season. See supplementary data (Fig. S7) for the contribution of each population to the general progression of 2018. A nonparametric correlation was used to compare the years due to both general populations expressing non-normality measured in the Shapiro test (p \ 0.001 for both years). The Spearman rho correlation showed that there was a high correlation between the two years (Spearman r s = 0.98).

Tiller-specific phase dynamics
The Markov chain diagrams ( Fig. 5a and Table S1 for the transitional matrices) show that the 2017 St Johns to 0.25 chance for p1 and p2 but almost a 0.50 chance for p3. Tillers in p4 had only a 0.05 chance of transitioning to senescence (p5). Also note that all possible transitions were present in this population, even the less prevalent ones such as p2 to p5, with only  (Fig. 5b). Most of the remaining chance was to stay in their respective phases. The exception was pre-flowering (p0), which had a 0.10 chance to transition into p1. Tillers in p1, p3 and p4 had chances of 0.02, 0.03 and 0.06 chance of transitioning into senescence (p5). One of the possible transitions was not observed. In the 2018 Lakeside population, tillers were most likely to transition into full flowering (p4), with chances ranging from 0.64 to 0.76 (Fig. 5c). The second most likely transition for the lower flowering phases was to stay in their respective phase, with chances of 0.225, 0.243 and 0.263 for p1, p2 and p3, respectively. The senescence rate for p4 into p5 was 0.074. Four of the possible transitions were not observed. The RV correlation coefficients for all the population comparisons ranged from 0.824 to 0.999 ( Table 3). The RV coefficients ranged between 0.995 and 0.999, 0.824 and 0.932 and 0.847 and 0.950 for the spatial, temporal and spatiotemporal correlations, respectively.

Demographic flowering progression
The phenological progressions of the Dactylis glomerata populations illustrate the same general pattern of development. The general flowering progression shows that populations in a larger area will display uniformity in average phenological progression between years. The phase distributions show small differences in day-to-day variation between all populations. Key phenological events were clearly distinguished, with start of flowering, peak flowering and start of senescence all overlapping, in both time and space. This supports previous findings by León-Ruiz et al. (2011) that Dactylis flowering development is synchronistic between populations over a larger region and between years. The only main difference between populations was the senescence rates, likely attributed to local conditions responsible for drying (Atzema 1992;Hayhoe and Jackson 1974). In this study, full flowering of the first 60% of the tillers was reached within roughly 14 days, while the remaining 40% took more than a month. This demonstrates that the flowering of a population follows a non-normal distribution. We also found that tillers that start flowering later are more likely to flower for a shorter time (results not shown), as previously demonstrated by Tormo-Molina et al. (2015). The synchronal phenological progression of all populations indicates that Dactylis pollen release is likely to happen simultaneously over a larger area. This is supported by findings from Italy (Ghitarrini et al. 2017), Spain (Cebrino et al. 2016) and Austria (Kmenta et al. 2017).
If pollen release will be proportional to the percent of extruded anthers in each tiller, then higher flowering phases will likely release more pollen (Tormo et al. 2011) and that the largest amount of pollen will be released faster than the remaining 40%. This complements a previous study by Comtois (2000), suggesting that aerobiological studies should utilize gamma distributions instead of a normal distribution when conducting a hypothesis test. Furthermore, our results illustrate that average full flowering in a population follows an uneven distribution with a mixture of some non-flowering (p0) tillers, most in full flowering (p4) and some senescent (p5). This highlights a distinct difference between peak flowering and full flowering, with peak flowering of the season being the day of maximum numbers of tillers in full flowering. Therefore, peak flowering of populations will likely be more connected with pollen release, than the average full flowering. If pollen has the potential to disperse evenly during the flowering progression, then peak flowering will also explain maximum pollen release potential (Frenguelli et al. 2010;Romero-Morte et al. 2018). This study therefore suggests that the daily or bi-daily resolution of Dactylis phenology is needed to illustrate the narrow window of maximum pollen release potential that occurs during the peak flowering of Dactylis glomerata populations. This information can increase the certainty of species pollen release estimates, which can be of great importance in dispersion modelling, forecasting and the usage of regional pollen calendars (e.g. Adams-Groom et al. 2020;Lo et al. 2019).

Stochastic tiller dynamics
Our findings illustrate that each population is a heterogeneous distribution of tillers occupying varying phases throughout the flowering development. The population matrix correlations are high between populations, which suggests an inherently uniform developmental dynamics shared between populations. However, the correlations also indicate that the spatial correlation is stronger than the temporal and spatiotemporal correlations in the developmental stochasticity between populations. This suggests that differences are larger between years than between separate locations, probably due to regional phenomena that vary on a temporal basis. Although no other studies have been performed on the phase-based developmental dynamics in Dactylis (or any other grasses for that matter), at least two connected studies (Calder 1964a, b) propose that stage-based development is the result of physiological processes caused by photoperiod and temperature. Three stages were proposed: juvenile stage (irresponsive to photoperiod and temperature), inductive stage (responsive to photoperiod and temperature) and postinductive stage (responsive to long days). Calder (1964a) further states the continuation of studies into Dactylis stage-based differentiation is essential in determining the full extent of the flowering process. Later studies confirm the differentiated physiological response to photoperiod and temperature (Broué and Nicholls 1973;Heide 1987). The likelihood of tillers to stay in lower flowering phases (as seen in 2017) is consistent with a lack of external growth stimuli to progress to full flowering (Calder 1963;Heide 1994).
On the other hand, tillers expressed high likelihoods of transitioning directly from pre-flowering (p0) to full flowering (p4) (as seen in 2018). This complements studies from Spain as it confirms Dactylis tillers' capacity to quickly reach full pollen release potential from one day to the next (Cebrino et al. 2016). Such rapid responses might be a contributing factor to the large daily variation in grass pollen concentrations that have previously been observed within the urban environment (Hjort et al. 2016;Simoleit et al. 2017;Werchan et al. 2017). The observed small variation in the stochastic element implies the presence of a consistent developmental progression within the species, likely to be seen in other species using the same approach. Although this study has not included a meteorological component, we suggest that small variations in meteorological factors such as solar radiation or temperature would likely explain the differential in tiller-specific flowering dynamics between 2017 and 2018. This has previously been shown using Markov chain approaches to be the case for yearly variation in Birch masting behaviour (Tseng et al. 2020). The novel aspect of including stochasticity in flowering dynamics demonstrates that grass populations have defined probabilistic developmental patterns that can be applied to populations over a larger region and potentially utilized to investigate pollen release scenarios. In contrast, the behaviour of small populations or individual plants can be difficult to predict due to the nature of probability.

Connection to grass pollen concentrations
The findings in this study increase knowledge on airborne grass pollen concentrations considerably and may have direct application in the development and improvement of numerical forecasting models for grass pollen using regional scale atmospheric transport models. In Europe, models have been developed for trees (e.g. Kurganskiy et al. 2020;Sofiev et al. 2015;Zhang et al. 2014;Zink et al. 2013) and ragweed (e.g. Prank et al. 2013;Zink et al. 2017), partly due to availability of large-scale source maps (Pauling et al. 2012;Skjøth et al. 2019) and their tendency to long distance transport (de Weger et al. 2016;Š ikoparija et al. 2013;Siljamo et al. 2008;Skjøth et al. 2007). This makes regional-scale models particularly useful for simulating pollen concentrations for these species. For grasses, this may be different. Firstly, pollen consists of a range of species all contributing to the overall pollen concentration (Ghitarrini et al. 2017). This can make current emission models developed for trees difficult to implement, depending on the local characteristics. In Australia, first attempts have been made with the CSIRO Chemical Transport Model, but with limited success (C-CTM) using mechanistic emission modelling (Emmerson et al. 2019). In the UK, the different grass species have been found to flower in a non-unified pattern throughout the country (Brennan et al. 2019), suggesting that flowering of individual species responds to local environmental variables. A recent study by Rowney et al. (2021) detecting D. glomerata and other grass species within the UK network found that 83% of the catch of D. glomerata in 2017 in Worcester was captured during the period 4-21 June. This corresponded to the period dominated by p4 and partly p5. In addition, on days before 4 June, when no or limited amounts of D. glomerata pollen were present in Worcester, the period was dominated by lower flowering phases (p1-p3). During this period, larger amounts of D. glomerata pollen were found at other sites in the UK (Rowney et al. 2021). This suggests that the first part of the flowering sequence (p1-p3) contributes only small amounts of pollen to the airborne pollen concentration. It is likely that other grass species will show a similar behaviour. It also suggests limited or no long-distance transport of grass pollen to Worcester of D. glomerata outside the local flowering period. This complements previous findings that grass pollen concentrations are much more localised phenomena compared to trees and ragweed with significant variations at the urban scale (e.g. Hugg et al. 2017;Skjøth et al. 2013). It has been suggested that such variations are caused by variations in emission patterns between different grass species (Peel et al. 2014) and localised emission sources (e.g. Skjøth et al. 2013). The Markov chain modelling approach can take this stochastic emission pattern into account, including potential rapid changes in emission potential. This concept is currently not implemented in state-of-theart emission models designed for regional scale models (e.g. Zhang et al. 2014;Zink et al. 2013) and conceptually introduces a numerical approach to quantify the element of chance (or risk) of high pollen emission. This approach may be needed for fine-scale modelling of grass pollen concentrations at the urban scale.

Conclusion
This study demonstrates that the flowering progression of Dactylis glomerata populations follows a nonnormal distribution, with a skewness towards the beginning of the season. The study shows that several populations are synchronal within an entire region and between years, with key phenological events only differing by a few days. Populations share similar phase distributions between timesteps, highlighting the uniformity of development dynamics throughout each season. Tiller-specific flowering development revealed that stochastic elements are critical in explaining inherent population demographic properties of the flowering process within grass species and that rapid development of the entire population can happen from day to day. The development may be important in forecasting. Markov chain modelling approaches utilized in the study can be further combined with local grass maps and meteorology to develop grass pollen emission models for use by numerical models.
Acknowledgements Thanks are due to Dr Geoffrey Petch for providing assistance during the fieldwork. This research was funded by the European Commission through a Marie Curie Career Integration Grant (Project ID CIG631745 and Acronym SUPREME to CAS) and by the University of Worcester.
Author contributions CAF was involved in conceptualization, methodology, validation, formal analysis, investigation, resources, data curation and writing-original draft. CAF, BAG and CAS were involved in writing-review and editing and visualization. CAS was involved in supervision, project administration and funding acquisition.
Funding This research was funded by the European Commission through a Marie Curie Career Integration Grant (Project ID CIG631745 and Acronym SUPREME to CAS) and by the University of Worcester.

Compliance with ethical standards
Conflicts of interests The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.