Abstract
A data set consisting of Volunteered geographical information (VGI) and data provided by expert researchers monitoring the first bloom dates of lilacs from 1956 to 2003 is used to investigate changes in the onset of the North American spring. It is argued that care must be taken when analysing data of this kind, with particular focus on the issues of lack of experimental design, and Simpson’s paradox. Approaches used to overcome this issue make use of random coefficient modelling, and bootstrapping approaches. Once the suggested methods have been employed, a gradual advance in the onset of spring is suggested by the results of the analysis. A key lesson learned is that the appropriateness of the model calibration technique used given the process of data collection needs careful consideration.
Notes
Centred on 1980—the midpoint of the time interval, as this reduces rounding error when calibrating the model.
References
Appleton D, French J, Vanderpump M (1996) Ignoring a covariate: an example of Simpson’s paradox. Am Stat 50(4):340–341
Baayen R, Davidson D, Bates D (2008) Mixed-effects modeling with crossed random effects for subjects and items. J Mem Lang 59:390–412
Caprio J (1957) Phenology of lilac bloom in Montana. Science 126:1344–1345
Carr DB (1991) Looking at large data sets using binned data plots. In: Buja A, Tukey P (eds) Computing and graphics in statistics. Springer, Berlin
Cayan D, Kammerdiener S, Dettinger M, Caprio J, Peterson D (2001) Changes in the onset of spring in the western united states. Bull Am Meteorol Soc 82(3):399–415
Cohn JP (2008) Citizen science: can volunteers do real research? BioScience 58(3):192–197. doi:10.1641/B580303
Coleman D (2010) The potential and early limitations of volunteered geographic information. Geomatica 64(2):27–39
Cooper CB, Dickinson J, Phillips T, Bonney R (2007) Citizen science as a tool for conservation in residential ecosystems. Ecol Soc 12(2):11
Davison A, Hinkley D (1997) Bootstrap methods and their application. Cambridge University Press, Cambridge
Emery W, Baldwin D, Schlüssel P, Reynolds R (2000) Accuracy of in situ sea surface temperatures used to calibrate infrared satellite measures. J Geophys Res 106:2387–2405. doi:doi:10.1029/2000JC000246
Gelfand A, Banerjee S, Sirmans C, Tu Y, Eng Ong S (2007) Multilevel modeling using spatial processes: application to the Singapore housing market. Comput Stat Data Anal 51:3567–3579
Goldstein H (1986) Multilevel mixed linear model analysis using iterative generalised least squares. Biometrika 73:43–56
Goldstein H (1987) Multilevel covariance component models. Biometrika 74:430–431
Goldstein H (1987) Multilevel models in educational and social research. Griffin, London
Goodchild M (2007) Citizens as sensors: the world of volunteered geography. GeoJournal 69(4):211–221. doi:10.1007/s10708-007-9111-y
Haklay M (2010) How good is volunteered geographical information? A comparative study of openstreetmap and ordnance survey datasets. Environ Plann B, Plann Des 37(4):682–703
Hand E (2010) Citizen science: people power. Nature 466(7307):685–687. doi:10.1038/466685a
Lister M Adrian, the Climate Change Research Group (2011) Natural history collections as sources of long-term datasets. Trends Ecol Evol 26(4):153–154
Longford N (1993) Random coefficient models. Clarendon Press, Oxford
McCaffrey RE (2005) Using citizen science in urban bird studies. Urban Habitats 3(1):70–86
Menzel A, Fabian P (1999) Growing season extended in europe. Nature 397:659
Miller-Rushing A, Primack R, Primack D, Mukunda S (2006) Photographs and herbarium specimens as tools to document phenological changes in response to global warming. Am J Bot 93:1667–1674
Myers R, Montgomery D, Vining G, Borror C, Kowalski S (2004) Response surface methodology: a retrospective and literature survey. J Qual Technol 36:53–78
Myers JL, Well A, Lorch RF (2010) Research design and statistical analysis, 3rd edn. Routledge, New York
R Development Core Team (2011) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria
Rayner N, Parker D, Horton E, Folland C, Alexander L, Rowell D, Kent E, Kaplan A (2003) Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth century. J Geophys Res 108:4407. doi:doi:10.1029/2002JD002670
Robbirt K, Davy A, Hutchings M, Roberts D (2010) Validation of biological collections as a source of phenological data for use in climate change studies: a case study with the orchid ophrys sphegodes. J Ecol 99(1):235–241
Schwartz M (1997) Phenology of seasonal climates. In: Lieth H, Schwartz M (eds) Spring index models: an approach to connection satellite and surface phenology. Backhuys, Netherlands, pp 23–38
Schwartz MD (1994) Monitoring global change with phenology—the case of the spring green wave. Int J Biometeorol 38(1):18–22
Schwartz M (1998) Green-wave phenology. Nature 394(6696):839–840
Schwartz M, Caprio J (2003) North american first leaf and first bloom lilac phenology data. IGBP PAGES/World Data Center for Paleoclimatology Data; Contribution Series # 2003-078; NOAA/NGDC Paleoclimatology Program, Boulder CO, USA
Schwartz M, Reiter B (2000) Changes in north american spring. Int J Climatol 20(8):929–932
Simpson E (1951) The interpretation of interaction in contingency tables. J R Stat Soc, Ser B Stat Methodol 13(2):238–241
The Guardian (2011) Spring’s here: skylarks overhead, moles in the garden, moths in the bathroom. URL http://www.guardian.co.uk/environment/2011/mar/27/spring-wildlife-black-mountains-wales
The Guardian (2011) Weatherwatch: phenology in the UK. URL http://www.guardian.co.uk/news/2011/apr/11/weatherwatch-phenology
Tobler WR (1970) A computer movie simulating urban growth in the Detroit region. Econ Geogr 46:234–240
USA National Phenology Network (2011) History of lilac and honeysuckle phenological observations in the USA. http://www.usanpn.org/?q=node/36
van Oort P, Zhang T, de Vries M, Heinemann A, Meinke H (2012) Correlation between temperature and phenology prediction error in rice (Oryza sativa L.). Agric For Meteorol 151(12):1545–1555
Wagner CH (1982) Simpson’s paradox in real life. Am Stat 36(1):46–48. URL http://www.jstor.org/stable/2684093
Author information
Authors and Affiliations
Corresponding author
Appendix: Computational considerations
Appendix: Computational considerations
In this section, some more detail is supplied about the software tools and techniques that were used to carry out this analysis. All of the statistical modelling was carried out using the R statistical programming language [25]. In particular, the random coefficient models were calibrated using the lme4 package.
The functions supplied in the R base library and lme4 were sufficient for all of the computations, except for the standard errors associated with the τ i values, and Δ. For these, a regression bootstrap approach as set out in [9] is used. Briefly, this estimates the sampling variation of parameters of interest by simulating data sets drawn from the model that is being fitted to the data (in this case the model given by Eq. 8). The sampling variation simulated is just that due to the variability in ε ij —so that rather than randomly assigning new values for the τ j ’s and υ i ’s for each simulated sample, it is assumed they are fixed at the estimated values. By simulating a large number of data sets in this way (say 1000, as in this paper), and applying the random coefficient estimation function supplied by lme4 to each simulated data set, an estimate of the sampling variability of the τ j ’s is obtained.
Rights and permissions
About this article
Cite this article
Brunsdon, C., Comber, L. Assessing the changing flowering date of the common lilac in North America: a random coefficient model approach. Geoinformatica 16, 675–690 (2012). https://doi.org/10.1007/s10707-012-0159-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-012-0159-6