International Journal of Biometeorology

, Volume 56, Issue 4, pp 707–717 | Cite as

Comparison of regression methods for phenology

  • Adrian Mark Ikin Roberts
Original Paper


Several methods exist for investigation of the relationship between records and weather data. These can be broadly classified into models that attempt to incorporate information about underlying biological processes, such as those based on the concept of thermal time, and linear regression methods. The latter are less driven by the biology but have the advantages of ease of use and flexibility. Regression can be used where there is no obvious mechanistic model or to suggest the form of a mechanistic or empirical model where there are several to choose from. Stepwise regression is commonly used in phenology. However, it requires aggregation of the weather records, resulting in loss of information. Penalised signal regression (PSR) was recently introduced to overcome this weakness. Here, we introduce a further method to the phenology context called fusion, which is a sparse version of PSR. In this paper, we compare the performance of these three regression methods based on simulations from two types of mechanistic models, the spring warming and sequential models. Given a suitable choice of temperature days as regression covariates, PSR and fusion performed better than stepwise regression for the spring warming model and PSR performed best for the sequential model. However, if a large number of redundant temperature days were included as covariates, the performance of PSR fell off whilst fusion was quite robust to this change. For this reason, it is best to use PSR and fusion methods in tandem, and to vary the number of covariates included.


Stepwise regression Fusion Fused lasso Phenology Smoothing P-spline regression 



This work was funded by the Scottish Government. I am most grateful to Professor Fred Last, who introduced me to phenology. I thank the referees for the valuable comments and suggestions that have helped to improve this paper.


  1. Burnham KP, Anderson DR (2002) Model selection and inference: a practical information-theoretic approach, 2nd edn. Springer, New YorkGoogle Scholar
  2. Chuine I, Cour P, Rousseau DD (1998) Fitting models predicting dates of flowering of temperature-zone trees using simulated annealing. Plant Cell Environ 21:455–466CrossRefGoogle Scholar
  3. Chuine I, Kramer K, Hänninen H (2003) Plant development models. In: Schwartz MD (ed) Phenology: an integrative environmental science. Kluwer, Dordrecht, pp 217–235CrossRefGoogle Scholar
  4. Draper NR, Smith H (1981) Applied regression analysis, 2nd edn. Wiley, New YorkGoogle Scholar
  5. Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Ann Statist 32:407–409CrossRefGoogle Scholar
  6. Efroymson MA (1960) Multiple regression analysis. In: Ralston A, Wilf HS (eds) Mathematical methods for digital computers. Wiley, New YorkGoogle Scholar
  7. Fitter AH, Fitter RSR, Harris ITB, Williamson MH (1995) Relationship between first flowering date and temperature in the flora of a locality in central England. Funct Ecol 9:55–60CrossRefGoogle Scholar
  8. Freedman DA (1983) A note on screening regression equations. Am Stat 37:152–155Google Scholar
  9. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New YorkGoogle Scholar
  10. Hudson IL (2010) Interdisciplinary approaches: toward new statistical methods for phenological studies. Clim Chang 100:143–171CrossRefGoogle Scholar
  11. Hudson IL, Keatley MR (eds) (2010) Phenological research: methods for environmental and climate change analysis. Springer, DordrechtGoogle Scholar
  12. Hudson IL, Keatley MR, Roberts AMI (2005) Statistical methods in phenological research. In: Francis AR, Matawie KM, Oshlack A, Smyth GK (eds) Statistical Solutions to Modern Problems. Proceedings of the 20th International Workshop on Statistical Modelling. Sydney, Australia, July 10–15, 2005. pp 259–270Google Scholar
  13. Land S, Friedman J (1996) Variable fusion: a new method of adaptive signal regression. Technical Report, Department of Statistics, Stanford UniversityGoogle Scholar
  14. Linkosalo T, Lappalainen HK, Hari P (2008) A comparison of phenological models of leaf bud burst and flowering of boreal trees using independent observations. Tree Physiol 28:1873–1882CrossRefGoogle Scholar
  15. Marx BD, Eilers PHC (1999) Generalized linear regression on sampled signals and curves: a P-spline approach. Technometrics 41:1–13Google Scholar
  16. Murtagh PA (2009) Pefromance of several variable-selection methods applied to real ecological data. Ecol Lett 12:1061–1068CrossRefGoogle Scholar
  17. Roberts AMI (2008) Exploring relationships between phenological and weather data using smoothing. Int J Biometeorol 52:463–470CrossRefGoogle Scholar
  18. Roberts AMI (2010) Smoothing methods. In: Hudson IL, Keatley MR (eds) Phenological research: methods for environmental and climate change analysis. Springer, DordrechtGoogle Scholar
  19. Roy DB, Sparks TH (2000) Phenology of British butterflies and climate change. Glob Change Biol 6:407–416CrossRefGoogle Scholar
  20. Schwartz MD (ed) (2003) Phenology: an integrative environmental science. Kluwer, DordrechtGoogle Scholar
  21. Sparks TH, Carey PD (1995) The response of species to climate over two centuries: an analysis of the Marsham phenological record, 1736–1947. J Ecol 83:321–329CrossRefGoogle Scholar
  22. Sparks T, Tryjanowski P (2010) Regression and causality. In: Hudson IL, Keatley MR (eds) Phenological research: methods for environmental and climate change analysis. Springer, DordrechtGoogle Scholar
  23. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58:267–288Google Scholar
  24. Tibshirani R, Saunders R, Zhu, Knight (2005) Sparsity and smoothness via the fused lasso. J R Stat Soc B 67:91–108CrossRefGoogle Scholar

Copyright information

© ISB 2011

Authors and Affiliations

  1. 1.Biomathematics & Statistics ScotlandEdinburghUK

Personalised recommendations