Skip to main content
Log in

Weighted averaging partial least squares regression (WA-PLS): an improved method for reconstructing environmental variables from species assemblages

  • Research tools
  • Published:
Hydrobiologia Aims and scope Submit manuscript

Abstract

Weighted averaging regression and calibration form a simple, yet powerful method for reconstructing environmental variables from species assemblages. Based on the concepts of niche-space partitioning and ecological optima of species (indicator values), it performs well with noisy, species-rich data that cover a long ecological gradient (>3 SD units). Partial least squares regression is a linear method for multivariate calibration that is popular in chemometrics as a robust alternative to principal component regression. It successively selects linear components so as to maximize predictive power. In this paper the ideas of the two methods are combined. It is shown that the weighted averaging method is a form of partial least squares regression applied to transformed data that uses the first PLS-component only. The new combined method, ast squares, consists of using further components, namely as many as are useful in terms of predictive power. The further components utilize the residual structure in the species data to improve the species parameters (‘optima’) in the final weighted averaging predictor. Simulations show that the new method can give 70% reduction in prediction error in data sets with low noise, but only a small reduction in noisy data sets. In three real data sets of diatom assemblages collected for the reconstruction of acidity and salinity, the reduction in prediction error was zero, 19% and 32%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Battarbee, R. W. & D. F. Charles, 1987. The use of diatom assemblages in lake sediments as a means of assessing the timing, trends, and causes of lake acidification. Progr. Phys. Geogr. 11: 552–580.

    Google Scholar 

  • Birks, H. J. B., J. M. Line, S. Juggins, A. C. Stevenson & C. J. F. Ter Braak, 1990a. Diatoms and pH reconstruction. Phil. Trans. r. Soc. Lond. B 327: 263–278.

    Google Scholar 

  • Birks, H. J. B., S. Juggins & J. M. Line, 1990b. Lake surfacewater chemistry reconstructions from palaeolimnological data. In B. J. Mason (ed.), The Surface Waters Acidification Programme. Cambridge University Press, Cambridge: 303–313.

    Google Scholar 

  • Brown, G. H., 1979. An optimization criterion for linear inverse estimation. Technometrics 21: 575–579.

    Google Scholar 

  • Cleveland, W. S., 1979. Robust locally-weighted regression and smoothing scatterplots. J. am. Statist. Assoc. 74: 829–836.

    Google Scholar 

  • COHMAP Members, l988. Climatic changes of the last 18000 years: observations and model simulations. Science 241:1043–1052.

    Google Scholar 

  • Cumming, B. F., J. P. Smol & H. J. B. Birks, 1991. The relationship between sedimentary chrysophyte scales (Chrysophyceae and Synurophyceae) and limnological characteristics in 25 Norwegian lakes. Nord. J. Bot. 11: 231–241.

    Google Scholar 

  • Dixit, S. S., A. S. Dixit & J. P. Smol, 1991. Multivariable environmental inferences based on diatom assemblages from Sudbury (Canada) lakes. Freshwat. Biol. 26: 251–266.

    Google Scholar 

  • Fritz, S. C., S. Juggins, R. W. Battarbee & D. R. Engstrom, 1991. Reconstruction of past changes in salinity and climate using a diatom-based transfer function. Nature 352: 706–708.

    Google Scholar 

  • Gasse, F. & F. Tekaia, 1983. Transfer functions for estimating paleoecological conditions (pH) from East African diatoms. In J. Meriläinen, P. Huttunen & R. W. Battarbee (eds), Palaeolimnology. Development in Hydrobiology 15. Dr W. Junk Publishers, The Hague: 85–90. Reprinted from Hydrobiologia l03.

    Google Scholar 

  • Guiot, J., 1990. Methodology of the last climatic cycle reconstruction in France from pollen data. Palaeogeogr. Palaeoclimatol. Palaeoccol. 80: 49–69.

    Google Scholar 

  • Hall, R. I. & J. P. Smol, l992. A weighted-averaging regression and calibration model for inferring total phosphorus concentration from diatoms in British Columbia (Canada) lakes. Freshwat. Biol. 27: 417–434.

    Google Scholar 

  • Hastie, T. & R. Tibshirani, 1990. Generalized Additive Models. Chapman and Hall, London.

    Google Scholar 

  • Helland, I. S., l988. On the structure of partial least squares regression. Commun. Statist.-Simula. 17: 58l-607.

    Google Scholar 

  • Hill, M. O., 1973. Diversity and evenness: a unifying notation and its consequences. Ecology 54: 427–432.

    Google Scholar 

  • Hill M. O. 1979. DECORANA — A FORTRAN program for detrended correspondence analysis and reciprocal averaging. Ecology and Systematics. Cornell University, Ithaca, New York, 55 pp.

    Google Scholar 

  • Hill M. O. & H. G. Gauch, 1980. Detrended correspondence analysis, an improved ordination technique. Vegetatio 42: 47–58.

    Google Scholar 

  • Howe, S. & Webb, T. III, 1983. Calibrating pollen data in climatic terms: improving the methods. Quat. Sci. Rev. 2: l7–51.

    Google Scholar 

  • Huntley, B. & I. C. Prentice, 1988. July temperatures in Europe from pollen data, 6000 years before present. Science 241: 687–690.

    Google Scholar 

  • Juggins, S., 1992. Diatoms in the Thames estuary, England: Ecology, palaeoecology, and salinity transfer function. Bibl. diatomol. 25: 1–216.

    Google Scholar 

  • Juggins, S. & C. J. F. Ter Braak, 1992. CALIBRATE — a program for species-environment calibration by [weighted-averaging] partial least squares regression. Unpublished computer program, Environmental Change Research Centre, University College London, 20 pp.

  • Line, J. M. & H. J. B. Birks, 1990. WACALIB version 2.1 — a computer program to reconstruct environmental variables from fossil assemblages by weighted averaging. J. Paleolimnol. 3: 170–173.

    Google Scholar 

  • Lorber, A., L. E. Wangen & B. R. Kowalski, 1987. A theoretical foundation for the PLS algorithm. J. Chemometr. l: 19–31.

    Google Scholar 

  • Martens, H. & T. Naes, 1989. Multivariate calibration. Wiley, Chichester, 419 pp.

    Google Scholar 

  • Minchin, P. R., 1987. Simulation of multidimensional community patterns: towards a comprehensive model. Vegetatio 71: 145–156.

    Google Scholar 

  • Naes, T., C. Irgens & H. Martens, 1986. Comparison of linear statistical methods for calibration for NIR instruments. Appl. Statist. 35: 195–206.

    Google Scholar 

  • Oksanen, J., E. Laara, P. Huttunen & J. Meriläinen, 1988. Estimation of pH optima and tolerances of diatoms in lake sediments by the methods of weighted averaging, least squares and maximum likelihood, and their use for the prediction of lake acidity. J. Paleolimnol. 1: 39–49.

    Google Scholar 

  • Overpeck, J. T., T. Webb III & I. C. Prentice, 1985. Quantitative interpretation of fossil pollen spectra: dissimilarity coefficients and the method of modern analogs. Quat. Res. 23: 87–108.

    Google Scholar 

  • Prentice, I. C., P. J. Bartlein & T. Webb III, 1991. Vegetation and climate change in eastern North America since the last glacial maximum. Ecology 72: 2038–2056.

    Google Scholar 

  • Rousseau, D. D., 1991. Climatic transfer function from Quaternary molluscs in European loess deposits. Quat. Res. 36: l95–209.

    Google Scholar 

  • Roux, M., 1979. Estimation des paléoclimats d'après l'écologie des foraminifères. Cah. Anal. Données 4: 61–79.

    Google Scholar 

  • Roux, M., S. Servant-Vildary & M. Servant, 1991. Inferred ionic composition and salinity of a Bolivian Quaternary lake, as estimated from fossil diatoms in the sediments. Hydrobiologia 210: 3–18.

    Google Scholar 

  • Shelford, V. E., 1911. Ecological succession: stream fishes and the method of physiographic analysis. Biol. Bull. (Woods Hole) 21: 9–34.

    Google Scholar 

  • Stevenson A. C., S. Juggins, H. J. B. Birks, D. S. Anderson, N. J. Anderson, R. W. Battarbee, F. Berge, R. B. Davis, R. J. Flower, E. Y. Haworth, V. I. Jones, J. C. Kingston, A. M. Kreiser, J. M. Line, M. A. R. Munro & I. Renberg, 1991. The surface waters acidification project Palaeolimnology programme: modern diatom/lake-water chemistry data-set. ENSIS, London, 86 pp.

    Google Scholar 

  • Stone, M. & R. J. Brooks, 1990. Continuum regression: cross-validated sequentially constructed prediction embracing ordinary least squares, partial least squares and principal components regression. J. R. Statist. Soc. B 52: 237–269.

    Google Scholar 

  • Ter Braak, C. J. F., 1986. Canonical correspondence analysis: a new eigenvector technique for multivariate direct gradient analysis. Ecology 67: 1167–1179.

    Google Scholar 

  • Ter Braak, C. J. F., 1987. Ordination. In R. H. G. Jongman, C. J. F. Ter Braak & O. F. R. Van Tongeren (eds), Data analysis in community and landscape ecology. Pudoc, Wageningen: 91–173.

    Google Scholar 

  • Ter Braak C. J. F., 1988. CANOCO — a FORTRAN program for canonical community ordination by [partial] [detrended] [canonical] correspondence analysis, principal components analysis and redundancy analysis (version 2.1). Report LWA-88–02. Agricultural Mathematics Group, Wageningen, 95 pp.

    Google Scholar 

  • Ter Braak C. J. F., 1990. Update notes: CANOCO version 3.1. Microcomputer Power, Ithaca, NY, 35 pp.

    Google Scholar 

  • Ter Braak, C. J. F. & L. G. Barendregt, 1986. Weighted averaging of species indicator values: its efficiency in environmental calibration. Math. Bio. 78: 57–72.

    Google Scholar 

  • Ter Braak, C. J. F. & C. W. N. Looman, 1986. Weighted averaging, logistic regression and the Gaussian response model. Vegetatio 65: 3–11.

    Google Scholar 

  • Ter Braak, C. J. F. & I. C. Prentice, 1988. A theory of gradient analysis. Adv. Ecol. Res. 18: 271–317.

    Google Scholar 

  • Ter Braak, C. J. F. & H. van Dam, 1989. Inferring pH from diatoms: a comparison of old and new calibration methods. Hydrobiologia 178: 209–223.

    Google Scholar 

  • Ter Braak, C. J. F., S. Juggins, H. J. B. Birks & H. van der Voet, 1993. Weighted averaging partial least squares regression (WA-PLS): definition and comparison with other methods for species-environment calibration. Chapter 25 in G. P. Patil & C. R. Rao (eds), Multivariate Environmental Statistics. North-Holland, Amsterdam.

    Google Scholar 

  • Walker, I. R., R. J. Mott & J. P. Smol, 1991. Allerod-Younger Dryas lake temperatures from midge fossils in Atlantic Canada. Science 253: 1010–1012.

    Google Scholar 

  • Whittaker, R. H., 1956. Vegetation of the Great Smoky Mountains. Ecol. Monogr. 26: 1–80.

    Google Scholar 

  • Wold, S., 1992. Nonlinear partial least squares modelling. II Spline inner relation. Chemometrics and Intelligent Laboratory Systems 14: 71–84.

    Google Scholar 

  • Wold, S., A. Ruhe, H. Wold & W. J. Dunn III, 1984. The collinearity problem in linear regression: the partial least squares (PLS) approach to generalized inverses. SIAM J. Sci. Stat. Comput. 5: 735–743.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

ter Braak, C.J.F., Juggins, S. Weighted averaging partial least squares regression (WA-PLS): an improved method for reconstructing environmental variables from species assemblages. Hydrobiologia 269, 485–502 (1993). https://doi.org/10.1007/BF00028046

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00028046

Key words:

Navigation