Introduction

Ecosystems can be thought of as an almost infinite network of interactions among biotic and abiotic components balanced between internal and external driving factors. In a stable ecosystem the interactions are in balance, but when they become unbalanced the character of the ecosystem will change. The change may be small or substantial and may occur suddenly in a short time or slowly over an extended period. A rapid change occurring in the present may be monitored by regular observations. However, many changes have been proceeding over a long period before observation was possible, and some rapid and extensive changes have occurred far back in the past. In order to study the dynamics of these ecosystems we have to look back into the past by using the record of changes in fossil organisms and sediment characteristics (‘proxy’ data) to reconstruct past ecosystems and biotic responses. Because of the complex network of interactions throughout the ecosystem, it is desirable to study as many proxies as possible in order to gain a wider overview of the situation than could be acquired from a single proxy (Smol 2002; NRC 2005). Such an investigation is called a multi-proxy study. In this essay about multi-proxy studies we shall concentrate on lake-sediment studies (palaeolimnology) in temperate areas, although one should be aware that successful multi-proxy studies have been carried out on peats (e.g. Booth and Jackson 2003; Pancost et al. 2003; Booth et al. 2004; Chambers and Charman 2004; Charman and Chambers 2004; Mighall et al. 2004), dendrochronological series (e.g. McCarroll et al. 2003), archaeological sites (e.g. Clark 1954; Wasylikowa et al. 1985; Davies et al. 2004; Selby et al. 2005), salt-marsh sediments (e.g. Gehrels et al. 2001), freshwater-marsh sediments (e.g. Finkelstein et al. 2005) and marine sediments (e.g. Andersson et al. 2003; Oldfield et al. 2003a; Risebrobakken et al. 2003; Haug et al. 2005), and in tropical (e.g. Verschuren et al. 2000; Vélez et al. 2005) and extreme polar (e.g. Birks et al. 2004; Hodgson et al. 2005) environments.

The earliest multi-proxy studies, reviewed by Wright (1966) and Birks and Birks (1980), used the palaeolimnological record to test ideas of lake ontogeny and biotic responses over time to external perturbations and internal processes. Although these studies used selected taxa and proxies and there was little or no statistical or numerical analysis, they provided elegant and carefully argued narratives, emphasising limnological processes and the role of catchment changes on lake dynamics. They are major contributions and in many ways they present a challenge to palaeolimnologists today to make further advances in our understanding of lake development and dynamics (Deevey 1984; Likens 1985). In palaeolimnological studies these days, a multi-proxy approach is the norm, but the aims of investigating ecosystem dynamics have turned more towards the reconstruction of past environments and climate changes (Lotter 2003). The synthesis of multi-proxy results in successful studies exceeds the sum of the component parts. However, as knowledge and experience expand, problems have become apparent in the use of some of these component parts for ecosystem reconstruction.

Extensive and detailed reviews of multi-proxy studies in palaeolimnology and palaeoecology include Wright (1966), Birks and Birks (1980), Delcourt and Delcourt (1991), Smol (2002), Cohen (2003), Lotter (2003), Pienitz et al. (2004) and NRC (2005). The four volumes on palaeolimnological methods edited by Last and Smol (2001a, b) and Smol et al. (2001a, b) provide detailed accounts of the full range of field and analytical techniques currently available in palaeolimnology.

The essential aspect of any multi-proxy study is that several proxies are used simultaneously to address the aims of the project. The methods used will, of course, be related to the research question under investigation. The study of lake sediments can be directed towards reconstructions of the aquatic environment and/or of the terrestrial catchment of the lake, even including the regional landscape beyond the catchment. The factors or processes behind the reconstructed changes (patterns) in the lake ecosystem can be sought in terms of causal processes such as changes in climate, both temperature and precipitation, or human activity that affect most aspects of lake ecosystem functioning. Often, more specific questions are asked concerning both natural and human-induced changes in lake-water quality and catchment characteristics, especially changes in vegetation and the catchment that affect the lake either directly or indirectly (Birks et al. 2000; Lotter and Birks 2003).

The results of a multi-proxy study are usually presented and discussed in a descriptive or narrative way (Birks 1993a), using all the lines of evidence to reconstruct various aspects of the past ecosystem and to deduce the range of changes it has undergone. The value of any multi-proxy study clearly rests on the reliability of the proxies used to reconstruct the past environmental conditions. Different proxies reflect different environmental factors at a range of spatial scales and consequently show different strengths and weaknesses. By combining proxies, strengths can be exploited and weaknesses can be identified (Mann 2002). However, weaknesses exposed by multi-proxy studies should not be ignored. They demonstrate shortcomings in methodology and resolution, limitations in taxonomic identifications, lack of understanding of the taphonomy of fossils, and gaps in our knowledge of the relationships of proxies, both biological and physical, to environmental factors. Thus important new lines of research may be stimulated.

There have been many major advances in palaeolimnology in the last 25 years, as reviewed by Smol (2002). In the context of multi-proxy studies discussed here there have been at least six major areas of development. (1) the study of new proxies such as stable isotopes, near-infrared spectroscopy, organic chemistry and bio-markers, chironomids and organic contaminants; (2) improved chronological tools including the discovery of lakes with annually laminated sediments, improvements in 14C dating, 14C calibration and 210Pb dating, and the development of other dating techniques; (3) increasing use of quantitative methods for summarising patterns in complex stratigraphical data and for deriving transfer functions to reconstruct quantitatively past environmental variables from biological proxy data; (4) an increase in fine-resolution studies, often utilising laminated sediments; (5) increasing concern for careful and rigorous project design, site selection, and hypothesis testing; and (6) an emphasis, perhaps an over-emphasis, on palaeoenvironmental reconstructions, with a corresponding neglect of lake biotic responses to changing internal or external factors, of lake dynamics and processes, and of the underlying biology and ecology of the organisms preserved as proxy records in lake sediments. The aim of this essay is to outline some of the methodological and conceptual aspects and challenges of multi-proxy studies in palaeolimnology. It makes no attempt to be exhaustive and inevitably reflects our personal interests and biases, particularly towards quantitative approaches and to recent (last 100–300 years), Holocene, and late-glacial palaeolimnology. It also reflects our research experiences in temperate areas, and our collaborations with colleagues in the UK, Fennoscandia, USA, Canada, The Netherlands, and Switzerland.

Basic requirements and challenges for a multi-proxy study

  1. 1.

    As in any scientific investigation, clear research questions are needed at the outset that the study aims to address. This is especially important in multi-proxy studies as they inevitably involve several scientists collecting a large amount of data. This process is often very time-consuming and therefore expensive in time, effort, and resources.

  2. 2.

    A good leader is required, with effective communication and co-ordination skills, a broad knowledge, a flexible approach, and an enthusiasm and determination to synthesise and publish the results. Multi-proxy studies accumulate large amounts of data (e.g. an estimated 25,000 data points were collected in the Kråkenes Project; Birks et al. 1996, 2000; Birks and Wright 2000). Thus the project has to be carefully planned and co-ordinated from the outset so that all the data are available to all the participants at the synthesis and writing-up stages. A major benefit from a well co-ordinated study is that all the participating scientists are involved andcross-disciplinary links and collaboration can be established.

  3. 3.

    Because so much work goes into a multi-proxy study it is vital that the site or sites for investigation are chosen in locations that will potentially provide answers to the original aims of the project. Once a site is chosen, the collection of the sediments must be done in the most careful and precise way possible from an appropriate place in the lake. It is vastly preferable to undertake all the analyses on one core, as precise correlations can then be made between proxy records. It is therefore worth spending time on site selection, establishing the basic morphometry and sediment stratigraphy of the basin, and on obtaining continuous large-diameter (10–11 cm) cores (e.g. Nesje 1992) or a series of overlapping large-diameter cores (e.g. Cushing and Wright 1965). Such cores usually provide enough material for the majority of analyses to be performed, but perhaps not enough for studies of fossil beetles or some organic bio-markers. If more than one core is required, for example a central core in deep water and a littoral core in shallow water, or a transect of cores, then the cores should be correlated as precisely as possible. This can be done using sediment lithology and comparison of percent loss-on-ignition or magnetic susceptibility measurements. If several lakes are to be investigated, the cores can be correlated using dating techniques (14C, 210Pb) or by correlating events in a regional record such as anemophilous pollen, tephra, or atmospheric contaminants such as sphaeroidal carbonaceous particles.

    In practice there are three sampling and analytical situations in a multi-proxy study–(1) the ‘ideal’ situation where all the analyses of the various proxies are made at the same levels in the same core, (2) the ‘worst’ situation where the analyses are made at different levels in two or more cores from the same part of the basin, and (3) the ‘compromise’ situation where different proxies are studied at different levels but in the same core. To help alleviate the ‘worst’ situation, reliable core correlation can often be achieved using sequence-slotting procedures (Birks and Gordon 1985; Thompson and Clark 1989) or other numerical procedures (Kovach 1993) with percent loss-on-ignition [percentage weight loss after burning at 550 or 900°C], magnetic susceptibility, and other sedimentary variables as the basis for core comparison and correlation. In the ‘compromise’ situation of sampling different levels in the same core, it may be necessary to interpolate the different data sets to constant sampling interval or temporal resolution to permit various types of time-series analysis (Birks 1998) and to allow comparisons between different proxies. A wide range of interpolation procedures is available (Davis 2002; Weedon 2003). They all inevitably result in some loss of information and temporal resolution. The interpolation approach adopted depends very much on the research questions under study.

  4. 4.

    Because so many data are collected, it is important to store and co-ordinate them efficiently. A multi-proxy relational data-base (e.g. Juggins 1996) ensures compatibility and consistency between data types and provides a rapid and effective means of bringing together, comparing, and cross-correlating different proxy records within and between cores. It provides archival and research tables of, for example, basic core data, physical and chemical variables, biological data, chronological information, age-depth model results, and correlations. A data-base allows rapid retrieval of data and provides the basis for subsequent data manipulation and output for further analysis.

  5. 5.

    For almost all multi-proxy studies a reliable chronology is essential. This is usually provided by high-resolution radiocarbon dating, preferably AMS 14C dating of carefully determined terrestrial plant material (e.g. Gulliksen et al. 1998). Recent sediments can be dated by the 210Pb method and associated radiometric techniques involving 137Cs and 241Am (e.g. Appleby 2004) and age-depth models of recent peat profiles have been made by 14C dating (Goslar et al. 2005). In rare instances, lake sediments may be annually laminated and an absolute, or at least a ‘floating’ absolute, chronology can be established (Bradbury and Dean 1993; Anderson et al. 1995, 1996; Ralska-Jasiewiczowa et al. 1998, 2003; Lotter 1999, 2001; Smith et al. 2004) and used to establish rates of compositional change in different proxies (e.g. Lotter et al. 1992), to detect decadal or even annual environmental changes (e.g. Smith et al. 2004), and to infer catchment–lake interactions at a decadal scale (Anderson et al. 1995, 1996).

  6. 6.

    Clear presentation of the wealth of results from a multi-proxy study is necessary. An important first step, essential if the data-sets are from different cores, is to establish age-depth models for each core, so that all the data can be plotted on a comparable age basis. Calibration of radiocarbon dates into calendar years is needed to provide a linear age scale into which other chronologies (e.g. 210Pb) can be combined. Techniques for radiocarbon calibration (e.g. Buck and Millard 2004) and the underlying radiocarbon calibration data-sets (Reimer et al. 2005) are continually evolving. There are many approaches to age-depth modelling (e.g. Bennett 1994; Telford et al. 2004b; Heegaard et al. 2005), all with strengths and weaknesses. The limiting factor of all age-depth models is the number and reliability of the available radiocarbon or other types of dates (Telford et al. 2004b).

    Once a robust and realistic age-depth model is established, the variables from the core(s) can be plotted stratigraphically using computer software such as TILIA, TILIA.GRAPH, and TGView (Grimm 1991–2004) or PDP (Palaeo Data Plotter, Juggins 2002), now superseded by C2 (Juggins 2003). These programs allow stratigraphical variables with different sampling intervals to be plotted on a common depth or age basis (e.g. Oldfield 1996; Oldfield et al. 2003a) or stratigraphical variables with different sampling intervals and from different sites or cores to be plotted on a common age basis (Oldfield 1996).

  7. 7.

    Numerical techniques for detecting the major patterns of variation in a range of stratigraphical data, often consisting of a large number of variables, and for summarising the main stratigraphical patterns are an invaluable tool in synthesising data in multi-proxy studies. The major numerical techniques are reviewed by Birks (1998). A valuable philosophical concept is the principle of parsimony and hence the statistical concept of the ‘minimal adequate model’ in numerical analysis and model selection (Crawley 1993).

    There are three classes of numerical techniques that are useful for analysing multivariate multi-proxy stratigraphical data. Independent zonations of different stratigraphical proxies (e.g. pollen, diatoms, chironomids, sediment geochemistry) using Gordon's (1982) optimal, non-hierarchical partitioning (see also Birks and Gordon 1985) and subsequent comparisons of the various partitionings with the broken-stick model (Bennett 1996) will detect the minimal number of potentially ‘significant’ zones. Zonation schemes based on different proxies can then be compared visually (e.g. Lotter and Birks 2003) or statistically (e.g. Gardiner and Haedrich 1978).

    Sequence-splitting (Walker and Wilson 1978; Walker and Pittelkow 1981) is a potentially valuable tool for summarising multi-proxy data. It was developed for pollen-stratigraphical data and it has not, as far as we know, been applied in palaeolimnology. It ‘zones’ each stratigraphical variable (like individual pollen taxa) into sections with distinct but homogenous means and standard deviations. The statistical significance of each split is tested (Walker and Wilson 1978) and the occurrence of all splits in time can be tested statistically (Gardiner and Haedrich 1978). Birks and Gordon (1985) discuss the approach in detail and Birks and Line (1994) present a palaeoecological application involving statistical testing within and between sequences. The procedure requires statistically independent stratigraphical variables (like accumulation rates). A simple way of transforming relative percentage stratigraphical data into independent variables is to represent them as principal component or correspondence analysis axes that are, by definition, orthogonal and uncorrelated. Each axis can be used as a variable in sequence splitting. The temporal occurrence of significant splits in the data can then be compared with the occurrence of splits in other data-sets from the same core, thereby identifying consistent periods of change in different individual proxy variables.

    Ordination techniques (e.g. principal components analysis, correspondence analysis) provide valuable summaries of the major stratigraphical patterns in a particular palaeolimnological variable (e.g. diatoms), particularly when the sample scores on ordination axes 1, 2, etc. are plotted stratigraphically. Such plots (e.g. Ammann et al. 2000; Birks et al. 2000; Birks and Birks 2001; Lotter and Birks 2003) highlight the major patterns of variation, and illustrate and summarise the nature of the temporal changes. It is most parsimonious to consider only those ordination axes that are statistically significant, namely that have eigenvalues larger than expected under the broken stick model (Jolliffe 2002; Jackson 1993). For biological data, detrended correspondence analysis (DCA) (Hill and Gauch 1980) is preferable because the sample scores are scaled in ‘standard deviation’ units of compositional change or turnover (β-diversity). It is thus possible to obtain a graphical summary of the magnitude of compositional change within a stratigraphical proxy (like chironomids) and between stratigraphical proxies (e.g. chironomids, pollen, diatoms) from the same stratigraphical sequence (e.g. Birks et al. 2000; Birks and Birks 2001). When interest is focussed on the magnitude of compositional change in a group of organisms over a specific time interval between sites, a series of constrained DCCAs (= detrended canonical correspondence analysis with detrending by segments and non-linear rescaling: ter Braak 1986) using sample age as the constraining variable can be made and the estimates of compositional change for the time interval at each site can be mapped and compared (e.g. Smol et al. 2005).

  8. 8.

    Interpretation and publication of the large amounts of data resulting from a multi-proxy study are major challenges.

Firstly, there are often too much data to assimilate readily and it is here that numerical techniques for data summarisation are their most useful and powerful (see above, Birks 1998; Bradshaw et al. 2005a).

Secondly, there is the challenge to avoid the natural tendency to believe that one type of proxy is, in some way, more reliable or more informative than another proxy record, and hence to give subconsciously greater weight to some proxies than to others.

Thirdly, it is a major challenge to avoid the ‘reinforcement syndrome’ (Watkins 1971; Thompson and Berglund 1976; Bennett 2002) or the tendency to adopt a ‘confirmatory’ approach where data interpretation is forced to fit into a particular favoured paradigm or stratigraphical sequence of environmental changes. This syndrome was articulated in the field of palaeomagnetism when Watkins (1971) wrote “It is infinitely more difficult, if not impossible, to prove that a given magnetic field behaviour has not taken place, than to ‘show’ it has occurred. Superimposed on this is an important human element: it is far more reasonable to generate the energy and the belief (? faith) required for publication of data confirming a discovery than to publish more negative data of a pedestrian nature. Thus the initial discovery is reinforced.” In palaeolimnology there is a tendency to try to match small changes in proxy data-sets (‘signal’) to fit or to confirm the current paradigm or model and to ignore other, perhaps equally large, changes as ‘noise’. To avoid the reinforcement syndrome it is important to let the data speak for themselves. Lotter et al. (1995) and Ammann et al. (2000) provide striking examples where numerical techniques helped the data to speak for themselves. The relative sensitivities of different proxies were revealed and the presence or absence of lags in biotic response to rapid climate change could be assessed. An invidious effect of the reinforcement syndrome is so-called publication bias (Möller and Jennions 2001; Meiri et al. 2004) where only confirmatory results are published, especially in so-called ‘high-impact journals’ and non-confirmatory results are published in other journals or, worst of all, are never published. As Watkins (1971) noted, “it would be instructive to compile examples of other applications of this ‘reinforcement syndrome’ to see if there are any natural laws governing the blossoming or survival of possibly spurious, or at least only partially correct, observations or ideas.” Examples of this syndrome may exist in the palaeoclimatological literature concerning, for example, cycles or periodicities in Holocene climatic change, the global extent of rapid and short-lived climatic changes in the Late-glacial and early Holocene, and the early Holocene thermal maximum.

Fourthly, a potentially rewarding approach to the interpretation of large amounts of multi-proxy data is so-called ‘data-splitting’. One proxy (e.g. pollen) may be used to reconstruct mean July air temperature and this reconstruction is then used to interpret stratigraphical changes in another independent proxy (e.g. chironomids, diatoms) in terms of biotic responses to climate change (e.g. Ammann 1989a, b, 2000). Lotter and Birks (2003) adopted this approach in their interpretation of Holocene multi-proxy data at Sägistalsee. Plant macrofossil data were used to reconstruct catchment vegetation, and these reconstructions were then used, along with insolation and other independent climate proxies, as ‘predictors’ in statistical modelling to see which ‘predictors’ best explained, in a statistical sense, the observed changes in five different types of limnological variables (chironomids, cladocera, sediment geochemistry, sediment magnetics, and sediment grain-size). This hypothesis-testing approach is relatively new and has great potential for future research development. It is a powerful way of testing ideas and it should be undertaken more widely in the future (Birks 1993a, b, 1996, 1998), in an attempt to test hypotheses about the possible processes driving biotic and lake-ecosystem changes. Ammann et al. (2000) used the oxygen-isotope stratigraphy from late-glacial sediments as a record of climate change against which observed biotic changes (pollen, chironomids, cladocera, beetles, plant macrofossils) could be compared and evaluated in terms of lags in response to rapid climate change. Other examples of this ‘data-splitting’ approach as an effective means of using one or more proxy types to help interpret the observed changes in another proxy type include Seppä and Weckström (1999), Seppä et al. (2002), Heiri et al. (2003), and Shuman et al. (2004).

Fifthly, it is a major challenge not only to interpret and synthesise the results from a multi-proxy study in as fair and as objective a way as possible, but also to write up the results and to publish synthesis papers, which are, by their very nature, often rather complex and long. There is a tendency today towards the publication of more and more short papers. This has the disadvantage that papers can easily be overlooked because of the ever-increasing number of publications and ‘information-overload’ for readers. A reader can become frustrated when, for example, environmental reconstructions from the same core but based on different numerical methods or calibration data-sets or on different types of proxies are published in different journals, and presented and plotted in different ways and on different scales. If the potential of multi-proxy studies is to be maximised, it is essential that the results be synthesised in a common format presenting points of similarity and points of difference, the potential strengths and weaknesses, and the different potential sensitivities of different proxies, so that their contribution to the conclusions can then be evaluated. It will be vastly more interesting to discuss apparent contradictions in interpretation as these will raise important questions about the proxies, what aspects of the environment they may be reflecting and responding to, and how to interpret them. Contradictions or anomalies also raise important and productive research questions concerning the appropriate use of calibration data-sets and the limitations of our existing ecological, environmental and limnological understandings (see Bigler et al. 2002; Rosén et al. 2003). Palaeoceanographers now recognise that different proxies (diatoms, planktonic foraminifera, benthic foraminifera, sediment grain-size, chemical ratios and stable isotopes) reflect different aspects of the ocean system in terms of stratification, currents, and rates of overturn (Andersson et al. 2003; Risebrobakken et al. 2003). Palaeolimnologists could, with profit, adopt a similar approach in their interpretation of multi-proxy data.

The complex nature of proxy data

The essential feature of multi-proxy studies is that several stratigraphical proxies are used to investigate a common aim. Each proxy takes its own unique place in the ecosystem network and may be used to reconstruct different facets of the ecosystem. Besides the standard much-used proxies, new techniques and proxies are continually being developed, often for specific purposes. Rather than trying to discuss all the various types of proxies available in palaeolimnology, we illustrate the complexities of deriving reliable and robust palaeoenvironmental inferences in multi-proxy studies by focusing on the interpretation of a commonly used physical proxy, namely sediment loss-on-ignition, and on the interpretation of biological proxies using transfer functions.

Physical proxies

Percent loss-on-ignition (% LOI) is the most widely used and perhaps the most useful, simple, physical proxy in palaeolimnology. It reflects the proportion of organic carbon, carbonate, and mineral matter in the sediment (Dean 1974; Boyle 2004). Loss-on-ignition at 550°C (Heiri et al. 2001) has been found to be a remarkably good summarising proxy for many changes in a lake ecosystem (e.g. Levesque et al. 1994; Birks et al. 2000; Battarbee et al. 2001, 2002). However, it is a percentage, and thus an increase can reflect an absolute increase in organic matter or an absolute decrease in mineral matter, or some combination of both. In addition, organic and mineral matter can both originate in the lake (bioproduction, biogenic silica and carbonate) and/or in the catchment (bioproduction, humus or mineral inwash due to catchment instability). Thus % LOI is a simple measurement that can have a complex interpretation (Shuman 2003). Livingstone et al. (1958) were the first to realise this, but few absolute estimates of organic accumulation have been made. Recently, Velle et al. (2005a) estimated the rates of accumulation of organic and mineral matter at Råtåsjoen, central Norway, and were able to interpret changes in % LOI as processes related to early Holocene increased lake productivity and decreased mineral inwash resulting from stabilisation and vegetation of the catchment. Maximum organic matter deposition occurred around 5000 cal b.p. and was related to the climate-induced loss of trees from the catchment. The organic matter stored in the soils was released and washed into the lake. Velle et al. (2005a) were also able to show that % LOI was not related to diatom productivity of biogenic silica. It was slightly correlated with Holocene temperature changes as deduced from the chironomid record, but the organic matter accumulation rate was not. The absolute amount of carbon in the sediments was related much more strongly to changes in catchment vegetation, as deduced from the plant macrofossil and pollen records. The predominant catchment origin of organic matter in sediments in upland lakes was long-ago proposed by Mackereth (1965, 1966), and has been elegantly confirmed by whole-lake additions of 13C (Pace et al. 2004).

% LOI has also been interpreted more directly as a climate signal (e.g. Willemse and Törnqvist 1999). At Lochan Uaine, Scotland, changes in the chironomid assemblage could be related to small temperature changes coinciding with changes in the % LOI curve (Battarbee et al. 2001), suggesting that the % LOI was reflecting greater bioproduction and preservation during times of either warm or cool temperatures. In the Jotunheim mountains of central Norway, Nesje and Dahl (2001) found sharp decreases in % LOI in several lakes at around 8200 cal b.p. that were related to times of glacier re-advance, the so-called Finse event. This cool and/or wet event is correlated in time to a major cool period in the Greenland ice cores (Alley et al. 1997). It is unlikely that the dips in % LOI in the Jotunheim lakes were caused by changes in bioproduction, as the sediments are visibly more silty, suggesting that the % LOI is reflecting minerogenic inwash from the catchment. At Lake Tsuolbmajavri in northern Finland, the % LOI (Seppä and Weckström 1999) follows the annual precipitation reconstruction more closely than the summer temperature curve reconstructed from the pollen data (Seppä and Birks 2001), suggesting that precipitation effects on the catchment may have influenced the minerogenic input and thus the % LOI in this sub-arctic lake. Shuman (2003) emphasises that changes in % LOI in a single core may be difficult to interpret because of within-lake processes and thus multiple cores increase the interpretability of the % LOI record.

Various other chemical and physical proxies have been measured in lake sediments, most notably stable isotopes of H, O, C and N, and carbonate content, chemical composition and magnetic properties. Developing proxies include near-infrared spectroscopy (Rosén et al. 2000, 2001) and bio-markers in sediment organic geochemistry. The last is particularly useful as it is a record of organic compounds produced by organisms that leave no visible remains, such as algal groups, bacteria and cyanobacteria (e.g. Fritz 1989; Lotter 2001). Long-chain lipids from leaf cuticles have been used to characterise terrestrial vegetation changes in response to changes in precipitation and run-off into near-shore marine sediments in Venezuela (Hughen et al. 2004). A new approach by Huang et al. (2004) has shown the potential of studying isotopes in specific lipid biomarkers preserved in lake sediments as a record of environmental change.

Biotic proxies

Environmental reconstructions

Biotic proxies are as numerous as the organisms that leave a record in lake sediments (Smol et al. 2001a, b). As specialist knowledge is needed to identify the fossil material, the organisms are usually studied as groups, such as diatoms, pollen, plant macrofossils, chironomids etc. If enough is known about the biology and ecological tolerances of a taxon, that taxon may be used as an indicator species for the reconstruction of past habitat, community, and environment, including climate (Birks and Birks 1980). Similarly if an assemblage of taxa resembles a modern community that lives in a defined ecological range today, that assemblage may be used to infer past conditions. The indicator species and assemblage approaches rely on modern analogy and assume that the limiting conditions in the past were the same as they are today (Birks and Birks 1980; Birks 2003). The assemblage approach has been quantified as the Mutual Climatic Range Method (MCRM) used with Coleoptera (Atkinson et al. 1987), with molluscs (Moine et al. 2002) and with plant macrofossils (Sinka and Atkinson 1999; Pross et al. 2000). It is also the basis of probability density functions used with plants (Kühl et al. 2002; Kühl 2003; Kühl and Litt 2003) and modern analogue techniques, often used on marine assemblages (e.g. Telford et al. 2004a; Telford and Birks 2005), but also on terrestrial pollen assemblages (e.g. Bartlein and Whitlock 1993; Davis et al. 2003). These methods are designed to reconstruct past environments from fossil assemblages of taxa whose environmental limits have been either determined or assumed by correlation of taxon distributions and abundance with climate or other environmental data.

Another approach to environmental and climate reconstruction is the transfer function approach (Birks 1995, 1998, 2003). Within a group of organisms, taxa from surface-sediment samples are related numerically to environmental parameters by means of a quantitative transfer function. Using the transfer function, past environmental parameters are reconstructed from fossil assemblages. The most widely used transfer functions are between diatoms and lake-water pH, salinity, and total P, pollen and mean July and January temperature and annual precipitation, chironomids and mean July air temperature and water temperature, and Cladocera and mean July air temperature. The use of transfer functions to reconstruct past climate has often been an aim of multi-proxy studies, but surprisingly few multi-proxy studies have compared the resulting reconstructions. When the mean July temperature reconstructions using various methods (transfer functions for pollen, chironomids, and cladocera; MCRM for Coleoptera; indicator species and assemblages for plant macrofossils) were compared for the Late-glacial and early Holocene at Kråkenes (Birks and Ammann 2000) the results were somewhat surprising. Although the patterns of the temperature curves were all the same, as one might expect given the temperature-driven changes through the Late-glacial, the estimated temperature values of the reconstructions were different. The reasons for the discrepancies need to be sought in a more detailed examination of the performance of the numerical reconstruction methods and the representativity of the training sets, especially near the limits of biological existence that prevailed during the Younger Dryas in western Norway.

Transfer functions perform well when the environmental changes are large and are within the central range of the modern training set (Birks 1998). In the Late-glacial, the large temperature changes are well reconstructed. However, reconstructions become less reliable when the values of the environmental variables are near the limits of the training set (Birks 1998). In cold climates, diversity is reduced and the same cold-adapted assemblage of e.g. chironomids, may exist over a wide temperature range. The same restriction applies to pollen, but there is the additional complication of the presence of long-distance-transported pollen from trees in warmer regions into the pollen assemblages deposited beyond the arctic or alpine tree-lines (Birks and Birks 2003). Thus, the reconstruction of cold temperatures and associated precipitation levels from pollen assemblages is difficult (Birks et al. 2000; Larsen and Stalsberg 2004). A similar imbalance is present in diatom/total phosphorus reconstructions; diatoms are sensitive to low and medium total-P concentrations but relatively insensitive to high total-P situations. Related problems of insensitivity can arise when inferring Holocene temperature changes from fossil chironomid assemblages. Temperature changes in the Holocene are smaller and more subtle than late-glacial changes. Reconstructed changes are nearly always within the inherent prediction error range of reconstruction, although trends may be apparent (Birks 2003), and the reconstructions may also be rendered insensitive by the overall predominance of common species with wide ecological tolerances. Small reconstructed environmental changes may result from the chance occurrence of species with narrow tolerance ranges (Velle et al. 2005b).

Apparent discrepancies in quantitative environmental reconstructions based on transfer functions and a range of organisms raise important and critical questions about transfer functions and their robustness. There are several assumptions behind the transfer-function approach (Birks 1995). The most relevant here are the assumptions that (1) the environmental variable(s) to be reconstructed is, or is linearly related to, an ecologically important determinant in the ecosystem of interest; and (2) environmental variables other than the one of interest have negligible influence, or their joint distribution with the environmental variable of interest in the past is the same as in the modern calibration data-set (Birks 1995). Transfer functions are, by necessity, correlative in character; they model numerically the relationship between the observed occurrence and abundance of organisms in surface-sediment samples and modern environmental variables, for example the relationships between chironomid assemblages and mean July air temperature. It is probable that chironomids respond to water temperature rather than directly to air temperature (Brooks and Birks 2001; Brooks 2003). Although there is a strong correlation today between lake-water and air temperatures (Livingstone and Lotter 1998; Livingstone et al. 1999), and transfer functions for modern chironomid assemblages and air temperature perform well as assessed by statistical criteria in cross-validation using modern samples, the critical question is whether the relationship between lake-water and air temperature would be the same if winter precipitation as snow increased by 100–200% or more, as it probably did in parts of the Holocene in the Norwegian mountains (Nesje et al. 2001; Bjune et al. 2005; Bakke et al. 2005a). Large amounts of snow melt-water would result in cool lake-water even though the mean summer air temperature may be the same as in the periods with less winter precipitation. Brooks and Birks (2001) discuss two lakes today in Norway with cold-water modern chironomid assemblages but with high summer air temperatures. Both lakes are ‘outliers’ when chironomids are used to infer modern summer air temperatures, giving estimates of air temperature 4°C cooler than the observed values. Observed differences between reconstructed values of mean July air temperature based on pollen and plant macrofossils and on chironomids in the Holocene (Brooks and Birks unpublished) may be, in part, a result of the relationship between mean July air temperature and July water temperature not having the same joint distribution in the past. A further complication in the use of chironomid transfer functions for inferring past climate is the strong covariance between modern temperature and lake trophic conditions (Broderson and Anderson 2002). Velle et al. (2005b) discuss possible additional confounding variables in chironomid-inferred air temperatures for the Holocene in western Norway. A similar problem may arise in the use of diatom–climate transfer functions as several limnological variables (e.g. alkalinity, pH, conductivity) may covary with temperature (Anderson 2000).

Although there are several numerical procedures for evaluating transfer function models (e.g. Birks 1995; Telford and Birks 2005), the most powerful means for assessing the reliability and sensitivity of a particular transfer function is to compare palaeolimnological reconstructions using transfer functions with known historical records (e.g. Renberg and Hultberg 1992; Fritz et al. 1994; Bennion et al. 1995; Lotter 1998; Teranes et al. 1999; Bradshaw and Anderson 2001). In general the environmental reconstructions based on transfer functions parallel the trends in the historical records but do not always match the absolute values.

Discrepancies emerging from multi-proxy studies (e.g. Birks and Ammann 2000; Rosén et al. 2003) encourage researchers to ask what particular transfer functions really reflect – air temperature, water temperature, length of growing season, trophic status, pH, lake habitat, or a complex interaction of these and other variables? Recent work by Heegaard et al. (2006) indicates that there are significant differences between modern chironomid, cladoceran, and diatom assemblages along an altitudinal gradient in the Swiss Alps in terms of where major compositional changes occur. There appears to be no consistent ‘aquatic ecotones’ between the three groups of organisms. This suggests that each is responding to different environmental variables or complexes of variables that may influence the rates of compositional change between the taxonomic groups with altitude. Thus different proxies and their responses to different aspects of the environment can be utilised to demonstrate varying degrees of inertia and different thresholds (Smith 1965; Maslin 2004). This adds to the challenges of interpreting multi-proxy data and illustrates its potential to differentiate a range of biotic responses to environmental change.

Environmental reconstructions using transfer functions may depend on a surprisingly small number of taxa (e.g. Racca et al. 2003). If there is a preponderance towards abundant taxa with wide ecological tolerances in Holocene fossil assemblages and taxa with narrow tolerances are rare or absent, transfer functions may be rather insensitive, as they appear to be in several reconstructions of Holocene past climate (e.g. Brooks and Birks 2001; Rosén et al. 2001, 2003; Bigler et al. 2002; Korhola et al. 2002; Velle et al. 2005b).

Given current uncertainties about what environmental variables are the major determinants of the occurrence and abundance of different groups of organisms, it is advisable to avoid any attempts to derive ‘consensus’ reconstructions based on different groups of organisms. Given the hidden biases and assumptions in different numerical reconstruction procedures (Telford et al. 2004a; Telford and Birks 2005), ‘consensus’ reconstructions based on the same group of organisms but involving different numerical techniques may conceal important differences in the behaviour of the numerical procedures and are similarly not recommended (cf. Birks 1995, 1998).

A further problem associated with environmental reconstructions in multi-proxy studies is distinguishing between ‘signal’ and ‘noise’ (Birks 1998). The SiZer smoothing procedure of Chaudhuri and Marron (1999) helps to assess which features in a smoothed time-series are statistically significant and hence which features may represent ‘signal’. Korhola et al. (2000) provide a palaeoecological application of SiZer. The approach could be extended to consider several stratigraphical records from a multi-proxy study to help distinguish ‘signal’ from ‘noise’.

There have been considerable advances in the theory, methodology, and development of quantitative transfer functions in the last 20 years (Birks 1995, 1998, 2003). However, as a result of recent multi-proxy studies, problems in some transfer functions are emerging. There is thus the need to ‘return to basics’, in particular to study the environmental requirements and niche parameters of species commonly found as fossils (e.g. Broderson et al. 2004). There is considerable scope for incorporating ecological knowledge into environmental reconstructions and interpretations of multi-proxy studies within a Bayesian framework for inference and prediction – see Ellison (2004) and Clark (2005) for recent lively discussions about why ecologists (and thus palaeoecologists) are becoming or should become Bayesians.

A wider-based multi-proxy approach is now developing where transfer functions, involving whole groups of organisms, are being used in combination with indicator species information. The interest is shifting away from climate or pH reconstruction as ends in themselves and more towards whole lake ecosystem reconstructions and the causes behind the changes. To do this, one has to look inside the proxy group and seek reliable indicator species. This was the original approach to palaeolimnology. It is particularly appropriate for aquatic macrophytes, where individual species ecology has always been important (e.g. Iversen 1954; Watts 1978; Birks et al. 1976, 2001; Birks 2000, 2001). Less is known about the ecology of freshwater algae, including diatoms, but ecological studies of arctic and Antarctic lakes and ponds (e.g. Douglas et al. 2004) are contributing much to our knowledge of diatom and chrysophyte ecology. The modern ecology of chironomid taxa has recently been used to help to rationalise anomalous temperature reconstructions made from chironomids (Brodersen et al. 2004; Velle et al. 2005b). Cladoceran ecology has always been of more interest than climate reconstruction from the whole group (e.g. Hofmann 1996, 2000; Duigan and Birks 2000; Milecka and Szeroczyńska 2005). Coleopteran ecology has also always played a large role in palaeoecological investigations although climate reconstructions using MCRM have now become dominant (e.g. Elias 1994, 1997, 2001; Elias et al. 1999). The ordination method detrended correspondence analysis (DCA) can be used to summarise compositional turnover for groups of organisms that can then be directly compared among groups (e.g. Birks et al. 2000). Individual species changes can then be investigated to seek the reasons for rapid changes in turnover and ecological factors can be inferred to explain the changes (e.g. Birks and Birks 2001).

Pollen analysis and plant macrofossils

As palaeolimnology has made considerable methodological and conceptual advances in the last 20 years (e.g. Battarbee 2000; Smol 2002; Brooks 2003; Fritz 2003; Mackay et al. 2003), it has increasingly developed its own identity, with its own journal, meetings and research agenda. Pollen analysis has not, however, played a major part in the recent development of palaeolimnology (Birks 2005) even though pollen analysis and the associated study of plant macrofossils can provide the main evidence for catchment vegetation over long time periods. Pollen and plant macrofossil analysis (e.g. Wick et al. 2003) are becoming increasingly important in multi-proxy palaeolimnological studies as the role of the lake's catchment and its vegetation and soils is so important in understanding lake biotic and sedimentary changes (e.g. Anderson et al. 1995; Korsman and Segerström 1998; Seppä and Weckström 1999; Lotter 1999, 2001; Birks et al. 2000; Bradshaw et al. 2000, 2005a; Bradshaw 2001; Lotter and Birks 2003; Oldfield et al. 2003b). Limnologists are exploring links between water chemistry and nutrient status and catchment vegetation (Maberly et al. 2003; van Breemen and Wright 2004). There is also a resurgence of interest in biogeochemistry (Jackson and Hedin 2004). Palaeolimnological techniques such as sediment geochemistry can also be used to address critical questions in understanding changes in vegetation history by providing information about catchment soil development and change (e.g. Engstrom and Hansen 1985; Ford 1990; Willis et al. 1997; Ewing 2002; Ewing and Nater 2002). There is thus an increasing need for close collaboration and interaction between pollen analysts, vegetation historians, and palaeolimnologists in multi-proxy studies.

Another area where close collaboration is needed is the analysis of plant macrofossils. Besides providing unique evidence for the local presence of taxa in or near the study lake, macrofossils of aquatic macrophytes are a record of a major component of the lake ecosystem, namely the macrophyte flora (Birks 2000, 2001). Aquatic macrophytes are a major habitat for other aquatic biota, are sensitive to changes in lake level and nutrient status, and represent one alternative equilibrium state in shallow lakes. Interest in plant macrofossils is greatly increasing, not only to provide terrestrial material for 14C AMS dating, but also to help understand changes in aquatic biota in multi-proxy studies (e.g. Sayer et al. 1999, 2006; Birks et al. 2001; Brodersen et al. 2001; Odgaard and Rasmussen 2001; Bradshaw et al. 2005b; Davidson et al. 2005).

Pollen and plant macrofossils represent different but indeterminate spatial scales. The regional pollen rain reflects vegetation at a regional scale, but pollen may also be derived more locally, such as from lake-side and aquatic vegetation (Birks 2005). Plant macrofossils are usually not dispersed far from their source. However, they can be carried long distances by water and by wind. For example, it has been difficult to determine the local significance of isolated Betula fruits and small fragments of Pinus bark in sites above the tree-line (Eide et al. 2006). Within a lake, aquatic macrofossils are usually related closely to the parent vegetation. Consequently, they are better represented in shallow water where the macrophytes were growing (Birks 2001). Thus a core from deep water in the centre of a lake, ideal for pollen, is not always so suitable for macrofossil representation. Central cores appear to contain a good representation of the chironomid community (Heiri 2004), whereas marginal cores can give a biased record of chironomids (Brooks 2000). Central cores contain diatoms from all the available lake habitats (plankton, mud, sand, stones, macrophytes). However, few comparisons have been made of central and littoral cores in multi-proxy studies, mainly because of the large amount of work involved (e.g. Digerfeldt 1971, 1986; Anderson et al. 2005). The multi-proxy study at Lobsigensee led by Brigitta Ammann is an impressive example of using both central and littoral cores to study a wide range of late-glacial proxies and their responses to climatic changes (Ammann et al. 1983, 1985; Ammann and Tobolski 1983; Chaix 1983; Eicher and Siegenthaler 1983; Elias and Wilkinson 1983; Hofmann 1983; Ammann 1989b).

Recent examples of multi-proxy studies

The Kråkenes Project (Birks and Wright 2000) is an example of how a variety of proxies can be used to reconstruct the lake ecosystem, including the catchment, and climate changes over the Late-glacial and early Holocene. Plots of DCA sample scores on axis 1 of the groups together with the % LOI and Pediastrum curves, all showed synchronous changes at the end of the Allerød interstadial, the inception of the glacier in the catchment during the Younger Dryas stadial, and its melting and the temperature rise at the beginning of the Holocene. A similar synchroneity was observed at Pine Ridge Pond in eastern Canada (Levesque et al. 1994), indicating that temperature changes were the over-riding forcing factor in late-glacial ecosystem change. During the early Holocene at Kråkenes, however, major changes in turnover of the various groups were not synchronous and different groups reached compositional stability at different times. This suggests that internal ecosystem factors were playing an important role, such as the development of macrophyte communities, cessation of mineral inwash from the catchment, natural acidification and reduction of nutrients in the lake water, and catchment vegetation and soil development culminating in the immigration of birch trees and the development of birch forest (Birks et al. 2000).

A good chronology can be used to estimate rates of biotic change, as at Kråkenes (e.g. Birks et al. 2000). Here, rapid rates of change in the Late-glacial coincided with the major temperature changes. In the early Holocene rates of change were variable among proxies, reflecting major stages in the successions of the different groups, related particularly to catchment vegetation and soil development and to lake nutrient status. In contrast, the chronology of the recent sediment sequences in the CASSARINA project in North Africa was often poor because of low 210Pb accumulation (Appleby et al. 2001). However, all the sequences covered about 100–150 years. Plots of the DCA sample scores on axis 1 of the organism groups (aquatic and terrestrial macrofossils, pollen, zooplankton, diatoms; Birks and Birks 2001) showed very large amounts of compositional turnover, quantifying the enormous changes in aquatic ecosystems that had occurred within decades under strong forcing imposed by human activity, in this case freshwater withdrawal or continuous freshwater supply in the Nile Delta.

When palaeolimnological data are available from many sites and for the same time period (e.g. last 150 years), the amount of compositional change or biotic turnover for the time interval of interest can be estimated for each site and compared between sites (Smol et al. 2005). This approach was applied to 55 palaeolimnological records from lakes in the circumpolar Arctic and it demonstrated widespread changes in algal and invertebrate communities that are consistent with recent climate warming (Smol et al. 2005). The observed palaeolimnological changes in diatoms, chrysophytes, chironomids, and cladocera are interpreted as reflecting increases in arctic lake primary production (Smol et al. 2005). This hypothesis has been tested for six lakes on Baffin Island by using reflectance spectroscopy to infer changes in lake sediment chlorophyll a concentrations and hence change in lake primary productivity (Michelutti et al. 2005). The inferred changes in chlorophyll a are paralleled by changes in total organic carbon reflecting the balance between the production and decomposition of organic carbon, in biogenic silica, and in C:N ratios. The changes in these four biogeochemical proxies are all consistent with the hypothesis of increased primary production since a.d. 1850. Similarly, a multi-proxy study of Svalbard lakes has illustrated how lake development has responded to climate change over the last century (Birks et al. 2004).

A similar multi-proxy approach involving a range of biological, biogeochemical, and stable isotope variables, and numerical techniques has been used to test hypotheses about recent (last 100 years) changes in diatom assemblages in alpine lakes in the Colorado Front Range. The changes appear to be a response to anthropogenic nitrogen deposition from agricultural and industrial sources to the east of the Rockies (Wolfe et al. 2001, 2002, 2003; Das et al. 2005). The effects of recent human impact have also been demonstrated by a multi-proxy study of Upper Klamath Lake, USA (Bradbury et al. 2004) and the impact of lake pollution and subsequent recovery were traced by Hynynen et al. (2004). Human impact has also been studied in an archaeological context (Davies et al. 2004).

A recent development in multi-proxy studies is the statistical testing of alternative hypotheses about the causes of the observed or reconstructed changes. This has already been mentioned at Sägistalsee (Lotter and Birks 2003). There is great potential for developing this approach more specifically in future multi-proxy studies (Lotter and Birks 1997; Birks 1998, 2003) as a means of evaluating multiple alternative hypotheses.

Future directions

Multi-proxy studies are making major contributions to palaeoecology and palaeolimnology. Our knowledge about the history of past climate change and past ecosystem development and lake ontogeny is steadily increasing. Each proxy reflects the environment at its own spatial scale, taking its place in the network of interactions that comprise an ecosystem, thus providing insights into different facets of an ecosystem.

New proxies are continually being recognised, applied and evaluated. Some of the most promising and diverse include biogeochemistry (e.g. Meriläinen et al. 2001; Fisher et al. 2003; Hynynen et al. 2004; Das et al. 2005; Sayer et al. 2006) and stable isotopes (e.g. Finney et al. 2000, 2002; Hammarlund et al. 2002; Veski et al. 2004; Wooller et al. 2004; Seppä et al. 2005). As well as the development of under-utilised fossil proxies (e.g. animal hairs Hodgson et al. 1998, phytoliths Carnelli et al. 2004, fish scales Davidson et al. 2003), well-known proxies are being used in new ways, using newly developed analytical techniques and improved chronologies to estimate amounts and rates of change through time, and using new approaches to detect morphological or genetic changes in response to environmental change (e.g. Weider et al. 1997; Kerfoot et al. 1999; Cattaneo et al. 2004; Hairston et al. 2005).

An important new direction in multi-proxy studies is a shift in the approaches to the interpretation of palaeolimnological data. Many multi-proxy studies today are focusing on palaeoecological questions as well as environmental reconstructions, invoking ecological indicator species and assemblages to provide new insights into past ecosystem functioning and pushing proxies further in interpretations of possible causal processes and driving factors. Data interpretation used to be primarily descriptive in terms of the reconstruction of past populations, communities, environments, and ecosystems (Birks and Birks 1980) but it can become more ecologically focused on the potential causes of the observed patterns of change or stability (Bennett and Willis 2001). It is here that well-designed multi-proxy studies can make a great contribution in the future because they can provide several potentially independent lines of evidence that can help to evaluate and resolve alternative competing hypotheses set up as explanations for a given stratigraphical pattern in the data (Bennett and Willis 2001). Multi-proxy studies can thus explore ‘the geological record of ecological dynamics’ (NRC 2005) and use ‘the geological record as an ecological laboratory’ (NRC 2005) to study critical research problems concerning biological diversity, community structure, the role of biogeochemistry, ecological impacts of climatic variability, habitat alteration, and the dynamics of biotic invasions. The resolution of such problems requires the fourth dimension of time that can only be provided by palaeolimnological or other palaeoecological data. Carefully designed and rigorously implemented multi-proxy studies have the potential to provide unique records of ecological dynamics over time and thus to contribute to our understanding of the natural variability of populations, communities, and environments, and of the responses of biological assemblages to a range of different environmental changes and forcing functions. Statistical techniques that take account of the inherent properties of multi-proxy data (Birks 1993b, 1996, 1998) can play an important role in testing competing hypotheses concerning possible causal factors and will allow a fuller exploitation of ‘the geological record as an ecological laboratory’ (NRC 2005). Deevey (1964) proposed over 40 years ago the idea of ‘coaxing history to conduct experiments’ as a way of exploiting the palaeoecological record as a long-term ecological experiment. The available analytical and statistical tools have expanded greatly and become increasingly more refined and will no doubt continue to do so. They can be used in the future to focus multi-proxy studies on ecological interpretations and causal factors and to exploit the palaeolimnological record as a unique source of information on biotic changes and responses over a wide range environmental changes at many temporal scales.

A further exciting development in multi-proxy studies is the involvement of ecological dynamic models. For example, a forest succession model has been used to simulate tree-line dynamics and forest composition over long time periods. Climatic parameters derived from palaeolimnological proxies that are independent of the vegetation proxies (e.g. chironomid-inferred temperatures) were used to drive the model (Heiri et al. 2006). In this elegant study the model results were compared with pollen and plant macrofossil reconstructions of the catchment vegetation, making it possible to disentangle the effects of climate and human impact on long-term vegetation dynamics. The combination of ecological models and palaeolimnological proxies (e.g. Keller et al. 2002; Lischke et al. 2002; Heiri et al. 2006) is a powerful means of interpreting observed patterns of ecological changes and dynamics in terms of several causal processes, and highlights the future potential of multi-proxy studies in the modelling and understanding of palaeoecological patterns and processes.

Conclusions

Multi-proxy studies are deceptively simple, highly seductive, and seemingly full of promise. In practice, they are a huge amount of work, they are never simple, they are full of surprises, even shocks, and they are rarely neat, tidy, or simple to interpret. In terms of multi-proxy reconstructions of past climates, we may be near the resolution of current data and predictive abilities of our transfer-function models. The sample-specific errors of prediction estimated by bootstrapping or some form of statistical cross-validation of about 0.8–1.5°C for July temperatures (Birks 2003) encompass the likely range of summer temperature change within the Holocene. In Norway the major changes in glaciers during the Holocene appear to be a response to changes in winter precipitation rather than to changes in summer temperature (e.g. Bjune et al. 2005). Reconstruction of the full picture of Holocene climate change here thus requires a major multi-proxy combination of biological, geological, and sedimentological data (e.g. Dahl et al. 2003; Bakke et al. 2005a, b; Bjune et al. 2005). Multi-proxy studies are not really ‘safe’ science. It is relatively easy and ‘safe’ to develop modern organism-environment calibration data-sets and associated transfer functions. However, complexities can and do arise when these transfer functions are applied to stratigraphical data in multi-proxy studies (e.g. Rosén et al. 2003; Velle et al. 2005a, b).

Despite these problems, multi-proxy studies are important research activities as they provide the means to study lake and biotic responses to environmental change which may have social implications. For example, the CASSARINA project in North Africa (Birks and Birks 2001; Birks et al. 2001; Flower 2001) revealed alarming amounts of biotic change in the last 100 years in response to human impacts. In the Egyptian Nile delta lakes, hydrological and salinity modifications resulted from the year-round inflow of fresh irrigation water controlled by the Nile dams and the rise in the freshwater table due to inadequate drainage in the flat delta. Azolla nilotica recently became extinct in these lakes (Birks 2002), probably as a result of eutrophication and salinity changes. Without the evidence provided by the analysis of plant macrofossils, pollen, diatoms, mollusca, foraminifera, ostracods, and other animal remains from the same cores, the extinction of A. nilotica would not have been recorded and the likely causes would have remained obscure.

Multi-proxy studies are challenging. Projects are usually expensive because of the labour involved, so they have to be carefully designed and coordinated and suitable sites must be chosen to provide the maximum amount of useful information in relation to the aims of the project. It is a major challenge to synthesise the large amount of diverse data and to prepare it for publication. Although we now have vast computing resources, a diverse range of numerical techniques, and large numbers of modern calibration data-sets and transfer functions, the real challenge is to improve on the classical pioneer studies and to argue as logically and as rigorously as was done in the early multi-proxy studies (e.g. Livingstone 1957; Livingstone et al. 1958; Cowgill et al. 1966; Wright 1966; Hutchinson 1970; Deevey 1984; Likens 1985). There has been a tendency in some aspects of palaeolimnology to get too pre-occupied with the minutiae of reducing modern prediction errors from 0.91 to 0.89°C when the environmental data themselves have inherent variability of 1 or 2°C, or with the details of a particular ordination or time-series technique. As a result there is a danger that we can lose sight of the important research questions, of the research hypotheses we are trying to test, of the long-term trends we are trying to detect, and of the limitations of our data, methods, and approaches. Carefully designed and critically implemented multi-proxy studies have the potential to contribute greatly to our understanding of how lakes and their biota respond to internal and external forcing, and to our appreciation of the sensitivities, strengths, and weaknesses of different proxies. They will enable us test specific hypotheses about lake development and biotic responses to specific factors. The interpretation of multi-proxy data raises many important research questions involving new approaches such as ecological modelling and statistical testing. Much has been achieved in such studies, much more remains to be done.