Introduction

Measures of soil respiration (RS, the flux of CO2 between the soil surface and atmosphere) extending over the last 60+ years constitute a geographically and ecologically rich dataset that has improved our understanding of soil carbon (C) cycling. For example, meta-analyses have used these data to draw inferences about the correlation between RS and its biotic and abiotic drivers (Raich and Schlesinger 1992); the autotrophic and heterotrophic sources of RS (Bond-Lamberty et al. 2004; Subke et al. 2006); the importance of CO2 production and biophysical lags to RS patterns (Vargas et al. 2010a); the response of the global RS flux to climate change (Carey et al. In review; Bond-Lamberty and Thomson 2010; Wang et al. 2014; Zhou et al. 2016); the temperature sensitivity of soil heterotrophic respiration (Zhou et al. 2009; Wang et al. 2014); and the response of RS flux to experimental changes in precipitation patterns (Thomey et al. 2011; Vicca et al. 2014). For specific sites, RS has also been used to calculate biometric-based estimates of carbon exchange in comparison with eddy covariance data (Curtis et al. 2002; Gough et al. 2008; Goulden et al. 2011; Giasson et al. 2013), providing a powerful constraint on carbon-cycle measurements and modeling.

In spite of this progress, the terrestrial carbon cycle remains a remarkably uncertain component of Earth system models (ESMs), an uncertainty that has been primarily attributed to the soil component (Friedlingstein et al. 2006; Todd-Brown et al. 2013; Anav et al. 2013; Friedlingstein et al. 2014; Hoffman et al. 2014). Many fundamental questions about climate change depend on the sensitivity of RS to global change factors, including whether soils will gain or lose C over the next century (Cox et al. 2013; Todd-Brown et al. 2014), how the residence time of terrestrial C will change (Schuur et al. 2009; Giardina et al. 2014; Carvalhais et al. 2014), and whether mitigation actions can sequester soil C in meaningful quantities, and over meaningful timeframes for human societies (Paustian et al. 2016). Although researchers recognize the central importance of soils to C-climate feedbacks (Ciais et al. 2013), and are aware of the thousands of extant data on seasonal and annual fluxes (Bond-Lamberty and Thomson 2010), RS data are rarely used for these models’ parameterization or assessment (Shao et al., 2013).

There are several reasons why RS data are infrequently used. First, many recent efforts to improve the representation of soil C turnover in ESMs have focused on simulations at decadal to millennial timescales, and therefore on improving model fidelity of variables that integrate long time periods, such as soil C stocks and 14C ages (Todd-Brown et al. 2012; Koven et al. 2013; Wieder et al. 2013; Todd-Brown et al. 2014). In contrast, when daily to annual C fluxes are simulated, there is often a bias towards utilizing eddy covariance (EC) flux data to the exclusion of other data streams partially because of the readily accessible EC data (Williams et al. 2009; Richardson et al. 2010; Kuppel et al. 2012; Keenan et al. 2013; Collalti et al. 2016). Second, while RS seasonal and annual sums have been compiled (Bond-Lamberty and Thomson 2010; Hashimoto 2012), instantaneous RS measures from autochambers and survey campaigns have rarely been synthesized (Bahn et al. 2010; Vargas et al. 2010a; Lavoie et al. 2014; Cueva et al. 2015), and are not easily available to modelers because they are not in organized data repositories (see Vargas and Leon 2015 for an exception). These small but geographically widespread datasets are generally in the hands of individual investigators, and are part of what Dietze et al. (2013) refers to as the ‘long tail’ of data. Third, many scientists are not sure how to use RS data effectively. The relationship between RS and soil C stocks and turnover is not always conceptually clear, particularly given uncertainties in partitioning RS sources. RS also does not match one-to-one with soil heterotrophic respiration (RH), which is the variable generally desired for modeling soil C turnover, but rather integrates RH and belowground autotrophic respiration (RA). The fact that RS integrates multiple processes—root and microbial activity, along a vertical gradient of soil activity—can make RS a challenging flux to interpret and represent in mathematical models (Vargas et al. 2011b).

Given the rapid development of new soil C cycle representations in ESMs (Luo et al., 2016), these RS data represent an underutilized resource, as RS is an information-rich data stream that occupies an important mid-level spatial scale bridging soil pore-scale processes to ecosystem-scale fluxes (Fig. 1). Other ecosystem-scale measures, such as net primary production (NPP) and net ecosystem production (NEP), similarly integrate multiple processes (e.g., respiration and photosynthesis, from both subcanopy and canopy layers), but nonetheless have been used effectively for model validation and improvement (Hudiburg et al. 2009; Williams et al. 2009; Schwalm et al. 2010; Keenan et al. 2012).

The goal of this review is to highlight the opportunities for utilizing Rs data, justify continued collection of these data, and inspire novel synthesis and modeling activities. We highlight three challenges for utilizing and interpreting RS, and identify potential solutions. We then discuss how RS data are useful for model development, and outline approaches for using RS for data-model fusion (i.e. selecting models and parameters that are ecologically realistic and consistent with a range of data streams) and for model benchmarking (ranking models by how well they match observations). We also discuss quality control and database activities that will make RS data more robust, useful, and broadly available to the scientific research community.

Challenge 1: reconciling soil and tower fluxes

The RS-RECO mismatch

At many EC tower sites, particularly in forests, studies have reported periods when RS is consistently higher than ecosystem respiration (RECO) (Phillips et al. 2010; Thomas et al. 2013; Giasson et al. 2013; Speckman et al. 2015). This is biophysically impossible: RS is a major component of whole-ecosystem respiration in forests, but it cannot be higher than RECO, which also includes stem, leaf, and other aboveground autotrophic respiration (Bolstad et al., 2004; Gough et al. 2008; Ohkubo et al., 2007; Tang et al. 2008). This irregularity indicates either inconsistent measurement footprints, or persistent measurement biases in RECO, RS, or both (Fig. 2).

This challenge was first identified nearly 20 years ago, with RS measurements reported to be higher than RECO in boreal forests (Lavigne et al. 1997). While RS has measurement errors and scaling issues (see Table 1 and Challenge 3), systematic underestimation of RECO is now believed to affect practically all EC sites (Schimel et al. 2008; Aubinet et al. 2012). For instance, Luyssaert et al. (2007) conducted a global synthesis of forest C flux and pools, and showed that non-closure of ecosystem C budgets was the rule rather than the exception. As much as 60 % of the CO2 thought to be taken up by forests based on EC-based GPP was unallocated to plant biomass. Indirect evidence also comes from EC energy balance studies, which have shown that more energy enters EC sites than can be accounted for by latent and sensible heat losses, suggesting that some CO2 emissions might also be missed (Falge et al. 2001; e.g. Barr et al. 2006; Papale et al. 2006; Franssen et al. 2010; Foken et al. 2011; Stoy et al. 2013).

One common explanation is the influence of advection on net ecosystem exchange (NEE) measurements using EC. This implies that we should have more confidence in high RECO values, while lower values should be rejected as they have higher probability of being affected by advection or other systematic errors (Van Gorsel et al. 2007). Several studies in forest ecosystems have attributed RECO-RS mismatch to EC measurements that are affected by topography, which influences advection and thus calculation of NEE (Kutsch et al. 2008; Phillips et al. 2010). In tall plant canopies, nocturnal fluxes can also be difficult to measure because of poorly constrained storage of CO2 within canopies, assumptions for low turbulence filtering (i.e., u* threshold), and decoupling of eddy covariance systems from subcanopy processes (Goulden et al. 1996; Aubinet and Feigenwinter 2010; Thomas et al. 2013; e.g. Alekseychik et al. 2013).

Mismatch may also occur when RECO is much larger than RS, for example as reported by Giasson et al. (2013) during winter at Harvard Forest. In this case, the large difference between RECO and RS indicated high rates of aboveground respiration, approximately twice as large as soil respiration. This is improbable, given that deciduous trees dominate the site. Again, the difficulty in reconciling EC and soil measures may result from poor vertical air mixing. For quality assurance, time periods with low wind or turbulent conditions are generally filtered from NEE time series, but if high winds ventilate the snowpack where CO2 from soil respiration has been accumulating, then such windy periods could produce over-estimates of RECO.

The tendency for nocturnal NEE (and hence RECO based on common approaches) to be underestimated results in artificially high estimates of gap-filled annual net ecosystem production (NEP), and can also cause ecosystem responses to global change factors to be missed. In a dramatic example, Speckman et al. (2015) compared chamber and tower-based estimates of RECO over seven growing seasons during a bark beetle infestation that reduced live wood biomass by 85 % at the GLEES Ameriflux site in Wyoming (Fig. 3). Over that period, EC-based estimates of RECO did not change, while chamber-based estimates of RECO declined by 35 %. Importantly, the mismatch between tower and chamber-based measurements was not dependent on turbulence, and did not diminish when more restrictive turbulence filtering criteria (u*) were used. However, the mismatch did diminish over time with increasing tree mortality, leading the authors to conclude that the loss of canopy improved coupling of flows between the subcanopy and within-canopy air space.

In addition to measurement limitations, there are algorithmic limitations to estimating RECO with EC. Importantly, RECO and GPP are not directly measured–they are inferred as the result of partitioning NEE. Most partitioning approaches rely on the assumption that NEE at night, when photosynthetically active radiation is low or zero, does not have GPP, and that the drivers of RECO (primarily temperature) and GPP (primarily light) differ. Most partitioning methods also assume that nighttime RECO is a good proxy for daytime RECO, or vice versa. However, linkages between photosynthesis rate and soil autotrophic and heterotrophic activity (Tang et al. 2005; Carbone et al. 2007; Bahn et al. 2009; Vargas et al. 2012), diurnal patterns of soil moisture available to soil heterotrophs (Baker et al. 2008), and diurnally shifting tower footprints (Xu et al. 2017), all lead to potential mismatch between RS and RECO. Furthermore, recent methods for estimating daytime respiration using light-response relationships (Laslop et al. 2010) or stable-isotope approaches (Wehr et al. 2016), have shown that RECO is lower during day than at night, due to photoinhibition of leaf respiration. By providing daytime measurements of RECO, these methods also allow partitioning of GPP and RECO in situations with low nocturnal turbulence, or with advection, will hopefully yield better fit between RS and RECO in the future.

The preceding examples demonstrate that EC, while a powerful approach, has particular limitations and biases. Franks et al. (1997) cautioned against calibrating models exclusively with EC flux data, a point reiterated by Medlyn et al. (2005) and Keenan et al. (2013). Nevertheless, there is a heavy reliance on EC estimates of GPP and RECO for model development and benchmarking, a practice that amounts to fitting models to models.

Opportunities to close the RS-RECO gap

Given that simultaneous chamber-based RS measurements are already collected at many EC sites, it would be relatively easy to use comparisons of RECO and RS to flag potentially low-quality EC data (Tang et al. 2008; Phillips et al. 2010; Giasson et al. 2013). Based on Ameriflux sites contributing RS data the Ameriflux database or to the Soil Respiration Database (Bond-Lamberty and Thomson 2010), we estimate RS data are available for at least 37 sites.

While continuous measurements are ideal, even intermittent data provide valuable constraints on RECO. First, RS data constitutes an independent measurement, which is important for methodology validation. Second, RS measures are not subject to systematic biases at night due to low turbulence conditions (although RS chamber bias can occur under high turbulence conditions, e.g. Takle et al. (2004)). Third, RS is directly measured rather than inferred. Fourth, the accuracy of soil chamber measurements is technically straightforward to measure using a sand column with a known flux rate, a type of calibration check that is not possible for EC (Martin et al. 2004; Pumpanen et al. 2004; Risk et al. 2011). Laboratory calibration has the potential to make RS data, at least in theory, more accurate than RECO. For instance, Pumpanen et al. (2004) compared twenty different RS chambers on a sand column through which a known CO2 flux rate was established. They found substantial variation based on chamber design, ranging approximately ±35 % from the true flux rate, indicating that there is unrealized potential to reduce RS uncertainty by using sand columns more widely for routine soil chamber calibration. Such systems can also be used to test chamber responses to wind and changes in soil moisture under laboratory conditions. Sand column calibration cannot account all sources of error and bias, such as soil collar damage roots and hyphae (Heinemeyer et al. 2011), but provides considerable constraint on instrumentation uncertainty (Table 1).

Table 1 Sources of uncertainty for chamber-based RS and EC-based RECO measurements

We suggest using RS-RECO comparison as an alternative evaluation of EC data quality, in addition to energy balance closure and more traditional metrics such as u*. (One of the largest uncertainties associated with this, upscaling chamber measurements to the EC tower footprint, is discussed below in Challenge 3.) There are several examples of utilizing RS to detect low-quality nocturnal EC fluxes, and to construct bottom-up, chamber-based estimates (combining soil, stem and foliage respiration) to gap-fill RECO under periods of flow decoupling and advection (Van Gorsel et al. 2009; Zeeman et al. 2012; Thomas et al. 2013; Alekseychik et al. 2013; Speckman et al. 2015). Alternatively, Thomas et al. (2013) were able to salvage a large portion of nocturnal EC measures at a site prone to advection by summing above-canopy fluxes with fluxes inferred from a sub-canopy system and soil chambers.

Ultimately, uncertainty analyses are likely to reveal sites and periods where EC RECO fluxes are difficult to reconcile with other techniques. Therefore, it may be necessary to move away from the customary approach of gap-filling and partitioning EC flux data, toward more use of the best observations—EC data from optimal conditions, or chamber-based data—for model evaluations.

Challenge 2: beyond autotrophic and heterotrophic partitioning

Terrestrial ecosystem models rarely simulate RS explicitly, but instead simulate components of RH and RA. In some cases, aboveground and belowground components of RA may be pooled into a single flux (e.g. CLM: Oleson et al. 2010), roots may be lumped with aboveground biomass, as is the case in some simple models (e.g., DALEC: Williams et al. 2005 or SIPNET: Braswell et al. 2005), or respiration flux components may be output in ways difficult to combine into an estimate of modeled RS that is truly comparable to observed RS. For this reason, and because RH is used in conjunction with NPP to estimate annual terrestrial C storage (NEP, see Fig. 1), total RS is often not as useful to modelers as its partitioned components. But partitioning RS is non-trivial, requiring experimental manipulations that are prone to artifacts (Hanson et al. 2000; Kuzyakov 2006; Subke et al. 2006). Most partitioning experiments, whether they utilize isotopes or bulk fluxes, require physically separating roots from native soil, and doing so alters moisture, temperature, oxygen levels, and C source-sink dynamics from the undisturbed system, all of which likely influence RA and RH estimates (Nickerson and Risk 2009; Phillips et al. 2013; Snell et al. 2014). Furthermore, there is ample evidence that microbial ‘priming’, i.e. enhancement of microbial decomposition due to root growth, is likely the rule rather than the exception, making RH and RA conjoined fluxes that can only be artificially disentangled (Zhu and Cheng 2011; Dijkstra et al. 2013; Qiao et al. 2014; Finzi et al. 2015).

Fig. 1
figure 1

Schematic of ecosystem C budget components. Gross primary production (GPP) is allocated to above and belowground plant growth (net primary production, NPP) and respiration (RA). Total soil respiration (RS) is the sum of belowground RA and heterotrophic respiration (RH). Total ecosystem respiration (RECO) includes RS plus aboveground RA. Net ecosystem exchange (NEE) is the CO2 flux measured by the eddy covariance method, and generally reported on a 30 min basis. Cumulative annual NEE is called net ecosystem production (NEP) and represents the net increase or decrease in ecosystem carbon storage. NEP can also be estimated from biomass-based measures of plant growth and chamber-based measures of RH

Fig. 2
figure 2

Comparison of RECO (Level 4 FLUXNET estimates) and RS at four forested eddy covariance sites. RS > RECO for significant periods at Harvard Forest, Wind River, and Willow Creek. By contrast, at Mary’s River site a subcanopy flux system was used to identify and gap-fill nights with decoupling, improving RECO estimates (Thomas et al. 2013)

Fig. 3
figure 3

Comparison of chamber and tower based estimates of RECO at the GLEES Ameriflux site during a massive bark beetle die-off. a Comparison of RECO measured by EC (REC), and by summed chamber measures of stems (RW), foliage (RF), and soil (RS). Chamber and tower mismatch correlated with canopy leaf area index (LAI). Reprinted with permission from Speckman et al. (2015)

Because of these obstacles, are there alternatives to traditional partitioning approaches that can be used as model constraints? One well-established approach for using aggregate, unpartitioned RS to constrain component processes is the Total Belowground Carbon Flux (TBCF) approach, originally proposed by Raich and Nadelhoffer (1989). This method indirectly estimates plant belowground C allocation (RA + belowground NPP, see Fig. 1) using measurements of RS and litter fluxes. The TBCF approach makes several major assumptions to justify indirect estimation of plant allocation, namely that most C leaves the soil through respiration (i.e. dissolved C losses and erosion are negligible), and that net changes in standing litter and SOM are negligible. The TBCF approach has been criticized for poor correlation with direct observations in managed forest systems (Gower et al. 1996), but nevertheless has generated important insights into plant allocation when applied at a global-scale (Litton et al. 2007; Litton and Giardina 2008). In addition to employing TCBF,we argue there are more opportunities to use measured aggregate RS to validate the sum of the separately modeled RH and RA. Furthermore, while partitioning studies will continue to be important, there are also opportunities to improve on them to obtain much more detail, for instance by applying techniques that allow more than two sources of RS to be resolved.

Partitioning RS – what has been learned?

Researchers have attempted to partition RS into discrete sub-fluxes for decades (Coleman 1973), with a few studies almost a hundred years old (Lundegårdh 1927; Turpin 1920). Attributing RS to its source components furthers our mechanistic understanding of the many processes underlying and controlling this large C flux (Luo and Zhou 2006). In addition, after ecosystem-scale theories of C and energy dynamics were developed (Kira and Shidei 1967; Odum 1969), isolating the heterotrophic component of RS allowed, in conjunction with a measure of net primary production, computation of ecosystem carbon gain or loss (NEP = NPP – RH, see Fig. 1). Finally, once it was realized that RS components might respond differently to environmental drivers and climate change (Boone et al. 1998), partitioning experiments were critical to draw generalized inferences in these areas (Wang et al. 2014).

A number of meta-analyses and syntheses have attempted to distill what has been learned from decades of partitioning experiments. Hanson et al. (2000) gave a qualitative assessment of partitioning methods and found that published studies exhibited wide variability, but on average about half (46 %) of the annual RS flux was root-derived. Using significantly larger data sets, Bond-Lamberty et al. (2004) and Subke et al. (2006) found that as RS increases, autotrophic contributions increase more than heterotrophic contributions; this pattern is generally consistent across biomes and different ecosystem ages; and there are no significant differences between partitioning methods (Kuzyakov 2006). Finally, two recent meta-analyses have explored how climate warming (Wang et al. 2014) and the interactive effect of multiple global change factors (Zhou et al. 2016) affect RS and its components. Importantly, these studies have found differences in response timing and magnitude, with RH exhibiting a sustained response to warming versus a non-significant RA response. They have also shown that complex interactive effects between factors such as warming, elevated CO2, and nitrogen addition can even reverse the direction of response (Zhou et al. 2016).

An interesting question is how and when global relationships for partitioning RS can be used, particularly as more than a decade has passed since the original global analyses of RH, and almost an order of magnitude more studies have been published partitioning RS. To examine this question, we took data from a global soil respiration database (Bond-Lamberty and Thomson 2010), computed RH via two well-known published equations (Bond-Lamberty et al. 2004; Subke et al. 2006), and computed the resulting error relative to the known, measured RH flux in the database (Fig. 4). From 386 studies, only 121 reported RH fluxes within ±10 % of the equation-derived value, while large errors occurred in ecosystems with anomalously-low RH values (but note that these high percentage errors represent very low absolute numbers); no relationship with biome, ecosystem type, or time since disturbance was evident. This emphasizes the caution researchers must use when applying such generalized equations to singular locations, in particular in areas with high nitrogen deposition (Janssens et al., 2010) or suspected imbalances (from e.g., recent disturbance) in the soil carbon cycle (Harmon et al. 2011). Because direct measurements of RA and RH are have not been collected for many C-budget study sites (Luyssaert et al. 2007; Litton and Giardina 2008), there may be temptation to downscale global equations where no better measurements are available. However, C-budget relationships derived from large aggregates of data can add considerable error when applied at local scales (Kicklighter et al. 1994; Gower et al. 1996).

Fig. 4
figure 4

Error arising from application of global relationships to partition sources of soil respiration. Data from a global soil respiration database, filtered to studies reporting both RS and RH from unmanipulated ecosystems. RH was computed from RS following Bond-Lamberty et al. (2004) and compared to reported RH; dashed lines show ±10 % error. Data based on Subke et al. (2006) are almost identical and not shown. These global relationships predict that the RH fraction of RS decreases as RS increases, but do not allow for sites with high RS and very low RH (at upper left; the graph cuts off a few very high-error sites)

Opportunity: advancing partitioning to consider more than two sources

Partitioning experiments have generated important insights, but designing experiments around a two-source model of RS is both a conceptual and physical limitation. There are many cases in which it is useful to partition RS into three or more sources, for instance to distinguish decomposition of recent soil C additions from that of native soil C and RA (Whitman and Lehmann 2015), or to separate out the contribution of mycorrhizae from roots and free-living microbes (Heinemeyer et al. 2007; Heinemeyer et al. 2011; Phillips et al. 2012; Barba et al. 2016). As climate and land-use change disrupt equilibrium soil conditions, it becomes more important to partition microbial decomposition of older, stored C in addition to labile C (Wutzler and Reichstein 2007). By applying multiple partitioning approaches simultaneously–for instance physically separating roots as well as monitoring the isotopic composition of each component (Czimczik et al. 2006; Hopkins et al. 2013; Phillips et al. 2013), using different-sized micropore mesh treatements (Moyano et al. 2007; Heinemeyer et al. 2007), using two gas tracers simultaneously (Risk et al. 2013b), or combining single tracer experiments in which different C sources are isotopically labelled in each experiment (Whitman and Lehmann 2015)–it becomes possible to characterize more than two components of RS.

Until recently, many researchers simply equated RA with being from ‘recent’ C sources, and RH with being from ‘older’ C sources, as depicted in Fig. 5a (Högberg et al. 2001; Hahn et al. 2006; Ogle and Pendall 2015). However, a number of studies have demonstrated that the C age of RA and RH varies at seasonal to interannual timescales as depicted in Fig. 5b. For instance, compiling root-respired 14CO2 data from three temperate forests, Hopkins et al. (2013) showed a tendency for RA to include substantial C from previous years early in the growing season, transitioning to current-year C by the end of summer (Fig. 6). Similarly, Lynch et al. (2013) showed that after the cessation of a free air CO2 enrichment experiment, carbohydrates stored by trees during the experiment continued to contribute to root respiration for at least two years after the isotopically distinct CO2 source was gone. Czimczik et al. (2006) found that as boreal forest stands matured following wildfire, root respiration tended to contain more stored C. Examining soil-respired 14CO2 in a deciduous forest in Northern Wisconsin, Phillips et al. (2013) found evidence for gradual microbial transition from previous-year to current-year substrates over the course of a growing season. Fresh root exudates appeared to be a preferred microbial substrate. In an arctic tundra ecosystem, Hartley et al. (2012) also found evidence for seasonal microbial switching, but found that mid-summer plant activity stimulated microbial decomposition of older soil organic matter (i.e. positive priming).

Fig. 5
figure 5

Combining 14C monitoring or 13C pulse-chase labeling with physical partitioning provides observations that are more useful for testing model C pool structures. a An isotope mixing model for partitioning RS, which assumes new C is from RA and old C is from RH. b By using physical partitioning methods such as root exclusion while also monitoring the isotopic composition of each component, dynamic ages have been observed at seasonal to interannual timescales. In this example, both RH and RA include more ‘new’ C as the growing season progresses. c Age determinations of RA and RH can be used to test models simulating C as discreet pools with characteristic turnover times. In this example, five soil C pools and two plant C pools are represented, and turnover of both very old and new C increased during summer

Fig. 6
figure 6

Growing season patterns for the difference in Δ14C between root respiration and the atmosphere (ΔΔ14C), which is used to infer C age, for three temperate Ameriflux forest sites. Howland forest (solid triangles), Harvard Forest deciduous site (open squares), and Harvard forest evergreen site (open diamonds). Error bars are the larger of the standard deviation of replicates of the propagated error of ΔΔ14C calculations. Reprinted with permission from Hopkins et al. (2013)

Collectively, these studies suggest that RS should be represented by more than two sources, with both RA and RH containing varying levels of fresh photosynthetic products and stored C pools throughout the year (Hopkins et al. 2013; Phillips et al. 2013), and over timescales of ecological succession (Czimczik et al. 2006). Using conventional assumptions for partitioning RS with 14C (RH = ‘old’ and RA = ‘recent’, Fig. 5a) could lead to overestimates of RH early in the growing season when root respiration utilizes older stored C, and underestimates of RH late in the growing season, if microbes have switched to recent C substrates.

Determination of RA and RH ages by combining physical partitioning and isotopic monitoring also provides observations to test models with more than two discreet C pools (e.g. Parton et al. 2010; Sierra et al. 2012). For instance, Fig. 5c represents hypothetical contributions of five discreet soil carbon pools ranging in turnover time from less than one year to millennial timescales, and two plant C pools with annual and decadal turnover times, that could give rise to the C ages shown in Fig. 5b. Such large ranges in ages of forest soil and root C have been established by measurements of plant, litter, and soil Δ14C (Gaudinski et al. 2000; Trumbore 2000), and have been used to develop soil C models with excellent predictive power. For instance, Sierra et al. (2012) showed that C turnover times inferred from plant, litter, and soil Δ14C measurements in 1996 were able to accurately predict gradual changes in Δ14C measured in RS over the following decade.

Figure 5 also demonstrates why the relationship between RS and soil C stocks can be unclear to many, and why modelers have found it difficult to use bulk (unpartitioned) RS data to test soil C pool models. Currently, the most abundant and widely-used datasets used for soil C model testing do not include in situ RS, but instead are from litter bag decay experiments (Bonan et al. 2013), laboratory incubations, soil C stock sizes, and 14C ages of solid soil C (Jenkinson et al. 2008; Koven et al. 2013). Laboratory incubations and litter bag experiments have demonstrated that a large fraction of soil C turns over slowly, and solid soil 14C data indicates the timescale of the slowest C pools can be on timescales of century to millennia (Trumbore 2009). But RS is dominated by relatively young C that turns over in a year or less (Gaudinski et al. 2000), and that may contribute very little to the soil organic C pool. Collectively this suggests RS can vary substantially from year to year with little impact on soil carbon aggradation or decay (Sierra et al. 2012). It is impossible to know from bulk RS data, or even from estimated RH, when a change in the contributions of long-lived ‘old’ soil C pools has taken place, unless age estimates of the fluxes can also be determined.

However, isotopic age estimates of RH can allow changes in SOC turnover to be detected years to decades before they would be measurable from sampling solid SOC (Gaudinski et al. 2000; Hartley et al. 2012). For instance, Sierra et al. (2012) provided compelling 14C evidence that global change factors, including warming and N-deposition, resulted in a transfer of C from long-lived to rapid-turnover pools in soil experiments at Harvard Forest. Moving forward, it would be advantageous for more studies to combine 14C or pulse-chase 13C monitoring with physical partitioning approaches, to provide a baseline for detecting climate change impacts on organic matter decomposition. Furthermore, there is some urgency to apply 14C measurements soon, given the anticipated loss of 14C bomb-C determinations in coming decades (Graven 2015).

Opportunity: evaluate abiotic sources of soil CO2

There is also a relatively unexplored opportunity to utilize partitioning experiments to evaluate abiotic sources of soil CO2. The general assumption in most soil flux studies is that only biological communities near the soil surface are responsible for the observed CO2. This is of course a simplification, and CO2 in the soil profile may originate from a variety of sources, including soil pedogenic or liquid water carbonates (Cerling et al. 1991; Serrano-Ortiz et al. 2010; Shanhun et al. 2012; Risk et al. 2013b), methane oxidation (Romanak et al. 2012; Mills et al. 2013), or volcanic activity (Werner and Brantley 2003; Viveiros et al. 2008; Beulig et al. 2016). Soil sinks for CO2 may also exist, for example from carbonate dissolution (Shanhun et al. 2012) and C-fixation by chemoautotrophic microbes (Beulig et al. 2016). Though these abiotic and anaerobic mechanisms are probably quite common and contribute at some scale to RS, they are often studied only under specialty situations and we lack widespread estimates of the abiotic flux contribution. At individual sites, however, the contribution can be large: for instance, Rey et al. (2014) determined that volcanic CO2 emissions accounted for almost 50 % of net ecosystem carbon balance at an arid steppe site in Spain.

Various noble gas tracers and isotopic methods are available for tracing CO2 origin (Wilkinson et al. 2010), but there are also simpler approaches for abiotic partitioning that can readily be integrated into existing study sites. Researchers can 1) increase their use of geochemistry and gas ratios, 2) mine continuous temporal records more effectively, and/or 3) add gradient sensors or measurements. For example, the ratios of two common gases can help infer source contributions based on known stoichiometric relationships (Fig. 7).

Fig. 7
figure 7

Mixing lines for biological respiration and methane oxidation (left) and example data for different idealized environments, each with strong source dominance (right) except for oil and gas thermogenic seeps, which are shown here as being mixed with biological contributions. Modified from Romanak et al. (2012)

What would a generic biotic-abiotic RS measurement system look like? It would measure CO2 plus another useful accessory such as O2, at the surface (flux), and potentially other gases at a limited number of depths (concentrations) along with in-situ diffusivity measurement. Because aerobic respiration involves a theoretically constant ratio of CO2 production to O2 consumption, measuring deviations from this ratio can allow quantification of anaerobic respiration, or wetting events that preferentially store or advect CO2 over O2 (Turcu et al. 2005; Angert et al. 2015). Oxygen sensors are widely available, and although high accuracy is required for this application, they could presumably be added to autochambers (Liptzin et al. 2011). The system would likely need to run as often as possible, both to avoid the problem of bias from sparse data (Creelman 2015), but also to capture periods when biotic activity is negligible, so that other interesting background processes (e.g. geologic seepage, carbonate dissolution, wind ventilation of soil CO2) are brought into focus.

Additionally, RS studies do not often make full use of spatiotemporal possibilities to evaluate abiotic drivers. Non-growing season data are particularly well suited for identifying the presence of continuous abiotic activity such as volcanic contributions that may otherwise be masked (Risk et al. 2013a; Rey et al. 2014). Also, with the growth of temporally rich data from autochamber systems, spectral or frequency-based analytical techniques (e.g., frequency filtering, wavelet) become useful (Vargas et al. 2010b; Heinemeyer et al. 2012). While not currently common, they may also help identify abiotic background drivers of RS that may vary on unusual timescales for biological processes (i.e., independently of moisture, temperature, and phenology). Finally, from a spatial perspective, subsurface concentration gradient studies can be extremely useful in helping identify layer-by-layer processes (Tang et al. 2003; Davidson et al. 2006; Vargas and Allen 2008; Phillips et al. 2012; Maier and Schack-Kirchner 2014), and basal fluxes that originate from below the soil profile. It would not be surprising if much of the new understanding in RS over the coming years comes from gradient-based techniques, because of its ability to vertically resolve respiration across gradients of soil age, substrate quality, and O2 concentration, and because of the potential for parameterizing physical-biogeochemical models.

Challenge 3: better upscaling and downscaling

The final challenge we address is in how to bridge RS measurements to spatial and temporal scales matching those of other ecosystem C measurements and model outputs. One of the largest uncertainties associated with comparisons between RECO and RS (Challenge 1, above) is the upscaling of chamber measurements to the footprint of the EC tower. The difference in ground measurement areas between individual soil collars (0.01–0.1 m2) and a 30 m forest EC tower (~1 km2) give a scaling factor of 105–106. Similarly, gap-filling monthly survey measurements to infer an annual total requires estimating over 99 % of the time period. These numbers suggest that high spatial variability of RS is an additional plausible reason that RS is often not matched well with RECO.

Upscaling often entails taking a simple mean of measurement locations distributed across a tower footprint, and some studies have performed power calculations to derive the minimum acceptable number of RS measurements (Rodeghiero and Cescatti 2008). More rarely, regression models relating RS to the spatial distribution of its main environmental drivers, including roots, microbial substrates, soil temperature and soil moisture, or soil depth (Tang and Baldocchi 2005; Saiz et al. 2006; Savage et al. 2008; Martin and Bolstad 2009), are used. Often, however, other easily measured soil variables have limited capability to predict spatial variation in RS (Stoyan et al. 2000; Allaire et al. 2012; Kreba et al. 2013). For instance, at Harvard Forest, Giasson et al. (2013) found no more correlation among neighboring collars than among distant collars. In fact, they showed that variation due to spatial variability was as great as the variation due to experimentally imposed manipulations, such as soil heating, rain exclusion, selective harvest, and N additions.

Multiple studies have discussed the challenge of representing spatial heterogeneity of RS (e.g. Stoyan et al. 2000) where geostatistical techniques have been used to demonstrate the challenge of measuring RS from the plot to the landscape scale (Kosugi et al. 2007; Teixeira et al. 2013). One of the reasons for this lack of predictive ability is the tendency of RS to have “hotspots” and “hot moments”, locations or periods of time with disproportionally high RS emissions compared to surrounding conditions (Kim et al. 2012; Leon et al. 2014). (sensu McClain et al. 2003) suggested that these occur when episodic hydrologic flow paths deliver substrates and nutrients in abundance to locations of high biological activity. However, Leon et al. (2014) showed that hotspots may not be predictable, as they may appear in areas where no evidence of high metabolic activity was present during antecedent measurements. Furthermore, Leon et al. showed that the location and intensity of hot spots can change seasonally, and can be correlated with different environmental variables in different seasons (Fig. 8).

Fig. 8
figure 8

Spatial patterns of RS generated by ordinary kriging for the dry season (a) and wet season (b) in a Mediterranean shrubland. The figure shows the emergence of a hot spot (i.e., area of high RS) during the wet season (orange area in panel b). Reprinted with permission from Leon et al. (2014)

This spatial variability makes measuring RS representatively and robustly, and using these measurements to compare and evaluate RECO derived from EC, more challenging. One way researchers can help to identify the presence of hot spots and hot moments to other data users is to report the full probability distribution function (or at least the second and third moments—variance and skewness) of observed soil respiration in addition to site means (Lavoie et al. 2014). Ideally, complete data sets that include full time series of raw data and all sampling locations would be made available in archived repositories, allowing users to define hot spots and hot moments using different algorithms, and re-analyze older data as new techniques or insights become available.

Opportunity: ‘smarter’ quantification of site-level RS

While we do not offer a blanket approach for upscaling, several techniques can provide better site- or ecosystem-level estimates than simple means of chamber locations. If RS spatial variance and its drivers are well quantified, stratified (as opposed to random) sampling can allow for more efficient measurements and robust upscaling (Rodeghiero and Cescatti 2008). For instance, soil chambers in distinct vegetation types can be weighted by the fractional extent of vegetation types within the EC tower footprint (Giasson et al. 2013). Taking this approach a step further, footprint analysis can be employed to account for movement of the EC tower footprint through time, i.e. applying a time-varying source weight function to individual chambers (Budishchev et al. 2014). In addition, random variability of RS across space and scales can be simulated with geostatistical techniques such as kriging (Herbst et al. 2012; Allaire et al. 2012; Leon et al. 2014).

In complex terrain, in contrast to typically flat EC sites, empirical models relating RS to hydrologic factors may provide ‘smart’ upscaling to watershed scales. For instance, Riveros-Iregui and McGlynn (2009) demonstrated that upland accumulated area was a robust predictor of RS, because both soil moisture and RS increased at valley bottoms relative to ridges. However, the opposite pattern was found at a different Western U.S. site, where elevation-related cooling increased soil moisture, and thus RS, at high elevation positions (Berryman et al. 2015). Ultimately, researchers must discern effects that covary with topography such as soil moisture, soil organic carbon (as available substrate for Rs), temperature, and vegetation type to fully understand the biophysical controls on RS in complex terrain.

RS measurements frequently must be scaled in time as well as space, as most site-level estimates are derived from sporadic manual measurements, not continuous ones. While seasonal differences in RS can be large (e.g. Khomik et al. 2006), seasonal patterns tend to be relatively straightforward to account for in gap-filling regression models, absent the effects of disturbances or drought (Borken et al. 2006; Martin et al. 2012; Barba et al. 2016). More difficult to simulate are RS ‘bursts’ after rewetting or thawing (Kim et al. 2012). This phenomenon, known as the “Birch effect”, is thought to be due to some combination of increased microbial populations, higher metabolism, and changes in the physical protection and hydrological connectivity of the soil (Xu and Baldocchi 2004; Tang and Baldocchi 2005; Oikawa et al. 2014). These bursts present similar challenges as other types of ‘hot moments’, but have more predictable occurrence, and in some cases can constitute an appreciable fraction of annual fluxes. Also difficult to represent are seasonal or diurnal hysteresis of RS responses to temperature (Kicklighter et al. 1994; Riveros-Iregui et al. 2007; Phillips et al. 2011), and linkages and lags between photosynthesis and respiration (reviewed by Carbone and Vargas 2007; Davidson and Holbrook 2009). Oikawa et al. (2014) proposed a model that could simulate wetting-related pulses and temperature hysteresis, and Zhang et al. (2015)used a CO2 gas transport model to simulate temperature hysteresis. However, there are presently no standardized gap-filling algorithms or approaches for soil respiration, unlike for EC (Moffat et al. 2007). In the most comprehensive work to date, Gomez-Casanovas et al. (2013) assessed a variety of gap-filling RS algorithms and found that different methods exhibited different levels of skill with data gaps of various lengths, but rarely produced large systematic biases. Obviously, continuous data are preferable to periodic campaign-style measurements for constructing annual carbon budgets, and systematic biases can be introduced when measurements are made always at the same time of day (Savage et al. 2008; Savage et al. 2009; Gomez-Casanovas et al. 2013). Nevertheless, supplemental periodic measurements with a portable instrument are generally necessary to capture site spatial variability.

In addition to estimating a site average RS it is necessary to estimate the associated uncertainty (Reichstein and Beer 2008). While uncertainty of individual RS chambers can be quantified readily (Pumpanen et al. 2004), describing site-level uncertainty is more difficult, as it includes uncertainty due to spatial variability in RS, gap-filling assumptions, and the goodness of fit of the statistical model used for scaling (Lavoie et al. 2014). A growing body of research is indicating consistencies in the statistical features of RS in both time and space, work that may provide simplifying solutions for uncertainty quantification. For example, at Willow Creek Ameriflux site (Lavoie et al. 2014), at Harvard Forest (Savage et al. 2008), and at multiple sites around the world (Cueva et al. 2015), the standard deviation of RS measured at multiple locations within the ecosystems was heteroschedastic, increasing linearly with site average RS. Similarly, in the temporal domain, the random error of RS (defined as the deviation between measures made at one location under as-similar-as-possible conditions) was found not only to scale with RS, but to scale with the same slope across a large number of sites, and even in different ecosystems (Savage et al. 2008; Lavoie et al. 2014; Cueva et al. 2015).

Opportunity: downscaling tower measurements

In addition to scaling up RS to the EC site level, there are emerging opportunities for decomposing EC fluxes to capture spatial variability within the tower footprint, perhaps more in line with the spatial scale of RS. Flux tower based RECO estimates may suffer from systematic bias due to constantly changing flux footprint source area with time, as a function of wind direction, surface structural properties, and atmospheric stability (Schmid, 2002). However, techniques that take advantage of this footprint variability can be used to map spatial variation and provide gridded estimates of RECO.

The Environmental Response Function (ERF) approach (Metzger et al. 2013) is one such approach to direct attribute the measured flux to the actual source area of contribution and use that information to appropriately scale tower fluxes to other measurements such as chambers . As applied to towers (Xu et al. 2017), the ERF techniques estimates minute-to-minute variations in the flux footprint to compare measured fluxes against covarying spatial and temporal drivers. The comparison allows a statistical model to be built to map fluxes and their uncertainty across space, using a machine learning approach. With ERF, down-scaled NEE observations can then used to derive a re-upscaled, gridded RECO and directly compared to point-level Rs chamber measurements. To date, ERF has only been applied to direct fluxes (NEE), but a new opportunity arises to use ERF in partitioning and improving comparison of daytime RECO estimates and RS.

Using RS for model development

Aside from the impetus to use as many data sources as possible for data-model synthesis, field RS data have the tantalizing potential to provide a critical bridge across scales and domains. In RH models, there is a significant scale gap between lab incubations and litter decomposition experiments, and the field- and global-scale questions that are being asked of these models. RS measurements scaled to site and ecosystem levels (see Challenge 3) could reduce some of this gap and bridge key spatial-temporal heterogeneities in the RH model. In addition, NEE is currently the only measurement routinely used for model development which crosses both soil and plant domains. Having additional information from RS for model benchmarking would allow modelers to better identify poorly-performing sections of land carbon models (Luo et al. 2012).

The few studies that have directly tested the value of RS for model fidelity also suggest the importance of these data. Keenan et al. (2013) rated the utility of data streams for data-model synthesis, and found RS data were one of the most valuable measurements for reducing uncertainty of forest productivity estimates. They concluded that most model improvement is seen with a limited number of measurement types, and speculated that the most valuable measurements provide information about turnover of C pools at both fast and very long timescales. Similarly, Zobitz et al. (2008) tested whether RS could be simulated by a forest C model using only NEE and soil C stock data for parameterization, and found that the model could not be satisfactorily constrained, implying that Rs brings distinct and unique information to any model-data assimilation exercise.

We see three types of model improvement activities that could utilize RS data: (1) data-model synthesis where the aim is reducing uncertainty of parameter estimates, (2) preventing models from predicting biologically implausible states, and (3) benchmarking a range of models to common sets of observations (Kelley et al. 2013). In Table 2 and below we give examples using RS data for each of these.

Table 2 Uses of RS data for terrestrial model improvement

Data-model synthesis techniques attempt to identify and select models, parameters, and uncertainty that are more parsimonious or consistent with the entire range of collected observations. These approaches, which can be Bayesian in nature, attempt to propagate a probabilistic set of consistent parameters forward, allowing for direct estimation and partitioning of sources of uncertainty (LeBauer et al. 2013; Dietze et al. 2013). One advantage of these approaches is that simple assumptions often made on scaling, gap-filling, and single variable regression can be removed if model processes are able to simulate components or model outputs can be “sampled” like the original observation. For RS, this means that infrequent survey measurements can be assimilated without requiring separate gap-filling procedures.

Particularly relevant for RS, data-model synthesis allows observations to be used even when they do not match one-to-one with modeled variables. For instance, incorporating RS in data-model syntheses with a forest C model was found in one instance to improve estimation of NEE (Keenan et al. 2013), and in another instance to reduce the uncertainty in heterotrophic respiration and the size of soil carbon pools (Keenan et al. 2012). Direct model-data integration would require terrestrial C models to simulate belowground RA, which many do not. Autotrophic respiration is usually split into growth and maintenance respiration processes, but rarely into separate above- and below-ground components (Medvigy et al. 2009; Oleson et al. 2010; Parton et al. 2010; Dunne et al. 2012). Pushing for the development of allocation schemes that distinguish belowground RA in global scale models would provide an elegant solution to the comparison problem, but it is by no means the only option.

In addition to reducing parameter uncertainty, another approach for data-model synthesis is to focus on preventing parameters and states that are not biologically plausible (Bloom and Williams 2015). A common problem for C models is equifinality: they can produce similar annual fluxes with different underlying parameters and process representations. Poor model parameterization is likely to translate to weak predictive power under changing environmental conditions. Perhaps the most powerful use of RS data is as a diagnostic tool to identify implausible parameterization of RH or RA, as a form of ecological dynamic constraint (Bloom and Williams 2015). For instance, to check on the parameterization of RA, one could use RS as a co-constraint with aboveground measures of plant growth, such as biometric or remotely-sensed measures of NPP. To check on the parameterization of RH, one could use RS and a co-constraint with C stock or 14C data, thereby encompassing turnover processes at immediate to long timescales.

Comparing modeled and observed seasonal dynamics of RS provides another means for bounding plausible conditions. There are some situations (e.g. winter dormancy) where RA is all but shut down and we would expect RS to be roughly equal to RH. In these cases, relatively straightforward direct comparison of RS and RH is likely reasonable. More creative comparisons could be done by examining relative shifts of RS across an environmental gradient where either variations in RH or RA are expected to dominate, or where belowground RA is expected to be proportional to some other measurement like leaf RA. Ideally competing hypothesis would be codified in model formulations, leading itself to formal data-model integration and model selections. Clearly, this is a study-specific approach that requires a deep understanding of the mechanisms being investigated, but has the potential to stimulate creative uses of data-model synthesis techniques.

Building on these ideas, RS data can also play a valuable role in benchmarking multiple models against common sets of observations. Luo et al. (2012) summarized several key qualities that benchmark data should have. First, benchmark data should reflect properties that are fundamental to all ecosystems, such as C uptake and residence time. As discussed above (see section 2.3), RS provides a critical data source for evaluating both soil and plant C residence time. It is particularly powerful for evaluating residence time at the timescale of manipulative experiments, and when measured in combination with isotopic dating methods. Second, benchmark data should be selected to reduce equifinality. RS provides a focus on soil processes that may be obscured in EC data or remotely-sensed data products, and is thus particularly useful for detecting weaknesses in belowground parameterization. RS data are also abundant and geographically widespread, another important quality for reducing equifinality in global-scale models. Third, benchmark data should be reliable. As discussed above (Challenge 1), RS measurement systems have the benefit of validation under laboratory conditions, and are likely more reliable than EC-derived estimates of RECO.

Luo et al. (2012) also noted that land models can be evaluated on the basis of seasonal patterns rather than absolute values. RS data are particularly useful for calibrating soil models at seasonal timescales, since soil C stock and even litterbag data integrate decomposition processes occurring over timescales ≥1 year. For example, Fig. 9 shows seasonal patterns in RH and RECO simulated by the Community Land Model (CLMv.4.0) in comparison to observed RS and RECO from the a pine forest site in Oregon (US-ME2, model data from Hudiburg et al. 2013). Even after site-specific tuning of the model, seasonal patterns had considerable mismatch with observations. Modeled fluxes were strongly suppressed by seasonal summer drought. As a result, RH peaked in winter rather than during the growing season, and modeled RECO peaked two to three months early, demonstrating fundamental problems in representation of soil and plant activity.

Fig. 9
figure 9

A comparison of RH and RECO modeled with CLM v.4.0 to chamber RS and EC-based RECO (Level 4 data) at the AmeriFlux Metolius intermediate pine site (US-ME2). Modeled RECO peaked two to three months earlier than observations, missed a large portion of the growing season, and is decoupled from RH, which peaked in winter rather than during the growing season. Model output is from Hudiburg et al. (2013)

Conclusions and future directions

RS data are a rich source of information on soil responses to land use and climate factors, and are important for advancing our understanding of the terrestrial C cycle. However, to a large extent RS data have not been integrated with modeling efforts nor with analyses of related C fluxes and pools. We argue that in field studies, RS data are particularly useful for:

  1. 1)

    Identifying and potentially gap-filling low-quality nighttime EC data. RS data are completely independent of, and not subject to the same nighttime biases as EC. They therefore provide a lower plausible bound for RECO and more generally an additional EC data quality metric. RS measures can also be independently validated such as with sand-column flux generators. For these reasons, RS measurements should be considered important components of EC site instrument arrays, particularly at forest sites with complex canopies.

  2. 2)

    Determining soil responses to non-equilibrium conditions over experimental time frames. Given the slow turnover of the majority of soil C pools and high spatial variability, it is often difficult to detect statistically significant changes in soil C stocks in response to manipulations of land use or climate variables. Measurements of RS and partitioned RH components therefore provide an important means to detect experimental treatment effects on soil C turnover.

  3. 3)

    Identifying sources of R S . While partitioning RS into autotrophic and heterotrophic components is valuable in its own right, applying physical partitioning in combination with isotopic monitoring makes it possible to evaluate contributions from recent and old C by both plants and microbes, as well as abiotic CO2 sources, under field conditions.

  4. 4)

    Quantifying spatial variability within tower footprints. Soil flux measurements can be used to identify hot spots and hot moments, evaluate spatial variables that control RS, and also evaluate the representativeness of EC tower measurements within the larger domain.

In addition, RS data can be used in the following ways to improve terrestrial C cycle models:

  1. 1)

    Improving data reliability. RS provide directly-measured respiration fluxes, in contrast to inferred EC fluxes based on partitioning NEP, and can be used to reduce the weight given to low quality EC data during data assimilation procedures.

  2. 2)

    Testing multiple-pool soil C models. When measured in combination with partitioned fluxes (RH and RA), or 14C abundance, RS data can be utilized for calibration of complex soil C models that represent ‘new’ and ‘old’ C contributions to both RH and RA, or additional C sources, such as from geologic sources.

  3. 3)

    Preventing biologically implausible states. When used as a joint constraint with soil C stock data, RS can improve modeled soil C turnover across immediate to very long timescales. When used as a joint constraint with plant growth data, RS can provide a plausibility test for evaluating RA and plant allocation.

  4. 4)

    Addressing model equifinality. RS data can help to ensure that belowground processes are well-constrained in complex terrestrial models. Because RS crosses scales and plant and soil domains, it can reduce the uncertainty of several related variables in data-model synthesis techniques, including NEE, RA, and RH, and soil C stocks.

  5. 5)

    Providing data for model benchmarking. Because RS data are abundant, geographically widespread, and generally reliable, they have qualities that make them well-suited for benchmarking models.

Taking advantage of the full potential of RS measurements will require a concerted effort to make RS data more accessible and intercomparable among sites. Many measurement ‘best practices’ and data processing QA/QC procedures have been suggested, but need to be more broadly adopted by the terrestrial ecology community (Bahn et al. 2008; Heinemeyer et al. 2011; Gomez-Casanovas et al. 2013; Lavoie et al. 2014; Maier and Schack-Kirchner 2014; Cueva et al. 2015). While the Global Soil Respiration Database (Bond-Lamberty and Thomson 2010) has made an extensive amount of seasonally and annually averaged Rs data available, similar efforts are needed to gather and make publicly available instantaneous flux measurements from survey campaigns and autochambers (McFarlane et al. 2014). Ideally, efforts to catalog both raw and processed RS data would allow comparison with contemporaneous EC tower fluxes and other soil C data, including SOC stocks, isotopes measurements, and related biogeochemistry data. These efforts can help to usher new global syntheses, and progress in both measurement and modeling realms.