Introduction

Given their often large and complex microbiomes, soils can be considered as hotspots for microbial biodiversity on Earth. As a result, soils provide a large number of biological services that are essential for life on Earth, which are considered as life support functions (LSF). These LSF include:

  1. (1)

    The provision of “fertile ground” as a basis for a sustainable bio-economy, including the growth of food, feed, fibers, and bioenergy crops;

  2. (2)

    The maintenance of a natural unthreatened plant biodiversity at sites which are not used for agricultural production;

  3. (3)

    The safeguarding of drinking water, by filtering and degrading pollutants in soil before they enter the groundwater body;

  4. (4)

    The protection from erosion;

  5. (5)

    The potential to act as a sink for atmospheric CO2.

Moreover, soils have major roles as sources—or sinks—of greenhouse gases such as CH4, methyl bromide, and N2O, and land management practices have great influence on the underlying processes (Van Elsas et al. 2007).

However, this multi-functionality of soil is highly endangered as a result of the ongoing global change. Climate change induces not only an increase in temperature in the long run, but it is also associated with increased frequencies of extreme weather events, like prolonged drought periods or heavy flooding. Next to these, increased land use intensities (causing, for instance, overgrazing, and agricultural production declines), mining and pollution pose additional challenges. Furthermore, land use conflicts often reduce areas with high-quality soils due to urbanization and the associated sealing. Thus, the persistent threat of soil degradation driven by climatic and anthropogenic forces prioritizes the development of strict directives for the protection of soils, as recently promoted by the European Union (www.ec.europa.eu/environment/soil/index_en.htm). In this respect, the importance of developing robust, reliable and resilient biological indicators for monitoring of soil quality has been emphasized, in order to establish an early warning system of potential losses of the multi-functionality of soils.

By definition, such indicators should allow easy measurement and be accurate for the purpose they were developed for. In addition, it would be advantageous if costs were kept low. Thus, there have been several past efforts to define biological indicators of soil quality, mainly focusing on the “visible” parts of the soil biota, i.e., the soil macrofauna (Stott et al. 2009). Whereas, such indicators have indeed become well established and found their way into several guidelines (e.g., the abundance and/or diversity of earthworms or of nematodes (ISO 11268)), indicators that describe the status of the soil microbiome (or of microbial key players) are still rare. The currently existing indicators are mostly based on so-called sum or black-box parameters, as exemplified by traditional parameters like microbial biomass, global or potential microbial activity patterns or assays that determine potential enzymatic activities (Nannipieri et al. 2003; see also Table 1). For example, of the ten indicators used to evaluate soil quality given by Andrews et al. (2004), only two (microbial biomass C and potential N mineralization) address microbiological soil properties. Recently, an alternative has been proposed, based on the use of specific ratios that report on function. Thus, the metabolic quotient (C-CO2/microbial biomass C) or the ratio between enzyme activity and microbial biomass have come into focus (Nannipieri et al. 2003, 2012). However, there is still a lack of emphasis on soil microbiological processes, which becomes also apparent when methods to analyze soil quality, as standardized by ISO, are considered (https://www.iso.org/committee/54366/x/catalogue/completed). Of the 52 methods proposed, most use aspects of the soil macrofauna and/or of plants as quality indicators. Also, 13 new methods that are under recent development do not include tools to analyze the soil microbiome and its functional traits. It is encouraging, though, to note that ISO standard 17601:2016, of December 2016, now describes methods that aim to determine the abundance of selected microbial gene sequences by quantitative PCR using soil microbiome DNA (https://www.iso.org/committee/54366/x/catalogue/completed). A generic account of such methods, including their intricacies, is presented in Table 1. However, the selection of traits is lagging behind this development. Hence, taking into account the importance of the soil microbiome for most soil LSFs, there is a strong need to implement new trait-based indicators that monitor such LSF.

Table 1 Overview of methods for studying soil microbiomes (optimized from van Elsas and Boersma 2011)

In this opinion paper, we address how methodological developments in the last decades have revolutionized our view of the “living soil” and how this knowledge can be used for the development of improved indicators of soil quality. We define soil quality as “the capacity of a soil to function within ecosystem boundaries to sustain biological productivity, maintain environmental quality and promote plant and animal health” (Doran and Parkin 1994). Based on this definition, we will present steps to develop a framework in which the concept of a “normal operating range” (Pereira e Silva et al. 2012; Semenov et al. 2014) of soils is addressed, as governed by the microbiome and soil conditions.

The soil microbiome–an ignored majority

The reasons why soil bacteria and archaea, as well as fungi and other micro-eukaryotes have been - to a great extent - ignored in the discussion on indicators for soil LSF are diverse. For a long time, the phenomenon denoted as the “Great Plate Count Anomaly” (Staley and Konopka 1985) has eluded our efforts to characterize the soil microbiome in depth. Moreover, as first evidenced by the hallmark soil DNA-based work of Torsvik et al. (1996), the soil microbiome is extremely complex with respect to its diversity. For instance, recent estimates consider that one gram of soil may harbor more than 10,000 different bacterial species, which are strongly interconnected and form dense network structures (Nesme et al. 2016). Furthermore, using “-omics” approaches, soil microbiomes can be assessed at different levels (Emmerling et al. 2002; Nannipieri et al. 2014), including the analysis of potential (by measuring all genes present in a given sample) or actual activities (by focusing on gene expression). Finally, the soil microbiome is very dynamic in time and space, which has resulted in the concept of “hot spots,” next to “hot moments,” in soil (Blagodatskaya and Kuzyakov 2013). Consequently, if parts of the soil microbiome should serve as indicators for ecosystem services from soils, there is a need not only to address the issue “how to analyse microbial communities in soil” but also “what is the required frequency of sampling” and “what is the optimal size of a sample.” Besides these technical issues, also the lack of conceptual frameworks, like missing ecological concepts (e.g., with respect to the effect of diversity on the resilience of soil or on the stability of function), the absence of threshold values that define soil quality, and the lack of well-defined standard methods, have hampered developments in this area.

Today, a wealth of studies has been dedicated to microbial community composition in soil (Lauber et al. 2009; Vestergaard et al. 2017). Typically, ribosomal genes like the 16S rRNA gene for bacteria and archaea, or the ITS2 region for fungi, have been used to assess the diversity patterns of the respective microbial groups (Dini-Andreote et al. 2017; Poulsen et al. 2013). Whereas in the beginning of the “molecular age” in soil microbial ecology, mostly fingerprinting techniques, such as PCR-DGGE (Muyzer and Smalla 1998), were used, the progress in sequencing technologies has enabled us to (1) enhance the throughput, and (2) phylogenetically group and name major taxa in the soil microbiome. The current development of sequencing technologies has indeed resulted in the generation of “big data,” which poses a challenge to our analysis methodologies (bioinformatics). Taking only the keywords “bacteria” and “soil,” almost 50,000 publications were recently found in the NCBI database (February 2017), and this number is increasing very rapidly. The data could provide a perfect playground that allows one to identify major responders to soil environmental conditions or stresses. Here, “metadata” like major soil properties, climatic conditions or soil use are clearly needed, as they provide the ecological “context” that underlies the response data. We here propose relevant microbial groups and/or specific genes of these, that depict important steps of the aforementioned LSF, to be used as straightforward indicators for soil quality. As indicated in the foregoing, the abundance of such indicator organisms or genes can be easily monitored by quantitative PCR (qPCR). On top of that, the activity patterns of the respective microbes can be assessed by reverse transcription (RT)-qPCR. A partly empirical, partly theoretical framework will then have to be developed that fits the data and describes the potential for efficient functioning of the identified function populations. Adding to this conceptual framework, the application of selected macro-ecological concepts, like the “functional response groups” concept (Nunes et al. 2016), for analyzing the soil microbiome will result in identifying groups of soil bacteria that respond similarly to challenges under comparable conditions (Lynch et al. 2004). However, critical evaluation of the extent to which such groups are indeed strongly linked to LSF or ecosystem functions is required, and so additional criteria for the selection of robust indicators are needed.

Soil microbiome functions that support plant growth

Only a small percentage of the available volume or surface in soil harbors microorganisms (Nannipieri et al. 2003; Van Elsas et al. 2007). Moreover, microorganisms are usually quite inactive in bulk soil, yet show raised activities in soil “hot spots,” such as the rhizosphere, mycosphere, drillosphere, and/or detritusphere. Hereunder, we will examine the rhizosphere as a model hot spot in soil. Microorganisms in the rhizosphere form complex communities, which are strongly driven by influences from the plant root. In particular, those members of the soil microbiome that play major roles in the promotion of the growth and health of plants are important, as they may need stimuli in their microhabitat to exert their function. Some examples are plant-beneficial microorganisms, like the symbiotic nitrogen-fixing rhizobia or the plant-associative nitrogen fixers such as azospirilli and paenibacilli, the phosphate-solubilizing bacteria, and the pathogen-suppressing organisms such as diverse pseudomonads and bacilli (Salek-Lakha and Glick 2007; Berendsen et al. 2012). In addition, microorganisms that incite systemic induced resistance in plants, and arbuscular mycorrhizal (AM) and ectomycorrhizal (EM) fungi that form beneficial symbioses with host plants are important. Particular key genes of such organisms may be taken to serve as indicators for the conduciveness of soil for production of high-quality plants. In contrast, markers for plant pathogens like Fusarium will enable an assessment of the adverse effects plants may sense in their quest to develop in a given soil.

Examples of microbial traits potentially yielding gene proxies as markers - that assist plants in (1) warding off pathogens and (2) promoting plant growth, are the production of anti-pathogen compounds, as well as of plant growth hormones such as indole acetic acid (IAA) (Brimecombe et al. 2007; Berendsen et al. 2012). Moreover, certain bacteria in the rhizosphere produce ACC deaminase, an enzyme that transforms the precursor of the plant stress hormone ethylene (Brimecombe et al. 2007; Glick 2004). Thus, plant physiology is strongly influenced. Also, microbiome traits that assist the plant in nutrient mobilization e.g., by nitrogen fixation or phosphorus solubilization are important (Brimecombe et al. 2007; Sørensen and Sessitsch 2007; Pii et al. 2015).

However, we still have a poor understanding of the dynamics of, and interactions in, rhizosphere microbiomes. Given the fact that conditions in the rhizosphere are very dynamic (e.g., as a result of the day/ night cycles of photosynthesis and assimilation, and the dynamic growth of roots), plant-associated microbial communities are also considered to be highly variable in time and space. This implies that different types of organisms and functions are important at different time points during plant development. Hence, particular functions that are in high demand at certain points in time may be almost irrelevant at other time points. This facet of the rhizosphere microbiome is often overlooked, yet it needs to be taken into consideration to come to a balanced view of rhizosphere community function. Clearly, on the one hand, plant species type can affect the activity, abundance and composition of the local microbial communities through rhizodepositions (Brimecombe et al. 2007; Sørensen and Sessitsch 2007). On the other hand, the huge microbial diversity of the bulk soil may be more important as the resource library for the rhizosphere, and hence the effect of plant roots (or, specifically, plant species) is often temporary. For instance, when Carex arenaria, a non-mycorrhizal plant species, was grown in 10 different soils, the bacterial diversities in the respective rhizospheres were more similar to those of the bulk soil than to those of rhizosphere communities from other soils (De Ridder-Duine et al. 2005).

Soil microbiome functions that drive nitrogen transformations

With the ongoing metagenomics-based analyses of soil microbiomes, a progressively higher number of genes encoding key enzymes that drive relevant LSF processes in soils has been described recently (Dini-Andreote et al. 2017; Nelson et al. 2016; Vestergaard et al. 2017). This includes steps in the transformation of carbon, nitrogen, sulfur and phosphorous. Table 2 lists a number of gene proxies for these processes. Nitrogen (N) transformation processes constitute a nice showcase, as primer systems for genes encoding proteins that drive inorganic nitrogen turnover are available, in particular the key processes nitrogen fixation, nitrification and denitrification. Whereas such processes occur—broadly speaking—under a wide range of conditions, other transformation steps like nitrate ammonification or anaerobic ammonia oxidation [Anammox] only occur under restricted conditions (Ollivier et al. 2011) and therefore metadata should be coupled to the PCR-generated data. The primers that amplify the genes used as proxies (Table 2) can, thus, be used for quantifying gene or transcript copy numbers (following reverse transcription). They may also assess diversity patterns, as related to “conditions” described by the metadata. Although the genes that are amplified may constitute good markers for the related processes and it seems straightforward to use them as proxies for N turnover processes, biases introduced by the primer systems used need to be taken into account (Wei et al. 2015). Furthermore, for some of the genes, functionally-redundant forms exist in nature (e.g., the nitrite reductase genes nirS und nirK) (Philippot 2002). Thus, if the complete transformation step should be quantified or the diversity pattern described, all of these forms must be measured to avoid biases. Furthermore, in particular the genes that encode steps in denitrification and nitrogen fixation may be expressed only under certain environmental conditions. Thus, analysis of the respective gene pool does not a priori allow to analyze metabolic fluxes. Hence, only potential function is indicated and not actual activity.

Table 2 Some molecular markers proposed for evaluating nutrient cycling functions in soil

The amount of plant available N in soil depends on N immobilization-mineralization processes. Both processes are carried out by diverse microbiota and depend on several reaction steps, and often we lack knowledge on the respective genes and organisms. On the basis of these considerations, one could place a focus on the microbial proteome, thus considering the enzymes involved in the rate-limiting steps of each process. For instance, N immobilization (the conversion of ammonium-N to amino acid-N), is catalyzed by glutamine synthetases. These enzymes have higher affinity for the substrate (lower Km value) than glutamate dehydrogenase (Reitzer and Magasanik 1987). In the case of N mineralization, this approach is more problematic, as organic N is made up of different compounds, and so diverse genes are likely needed to describe the process. For example, for the transformation of protein-N into ammonium-N, the activity of several exo- and endopeptidases is needed, as these split proteins into oligopeptides and subsequently amino acids. Further, amino acid oxidases and amino acid dehydrogenases convert the latter compounds into ammonium-N (Nannipieri and Paul 2009). The existence of several proteases that can hydrolyse proteins makes the development of a single robust marker difficult. Thus, although several primers have been developed that report on the relative abundance of protease-encoding genes and the protease activity of soil can be related to the presence of these genes (Mrkonjic Fuka et al. 2008a, b; Baraniya et al. 2016), we still lack an overall picture on proteolysis in soil.

Soil functions that allow filtering and clean-up of percolating water

The filtering function of soil has been thought to be mainly a physical process, as pollutants are adsorbing to the soil matrix during their flow through the soil system. Thus, a non-living soil can in principle serve as an efficient filter provided that the forces involved in adsorption/desorption inhibit the release of such compounds. However, processes like freeze/thaw cycles and drying/rewetting often lead to the release of bound chemicals in soil. Thus, processes that induce a complete degradation of xenobiotics, driven by the soil microbiome, are advantageous for a sustainable clean-up of soils. There is indeed a huge genetic potential in soil microorganisms to degrade key anthropogenic substances. Such (mainly aerobic) activities of microbes, in terms of pollution removal, have already been described from the 1980s onwards (Cheng et al. 1983; Wilson and Jones 1993; Xu and Zhou 2017). Recent molecular approaches have confirmed most of these old data, and have even shown new important traits of the soil microbiome with respect to the degradation of pollutants. For instance, using metagenomics of microbiomes from industrially-contaminated soil sites in Gujarat (India), Shah et al. (2013) identified over 100 different genes for enzymes predicted to be involved in xenobiotic- degradation pathways. Based on these studies, a huge number of potential indicators has become available, however testing of these for robustness regarding the biodegradation potential and activity, and subsequent validation in the context of well-defined soils and metadata, is needed (Shah et al. 2013; Sukul et al. 2017).

The soil mobilome

Whereas most of the functions addressed so far are all chromosomally-encoded, and so their detection by molecular markers indicates the presence of trait-carrying organisms, other traits are typically found in the “volatile gene pool” in soil, i.e., the mobilome. The mobilome includes plasmids, bacteriophages and even extracellular DNA that can be re-acquired via transformation. In particular, plasmids are of prime importance, as they are carriers of a range of genes that provide “help” to organisms in particular conditions (Van Elsas and Bailey 2002). Thus, several studies have shown that the frequency and diversity of plasmids increase in soil microbiomes as a response to the stresses induced by, e.g., antibiotics or other compounds (You and Silbergeld 2014). Given this response, appropriate markers that describe the soil mobilome backbone (i.e., representing non-accessory genes of plasmids that typically carry response functions) are of high value, as they report on the occurrence of such stresses in soil. Besides, the relevant plasmid-carried functional genes (which are indicative of the functional potential and reflect soil history), and their expression levels, are of importance, as these allow investigating the actual status of a soil. There is a need to define clear markers that stand for these key responsive genes on such plasmids. Mobile genetic elements also provide important traits for microbe-host interactions. The classical example is given by the Ti plasmid of agrobacteria (currently renamed rhizobia) and the Sym plasmid in rhizobia. Thus, there is no doubt that horizontal plasmid transfer occurs at high frequency in the rhizosphere (Van Elsas et al. 1988; Sørensen et al. 2005). But this is not restricted to plant—microbe interactions. The soil-dwelling bacterium Burkholderia terrae, with excellent capacity to colonize fungal hyphae, is a good example of this. Its huge genome was found to be highly adaptable due to a very high number of genomic islands interspersed in a genomic backbone (Haq et al. 2014). Several of the genetic systems of this organism were found to have key roles in the interaction with soil fungi, an example of this being a five-gene cluster - predicted to encode a response to fungal-released carbon compounds next to a toxic oxygen radical dissipation mechanism - with raised ad-fungus activity (Haq et al. 2017). This latter gene cluster may constitute an excellent candidate that reports on bacteria interacting with fungi in the soil, a key facet of microbiome connectedness. Hence, monitoring genes like the latter will provide key information on the tightness of bacterial-fungal interactomes in the soil.

Brave new world

In line with the foregoing considerations, we conclude that each of the identified soil LSF, instead of being encoded by all members of soil microbiomes, is typically driven by a subset of these; this may be soil- or condition-specific. Hence, taking into account soil metadata, a smart selection of key functions may enable us to define markers for the multi-functionality of soils. With the information that is now within reach, as offered by the state-of-the-art molecular technologies, we will even be able to go beyond the level of single microbes. For instance, the co-occurrence of certain microbial taxa (Barberán et al. 2012; Uksa et al. 2015) will give rise to the unlocking of “co-occurrence networks”; however, the co-occurrence should be distinguished experimentally as being causal versus coincidental. Although still purely statistics-based, such analyses yield information on positive as well as negative microbial interaction patterns, facilitating the development of hypotheses with respect to the actual mechanistic interactions that sustain the system. On the one hand, the total number of network nodes might be of interest as it reports on network connectedness. On the other hand, the presence or absence of certain interactive links might be strong indicators for a particular tightness of the interactive system.

Also, the analysis of actual activities based on the assessment of transcripts is an issue that will become of paramount importance in the future, because it allows correlations to be established between the “potential quality” of soil, as defined by the collective presence and prevalence of DNA-based indicators, and temporally-defined activity measurements (which report on the actual achievable functional quality). The aforementioned issues related to sampling, time and space may be even more critical than the assessment of potentials based on soil microbiome DNA.

Conclusions and the way forward–how to assemble a framework for soil quality assessment?

A framework for soil quality assessment is of major value to global sustainable food/feed/fiber production, as well as forestry. Clearly, soil LSF are the main drivers of this development and these are strongly dependent on the soil microbiome, with special importance for the rhizosphere when it comes to plant health and growth. However, in spite of the long-standing discussion on the selection of indicators for soil quality (e.g., Bloem et al. 2006), there is currently no consensus as to what would constitute a reasonable set of proxies that together constitute a good framework for soil quality assessments (Pereira e Silva et al. 2012). We here expose tangible arguments for the contention that a reasonable way forward consists in “accepting” existing tools and broadening the scope on the basis of accepted value of novel proxies for functions that underpin soil quality.

A key aspect in the assessment of parameters that define the quality of soils is the integration of the appropriate variables into one framework or even a single unit, which, ideally, should be able to characterize the soil quality status. Parameters describing the soil microbiome, grouped into such a unit, should thus enable discerning “normal” (range) situations from stressed (out-of-normal-range) situations.

In an optimized framework for soil quality assessments, both traditional (black-box) and novel (more process-specific) indicators should be combined. The more traditional indicators have been based on assessments of “visible” soil attributes, as well as chemically/physically measurable ones. These should now be combined with advanced novel tools that come about from our progressively-increasing understanding of soil microbial processes, as supported by molecularly-based soil analyses, such as via metagenomics. The latter set of proxies would, for the first time in history, include our much-brightened vision of the genes and proteins that underpin the soil microbiome based key life support functions. Thus, the construction of a novel framework for soil quality analysis, encompassing the “best of two worlds” is envisioned. The key argument however is that, with respect to the traditional methods, long-term data already exist which will allow a cross-validation when combined with novel indicators.

The bioindicators that report on key LSF in soil need to follow several criteria, i.e., (1) they should be as universal for the different soil microbiomes as possible, (2) they should represent the function addressed in a representative and accurate manner, and (3) they should adequately report on soil disturbances that may cause deterioration or harm. In a well-designed monitoring system that assesses soil quality, the suite of indicators selected on the basis of such criteria should be sufficient to allow pinpointing situations or conditions in the local microbiomes that are out of a pre-defined range that determines (acceptable) quality. Ideally, a suitable framework should include markers that report on the key LSF processes in soil, as discussed in the foregoing. These include, minimally, important processes for nutrient cycling (C, N, P, and S), important beneficial microbial groups and their traits (for example, mycorrhizae, plant beneficials, pathogens, and pollutant degraders) and markers for mobilomes.

How would a novel framework, taking on board the “best of two worlds,” be conceptualized? First, a selection of proxies for key functions needs to be made including the molecular ones, to establish a minimal data set. The indicators should each work well, reporting in a robust manner on soil (stress) status, and indicating the range of values that define acceptable quality, discerning it from unacceptable quality. Then, the indicators can be combined in order to establish a modeling approach (resulting in a normal operating range (NOR) for a particular soil type in a given region under a certain land use type), much like what was proposed by Pereira e Silva et al. (2012) and Semenov et al. (2014). In this approach, the NOR of a soil is captured into a multi-dimensional model that may encompass a large number (n) of variables in n-dimensional space; the model needs border values that define the acceptable range of each parameter. An example of the application of the model with three out of 22 parameters (nifH, nosZ, and ammonia oxidizing archaea [AOA]) visualized is shown in Fig. 1. Whereas mathematically viable, such a model is complex. A drawback of such a model is the current lack of accepted “border” (critical) values, that delineate what is “inside” the NOR and what is “outside.” However, such “border” values can be filled in progressively, as the model is applied, and fed with data, using a range of soils in different states of degradation. In such a process, levels of acceptability/unacceptability need to be fed into the system and correlations to be drawn with the data inside the model.

Fig. 1
figure 1

Representative example of a NOR of soils showing three of 22 variables (adapted from Semenov et al. 2014). The upper ellipsoid characterizes the NOR for clayey soils while sandy soils are represented by the lower ellipsoid. The ellipsoids represent the borders of the NOR for three variables (abundance (q–quantity) of ammonia oxidizing archaea (AOA), nifH, and nosZ). Red crosses are observed values which characterize the NOR. The blue line is the distance between the center of the NOR (blue dot) and the investigated soil (green dot). It is important to mention that the distance that reflects how much the selected soil (green dot) is outside the NOR is the distance between the green dot and the edge of the three-dimensional sphere