Introduction

Despite their generally inconspicuous nature, terrestrial arthropods constitute one of the most prominent components of terrestrial ecosystems. They account for a large amount of biomass and represent a substantial proportion of all terrestrial biodiversity (Adis 1988; 1990; Stork 1988; Basset et al. 2004; Nakamura et al. 2007). The diversity and composition of terrestrial arthropod communities have widely been used as bio-indicators for a variety of processes and habitat characteristics, including vegetation properties, river flooding regime, land use and management practices, ecosystem restoration, and soil contamination (e.g., Basset et al. 2004; Cartron et al. 2003; Gardner 1991; Irmler 2003). However, because of the large abundance and richness, considerable time and taxonomic expertise are required for sorting terrestrial arthropods samples and identifying individuals to the species level (Basset et al. 2004; Caruso and Migliorini 2006; Gardner et al. 2008; Lawton et al. 1998; Moreno et al. 2008). Common alternatives proposed to reduce time and economic efforts include shortening the sampling period (Biaggini et al. 2007; Caruso and Migliorini 2006), using morpho-species (Basset et al. 2004), selecting specific indicator species (Beccaloni and Gaston 1995), and using data of higher taxonomic levels as surrogates for species (Andersen 1995).

In general, the feasibility of higher taxonomic level surrogates is not agreed upon. Several studies point out that relatively coarse taxonomic data may give outcomes comparable to results obtained at the species level. For example, family richness was shown to be a good predictor of species richness for a variety of taxonomic groups, including plants, birds, and bats, in different regions (Williams and Gaston 1994). In Victoria (Australia), stream classifications based on aquatic macro-invertebrates showed similar results for family, genus and species level data (Hewlett 2000). Likewise, the discriminatory power of oribatid mites in a Mediterranean area for pollution and fire disturbance was similar at the levels of family, genus and species (Caruso and Migliorini 2006). In contrast to these findings, however, several other studies indicate that the species level is most appropriate for biological monitoring. For example, an investigation of Australian ant fauna revealed only a weak relation between genus richness and species richness, indicating that genera provide a poor surrogate for species (Andersen 1995). Similarly, species level data were considered indispensable for assessing the riparian quality of three rivers in South Africa (Smith et al. 2007). A European water type characterization based on aquatic macro-invertebrate communities revealed that the species (or ‘best available’) taxonomic level was more informative than the family level, as the latter led to a less distinct separation of sites (Verdonschot 2006). It has been concluded that further studies are needed to reveal whether results are mere region- or system-specific, or may reflect more generic patterns (Biaggini et al. 2007; Moreno et al. 2008).

Floodplains of large rivers are among the most fertile and richest ecosystems on earth, characterized by very high landscape and biological diversity (Robinson et al. 2002; Ward et al. 2002). Nevertheless, these systems have been poorly investigated with respect to the taxonomic level most appropriate for monitoring biotic properties. Using a lowland floodplain area along the river Rhine for data collection, the present study aimed to compare four arthropod datasets of different taxonomic detail on their discriminatory power for various environmental factors. The arthropod datasets comprised ground-dwelling arthropods at class-order level, beetle families, ground beetle genera and ground beetle species. The choice for beetles and ground beetles was made because they are relatively easy to identify and because they tend to show clear responses to a variety of environmental characteristics (Biaggini et al. 2007; Irmler 2003; Pohl et al. 2007; Uehara-Prado et al. 2009). The environmental conditions investigated included vegetation characteristics, hydro-topographic setting, physical–chemical soil properties and soil contamination levels. To relate the arthropod assemblages to these environmental characteristics, the method of variance partitioning was used. This is a multivariate statistical approach designed to attribute variation in community composition to specific explaining variables and thus particularly suited to assess the importance of different environmental factors relative to each other (Borcard et al. 1992; Peeters et al. 2000).

Methods

Study area

The river Rhine is one of the longest and most important rivers in Europe, flowing from the Swiss Alps via Germany and The Netherlands to the North Sea. Shortly downstream of the border between Germany and The Netherlands, the Rhine splits in three main distributaries, i.e. the Waal, the Nederrijn and the IJssel (Fig. 1). The floodplains along these distributaries are generally embanked and cultivated. During the past century, large amounts of contaminated river sediment have been deposited in these areas (Middelkoop 2000). This has resulted in elevated concentrations of several contaminants, notably heavy metals, in the floodplain soils.

Fig. 1
figure 1

Location of the study area ‘Wolfswaard’

The ‘Wolfswaard’ floodplain area (51o57′19″N; 5o39′3″E) is located south of the city of Wageningen along the Nederrijn distributary (Fig. 1). The study area is embanked by a winter dike. In addition, there is a minor embankment parallel to the river at a distance of approximately 200 m from the middle of the channel. Land use in the study area comprises mainly extensive agriculture, with semi-natural grasslands in use for cattle grazing. A small part of the grassland area, which is surrounded by a hedgerow, is employed for sheep grazing and contains some scattered fruit trees. The banks of the river are covered by willow pollards. Sampling sites were selected at 30 locations, based on differences in vegetation and hydro-topographic setting (distance to the river, elevation) that were apparent in the field.

Investigation of environmental characteristics

The coordinates of the sampling sites were recorded with an accuracy of 1 m using a hand-held GPS (Garmin Vista HCx) and the European Geostationary Navigation Overlay Service (EGNOS). The elevation of each sampling site was derived from The Netherlands’ 5 × 5 m digital elevation model (www.ahn.nl). The average yearly flooding duration (days per year) was derived from daily river water level data covering the period 1999–2008 (www.waterbase.nl). River water levels at the study area were based on measurements obtained at a gauging station approximately 10 km upstream, assuming an average water level drop of 3.8 cm km−1. This water level drop was calculated from linear interpolation of the average water levels measured at the upstream gauging station and at a gauging station approximately 20 km downstream. The unembanked sampling sites and the sites higher than the minor embankment were assigned the duration of river water levels exceeding their elevation; the embanked sites were assigned the duration of water levels exceeding the height of the embankment (9.10 m).

The 0–5 cm upper soil layer was sampled in August 2007. Within a radius of 1 m from the centre of each site, three soil samples were collected. The samples were pooled per site, mixed, and air-dried for 48 h at ambient room temperature. The pH was measured in a suspension of 10 g air-dried soil mixed with 25 ml deionized water (<10 μS cm−1), mixed 24 h before the measurement. Air-dried samples were oven-dried for determining the soil moisture content, based on the weight loss upon 24 h at 105°C. Soil organic matter content (%) was determined by the weight loss upon ignition (4 h at 550°C) of ~10 g oven-dried samples. The particle size distribution of the soil was analyzed by means of laser diffraction (Malvern Master Sizer 2000 with Hydro 2000 G), performed on oven-dried samples sieved over 2000 μm. Prior to this analysis, samples were treated with 30% H2O2 and 10% HCl for detaching coagulating particles and dissolving organic matter. To determine the soil metal concentrations, 0.2 g dw soil of each sample was weighted on a Sartorius LA310S mass balance and digested in a mixture of 4 ml 65% HNO3 and 1 ml 30% H2O2 using a Milestone Ethos-D microwave. Total soil concentrations of arsenic (As), cadmium (Cd), chromium (Cr), copper (Cu), nickel (Ni), lead (Pb) and zinc (Zn) were determined with ICP-MS (X Series; Thermo Electron Cooperation).

Vegetation characteristics were investigated in May 2008. Using 3 × 3 m plots, vascular plant species covers were estimated according to a modified scale of Braun-Blanquet (Barkman et al. 1964). Nomenclature of the species followed Van der Meijden (2005). In addition, the total coverage and the average height of the herb layer were assessed. The 30 vegetation recordings, encompassing 73 plant species, were classified with TWINSPAN, a hierarchical divisive classification program (Hill and Šmilauer 2005). To account for differences in coverage, five pseudospecies cut levels were distinguished: 0, 5, 26, 51, and 76% (Hill and Šmilauer 2005). The classification resulted in seven vegetation types, comprising river bank vegetation, four types of grassland, herbaceous floodplain vegetation, and hedgerow vegetation (Table 5).

Arthropod collection and identification

Soil-dwelling arthropods were collected monthly from April 2007 to April 2008. Sampling took place with pitfall traps with a diameter of 11 cm. The traps were filled with ~3.7% formalin and a drop of detergent lotion to reduce surface tension. Each trap was sheltered by a square or octagonal wooden tile raised approximately 3 cm above the soil surface. Prior to each sampling event, the traps were opened for a period of 14 days. Pitfall samples were stored in ~3.7% formalin. Arthropods were first identified at the level of class (Chilopoda, Diplopoda), intra-class (Acari), or order (Araneae, Coleoptera, Dermaptera, Hemiptera, Hymenoptera, Isopoda, Opiliones). Because of the focus on soil-dwelling arthropods, the order of Hymenoptera was confined to the ants (Formicidae). These ten groups, hereafter called ‘arthropod groups’, comprised the dataset at the coarsest taxonomic level. After this first identification stage, the beetles (Coleoptera) were further identified to family level. Of the beetle families, the ground-beetles (Carabidae) were selected for identification of genera and species. The beetle families were identified after Unwin (1988); identification of the ground-beetles followed Boeken et al. (2002) and Müller-Motzfeld (2004). To obtain consistency in the classification across the different taxonomic levels, the taxa identified were compared to the taxa included in the Dutch Species Catalogue (www.nederlandsesoorten.nl). In case of dissimilar names, the names of the Dutch Species Catalogue were adopted.

Data analysis

In order to correct for occasionally missing arthropod samples, total arthropod numbers per sampling site were determined by calculating average numbers per site and multiplying by the total number of sampling events (13). Based on these total numbers per sampling site, the taxonomic richness (R), the Shannon index (H′; Eq. 1) and the evenness (E; Eq. 2) were calculated across the study area for each of the four datasets.

$$ H' = - \sum\limits_{i = 1}^{i = R} {P_{i} {\it}\ln P_{i}{\it} } $$
(1)
$$ E = {\frac{H'}{{H_{\max } }}}\quad{\text{with }}\quad H_{\max } = \ln R $$
(2)

where H′ = Shannon index; R = taxonomic richness (i.e., the number of taxa); P i = the relative abundance of each taxon, calculated as the proportional contribution of the number of individuals of that taxon to the total number of individuals within the dataset; E = evenness.

The environmental variables flooding duration, median grain size (d50) and average herb height showed right-skewed distributions and were log-transformed before further analyses. The relations between the arthropod assemblages and the different environmental variables (Table 1) were assessed with variance partitioning (Borcard et al. 1992; Peeters et al. 2000). Prior to the variance partitioning, the total amount of variation in each arthropod dataset was assessed by determining the sum of all canonical eigenvalues with detrended correspondence analyses (DCA; CANOCO 4.0; Ter Braak and Šmilauer 1998). DCA was also used to assess whether the arthropod assemblages followed linear or unimodal response models. The DCA was based on logarithmically transformed arthropod numbers (log (N + 1)) and revealed short to moderate gradients for each of the four arthropod datasets (gradient length <3 SD). Hence, the variance partitioning was based on the linear method of redundancy analysis (RDA; CANOCO 4.0; Ter Braak and Šmilauer 1998). For each environmental variable in a canonical analysis, a so-called variance inflation factor (VIF) is calculated which expresses the (partial) multiple correlation with other environmental variables. A VIF >20 indicates that a variable is almost perfectly correlated with other variables, which results in an unstable canonical coefficient for this variable (Ter Braak and Šmilauer 1998). Initial analyses revealed high VIFs for the grain size distribution parameters, i.e. clay fraction, silt fraction, sand fraction and median grain size. Of these, the median grain size was selected as representative grain size distribution parameter and the others were excluded from further analysis. Similarly, the total soil concentrations of As, Cd, Cr, Cu, Ni, Pb, and Zn were characterized by high VIFs in the initial ordinations. A principal component analysis (PCA; SPSS 16.0) was executed on the soil metal concentrations in order to reduce the amount of variables while preserving the main part of the variation. As the first principal component accounted for over 92% of the variation in the soil metal concentrations, the remaining components were discarded and for each sampling site the soil metal concentrations were replaced by the site score on the first component (Schipper et al. 2008b). Thus, the eventual variance partitioning analyses were performed with 10 environmental variables divided into four groups: vegetation characteristics (vegetation type, total herb layer coverage, average herb layer height), physical–chemical soil characteristics (pH, soil organic matter, soil moisture content, median grain size), hydro-topographic setting (elevation, flooding duration) and soil metal contamination (site scores on the first principal component of the soil metal concentrations). Monte−Carlo permutation tests were performed to test the significance of each set of environmental variables for structuring the arthropod assemblages (Ter Braak and Šmilauer 1998).

Table 1 Mean, standard deviation (SD) and range of the environmental characteristics across the sampling sites (n = 30)

Results

In total, 42,096 arthropods were collected (Tables 6, 7). The most abundant groups comprised the spiders (Araneae; 26%), beetles (Coleoptera; 21%), mites (Acari, 18%), ants (Formicidae; 14%), and isopods (Isopoda; 8%). For the beetles, 32 families and 9,009 individuals were identified. The most abundant families were the Staphilinidae (35%) and the Carabidae (29%), followed by the Curculionidae (9%), Hydrophilidae (6%), Elateridae (4%), Cryptophagidae (4%), Chrysomelidae (3%) and Leiodidae (3%). All other families made up less than 2% of the total number of individuals. The ground beetle species (Carabidae) comprised 2,600 individuals belonging to 30 genera and 68 species. Pterostichus melanarius accounted for 33% of the total number of individuals. Other frequently encountered species were Nebria brevicollis (17%), Harpalus rufipes (8%), Anchomenus dorsalis (4%), Bembidion gilvipes (3%), Bembidion properans (3%), Harpalus affinis (3%), Carabus monilis (3%), and Poecilus cupreus (3%). Remaining species made up less than 2% of the total number of individuals.

On average, the taxonomic richness was higher for the beetle families and ground beetle species than for the other datasets, whereas the evenness was highest for the arthropod groups (Table 2). According to the coefficients of variation, the spatial variation in abundance, richness, diversity, and evenness was lowest for the arthropod groups and tended to increase towards the ground beetle species (Table 2). Similarly, the total variation in the arthropod datasets, as expressed by the sum of canonical eigenvalues generated by the DCA analysis, clearly increased with increasing taxonomic detail (arthropod groups: 0.068; beetle families: 0.650; ground beetle genera: 1.238; ground beetle species: 2.355). The variance partitioning for the different arthropod datasets showed comparable results (Fig. 2; Table 3). For all datasets, the major part of the variation (i.e., 66–78%) could be explained by the environmental variables investigated, leaving 22–34% of stochastic or unexplained variance (Fig. 2). In general, vegetation characteristics were most important in explaining variance in taxonomic composition, accounting for 31–38% of the total variation in the datasets (Fig. 2; Table 3). Monte−Carlo permutation tests revealed that the effect of vegetation was significant (P < 0.05) for each dataset (Table 3). Soil characteristics were responsible for 7–10% of the variation in taxonomic composition. The contribution of the soil characteristics was significant (P < 0.05) for the arthropod groups, but not for the three beetle datasets. Hydro-topographic setting accounted for another 3–7% of the variation and was significant (P < 0.05) for the ground beetle genera. Soil heavy metal contamination explained only a minor part of the variance (2–4%), with a slightly higher contribution for the ground beetles than for the other two datasets. Its contribution was significant for the ground beetle genera (P < 0.05) and approached significance for the ground beetle species (P = 0.05).

Table 2 Number of individuals (n), richness (R), evenness (E) and Shannon index (H′) averaged across the sampling sites (n = 30) for the different arthropod datasets
Fig. 2
figure 2

Variance partitioning for different arthropod datasets based on redundancy analysis (RDA)

Table 3 Results of the variance partitioning for the four arthropod datasets

Ordination of the sampling sites based on all 10 environmental variables showed that the hedgerow sites could be clearly discriminated from the other sampling sites (Fig. 3). The sites surrounded by the hedgerow (i.e., grassland with scattered fruit trees) could also be easily distinguished, although for the arthropod groups this cluster showed somewhat more overlap with other sampling sites than for the other datasets. In contrast, the arthropod group dataset was more distinctive for the river bank vegetation than the three beetle datasets. For none of the four datasets, the sites located within the different floodplain grassland types or the herbaceous floodplain vegetation could be clearly distinguished from each other. The so-called indicator value method of Dufrêne and Legendre (1997) was used to identify indicator arthropod taxa for the vegetation types. The indicator value is a composite measure of a taxon’s relative abundance (specificity) and relative frequency of occurrence (fidelity) within a specific vegetation type. The value ranges up to 100% if a taxon is present in only one vegetation type (maximum specificity) and in all sampling sites belonging to this type (maximum fidelity). Significant indicator taxa for the hedgerow could be found for all datasets (Table 4). The beetle family dataset contained indicators for two more vegetation types, i.e., grassland with scattered fruit trees and herbaceous floodplain vegetation. Indicator taxa for river bank vegetation were found within the ground beetle datasets only. Numbers of taxa occurring in only one vegetation type were 0, 1, 1, and 3 for the arthropod groups, beetle families, ground beetle genera and ground beetle species, respectively.

Fig. 3
figure 3

Ordination of the sampling sites with respect to the first two RDA axes for the different arthropod datasets. Different symbols indicate different vegetation types: ♦ = hedgerow; ■ = grassland with scattered fruit trees; ▲ = river bank vegetation; × = herbaceous floodplain; □ = floodplain grassland (1); ∆ = floodplain grassland (2); + = floodplain grassland (3). The ellipses emphasize the sites within the hedgerow vegetation, river bank vegetation and grassland with scattered fruit trees vegetation

Table 4 Significant (P < 0.05) indicator taxa for the vegetation types

Discussion

Limitations of the present analysis

The present study compared four arthropod datasets of different taxonomic detail on their discriminatory power for various environmental characteristics in a lowland floodplain area along the river Rhine. The datasets comprised arthropod groups at class-order level (n = 10), beetle families (n = 32), ground beetle genera (n = 30) and ground beetle species (n = 68). The variance partitioning showed similar results for the different datasets, suggesting that their discriminatory power for floodplain characteristics is comparable. The focus on beetles and ground beetles, however, inevitably raises the question whether the results are specific to these groups or of a more generic nature. More specifically, one may wonder whether genera and species of for example ants, isopods, harvestmen or other beetle families would actually have shown larger discriminator power for the environmental variables investigated. One way to consider this question is to examine typical ratios among numbers of orders, families, genera, and species. The lower these ratios, the larger will be the similarities between responses and properties across different taxonomic levels (Lenat and Resh 2001). Conversely, high ratios could then indicate that a higher degree of taxonomic detail would increase the discriminatory power of the taxa. Considering the taxonomic diversity specific for The Netherlands, the order of the beetles (Coleoptera) is rather rich in both families and species in comparison to most of the other groups investigated (Dutch Species Catalogue; www.nederlandsesoorten.nl). For example, the order of isopods (Isopoda) comprises 27 families including 306 species. The family of ants (Formicidae) includes 66 species; the 4 families of harvestmen (Opiliones) together comprise only 30 species. The order of beetles (Coleoptera) is divided into 112 families including no less than 4,116 species, with the ground beetle family (Carabidae) representing the third most species-rich beetle family in The Netherlands (390 species), after the Staphilinidae and the Curculionidae. Although it cannot be excluded that certain species and genera within other arthropod families will be more discriminative with respect to the environmental characteristics investigated, the rather high ratios of family: order (112) and species: family (390) indicate that the influence of these floodplain characteristics on arthropod assemblages is not severely underestimated by the choice for beetles and ground beetles.

Taxonomic level required for biomonitoring

The present body of knowledge is ambiguous with respect to the taxonomic level most suited for biological monitoring. A number of studies have concluded that investigations of higher taxonomic levels give outcomes comparable to results obtained at the species level (Biaggini et al. 2007; Cardoso et al. 2004; Hirst 2008; Sánchez-Moyano et al. 2006), whereas several others indicate that species data are most appropriate (Andersen 1995; Nahmani et al. 2006; Verdonschot 2006). One explanation for these seemingly conflicting findings might be that the taxa investigated in the different studies show a different degree of taxonomic bifurcation. The extent to which species assemblages are mirrored by higher taxonomic level assemblages depends upon the diversity of the fauna being considered (Andersen 1995; Marshall et al. 2006). Where only a few species are present per higher level taxon and higher level taxa are numerically dominated by a single species, higher level data can adequately represent species patterns. Where diversity is higher, it may be necessary to actually investigate genera or species, because higher taxa may have undergone adaptive radiation and the species within for example one family are less likely to share common ecological tolerances and preferences (Marshall et al. 2006; Sánchez-Moyano et al. 2006; Verdonschot 2006). The degree of taxonomic bifurcation might actually explain why several studies performed in marine environments emphasize the feasibility of higher taxonomic level investigations (Olsgard et al. 1998; Sánchez-Moyano et al. 2006; Stark et al. 2003; Warwick 1988), as there are on average substantially fewer species per higher taxon in the marine environment than on land (Vincent and Clarke 1995; Williams and Gaston 1994).

Another explanation for the ambiguity in the literature might relate to the range of environmental characteristics covered by the respective studies. Higher taxonomic units may aggregate species with different ecological tolerances and preferences, resulting in a wider variety of ecological response and thus wider distribution ranges. Hence, higher taxonomic units are less likely to reflect responses to subtle environmental gradients. However, distinctly different environmental conditions might require such different physiological or ecological adaptation strategies that tolerance ranges might become exceeded not only for species, but also for aggregated taxonomic groups. Indeed, studies pertaining to sites that are distinctly different with respect to for example land use or the degree of human disturbance showed that relatively coarse taxonomic arthropod data were sufficient to discriminate between the sites, despite a relatively large degree of taxonomic bifurcation (Biaggini et al. 2007; Nakamura et al. 2007). The lowland floodplains along the Rhine river in The Netherlands are characterized by considerable environmental heterogeneity, due to both natural processes and human influences (Schipper et al. 2008a). On a small spatial scale, relatively large differences can be found with respect to e.g., elevation, flooding, soil characteristics, and vegetation types. Such a wide range of environmental conditions might require such different physiological or ecological adaptations that arthropod assemblages show clear spatial variation not only at low, but also at higher taxonomic levels. This likely explains why indicator taxa for a distinct vegetation type like the hedgerow were found not only among the ground beetles and beetles, but even among the rather coarse arthropod groups at class–order level.

In addition to the degree of taxonomic bifurcation and the degree of environmental heterogeneity, differences in research goals might explain why the literature is inconclusive concerning the taxonomic level most suited for biological monitoring. If a study aims to detect the influence of perturbations or distinct environmental characteristics on organism distribution, identification to family or maybe even order level can be sufficient. However, if the goal is to detect small between-site differences in environmental characteristics and to provide an interpretation of the ecological consequences, it might be necessary to perform identification at lower taxonomic levels (Basset et al. 2004; Lenat and Resh 2001). The lower the taxonomic level, the more specific and thus informative a taxon’s distribution becomes (Williams and Gaston 1994). Indeed, the ground beetle family as a whole (Carabidae) was no significant indicator for any of the vegetation types, whereas ten of the species within this family were significant indicators for four different vegetation types (Table 4). The higher specificity of taxa at lower taxonomic levels may also explain why the ground beetle genera and species showed a significant relation to soil heavy metal contamination, whereas no significant relations with soil contamination could be detected for the beetle families and the arthropod groups (Table 3).

Summarizing, the question concerning the most appropriate taxonomic level for biological monitoring cannot be answered by rigidly recommending one level of taxonomy (Lenat and Resh 2001). The level required depends on the taxonomic bifurcation of the focal taxa, the range of environmental characteristics covered, and the objectives of the study. The results of the present investigation suggest that in clearly heterogeneous environments such as lowland floodplains, relatively coarse taxonomic data can provide a sound indication of the relative importance of different environmental factors for structuring arthropod communities. Hence, if sorting and identification to species level is not possible due to limited resources or taxonomic knowledge, investigations at the family or order level can provide valuable insight in the importance of for example soil pollution relative to the influence of other environmental characteristics. However, for investigating the consequences of environmental pollution or vegetation characteristics in terms of taxonomic diversity or community composition, a higher degree of taxonomic detail will be beneficial.