Introduction

Freshwater ecosystems harbour a rich diversity of species and habitats, despite their comparatively small share of the world’s surface (less than 1%). There is evidence that the decline in freshwater biodiversity has been greater during the last few decades than that of the marine and terrestrial counterparts (Darwall et al., 2009; Garcia-Moreno et al., 2014). A current estimate states that freshwater ecosystems provide suitable habitats for at least 126,000 plant and animal species (Balian et al., 2007) with a huge species diversity in the group of macroinvertebrates.

The insect order Trichoptera—caddisflies—includes more than 14,500 described species (Morse, 2015), more than 1,700 of which occur in Europe (Malicky, 2005a; Graf et al., 2008). Caddisflies inhabit a variety of habitats from springs to large rivers as well as all types of standing water bodies (e.g. Wiggins, 2004; Holzenthal et al., 2007; Graf et al., 2008; Malicky, 2014). Accordingly, they cover a diverse range of biological and ecological traits. Thus, caddisflies are well suited for bioindication ranging from organic pollution (e.g. Zelinka & Marvan, 1961; Barbour et al., 1999; Barbour & Yoder, 2000; Wright et al., 2000; Dohet, 2002; Graf et al., 2002; Hering et al., 2006a), hydromorphological degradation (e.g. Statzner et al., 2001; Lorenz et al., 2004, Jähnig et al., 2010), acidification (e.g. Townsend et al., 1983; Sandin et al., 2004) and pesticides (e.g. Schulz, 2004) to climate change (e.g. Hering et al., 2009; Conti et al., 2013). Knowledge about their ecological preferences has increased over the past years and is also openly accessible (Graf et al., 2008; Schmidt-Kloiber & Hering, 2015). Furthermore, this knowledge is often integrated into various bio-assessment systems (e.g. Ofenböck et al., 2004; Chovanec et al., 2005; Hering et al., 2006b; Meier et al., 2006).

While the Fauna Europaea consortium has made valuable efforts to collect species occurrence information on country level (de Jong et al., 2014), information about the actual distribution (point occurrence) of caddisfly species is rather limited or obsolete (e.g. Illies, 1978). Malicky published numerous papers (e.g. 1983, 1984, 1986, 1988, 1992, 1996, 2000, 2005b) in order to analyse distribution patterns of species. Additional information is scattered and mostly distributed in (access-restricted) collections of individual researchers and caddisfly experts, summarising results are largely missing. Frequently, national species lists are consulted instead, which mostly fail to reflect the species’ actual eco- or bioregional distributions.

Detailed knowledge on the distribution patterns of species is an inevitable base requirement for robust species conservation and management strategies, e.g. for the establishment of IUCN Red Lists of threatened species (Ferrier, 2002; IUCN, 2012) or the identification of critical areas for biodiversity conservation (Carrizo et al., 2017). Furthermore, it also serves as a valuable basis for quality control of species determinations. While the global collection of occurrence data of terrestrial fauna and flora has a long tradition, and conservation assessments of various animal groups already exist (e.g. mammals: Ceballos & Ehrlich, 2006; birds: Stattersfield & Capper, 2000; amphibians: Stuart et al., 2004), the knowledge about occurrences of freshwater species is less advanced (Clausnitzer et al., 2009; Darwall et al., 2011). In contrast to fish (Brosse et al., 2012) or groundwater crustaceans (Zagmajster et al., 2014), comprehensive collections on the distribution of freshwater insects are rare. Most global distribution information exists for Odonata (damsel- and dragonflies; Clausnitzer et al., 2009; Boudot & Kalkman, 2015).

On a regional scale, several national and regional Red Lists of caddisflies exist (e.g. Austria: Malicky, 2009; Germany: Gruttke, 2016), but there is no Red List assessment on a continental, pan-European scale because a comprehensive compilation of occurrence records as base for the evaluation of the conservation status has been lacking so far. This in mind, the aim of our investigation was to compile a comprehensive database comprising the current distribution of European caddisflies to allow for generating (digital) distribution maps, analysing distribution changes through time as well as exploring autecological species preferences. Finally, such a database at hand enables the creation of a European-wide Red List of caddisflies including the indication of national responsibilities in case of endemic species.

In this paper, we summarise the data collection and quality control processes, and present ecoregional patterns of species richness and endemisms of caddisflies.

Methods

Data collection and compilation

The foundation for the “Distribution Atlas of European Trichoptera” (short: DAET) was laid within the BioFresh research project, funded by the EU from 2010 to 2014 (http://freshwaterbiodiversity.eu). The project aimed at investigating the status, trends, pressures and conservation priorities of freshwater biodiversity and its related ecosystem services. The project included a contingency fund for mobilising so far not digitally available freshwater-related datasets. Parts of this fund and an additional grant of the University of Duisburg-Essen supported the DAET data compilation and management work. DAET data providers were offered an honorarium based on the number of records supplied.

In a first step, we developed an electronic template for data collection. The template included a species list and relevant parameters that should be queried in the course of data collection (e.g. sex and life stage of the species, name, coordinates and altitude of the location, collection date and method, name of collector/identifier, as well as a variety of environmental parameters). With this template, 129 European Trichoptera experts were contacted in the first instance. Additionally, we set up a specific website (as part of the BioFresh website) to spread the word and make all necessary template documents (including a data provider agreement) publicly available.

As a parallel exercise, we conducted a literature review searching for published geo-located Trichoptera data, which were then digitised either by us or the BioFresh data team. A majority of substantial data were included from the database of the Upper Austrian State Museum, Biology Center (ZOBODAT).

All collected data were compiled in a MS Access database; the database structure can be seen in Table 1.

Table 1 Database structure showing the four main tables and their fields including explanations

Quality control

The collection of Trichoptera data focused on occurrence records of adult specimens. This seemed to be the only way to exclude misidentifications and to get a clear and relatively unambiguous picture of the current distribution patterns of this insect order (see “Discussion” section below).

Quality control prior to including data into the database comprised a nomenclatoric check (we followed Malicky, 2005a), a harmonisation and rough validation of coordinates (see below), the correction of typos and the harmonisation of character sets. If necessary, the species were marked as “doubtful species” (i.e. species which were only mentioned once in literature and for which no type material is available) or “possible synonym”, if the status of the species was not entirely clear. If species comprising sub-species were submitted with species names only, the sub-species were geographically allocated according to their actual distribution range, if unequivocally possible.

Though defined in the template, coordinates of occurrence records were delivered in very different formats. Using the software Franson CoordTrans and MSP GeoTrans 3.5, we translated data to the World Geodetic System 1984 (WGS84). For big raster data (10 × 10 km), a radius was included into the database indicating that the precision of the sampling location lies within this circle. Data delivered without coordinates (around 8,000 species records) were allocated to indicated locations using tools like Google Earth, Google Maps, Getamap, Open Street Map or Open Topo Map. Historic data that only included the names of regions or administrative districts were added to the database using a coordinate within the region and a radius.

We conducted a final visual quality control of plotted distribution maps per species. Occurrence records were visualised with QGIS (versions 1.7.2 to 2.18). As data were continuously added (also after the finalisation of the BioFresh project), this step was repeated at least four times.

Finally, all data will be freely accessible to other researchers and all interested audience through the Freshwater Biodiversity Data Portal (http://data.freshwaterbiodiversity.eu).

Data evaluation methods

For all analyses in this paper, we used occurrence records of the European continent only. The ecoregions defined by Illies (1978) served as a base for our evaluations of Trichoptera diversity, but we excluded ecoregion X (North Africa) and ecoregion Y (Middle East)—though they form a faunistic entity with Europe—due to lack of data in these regions. Ecoregions are widely used in aquatic ecology such as in lake or river typologies (e.g. Moog et al., 2004). Additionally, they are adopted for applied purposes and serve as typological entities for the assessment of European running waters according to the Water Framework Directive (WFD; Directive 2000/60/EC). Based on this approach, species exclusively distributed in one ecoregion were defined as “endemic”. Species with unclear taxonomic status, not identified sub-species or species groups, were excluded from analyses. The spatial analyses, intersecting ecoregions and occurrence points, were conducted in ArcGIS 10.4 (ESRI, 2016).

Results

Database content

The Trichoptera taxa list in the background of the database currently comprises 1,706 taxa, 1,580 species and 126 sub-species (for readability reasons, we termed these two different taxonomic units “species” within this manuscript). Additionally to the authors, 82 persons working with caddisflies in Europe (listed in the acknowledgements) provided species occurrence data. Currently (January 2017) the database holds 601, 702 records, defined as species at a site on a unique date. Some data providers also submitted larval data, which were admittedly included into the database. However, for evaluations, we only used the 441,226 records of adult specimens. The data were collected at more than 55,000 different sites in 50 countries covering the European continent. A few sites from Asia (1,866) and Africa (353) complement the database. Records of these sites are valuable contributions to get a clearer picture of actual species distributions, but were—due to fragmentariness—not included into the data analyses. The occurrence records cover the temporal range from 1793 to 2017 and are available for 1,694 adult species (Schmidt-Kloiber et al., 2015). In total, 1,292 species distributed (exclusively) in the European ecoregions were used for the analyses.

Figure 1 presents a map of occurrence records of all valid species in the database in European ecoregions revealing a relatively good data coverage in central Europe, but also depicting areas with data deficiencies in northern Europe (parts of Germany, Denmark, Sweden and Norway), eastern Europe (Poland, Baltic countries, Belarus, Russia and Ukraine), Balkan countries (parts of Romania, Serbia, and Moldova) and the south-western parts of Europe (Spain, Portugal and parts of France).

Fig. 1
figure 1

Total number of occurrence records of adult Trichoptera species and sub-species related to European ecoregions (N = 419,942)

To illustrate examples of different species distribution patterns, we selected the genus Rhyacophila and show two wide-spread species (Rhyacophila dorsalis sensu lato (i.e. including all sub-species) and Rhyacophila nubila (Zetterstedt, 1840); Fig. 2) as well as one species with a restricted distribution range (Rhyacophila bonaparti Schmid, 1947; Fig. 3), respectively.

Fig. 2
figure 2

Distribution of Rhyacophila dorsalis sensu lato (blue dots; N = 5,768) and Rhyacophila nubila (Zetterstedt, 1840) (yellow triangles; N = 2,952)

Fig. 3
figure 3

Distribution of Rhyacophila bonaparti Schmid, 1947 (N = 44; including overlapping points)

Trichoptera biodiversity in European ecoregions

For an overall impression of Trichoptera diversity across Europe, we summarised the number of species in each ecoregion as described by Illies (1978) (available at http://www.eea.europa.eu/data-and-maps/data/ecoregions-for-rivers-and-lakes). Per definition, these regions were delineated based on biogeographical characteristics of more than 50 different groups of aquatic biota. Based on the principle of actual occurrences, species diversity patterns per ecoregion give ecologically more meaningful information than summarising them on a country level.

Figure 4 shows areas with high Trichoptera species richness in the Ibero-Macaronesian Region (ER1), Italy and Corsica (ER3), the Alps (ER4) and the Hellenic Western Balkan (ER6) with more than 350 species each. Relatively species-poor ecoregions with less than 50 species are located in the North (Iceland, ER19; Tundra, ER21; Taiga, ER23) and the East (Caspic Depression, ER25). Additionally to Fig. 4, Table 2 presents the numbers of species per ecoregion using a correction factor according to Heino (2002) which takes the area of the ecoregion into account. Also, according to this approach, the Alps (ER4) and Italy and Corsica (ER3) are the most species-rich regions, followed by the Hellenic Western Balkan (ER6).

Fig. 4
figure 4

Number of species aggregated per ecoregion. For ecoregion numbers, see Table 2

Table 2 Ecoregion name and number, land area; number of sampling records per ecoregion; number of overall and endemic species per ecoregion compared between freshwaterecology.info (“fwe”; Graf et al., 2008, 2016) and the DAET database; italicised values indicate ecoregions where species numbers of the DAET database exceed those of the freshwaterecology.info database (and vice versa); number of species per ecoregion calculated after Heino (2002); number of species in a 25-km buffer area around each ecoregion

The European freshwater species traits database (available at www.freshwaterecology.info) also offers information on species distributions in ecoregions (Schmidt-Kloiber & Hering, 2015). In this database, Trichoptera occurrences were compiled based on an extensive literature review (Graf et al., 2008, 2016). In Table 2, we extracted data from this database to highlight the differences between the literature review and the actual occurrence records of the DAET database. The table reveals differences in species numbers for all 25 ecoregions.

The table shows that in 13 cases, the number of actual species per ecoregion in the DAET database exceeds the number of species known from literature in the freshwaterecology.info database. This is mainly true for ecoregions where we successfully mobilised data from caddisfly experts. Other regions like the Tundra, Taiga or the Caucasus showed that actual occurrence records are still missing in the DAET database.

Endemic species

The unique DAET data compilation also enables to highlight another important aspect of species ranges, namely the distribution of rare or restricted as well as endemic caddisfly species in Europe. Table 2 additionally shows a comparison between large-scale endemic species, i.e. species occurring in only one ecoregion, according to freshwaterecology.info (Graf et al., 2008, 2016) and according to the Trichoptera occurrence database at hand. The numbers of endemic species compiled in the DAET database exceed those of freshwaterecology.info in 13 (out of 25) ecoregions. In four cases, both data sources show a similar number of endemic species.

Generally, the evaluation of species with restricted distribution ranges reveals centres of endemism in the Ibero-Macaronesian Region, Italy, the Hellenic Western Balkan and the Caucasus with more than 50 endemic species each (Fig. 5). Other regions known to host a variety of endemic species from literature, like the Alps, the Carpathians or the Balkans, are classified in category 2 (21–50 endemic species).

Fig. 5
figure 5

Number of endemic species aggregated per ecoregion. For ecoregion numbers, see Table 2

Discussion

Challenges of data compilation

When the idea of a pan-European Trichoptera database got supported through the contingency fund of the BioFresh project, we had one clear vision in mind: to compile as many (georeferenced) occurrence records of adult caddisfly species from as many European researchers as possible. However, such missions always face a variety of challenges on their way. In the following, we want to discuss the four main problems: data availability and reliability, motivation to contribute, time and funding to compile data as well as quality control.

Data availability and reliability

Reliable determination on species level is essential to establish robust distribution maps. Hitherto, adult specimens of caddisflies are the only stages that can be identified almost doubtlessly (Malicky, 2004, 2005a). Additionally, identification keys covering not only national areas are crucial (Malicky, 2004). Some larval keys follow this approach (like the one by Waringer & Graf, 2011), but especially the share of undescribed micro-endemic species is high. During the past decades, the use of Trichoptera as biological quality elements in bio-assessments has produced huge and valuable sets of larval data. However, the need for extensive and time-consuming quality checks hampered the implementation of such information into our database.

Our approach to limit the data collection on information of adult specimens not only decreased the number of available data sources, but also considerably reduced the number of contributing experts for the benefit of more reliable results. In some countries, national databases exist (mainly based on larval records), but even data of museum collections are often not freely accessible. Our data consequently contain collections of Trichoptera experts compiled in their private interest. Thus, density of sites and frequency of sampling dates do not follow any sampling design, but are rather biased by experts’ residence places or preferred sampling regions. While Austria, Germany, Great Britain, Italy, the Netherlands or also Greece are covered relatively well, France, Poland, Spain, the Balkans or Russia reveal deficits.

Besides the ongoing process of new species descriptions as well as new recordings of species, several Trichoptera groups are in high need of taxonomical revision, which includes synonymisation and splitting of species into sub-species as one inherent dynamic in taxonomy. The collected data therefore can only reflect the state of the art at the given publication date.

Motivation to contribute data

Even though the idea of open access publishing is already well established in the science community in general, in biodiversity and ecological research efforts in this direction are still required (e.g. Costello, 2009; Whitlock, 2011; Costello & Wieczorek, 2013). The publication of data per se (e.g. point records of species) with free access to everyone is neither seen as good scientific practice nor considered as necessity yet. Though it is critical that past and recent biodiversity data, i.e. occurrence records, are made readily available to researchers and policy makers to enable the best possible conservation decisions (Costello & Wieczorek, 2013), the success of data collection projects like ours is dependent on the altruism and will of every single expert. To overcome the reluctance to contribute, we had the opportunity to offer a small honorarium to data providers, even though this by far did not cover their real work efforts. More importantly, we tried to pursue a clear, visible and comprehensible data publishing policy and citation rules for the contributed data. This includes, first and foremost, the clear visibility of the data provider on the Freshwater Biodiversity Data Portal (http://data.freshwaterbiodiversity.eu). Hence, the data provider is visible for each species record and clear citation rules are supplied.

Another incentive for contributing data is the possibility of future collaborations, inclusion in research projects or publications that might be initiated based on the published data. Already during the compilation process, all data providers of the DAET database had access to all data upon request, which allowed more comprehensive analyses for them (see e.g. Previšić et al., 2014a, b; Graf et al., 2015; Ibrahimi et al. 2015, 2016; Vitecek et al., 2015a, b; Graf & Vitecek, 2016). Generally, papers connected to publicly available data get significantly more cited because the data become available for inclusion in broad-scale analyses (Piwowar et al., 2007).

Time and funding for compiling data

Data compilation often is a tedious job and even if the data are already stored in a database, it always takes more time than expected to accurately deliver them. The time needed for such efforts rarely can be settled with money (e.g. Costello et al., 2013). As one of the intentions of our project was to establish a Trichoptera assessment according to the IUCN Red List criteria, we set up a rather comprehensive data collection template, which also focused on environmental details. Entering all the required information turned out to be too time consuming for most of the data providers, so that we finally accepted also short versions of the template including only species names, specifications of the life stage, date and coordinates. However, not only data provision, but also data management of submitted material is a time-consuming, funding-extensive and complex challenge (see next point).

Quality control of submitted data

For a reliable further use of such a large database like the DAET, a thorough and accurate quality control is inevitable (see Chapman, 2005a, b). A first step of quality control included the elimination of typing errors or errors that resulted from the variety of countries in the region, all using different languages and character sets (ISO coding), or a variety of database formats. Even though we defined a specific format for the geographic coordinates in our template, a huge diversity of different formats was submitted, which led to extensive harmonisation efforts. Coordinates located outside the continental limits (i.e. located in the sea) were eliminated or corrected. We tried to circumvent nomenclatural issues by adding a reference species list to the template. Still, we had to deal with taxonomical problems like synonyms or obscurely described taxa in the delivered datasets.

Trichoptera biodiversity in Europe

Pan-European species data collections—allowing for an evaluation of the conservation status of the group under consideration—already exist for several freshwater vertebrate groups (e.g. fish: Freyhof & Brooks, 2011; selected groups of molluscs: Cuttelod et al., 2011; amphibians: Temple & Cox, 2009; reptiles: Cox & Temple, 2009). For invertebrates, this information is only available for Odonata so far (Kalkman et al., 2010). Our project, for the first time, compiled Trichoptera occurrence data from all over Europe in one single database. Although the database comprises the high number of more than 600,000 records, it also reveals data-deficient regions (Fig. 1). This data deficiency mainly has two reasons: (1) low or missing investigation frequency, or (2) low species numbers due to missing specific aquatic habitats. Associated with (1) is the issue of unevenly distributed sampling sites, showing for example a clearly higher number of species in regions where experts are located (compare Table 2, “number records”). Further, most of the routine monitoring programmes—which normally seek to evenly cover an area—could not be taken into consideration because they often are based on the assessment of larval invertebrate stages (e.g. WFD-compliant monitoring), and extensive investigations of adults are missing. The difficulty in evaluating the difference between real absences and data deficiency hampers the statistical evaluation of the entire dataset (Elith et al., 2006).

Biodiversity—expressed by species richness—in a given area can reflect habitat heterogeneity or gradients of habitat diversity. Highest diversity and heterogeneity of habitats can be expected along (ecoregional) transitional zones like zoogeographical ecotones (Naiman et al., 1988). To evaluate this hypothesis, we calculated the number of species within 25-km buffer areas around each ecoregion (Table 2). The results confirmatively showed for all inland-ecoregions higher species numbers in these buffers than in the respective ecoregions.

In general, overall species diversity is declining with increasing latitude, which reflects historic events like glaciation. This can be seen in various biota groups (e.g. Reyjol et al., 2007; Hof et al., 2008; Heino, 2009; Zagmajster et al., 2014). A second significant trend is a reduction of species diversity from the West to the East as (species-rich) mountainous areas are missing in the eastern plains. Both trends are also reflected in our dataset. Areas with high caddisfly biodiversity—with more than 350 species—are located in the southern as well as mountainous ecoregions (Fig. 4). In several cases, we see a distinctive separation between Atlantic species and Siberian elements (Fig. 2) overlapping in Central Europe. This pattern is mainly a result of fluctuations in continental ice cover during the Pleistocene, which in turn caused several range extensions and regressions of Trichoptera species (Malicky, 2000; Pauls et al., 2006). During glacial periods in northern Europe, species retreated to the South or to ice-free parts at the southern margins of the glacial shield. Isolation of populations resulted in speciation processes and increased diversity in mountainous ranges and in the Mediterranean region (Pauls et al., 2006; Previšić et al., 2014a). Our data further summarise and illustrate glacial refugia already identified by Malicky (2000, 2006), and show distribution patterns, which deliver deep insight into biogeographical history.

Similar patterns are reflected for endemic species, with species-rich areas in the Mediterranean and mountainous regions (Fig. 5). These regions were also identified to host a high percentage of Trichoptera endangered by climate change (Hering et al., 2009; Conti et al., 2013), which makes them focus areas regarding future conservation issues under the aspect of progressing anthropogenic pressures (Zarfl et al., 2014; Vitecek et al., 2015a, b). Politically seen, these disjunct distributed and endemic species fall within national responsibilities for conservation. Based on detailed distribution maps like ours, areas of high speciation processes can be identified and presented to stakeholders and policy makers to support environmental decisions.

Source-based differences in ecoregional distributions

Differences in ecoregional occurrence records between literature (as compiled in the freshwaterecology.info database) and observed distribution data (DAET at hand) may be caused by the fact that many occurrence records are either not published, especially if compiled by lay person collectors, or inclusion is hampered through language barriers (mainly true for Russia and other eastern European countries). Other data—especially older ones—often were published without exact georeferenced sites. While these data could be assigned to an ecoregion during the literature review for freshwaterecology.info relatively easily, they mostly could not be taken into account for the Trichoptera distribution atlas presented here. Further, some regions might have seen a high density of data collection through experts (e.g. Greece; Malicky, 2005b), whereas in other regions experts might be missing (see section “Challenges of data compilation” above).

Mismatches between the observed (DAET) and expected (freshwaterecology.info) number of species including endemic ones within ecoregions are more pronounced in small ecoregions like the Alps or the Pyrenees. Especially in the case of the latter, the difference is highly evident (139 observed versus 226 expected). This fact can be related to a spurious precision of ecoregion borders in GIS analyses, which is an extremely sharp one that is not reflected in nature. Records from field samplings or museum records, which are often georeferenced by location names, are not aligned with ecoregion borders. Hence, the probability that a species record lies accidently outside the considered ecoregion increases with decreasing ecoregion size.

In turn, the spatial explicit ecoregion boundary does not consider if the location of a record is close to the ecoregion border or not. Thus, we tested a 25-km lateral buffer around the ecoregion Pyrenees to check for the difference in the number of observed species in the “original” ecoregion and the ecoregion including this transitional zone. Taking this small extra area into account (including the area of the 25-km buffer, the Pyrenees are still the smallest ecoregion by far; approximately only one-third of the area of the ecoregion Alps), the number of recorded species significantly increases from 139 to 199, which strongly supports our assumption that ecoregion borders of very small ecoregions represent a spurious precision.

Conclusions

The “Distribution Atlas of European Trichoptera” constitutes a unique database of caddisfly occurrence records of Europe. Though the compilation of these data reveals data-deficient regions—either caused by missing experts/sampling points in or our inability to mobilise data of a certain region—the dataset ranks among the most comprehensive distribution data collections of freshwater invertebrates (besides Odonata). It may be used as a base for a variety of modelling ideas (e.g. climate change, land use change etc.) or might serve as a valuable source to facilitate future conservation decisions.

Data collection and quality control are still ongoing and new data will—depending on funding—also be added in future to close knowledge or information gaps. A database like ours depends on contributions from a large number of experts and we therefore invite all caddisfly collectors of Europe to add their records. Our long-term aim is the generation of a IUCN Red List of threatened caddisfly species of Europe that then might support policy decisions regarding protected regions, freshwater key biodiversity areas or Natura 2000 areas.