Introduction

Two decades ago, DNA barcoding has been proposed as a fast and objective way of identifying species, free from problems such as the intraspecific phenotypic diversity related to sex and developmental stage, poor preservation of identified organisms or imperfect taxonomic skills of people who identify the collected material (Hebert et al. 2003, 2004). DNA barcoding has been proven as an efficient and handy method for fast and preliminary biodiversity surveys, ecological forensics and, importantly, for quantifying species richness and evolutionary diversity within and among communities (e.g., Kress et al. 2015). Moreover, through the years, the idea evolved into a set of advanced tools, such as DNA metabarcoding or environmental DNA detection, that greatly enhance the assessment and biomonitoring of biotic communities, and ecosystem health as well as provide an invaluable insight into interactions between organisms (Lacoursière-Roussel et al. 2018; Gold et al. 2021). DNA-based biomonitoring is being developed and aimed to become the golden standard for the bioassessment of aquatic ecosystems within the European Union Water Framework Directive (Bruce et al. 2021; Thalinger et al. 2021). Public access to reliable, DNA-barcode reference libraries for all taxa in interest, generated by faunistic studies and well-curated by experts in their taxonomy, is a prerequisite for achieving such an aim (Kvist 2013). However, so far, apart from some emblematic taxa such as, e.g. Odonata, reference libraries are far from being completed even for fresh waters of the EU member states, and are just nascent for less emblematic and for European biodiversity hotspots, such as the Mediterranean Basin (Weigand et al. 2019).

The Mediterranean Basin, including the Balkan Peninsula, is considered among the World’s top important biodiversity hotspots, with at least 60% of the Palearctic and 6% of the World’s freshwater species present in the area (Tierno de Figueroa et al. 2013). Several recent molecular studies revealed the complex phylogeographical structures and cryptic diversity patterns within numerous morphologically defined species of freshwater organisms inhabiting the Mediterranean Basin (e.g. Previšić et al. 2014; Bisconti et al. 2016). Such patterns, driven by the dynamic geological history of the area and alterations of its hydrological network, are particularly prominent in the case of obligatorily aquatic organisms with no airborne dispersal stages, such as fishes and malacostracan crustaceans (Geiger et al. 2014; Sworobowicz et al. 2015; Mamos et al. 2016; Grabowski et al. 2017; Buj et al. 2019). Nevertheless, they were also observed in the amphibiotic aquatic insects, including skilled flyers, such as odonates (Galimberti et al. 2021). In the case of heteropterans, the variable dispersal skills combined with the patchiness and isolation of local freshwater ecosystems may promote genetic diversification, at least in some species. The very few studies that have tackled this issue involving molecular markers, also DNA barcodes, revealed already some taxonomic inconsistencies as well as cryptic diversity within some genera of true water bugs in Europe, pointing out the need for more detailed biogeographical and taxonomic studies using integrative approach (Berchi et al. 2018; Havemann et al. 2018). Nevertheless, the genetic diversity of heteropterans in the Mediterranean, particularly in the Balkan Peninsula, remains scarcely studied and there are no DNA barcode libraries for the local species.

Heteroptera is the most diverse hemimetabolous insect order represented in inland waters, including both aquatic and semi-aquatic species. The water bugs include 20 families, 326 genera and ca. 4700 species of the infraorders Nepomorpha, Gerromorpha and Leptopodomorpha inhabiting all types of inland waters on all continents, excluding Antarctica (Polhemus and Polhemus 2008). Among them, the Gerromorpha and Nepomorpha are considered primarily aquatic (Polhemus and Polhemus 2008; Lancaster and Downes 2013; Gullan and Cranston 2014; Henry 2017). Most of them are able to fly at the imago stage, however, their airborne dispersal skills vary depending on species (Fernando and Galbraith 1973; Savage 1989; Boda and Csabai 2013). Although they are most diverse in the tropics, some 340 species are known to occur in the Palearctic. The vast majority of species are predatory, with an exception of Corixidae, which are omnivorous. On the other hand, they are important prey items for a variety of other animals, predominantly fish and birds (Peckarsky 1982; McCafferty 1983; Zimmermann and Spence 1989; Hutchinson 1993; Klecka 2014; Boda et al. 2015). The diversity and distribution of Heteroptera in Europe have been the subject of numerous studies and, generally, the taxonomy of the group is considered well-established (Jansson 1986; Aukema and Rieger 1995; Aukema et al. 2013). Still, the particular number of species occurring in Europe is hard to confirm, according to the inability to verify older records. Summarizing available literature (Aukema and Rieger 1995; Fent et al. 2011; Aukema et al. 2013; Protić 2016), at least 147 species of aquatic and semi-aquatic Heteroptera were reported from Europe. Given their diversity, abundance and ecology, presence in various types of water bodies, habitat specialisation and sensitivity to pollutants, aquatic true bugs can be used in biomonitoring as important indicators of aquatic habitat quality and pollution impact (reviewed by Bakonyi et al. 2022).

Aquatic heteropterans of the Balkan Peninsula have remained largely understudied. Most information on Gerromorpha and Nepomorpha comes from the general works on Heteroptera by Josifov (1986); Aukema and Rieger (1995); Protić (1998); Protić (2001); Aukema et al. (2013), where they listed 83 species from the Balkan Peninsula, which makes 56% of European fauna. Only most recently the ecology of true aquatic bugs of Montenegro was studied by Gligorović et al. (2016) and the checklist was provided by Protić (2016), who reported the presence of 13 species of Gerromorpha and 21 species of Nepomorpha in the country (23% of European fauna, 41% of Balkan Peninsula). According to the Catalogue of the Heteroptera of the Palearctic Region, there are reports about 23 species of Nepomorpha and 16 species of Gerromorpha from Albania. Data about species lists occurring in the Skadar Lake can be found only in the two original papers of Kment et al. (2005) and Gligorović et al. (2016) and in the summary of the book about Skadar Lake zoobenthos by Pešić et al. (2018a). So far, 15 species of water bugs are known to occur in the Lake Skadar basin, however, considering the diversity of this group in the Balkans and the local habitat mosaic, we can expect much higher number of species to be present in the area.

Aims

Our study aims to initiate a comprehensive DNA-barcode library for aquatic Heteroptera (Gerromorpha, Nepomorpha) of the Balkan biodiversity hotspot, with a focus on the fauna of the ancient basin of Lake Skadar. Such DNA-barcode library for the local biota is an indispensable basis for setting up reference conditions and future biomonitoring/conservation efforts in the basin and is a response to challenges of the anthropogenic threat posed to Lake Skadar. Pešić et al. (2018b) underlined the need for employment of novel and efficient, DNA-based methods, in surveying the threatened biodiversity of Lake Skadar. Our study, performed within the project “DNA-Eco: DNA barcode reference library as a tool for sustainable management of freshwater ecosystems in the highly threatened Lake Skadar Basin” funded by the Ministry of Science, Montenegro, is among the first to address this need. Additionally, we verify the genetic identity of the local populations of widespread species in reference to those from other parts of Europe, available in the publicly accessible Barcode of Life Datasystems (BOLD). By this, we verify whether a simple survey of the collected material, employing DNA barcodes is efficient at taxonomic assignment using the currently available reference library and if such a survey hints at pronounced phylogeographic structure or hidden diversity within any species. We assume the latter may be expected in species of wide geographic distribution but limited dispersal abilities, i.e. due to wingless or poorly flying imagines. Finally, we provide the first thorough molecular study of the true water bugs of Skadar Lake and adjacent regions.

Study area

Lake Skadar is located in the northern Mediterranean Region, western Balkans, and extends in an NW–SE direction, parallel to the coastline of the Adriatic Sea, and separated from it by the Rumija Massif. The northern part of the basin is situated on the Niksić karst field at an altitude of 600–630 m a.s.l. The karst field is intersected by the Zeta River and its major tributaries. The altitudes of the karst polje generally decrease from north to south and the southernmost part forms cryptodepression, which is flooded by the lake waters. Skadar Lake is the biggest lake on the Balkan Peninsula, approximately 44 km long and up to 18–20 km wide, with a surface area fluctuating seasonally from ca. 350 to ca. 600 km⁠2, depending on the seasonal water level. The lake is generally shallow, with an average depth of about 5 m. Skadar Lake is fed by numerous sublacustrine karst springs deep down to 60 m, as well as by a few rivers, such as Morača, Crnojevića and Crmnica. Bojana River constitutes the lake's water outlet and connects Skadar Lake to the Adriatic Sea. The climate of Lake Skadar basin is Mediterranean, with hot summers and winter temperatures rarely going down below 0 °C. Thus, Lake Skadar is referred to as a subtropical water body (Lasca et al. 1981). The shallowness of the lake, together with the thermal conditions provides an opportunity for lush submerged vegetation, while the periodically flooded areas at the margin of the lakes are an extensive wetland system. The highly heterogeneous ecosystem encompassing a combination of warm lacustrine and wetland as well as cold spring and stream habitats is known to support very rich, yet still largely understudied fauna with more than 20 endemic species described so far and, most probably, many more awaiting discovery (Crnobrnja-Isailović et al. 2018; Pešić et al. 2018a; Gadawski et al. 2022b). Another peculiarity of the Skadar basin is related to its quite mysterious geological history; the basin with its karst spring systems and wetlands is most probably very old (of Pliocene, ca. 3 mya, or even earlier origin), while the present lake formed only about 1200 years ago (for review see Grabowski et al. 2018).

The unique ecosystem, high biodiversity (at least 1900 species) with number of endemics as well as the ecological importance of Lake Skadar have been recognized and appreciated by establishing a National Park (IUCN Management Category II) on the Montenegrin side, and a “Managed Natural Reserve” (IUCN Category IV) covering the Albanian part of the lake, while at the international level, the lake is recognized as a Ramsar site (a wetland area of international significance) and one of the Key Biodiversity Areas across the Mediterranean biodiversity hotspot (Darwall et al. 2014). Despite its peculiar value, international recognition and protection, Lake Skadar basin is highly endangered by degradation. As summarised by Pešić et al. (2018b), the main threats to the environment of Lake Skadar and its basin are: (1) pollution (including industrial, municipal, solid, and liquid waste), (2) poaching, (3) lakeshore development, and (4) water management measures, with additional challenges posed by: (a) increasing tourism pressure, (b) intensification of the agricultural sector, (c) introduction of the invasive species, (d) using the sublacustrine springs of Lake Skadar for the regional water supply of the Montenegrin coastal area, (e) hydropower development of the Drin River in Albania and the Morača Valley in Montenegro, and (f) dredging of the Bojana River to reduce flooding problems. One of the initial countermeasures to these threats requires a definition of “reference conditions” leading to the establishment of protocols for use in biomonitoring programs (Pešić et al. 2018b).

Material and methods

Field collections and taxonomic identification

The material for the study was collected during several field expeditions to Lake Skadar in April/May and September/October 2014, July/August 2015 and May/June 2018 organised jointly by the Department of Invertebrate Zoology & Hydrobiology, University of Lodz, the Department of Invertebrate Zoology and Limnology, University of Szczecin, and the Department of Biology, University of Montenegro. The material was collected with qualitative methods from a variety of aquatic habitats (Fig. 1) in and around the lake, including kick-sampling in shallow inshore habitats, dredging of underwater meadows, submerged light traps in the open lake, kick-sampling in karst springs, streams and wetlands. This material was supplemented by water bugs attracted to light in the open areas, ensuring that the light was visible over a long distance above the lake's surface. For the latter purpose, we used a 500W white-light bulb connected to a power generator, placed in front of a white screen (2 m × 3 m). The expositions started ca. 45 min before sunset and continued for three hours. All the collected individuals were preserved in 96% ethanol for further examination. The distribution of the collection sites is presented on the map (Fig. 2) and the collection details are summarised in Table S1, S2.

Fig. 1
figure 1

Examples of habitats sampled in the Lake Skadar basin: a Lake Skadar at Biotsko Oko (7 m asl), b Lake Skadar at Ostros (9 m asl), c Zajčina spring (165 m asl), d Crnojevića river (36 m asl), e alpine river in Lukavica (1495 m asl), f) Babino Sicalo spring (1606 m asl)

Fig. 2
figure 2

Sampling sites for aquatic heteropterans in the Lake Skadar basin

The material was identified by GT, using a stereomicroscope Nikon SMZ800, to the lowest possible taxonomic level according to the available taxonomic literature on the European aquatic Heteroptera (Jansson 1986; Strauss and Niedringhaus 2014). Vouchers were deposited in the invertebrate collection of the Department of Invertebrate Zoology and Hydrobiology, University of Lodz.

DNA extraction, amplification, sequencing, data depository

Molecular studies were done either in the Canadian Center for DNA Barcoding (CCDB), Guelph, Canada, or in the Department of Invertebrate Zoology and Hydrobiology (UniLodz), University of Lodz (see Table S1 for details).

Two DNA extraction protocols were implemented for samples analysed in the CCDB. For larger specimens, the standard procedure involved sampling a leg for DNA extraction at a later stage (Ivanova et al. 2006). Conversely, for smaller individuals measuring less than 3 mm, the entire specimen was placed in a 96-well microplate and used for DNA extraction. An additional step was incorporated into the procedure to recover exoskeletal remains following non-destructive lysis (Porco et al. 2010).

For samples analysed at UniLodz, total DNA was extracted using either the Chelex procedure (Casquet et al. 2012) or the phenol–chloroform method following the procedure by Grabowski et al. (2012). Amplification of COI region and enzymatic purification was done according to primers and the procedure described by Rewicz et al. (2021). Sequencing was done in Macrogen Europe either one-way or bi-directional according to the obtained quality. DNA extracts were deposited either in UniLodz, or in CCDB (See Table S1 for particular individuals).

The obtained sequences were first identified using BLAST (Altschul et al. 1990) searches on the NCBI nucleotide database (https://www.ncbi.nlm.nih.gov/). Then, sequences were manually edited, aligned, and trimmed as well as checked for the absence of frameshifts, double peaks, and stop codons in the Geneious 10.2.6 (Kearse et al. 2012).

Dataset assemblage

Relevant voucher information, photos, and newly generated DNA barcodes are publicly accessible through the dataset DS-MOHET (https://doi.org/10.5883/DS-MOHET) in BOLD (Ratnasingham and Hebert 2007). Additional publicly available COI sequences from European Gerromorpha and Nepomorpha were obtained from BOLD (public dataset DS-HETEUR). The dataset DS-HETEUR consists of 1168 sequences that are publicly available (accessed 16-06-2023) with 220 sequences (18.8%) developed during this study (DS-MOHET). Sequences stored in BOLD as private records are not included in the analysis and are not present on BIN distribution maps. Analysis are based only on published records with GPS coordinates. The data were used for constructing the distance-based NJ tree and the haplotype networks (see below). Newly obtained sequences were deposited in GenBank under accession numbers ON406644-ON406861.

DNA barcode analysis

We calculated the sequence divergences (mean and maximum intraspecific variation and minimum genetic distance to the nearest-neighbour species) using ‘Barcode Gap Analysis’ and ‘Distance Summary’ tools in BOLD (boldsystems.org), employing the Kimura-2-Parameter (K2P), and p-distance metrics. Unique BIN (Barcode Index Number) identifiers, which group DNA sequences based on the genetic distance (entative equivalents of species) were assigned for sequences deposited in BOLD (Ratnasingham and Hebert 2013).

We provided a Neighbour-Joining cluster analysis (NJ; (Saitou and Nei 1987)) for all the studied COI barcode sequences, based on the K2P distance (Kimura 1980), with a bootstrap test (1000 replicates) (Felsenstein 1985), using MEGA X software (Kumar et al. 2018). Separate trees were provided for Nepomorpha and Gerromorpha.,

Species delimitation

In order to explore the molecular threshold between species and verify clustering of species having more than one BIN, we performed species delimitation analyses. The delimitation methods were based on genetic distances and phylogeny reconstructions. Firstly, we downloaded COI sequences from DS-HETEUR BOLD dataset, only in case of sequences > 500 bp long, excluding contaminants and sequences containing stop codons. The downloaded sequences were realigned using MAFFT v. 7 with an automatic search for the best algorithm (Katoh et al. 2013). Sequences that would shorten the alignment below 500 bp, but had longer representation for particular species, were removed. The final dataset of Heteroptera from Europe consisted of 1091 sequences, each 513 bp long. We used the distance-based species delimitation method based on the barcode gap detection, i.e. ASAP: assemble species by automatic partitioning software (Puillandre et al. 2021). ASAP was run with the Kimura (K80) distance method. For the phylogeny based delimitation method, we used Bayesian implementation of the multi-rate PTP (mPTP, Kapli et al. 2017). The method implements MCMC sampling that provides a fast and comprehensive evaluation of the inferred delimitation. We decided to use this method as other methods like bPTP or GMYC have a tendency to over-split species (e.g. Wattier et al. 2020; Mamos et al. 2021). Two runs of 500 M MCMC generations-long chain with burn-in of 10% were performed on the local server. As an input tree for mPTP, we reconstructed the maximum-likelihood phylogeny using W-IQ-Tree (Trifinopoulos et al. 2016), with 1000 bootstrap replicates and automatic substitution model selection (Fig S1).

Additionally, for species in which more than one BIN was detected, we displayed the haplotype relationship through a Minimum Spanning Network using PopART (Leigh and Bryant 2015).

Results

We present 220 DNA barcode sequences of 24 species of aquatic bugs (see Table S1 for details). We failed to obtain the sequence only from Lethocerus patruelis (Stål, 1854) as this specimen was found in the field, it was already dead and dry. In Montenegro, we collected 191 specimens from 24 species (25 BINs), while in Albania we caught 30 specimens, from 11 species (12 BINs). Our findings added the first records of 4 species for Montenegro (Micronecta poweri (Douglas & Scott, 1869), Hesperocorixa sahlbergi (Fieber, 1848), Notonecta viridis Delcourt, 1909 and Gerris lateralis Schummel, 1832). In Albania, Micronecta pusilla (Horvát, 1895) was recorded for the first time. Even, though we focused mainly on the Skadar Lake catchment area, we covered 63% of Montenegrin (24 from 38, 4 species new) and 27% of Albanian (11 from 40, 1 species new) aquatic Heteroptera fauna. According to the list provided by Pešić et al. (2018a, b), our findings added the first records for Micronecta pusilla, Hesperocorixa sahlbergi, Aquarius paludum (Fabricius, 1794) and Velia affinis Kolenati, 1857 to the 15 previously known species from the transboundary Skadar Lake area itself.

Lengths of the analyzed DNA barcode fragment ranged from 402 to 658 base pairs (bp). In two sequences of Ilyocoris cimicoides (Linnaeus, 1758), we detected putative pseudogenes (NUMT) and removed them from further analysis as such NUMTs often co-amplify with the targeted COI region but lack the proper phylogenetic information (Jordal and Kambestad 2014; Hawlitschek et al. 2017; Zhao et al. 2022). All the sequences were characterized by high AT content, with mean values of G = 16%, C = 17.5%, A = 32.2%, and T = 34.4%. The number of BINs per species ranged from one (21 species, 87.5%) to two, for three species: Hydrometra stagnorum (Linnaeus, 1758), Nepa cinerea Linnaeus, 1758, and Notonecta maculata Fabricius, 1794. We did not detect BIN sharing between species, and seven BINs from the detected 27 were unique for BOLD (sequences deposited for the first time to BOLD and clustered to the new BIN). Intraspecific K2P distance varied from null to a maximum of 6.08%, 3.45%, and 1.25% for the aforementioned species for which double BINs were detected. Interspecific distances varied from 5.61% (Notonecta viridis; Notonecta glauca, 1758) to 20.15% (Micronecta pusilla; Ilyocoris cimicoides) (Table 1). We found non-overlapping clusters in the case of all analyzed species. In all cases, the distance at which barcode gap (i.e. distance to the nearest neighbor) could be observed exceeded the maximum intraspecific distance. For a better presentation of the phylogenetic relationships between species, we showed them on two figures (Nepomorpha: Fig. 3; Gerromorpha: Fig. 4). Wider overview of molecular identification showed that our data provided the first COI sequences for Sigara scripta (Rambur, 1840) BOLD:AEI1642, Micronecta pusilla BOLD:AEG9578, Velia affinis BOLD:AEG0715, and Velia sp. BOLD:AEF9274 (Fig. S1). Moreover, sequences from the Skadar Lake region added a formerly unknown BINs to Nepa cinerea BOLD:AEH5788, Hesperocorixa sahlbergi BOLD:AEF2957, Notonecta maculata BOLD:AEH5379, BOLD:AEF8405 and Micronecta poweri BOLD:AET2410 (Fig. S1). Minimum Spanning Networks shown for the species with multiple BINs (Fig. 5) revealed presence of geographical patterns. In Hydrometra stagnorum, BIN AEC2693 contained most of the sequences (6 haplotypes) from Montenegro, plus one sequence from Portugal. Single indvidual from Norway was assigned to BIN ACX 7898. BIN AAK5632 consists of sequences from Montenegro and Albania and also sequences from Central and Western Europe. The most common haplotype is shared between France, Montenegro, and Germany. A complex of five BINs can be observed within Nepa cinerea, four of which are restricted to Norway (ADP8747), Portugal (AEC3215), China (ADL0058), and Montenegro (AEH5788), respectively. Sequences from Finland, Germany, Poland, and Montenegro share BIN AAK8359. Notonecta maculata sequences were assigned to three BINs. Individuals from the Lake Skadar area are within BIN AEH5379 (singleton from Albania), and BIN AEF8405, together with individuals from Albania and Montenegro. The last BIN (AAN1703) consists of individuals from Western Europe (Fig. 5).

Table 1 BOLD distance analysis for the studied Gerromorpha and Nepomorpha
Fig. 3
figure 3

Neighbour-joining topology of the analysed Nepomorpha species based on Kimura 2-parameter distances. Triangles indicate the relative number of individuals sampled (height) and sequence divergence (width). Dashed rectangles encircled species, where double BINs were detected. Numbers next to nodes represent bootstrap values > 50% (1000 replicates)

Fig. 4
figure 4

Neighbour-joining topology of the analysed Gerromorpha species based on Kimura 2-parameter distances. Triangles indicate the relative number of individuals sampled (height) and sequence divergence (width). Dashed rectangle encircles species, where double BINs were detected. Numbers next to nodes represent bootstrap values > 50% (1000 replicates)

Fig. 5
figure 5

Minimum Spanning Network showing phylogenetic relationships within three species: Hydrometra stagnorum, Nepa cinerea, Notonecta maculata, for which multiple BINs were detected. Each bar represents a single mutational change The diameter of the circles is proportional to the number of individuals in each haplotype sampled (see open circles with numbers). The lines illustrate the results of the species delimitation analysis: solid line—mPTP, dotted line—ASAP, dashed line—BINs

The ASAP delimitation method provides similar results as the BIN clustering, except for two BINs being fused (AEH5788 and ADP8747) in the case of N. cinerea, and merging all BINs of N. maculata. (Fig. 5, Table S3). The mPTP results for the three BIN discordant species (H. stagnorum, N. cinerea, N. maculata) show an unambiguous overlap of the molecular units and morphological species (Fig. 5). Using the barcode gap analysis, we estimated the threshold distance separating species at 2.2% K2-p (Fig. 6, Table S3).

Fig. 6
figure 6

Barcode gap histogram (molecular distance vs. frequency) based on COI K2P distance generated by the ASAP software. We used a dataset (DS-HETEUR) composed of our data and public sequences available in BOLD for European water bugs.

Rivers and springs were the richest in terms of species number. We noticed 19 and 20 species of aquatic true bugs in such water bodies, respectively. Sigara dorsalis dominated in both. Fewer species (11) were found in Lake Skadar itself. However, species composition varied between the sites. Sublacustrine springs showed the greatest diversity (8 species), with the dominance of larvae of the genus Micronecta (only the imagines of M. pusilla were found in the Skadar Lake), followed by Sigara dorsalis and Ilyocoris cimicoides. The littoral zone of Lake Skadar was less diverse. Heteropteran communities in this habitat consisted of five species, and the dominant was Sigara dorsalis. The benthic zone was least diverse in a number of species. Only Corixinae larvae (the same species as detected in the other sampling sites) and specimens of Micronecta pusilla (larvae and imagines) were found there.

Discussion

The true aquatic bugs have neither been frequently targeted by studies involving DNA barcoding (i.e. Havemann et al. 2018; Meyin A Ebong et al. 2016) nor included in wider true-bugs DNA-oriented papers (i.e. Park et al. 2011; Raupach et al. 2014; Sousa et al. 2021). In our study focusing on the Lake Skadar basin and adjacent areas, we found 20 species (83% of our dataset) of which each was associated with only a single BIN, congruent with our morphological identification. An identity of one species of Velia remains unknown and needs further morphological evaluation. Three species were represented by multiple BINs (two in each case). Surprisingly, we did not detect BIN sharing between species, as reported in other studies regarding heteropterans (Havemann et al. 2018; Raupach et al. 2014) and other insect orders (i.e. Raupach et al. 2020a, b; Rewicz et al. 2021). Interestingly, the estimated 2.2% K2-p threshold distance separating species in our data (Fig. 6), being identical with the initial distance threshold used in the BIN-defining algoritm (Ratnasingham and Hebert 2013), provides some suggestion for existence of species complexes in the species characterised by presence of multiple BINs.

Species with high intraspecific variability

Three species in which we found multiple BINs were characterised by relatively high intraspecific genetic distances: Notonecta maculata (1.25%), Hydrometra stagnorum (6.08%), Nepa cinerea (3.45%). In the case of of three other species with high intraspecific distances (Aquarius paludum, Gerris lacustris, Ilyocoris cimicoides), a continuum of differentiation without any clear barcoding gap could be observed (Table 1). All delimitation methods (BINs, ASAP, mPTP) also clearly pointed out that each represents a single species/lineage (Fig S1, Table S3).

Notonecta maculata is a medium size (ca. 16 mm) predatory water bug and one of the nine congeneric species occurring in Europe. It inhabits various habitats with an emphasis on small lotic water bodies with well-developed aquatic vegetation (Strauss and Niedringhaus 2014). In the analysed material, it was represented by 19 barcoded individuals assigned to two BINs, BOLD:AEF8405 (18 individuals, 11 in Montenegro, 7 in Albania) and BOLD:AEH5379 (1 individual, Albania). The latter BIN is unique and a singleton, while BIN AEF8405 also contains private sequences from the Mediterranean region (3 from Malta, 1 from Cyprus) deposited in BOLD. Both BINs differ from the BIN BOLD:AAN1703, already known from Western and Central Europe (Germany, Austria, Italy, Portugal) (Fig. 5, Fig. S1, Fig. 7), by 1.38% K2P distance. Such spatial segregation and split into ‘southern-Mediterranean’ haplogroup (including two BINs revealed in this study) and Central-Western ‘continental haplogroup’ may reflect the complex evolutionary history of N. maculata in Europe. Regarding BIN AEH5379, we cannot exclude that it may be an artifact generated by sampling bias. Broader sampling in the southern part (Albania, Greece), and north (Croatia, Slovenia, Bosnia and Hercegovina) of the Balkan Peninsula may provide more individuals to fill this gap. A further morphological examination is necessary, as this group (Hebsgaard et al. 2004) is known for complicated taxonomy, and it is possible that potential cryptic (or pseudo-cryptic) species can be revealed.

Nepa cinerea is a widespread and ubiquitous water bug, one of three species within the genus, occurring in Europe. It may reach a total length of 15–23 mm, plus an elongated breathing tube which may be as long as the body. This sneaky predator feeds on various aquatic invertebrates and fish fry. It inhabits various flowing and stagnant waters, often among the submerged vegetation and detritus (Strauss and Niedringhaus 2014). We barcoded 5 individuals, which were assigned to two BINs: AEH5788, a new unique BIN with 3 individuals from Montenegro, and AAK8359 (2 individuals from Montenegro), which is definitely the most widespread in Europe and recorded from Germany, Poland, Finland, Netherlands (BOLD private data, accessed on 16-06-2023), Austria (BOLD private data, accessed on 16-06-2023), Romania (BOLD private data, accessed on 16-06-2023), and the United Kingdom (BOLD private data, accessed on 16–06-2023) (Fig. 8). The phylogeographic structure within this species seems to be intricate, with at least 5 haplogroups, of which 4 are restricted to limited areas (countries) like China, Norway, Montenegro, and Portugal, and one is widespread in Europe (Fig. 5). The occurrence of two lineages in sympatry within the Skadar Lake area serves as a definitive confirmation of the area's uniqueness and emphasizes the pressing need for its protection. An integrative approach involving morphology, ultrastructure, and nuclear markers is needed to solve this putative species complex.

Fig. 7
figure 7

Distribution of Notonecta maculata: A records published in GBIF database; B DNA barcode data published in BOLD database; BIN: AAN1703—blue, AEF8405—green, AEH5379—grey. Circle indicates records obtained during this study

Hydrometra stagnorum is one of two Hydrometra species widely distributed in Europe. Associated with the water surface, it occurs in the coastal zone of various types of stagnant and flowing waters. Its length reaches 7–9 mm, and the species predates and scavenges on organisms falling on the water surface (Strauss and Niedringhaus 2014). We revealed the presence of three haplogroups equivalent to BINs BOLD:AEC2693 (6 individuals from Montenegro), BOLD ACX7898 (1 individual from Norway), and BOLD:AAK5632 (5 individuals from Montenegro, 1 from Albania). Two BINs were sympatric, even in the same exact localities, in the Skadar Lake region. Nevertheless, there are differences in their distribution at the scale of Europe. BIN AEC2693 is limited to its southern part, recorded so far from Montenegro, France (BOLD private data), and Portugal, ACX7898 is a single individual from Norway, while AAK5632 is much more widespread and found in Montenegro, Albania, Germany, Austria, Poland, France, Romania (BOLD private data, accessed on 16-06-2023), and Netherlands (BOLD private data, accessed on 16-06-2023) (Fig. 9). The high 6.08% K2P intraspecific distance may indicate the presence of cryptic species and requires further attention. All three lineages seem to be separate species according to ASAP and BIN delimitation. mPTP merges them into one, but this method is known to provide the most conservative/lumping approach combining even well-recognized different species into single MOTU (Silva et al. 2023; Parslow et al. 2021). Taking into account limited dispersal abilities and geographic distribution of Hydrometra, we can assume the existence of some local (pseudo)cryptic species. An integrative approach involving morphology and more molecular markers is needed to elucidate the taxonomy of this species.

Fig. 8
figure 8

Distribution of Nepa cinerea: A records published in GBIF database; B DNA barcode data published in BOLD database; BIN: AAK8359—blue, ADP8747—red, AEC3215—orange, AEH5788—grey. Circle indicates records obtained during this study

Cryptic diversity within the Skadar Lake area

Due to the turbulent geological history of the Balkan Peninsula and Skadar Lake Lake itself (details by (Grabowski et al. 2018)), the aquatic fauna of the basin is rich in endemics (Pešić and Glöer 2013, 2018; Jabłońska et al. 2018, 2020). Most of the endemicity was detected in obligatory aquatic invertebrates (the whole life cycle in water), such as amphipods (Jabłońska et al. 2020), shrimps (Jabłońska et al. 2018), snails (Pešić and Glöer 2013, 2018; Pešić et al. 2019), leeches (Grosser et al. 2015; Marinković et al. 2019; Jovanović et al. 2021), and water mites (Pešić 2022; Pešić et al. 2021a, 2021b, 2020; Pešić and Smit 2020). Recent morpho-molecular studies on Lake Skadar Chironomidae done by Gadawski et al. (2022a) revealed the presence of new BINs, which could be identified only at the genus level. Thus, we cannot exclude the presence of endemic or non-described species there, yet it needs further work integrating morphological and molecular data. Our findings revealed new BINs within Nepa cinerea and Hesperocorixa sahlbergi. First DNA barcodes obtained for Velia affinis, Micronecta pusilla, and Sigara scripta need urgent comparison with specimens from other parts of their broad occurrence range (Eastern-Mediterranean, Pontic and Caucasus), to determine their taxonomic and phylogenetic position (Aukema and Rieger 1995). The finding of Velia sp. and Notonecta maculata representing the same BINs in the Lake Skadar area as in Cyprus, and Malta, respectively (BOLD private data) provides a perspective for the description of new species with not strictly Balkan, but the broader Southern European-Mediterranean range. This area and the Mediterranean islands, in particular, are known for their high level of diversity and endemicity (Bisconti et al. 2016; Grković et al. 2015; Hupało et al. 2018, 2021; Ivković et al. 2021). On the other hand, Mediterranean fresh waters and wetlands are scattered, and heavily vulnerable to both, anthropogenic pressure and habitat loss (Cuttelod et al. 2009; Darwall et al. 2014; Hermoso and Clavero 2011; Leberger et al. 2020; Milano et al. 2013; Tachos et al. 2022), as well as threatened by invasive species (Deidun et al. 2018; Hermoso et al. 2011; Naselli-Flores and Marrone 2019). As mentioned in the study area's description, various anthropogenic threats also apply to Lake Skadar, and the need for documenting its biodiversity's 'reference condition' becomes apparent (Pešić et al. 2018b).

Fig. 9
figure 9

Distribution of Hydrometra stagnorum: A records published in GBIF database; B DNA barcode data published in BOLD database; BIN: AAK5632—blue, AEC2693—green, ACX7898—red. Circle indicates records obtained during this study

Heteroptera of Lake Skadar—perspective for further findings

Our data confirmed the dependence of the obtained species composition on the sampling technique. Light trapping, which we conducted on the Skadar Lake shores, and spring systems, revealed only one species—Sigara dorsalis. Surprisingly, we did not detect any other representatives of Corixinae among more than 5000 invertebrates lured to light and barcoded (Grabowski et al. unpublished data). Such low effectiveness of collecting water boatmen at light is surprising, given that they are reported to be highly phototactic (Weigelhofer et al. 1993). Light traps set at the bottom of the lake bring us only representatives of Corixinae and Micronectinae, mostly juvenile specimens. Using classic hydrobiological equipment (kick net, dredge) was the most effective and resulted in collecting 24 species.

The fauna of true water bugs in the Skadar Lake area is definitely understudied. So far, including our findings, 24 species are known from the area and, in a broader view, only 38 and 40 species are known from Montenegro and Albania, respectively, at the country scale (Aukema and Rieger 1995; Aukema et al. 2013; Protić 2016). The majority of our records represent species that have already been reported from the region. Still, we were able to add nine new species and increase the number of species known from Lake Skadar area by 60%. An obvious reason for that is scarce literature data with only two original papers published recently on aquatic heteropterans in the Skadar Lake area (Kment et al. 2005; Gligorović et al. 2016) and one review paper including this group (Pešić et al. 2018a). The general diversity of water bugs fauna of the Balkans (83 species) and Europe (147 species) together with a mosaicity of various aquatic habitats present in the Lake Skadar basin suggest that many more true water bugs can still be found in this area. However, there is no data available on the diversity of this group from other big Balkan lakes, such as Ohrid, Prespa or Doiran, that would allow any prediction or comparison regarding that matter. Interestingly, only one species generally considered rare in Europe, i.e. Sigara scripta, was discovered in Lake Skadar. However, it should be emphasized that our study is a preliminary survey. Even as such, being also the first insight into the molecular diversity of the local aquatic bugs, it has already pinpointed several taxonomic issues that need to be tackled in future research and provided valuable contribution to documentation of the “reference condition” of local biodiversity. Further studies, including a more systematic approach to various habitats and elevation gradients, are needed to complete the DNA barcode reference library. Having such a library will enable eDNA-based surveys on larger geographic scales that may be a very effective way of estimating biodiversity patterns and studying biogeographical affiliation of species, particularly in standing waters (Carraro et al. 2020).

Conclusions

Answering the need for employment of novel and efficient, DNA-based methods, in surveying the threatened biodiversity of Lake Skadar, we provided the first comprehensive DNA barcode library for aquatic Heteroptera from its basin and adjacent regions, a unique and diverse part of the Balkan Peninsula. At the same time, we extended the list of species known from the area by 60%. Out of the 24 barcoded species, we detected multiple highly divergent, and some new, BINs for Notonecta maculata, Hesperocorixa sahlbergi, Hydrometra stagnorum and Nepa cinerea, indicating possible taxonomic inconsistencies, the potential for (pseudo)cryptic diversity and intricate phylogeographic patterns. Our study showed that the presumably well-known and threatened diversity hotspots, such as the Lake Skadar region are heavily understudied regarding even the usually large and common insect taxa, such as true water bugs. We also pinpointed the value of simple DNA-barcoding-based surveys, not only for providing reference barcode libraries for effective biomonitoring but also for signaling taxonomic and biogeographic issues.