DNA barcoding and cryptic diversity of deep-sea scavenging amphipods in the Clarion-Clipperton Zone (Eastern Equatorial Pacific)

The Clarion-Clipperton Zone (CCZ), located in the abyssal equatorial Pacific, has been subject to intensive international exploration for polymetallic nodule mining over the last four decades. Many studies have investigated the potential effects of mining on deep-sea ecosystems and highlighted the importance of defining environmental baseline conditions occurring at potential mining sites. However, current information on biodiversity and species distributions in the CCZ is still scarce and hampers the ability to effectively manage and reduce the potential impacts of mining activities. As part of the regulatory regimes adopted by the International Seabed Authority, concession holders are required to conduct an environmental impact assessment and gather baseline data on biodiversity and community structure in relation to their license areas. In the present study, we used an integrative molecular and morphological approach to assess species richness and genetic variation of deep-sea scavenging amphipods collected in two nodule-mining exploration areas (UK-1 and OMS-1 areas) and one Area of Particular Environmental Interest (APEI-6) in the eastern part of the CCZ. We analyzed the DNA sequences of the cytochrome c oxidase subunit I gene of 645 specimens belonging to ten distinct morphospecies. Molecular data uncover potential cryptic diversity in two investigated species, morphologically identified as Paralicella caperesca Shulenberger & Barnard, 1976 and Valettietta cf. anacantha (Birstein & Vinogradov, 1963). Our study highlights the importance of using molecular tools in conjunction with traditional morphological methods for modern biodiversity assessment studies, particularly to evaluate morphologically similar individuals and incomplete specimens. The results of this study can help determine species identity and ranges, information which can feed into environmental management.


Introduction
Polymetallic nodules, commonly referred to as manganese nodules, are probably one of the most attractive mineral resources as they contain economically relevant metals, such as copper, nickel, and cobalt, and can be found in large quantities on the ocean floor (Halbach et al. 1975;Halbach and Fellerer 1980;Clark et al. 2013). The Clarion-Clipperton Zone (CCZ), situated in the central equatorial Pacific basin between two geological submarine fracture zones, holds major portions of manganese nodule deposits and is therefore subject to intense exploration for future deep-sea mining activities (ISA 2010). However, there is great concern about the potential effects of nodule recovery on the abyssal seabed. Major effects of mining disturbances, such as the removal of nodules, the generation of sediment plumes, and the discharge of mine tailings, will, without a doubt, considerably alter habitat characteristics (Rolinski et al. 2001;Sharma et al. 2001) and harm the associated fauna in the area of immediate impact and beyond (Thiel 2001;Miljutin et al. 2011;Vanreusel et al. 2016). Hence, systematic conservation planning processes and the associated establishment of marine protected areas (MPAs) and adjacent buffer zones across the CCZ are inevitable (Wedding et al. 2013).
The CCZ lies beyond exclusive economic zones (EEZs) of up to 200 nautical miles from the coastline. The International Seabed Authority (ISA), established under the 1982 United Nations Convention on the Law of the Sea and the 1994 Agreement relating to the Implementation of Part XI of the United Nations Convention on the Law of the Sea, is responsible for administering and protecting this "heritage of mankind" (Lodge et al. 2014;Wedding et al. 2015). At present, the ISA has issued sixteen nodule exploration licenses within the CCZ (Fig. 1), with each license covering an area up to 75,000 km 2 (https://www.isa.org.jm/deep-seabed-mineralscontractors). Exploration and exploitation of these areas may only be undertaken under regulatory regimes for the exploitation of manganese nodules that have been adopted by the ISA (ISA 2013a). Consequently, contractors are required to acquire environmental baseline data on biodiversity as well as community structure and standing stocks in relation to their license area (ISA 2013b). For biological communities, these baseline data requirements (section 15 (e)), include to "…. collect data on the sea floor communities specifically relating to megafauna, macrofauna, meiofauna, microfauna, demersal scavengers and fauna associated directly with the resource, both in the exploration area and in areas that may be impacted by operations (e.g. the operational and discharge plumes)" (ISA 2013b, p. 5).
As part of the ABYSSal baseLINE (ABYSSLINE) project, this study aims to gather baseline information on diversity and genetic variation of scavenging amphipods collected in two exploration areas, UK's contract area (hereafter referred to as UK-1), licensed to UK Seabed Resources (UKSR), and Singapore's contract area (hereafter referred to as OMS-1), licensed to Ocean Mineral Singapore (OMS), plus one Area of Particular Environmental Interest (APEI-6, formerly known as APEI-4), in the eastern part of the CCZ (Fig. 1). Demersal scavengers, particularly lysianassoid and alicelloid amphipods, are an important element of the deep-sea ecosystem (e.g., reintroducing organic carbon from large food falls into the normally food-limited deep sea) (Christiansen and Diel-Christiansen 1993;Kamenskaya 1995;Duffy et al. 2012). Given their ubiquity and high mobility, they represent an ideal group to examine patterns of distribution, the range of isolation, and gene flow across the CCZ (Hessler et al. 1978;Blankenship and Levin 2007;Duffy et al. 2012;Fujii et al. 2013;Havermans 2016). In the deep sea, scavenging amphipods are easily sampled using baited traps (Duffy et al. 2012;Horton et al. 2013;Thurston 2014, 2015;Horton et al. 2020b). The species collected in such traps usually include members of the superfamilies Lysianassoidea and Alicelloidea (including species from the families Cyclocaridae, Eurytheneidae, Hirondelleidae, Lysianassidae, Scopelocheiridae, Uristidae, Alicellidae and Valettiopsidae) (Horton et al. 2020a).
In the age of the biodiversity crisis, there is a growing need to quantify diversity as well as to recognize the communities and habitats that are particularly vulnerable to human activities. DNA barcoding offers a powerful tool for improving species diversity estimates (e.g., Plaisance et al. 2011;Leray and Knowlton 2015). However, many such studies lack the morphological taxonomic identification of the delimited Molecular Operational Taxonomic Units (MOTUs). This hinders the comparison of the results with other datasets, which, for a variety of reasons, are unavailable for genetic studies. The reverse taxonomic approach, using molecular information to recognize genetically divergent clusters that are later identified by taxonomists, may speed up species-level discrimination and help to detect potential cryptic diversity (Markmann and Tautz 2005;Smith et al. 2005;Janssen et al. 2015). The goal of the present research was therefore to study the identification and diversity of the deep-sea scavenging Amphipoda from the eastern part of the Clarion-Clipperton Zone. The results will be useful for integrating similar datasets from baseline biological studies of other CCZ contractors.

Study area and sample collection
The research was conducted in the high seas in areas beyond national jurisdiction (known as "the Area" in UNCLOS [United Nations Convention on the Law of the Sea] terminology), which are controlled by the United Nations International Seabed Authority (ISA) (Lodge et al. 2014;Wedding et al. 2015). Samples were collected during two scientific cruises, ABYSSLINE 1 or AB01 (R/V Melville, MV1313) in 2013 (October), to the UK-1 license area (Stratum A), and ABYSSLINE 2 or AB02 (R/V Thomas G. Thompson, TN319) in 2015 (February-March) targeting two contract areas, the UK-1 (Stratum B) and OMS-1 (Stratum A) areas, plus one of the nine Areas of Particular Environmental Interest (APEI-6, designated by the ISA for preservation) (Wedding et al. 2013;Lodge et al. 2014;Wedding et al. 2015) (Fig. 2). The study areas are located in the eastern part of the abyssal Pacific nodule belt between the Clarion and Clipperton fracture zones (CCZ,6°N and 20°N,120°W and 160°W). Specimens were retrieved from ten randomly located stations at depths ranging from 4057 to 4221 m using a freefall baited trap incorporating small baited mesh funnel traps. Their design and deployment is described in Smith et al. (2013). Specific sample locations are given in Table 1. Upon recovery of the baited trap, captured specimens were transferred immediately into DESS (solution of dimethyl sulfoxide [DMSO], disodium EDTA and saturated sodium chloride) (Yoder et al. 2006) or 96% ethanol for preservation.

Taxa identification and databasing
The representation of 645 scavenging amphipods was chosen for barcoding study (Table 1). Subsequently, a reverse taxonomic approach based on DNA barcoding was applied to speed up species-level discrimination and to detect potential cryptic diversity (Markmann and Tautz 2005;Janssen et al. 2015). Firstly, the so-called "gold standard" barcode gene, cytochrome c oxidase subunit I (COI) (Hebert et al. 2003), was amplified from the captured specimens, and the generated sequences were grouped into genetically divergent clusters to facilitate comparison (as in Smith et al. 2005;Janssen et al. 2015). Morphological analysis was then undertaken to verify DNA-based species discrimination and to examine if the presumed cryptic species were distinguishable based on morphological characters, which may have been previously overlooked.
Prior to DNA extraction, each specimen was given a unique registration number, photographed using a Leica binocular microscope, and the resulting image file was linked to the registration number. The registration number and all metadata (scientific name, collection data, etc.) were entered into a spreadsheet database and submitted to the Barcode of Life Data Systems (BOLD) (Ratnasingham and Hebert 2007) under the project, SCAP "Scavenging amphipods of the Clarion Clipperton Zone." DNA extraction, polymerase chain reaction (PCR), and sequencing Total genomic DNA was isolated from either a pereopod or from the whole specimen for small individuals using the Chelex extraction method (Walsh et al. 1991). A fragment of the mitochondrial COI (~645 bp) was amplified using universal primers LCO1490 and HCO2198 (Folmer et al. 1994). Polymerase chain reactions (PCR) were performed in 25 μl volumes using Illustra PureTaq PCR beads from GE Healthcare Life Science (Buckinghamshire, UK). PCRs contained 20-μl sterile molecular grade H 2 O, 0.5 μl of each primer (10 pmol/μl), and 4 μl of DNA template. Amplification was conducted using an Eppendorf Mastercycler pro S thermocycler (Hamburg, Germany) with the following parameters: initial denaturation at 94°C for 5 min followed by 38 cycles repeating the sequence of 94°C for 45 s (denaturation), 42°C for 45 s (annealing), and 72°C for 80 s (elongation). Final extension was performed at 72°C for 7 min. PCR products were confirmed by size with electrophoresis on a 1% agarose gel with GelRed (Biotium, Hayward, USA) using commercial DNA size standards. PCR products, which produced light bands after electrophoresis, were outsourced for purification and Sanger sequencing to a contract sequencing facility (Macrogen Europe Laboratory, Amsterdam, Netherlands) using the same primer sets as for PCR.

Species delimitation
Sequencing reads were assembled and checked for the presence of mitochondrial pseudogenes (numts) with the software Geneious version 7.1.9 (created by Biomatters, Ltd. available from www.geneious.com) (Kearse et al. 2012) by translating all nucleotide sequences in amino acid sequences. In total, 645 high-quality COI sequences with a length of at least 500 base pairs (bp) were selected for further analysis. Sequences, used primer pairs, and trace files are publicly accessible through the public dataset SCAP "Scavenging amphipods of the Clarion-Clipperton Zone" on BOLD (Ratnasingham and Hebert 2007) (dx.doi.org/10.5883/DS-SCAP). Intra-and interspecific variability and base frequencies were calculated using the sequence analysis tools provided on BOLD Systems (distance model: K2P; alignment options: BOLD aligner; ambiguous base/gap handling: pairwise deletion). Sequence alignments were automatically generated using MUSCLE (Edgar 2004) as implemented in Geneious (Kearse et al. 2012), and the nonoverlapping sequence regions at the 5′-and 3′-ends were trimmed manually, leaving a 645-bp alignment. Neighborjoining (NJ) analysis (Saitou and Nei 1987) was conducted based on pairwise deletion and Kimura 2-parameter (K2P) distances (Kimura 1980) using MEGA version 7.0.14 (Kumar et al. 2016), as K2P is used as a standard model for COI barcoding studies and allows direct comparison with other studies (e.g., Havermans et al. 2011). Branch support for the NJ topology was evaluated by non-parametric bootstrapping (Felsenstein 1985) with 1000 replicates. Resulting trees were initially processed in MEGA and later prepared as graphic in Adobe Illustrator CS6. On the BOLD Workbench, all sequences were automatically subjected to the Barcode Index Number (BIN) system, i.e., sequences were clustered to molecular operational taxonomic units (MOTUs, [Floyd et al. 2002]) independent of taxonomic assignment and then assigned to a unique alphanumeric code or BIN. The BIN assignment is based on refined single linkage (RESL) analysis that couples single linkage with a threshold of 2.2%. If sequences are more divergent than two times this threshold (or 4.4%), a new BIN is generated. Sequences with a low genetic divergence (<4.4%) are submitted to a second refined step using Markov clustering that assigns sequences to a new or a preexisting BIN (Ratnasingham and Hebert 2013). The number of BINs in our dataset was counted, as they appeared in BOLD on August 4, 2017. For comparative purposes, two other clustering approaches based on heuristic search algorithms were implemented. First, sequences were assigned to MOTUs using the CD-HIT-Suite (http://weizhongli-lab.org/cdhit_suite/cgi-bin/ index.cgi), a web server for comparing biological sequences and quickly identifying MOTUs calculated by pairwise alignment at a user-defined similarity threshold (Huang et al. 2010). CD-HIT first sorts sequences in decreasing length order. The longest sequence becomes the representative of the first cluster. Then each sequence is compared pairwise to the representative sequence. If the similarity with any representative is above a given threshold, it is grouped into that cluster. Otherwise, a new cluster is defined with that sequence as the representative. Second, sequences were assigned to MOTUs using BLASTclust as integrated into the Bioinformatics Toolkit platform (https://toolkit.tuebingen.mpg.de/) (Alva et al. 2016). BLASTclust clusters unaligned sequences using a singlelinkage algorithm based on megablast similarity scores. The threshold value for species delimitation was set at 0.93 or 93%, corresponding to the genetic distance value of ten times of the mean intraspecific divergence value (here 7% when excluding species complexes) as proposed by Hebert et al. (2004).
Once the number of genetic clusters or MOTUs within the dataset was determined, specimens from each group were blasted against publicly available sequence data in GenBank (Zhang et al. 2004;Morgulis et al. 2008) and BOLD (Ratnasingham and Hebert 2007) to confirm the identity of all consensus sequences. In order to visualize the relationships among COI haplotypes from two species morphologically identified as Paralicella caperesca  and Valettietta cf. anacantha (Birstein & Vinogradov, 1963), a Minimum Spanning Network (MSN) was constructed using the PopART software (http://popart.otago.ac.nz) (Bandelt et al. 1999;Leigh and Bryant 2015).

Morphological species delimitation
Traditional morphological approaches were used to confirm MOTUs in order to provide a valid species-level identification. Up to five representative specimens of each cluster were morphologically identified to species level by one of the authors (TH) using original descriptions and keys, where available. Where specimens were preserved in DESS, this was more difficult, and sometimes only part specimens were available, in which case the specimens were identified to the lowest possible taxonomic resolution, e.g., genus level. Names were matched with the World Register of Marine Species (WoRMS Editorial Board 2020).

Species breakdown
According to morphological analysis, ten amphipod species, belonging to six genera and six families, were found within the UK-1 and OMS-1 license areas and in the northern part of APEI-6. Although the full species composition was not analyzed in detail as part of this study, the breakdown of specimens barcoded from each site can be seen in Fig. 3. The greatest number of species (9 species, 424 individuals) was recorded in the 2013 survey site, UK-1 Stratum A. In 2015, at survey sites UK-1 Stratum B, OMS-1, and APEI-6, where a lower number of individuals was studied (143, 55, and 23 individuals, respectively), fewer species were recorded (3-4 species). In UK-1 Stratum A, the uristid species Abyssorchomene distinctus (Birstein & Vinogradov, 1960) was the most common species comprising >50% of the assemblage, whereas other taxa were present in lower numbers ( A b y s s o r c h o m e n e c h e v r e u x i ( S t e b b i n g , 1 9 0 6 ) , Abyssorchomene gerulicorbis , Paracallisoma sp. Chevreux, 1903, and V. cf. anacantha)

Molecular species delimitation
Single-locus DNA sequences of the cytochrome c oxidase subunit I (COI) gene were obtained for 645 amphipod specimens. A list of the analyzed specimens is presented in Electronic Supplementary Material (ESM 1). The sequence length of the analyzed individuals ranged from 506 to 658 bp. No stop codons, deletions, or insertions were observed. Mean nucleotide frequency distributions of the analyzed sequences were A, 0.28; C, 0.15; T, 0.39; and G, 0.18.
For the majority of species, multiple specimens (mean = 80.37 specimens per species) were analyzed to investigate genetic variation. Two species (Cyclocaris sp. and E. magellanicus) were represented by a single individual only ( Table 2, Electronic Supplementary Material 2). Mean intraspecific K2P divergence was 2.08%, ranging from zero to a maximum value of 15.25% ( Table 2). Most of the analyzed species revealed average intraspecific distances lower than 2%. Extremely low intraspecific variation was found for three species: A. distinctus, A. gerulicorbis, and E. maldoror. A. distinctus showed a mean K2P distance of 0.04% between individuals collected from the UK-1 (4082-4221 m depth) and the OMS-1 (4071-4183 m depth) claim areas. Specimens of A. gerulicorbis that were collected exclusively from Stratum A in the UK-1 area (4082-4203 m depth) had a mean intraspecific divergence of 0.15%. No intraspecific Fig. 3 Species breakdown of samples collected from trap deployments within the UK-1 and OMS-1 license areas in the CCZ and in the northern part of APEI-6, with indication of the year sampled and the number of individuals used for molecular study from each site. Note that the studied individuals represented only a subset of all amphipods collected divergence was noted between four specimens of E. maldoror from two strata (A and B) in the UK-1 claim area. Moderately high intraspecific divergence (4.51%) was observed for P. tenuipes, while exceptional deep intraspecific variation (>7% maximum distance) was noted within P. caperesca and V. cf. anacantha.
For P. caperesca, intraspecific divergence ranged from 0 to 15.25%, with a mean value of 7.60%. Here, the maximum intraspecific distance was greater than the minimum interspecific distance (9.67%). Intraspecific distances of V. cf. anacantha that were collected from Stratum A in the UK-1 license area exclusively ranged from 0 to 9.23%, with a mean value of 4.4%. The exclusion of P. caperesca and V. cf. anacantha from distance analysis would lead to a decrease of mean intraspecific divergence from 2.08 to 0.7%. The mean K2P distance between species was 20.4%, ranging from 9.67 (between the congeneric species A. gerulicorbis and A. chevreuxi) to 25.40% (between the sister species P. caperesca and P. tenuipes). The frequency distribution of K2P distances for intraspecific variation and interspecific distances is shown with (Fig. 4a) and without (Fig. 4b) putative species complexes. No clear barcoding gap was observed when P. caperesca and V. cf. anacantha were included in the analysis, but there was also little overlap between intraspecific and interspecific distances (Fig. 4a). In Fig. 4b (excluding the two species complexes), interspecific variation clearly exceeds intraspecific variation, thereby producing a barcoding gap with a size of about 2.5% (Fig. 4b) (Hebert et al. 2003;Meyer and Paulay 2005).
The neighbor-joining topology (Fig. 5) revealed that conspecific individuals based on morphological identification always cluster together supported by high bootstrap values of >99%. However, within P. caperesca and V. cf. anacantha, several distinct clusters separated by high genetic divergence were observed. Depending on whether BLASTclust, CD-Hit, Barcode Index Number (BIN) clustering or subjective evaluation of the neighbor-joining tree were used (Fig. 5), the sequences cluster into 12-24 MOTUs. BIN clustering split the sequences into 24 distinct BINs, while CD-Hit and BLASTclust indicated the existence of 14 and 12 MOTUs, respectively. Of the ten morphologically identified species, seven could be separated using the BIN system.
In three cases, a single species contained several BINs: Paralicella tenuipes Chevreux 1908 = 4 BINs, P. caperesca = 11 BINs, and V. cf. anacantha = 2 BINs. CD-Hit and BLASTclust were able to identify eight distinct species clusters, while two morphospecies (P. caperesca and V. cf. anacantha) were assigned to 4 and 2 entities (CD-Hit) and 2 entities (BLASTclust), respectively. When comparing the results obtained by morphology-based and DNA-based (neighbor-joining, BIN, BLASTclust, and CD-Hit) approaches, 14 putative species were identified. In this context, CD-Hit showed the best clustering performance with eight "well-defined" species clusters and 6 putative species complexes, 4 in P. caperesca and 2 in V. cf. anacantha. A detailed topology and the haplotype diversity of P. caperesca and V. cf. anacantha are presented in Figs. 6 and 7. The analysis of 177 sequences of P. caperesca from four sampling areas (UK1-Stratum A, UK1 Stratum B, OMS-1, APEI-6) revealed high haplotype diversity of the COI gene (Fig. 6). Of 50 haplotypes, 33 were unique and 17 were shared by more than one individual. Many of these are separated by only one or a few nucleotide substitutions. However, there are at least four major haplogroups, designated as Pc1-4, separated by 17-38 nucleotide changes (Fig. 6). The first group Pc1 included 14 haplotypes that were separated from each other by Neighbor-joining topology for COI amphipod sequences based on K2P and results of species delimitation analyses. Bootstrap supports over 90% (1000 replicates) are shown on the branches. The number of the analyzed specimens collapsed into a single node is provided in parentheses following the species name. Triangle length corresponds to the sequence divergence for each species. Red triangles indicate species complexes with intraspecific maximum pairwise distances > 7%. Colored bars represent Molecular OTUs or putative species generated by Barcode Index Number (BIN) clustering, CD-Hit, and BLASTclust algorithm. Orange bars, BIN; green bars, CD-HIT; blue bars, BLASTclust a minimum of one mutational step. However, two haplotype clusters within Pc1 are separated by 14 and 16 nucleotide changes from their respective NNs. This is consistent with the high level of genetic divergence within Pc1, ranging from 0 to 6.8% (mean = 2.43) ( Table 3). The second group Pc2 formed a group of four closely related haplotypes separated by only one nucleotide substitution. Here interclade divergence ranged from 0 to 0.47% (mean = 0.18). Haplogroup Pc3 consisted of 14 haplotypes separated by one up to 18 mutational steps, suggesting the presence of at least four subclusters. The genetic variability within Pc3 ranged from 0 to 5.96% (mean = 3.74%). Cluster Pc4 groups 18 closely related haplotypes, many of these separated by only one substitution. K2P divergence calculated for Pc4 ranged from 0 to 2.64% (mean = 0.55). The lowest value of interclade divergence (5.97%) was observed between specimens from Pc3 and Pc4. Highest values (>12%) were calculated between Pc1 and Pc3 and Pc2 and Pc3 (Table 3). Interestingly, there is no Fig. 6 Neighbor-joining topology based on K2P and 1000 bootstrap and COI haplotype network for P. caperesca. Neighbor-joining topology for P. caperesca with bootstrap supports shown on the branches. The number of the analyzed specimens is provided in parentheses following clade designation (Pc1-Pc4). Minimum spanning network showing haplotype diversity of P. caperesca is represented next to the tree. Distinct haplotypes are depicted as circles (indicating haplotypes found at a single locality) and pie charts (haplotype found at different localities) with a diameter proportional to their frequency among all haplotypes found. Each color indicates a sampling area and the frequency distribution (pie charts) of haplotypes for each locality. Lines with small black slashes provide a relative estimate of genetic divergence between haplotypes) significant geographic pattern to this network, since distinct clusters comprise several specimens caught at the same sample locations. The seven records for V. cf. anacantha from UK-1 Stratum A included two haplotypes (Fig. 7), clearly forming two distinct lineages, separated by 55 nucleotide changes. No intraspecific divergence was noted within these two groups, whereas interclade divergence was 9.19% (Table 3), implying that there may be two species within V. cf. anacantha. The entity identified in this study has been named Valettietta cf. anacantha since we cannot be certain at this stage if it represents the true V. anacantha, which was described from specimens collected in the Philippine trench at depths between 0 and 5300 m. V. cf. anacantha closely resembles the Atlantic species Valettietta gracilis Lincoln & Thurston, 1983, particularly in the form of the gnathopods. However, without access to further specimens and the holotype of V. anacantha, it is at the moment not possible to determine whether the entity in this study is V. anacantha, or a new species. Five individuals of Valettietta were available for morphological study, two of which (forming the clade Va2 [ Fig. 7]) were considerably destroyed (lacking head and gnathopods) preventing species-level morphological identification.

Discussion
Accurate, standardized approaches for quickly establishing a baseline of biodiversity and collecting information on key biological unknowns are fundamental for an effective environmental management and the assessment of short-, medium-, or long-term impacts of deep-sea mining. As part of the regulatory regimes for the exploitation of manganese nodules adopted by the ISA (2013a), the present study represents the first comprehensive molecular assessment of scavenging amphipods from two adjacent nodule-mining exploration areas (OMS-1 and UK-1) and one Area of Particular Environmental Interest (APEI-6) in the deep waters of the eastern CCZ. Overall, our study assembled mitochondrial COI sequences for 645 specimens that were grouped into 12-24 MOTUs or putative species using different species delimitation approaches. Morphological verification revealed the presence of just ten amphipod species, reflecting the discordance between DNA-based and morphological species assignment seen in previous studies on amphipods (e.g., Havermans et al. 2011;Ritchie et al. 2015). Fig. 7 Neighbor-joining topology based on K2P and 1000 bootstrap and COI haplotype network for V. cf. anacantha. Neighbor-joining topology for V. cf. anacantha with bootstrap supports shown on the branches. The number of the analyzed specimens is provided in parentheses following clade designation (Va1-Va2). Minimum spanning network showing haplotype diversity of V. cf. anacantha is represented next to the tree. Distinct haplotypes are depicted as circles with a diameter proportional to their frequency among all haplotypes found. The red color indicates the sampling area (UK-1 Stratum A). Lines with small black slashes provide a relative estimate of genetic divergence between haplotypes)

Species breakdown
The species identified in our samples comprise a common and abundant element of the scavenging amphipod community of the Pacific deep sea Barnard 1976, Ingram andHessler 1983;Patel et al. 2020, Jones et al. submitted.). According to morphological analysis, species composition was observed to vary between 2013 and 2015. Species richness was higher in 2013, despite more stations being sampled in 2015. A number of species were common to all the traps but were recorded at differing abundances according to site or year of sampling. For example, in UK-1 Stratum A, sampled in 2013, the uristid species A. distinctus was the most common species comprising >50% of the assemblage, whereas only eight specimens belonged to A. gerulicorbis (Fig. 3). In 2015, the samples (UK-1 Stratum B, OMS-1 and APEI-6) were dominated by the alicellid species P. tenuipes and the P. caperesca species complex, making up >87% of the individuals, whereas other taxa were present in lower numbers or represented by single individuals only (e.g., E. maldoror and Cyclocaris sp.). The species composition of the samples collected in 2015 resembled that reported by Ingram and Hessler (1983) from baited traps set in the abyss North of Hawaii. The numerical dominance of A. distinctus noted in 2013 was also observed by Patel et al. (2020) who studied scavenging Amphipoda from the eastern part of the CCZ. In the latter study, the number of individuals of A. distinctus diminished westward (from >300 ind. in the easternmost stations to less than 50 ind. in the westernmost part of the study area). The studies of the abyssal amphipod scavenging community North of Hawaii did not report this species Shulenberger and Barnard 1976;Ingram and Hessler 1983). The observed compositional differences, however, should be treated with caution, since only a subset of all collected individuals was studied. These preliminary findings may reflect temporal and/or spatial variability in scavenger populations, possibly linked to differences in environmental conditions and the populations for which they select. In the North Atlantic, shifts in abyssal scavenging amphipod communities driven by changes in organic flux were observed (Horton et al. 2020b). These authors noted that the community formed by obligate necrophages (two Paralicella species) changed in the years with increased input of organic matter and became dominated by species with more opportunistic feeding habits (the representatives of the genus Abyssorchomene). Moreover, our results support the finding that locally less abundant species are not necessarily endemic but can be widely distributed. This is the case for several species analyzed from the ABYSSLINE samples, which have been shown in earlier studies to have a widespread or even cosmopolitan distribution (e.g., A. gerulicorbis, E. maldoror, and E. magellanicus) (Havermans 2016), supporting the theory that local rarity does not necessarily imply a small geographic range (Rex 2002;McClain and Hardy 2010).
All license areas in the CCZ are planned as zones of future mining programs, and this will obviously affect the deep-sea communities. To protect biodiversity, nine APEIs, situated North and South of the CCZ, have been established, where such activities are prohibited (Lodge et al. 2014). Studies of other benthic invertebrate groups have reported a similar underrepresentation of CCZ species found in APEIs: Ophiuroidea (five out of 38 spp. recorded in CCZ were present in APEI-3, Christodoulou et al. 2020) or Tanaidacea (1/3 of species from CCZ were found in APEI-3, Jakiel et al. 2019). In our study, fewer species were recorded at the single site in APEI-6 than at the sites studied within contractor areas. However, it has to be recognized that the focus of the present work was on species identity and not community structure, and the fact that only a subset of individuals was utilized could have affected the results. Scavenging amphipod communities are known to be dominated by a few species, with a number of less abundant species also commonly present, which in a smaller sub-sample may be missed altogether. Differences in species diversity between APEI and license areas have not been reported in other studies of scavenging amphipods (Patel et al. 2020, Jones et al. submitted.), and since they are highly mobile crustaceans, they may be potentially less vulnerable to local environmental heterogeneity as might be the case for epifaunal and infaunal macroand megafaunal taxa. It remains important to continue to study whole populations of scavenging amphipods across the CCZ as there may be differences in the distributions of some of the rarer taxa attracted to baited traps, which are not yet apparent, and to further understand the population connectivity between the areas studied. Our study supports these broader aims by providing more robust identification for the commonly found scavenging amphipod taxa.

Molecular species delimitation
The neighbor-joining topology (Fig. 5) clearly discriminated the ten morphologically determined species from their nearest neighbor, even in the case of putative species complexes. In contrast to recent work (Ritchie et al. 2015) proposing that morphological traits used to distinguish species within the alicellid genus Paralicella are not sufficiently robust to ensure accurate identification, here all analyzed amphipod taxa, including the sister species P. caperesca and P. tenuipes, could be identified and discriminated by morphological characters without any difficulties. Indeed, discrimination of Paralicella tenuipes is facilitated by its possession of a distinctive red eye and a strongly beveled pereopod seven basis, as indicated in the original description of the species by Shulenberger and Barnard (1976). However, deep intraspecific divergences were observed within the two taxa morphologically identified as P. caperesca and V. cf. anacantha, generating multiple lineages or MOTUs separated by distances higher than values of interspecific differentiation, which are known from other amphipod crustaceans (Havermans et al. 2011(Havermans et al. , 2013Havermans 2016) (Fig. 6 and 7, Table 3). Most of these MOTUs occurred in close geographical proximity or were even caught in the same baited trap while still displaying high levels of genetic variation. Such high levels of intraspecific divergences occurring in sympatry on a small geographical scale can be associated with previously overlooked morphological characters, complexes of cryptic species, historical polymorphism, or introgression from other taxa (Held 2003;Bickford et al. 2007;Kemppainen et al. 2009;Havermans et al. 2011Havermans et al. , 2013Haye et al. 2014;Mamos et al. 2014;Havermans 2016). However, past studies on scavenging amphipods have shown that several known species are actually complexes of unrecognized cryptic species with almost identical morphological characteristics and a restricted distribution (e.g., Havermans et al. 2011Havermans et al. , 2013Havermans 2016). In the case of P. caperesca, the species appears to be composed of at least four (Pc 1-4) genetically distinct but morphologically indistinguishable groups (at least using current morphological knowledge) (Fig.  6). Our results are concordant with the data of Ritchie et al. (2015) who also found four different genotypes based on mitochondrial 16S rDNA sequences. A more detailed study of P. caperesca populations from the whole geographic range incorporating slower evolving, nuclear genes, and thorough morphological examination will be needed in order to facilitate the robust identification of the putative species in this complex. Species discrimination using DNA barcoding is only reliable if there is no overlap between the average intraspecific and the average interspecific distance, a condition so-termed the barcoding gap (Hebert et al. 2003(Hebert et al. , 2004. No barcoding gap was observed when including the P. caperesca and V. cf. anacantha complexes in the analysis (Fig. 4a). However, a clear gap was generated when just considering "well-defined" species ( Fig. 4b), indicating that DNA barcoding can be an efficient tool for the identification of scavenging amphipods and also for the detection of cryptic species. Hebert et al. (2004) proposed a standard sequence threshold of ten times the mean intraspecific K2P distance. For our dataset, this value would be >20% when including the possible cryptic species, accordingly, a separation of species with low genetic divergences would not be possible. In order to apply a group-specific threshold for scavenging amphipods, BLASTclust and CD-Hit analyses were used with a 93% cut-off value, corresponding to the genetic distance value of ten times of the mean intraspecific divergence value when excluding species complexes. The automated assignment to BINs, however, is based on a three-stage procedure, which starts with single linkage analysis using a predefined threshold of 2.2% (Ratnasingham and Hebert 2013). As a consequence, MOTU delineation revealed marked differences in performance between BIN assignment and the two other delimitation methods. Whereas BLASTclust and CD-Hit showed only slight differences in MOTU counts, 2 and 4 MOTUs for P. caperesca, respectively, the BIN method suggests the presence of 11 species-like units in P. caperesca. Another case of discrepancy involved the division of P. tenuipes among four BINs. Only for V. cf. anacantha was the same MOTU count (n = 2) generated by each method, indicating the likely presence of new species of Valettietta in the CCZ. The higher number of MOTUs produced by BIN assignment probably arose due to the delimitation thresholds associated with BINs and may lead to overestimation of species richness. Hence, our results have highlighted that the choice of analysis method, linked with a certain threshold value of sequence divergence, can considerably affect species identification accuracy.

Conclusion
We have demonstrated that in large, specimen-rich datasets, combining traditional approaches with molecular data, can speed up morphospecies assignment and enable cost-efficient and reliable species delimitation. Despite the fact that some taxa are currently underrepresented in our samples (e.g., representatives of the genera Cyclocaris, Paracallisoma, and Eurythenes), our study highlights the benefits of DNA sequence analysis to discriminate well-established amphipod species and also to detect distinct genetic lineages, when working with morphologically indistinguishable species or incomplete specimens. These approaches enable caution to be applied when identifying certain species and highlight where further morphological taxonomic work may be needed. However, the discrepancies in MOTU delineation and the deep mitochondrial divergences observed within two species imply that additional data and taxonomic studies using supplementary markers (e.g., mitochondrial 16S rDNA, nuclear 28S rDNA) are needed to verify species delimitation in some scavenging amphipods.
In conclusion, our data represent an important step towards a highly effective identification system for macrofaunal crustaceans in the CCZ, a marine region prospected for deep-sea mining that could be severely affected by the exploitation of marine resources. The DNA barcodes generated in this study are publicly available and can be used by other contractors to aid in the identification of deep-sea amphipod communities in their license areas using standard molecular techniques. Jeffrey Drazen, Clifton Nunnaly, and Astrid Leitner for operating the baited traps. We thank Seafloor Investigations Ltd. for creating maps from the bathymetry collected. We would like to express our gratitude to both reviewers who provided comments on our manuscript which allowed us to improve it. This is publication 73 of the Senckenberg am Meer Metabarcoding and DNA Laboratory.
Funding This work was financed by UK Seabed Resources Ltd. under the project ABYSSLINE.

Declarations
Conflict of interest The authors declare that they have no conflict of interest.

Ethical approval Not applicable
Sampling and field studies Sampling was conducted outside the exclusive economic zone of any country so the Nagoya protocol does not apply to the study area.
Data availability The datasets generated and analyzed during the current study are available in the Barcode of Life Data Systems under the project, SCAP "Scavenging amphipods of the Clarion Clipperton Zone" (dx.doi. org/10.5883/DS-SCAP). The sequences were submitted to GenBank and received the following Accession Numbers: MW056513-MW057157. Summary of the used data is also provided in the Electronic Supplementary Material (ESM1.xlsx, ESM2.docx).
Author contribution PMA and IM conceived and designed the experiments. IM performed the experiments and first analyses. IM, TH, and AMJ analyzed the data. All authors wrote and discussed the paper; IM and AMJ prepared graphics.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.