Combination of nuclear and mitochondrial markers as a useful tool to identify Ctenophthalmus species and subspecies (Siphonaptera: Ctenophthalmidae)

Ctenophthalmus is considered the largest genus within the Order Siphonaptera. From a morphological point of view, only males of this genus can be identified at species and subspecies levels using morphological keys, whereas there are no morphological criteria in order to classify females at these taxonomical levels. Furthermore, the amount of available molecular and phylogenetic data for this genus is quite scarce so far. The main objective of this work was to assess the utility of the combination of nuclear and mitochondrial markers with respect to their ability to differentiate among different subspecies within the Ctenophthalmus genus. With this purpose, we carried out a comparative morphological and molecular study of three different subspecies (Ctenophthalmus baeticus arvernus, Ctenophthalmus nobilis dobyi, and Ctenophthalmus andorrensis catalaniensis) in order to clarify and discuss its taxonomic status. In addition, our study complemented the molecular data previously provided for Ctenophthalmus baeticus boisseauorum and Ctenophthalmus apertus allani subspecies. We sequenced five different molecular markers: EF1-α, ITS1, ITS2, cox1, and cytb. Our results confirmed that morphological data by themselves are not able to discriminate among Ctenophthalmus female taxa; however, the combination of the nuclear marker EF1-α together with mtDNA markers cytb and cox1 constituted a useful taxonomical and phylogenetic tool to solve this issue. Based on these results, we consider that the use of this molecular approach should be gradually used within Ctenophthalmus genus in order to complement its classical taxonomy and clarifying the complex taxonomy of other congeneric species of fleas.


Introduction
The family Ctenophthalmidae has been considered a "catchall" for a wide range of divergent taxa within the Order Siphonaptera (Whiting et al. 2008;Keskin 2019). In this sense, some authors recently reported that some subfamilies included within Ctenophthalmidae such as Stenoponiinae should be considered a monophyletic family (Stenoponiidae) (Zurita et al. 2015). Ctenophthalmus is considered the largest genus within the Order Siphonaptera; thus, this genus comprises fifteen different subgenera distributed throughout the world out of which the Ctenophthalmus subgenus is the largest one being distributed along the Palearctic region (Hopkins and Rothschild 1966;Beaucournu and Loverlec 2014). Only in the Western Palearctic Zone (European region) 143 taxa (57 species and 86 subspecies) have been described so far (Beaucournu and Loverlec 2014).
From a taxonomical point of view, it is noteworthy that Ctenophthalmus genus has been intensively studied using morphological features. Nevertheless, specialized entomologists have not been able to find standard criteria in order to discriminate among Ctenophthalmus female species so far; thus, there is no specific taxonomical key to identify Ctenophthalmus females based on morphological traits (Beaucournu and Launay 1990;Zurita et al. 2020). On the other hand, males of Ctenophthalmus can be easily distinguishable based on their complex genitalia. In this context, it is usual to find some taxonomical keys for this genus providing a deep level of classification to subspecies level exclusively based on males (Beaucournu and Launay 1990). For example, Beaucournu and Loverlec (2014) reported eight different for C. baeticus boisseauorum and C. a. allani subspecies specimens provided by Zurita et al. (2020). Although these five subspecies by themselves do not represent the whole Ctenophthalmus genus, this study expects to provide, for the first time, useful molecular tools in order to clarify the complex taxonomy of this flea group.

Collection of samples
All the samples were collected from rodents Apodemus sylvaticus and Crocidura russula from Py (eastern Pyrenees, south of France) (42°29′ 45″ N, 2°21′ 03″ E) during two different periods (July 2016 and July 2019) ( Table 1). Rodents were trapped using live traps. Afterward, rodents were exhaustively examined for fleas by combing through an inspection of head, neck, body, sides, tail, and ventral regions of each animal. Fleas obtained were kept in Eppendorf tubes with 96% ethanol for subsequent identification and DNA extraction.

Morphological identification
For morphological analysis, all whole specimens were examined and photographed under an optical microscope. Subsequently, eighteen samples (13 males and 5 females) corresponding to three different subspecies (C. b. arvernus, C. n. dobyi, and C. a. catalaniensis) were put away for molecular purposes (only males could be classified at subspecies level). On the other hand, one male for each species and five Ctenophthalmus sp. females were cleared with 10% KOH, prepared, and mounted on glass slides using conventional procedures with EUKITT mounting medium (O. Kindler GmbH & Co., Freiburg, Germany) (Lewis 1993). Once mounted, they were examined and photographed again for a deeper morphological analysis using a CX21 microscope (Olympus, Tokyo, Japan). Diagnostic morphological characters of all the samples were studied by comparison with figures, keys, and descriptions reported by Hopkins and Rothschild (1966) and Beaucournu and Launay (1990).

Molecular and phylogenetic study
As well as the three just cited Ctenophthalmus subspecies, we amplified EF1-α sequences from individuals of C. b. boisseauorum, C. a. allani, Ctenocephalides felis, Ctenocephalides canis, Nosopsyllus barbarus, Nosopsyllus fasciatus, Archaeopsylla erinacei, Pulex irritans, Stenoponia tripectinata tripectinata, and Panorpa meridionalis, only for phylogenetic comparative purposes. Genomic DNA was extracted using the DNeasy Blood and Tissue Kit (Qiagen) according to the manufacturer's protocol and was checked using an electrophoresis in a 0.8% agarose gel infused with SYBR Safe.
All molecular markers sequenced in the present study (EF1-α, ITS1 and ITS2 rDNA, cox1, and cytb mtDNA) were amplified by a polymerase chain reaction (PCR) using a thermal cycler (Eppendorf AG; Eppendorf, Hamburg, Germany). PCR mix, PCR conditions, and PCR primers are summarized in the Supporting information (Table S1).
Initially, we amplified a 658-bp fragment of cox1, the socalled barcoding fragment that can serve as the core of a global bioidentification system for animals (Hebert et al. 2003). For this purpose, we used the generic invertebrate amplification primers LC01490 and HC02198 (Folmer et al. 1994); however, we did not obtain reliable results owing to co-amplification of nonspecific products. For that reason, we finally used Kmt6 primer (Zhu et al. 2015) as a forward to amplify the cox1 partial gene (453 pb), whereas HC02198 remained as reverse primer for this partial gene. The EF1-α, ITS1, ITS2, cox1, and cytb partial gene sequences obtained from all specimens analyzed were deposited in the GenBank database (Table 1).
The PCR products were checked on SYBR Safe stained 2% Tris-borate-ethylenediaminetetraacetic acid agarose gels. Bands were eluted and purified from the agarose gel using the QWizard SV Gel and PCR Clean-Up System Kit (Promega, Madison, WI, USA). Once purified, the products were sequenced by Stab Vida (Lisbon, Portugal). To obtain a nucleotide sequence alignment file, the MUSCLE alignment method (Edgar 2004) was used in MEGA, version 5.2 (Tamura et al. 2011). To assess the similarity among all marker sequences of all specimens analyzed in the present study and other flea species, the number of base differences per sequence was assessed using the number of differences method of MEGA, version 5.2 (Tamura et al. 2011).
Phylogenetic trees were inferred using nucleotide data and performed using two methods: maximum likelihood (ML) and Bayesian inferences (BI). Maximum likelihood trees were generated using the PHYML package from Guindon and Gascuel (2003), whereas Bayesian inferences were generated using MRBAYES, version 3.2.6 (Ronquist and Huelsenbeck 2003). JMODELTEST (Posada 2008) was used to determinate the best-fit substitution model for the parasite data (EF1-α, ITS2, cox1, and cytb). Models of evolution were chosen for subsequent analyses according to the Akaike information criterion (Huelsenbeck and Rannala 1997;Posada and Buckley 2004). To investigate the dataset containing the concatenation of four markers (EF1-α, ITS2, cox1, and cytb), analyses based on BI were partitioned by gene and models for individual genes within partitions were those selected by JMODELTEST. For ML inference, best-fit nucleotide substitution models included a general time-reversible model with gamma-distributed rate variation GTR+G (ITS2), a Tamura- Nei model with gamma-distributed rate variation and a proportion of invariable sites, TrN+I+G (cox1) and a general time-reversible model with gamma-distributed rate variation and a proportion of invariable sites, GTR+I+G (EF1-α, cytb). Support for the topology was examined using bootstrapping (heuristic option) (Felsenstein 1985) over 1000 replications to assess the relative reliability of clades. The commands used in MRBAYES, version 3.2.6 for Bayesian inference, were nst =6 with gamma rates (ITS2) and nst =6 with invgamma rates (EF1-α, cox1, and cytb). For BI, the standard deviation of split frequencies was used to determine whether the number of generations completed was enough; the chain was sampled every 500 generations and each dataset was run for 10 million generations. Adequacy of sampling and run convergence was assessed using the effective sample size diagnostic in tracer, version 1.6 (Rambaut and Drummond 2007). Trees from the first million generations were discarded based on an assessment of convergence. Burn-in was determined empirically by examination of the log-likelihood values of the chains. The Bayesian posterior probabilities (BPP) comprise the significance scores for the nodes.
The phylogenetic analyses, based on single markers EF1-α, ITS2, cox1, and cytb sequences, were carried out using our sequences and those obtained from GenBank database (see Table S2). Phylogenetic trees were rooted including Panorpa meridionalis (Mecoptera: Panorpidae) as outgroup. This choice was based on the combination of morphological and molecular data obtained in previous studies, which provided compelling evidence for a sister group relationship between Mecoptera and Siphonaptera (Whiting 2002;Whiting et al. 2008). The ITS1 sequence of P. meridionalis or other species of Mecoptera was not available either by amplification of different individuals or in any public database. Thus, no phylogenetic tree with other Siphonaptera species based on ITS1 sequences was constructed, discarding this marker for the concatenated dataset.

Morphological and biometrical results
All the specimens studied in this work showed morphological characteristics expected for the genera Ctenophthalmus. All these characters were described in detail by Zurita et al. (2020).
Males of C. a. catalaniensis showed specific morphological characters: • Apex of the distal arm of IX sternum without an apical slot (Fig. 1A).
• Dorsal process basimere is slightly longer than wide with two long setae with the same length each other (Fig. 1B).
• Ventral process basimere is as longer as wide showing a small apical slot close to the beginning of the dorsal margin (Fig. 1C).
• Telomere with a narrow apex, carrying one curved long seta located near to the apex in dorsal margin. (Fig. 1D).
Males of C. b. arvernus showed specific morphological characters: • Arm of the distal branch of IX sternum without an apical slot ( Fig. 2A).
• Apical part of distal arm of IX sternum with parallel margins (Fig. 2A).
• Dorsal process basimere is significantly longer than wide with only one long seta (Fig. 2B).
• A narrow ventral process basimere, longer than dorsal processus basimere and with an apical slot on the apex (Fig.  2C).
Males of C. n. dobyi showed specific morphological characters: • Distal arm of IX sternum thumb-shaped without an apical slot in the apex (Fig. 2D).
• Dorsal process basimere is significantly longer than wide with three setae two of them long with the same length (Fig.  2E).
• Ventral process basimere is as longer as wide with a big posterior lobe with a triangular shape (Fig. 2F).
The morphology of females was studied focusing on the spermatheca's shape and the chaetotaxy and shape of the margin of the sternum VII; however, we did not find any morphological criteria to discriminate among females of Ctenophthalmus sp. collected for this study. The spermatheca always showed a hilla shorter and narrower than bulga showing a small prominence at the end in some specimens. According to chaetotaxy, all specimens showed the presence of six setae with different degree of development in sternum VII. In addition, the presence of three strong setae was common, longer than the other ones, which appeared very close to each other. With all these morphological results, we were not able to set up any taxonomical differentiation for female discrimination. All these female morphological traits were described in detail by Zurita et al. (2020) (see pictures of female in this paper).

EF1-α, ITS1, and ITS2 analysis
The length of the ITS1 sequences of Ctenophthalmus specimens ranged from 888 base pairs (bp) (C. b. arvernus and C. n. dobyi males and some Ctenophthalmus sp. females) to 889 bp (C. a. catalaniensis males and some Ctenophthalmus sp. females). On the other hand, the length of the ITS2 and EF1-α fragment was 492 bp and 976 bp, respectively, for all the specimens (Table 1). The intrageneric similarity ranged from 99.6 to 100% for ITS1 and ITS2 with a maximum of two different base pairs among all the sequences analyzed for both markers, including C. b. boisseauorum and C. a. allani (table not shown). According to EF1-α, the similarity observed for each Ctenophthalmus subspecies ranged from 96.7% (C. a. allani) to 99.5% (C. a. allani and C. n. dobyi). Furthermore, if we compare all the EF1-α sequences from Ctenophthalmus spp. provided in this study, they showed high level of similarity with each other with values about 97-99%. On the other hand, when we compared our sequences with those Ctenophthalmus species and subspecies retrieved from GenBank database, these values appeared quite lower (about 91-92%) (see Table 2).
Partial cox1 and cytb mtDNA gene analysis The partial gene cox1 and cytb mtDNA sequences of Ctenophthalmus male and female specimens amplified in our study were 453 bp and 374 bp in length, respectively (Table 1). Intra-subspecific percentage of similarity observed for both markers ranged from 97.4% (cox1 of C. n. dobyi) to 100% (cytb of all subspecies excepting C. b. arvernus) (see Tables 3 and 4). Based on the comparison of cox1 and cytb sequences, we observed that sequences of those taxa belong to Ctenopthalmus subgenus (C. n. dobyi, C. a. catalaniensis, C. b. arvernus C. a. allani, and C. b. boisseauorum) showed each other high percentages of inter-subspecific similarity, ranging from 95.8 to 100% (Tables 3 and 4). In contrast to that, when these five subspecies were compared with another  (Tables 3 and 4).
The concatenated dataset of EF1-α, ITS2, partial cytb and cox1 gene sequences included 2200 aligned sites and 41 taxa, including five different Ctenophthalmus subspecies (C. n. dobyi, C. b. boisseauorum, C. a. allani C. a. catalaniensis, and C. b. arvernus) and outgroups (A. erinacei). Phylogenetic analysis of the concatenated dataset yielded a tree with strongly supported nodes (Fig. 3). This analysis was not in concordance with all trees constructed based on the single markers since Ctenophthalmus subspecies appeared clearly separated in different clades (Fig. 3). Therefore, male specimens of C. n. dobyi clustered together and phylogenetically distant from the other four subspecies, whereas C. b. boisseauorum and C. a. allani appeared closely related comprising two main subclades (100/73 BPP and bootstrap values). Nevertheless, two specimens of C. b. boisseauorum clustered within "C. a. allani subclade." On the other hand, C. b. arvernus taxa set up an independent clade phylogenetically close to "C. b. boisseauorum and C. a. allani complex" (Fig. 3). C. a. catalaniensis specimens did not set up any specific clade neither appear phylogenetically related with any Ctenophthalmus subspecies. Lastly, Ctenophthalmus female specimens clustered separately each other appearing dispersed in the different clades throughout concatenated phylogenetic tree (Fig. 3).

Discussion
The accurate taxonomic identification of an organism constitutes the cornerstone of most aspects of arthropods science. The lack of molecular data in the taxonomic characterization of arthropods can result in many difficulties, especially in certain groups as Siphonaptera that have demonstrated both cryptic and polymorphic species (Lawrence et al. 2014;Zurita et al. 2018a;Zurita et al. 2019). The necessity to study, comparatively, morphological and molecular approaches within Siphonaptera Order has been reported by several authors in the last years (Whiting et al. 2008;Marrugal et al. 2013;Lawrence et al. 2014;Zurita et al. 2018b). Ctenophthalmus genus is considered the largest one within Siphonaptera Order (Beaucournu and Loverlec 2014); however, only morphological data are available in order to identify and discriminate species and subspecies of this group. Recently, Zurita et al. Table 4 Intra-subspecific* and inter-subspecific similarity observed among all the partial cytb mtDNA sequences of different species and subspecies belonging to Ctenophthalmus sp. obtained in this work and retrieved from GenBank database. Values are given in percentages Cytb C. a. allani L R 5 9 4 4 6 4 -LR594467 C. b. boisseauorum L R 5 9 4 4 6 8 -LR594477 (2020) claimed for further molecular and phylogenetic studies for this genus, especially aimed to discriminate among female specimens, which could not be morphologically identified at species and subspecies levels. Furthermore, morphological keys so far available for Ctenophthalmus male identification are difficult to use, remaining problematic or impossible for non-specialists (Beaucournu and Launay 1990). Our results agree with Beaucournu and Loverlec (2014) in relation to the geographical distribution and main hosts of C. n. dobyi, C. a. catalaniensis, and C. b. arvernus. Thus, these three subspecies were found parasitizing Apodemus sp. in the eastern Pyrenees, demonstrating that these taxa are well settled in this geographical area. Furthermore, it is important to highlight that the distribution of C. n. dobyi is enclosed only to the eastern Pyrenees region whereas the other two subspecies (C. a. catalaniensis and C. b. arvernus) could be also found parasitizing rodents from any part of the Iberian Peninsula or the south of France (C. b. arvernus) (Beaucournu and Launay 1990;Beaucournu and Loverlec 2014).
From a morphological point of view, our results confirmed the high levels of morphological variation observed in Ctenophthalmus male genitalia previously observed by several authors (Hopkins and Rothschild 1966;Beaucournu and Launay 1990;Gómez et al. 2003;Zurita et al. 2020). Thus, we identified three different subspecies (C. a. catalaniensis, C. b. arvernus, and C. n. dobyi); however, morphological study of females did not show any morphological specific pattern based on the spermatheca and the chaetotaxy and shape of the sternum VII. The lack of variability found in these regions agrees with Zurita et al. (2020) who also provided biometrical data for males of C. b. boisseauorum and C. a. allani even for Ctenophthalmus females, but this tool was not useful in order to identify female species. For this reason,   Table 1). This analysis was based on concatenated sequences of Elongation Factor 1 alpha (EF1-α), Internal Transcribed Spacer 2 (ITS2), partial cytochrome c-oxidase subunit 1 (cox1), and cytochrome b (cytb) gene of mitochondrial DNA inferred using the Bayesian inference (BI) and maximum likelihood (ML) methods and Bayesian topology. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) is shown on the branches (BPP/Bootstrap). The Bayesian Posterior Probabilities (BPP) are percentage converted been demonstrated that the presence of certain geographical barriers can induce lower levels of genetic variation in an isolated population of fleas (Lin et al. 2014;Zurita et al. 2015).
To summarize, this work constitutes the first study that provides a combination of morphological, molecular, and phylogenetic comparative data of three subspecies (C. n. dobyi, C. a. catalaniensis, and C. b. arvernus) in order to assess their taxonomic and phylogenetic relationships. In concordance with previous authors (Hopkins and Rothschild 1966;Beaucournu and Launay 1990;Zurita et al. 2020), we conclude that just the study of morphological traits in Ctenophthalmus genus is not able to discriminate at species and subspecies levels in females; therefore, molecular and phylogenetic data are needed in order to solve this taxonomical issue. We evaluated the efficiency of five different molecular markers (ITS1, ITS2, EF1-α, cox1, and cytb) to discriminate among Ctenophthalmus subspecies. In this sense, although ITS1 and ITS2 have demonstrated their utility to discriminate among congeneric species in fleas (Vobis et al. 2004;Marrugal et al. 2013), we suggest that these two rDNA markers should be discarded for specific identification within Ctenophthalmus species. We based this idea on the lack of nucleotide divergence observed for these two markers for all the subspecies analyzed in this study (with a maximum of two different base pairs). On the other hand, the use of these five molecular markers individually could not discriminate among all these subspecies; however, this paper demonstrates, by the first time, that the combination of the nuclear marker EF1-α together with mtDNA markers cytb and cox1 could constitute a useful taxonomical and phylogenetic tool in order to discriminate among different Ctenophthalmus species and subspecies. This molecular approach, provided in this study, is especially relevant for female specimens belonging to this genus, which cannot be classified with morphological techniques or even for a possible discrimination between C. a. allani and C. b. boisseauroum, which had recently been suggested as synonymous taxa (Zurita et al. 2020). Lastly, although we just could include five different Ctenophthalmus taxa, we consider that the use of these molecular markers together should be gradually used within Ctenophthalmus genus where most taxa have been only described from a morphological point of view. Thus, it would be necessary to complement these classic taxonomical data with phylogenetic studies based on molecular data in order to clarify the complex taxonomy of not only Ctenophthalmus genus but also the whole Ctenophthalmidae family.
Funding Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.
Data availability All data generated or analyzed during this study are included in this published article and its supplementary information files. The datasets generated during and/or analyzed during the current study are also available from the corresponding author on reasonable request.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.