DNA barcoding for identification of fish species from freshwater in Enugu and Anambra States of Nigeria

Within Enugu and Anambra States, Nigeria, identification of fishes has been based on morphological traits and do not account for existing biodiversity. For DNA barcoding, assessment of biodiversity, conservation and fishery management, 44 fish sampled from Enugu and Anambra States were isolated, amplified and sequenced with mitochondrial cytochrome oxidase subunit I (COI). Twenty groups clustering at 100% bootstrap value including monophyletic ones were identified. The phylogenetic diversity (PD) ranged from 0.0397 (Synodontis obesus) to 0.2147 (Parachanna obscura). The highest percentage of genetic distance based on Kimura 2-parameter was 37.00 ± 0.0400. Intergeneric distances ranged from 15.8000 to 37.0000%. Congeneric distances were 6.9000 ± 0.0140–28.1000 ± 0.0380, with Synodontis as the existing synonymous genus. Confamilial distances in percentage were 16.0000 ± 0.0140 and 25.7000 ± 0.0300. Forty-two haplotypes and haplotype diversity of 0.9990 ± 0.0003 were detected. Nucleotide diversity was 0.7372, while Fu and Li’s D* test statistic was 2.1743 (P < 0.02). Tajima’s D was 0.2424 (P > 0.10) and nucleotide frequencies were C (17.70%), T (29.40%), A (24.82%), G (18.04%) and A + T (54.22%). Transitional mutations were more than transversions. Twenty species (99–100%) were identified with the e-value, maximum coverage and bit-score of 1e−43, 99–100 and 185–1194, respectively. Seventeen genera and 12 families were found and Clariidae (n = 14) was the most dominant among other families. The fish species resolution, diversity assessment and phylogenetic relationships were successfully obtained with the COI marker. Clariidae had the highest number of genera and families. Phylogenetic diversity analysis identified Parachanna obscura as the most evolutionarily divergent one. This study will contribute to fishery management, and conservation of freshwater fishes in Enugu and Anambra States, Nigeria.


Introduction
Fishes are vital aquatic animals of great diversity in morphological appearances and they are more than 35,000 species globally that contribute significantly to the existing vertebrates (Zhang and Hanner 2012;Bingpeng et al. 2018). At a numeric basis, genuine scientific descriptions have been noted for more than 27,977 species of different fishes in approximately 62 orders and 515 families (Nelson 2006). These organisms play significant roles in income generation, protein and mineral dietary supplements for human utilization as well as serving as major components of biodiversity (Ward et al. 2005;Rasmussen et al. 2009;Ugwumba and Ugwumba 2003). Fishes possess characteristics of remarkable morphological features that pose great challenges in identification using only descriptors from morphological and morphometric features (Triantafyllidis et al. 2011;Zhang and Hanner 2011). Furthermore, characteristics of convergence and divergence in fishes are the resultant alterations in the morpho-based features that lead to controversial classification, distinguishability and identification of fishes (Keskin and Atar 2013). Characterization, identification and assessment of biodiversity are the ingredients to fishery investigations and assessment of natural reserves (Ardura et al. 2013;Vartak et al. 2015). The challenges posed by the use of morpho-based identification procedure coupled with dwindling number of experienced taxonomists have necessitated the use of an informative molecular method (Steinke et al. 2009). Unlike the morpho-based method that is faced with inaccuracy of identification due to existence of synonymous external morphological features and variations at different developmental stages, DNA barcoding is free from these barriers and can accurately identify species and also discover cryptic ones (Bingpeng et al. 2018).
Proper identification of fishes which has been traditionally based on morphological attributes requires a better, reliable, sensitive and affordable alternative to understand and obtain basic knowledge of fish identities and species biodiversity enrichment within given geographical locations. Identification of fishes using morphology-based approach poses a great challenge following high diversity in developmental stages and morphological plasticity (Victor et al. 2009). The DNA barcoding identification approach has been developed and noted to be potentially efficacious due to inherent characteristics of sensitivity, reproducibility, reliability and environmental friendliness (Zhang et al. 2004;Comi et al. 2005;Teletchea 2009). This method, if well utilized, will eliminate existing misidentification and the availability of cryptic species that mimic and equally compromise the accuracy of fishes in research, fishery management and conservation (Vecchione et al. 2000;Bortolus 2008). Number of countries including Australia (Ward et al. 2005); Antarctic Scotia Sea (Rock et al. 2008); Alaska and Pacific Arctic (Mecklenburg et al. 2011); Canada (Hubert et al. 2008); Mexico and Guatemala (Valdez-Moreno et al. 2009); Amazon (Ardura et al. 2010); India (Lakra et al. 2011); North America (Aprila et al. 2011); Eastern Nigeria covering only Ebonyi and Anambra States (Nwani et al. 2011) and Japan  have had DNA barcoding done on some of the fish species sourced from freshwater, sea and marine. Nigeria is a country of above 170 million people with abundant water bodies for fishery. Nigerian fishes need to be studied for adequate knowledge of genetic diversity and possible identification of new species, especially in Enugu and Anambra States that harbor many freshwater bodies. Application of informative molecular markers will provide information on the molecular structure of fish species that will be useful in identification of unique stocks, stock enhancement, breeding programs for sustainable yield and preservation of genetic diversity (Tripathi 2011;Dinesh et al. 1993). For high discriminatory role in fish species from different sources of water, DNA barcoding has been well adjudged including some cryptic ones (Hubert et al. 2008;Carvalho et al. 2011;Pereira et al. 2013;Benzaquem et al. 2015).
Use of COI in DNA barcoding within animal kingdom has become a marker of choice (Hebert et al. 2003). It has been extensively applied for identification of invasive species (Wilson-Wilde et al. 2010), food adulteration analysis (Cohen et al. 2009;Murugaiah et al. 2009;Rojas et al. 2010), in forensic cases (Eaton et al. 2009), ecological discrimination (Berry et al. 2017), biomaterial collections (Cooper et al. 2007) and evaluation and documentation of new species through the use of phylogenetic diversity (PD) (the summation of the phylogenetic tree lengths of all the branches that are members of the corresponding minimum spanning routes for assessing ancestral relationships and conservation) (Faith 1992(Faith , 2008. This DNA barcoding technique has also been utilized in the identification of different organisms to their respective species levels as reported in nematodes (Elsasser et al. 2009), fish parasites (Locke et al. 2010), bats (Clare et al. 2007), mosquitoes (Cywinska et al. 2006), fungi (Stockinger et al. 2010), earthworms (Chang et al. 2008), bacteria (Sogin et al. 2006), protists (Chantangsi et al. 2007), spiders (Barret and Hebert 2005), fish (Ward et al. 2005) and crustaceans (Costa et al. 2007). DNA barcoding has become universally important both in animal and plant organisms but plants use chloroplast loci genes (matK, rbcL, rpoB and rpoC1) targeting coding regions and nuclear genes (ITS) (Baldwin and Markos 1998;Mort et al. 2007;Dong et al. 2012), while COI is applied in DNA barcoding of animals due to hyper mutation, maternal inheritance, absence of introns, absence of recombination, high substitution rates and lack of fast nucleotide substitution within the mitochondrial genome where the marker is located (Ballard and Whitlock 2004;Ballard and Rand 2005;Nabholz et al. 2009;Bernt et al. 2013;Hoque et al. 2013). It is a useful tool in different biological studies and has been used by the Barcode of life data Systems (BOLD) as a potential approach for identification of fishes to the species level (Ward et al. 2005;Wong and Hanner 2008). Within the eastern zone of Nigeria, especially, Enugu and Anambra States that maintain abundant freshwater bodies, identification and classification of fish have been based on morphological traits which are prone to errors. Also, there is no record of existing biodiversity of fishes within these States due to the use of only a morpho-based method. Application of modern and informative molecular technique including DNA barcoding has become necessary to address the challenges of inappropriate identification. Therefore, we investigated the utility of COI marker gene for species identification and assessment of genetic diversity within and among fish species collected from different freshwater bodies in Enugu and Anambra States of Nigeria.

Sample collection
Forty-four (44) fish samples were collected from different locations in Enugu and Anambra States of Nigeria ( Fig. 1; Table 1). The freshwater bodies that were easily accessible in the two States included the locations of Nike, Ugwuonwu, Ezu, and Obinna's farm cutting across different lakes and rivers. Twelve (12), 17, 13, and 2 fishes were respectively collected from Nike Lake, Ugwuonwu Lake, Ezu River, and Obinna's farm through local farmers who caught the fishes with fishing nets. The collected samples (the cut caudal fin or muscle part from whole fish species already caught by the local farmers) were preserved in 75% ethanol prior to DNA extraction.

DNA extraction
DNA was extracted following the method of Marizzi et al. (2018) with modifications. Briefly, a tissue (caudal fin or muscle) of 0.01 to 0.015 g was cut from each of the ethanolpreserved fish samples and transferred to a sterile 1.5 mL microcentrifuge with addition of 300 μL of lysis solution for homogenization using sterile mortar and pestle. The mixture was incubated in a heat block at 65 °C for 10 min. Next, samples were centrifuged in a balanced configuration at maximum speed (13,000 rev/min) for 1 min to pellet debris followed by transfer of 150 μL of supernatant into new 1.5 mL microcentrifuge tube with care not to disturb the pellet debris. The mixture was well mixed after addition of 3 μL silica resin followed by incubation at 57 °C for 5 min and centrifugation at maximum speed for 30 s to pellet the resin. The supernatant was transferred to new 1.5 mL with addition of 500 μL ice cold wash buffer to the pellet followed by centrifugation at maximum speed for 30 s. After this, the supernatant was transferred with the addition of 500 μL of ice-cold wash buffer, thorough mixture by vortexing, resuspension of the silica resin and centrifugation at maximum speed for 30 s. The wash buffer removes contaminants from the samples while nucleic acids remain bound to the resin. A dry spin step after wash was performed to remove any remnant drops of supernatant with a micropipette. Finally, 100 μL of distilled water was added to the silica resin, mixed well by vortexing and incubated at 57 °C for 5 min. Samples were then centrifuged for 30 s at maximum speed to pellet the resin. Later, 90 μL of the supernatant was transferred to new tubes from the resin. The eluted DNA was stored at − 20 °C prior to PCR step. The extracted was verified by loading 2 μL on 1.0% agarose gel electrophoresis. Nike Lake Buffalo Obuu Dogfish4 Nike Lake Dogfish Ogazingu AfricanCat5 Nike Lake Catfish Alila ObscueSH6 Nike Lake Sleeping fish Evi NileTilapia9 Nike Lake Tilapia Ikpokpo Catfish10 Nike Lake Catfish Alila Catfish11 Nike Lake Catfish Alila Catfish12 Nike Lake Catfish Alila ElectricF13 Ugwuomu Lake Electric fish Elulu/Ntuji Catfish14 Ugwuomu Lake Catfish Alila Catfish17 Ugwuomu Lake Catfish Alila Catfish18 Ugwuomu Lake Catfish Alila Catfish19 Ugwuomu Lake Catfish Alila Catfish20 Ugwuomu Lake Catfish Alila Catfish21 Ugwuomu Lake Catfish Alila UpDoCat22 Ugwuomu Lake Upside down catfish Okpuu (Igagu) UpDoCat23 Ugwuomu Lake Upside down catfish Okpuu (Nchaba) CoastalUD24 Ugwuomu Lake Upside down catfish Okpuu (Mmanu) UpDoCat25 Ugwuomu Lake Upside down catfish Okpuu UpDoCat26 Ugwuomu Lake Upside down catfish Okpuu UpDoCat27 Ugwuomu Lake Upside down catfish Okpuu Dogfish28 Nike Lake Dogfish Ogazingu/Nkuta Azu Tilapia29 Nike Lake Tilapia Ikpokpo Dogfish30 Nike Lake Dogfish Ogazingu/Nkuta Azu UpDoCat37 Ugwuomu Lake Upside down catfish Okpuu (Igagu) Noise UpDoCat38 Ugwuomu Lake Upside down catfish Okpuu Catfish46 Ugwuomu Lake Catfish Alila Dogfish47 Ugwuomu Lake Dog fish Ogazingu/Nkuta Azu MoonFish55 Ezu River Moon fish Aghali Trunkfish56 Ezu River Trunk fish Obu AfricanKnF60 Ezu River African knife fish Uchulu (Akarakara) GrassEater72 Ezu River Grass eater fish Ejo AfricanJeF73 Ezu River African jewelfish Anyamme Cichlid74 Ezu River Damselfish Ikputu AfricanButCf76 Ezu River African Butterfly fish Adaala Tilapia77 Ezu River Tilapia Onyeoma UpDoCat79 Ezu River Upside down catfish Okpuu (Isinkita) RedTailedSy81 Ezu River Red Tailed fish Okpuu (Elo) WiheadCf83 Ezu River Wide head fish Okpuu (Utu) Dogfish84 Ezu River Dogfish Ogazingu Catfish87 Ezu River Catfish Alila CfHybrid90 Obinna's Farm Tropical catfish (hybrid) Alila TropCfish91 Obinna's Farm Tropical catfish Alila

Polymerase chain reaction and agarose gel electrophoresis and DNA sequencing
Polymerase chain reaction (PCR) amplification was performed in Ready-To-Go PCR beads in a total volume of 25 µL which consisted of 2 µL of ~ 100 ng DNA and 23 µL of primer/loading dye mix for fish cocktail with pairs of mitochondrial cytochrome oxidase I [(COI) primers forward primer, VF2_t1: 5′-TGT AAA ACG ACG GCC AGT CAA CCA ACC ACA AAG ACA TTG GCA C-3′; forward primer, FishF2_t1: 5′-TGT AAA ACG ACG GCC AGT CGA CTA ATC ATA AAG ATA TCG GCA C-3′; Reverse primer, FishR2_t1: 5′-CAG GAA ACA GCT AGT ACA CTT CAG GGT GAC CGA AGA ATC AGA A-3′ and reverse primer, FR1d_t1: 5′-CAG GAA ACA GCT ATG ACA CCT CAG GGT GTC CGA ARA AYC ARA A-3′]. The PCR tubes were placed in a thermal cycler that had been programmed with the appropriate PCR protocol with initial step at 94 °C for 1 min., 35 cycles of 94 °C for 15 s, 54 °C for 15 s, and 72 °C for 30 s., and 8 min final extension at 72 °C was maintained. The PCR products or amplicons were electrophoresed on 1.5% agarose gel containing 0.5 mg/mL ethidium bromide and photographed using UV Transilluminator light (Omega G) to ensure that the PCR was successful and yielded accurate amplicon size. The generated PCR amplicons were prepared and sent to Genewiz LLC, New Jersey, USA, for DNA sequencing.
To avoid issues relating to sequencing error, bidirectional sequencing coverage was performed for each sample and also sequenced twice.

Data analyses
A total of 44 sequences out of the 92 samples collected, were validated and used for analyses. The sequencing results generated from the Applied Biosystems Genetic automated sequencer were carefully trimmed, edited, filtered and assembled using DNA Subway (Merchant et al. 2016). Sequences were translated to amino acids and examined for stop codons to ensure there was no pseudogene amplification. Also, multiple and pairwise alignments were done using the ClustalW in BioEdit (Hall 1999;Bousalem et al. 2000;Chenna et al. 2003). The aligned sequences were subjected to phylogenetic trees reconstruction using Maximum Likelihood (ML) and Kimura 2-parameter (K2P) (Kimura 1980) and p-distance procedures with bootstrap test of 1000 replicates (Felsenstein 1981;Nei and Kumar 2000). The tree was drawn to scale, with branch lengths in the same units as those of the genetic distances used to construct the phylogenetic tree with a sequence of Pentalonia nigronervosa as an outgroup. Codon positions included were 1st + 2nd + 3rd + Noncoding. All positions with less than 95% site coverage were eliminated. Also, phylogenetic diversity (PD), a measure of the relative feature diversity of different subsets of taxa from a phylogeny and supports the broad goal of biodiversity, conservation and evolutionary heritage (Faith 2015), was computed using Molecular evolutionary genetic analysis version X (MEGA X) (Kumar et al. 2018). Genetic diversity distances based on K2P were also analysed to obtain intergeneric, congeneric and confamilial genetic distances using MEGA X. Other parameters including haplotype, Fu and Li's D* test statistics and Tajima's D analyses were computed using DnaSP version 5.10.01 (Librado and Rozas 2009). Tajima's D statistics were applied to calculate the neutrality of haplotype. The statistics use the nucleotide diversity (π) and the number of segregating sites (S) observed in a sample of DNA sequences to make two estimates of the scaled mutation rate, θ (S) and θ (π). Tajima's statistics D < 0 (θ (π) < θ (S) indicates populations that had experienced recent bottleneck effect. Multiple and pairwise alignments for detection of transitions and transversions were done using ClustalW in BioEdit software (Hall 1999;Bousalem et al. 2000;Chenna et al. 2003). Percentage similarity searches were compared with GenBank databases using BLASTn option in NCBI web-based site.

Phylogenetic reconstruction
Phylogenetic reconstruction had a branch length of 1.9896 and percentage replicate trees in which the associated sequences clustered together in the bootstrap test of 1000 replicates were shown next to the branches with 658 positions in the final dataset ( Fig. 2). Twenty major groups were identified with each species clustering at 100% bootstrap value followed by an outgroup that was included to ensure accurate and distinct grouping. Group I consisted of upsidedown catfish including UpDoCat22, UpDoCat23, Coasta-lUD24, UpDoCat25, UpDoCat26, UpDoCat27, UpDoCat37 and UpDoCat38. This group had species of fish with 100% bootstrap replications but monophyletic (a group containing the most common ancestor of a given set of sequence taxa and all the descendants of that most recent common ancestor) at a subclade of 74% and clustered with Synodontis obesus.

Phylogenetic diversity
Phylogenetic diversity of each group without its respective reference sequences were computed (Fig. 3). The PD ranged from 0.0397 (group I) to 0.2147 (group XVIII). Group I consisted of UpDoCat22, UpDoCat23, Coasta-lUD24, UpDoCat25, UpDoCat, UpDoCat27, UpDoCat37 and UpDoCat38, with a PD value of 0.0397. In group II, only RedTailedSy81 was detected with 0.0397. Groups III, IV and V comprised of AfricanButCf76, WiheadCf83 and UpDoCat79, with PD values of 0.1089, 0.1058 and 0.1276, respectively. Group VI consisted of seven fish sequences including Catfish12, Catfish14, Catfish17, Catfish18, Catfish19, Catfish20, and Catfish21 at 0.0661. Group VII further clustered seven sequences (AfricanCat5, Catfish10, Catfish11, Catfish46, Catfish87, CfHybrid90, and TropCfish91) that were resolved at a PD value of 0.0628. At 0.13544, ElectricF13 clustered in group VIII, while Tilapia29 and NileTilapia9 clustered in groups IX and X maintaining a similar PD value of 0.0812. In groups XI and XII, Cichlid74 and AfricanJeF73 produced 0.1053, and 0.1169, respectively. In group XIII, Tilapia1 and Tilapia77 had the same value of 0.1053. Groups XIV and XV consisted of Trunkfish2, and Trunkfish56 with the same PD value of 0.1426, while group XVI had Afri-canKnF60 with 0.1679. In group XVII, five fishes (Dog-fish4, Dogfish28, Dogfish30, Dogfish47 and Dogfish84) had a PD value of 0.1649. Also, groups XIX and XX had MoonFish55 and GrassEater72, with a synonymous PD value of 0.1393, while ObscureSH6 in XVIII had 0.2147. Some of the groups contained similar PD values. For instance, groups I and II had a synonymous PD value of 0.0397, where group I contained Synodontis obesus and Synodontis clarias in group II. In groups IX (Oreochromis aureus) and X (Tilapia guineensis), a PD value of 0.0812 was common to the two groups of species. Phylogenetic diversity value of 0.1053 was identified in groups XI and XIII with Chromidotilapia guntheri and Hemichromis fasciatus, respectively. Groups XIV (Mormyrus tapirus) and XV (Marcusenius cyprinoides) had a similar value of 0.1426, while groups XIX and XX yielded 0.1393. The two groups, XIX and XX, contained Citharinus sp. and Distichodus rostratus, respectively.

Genetic diversity distances based on Kimura 2-parameter
The highest genetic distances between species computed based on K2P was identified to be 37.00% (standard error, SE = 0.040) between groups 12 and 18 (Additional Table S1). Intergeneric genetic distances ranged from 15.800% to 37.00%. The highest intergeneric genetic divergence (37.00%) was detected between Hemichromis and Parachanna (groups 12 and 18), while the lowest value (15.80%) was between Synodontis and Schilbe (groups 2 and 3). For the congeneric distances, the values ranged from 6.9 ± 0.014 (groups 1 and 2) to 28.1 ± 0.038 (groups 1 and 13) with Synodontis as the existing genus (Table 2). Among the groups having the same genus, there were variations in their respective congeneric genetic distances. For instance, groups 1 and 12 had the same genus, Synodontis but the congeneric distance of 26.10 ± 0.030 was higher than the one (6.90 ± 0.014) obtained from groups 1 and 2 but lower than 28.10 ± 0.032 that was generated from groups 1 and 13. Groups 6 and 7 that had a synonymous genus of Clarias yielded 11.9 ± 0.019, while 12 and 13 possessing Hemichromis as genus produced 18.60 ± 0.025 as congeneric genetic distance. For confamilial genetic distances in percentages, the values ranged from 16.00 ± 0.014 (groups 9 and 10) to 25.7 ± 0.031 (groups 2 and 10) ( Table 3). Each of the combined groups had different indices as confamilial genetic distance. Differently combined groups based on their synonymous family of Cichlidae had variable values. For instance, combined groups of 1, 2, 9, 10, 11, 12 and 13 containing the same family of Cichlidae, the highest confamilial value was identified in groups 2 and 10. Groups 4 and 5 had a synonymous family of Claroteidae with a confamilial genetic distance of 21.90 ± 0.027, while groups 14 and 15 possessed Mormyridae and 23.00 ± 0.030, respectively as family and confamilial distance. Mean diversity in entire population was 22.7 ± 0.019.

Haplotype analysis and nucleotide frequencies
A total of 42 haplotypes, H, with haplotype (gene) diversity, Hd of 0.999 ± 0.00003, and 115 mutations were identified among the sequences. Only two sequences (Catfish18 and Catfish21) were found in haplotype 13 (Hap_13), while the remaining sequences had a separate haplotype. Also, 389, 0.73721 and 286.776 were detected as parsimony informative sites, nucleotide diversity, Pi, and average number of pairwise nucleotide difference, K, respectively. Fu and Li's D* test statistic was 2.17427 and it was found statistically significant (P < 0.02). Computation of Fu and Li's F* test statistic yielded 1.7450 and was statistically determined (P < 0.005). Also, Tajima's D of 0.2424 was identified but not statistically significant at P > 0.10.
In the present study, the COI amplified DNA fragments generated sequences with no presence of insertions, deletions or stop codons. Also, there was no identification of nuclear DNA sequences originating from mitochondrial DNA sequences. Average nucleotide frequencies detected were C

Generated fish sequence alignments for identification of variable sites
From the sequence alignments, there were genetic variations at a nucleotide level as determined at different consensus positions of the representative sequences (Additional file Fig. S1). Almost

Discussion
Use of DNA barcoding approach with COI gene for species identification has been well acknowledged and documented especially in fishery (Kochzius et al. 2010;Ward 2012;Knebelsberger et al. 2014). The utility of DNA barcoding was demonstrated to be efficient in species identification due to 100% success rate recorded in this study and this corroborates with other reports on DNA barcoding of fishes Shen et al. 2016). Other studies revealed success rates from 90 to 95.60% (Hubert et al. 2008;April et al. 2011;Iyiola et al. 2018). The different species clustered into 12 groups at 100% bootstrap value, thereby, demonstrating the unambiguous resolution and diagnostic utility of COI gene as earlier reported (Shen et al. 2016;Persis et al. 2009). The congeneric and confamilial species were well resolved by the phylogeny. Ward et al. (2009) had earlier pointed out that the COI gene delineates boundaries of different species, and that there was an indication of distinct phylogeny resolution in COI sequences that was linked to the clustering of congeneric and confamilial species. Generally, all the sequences pertaining to all species were correctly grouped together, thereby, demonstrating the potential of COI gene in DNA barcoding for fishery identification and management (Tripathi 2011). Some of the identified fish species in our study have been previously reported in Nigeria (Nwani et al. 2011;Persis et al. 2009;Nwakanma et al. 2015;Falade et al. 2016). Phylogenetic diversity, which assesses community phylogenetic richness, is obtainable through the summation of the lengths of tree branch lengths or distances that are members of the corresponding minimum traversing species or the sum of branch lengths of the evolutionary trees connecting a set of taxa or individuals, is a crucial diversity index (Faith 1992;Faith and Baker 2006). Applying rbcL DNA barcoding marker, comparison of species abundance for preservation of feature diversity through the use of PD has been documented in plants (Forest et al 2007) and also in the ecology of species to measure their richness using COI gene (Smith and Fisher 2009;Machac et al. 2011). In the present study, PD ranged from 0.0397 (Synodontis obesus) to 0.2147 (Parachanna obscura). Some of the tree branches of the fish species had similar values of PD, while some yielded variable values. There were different groups of fish species that exhibited identical values of PD (Synodontis obesus and Synodontis clarias, PD = 0.0397; Oreochromis aureus and Tilapia guineensis, PD = 0.0812; Chromidotilapia guntheri and Hemichromis fasciatus, PD = 0.1053; Mormyrus tapirus and Marcusenius cyprinoides, PD = 0.1426; Citharinus sp and Distichodus rostratus, PD = 0.1393), while other groups yielded variable values. This further illustrates the efficacy of this COI marker gene in distinguishing species and identifying relatedness based on their ancestral lineages. In mammals, the PD has been shown to be unevenly distributed across the globe (Davies et al. 2008;Schipper et al. 2008), and that hotspots of species richness might capture more PD than expected by chance (Sechrest et al. 2002;Spathelf and Waite (Nwani et al. 2011) and these differences could be attributable to the discrepancies in the number of individuals analyzed. There were variations within the intergeneric, congeneric and confamilial genetic distances thereby exposing the potential effectiveness of this marker in resolving species even within genus and family. We identified different ranges in genetic distances within the genera and families as 6.90-28.1% and 16.00-25.70%, respectively and these values are in agreement with the work of Bingpeng et al. (2018). It has been reported that DNA barcoding is a standardized approach that depends on the assumption that interspecies genetic distance or variability is greater than the one obtainable from intra-species (Hebert et al. 2003;Meyer and Paulay 2005). The highest genetic distance (interspecific divergence) obtained from the sequences was 37.00% and this is ten times higher than the one identified by Bingpeng et al. (2018) but supports the work of Iyiola et al. (2018). This implies that the genetic distance within the species is more than the one obtained from among them. This finding is in agreement with the earlier report where genetic variation within the population was found to be higher (Ren et al. 2017). The obtained congeneric distance range (6.9-28.1%) from this study is higher than the 8% identified in 35 freshwater fishes in Cuban (Lara et al. 2010) and 10.29% from Ebonyi and Anambra States of Southeastern Nigeria (Nwani et al. 2011). The identified K2P intergeneric COI sequence divergence in this study ranged from 15.800-37.00% and this value is slightly higher than the one (0.30-31.40%) reported by Iyiola et al. (2018), but it is in agreement with a previous report from COI (14.6-25.7%) and inter-transcribed sequence, ITS (32.8-35.0%) (Petrov et al. 2016). We obtained the highest intergeneric genetic divergence (37.00%) between Hemichromis and Parachanna, while the lowest value (15.80%) was found between Synodontis and Schilbe. This is in contrast with a previous research that identified the highest value (31.30%) between Parachanna and Malapterurus, while the lowest (0.30%) was between Hyperopisus and Brienomyrus (Iyiola et al. 2018). This difference could possibly be linked to the nature of fish species analyzed in the two separate researches. The identified range of percentage confamilial genetic distances (16.00 ± 0.014-25.7 ± 0.031) corroborates with earlier reports of 20.4% from Cuban freshwater fishes (Lara et al. 2010), 15.38% from Canadian freshwater (Hubert et al. 2008) and 15.46% from Australian freshwater (Ward et al. 2005). Mean diversity of 22.7 ± 0.019 that was generated from the entire population is lower than the one (87.5 ± 0.089) reported by Persis et al. (2009) and this could be due to the heterogeneous nature of the later.

Conclusion
This work has successfully demonstrated the utility of COI gene in distinguishing even the closely related species of fishes. The use of phylogeny, PD, BLAST analysis, congeneric, intergeneric and confamilial K2P-based distance computations contributed in identifying and characterizing closely related species with much efficiency. Clariidae had the highest number of genera and families and PD discriminated them on the bases of genetic divergence and ancestral linkages. Parachanna obscura in group XVIII was identified to be most evolutionarily divergent and PD further captured the shared ancestry of the fish species. Our results provided good insights into the phylogeny, genetic diversity, haplotype and nature of identified fish species in Enugu and Anambra States of Nigeria. The results obtained in this present study can facilitate decision makings and selections for biodiversity, breeding and conservation in fishery management.
Author contributions All authors were involved in the project design. GNU, DOI, CB, MJ, AB, OO, OCI, OC, MO, CE and VC did the literature search process, extracted data elements, and carried out study compilation. Data analyses were performed by DOI, MU and CO and reviewed by GNU, GA, JO and AD. DOI developed the first draft of the manuscript. All authors read the manuscript and approved the final copy of it.
Funding Author GNU received research grants (with Grant Number HRD-1438902) from National Science Foundation (NSF) to conduct this study for Targeted Infusion in Historically Black Colleges and Universities Undergraduate Program (HBCU_UP). The funder, NSF, provided fund to principal investigator (PI) and three undergraduate students to travel to Enugu and Anambra States of Nigeria for sample collection and DNA extraction. Also, reagents and DNA sequencing were paid for by the funder.
Data availability All data generated in this study are included in this published article [and its supplementary information files] and in GenBank (NCBI) [https ://www.ncbi.nlm.nih.gov/genba nk] where novel sequences with accession numbers MH746731 and MH776981-MH776988 were deposited.

Compliance with ethical standards
Conflict of interest The authors declare that they have no competing interests.
Ethical approval and consent to participate This work does not involve living animals and no ethics approval. The fish samples were obtained dead through the verbal consent of the local fish farmers who are permitted by the local chiefs to use fishing nets to operate within the studied freshwater bodies.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.