Introduction

Taxonomically, fishes are regarded as only one of the vertebrate classes, but phylogenetically, they almost correspond to vertebrates except that the latter also includes tetrapods (Fig. 1). Molecular synapomorphy of this paraphyletic group has been sought in relation to water-to-land transition (Falcon et al. 2023), and this transition was accompanied by molecular factors secondarily gained in the tetrapod lineage, including genes encoding receptors for airborne odors (Wang et al. 2021) as well as gene regulation enabling fin-to-limb transition (Meyer et al. 2021). The fin-to-limb transition involved the loss of the group of genes responsible for fin ray formation, actinodin, unique to the tetrapod lineage (Zhang et al. 2010; Biscotti et al. 2016).

Fig. 1
figure 1

Phylogenetic overview of different fish lineages. Phylogenetic relationships shown in this figure are based on OneZoom explorer (Wong and Rosindell 2022), and divergence times are based on the Timetree of Life project (Hedges et al. 2006). The Elopoomorpha–Osteoglossomorpha clade is based on recent literature (Parey et al. 2023). Diamonds indicate whole-genome duplication(s) (WGD), of which the ones on gray triangles indicate that each of them applies to a subset of the taxon. The timings of the WGDs before 400 million years ago, as well as those in the sturgeon/paddlefish lineage, are still controversial (see Kuraku et al. 2023; Redmond et al. 2023). Parentheses include the numbers of described species as of July 2023 based on Eschmeyer’s Catalog of Fishes (Fricke et al. 2023)

When limited to extant species, fishes consist of several distinct evolutionary lineages, cyclostomes, chondrichthyans, actinopterygians, and sarcopterygians (coelacanths and lungfishes), which diverged in more than 500 million years. Among these extant fish lineages, Actinopterygii comprises more than 27,000 described species, including sunfishes and sturgeons that exceed 3 meters in total length and the bigmouth buffalo Ictiobus cyprinellus that lives more than 120 years. Despite this deep divergence and exceptional species richness of actinopterygian fishes, only a limited number of ‘model’ species are often used in life science research as highly accessible laboratory-friendly systems (e.g., Braasch et al. 2015). Sometimes, study results from such laboratory systems are regarded as applicable widely to diverse ‘fishes’, in comparison with mammals or other vertebrates. Among typical fish models, the traditionally heavily used are zebrafish and medaka, both freshwater, short-lived, small-sized, annually reproducing species (Lleras-Forero et al. 2020). They cannot represent the whole fish diversity, since species in other fish lineages sometimes exhibit dissimilar life history characters, such as enhanced longevity, viviparity or placentation, and elongated gestation time.

The phylogenetic distinction between the different ‘fish’ lineages should also be recognized widely in life science. For this purpose, this review summarizes the current understanding of how molecular-level properties of these fish lineages differ from each other based on emerging genome sequence data.

Emerging genome sequence resources: how close to ‘T2T’?

Recent effort with massively parallel DNA sequencing methods has reached increasing fish lineages. One recent landmark study with genome assemblies of several Elopomorpha species suggested a new teleost fish phylogeny proposing a new monophyletic group Eloposteoglossocephala (Fig. 1; Parey et al. 2023). Before analyzing the genome contents, sequencing effort and subsequent quality controls of output should already encounter nonuniform characters of fish genomes, such as genome size, GC content, and repetitive element content (‘repetitiveness’). A scaffolding step such as that employing Hi-C data also instructs us about the species’ karyotype (Yamaguchi et al. 2021a).

Telomere-to-telomere (T2T) sequencing stands for complete determination of genomic DNA sequences from a telomeric end to the other for individual chromosomes of a particular species. T2T sequencing of complex eukaryotic genomes was reported initially for a human cell line (Nurk et al. 2022), followed by chicken (Huang et al. 2023). Finishing the genome assembly in the T2T grade requires a suite of sequencing and scaffolding approaches, and it would be instrumental in recognizing how close the existing resources are. Traditionally, researchers have relied on N50 lengths and ‘completeness scores’, quantified by detecting evolutionarily conserved 1-to-1 orthologs as metrics for evaluating genome assembly qualities, but the former can be excessively larger when erroneous assembly (‘overassembly’) is introduced, and the latter becomes saturated even with a genome assembly of suboptimal continuity and do not provide enough resolution for evaluation (Yamaguchi et al. 2021a). Instead, in the T2T era, the number and length of individual chromosomal sequences and the number and length of undetermined regions (‘gaps’) matter. It is also challenging to complete the sequencing of telomeric and centromeric regions, of which telomeric sequences can particularly be validated by computationally scanning the ends of chromosome-scale sequences for the canonical telomeric repeat (TTAGGG)n.

In fact, complete sequencing of fish genomes has been hindered due to various reasons. One confounding factor in cyclostome genome sequencing is somatic reorganization of the genomes termed programmed genome rearrangement (PGR) (in lampreys; Sémon et al. 2012) or chromosome elimination (in hagfishes; Nagao et al. 2023; reviewed in Smith et al. 2021). Besides, lamprey genomes exhibit a high GC-content heterogeneity with GC-rich protein-coding regions (Smith et al. 2013) as well as high interspersed repeat content derived partly from horizontal transfers of Tc1 retroelements (Kuraku et al. 2012). Another typical difficulty is in enlarged genome size as detailed later, mainly for chondrichthyans and lungfishes (see below). Among fish lineages, the highest chance of complete genome sequencing lies in the teleost fish lineage. The lineup of quality metrics for currently available genome assemblies of laboratory ‘models’ and aquaculture targets in this lineage generally shows high continuity and completeness (Fig. 2). Still, the latest genome assembly of the zebrafish Danio rerio (version GRCz11; released in 2017) contains far more than 10,000 gaps including the longest one of precisely 800 kb (Fig. 2). The currently available medaka genome assembly contains much fewer (491) gaps (that are consistently 1 kb-long), but there are nearly 900 sequences that remain to be joined into the 24 chromosomal sequences (Fig. 2; available at https://utgenome.org/medaka_v2/#!Assembly.md).

Fig. 2
figure 2

Statistics of selected teleost fish genome assemblies. Red bars show nuclear DNA contents estimated by Feulgen densitometry or flow cytometry analysis and karyotypes of individual species, wherever available (Gregory et al. 2007; Arai 2011). ‘Gaps’ denote those with no shorter than five undetermined bases. Gene-level completeness was evaluated by BUSCO v5 with the ortholog dataset Actinopterygii_odb10 (Seppey et al. 2019). The assemblies analyzed: medaka (Assembly version UT v2.2.4); zebrafish (GRCz11); fugu (fTakRub1.2); threespine stickleback (GAculeatus_UGA_version5); chub mackerel (fScoJap1.pri); gilthead seabream (fSpaAur1.1); channel catfish (Coco_2.0); Nile tilapia (O_niloticus_UMD_NMBU); Japanese eel (ASM2516954v1); common carp (ASM1834038v1); rainbow trout (USDA_OmykA_1.1); Atlantic salmon (Ssal_v3.1); and sterlet (ASM1064508v2)

Genome size

The ‘C-value enigma’ has attracted many researchers expecting implicit association with some phenotypic traits, but it remains elusive (Gregory 2005). Among all extant fish lineages, lungfishes have the largest known size of genomes exceeding 30 Gb (Vervoort 1980), followed by chondrichthyans whose genome sizes sometimes exceed 10 Gb (reviewed in Kuraku 2021). The difficulty in obtaining live tissues, which provide single cells as materials, prevented measuring nuclear DNA content especially of elusive shark and ray species, even though it is circumvented by applying a quantitative PCR-based method that does not require live cells (Kadota et al. 2023). It was not until long-read DNA sequencers became popular that giant (e.g., >10 Gb) vertebrate genomes were sequenced. It was initiated with the Mexican salamander Ambystoma mexicanum genome, whose resulting sequence assembly amounted to 32 Gb in length (Licht and Lowcock 1991; Nowoshilow et al. 2018). The Australian and West African lungfish genomes whose total number of nucleotides in currently available sequence assemblies amount to 40 and 35 Gb, respectively, were also sequenced to provide semi-chromosomal scaffolds, some of which exceed 1 Gb (Meyer et al. 2021; Wang et al. 2021). It should be noted that for these species with particularly enlarged genomes, the genome size estimates, independent of DNA sequence length, have not been unequivocally measured (e.g., by employing reliable reference from other species) and remain to be consolidated for controlling the genome sequencing products.

Among actinopterygians, some of the sturgeon species that underwent successive whole-genome duplications (Du et al. 2020), as well as polypterids, have genome sizes of larger than 3 Gb, while many of the remainders have genomes of smaller than 2 Gb. Although more thorough cross-species comparisons are awaited, among vertebrates, the genome sizes are correlated with intron lengths (Hara et al. 2018). Especially teleost fishes with small genome sizes have dramatically reduced intron lengths (Jakt et al. 2022). Short introns are thought to accelerate transcription and splicing (Swinburne and Silver 2008; Heyn et al. 2015; Keane and Seoighe 2016), which is implicated in rapid cellular activity, possibly leading to a short life (see below).

Whole-genome duplication

Recent accumulation of whole-genome sequences has finally encompassed all the extant vertebrate lineages, which has delivered an overview of which lineages experienced whole-genome duplication (WGD; Fig. 1). The non-teleost actinopterygian genomes have consolidated the phylogenetic position of the so-called teleost-specific genome duplication (TSGD; Thompson et al. 2021), whereas emerging chondrichthyan genomes support no additional WGD in its lineages of >450 million years (reviewed in Kuraku 2021). The most controversial is the mode of genome duplication close to the origin of vertebrates (reviewed in Kuraku et al. 2023). In the last century, hagfish and lamprey were regarded as having diverged before genome expansion occurred (Sidow 1996; Escriva et al. 2002). Later studies suggested more abundant gene repertoires in those jawless fish lineages even in the absence of whole cyclostome genome sequences (Putnam et al. 2008; Kuraku et al. 2009; Mehta et al. 2013). A series of genome analyses based on synteny conservation have suggested one round of WGD in the stem vertebrate lineage and a subsequent allotetraploidy in the stem gnathostome lineage (Simakov et al. 2020; Nakatani et al. 2021). Separately from these events, in the cyclostome lineage, additional WGD has been implicated by lamprey genome analysis (Nakatani et al. 2021), which should be ascertained by analyzing the hagfish genomes (Marlétaz et al. 2023b; Yu et al. 2023).

Karyotype

Another contrast between teleost fishes and other fish lineages is manifested in karyotypes and technical disposition for karyotyping studies. The karyotypes of teleost fishes usually harbor 44–54 chromosomes in diploid genomes, and their lengths are not dramatically variable (Arai 2011). For example, the medaka has 24 pairs of chromosomes (2n = 48) whose lengths vary only between 23 and 38 Mb, while zebrafish has 25 chromosome pairs (2n = 50) between 37 and 78 Mb. On the other hand, the karyotypes of non-teleost fishes often harbor more abundant chromosomes, including those shorter than 20 Mb (see below for those called ‘microchromosomes’) and/or longer than 100 Mb (summarized in Yamaguchi et al. 2023b). Those short chromosomes tend to be GC rich and have high gene density. As a whole, such karyotypes that exhibit a gradualism in chromosome length accommodate intragenomic heterogeneity of genomic, transcriptomic, and epigenomic properties (Hara and Kuraku 2023). Remarkably, the different fish lineages exhibit distinct modes of implementing this intragenomic heterogeneity.

The common ancestor of teleost fishes experienced TSGD (Fig.1). Comparative genomic study between fishes including spotted gar, which is a non-teleost bony fish, and tetrapods indicates that fusions between microchromosomes and macrochromosomes resulted in 13 pairs of ancestral chromosomes that subsequently duplicated to 26 pairs (Braasch et al. 2016). After a WGD, sequence divergence between duplicated genes resulting from WGD (‘ohnologs’) gradually stops the recombination between homeologous chromosomes (Li et al. 2021). This evolutionary transition is known as a rediploidization step. The rediploidization step after the TSGD was coupled with loss of ohnologs and chromosome rearrangements that resulted in large karyotype differences between early diverged teleost lineages (Parey et al. 2022). For example, chromosome rearrangements observed between zebrafish and medaka (Fig. 3) were introduced shortly after the divergence between Otomorpha and Euteleosteomorpha (Fig. 1). Later inside the taxa Ostariophysi (including zebrafish) and Euteleostei (including medaka), the karyotypes became largely stable. Therein, not only the number of chromosomes, but also the number of visible major chromosome arms (fundamental number) are highly conserved (Yoshida and Kitano 2021). Genome-wide comparative synteny analysis within these groups confirmed minimum chromosome fusion/fission events (Kasahara et al. 2007; Star et al. 2011; Kakioka et al. 2013; Chen et al. 2014). Additional independent WGDs are observed in Salmoniformes and some Cyprinidae fishes, at least (Lien et al. 2016; Xu et al. 2019). They have exceptionally diverse karyotypes presumably due to the lineage-specific rediploidization step after the WGD (Robertson et al. 2017). Thus, teleost fishes generally have stable karyotypes, and WGD has temporally driven their karyotype evolution.

Fig. 3
figure 3

Cross-species karyotypic similarity in actinopterygian and chondrichthyan lineages. Similarity of genomic sequences for a pair of teleost fishes with a divergence time of 224 million years ago (left) is visualized with diagonal lines, together with an elasmobranch species pair with a divergence time of 270 million years ago (right). The divergence time estimates were obtained from Timetree of Life project (Hedges et al. 2006). Chromosome-scale sequences are sorted by length from top to bottom and left to right. Diagonal lines are colored in accordance with sequence divergence levels (dark green 75–100%, light green 50–75%, orange 25–50%, yellow 0–25%). Note that some regions in the zebrafish or zebra shark genome sequences used as queries are highly repetitive and are therefore causing horizontal arrays of similarity signals. In the parentheses are the total sequence lengths of the individual genome assemblies

Among non-teleost fishes, the chromosome organization in chondrichthyan genomes remained unexplored because of the lack of chromosome-scale sequence information. One recent study based on whole-genome sequencing characterized the chromosome organization of the little skate Leucoraja erinacea, which divided its chromosomes into three length categories, macro-, meso-, and microchromosomes (Marlétaz et al. 2023a). Notably, this species lacks a karyotype report, and their sequence-based findings remain to be verified. The comparison of the little skate genome with that of the zebra shark exhibited intrachromosomal breaks and few interchromosomal rearrangements (Fig. 3). The overall karyotypic organization of elasmobranchs have been maintained well since the shark–ray split which traces back to approximately 270 million years ago (Fig. 3). In comparison, much lower cross-species similarity, with a considerable number of interchromosomal rearrangements, is observed between distantly related teleost fish lineages despite a relatively short divergence time (Fig. 3).

The karyotype report is also missing for some more chondrichthyan species, whose chromosome-scale genome sequences have been made available [smalltooth sawfish Pristis pectinata (Jarva et al. 2023), shortfin mako Isurus oxyrinchus (Stanhope et al. 2023), and elephant fish Callorhinchus milii (Nakatani et al. 2021)]. The lack of karyotype reports is attributed mainly to the difficulty in obtaining fresh tissue materials from which cell culture is performed for repeated experiments to consolidate reproducible results. Another critical obstacle is the unique body fluid composition of chondrichthyans, which is overcome by adding urea, NaCl, and TMAO in culturing chondrichthyan cells, to mimic the body fluid. This culture medium adaptation enabled karyotyping for four shark species (Uno et al. 2020), which serve as indispensable references for validating whole-genome sequences. The importance of karyotypic references is outstanding for chondrichthyans with high karyotypic variation for which no versatile substitute, such as genetic linkage mapping, is usually accessible, unlike teleost fishes. Despite its importance, this sort of effort has rarely been demonstrated (see Součková et al. 2023).

According to the karyotypes obtained in the above-mentioned cell culture-based study, as well as more existing reports (summarized in Stingo and Rocco 2001; Arai 2011), the chromosome numbers of sharks and rays dramatically vary from 2n = 28 (for Narcine brasiliensis; Donahue 1974) to 106 (for Chiloscyllium punctatum; Uno et al. 2020). Of those, even with a relatively deep divergence of more than 250 million years, the red stingray Hemitrygon akajei and brownbanded bamboo shark C. punctatum share similar numbers of metacentric and subtelomeric chromosomes (Asahida et al. 1987; Uno et al. 2020). Karyotypes of elasmobranchs tend to show a continuum of chromosome lengths spanning from >100 Mb to <10 Mb. Even within Chondrichthyes, holocephalans (chimaeras and ratfish) tend to have a dichotomy of chromosome length ranges between (1) nearly 100 Mb or longer (only 4–5 chromosomes) and (2) shorter than 50 Mb (Nakatani et al. 2021). Interestingly, there are closely related species whose chromosome numbers are similar, but their morphology and length differ significantly. For instance, the whale shark Rhincodon typus and the brownbanded bamboo shark have chromosome numbers of 102 and 106, respectively. The former species has eight meta- or submetacentric chromosome pairs with the largest chromosome 3.6 times longer than the smallest one, while the latter species has 26 meta- or submetacentric chromosome pairs with the largest versus smallest length ratio of approximately 7 times (Uno et al. 2020).

Many non-teleost karyotypes including those of birds and turtles exhibit a substantial variation of chromosome lengths (Burt 2002; International Chicken Genome Sequencing Consortium 2004; Waters et al. 2021). This observation prompted researchers to call their large and small components ‘macro-’ and ‘microchromosomes’, respectively (Ohno et al. 1969; Ohno 1970), but there is no consensus on their definition. For example, the smallest Australian lungfish chromosome (818 Mb) is much larger than the ‘macrochromosomes’ defined in the study of the little skate genome (>40 Mb) (Marlétaz et al. 2023a). Conversely, microchromosomes defined therein (2.5 to 20 Mb) are comparable to or sometimes larger than some of the largest chromosomes of Takifugu rubripes. In practice, the terms macrochromosome and microchromosome should thus be used to categorize chromosomes with different lengths within a karyotype, but not across different species’ karyotypes.

Sex chromosome organization

Sexes of most vertebrate species are determined genetically, and the determination mechanism, intensively explored in mammals, is triggered during embryonic development primarily by a master sex-determining (SD) gene (reviewed in Capel 2017). The SD genes are harbored on sex chromosomes that are derived from autosomal chromosome pairs and undergo various frequency of turnovers in different evolutionary lineages. In Mammalia and Aves, the XY and ZW systems, respectively, were established approximately 150 and 110 million years ago, which are the oldest among vertebrates ever investigated (Graves 2016).

These several years have witnessed elaborate studies on teleost fishes reporting a variety of SD genes (Kitano et al. 2024). Those teleost fish SD genes are often orthologous to sex determination genes identified in other vertebrates, including Dmrt1- and Sox3-related transcription factors known as the SD genes in mammals and birds or components of TGF-β signaling pathway, such as Amh, Amhr2, Bmpr1b, Gsdf, and Gdf6 (Bertho et al. 2021; Rajendiran et al. 2021). The high variety of SD genes observed even within evolutionarily young taxa shows rapid SD gene turnovers, for example, in the genera Takifugu (Kabir et al. 2022) and Oryzias (Tanaka et al. 2007; Myosho et al. 2015). In such species, sex chromosome pairs have accumulated few sequence changes, and recombination suppression, which is a hallmark of sex-determining regions, should be operating only in a small part of the genomic segment harboring the sex-determining region.

Of approximately 1300 chondrichthyan species, sex chromosomes were reported for only eight species, if limited to the solid study cases with multiple individuals for both sexes (summarized in Uno et al. 2020). They are all batoids, namely, Hypanus sabinus, Platyrhinoidis triseriata, Potamotrygon aff. motoro, Potamotrygon falkneri, Potamotrygon motoro, Potamotrygon orbignyi, Potamotrygon wallacei, and Rhinobatos productus. The other existing reports are limited to sharks, which include the large sex chromosomes of the leopard shark Triakis semifasciata, although based on a single sex (Maddock and Schwartz 1996). We recently reported a heteromorphic sex chromosome pair of the brownbanded bamboo shark Chiloscyllium punctatum and white-spotted bamboo shark Chiloscyllium plagiosum (Uno et al. 2020). Mass sequencing projects organized under the Earth BioGenome Project (Lewin et al. 2022) have released several genome assemblies of chondrichthyan species including chromosomes labeled as X (and Y, sometimes) in the NCBI Genome database, as of July 2023, which however needs to be verified by comparing between the male and female genomes.

So far, the only study that has associated a chondrichthyan sex chromosome with its DNA sequences was performed for the zebra shark Stegostoma tigrinum and the whale shark Rhincodon typus (Yamaguchi et al. 2023b). This study identified a chromosome with a conspicuously low relative sequencing depth (nearly 0.5 folds) for male versus female and designated it as the X chromosome. The zebra shark X chromosome has a pseudoautosomal region (PAR) that shows a male–female ratio of sequencing depth comparable to autosomes on one end and is homologous to the whale shark chromosome separately identified as the X chromosome. This co-occurrence suggests that this X chromosome originated before the divergence between these two shark species as early as around 50 million years ago. Further studies on species in other shark lineages are awaited.

Gene family evolution

A profound question is whether the variation of basic genomic characteristics, such as karyotype, genome size, and ploidy, is associated with the phenotypic traits. Some studies suggested the association of genome size with longevity (Griffith et al. 2003; also see Gregory 2004), body size, and/or habitat depth (Ebeling et al. 1971). Nonetheless, one solid line of examples for the association with phenotypes can be provided by studies focusing on particular gene families.

In the opsin gene family, teleost fishes tend to accumulate more gene copies through tandem gene duplications as well as the above-mentioned TSGD (Lin et al. 2017; Yamaguchi et al. 2021b). One striking example is the silver spinyfin Diretmus argenteus. This deep-sea dweller has unique visual opsin gene repertoires, that is, 38 rhodopsin (RHO or Rh1) gene duplicates, which are thought to cooperatively function for sensing dim light in the deep sea (Musilova et al. 2019). In contrast, chondrichthyans generally exhibit few duplications of opsin gene repertoires and instead experience more frequent secondary gene loss of visual opsins (Hart 2020; Yamaguchi et al. 2021b). Utilizing the reduced gene repertoires, visual adaptation has been achieved by spectral tuning sometimes involving amino acid substitutions that have not been described in other fish lineages (Yamaguchi et al. 2023a).

A possible link with varying phenotypes can also be sought in gene repertoires responsible for modulating water permeability. Herein, we focus on aquaporins (AQPs) that function mainly as water channels and scrutinize its gene repertoire variation based on previous studies (Cerdà and Finn 2010; Tingaud-Sequeira et al. 2010; Finn et al. 2014). This gene family also exhibited an expansive nature of gene repertoires in the teleost fish lineage due to TSGD and a conservative nature in the chondrichthyan lineage, except a few tandem duplications (AQP3 and AQP10) in the latter lineage (Fig. 4). Also, AQP8, which is presumed to be involved in ammonia transport (Saparov et al. 2007), has not been identified in genome sequences of chondrichthyan species except the spiny dogfish Squalus acanthias (Cutler et al. 2022).

Fig. 4
figure 4

Aquaporin (AQP) gene repertoires in selected species in different fish lineages. Orthology, shown with a vertical positioning of the boxes, is supported by existing literature as well as our phylogenetic inference that will be reported elsewhere. Colored boxes with no number indicate the existence of only one ortholog. The numbers in the boxes indicate the multiplicity of the orthologs generated by lineage-specific gene duplications, while white boxes with dotted lines show the absence of possible ortholog in the currently available genome assemblies. At the top is the classification into different subfamilies including glpAQPs (aquaglyceroporins) based on existing literature (King et al. 2004; Finn et al. 2014). Red boxes indicate the lamprey genes that are potentially orthologous to multiple jawed vertebrate AQP subtypes

A more remarkable distinction is observed in the organization of Hox gene clusters, namely, the genomic regions harboring tandem copies of homeobox-containing genes responsible for regional specification of body segments along the anteroposterior axis. In general, a vertebrate Hox gene cluster harbors up to ten Hox genes tandemly duplicated before or around the time of bilaterian radiation and are found in different chromosomes as a result of whole-genome duplications. Teleost fish Hox genes are usually contained in seven or eight clusters because of the TSGD (Kuraku and Meyer 2009), and the non-teleost species that underwent additional whole-genome duplication (acipenseriforms and cyclostomes) also have more than four clusters (Mehta et al. 2013; Pascual-Anaya et al. 2018; Du et al. 2020). Chondrichthyans tend to have conservative gene repertoires with decreased molecular evolutionary rate throughout the genomes (Fig. 1), which is also reflected in the high conservation of the organization of Hox A, B, and D clusters (Hara et al. 2018). Exceptionally, the elasmobranch Hox C cluster underwent frequent gene loss and invasion of massive repeats (Hara et al. 2018), increasing the cluster length that typically fits in 100 kb (reviewed in Kuraku 2021). Interestingly, the eroding Hox C cluster is located in the PAR of the X chromosome in the zebra shark (Yamaguchi et al. 2023b). In fact, this Hox C cluster erosion is not observed in holocephalans, as indicated by as many as eleven persistent Hox C genes in a 100 kb-long cluster in the Callorhinchus milii genome (Ravi et al. 2009), and is thus confined to Elasmobranchii (Fig. 1; reviewed in Kuraku 2021). To the authors’ knowledge, shark Hox C cluster is the only documented case of a vertebrate Hox cluster residing on a sex chromosome (Yamaguchi et al. 2023b). Its localization in the PAR which should still maintain recombination between maternal and paternal chromatids likely cancels the sex-biased dosage of the expression of the Hox C genes. The possible relevance of the decreased constraint in maintaining the Hox cluster structure unique to this lineage needs to be explored in more detail.

Conclusions

These several years have witnessed the arrival of modern sequencing technologies and genome informatic analyses at some missing lineages including elasmobranchs and lungfishes, which permitted more thorough among-lineage comparisons. Fish lineages, phylogenetically divided primarily into jawless, cartilaginous, actinopterygian, and sarcopterygian fishes, have distinct genomic organization featured by size and ploidy level of the genome, and size and number of chromosomes. In particular, variable contents of repetitive sequences and functional genes have been revealed to be associated with genomic organizations, namely the genome size and ploidy levels, respectively. Phylogenetic understanding on the molecular-level variation among these different fish lineages is critical to addressing innumerable questions concerning phenotypic variations of vertebrates and their evolutionary origins, such as what in the genome of their last common ancestor permitted the advent of vertebrates and what in the sarcopterygian lineage later led to the emergence of tetrapods.