Introduction

Fifteen years have passed since the publication of a novel approach for sequencing the whole mitochondrial genome (mitogenome) of fishes (ca. 16,500 bp) using a combination of the long PCR technique (Cheng et al. 1994) and a number of fish-versatile primers (Miya and Nishida 1999). During this period (1999–2014), our research group has assembled nucleotide sequences for ≈1340 whole mitogenomes from fishes using this method (including unpublished ones) and published 83 peer-reviewed papers on mitochondrial genomics (mitogenomics) of fishes (including those papers based on partial mitogenome sequences >2,000 bp; Fig. 1). Most of those papers (77 papers) are more or less phylogeny oriented with the exception of five reports of single mitogenomes from commercially or zoologically important species (e.g., Japanese sardine, Inoue et al. 2000b), a paper addressing the highly conservative gene arrangement in vertebrate mitogenomes based on a comparative genomic approach (Satoh et al. 2010), and a paper describing a database specifically designed for fish mitogenomes (MitoFish) with a novel gene annotation pipeline (MitoAnnotator) (Iwasaki et al. 2013). Apparently, the course of this direction towards phylogenetics is due to a successful demonstration of the utility of the mitogenomic data for resolving higher-level relationships of fishes at an earlier stage of the study (Miya and Nishida 2000b). Consequently, the “mitogenomic era” was born (Broughton 2010) and an unprecedented number of contributions to molecular phylogenetics and evolution of fishes have been published from this single research group. Although members of this research group comprise mainly Japanese professional scientists, postdocs and graduate students, the affiliations of the 151 coauthors of these papers are distributed across 17 different countries, representing truly international collaborative efforts.

Fig. 1
figure 1

Approximate phylogenetic coverage of the 83 mitogenomic papers published from our research group. More taxonomically comprehensive studies, such as higher teleostean phylogenies of Miya et al. (2003), are placed on internal nodes, while those studies dealing with specific taxa (e.g., lophiiform, Miya et al. 2010) are placed on terminal nodes. Topology of the tree is based mostly on our recent studies (Miya et al. 2005; Nakatani et al. 2011; Campbell et al. 2013b; Chen et al. 2013; Miya et al. 2013) supplemented by those studies based on multiple nuclear genes (Near et al. 2012; Wainwright et al. 2012; Betancur et al. 2013a; Broughton et al. 2013; Near et al. 2013). Terminal taxa are ordinal names taken from Nelson (2006) for actinopterygians supplemented by our recent studies (Miya et al. 2005; Miya et al. 2007; Lavoué et al. 2008b; Miya et al. 2013). Exceptions are those of the Percomorpha, in which all terminal taxa are taken from a revised classification of bony fishes adopted by DeepFin (http://deepfin.org). Numerals indicate those 83 studies according to the temporal sequence of the publications: 1 Miya and Nishida (1999); 2 Inoue et al. (2000b); 3 Miya and Nishida (2000b); 4 Inoue et al. (2000a); 5 Inoue et al. (2001a); 6 Ishiguro et al. (2001); 7 Inoue et al. (2001c); 8 Kawaguchi et al. (2001); 9 Inoue et al. (2001b); 10 Miya et al. (2001); 11 Lee et al. (2001); 12 Inoue et al. (2003a); 13 Miya et al. (2003); 14 Saitoh et al. (2003); 15 Ishiguro et al. (2003); 16 Inoue et al. (2003b); 17 Simmons et al. (2004); 18 Simmons and Miya (2004); 19 Inoue et al. (2004); 20 Yamanoue et al. (2004); 21 Mabuchi et al. (2004); 22 Minegishi et al. (2005); 23 Inoue et al. (2005); 24 Miya et al. (2005); 25 Ishiguro et al. (2005); 26 Lavoué et al. (2005); 27 Miya et al. (2006); 28 Yamanoue et al. (2006); 29 Mabuchi et al. (2006); 30 Satoh et al. (2006); 31 Saitoh et al. (2006); 32 Mabuchi et al. (2007); 33 Lavoué et al. (2007); 34 Mayden et al. (2007); 35 Yamanoue et al. (2007); 36 Miya et al. (2007); 37 Kawahara et al. (2008); 38 Mayden et al. (2008); 39 Lavoué et al. (2008b); 40 Yamanoue et al. (2008); 41 Azuma et al. (2008); 42 Setiamarga et al. (2008); 43 Lavoué et al. (2008a); 44 Yamanoue et al. (2009b); 45 Kawahara et al. (2009); 46 Yamanoue et al. (2009a); 47 Johnson et al. (2009); 48 Inoue et al. (2009); 49 Mayden et al. (2009); 50 Yagishita et al. (2009); 51 Poulsen et al. (2009); 52 Setiamarga et al. (2009); 53 Takada et al. (2010); 54 Miya et al. (2010); 55 Doosey et al. (2010); 56 Hirayama et al. (2010); 57 Inoue et al. (2010c); 58 Lavoué et al. (2010); 59 Tang et al. (2010); 60 Yang et al. (2010); 61 Inoue et al. (2010b); 62 Satoh et al. (2010); 63 Yamanoue et al. (2011); 64 Saitoh et al. (2011); 65 Lavoué et al. (2011); 66 Nakatani et al. (2011); 67 Tang et al. (2011); 68 Johnson et al. (2012); 69 Aschliman et al. (2012); 70 Lavoué et al. (2012b); 71 Lavoué et al. (2012a); 72 Yang et al. (2012a); 73 Yang et al. (2012b); 74 Imoto et al. (2013); 75 Lavoué et al. (2013); 76 Poulsen et al. (2013); 77 Tang et al. (2013); 78 Miya et al. (2013); 79 Campbell et al. (2013b); 80 Iwasaki et al. (2013); 81 Song et al. (2014); 82 Lavoué et al. (2014); 83 Campbell et al. (2014)

Taxonomic coverage of those studies is extensive (Fig. 1), encompassing three of the four major fish lineages from chondrichthyans (e.g., Inoue et al. 2010b; Aschliman et al. 2012) to various actinopterygians (e.g., Inoue et al. 2003a; Ishiguro et al. 2003; Miya et al. 2003; Saitoh et al. 2003) and even to sarcopterygians (coelacanths, Inoue et al. 2005). Notably, our research group has never performed de novo sequencing for mitogenomes from agnathans; however, Inoue et al. (2010b) used two agnathans as outgroups in their study of holocephalan evolutionary history. Thus, the temporal dimension of our studies is well over 450 million years, considering that a common ancestor of those four vertebrate lineages emerged in the Paleozoic (Benton et al. 2009). In terms of the spatial dimensions, our phylogenetic studies cover the global habitat diversity, including freshwaters on major continental landmasses (e.g., Otophysi, see Saitoh et al. 2003; Nakatani et al. 2011), coral reefs in tropical and subtropical waters (e.g., Labridae, see Mabuchi et al. 2004, 2007), open oceans (e.g., Pelagia, Miya et al. 2013), deep-sea midwaters (e.g., Myctophiformes, see Poulsen et al. 2013), demersal waters (e.g., Macrouridae, see Satoh et al. 2006), and combinations across those habitats (e.g., diadromous Anguillidae, see Minegishi et al. 2005).

As of 30 September 2014, those 83 mitogenomic papers have been cited 5,303 times in total, with the most frequently cited paper (Miya et al. 2003) reaching 543 citations, followed by 14 papers with ≥100 citations (Google Scholar: http://scholar.google.co.jp; Table 1). A number of the mitogenomic papers have been cited in standard textbooks in ichthyology, such as “Fishes of the World” (Nelson 2006) and “The Diversity of Fishes” (Helfman et al. 2009), and even in textbooks of other disciplines, such as “Vertebrate Palaeontology” (Benton 2009) and “Evolution” (Futuyma 2013). In the 4th edition of “Fishes of the World,” Nelson (2006) mentioned our research activities, stating “… Some of the new, exciting molecular work is being done in the laboratories of … Masaki Miya in Chiba, Japan; and Mutsumi Nishidain Tokyo, Japan. These workers and their colleagues as well as many others are finding support for new clades that must be taken seriously, especially in those areas where agreement is found when different approaches are taken.” In addition to those citations in the textbooks, our article demonstrating a sister group relationship between pike and salmon (Ishiguro et al. 2003) was featured in a review article of Trends in Ecology & Evolution for discussing the evolutionary origin of anadromous migrations of salmons (Ramsden et al. 2003); our recent discovery of “three deep-fish families into one” (Johnson et al. 2009) was highlighted in a “News & Views” section in Nature (Howlett 2009); and our discovery of deep-ocean origin of the freshwater eels was highlighted in a “Feature” section in Current Biology (Williams 2010). Other than those scientific impacts, some of our recent papers have attracted considerable attention from various media outlets, such as popular science magazines, newspapers, radio, and television [for a list of the media coverage of our studies, see the website of one of the authors (MM) lab: https://sites.google.com/site/masakimiyalab/media-coverage].

Table 1 The top 15 mitogenomic papers with ≥100 citations as of 30 September 2014

This review summarizes the achievements and novel perspectives that our mitogenomic project has brought to molecular phylogenetics and evolution of fishes. Although the review is necessarily a highly personal retrospect on our own studies, we attempted to describe those achievements and perspectives in an objective manner with several examples of criticisms from other studies. Finally, we briefly comment on the potential directions of future mitogenomics of fishes in the nuclear genomic age.

Phylogenetic background

Before reviewing the mitogenomics of fishes, we summarize a phylogenetic background prior to the inception of our project following Miya and Nishida (2000b). We use phylogenetic history of teleosts as an example because it is well documented and the teleosts cover a major portion of fish diversity, comprising over 26,840 extant species (about 96 % of all extant fishes) placed in 40 orders, 448 families, and 4,278 genera (Nelson 2006).

Earlier classifications of the teleosts (e.g., Berg 1940) simply reflected the recognition of a series of evolutionary “grades,” starting from a morphologically primitive group (e.g., Isospondyli), through intermediate steps, toward the most advanced forms (e.g., Percomorpha, see Nelson 1989). Following the publication of the seminal work by Greenwood et al. (1966) and the advent of cladistic theory (Hennig 1966; Wiley 1981), those evolutionary grades have been replaced by the monophyletic groups (“clades”), which are explicitly recognized through searches for the shared derived characters (“synapomorphies”). One early example of an important synapomorphy for a group of fishes was the recognition of the possession of leptocephalus larvae among members of the Elopomorpha (Greenwood et al. 1966). The influence of cladistics on fish systematics has been profound, with numerous comparative anatomical studies having been conducted in attempts to circumscribe monophyletic groups for both the basal (Rosen 1973, 1974, 1985; Fink and Weitzman 1982; Lauder and Liem 1983; Fink 1984; Begle 1991, 1992; Johnson 1992; Johnson and Patterson 1996; Lecointre and Nelson 1996) and the derived (Lauder and Liem 1983; Stiassny 1986; Stiassny and Moore 1992; Johnson 1993; Johnson and Patterson 1993) groups of teleosts based on nested sets of synapomorphies used to resolve their relationships. Despite the efforts of these early phylogeneticists, there remained much controversy over higher-level relationships among the major teleostean lineages. This controversy was especially evident in the transitions of the classification systems adopted in the first through fourth editions of “Fishes of the World” (Nelson 1976, 1984, 1994, 2006).

As independent lines of evidence, molecular characters were expected to be decisive in resolving persistent controversies in the teleostean phylogeny (Nelson 1989), although they had not fulfilled their promise before the late 1990s (Stepien and Kocher 1997). Indeed, an early study using partial amino acid sequences from three mitochondrial protein-coding genes (Normark 1991) suggested an unorthodox tree (e.g., nonmonophyletic teleosts) that Patterson et al. (1993) criticized as “goofy” from the morphologist’s point of view. Later, Bernardi et al. (1993) and Rubin and Dores (1995) analyzed amino acid sequences of growth hormone on the basis of a different data set, including 25 and 24 teleosts, respectively, comparing them with those from two outgroups, and reported that the resulting maximum parsimony tree agreed well with the morphology-based tree. Lê et al. (1993) analyzed nuclear 28S rRNA gene sequences from 31 gnathostomes (including 18 teleosts) and found two highly supported nodes within the teleosts (Osteoglossomorpha + Elopomorpha and Clupeomorpha + Ostariophysi) that were incongruent with the morphology-based tree (Nelson 1989). Using entire mitochondrial cytochrome b (cyt b) gene sequences from 31 fishes (including 30 teleosts), Lydeard and Roe (1997) reported that the resulting maximum parsimony trees were largely congruent with the morphology-based tree, although some incongruities (e.g., paraphyletic Neoteleostei) were observed. However, with the exception of the sister relationship between Clupeomorpha and Ostariophysi (see Lê et al. 1993), no novel molecular phylogenetic hypothesis has had a significant impact on those studies dealing with higher-level relationships among major teleostean lineages in the late 1990s (Johnson and Patterson 1996; Lecointre and Nelson 1996). Furthermore, some workers demonstrated the limits of partial mitochondrial rRNA (Ortí and Meyer 1997) and cyt b (Meyer 1994) genes in resolving higher-level relationships of teleosts.

It appears that adequate resolution of higher-level relationships in any organism requires longer DNA sequence reads, because historical signals are directional (= additive) by definition and expected to overcome nondirectional (= random) noise as more data are collected (Miya and Nishida 2000b). If so, what types of genes or genomes should we first explore for their usefulness in reconstructing higher-level relationships of teleosts? We chose the mitogenome, because it has many advantages as a phylogenetic marker such as maternal inheritance and haploidy, and the resulting short coalescence time (Avise 1994; Moore 1995; Curole and Kocher 1999; Springer et al. 1999). In mammals, for instance, the transition from an apparently unsolvable to a solvable problem came about mainly from the availability of complete mitochondrial DNA (mtDNA) sequences (Penny et al. 1999). There was, however, no simple and easy approach for sequencing entire mitogenomes in the late 1990s and we need to develop the novel method by ourselves.

Development of the novel sequencing method for fish mitogenomes

In an earlier stage of molecular phylogenetic studies of fishes, we published three papers on the deep-sea stomiiform fish genus Cyclothone (see Miya and Nishida 1996, 1997) and Sternoptyx (see Miya and Nishida 1998) based on partial sequences from the mitochondrial ribosomal genes (12S and 16S rRNAs). Concurrently, we attempted to expand those three studies to obtain the entire stomiiform phylogeny by sampling more molecular characters from additional taxa distributed across five families and 53 genera with about 400 species (Nelson 2006). Thus, our explorations of molecular phylogenetics and evolution of fishes had a very limited scope in an initial stage, only involving a single group of deep-sea fishes (Stomiiformes).

For the character sampling, we chose the mitochondrial cyt b gene sequences and attempted to amplify those sequences (approximately 1,150 bp) from seven gonostomatid species of Sigmops and Gonostoma (see Miya and Nishida 2000a) using two primers designed on the two flanking transfer RNA genes (tRNAGlu and tRNAThr). The PCR product from one species (Sigmops gracile), however, was unexpectedly small (approximately, 250 bp), and subsequent direct sequencing revealed that, instead of the cyt b, tRNAPro was found between these two tRNA genes (Fig. 2). Further PCR and sequencing experiments demonstrated that the cluster of these three tRNAs (tRNAGlu, tRNAPro, tRNAThr) was adjacent to the 3′ end of the cyt b gene, suggesting that tRNA rearrangements relative to the latter have occurred in S. gracile (see Miya and Nishida 1999). Such gene rearrangements had not been previously reported for any vertebrates at that time (see Macey 1997) and prompted us to sequence the whole mitogenome for this species (Miya and Nishida 1999).

Fig. 2
figure 2

Schematic representation of the mitogenome from Sigmops gracile (see Miya and Nishida 1999). The entire mitogenome is amplified using two long PCRs (gray lines with arrowheads 16S rRNA–cyt b and cyt b–16S rRNA gene regions) and the two long PCR products are used as templates for subsequent short PCRs using 30 pairs of fish-versatile primers that amplify contiguous, overlapping segments of the entire mitogenome (denoted by broken lines inside the gene map)

For sequencing the whole mitogenome (ca. 16,500 bp), purification of the intact mtDNA is an essential step for avoiding erroneous sequencings of mitochondrial pseudogenes in the nuclear genome (Dowling et al. 1996). We found that the use of a long PCR technique (Cheng et al. 1994) was an easy and effective method for the genomic purification from such small deep-sea fish (<120 mm in standard length) rather than the traditional method, such as physical isolation of mitochondria from the fish tissue (Miya and Nishida 1999). After a period of trial and error, we finally obtained excellent amplifications of the entire mitogenome from Sigmops gracile with two long PCR reactions using two sets of species-specific primers on the 16S rRNA and cyt b genes (Fig. 2) in late 1997. Those two long PCR products (approximately, 12 kb and 6 kb in length) were overlapped, being considered as the purified whole mitogenome and used as templates for subsequent DNA sequencings.

For massive sequencings from across various fish taxa, traditional methods, such as digestion with a restriction enzyme and subsequent cloning, were labor-intensive and time-consuming. Accordingly, we designed 30 sets of fish-versatile PCR primers with reference to the aligned whole mitogenome sequences from seven fish species available at that time (loach, Tzeng et al. 1992; carp, Chang et al. 1994; trout, Zardoya et al. 1995; cod, Johansen and Bakke 1996; bichir, Noack et al. 1996; lungfish, Zardoya and Meyer 1996; coelacanth, Zardoya and Meyer 1997), which cover the entire mitogenome in a series of partially overlapping fragments. Fortunately, those newly designed PCR primers worked very well for the species (Sigmops gracile) and successfully amplified contiguous, overlapping segments of the entire genome using the two long PCR products as templates (Fig. 2). We then assembled the whole mitogenome from direct sequencings for those 30 short PCR products.

This simple and quick approach (a combination of long and short PCRs with fish-versatile primers) has revolutionized the mitogenomics of fishes, enabling a rapid development in this field. With this novel technique, our research group has published a series of five papers describing whole mitogenomes from commercially or zoologically important fish species (Japanese sardine, Inoue et al. 2000b; Japanese eel, Inoue et al. 2000a; Japanese anchovy, Inoue et al. 2001b; ayu, Ishiguro et al. 2001; Mangrove rivulus, Lee et al. 2001) and is now assembling ≈1,340 whole mitogenome sequences as of 30 September 2014 (including unpublished ones). We abandoned this publication strategy at an earlier stage of the project, because our major interests were not in the descriptions of individual fish mitogenomes, but in their utility as phylogenetic marker sets for resolving higher-level relationships of fishes. Interestingly, Sorenson et al. (1999) concurrently developed a similar PCR-based approach to bird mitogenomes, and their research group has published a number of phylogenetic studies of birds (e.g., Mindell et al. 1999). We also applied this approach to invertebrates, such as decapods (Yamauchi et al. 2004a), copepods (Machida et al. 2002), insects (Yamauchi et al. 2004b), and lancelets (Nohara et al. 2005).

Use of mitogenomic data in molecular phylogenetics of fishes

It appears that adequate resolution of higher-level relationships in any organism requires longer DNA sequence reads (Miya and Nishida 2000b). In this regard, our PCR-based approach for fish mitogenomes (Miya and Nishida 1999) has overcome the technical difficulties in obtaining a number of the longer sequences from a wide variety of taxa. However, the usefulness of the whole mitogenome sequences in molecular phylogenetics of fishes was unknown and we had to demonstrate the phylogenetic utility and limits of the mitogenomic data before further sequencing efforts with this PCR-based method.

Miya and Nishida (2000b) addressed this issue using the whole mitogenome sequences of eight teleosts (including three newly determined sequences), whose relative phylogenetic positions are noncontroversial (Fig. 3). Maximum parsimony (MP) analyses of the nucleotide and amino acid sequences of 13 protein-coding genes from eight teleosts, plus two outgroups (a bichir and a shark), indicated that all of the individual protein-coding genes, with the exception of ND5, failed to recover the expected phylogeny, although unambiguously aligned sequences from 22 concatenated transfer RNA (tRNA) genes (stem regions only) recovered the expected phylogeny successfully with moderate statistical support (Miya and Nishida 2000b), consistent with a suggestion of Kumazawa and Nishida (1993). Miya and Nishida (2000b) also observed that the combined data sets comprising nucleotide sequences from the several protein-coding genes with higher phylogenetic performance (no third codon positions) plus the 22 concatenated tRNA genes (stem regions only) best recovered the expected phylogeny with all internal branches being supported by bootstrap values ≥90 %.

Fig. 3
figure 3

Basal teleostean relationships (a) following Nelson (1994). Miya and Nishida (2000b) chose eight teleosts (closed and open circles) from the four major lineages, whose relative phylogenetic positions are noncontroversial. They performed maximum parsimony analysis with various weighting schemes to evaluate the phylogenetic utility of the mitogenomic data by comparing the recovery rate of the expected phylogenies, (b) and among different data sets. Redrawn from Miya and Nishida (2000b)

Based on comparisons of the recovery rates of the expected phylogeny (Fig. 3) and statistical support for the internal branches among individual and concatenated genes, Miya and Nishida (2000b) concluded that judicious choice of mitochondrial genes and appropriate data weighting (including or excluding transitional and transversional changes at the third codon positions), in conjunction with purposeful taxonomic sampling to bisect long branches (Hillis 1998), are prerequisites for resolving higher-level relationships in teleosts under the maximum parsimony optimality criterion. Kawaguchi et al. (2001) reconfirmed this conclusion with an additional whole mitogenome sequence from an aulopiform, Aulopus japonicus, which occupies an important phylogenetic position between the basal and higher euteleosts. However, in our subsequent studies, we have had to concatenate all genes in our phylogenetic analyses, because there is no a priori reason to include or exclude specific genes (except for the ND6 gene owing to its distinct heterogeneous base composition).

It should be noted that comparisons of phylogenetic performance of the mitogenomic data had to be performed under the maximum parsimony (MP) optimality criterion in Miya and Nishida (2000b). Before the advent of the Bayesian (BA) inference implemented in MrBayes (Ronquist and Huelsenbeck 2003) and subsequent developments of a fast maximum likelihood (ML) algorithm implemented in RAxML (Stamatakis 2006), the MP method was the only character-based approach applicable to large data sets in the early 2000s. Our research group employed those model-based phylogenetic inferences (BA and ML) for large data sets (including ≥25 species) from Inoue et al. (2003a, 2004), and Simmons and Miya (2004) have not performed MP analysis since then, because MP is generally more sensitive to long branch attraction issues than ML (e.g., Philippe et al. 2005).

Mitogenomic resolution of fish phylogenies

In the early 2000s, no novel molecular phylogenetic hypotheses have been considered significant in those studies addressing higher-level relationships among major teleostean lineages (Johnson and Patterson 1996; Lecointre and Nelson 1996) with the exception of the sister group relationship demonstrated for Clupeomorpha and Ostariophysi (see Lê et al. 1993). This situation was in part due to the lack of longer DNA sequences that were necessary for adequate resolution of higher-level relationships (Stepien and Kocher 1997; Miya and Nishida 2000b). The novel method for sequencing whole mitogenomes of fishes (Miya and Nishida 1999), together with explicit demonstration of phylogenetic utility of the mitogenomic data (Miya and Nishida 2000b), enabled us to address various controversial issues in fish phylogenetics.

Inoue et al. (2001c) was the first study from our research group to address these phylogenetic issues, reexamining the interrelationships of the five major, basal teleostean lineages (Osteoglossomorpha, Elopomorpha, Clupeomorpha, Ostariophysi, Protacanthopterygii) using the mitogenomic data. There were five alternative phylogenetic hypotheses on the basis of both morphological (Greenwood et al. 1966; Greenwood 1973; Patterson and Rosen 1977; Arratia 1997) and molecular analyses (Lê et al. 1993) (Fig. 4), and Inoue et al. (2001c) performed MP and ML analyses with the mitogenomic data from purposefully chosen eight teleosts and two outgroups. The resultant tree topologies from the two methods were congruent, although they differed from any of the previously proposed hypotheses, with Osteoglossomorpha being sister to other teleosts, followed by the divergences between Elopomorpha and Otophysi + Protacanthopterygii (Fig. 4). Furthermore, the mitogenomic data confidently rejected all of those proposed hypotheses with high statistical significance. Note that recent phylogenetic studies using multiple nuclear genes (Li et al. 2008; Near et al. 2012; Betancur et al. 2013a; Broughton et al. 2013; Chen et al. 2013; Faircloth et al. 2013) have consistently recovered Elopomorpha to be the sister group to other teleosts instead of Osteoglossomorpha; none of those studies, however, performed statistical comparisons (e.g., AU-test, Shimodaira 2002) among alternative phylogenetic hypotheses and thus the issue requires further clarification.

Fig. 4
figure 4

Basal teleostean relationships recovered in the maximum parsimony analysis of Inoue et al. (2001c). Numbers above branches denote bootstrap values obtained for 500 replicates. Redrawn from Inoue et al. (2001c)

Unlike the depauperate basal teleosts that have only a few alternative phylogenetic hypotheses possible, higher teleosts are enormously species rich (>18,000 species, Nelson 2006) and their phylogenetic problems are so complex that G. Nelson (1989) even described them as the “(unresolved) bush at the top of the tree” with no consensus having been reached. As a first step toward resolution of higher teleostean phylogenetics, Miya et al. (2001) attempted to circumscribe a well-supported monophyletic group encompassing the “bush” (i.e., Percomorpha) and to determine the phylogenetic position of that clade relative to other major lineages using mitogenomic sequences from 48 species of teleosts (including 38 newly determined sequences).

In their MP tree, which was statistically indistinguishable from the ML tree, Miya et al. (2001) confirmed the monophyly of the more comprehensive clades, such as Euteleostei, Neoteleostei, and Eurypterygii with high bootstrap support (93–100 %), while interrelationships among more derived Myctophiformes, Ateleopodiformes, and Lampridiformes were ambiguous within the Ctenosquamata. These lineages together formed a sister clade to other higher teleosts in their MP tree. In the latter clade, the phylogenetically problematic Polymixiiformes, Paracanthopterygii, and Zeioidei formed a sister clade to other higher teleosts. Although taxonomic sampling was still sparse, Miya et al. (2001) successfully found a highly supported monophyletic group at the top of the tree (Fig. 5), which is the sister group to a clade comprising non-reciprocally monophyletic Beryciformes and Stephanoberyciformes. This study provided an important stepping stone toward more taxonomically densely sampled analyses.

Fig. 5
figure 5

First mitogenomic circumscription of the Percomorpha recovered in the maximum parsimony analysis of Miya et al (2001). Numbers next to internal branches indicate bootstrap values (only those ≥50 %) obtained for 500 replicates. Redrawn from Miya et al. (2001)

Four mitogenomic studies encompassing the entire actinopterygian phylogenies

The year of 2003 represented a significant milestone for our research group. Following Miya et al. (2001), we published four phylogenetic papers that together encompassed whole actinopterygian diversity, beginning from the basal actinopterygians (Inoue et al. 2003a) to ostariophysans (Saitoh et al. 2003), basal euteleosts (Ishiguro et al. 2003), and the higher teleosts (Miya et al. 2003). Those four pioneering studies have invariably had profound impacts on comparative studies of fishes, being cited 1,063 times in total as of 30 September 2014 (Table 1). Interestingly, a French group led by Guillaume Lecointre in Paris concurrently began publishing significant contributions in molecular phylogenetics of fishes mainly based on nuclear genes (e.g., Zaragüeta-Bagils et al. 2002; Chen et al. 2003; Dettaï and Lecointre 2005; Li et al. 2009).

The basal actinopterygians comprise four major fish lineages (polypteriforms, acipenseriforms, lepisosteids, Amia) and have been collectively called “ancient fish.” Inoue et al. (2003a) analyzed their relationships in relation to teleosts using the mitogenomic data and compared the resulting trees to the various alternative hypotheses. Inoue et al. (2003a) resolved polypteriforms as the sister lineage to all other actinopterygians and found that the other three ancient fish groups (acipenseriforms, lepisosteids, Amia) together formed a clade, which was sister to teleosts (Fig. 6). Although the latter hypothesis was congruent with that advocated by Venkatesh et al. (2001) based on nuclear markers (patterns of insertion/deletion in the RAG1 gene), subsequent mitogenomic reanalysis based on the different alignments (Azuma et al. 2008; Setiamarga et al. 2009; Nakatani et al. 2011) supported the neopterygian clade (lepisosteids + Amia being sister to teleosts). These results are in agreement with the analyses of multiple nuclear DNA sequences (Li et al. 2008; Near et al. 2012; Betancur et al. 2013a; Broughton et al. 2013) and morphological characters (Patterson 1973; Grande 2010).

Fig. 6
figure 6

Basal actinopterygian relationships recovered in the maximum parsimony analysis of Inoue et al. (2003a). Numbers above and below internal branches indicate jackknife values obtained for 500 replicates and decay indices, respectively. Redrawn from Inoue et al. (2003a)

Ostariophysi is the second largest superorder within Teleostei, comprising five mostly freshwater orders: Gonorynchiformes, Cypriniformes, Characiformes, Gymnotiformes, and Siluriformes. Saitoh et al. (2003) analyzed their higher-level relationships using whole mitogenome sequences from 11 ostariophysans and nine non-ostariophysan teleosts used as outgroups. Saitoh et al. (2003) reconfirmed clupeiforms as the closest relatives of ostariophysans and provided convincing evidence supporting monophyly and a sister-group relationship of Cypriniformes and Characiphysi (Characiformes + Siluriformes + Gymnotiformes). Saitoh et al. (2003) further argued a Pangean origin of the Otophysi (Cypriniformes + Characiphysi) based on their geographic distributions on the major continents. The publication of this article has led to our participation in the international project “Cypriniformes Tree of Life (CToL)” initiated by Dr. Richard L Mayden at Saint Louis University (see below).

Higher-level relationships of the basal Euteleostei (= Protacanthopterygii) are so complex and controversial that at least nine alternative morphology-based hypotheses were proposed before 2000. Ishiguro et al. (2003) investigated their relationships using mitogenomic data from 34 purposefully chosen species that fully represented major basal euteleostean lineages (Argentinoidea, Alepocephaloidea, Osmeriformes, Esociformes, Salmoniformes). Surprisingly, the resulting tree confidently placed alepocephaloids within the Otocephala (Fig. 7), a sister group of the euteleosts. This unexpected result has been reconfirmed by mitogenomic studies with denser taxonomic sampling from alepocephaloids (Lavoué et al. 2008b; Poulsen et al. 2009; Broughton 2010) as well as those studies based on multiple nuclear genes (Near et al. 2012; Betancur et al. 2013a; Broughton et al. 2013; Chen et al. 2013). Also Ishiguro et al. (2003) convincingly demonstrated a sister-group relationship between Esociformes and Salmoniformes (Fig. 7) (see also Campbell et al. 2013b), which was featured in a review article in Trends in Ecology & Evolution (Ramsden et al. 2003) in a discussion on the evolutionary origin of the anadromous migrations of salmonids. Later, Ishiguro et al. (2005) confidently demonstrated that Sundasalanx (formerly osmeriform Sundasalangidae, see Siebert 1997) is not a member of Osmeriformes, but is a member of Clupeiformes on the mitogenomic phylogeny.

Fig. 7
figure 7

Basal euteleostean relationships recovered in the maximum parsimony analysis of Ishiguro et al. (2003). Numbers above internal branches indicate jackknife values obtained for 500 replicates. Note that Alepocephaloidea falls outside the Euteleostei, and Protacanthopterygii is recovered as the nonmonophyletic group. Redrawn from Ishiguro et al. (2003)

Following a successful circumscription of a well-supported monophyletic group at the top of the higher teleostean tree and explicit demonstration of the phylogenetic position of such a monophyletic group relative to other major lineages (Miya et al. 2001), Miya et al. (2003) further expanded the taxonomic sampling to 100 purposefully chosen species and performed unweighted and weighted MP analyses using mitogenomic data. Miya et al. (2003) revealed that all major, comprehensive groups above the ordinal level as currently defined in higher teleosts (with the exception of the Neoteleostei and several monotypic groups), such as the Eurypterygii, Ctenosquamata, Acanthomorpha, Paracanthopterygii, Acanthopterygii, and Percomorpha, appeared to be non-monophyletic in the resulting tree. Such incongruities largely resulted from differences in the placement and/or limits of the orders Ateleopodiformes, Lampridiformes, Polymixiiformes, Ophidiiformes, Lophiiformes, Beryciformes, Stephanoberyciformes, and Zeiformes, all of them being long-standing problematic taxa in systematic ichthyology. Of these, the resulting phylogenetic positions of the Ophidiiformes and Lophiiformes were totally unexpected, because, although they have consistently been considered relatively primitive groups within higher teleosts (Paracanthopterygii), they were confidently placed within a crown group of teleosts (Fig. 8). These unexpected results have generally been supported by recent molecular phylogenetic studies based on multiple nuclear genes (Li et al. 2008; Near et al. 2012, 2013; Wainwright et al. 2012; Betancur et al. 2013a; Broughton et al. 2013). Later, Miya et al. (2005) included two species of the Batrachoidiformes not sampled in Miya et al. (2003) and performed partitioned Bayesian analysis. Miya et al. (2005) found that batrachoidiforms are also members of the “top” of the tree (Percomorpha) and this finding was reconfirmed by those studies based on multiple nuclear genes (Li et al. 2008; Near et al. 2012; Betancur et al. 2013a; Broughton et al. 2013).

Fig. 8
figure 8

Acanthopterygian relationships recovered in the maximum parsimony analysis of Miya et al. (2003). Numbers above and below internal branches indicate jackknife values obtained for 500 replicates and decay indices, respectively. Redrawn from Miya et al. (2003)

It should be noted that large data matrices used in Miya et al. (2003) have been useful for addressing some issues in molecular phylogenetics, such as “efficient resolution of the basal clades in a phylogenetic tree” (Simmons and Miya 2004) and “overconfidence of posterior probabilities in Bayesian inferences” (Simmons et al. 2004). In particular, the latter paper has attracted considerable attention from the molecular systematics community and cited as many as 258 times as of 30 September 2014 (Table 1).

Gene rearrangement as a phylogenetic marker

Most vertebrates share an identical mitochondrial gene order, and deviations from such structural property have been employed as a useful phylogenetic marker in various animals (Kumazawa and Nishida 1995; Boore 1999). Although our first discovery of the unique mitochondrial gene order in Sigmops gracile (see Miya and Nishida 1999) was not shared by any closely related gonostomatid species, subsequent findings of gene rearrangements in fishes have represented molecular synapomorphies to diagnose subsets of various fish taxa.

For example, Inoue et al. (2001a) found a novel gene order in a mitogenome from congrid eel (Conger myriaster), in which the ND6, tRNAGlu, and tRNAPro genes were translocated between the control region and tRNAPhe genes that are contiguously located at the 5′ end of the 12S rRNA gene in typical vertebrates. Based on the partial sequences of this region from 11 other anguilliform species representing 11 families, Inoue et al. (2001a) found that this unique gene order was shared by four other families belonging to the suborder Congroidei. Subsequently, Inoue et al. (2010c) published anguilliform phylogenies based on whole mitogenome sequences from all of the 19 families. Inoue et al. (2010c) confirmed that the unique gene order was shared by six families of the suborder Congroidei (Colocongridae, Congridae, Derichthyidae, Nettastomidae, Ophichthidae, Muraenesocidae) and those six families exclusively formed a monophyletic group supported by 99 % bootstrap support.

Inoue et al. (2003b) found identical, unusual large-scale gene rearrangements in the two deep-sea gulper eels (Eurypharynx pelecanoides and Saccopharynx lavenbergi) placed in two different families (Eurypharyngidae and Saccopharyngidae) within the suborder Saccopharyingoidei (Fig. 9) and were recovered as sister groups in the phylogenetic analysis. Mabuchi et al. (2004) found mitochondrial gene rearrangements in a cluster of tRNAIle (I), tRNAGln (Q), and tRNAMet (M) between the two protein coding genes (ND1 and ND2) in scarid fishes; typically vertebrates including closely related species of scarids (labrids and odacids) exhibit a gene order of IQM, while 10 scarid species representing seven of the 10 currently recognized genera exhibit IMQ followed by a pseudogene of tRNAMet. This finding may represent a molecular synapomorphy to diagnose the entire family (Scaridae), but an example from unique gene orders found in two of the four deep-sea macrourid subfamilies was counterintuitive in terms of external morphologies (Satoh et al. 2006). While Squalogadus modificatus (Macrouroidinae) is characterized by a round head with no noticeable snout, Trachyrincus murrayi (Trachyrincinae) has a rugged head with the most pointed snout among macrourids. Despite these morphological differences, the two subfamilies share a unique gene order among vertebrates, and partitioned Bayesian analysis strongly indicated monophyly of the two among the four currently recognized macrourid subfamilies (Fig. 10) (Satoh et al. 2006).

Fig. 9
figure 9

Unusual gene order of the mitogenomes from the two gulper eels (below) compared with that of the typical vertebrates (center) and its derivatives (around the circular genome). Four conservative blocks of genes are colored and note the remarkable difference of the order between that of the two gulper eels and typical vertebrates. Redrawn from Inoue et al. (2003b)

Fig. 10
figure 10

Unique gene rearrangements in the four macrourid fishes (left; A–C) and their distributions on the macrourid phylogenies (right) recovered in the Bayesian analysis of Satoh et al. (2006). Redrawn from Satoh et al. (2006)

More recently, Poulsen et al. (2013) attempted to resolve major myctophiform phylogenies from both mitogenomic sequences and the unique mitochondrial gene order rearrangements. Poulsen et al. (2013) observed a total of eight unique gene orders within the myctophids and found that mitogenomic sequences and those unique gene orders are highly congruent concerning phylogenetic resolution on several myctophiform classifications based on osteology, larval ontogeny, and photophore patterns (Fig. 11). In particular, the enigmatic Notolychnus valdiviae was placed as the sister taxon to all other myctophids and exhibited an unusual second copy of the tRNA-Met gene—a gene order rearrangement reminiscent of that found in the tribe Diaphini, although their analyses show it to be independently derived.

Fig. 11
figure 11

Myctophiform relationships recovered in the Bayesian analysis of Poulsen et al. (2013). Numbers beside internal branches are the bootstrap values and posterior probabilities only if those values are less than 100 %. Ten distinct myctophid lineages are noted (A–J) with respect to the current tribal classification. Patterns of eight gene rearrangements shared by subsets (total = 5) or all of the myctophiforms (total = 3) are shown in the colored boxes. Redrawn from Poulsen et al. (2013)

Resolution of inter- and intraspecific relationships

Other than the resolution of higher-level relationships in fishes, we also applied mitogenomic data to address inter- and intraspecific relationships. For example, Minegishi et al. (2005) analyzed mitogenome sequences of all 18 species/subspecies of the freshwater eel genus Anguilla to infer their phylogenetic relationships and to evaluate hypotheses about the possible historical dispersal routes of this genus. The resultant tree clearly indicated a sister relationship between the Atlantic and Oceanian species, which now have distantly separated geographic distributions (Fig. 12) due to the collision of the Eurasian and African continents. Minegishi et al. (2005) argued that the previous hypotheses to estimate the historical dispersal route of anguillid eels into the Atlantic Ocean based on the current geographic distribution of species are unsupported by the mitogenomic analysis and should be reconsidered.

Fig.12
figure 12

Relationships among 18 species/subspecies of the freshwater eel genus Anguilla recovered in the Bayesian analysis of Minegishi et al. (2005). Numbers above internal branches indicate Bayesian posterior probabilities (shown as percentages). Redrawn from Minegishi et al. (2005)

Mabuchi et al. (2006) determined a whole mitogenome sequence from the Lake Biwa wild strain of common carp (Cyprinus carpio) as a reference sequence for further comparisons with a number of already-published mtDNA sequences of multiple mitochondrial gene regions from various strains. This approach enabled more geographically comprehensive analyses including additional strains from Southeast Asia (Vietnam and Indonesia) to evaluate the origin of the Lake Biwa strain. Accordingly, Mabuchi et al. (2006) convincingly reconfirmed a previous finding of principal phylogenetic dichotomy between the “Lake Biwa wild” and “Eurasian” strains (Mabuchi et al. 2005) from more comprehensive data sets.

Carassius auratus is a primary freshwater fish with bisexual diploid and unisexual gynogenetic triploid lineages (Kobayasi et al. 1970), distributed widely in Eurasia. As the first step in clarifying the evolutionary entity of Carassius, Takada et al. (2010) determined nucleotide sequences for four mitochondrial genes [control region (CR) + ND4 + ND5 + cyt b; 4,669 bp] from 53 individuals of C. auratus and constructed a supermatrix with additional sequences from the 672 individuals. The supermatrix was subjected to phylogenetic analysis and the resulting trees revealed two lineages, one distributed mainly among the Japanese main islands and the other in various regions in and around the Eurasian continent, including Ryukyus and Taiwan (Fig. 13). The two lineages include seven sub-lineages with high regional specificity that are composed of endemic populations indigenous to each region. Triploids of C. auratus did not form a monophyletic group, but were clustered mostly with sympatric diploids (Fig. 13). Takada et al. (2010) argued that the lack of substantial genetic separation between triploids and diploids indicates that triploids are not composed of a single independent lineage.

Fig.13
figure 13

Intraspecific relationships among Carassius auratus complex (upper right) recovered in the maximum likelihood analysis of Takada et al. (2010). Global geographical distributions of seven major lineages shown (lower left). Numbers above internal branches indicate neighbor-joining and ML bootstrap values and Bayesian posterior probabilities. Redrawn from Takada et al. (2010)

Hirayama et al. (2010) investigated intraspecific variations in the mitogenomic sequences from eight local populations and four inbred strains of medaka (Oryzias latipes). They found that the number of tandemly repeated 11 nucleotide units in the control region varied greatly among local populations. Also, they noted that the number of repeats was more variable in the northern Japanese group (10–34) than in the southern group (7–12), while two other species of Oryzias, inhabiting tropical regions, had no such repeats. A comprehensive comparison between the number of repeat units and meteorological data indicated that the number of repeats correlated to the index data of a cold environment and seasonal climatic change. Hirayama et al. (2010) argued that the repeated sequences in the CR might function in mitochondrial gene expression and that the number of tandem repeats is likely related to adaptation to a harsh habitat. Note that the northern group of medaka was recently described as a new species in Asai et al. (2011)

Participation in an international project (Cypriniformes Tree of Life)

In March 2004 Prof. Richard L. Mayden of Saint Louis University submitted a proposal entitled “Collaborative research: Systematics of Cypriniformes, Earth’s most diverse clade of freshwater fishes” to the AToL initiative (Assembling the Tree of Life), a large research effort sponsored by the National Science Foundation, USA. MM was nominated as one of the core participants along with many researchers from various countries. In response to this international initiative, MM submitted a separate grant proposal to the Japan Society for the Promotion of Science (JSPS) in November 2004, focusing on the mitogenomic resolution of the cypriniform phylogenies. Fortunately, those two proposals were successfully funded in 2005 and the Cypriniformes Tree of Life (CToL) project was officially initiated as an international project (for details, see http://bio.slu.edu/mayden/cypriniformes/home.html).

Before massive mitogenomic sequencings, Miya et al. (2006) designed a set of PCR primers that effectively amplified the mitochondrial ND4/ND5 gene region. The two genes exhibited good phylogenetic performance in a previous study (Miya and Nishida 2000b) and we had supposedly used them as markers for resolving phylogenetic issues of subsets of the Cypriniformes (e.g., within tribes, subfamilies, or families). These two genes showed better phylogenetic performance than the more commonly used cyt b gene (Miya et al. 2006) and they have been widely used in subsequent studies of the cypriniform phylogenies (see below).

One of the most important steps in resolving the higher-level relationships of a species-rich clade, such as that of the Cypriniformes, is to include a number of taxa that can bisect long branches (Miya and Nishida 2000b). With this in mind, Saitoh et al. (2006) carefully chose 53 cypriniforms and assembled whole mitogenome sequences from them along with six outgroups. The unambiguously aligned sequences were subjected to partitioned Bayesian analyses and the resultant phylogenies strongly supported monophyly of the Cypriniformes as well as that of the families Cyprinidae, Catostomidae, and a clade comprising Balitoridae + Cobitidae, with the two latter loach families being reciprocally paraphyletic (Fig. 14). Saitoh et al. (2006) also demonstrated that the RY-coding (Phillips and Penny 2003; Harrison et al. 2004), which takes only transversions into account, effectively removes the “noise” from the saturated third codon positions and counters the apparent lack of signal by retaining all available positions in the data set. Such use of the RY-coding in the third codon positions has been followed in subsequent mitogenomic analyses of fishes (see below).

Fig.14
figure 14

Cypriniform relationships recovered in the Bayesian analysis of Saitoh et al. (2006). Note that both the loach families (Cobitidae and Balitoridae) are non-monophyletic. Numbers above internal branches indicate Bayesian posterior probabilities (shown as percentages). Redrawn from Saitoh et al. (2006)

Following the pioneering work of Saitoh et al. (2006), a number of papers have been published to resolve the relationships of the entire Cypriniformes (see Mayden et al. 2008; Mayden et al. 2009) and subsets of the order, such as Catostomidae (see Doosey et al. 2010), Danioninae (see Tang et al. 2010), Cyprinini (see Yang et al. 2010, 2012b), Gobioninae (see Tang et al. 2011), Labeonini (see Yang et al. 2012a), Oxygastrinae (see Tang et al. 2013), and Leuciscinae (see Imoto et al. 2013), and even to specific genera, such as Danio (see Mayden et al. 2007), Carassius (see Takada et al. 2010), Cyprinus (see Mabuchi et al. 2006), and taxonomically problematic genera, such as Discherodontus, Chagunius, and Hypselobarbus (see Yang et al. 2012b), using mitogenomic data (>2,000 bp).

Further resolution of non-euteleostean relationships

Publications of the four mitogenomic papers in 2003 (Inoue et al. 2003a; Ishiguro et al. 2003; Miya et al. 2003; Saitoh et al. 2003) provided useful frameworks for the taxonomic sampling in phylogenetic analysis in actinopterygians (e.g., choice of outgroups and limits of the targeted taxa). These frameworks in turn greatly facilitated further resolution of actinopterygian phylogenies at various taxonomic levels, leading to a number of unexpected discoveries that are in striking contrast to the textbook classification (Nelson 2006).

For basal teleosts below Euteleostei (excluding Cypriniformes), we resolved the lower- and higher-level relationships of Osteoglossomorpha, such as Notopteridae (see Inoue et al. 2009) and Mormyroidea (see Lavoué et al. 2012a); Elopomorpha (see Inoue et al. 2004; Chen et al. 2013), such as Anguilliformes (see Minegishi et al. 2005; Inoue et al. 2010c; Johnson et al. 2012) and Anguillidae (see Minegishi et al. 2005); and Otophysi (see Nakatani et al. 2011), such as Gonorynchiformes (see Lavoué et al. 2005, 2012b), Clupeiformes (see Lavoué et al. 2007, 2008a, 2010, 2013), Alepocephaliformes (see Lavoué et al. 2008b; Poulsen et al. 2009), and Gymnotiformes (see Lavoué et al. 2012a). Of those studies, an explicit demonstration of the deep-ocean origin of the freshwater eels (Inoue et al. 2010c), subsequent discovery of the “living fossil” eel from an undersea cave in Palau (Johnson et al. 2012), and surprisingly long morphological stasis found in the two allopatric lineages in the African butterflyfish (Lavoué et al. 2011) should be highlighted here.

Freshwater eels spend most of their lives in freshwater during their catadromous life cycle and migrate back to their specific breeding places in the ocean (Tsukamoto et al. 2002). However, the evolutionary origin of such an enigmatic behavior remained elusive because of the uncertain phylogenetic position of freshwater eels within the principally marine anguilliforms. Inoue et al. (2010c) performed phylogenetic analysis using mitogenomic sequences from all of the 19 currently recognized anguilliform families (including the four families of Saccopharyngiformes) and found that the freshwater eels occupied an apical position within the anguilliform phylogeny, forming a highly supported monophyletic group with various oceanic midwater eel species (Fig. 15). Moreover, reconstruction of the growth habitats on the resulting tree unequivocally indicated an origination of the freshwater eels from midwaters of the deep ocean (Fig. 15). This finding showed significant concordance with the recent collection of mature adults of the Japanese eel in the upper midwater of the Pacific (Chow et al. 2009) and Inoue et al. (2010c) argued that they have retained their evolutionary origin as a behavioral trait in their spawning areas.

Fig. 15
figure 15

A portion of the anguilliform relationships recovered in the maximum likelihood (ML) analysis of Inoue et al. (2010c). Numbers beside internal branches indicate bootstrap probabilities of ≥50 % based on 1000 replicates. Evolution of the adult growth habitats is reconstructed on the ML tree under an ML optimality criterion. A pie chart at each node indicates the likelihoods for these four character states. Redrawn from Inoue et al. (2010c)

Shortly after the publication of Inoue et al. (2010c), Johnson et al. (2012) reported the discovery of an enigmatic, small eel that exhibits an unusual suite of morphological characters, in a 35 m-deep fringing-reef cave in Palau (Fig. 16). Detailed morphological analysis explicitly placed this species as the most basal lineage (i.e., the sister group of extant anguilliforms). Phylogenetic analysis and divergence time estimation based on whole mitogenome sequences from various actinopterygians, including representatives of all eel families, demonstrated that this fish represents one of the most basal, independent lineages of the true eels, with a long evolutionary history comparable to that of the entire Anguilliformes (approximately 200 million years). Such a long, independent evolutionary history dating back to the early Mesozoic and retention of primitive morphological features warrant recognition of this species as a “living fossil” of the true eels and Johnson et al. (2012) described it as Protanguilla palau, a new genus and species of the new family Protanguillidae.

Fig. 16
figure 16

Phylogenetic position of the “living fossil” eel Protanguilla palau recovered in the time-calibrated phylogenies of Johnson et al. (2012). The tree topology is one of the three best scoring ML trees that is consistent with the morphology-based hypothesis. Note that the best scoring ML tree supports a sister-group relationship between Protanguilla and synaphobranchids (Johnson et al. 2012). Inlet photo courtesy of Jiro Sakaue. Redrawn from Johnson et al. (2012)

Finally, Lavoué et al. (2011) studied the African freshwater butterflyfish (Pantodon buchholzi), whose distinctive morphology earns it recognition as a monotypic family. Phylogenetic analysis using mitogenomic sequences showed that the two allopatric populations from the Congo and Niger basins formed a monophyletic group (Fig. 17), but they differed by 15.2 % in their coding sequences with no morphological divergences as shown by 15 morphometric measurements compared with those of other osteoglossomorphs. The mitogenomic divergence time between these populations was estimated to be greater than 50 million years (Fig. 17), and the deep genetic divergence was confirmed by nuclear sequence data. Thus, morphological stasis in these two allopatric lineages of Pantodon offers a living vertebrate model for investigating phenotypic stability over millions of generations in tropical Africa, which has experienced major climatic oscillations resulting in repeated cycles of forest expansion and fragmentation (Lavoué et al. 2011).

Fig. 17
figure 17

Geographic distribution (left) and time-calibrated phylogenies (right) of the two allopatric lineages of Pantodon and associated osteoglossomorphs recovered in the Bayesian divergence time analysis of Lavoué et al. (2011). Redrawn from Lavoué et al. (2011)

Further resolution of euteleostean relationships below percomorphs

For euteleostean relationships below percomorphs, we resolved the lower- and higher-level relationships of Esociformes + Salmoniformes (see Campbell et al. 2013b), Myctophiformes (see Poulsen et al. 2013), Stylephoriformes (see Miya et al. 2007), and Cetomimidae and associated families (Johnson et al. 2009).

Of those studies, two should be highlighted here because of their remarkable impact on systematic ichthyology. One study includes the rare, monotypic deep-sea fish family Stylephoridae, which has long been considered a member of the order Lampridiformes. Miya et al. (2007) demonstrated that the family was not a lampridiform, but that it was a sister to Gadiformes based on mitogenomic and nuclear data sets (Fig. 18). This result was totally unexpected, because it had been demonstrably placed within the Lampridiformes based on four morphological synapomorphies and it was deeply nested within the order in the cladogram (Olney et al. 1993). Subsequent studies based on multiple molecular markers corroborated this unusual phylogenetic position with confidence (Near et al. 2012; Betancur et al. 2013a; Grande et al. 2013) and a recent morphological study found that myological characters in the caudal fin of both gadiforms and stylephoriforms show uniquely derived states (Borden et al. 2013).

Fig. 18
figure 18

Unexpected phylogenetic position of Stylephorus chordatus within the higher teleosts as recovered in the Bayesian analysis of Miya et al. (2007). The species has long been placed in the Lampridiformes, but it was resolved as the sister species of the Gadiformes. Redrawn from Miya et al. (2007)

Another example also includes rare deep-sea fishes, which have been long misclassified into three different families. In one of the four mitogenomic studies published in 2003, Miya et al. (2003) found that the mitogenomic sequence from a mirapinnid specimen (Parataeniophorus sp. cf. gulosus) was almost identical with that of a whalefish (Cetostoma regani), differing in only seven among 16,508 bp sequenced. Paxton and Johnson (2005) questioned these results because of the striking morphological differences between these two families and absence of a voucher specimen for the mirapinnid (totally lost during DNA extraction), even stating “Our conclusion is that it is virtually impossible anatomically for Parataeniophorus gulosus to transform into Cetostoma regani” (Paxton and Johnson 2005). Availability of excellent new Gulf of Mexico megalomycterid specimens with closing-net data led Johnson et al. (2009) to re-examine the issue and they finally showed that fishes currently assigned to three families with greatly differing morphologies, Mirapinnidae, Megalomycteridae and Cetomimidae (Fig. 19), are larvae, males and females, respectively, of a single family Cetomimidae, based on morphology and mitogenomic sequences. These striking results were featured in a “News & Views” section in Nature (Howlett 2009), which was entitled “Three into one will go.”

Fig. 19
figure 19

Misclassified three deep-sea fish families proven to be a single family Cetomimidae in Johnson et al. (2009): a Eutaeniophorus festivus postlarva (Mirapinnidae), photo courtesy of Donald Hughes; b Ataxolepis apus adult male, 58 mm in standard length (Megalomycteridae), photo courtesy of G.D. Johnson; c Gyrinomimus sp., juvenile female (Cetomimidae), photo courtesy of Bruce Robison

Further resolution of percomorph relationships

We have successfully circumscribed the limits of percomorphs in a series of papers in the early 2000s (Miya et al. 2001, 2003, 2005) and those studies have hinted at previously unrecognized close relationships between some groups of percomorphs. For example, Gasterosteus aculeatus (Gasterosteiformes) and Helicolenus hilgendorfi (Scorpaeniformes) were recovered as a strongly supported monophyletic group with a bootstrap value of 100 % in Miya et al. (2001) (Fig. 5). The novel clade was also recovered in Miya et al. (2003) with additional groups, such as zoarcoids (Perciformes); however, it did not contain a gasterosteiform (Indostomus) and scorpaeniforms (Dactyloptena spp.).

In addition to this unexpected clade, one of the most striking findings in Miya et al. (2003) was a close relationship between carangids and pleuronectiforms, which had never been suggested but was supported by a jackknife value of 100 % (Fig. 8d). The placement is currently well established and the clade has been subsequently expanded to include those perciform taxa with diverse morphologies, such as Toxotidae, Carangidae, Centropomidae, Latidae, Xiphiidae, Istiophoridae, Polynemidae, Echeneidae, Coryphaenidae, Rachycentridae, Sphyraenidae, and Menidae (see Chen et al. 2003; Smith and Wheeler 2006; Smith and Craig 2007; Little et al. 2010; Near et al. 2012; Wainwright et al. 2012; Betancur et al. 2013a, b; Miya et al. 2013).

A close affinity of lophiiforms, caproids, and tetraodontiforms represents another example of the striking findings in Miya et al. (2003) (Fig. 8c). It was reconfirmed by Holcroft (2004) and Yamanoue et al. (2007) and later supported by those studies based on multiple nuclear genes (Dettaï and Lecointre 2005; Li et al. 2009; Near et al. 2012; Wainwright et al. 2012; Betancur et al. 2013a) as well as morphology (Chanet et al. 2013), leading to a series of mitogenomic studies of tetraodontiforms (Yamanoue et al. 2004, 2006, 2007, 2008, 2009a, 2009b, 2011) and lophiiforms (Miya et al. 2010).

Miya et al. (2003) also found a strongly supported clade (jackknife value = 87 %), comprising perciforms (4 spp.), mugiliforms (2 spp.), and atherinomorphs (6 spp.), within the percomorphs. In particular, four small, benthic perciforms (two blennids and two gobiesocids) were nested in the clade, which was totally unexpected from the previous studies (Fig. 8d). Subsequently, the clade has been reconfirmed with broader taxonomic sampling by Smith and Craig (2007) and Setiamarga et al. (2008) and both of them noticed that all members of the clade (excluding mugilids) spawn demersal eggs with filaments, later named Ovarentaria by Wainwright et al. (2012) with additional taxa in reference to the unique reproductive mode.

The partial resolution of the percomorph “bush” in three studies (Miya et al. 2001, 2003, 2005) facilitated further resolution of the complex relationships of higher teleosts. For example, Mabuchi et al. (2007) explicitly demonstrated that the Labridae and the remaining three labroid families (Cichlidae, Pomacentridae, Embiotocidae) have diverged early within the percomorphs, and the monophyly of the suborder Labroidei was confidently rejected by statistical tests. The resultant phylogenies indicated that the complex pharyngeal jaw apparatus (PJA) shared by those four labroid families evolved independently at least twice, once in Labridae and once in the common ancestor of the remaining three families (Fig. 20), supporting the idea that the evolution of the specialized PJA provided these lineages with the morphological potential for their spectacular trophic radiations from the aspect of historical repeatability (Mabuchi et al. 2007).

Fig. 20
figure 20

Diversity in the skulls of the Cichlidae (ah) and Labridae (ip), and diagrammatic representation of the principal components of the specifically modified pharyngeal jaw apparatus (PJA) of cichlids. Redrawn from Mabuchi et al. (2007)

Yamanoue et al. (2007), on the other hand, re-examined the phylogenetic position of the tetraodontiforms and demonstrated that they occupy the apical position in the percomorph tree together with lophiiforms and caproids. Yamanoue et al. (2008) further resolved tetraodontiform phylogenies and found the basal split into the two previously unrecognized clades, which involves ecological diversification into shallow and deep-water habitats (Fig. 21). This novel hypothesis has been at odds with both morphological and nuclear DNA-based analyses (see Yamanoue et al. 2008), but recent phylogenetic analyses based on 22 loci convincingly supported their evolutionary hypothesis on the early ecological diversification (Santini et al. 2013). Yamanoue and his colleagues further investigated subsets of tetraodontiform relationships, such as Molidae (see Yamanoue et al. 2004) and Balistoidea (see Yamanoue et al. 2009b), and revealed intriguing evolutionary events, such as explosive speciation in Takifugu (Fig. 22, Yamanoue et al. 2009a) and multiple invasions into freshwater by tetraodontids (Fig. 23, Yamanoue et al. 2011).

Fig. 21
figure 21

Tetraodontiform relationships recovered in the Bayesian analysis of Yamanoue et al. (2008). Numbers near internal branches indicate Bayesian posterior probabilities. Solid, open, and double circles, and triangles indicate that main habitats of a family are deep waters, coastal waters, open sea, and brackish and freshwater, respectively. Redrawn from Yamanoue et al. (2008)

Fig. 22
figure 22

Timescale for the divergence of species of Takifugu recovered in the ML analysis of Yamanoue et al. (2009a). Note that the divergence of Takifugu has occurred in a time window of 2.4 million years, which is comparable to the explosive speciation of the cichlids in Lake Malawi. Redrawn from Yamanoue et al. (2009a)

Fig. 23
figure 23

Tetraodontid relationships recovered in the ML analysis of Yamanoue et al. (2011). Freshwater species are colored with green (South America), red (Southeast Asia) and yellow (Central Africa). Numbers near internal branches indicate ML bootstrap probabilities. Redrawn from Yamanoue et al. (2011)

Kawahara et al. (2008) investigated another problematic issue concerning interrelationships of 11 gasterosteiform families within the larger context of percomorph relationships. Although members of this group commonly exhibit unique appearances, many of which are derived from various forms of dermal plate armor (Nelson 2006), seven synapomorphies were proposed for their monophyly (Johnson and Patterson 1993). The relationships derived from mitogenomic phylogenetic analyses indicated explicitly that previously recognized members of gasterosteiforms had diverged early within the percomorphs into three different clades with the following subgroups: Syngnathoidei, Gasterosteoidei (minus Indostomidae), and Indostomidae (see Kawahara et al. 2008). Kawahara et al. (2008) further found that Syngnathoidei forms a monophyletic group if it includes Dactylopteroidei, and that Indostomidae is nested within the Synbranchiformes, rendering the latter group paraphyletic. Subsequently, Kawahara et al. (2009) resolved gasterosteid relationships based on whole mitochondrial genomes (14,807 bp) and 11 single-copy nuclear genes (8,703 bp) from all the currently recognized genera. They proposed a novel hypothesis of their relationships (Fig. 24). Their results clearly indicated that morphological and behavioral similarities between Spinachia spinachia and two aulorhynchids (Auloryhnchus japonicus and Auloryhnchus flavidus) result from the independent adaptation to similar marine habitats (Kawahara et al. 2009). Notably, the resulting two trees from the whole mitogenomes and combined 11 nuclear genes exhibited complete topological congruence, yet no single nuclear gene was able to recover that topology (Kawahara et al. 2009).

Fig. 24
figure 24

Stickleback relationships recovered in the maximum likelihood (ML) analyses of Kawahara et al. (2009) using mitogenomes and 11 nuclear genes. Topologies of the resulting two trees are identical and numbers beside internal branches are bootstrap values from the two datasets. Boxes on the right side of the tree indicate the habitat of the gasterosteoid species. Redrawn from Kawahara et al. (2009)

Based on the results from Kawahara et al. (2008) on the relationships of the 11 gasterosteiform families, together with those from previous molecular phylogenetic studies, Song et al. (2014) attempted to circumscribe a novel clade mainly comprising Syngnathoidei using the mitogenomic data. They sampled a wide variety of percomorphs that have been suggested to have close affinities with syngnathoids and performed Bayesian and maximum likelihood analyses. The resulting trees revealed a highly supported monophyletic group comprising seven families in Syngnathoidei (Gasterosteiformes), Dactylopteridae (Scorpaeniformes), Mullidae in Percoidei and two families in Callionymoidei (Perciformes) (Fig. 25). No previous molecular phylogenetic studies have sampled all of those families (see table 2 in Song et al. 2014), and this previously unrecognized clade found across four different percomorph suborders with diverse morphologies (Fig. 25) represents another example of unique patterns of diversification in percomorphs with no trace of morphological synapomorphies.

Fig. 25
figure 25

“Syngnathiform” relationships recovered in the Bayesian analysis of Song et al. (2014). Numbers beside branches are posterior probabilities and bootstrap values in the Bayesian and maximum likelihood analyses, respectively. Redrawn from Song et al. (2014)

Table 2 Eight major clades within the Percomorpha consistently recovered in the previous mitogenomic studies (A–H, Kawahara et al. 2008) with the corresponding clades in the two latest studies based on multiple nuclear genes (Betancur et al. 2013a; Near et al. 2013)

Patterns of facial nerves, such as those of the ramus lateralis accessorius (RLA), have been considered as one of the candidate characters to delimit a monophyletic group within the percomorphs (Freihofer 1963). Six families of the suborder Percoidei (Arripidae, Dichistiidae, Kyphosidae, Terapontidae, Kuhliidae, Oplegnathidae) and suborder Stromateoidei (including six families) demonstrably share the unique pattern 10 of RLA (Fig. 26). Yagishita et al. (2009) investigated its evolutionary origin using mitogenomic data and found that there were at least two independent origins of the unique facial nerve pattern: one in the common ancestor of Kyphosidae, Terapontidae, Kuhliidae, and Oplegnathidae and another in the common ancestor of the percoids, Arripidae and Stromateoidei (Fig. 26), showing that the morphological character complex can evolve independently in different lineages within the percomorphs.

Fig. 26
figure 26

Ramus lateralis accessorius (RLA) pattern 10 (lower left) in Kuhlia and phylogenetic positions of those perciform families with RLA10 as shown in Yagishita et al. (2009). Redrawn from Yagishita et al. (2009)

The order Lophiiformes, commonly known as the anglerfishes, contains a diverse array of marine fishes, comprising 321 living species placed in 68 genera, 18 families and five suborders, and ranging from benthic shallow-water dwellers to highly modified deep-sea midwater species (Pietsch 2009). Miya et al. (2010) assembled whole mitogenome sequences from 39 lophiiforms, representing 17 of the 18 currently recognized families, and analyzed their evolutionary history based on the time-calibrated phylogenies, paying special attention to that of the deep-sea ceratioids. The resultant trees revealed previously unappreciated phylogenetic relationships among the lophiiform suborders and ceratioid families and Miya et al. (2010) concluded that the acquisition of novel features, such as male dwarfism (Fig. 27), bioluminescent lures, and unique reproductive modes allowed the deep-sea ceratioids to diversify rapidly in a largely unexploited, food-poor bathypelagic zone (200–2,000 m depth) relative to the other lophiiforms occurring in shallow coastal areas.

Fig. 27
figure 27

Evolution of the male sexual parasitism in the ceratioid anglerfishes recovered in the maximum likelihood reconstruction of Miya et al. (2010). Four discrete character states were assigned to each terminal and the ancestral character states were reconstructed on the ML tree. Redraws from Miya et al. (2010)

One of the most striking findings in our mitogenomic studies of percomorphs was the discovery of an enigmatic clade comprising 15 (or possibly 16) pelagic fish families (Miya et al. 2013). Previous molecular phylogenetic studies have sporadically recovered a portion of this clade based on the three to seven families mostly with strong statistical support (91–100 %, see table 3 in Miya et al. 2013). Miya et al. (2013) found that the 15 families (Fig. 28) share a common ancestry based on bioinformatic analyses using partial mitochondrial and nuclear gene sequences from all percomorphs deposited in GenBank (10,733 sequences) and subsequent mitogenomic analysis based on 57 species from those targeted 15 families and 67 outgroup taxa. Morphological heterogeneity among these 15 families is so extraordinary that they have been placed in six different perciform suborders; however, members of the 15 families are either coastal or oceanic pelagic in their ecology with diverse modes of life, suggesting that they represent a previously undetected adaptive radiation in the pelagic realm. The time-calibrated phylogenies implied that those 15 families originated from a deep-ocean ancestor and began to radiate after the end-Cretaceous when large predatory epipelagic fishes were selective victims of the Cretaceous–Paleogene mass extinction (Friedman 2009). Miya et al. (2013) named this clade of open-ocean fishes “Pelagia” in reference to the common habitat preference that links the 15 families.

Fig. 28
figure 28

Timetree of the Pelagia recovered in the Bayesian relaxed-molecular clock analysis of Miya et al. (2013). Horizontal bars indicate 95 % credible intervals of the divergence time estimates. Redrawn from Miya et al. (2013)

While writing this review, two mitogenomic papers on the percomorph relationships have been accepted for publication. The first one is that of Lavoué et al. (2014), who addressed phylogenetic issues of an enigmatic family Percichthyidae within the context of the Centrarchiformes. The latter taxon is a newly (but informally) recognized order based on the multiple nuclear genes (Near et al. 2012) and Lavoué et al. (2014) found that the mitogenomic data support the monophyly of the Percichthyidae only when the catadromous genus Percalates is excluded from the family. The second one is that of Campbell et al. (2014), who addressed an issue on phylogenetic affinities of the flatfish (Pleuronectiformes) based on the mitogenomic data. Campbell et al. (2014) found that the monophyly of the more comprehensive clade called Carangimorpha is strongly supported, but that of the flatfish is not, which is congruent with the recent molecular studies based on multiple nuclear genes (Betancur et al. 2013b; Campbell et al. 2013a).

Finally, it should be noted that our mitogenomic studies have consistently recovered eight major clades within the Percomorpha with the exception of Ophidiiformes and Batrachoidiformes despite differences in taxonomic coverage (Miya et al. 2003, 2005; Mabuchi et al. 2007; Yamanoue et al. 2007; Kawahara et al. 2008; Yagishita et al. 2009; Song et al. 2014). Kawahara et al. (2008) labeled those eight clades A to H and these clades are labeled (or named) in the latest two studies based on multiple nuclear genes as shown in Table 2. Interrelationships among those clades, however, are not congruent among those studies and should be clarified in future studies.

Divergence time estimation

The advent of Bayesian inference in phylogenetic analyses has provided a powerful framework for integrating different sources of information (molecules and fossils) to estimate divergence times on molecular phylogenies (Inoue et al. 2010a). Inoue et al. (2005) was the first study in our research group to employ this approach, estimating the divergence time between Indonesian and Comoran coelacanths based on the whole mitogenome sequences. Using the sarcopterygian–actinopterygian split as a calibration point [450 million years ago (Mya)], their estimate fell in the range of 30–40 Mya between the two species (Fig. 29). The estimated age was in agreement with the hypothesis that their diversification is linked to the collision of India with Eurasia (50 Mya) and the subsequent siltation caused by the formation of major rivers that resulted in disjunction of the ancestral coelacanth habitat (Springer 1999). Inoue et al. (2005) argued that this geological event allowed the ancestral coelacanth populations on either side of Indian subcontinents to diverge, resulting in the speciation into two species.

Fig. 29
figure 29

Timetree of the two coelacanth species recovered in the Bayesian relaxed-molecular clock analysis of Inoue et al. (2005). Horizontal bars indicate 95 % credible intervals of the divergence time estimates. Redrawn from Inoue et al. (2005)

Subsequent studies using the whole mitogenomes (Peng et al. 2006; Yamanoue et al. 2006) provided similar node ages for the major actinopterygian divergences based on the same calibration points as those used in Inoue et al. (2005). Hurley et al. (2007), on the other hand, re-investigated the timing of the evolutionary origins of basal actinopterygians using morphological, and nuclear and mitochondrial DNA data with additional fossil evidence. Their results indicated that the node age of the crown-group Neopterygii, including the teleosts, Lepisosteus and Amia, was at least 40 million years (Myr) older than the existing fossil-based estimate and that the nuclear estimates were always younger than the mitochondrial ones. Also their estimates were generally much younger than the previous molecular estimates based on the whole mitogenomes (Inoue et al. 2005; Peng et al. 2006; Yamanoue et al. 2006), because the three most basal nodes were constrained with both minimum and maximum ages with narrow, younger confidence intervals based on the revised fossil records. Accordingly, we employed those revised fossil records from Hurley et al. (2007) as the more reliable time constraints in subsequent studies and provided much younger node ages across the trees (Azuma et al. 2008; Inoue et al. 2009, 2010b; Setiamarga et al. 2009; Miya et al. 2010; Nakatani et al. 2011; Saitoh et al. 2011; Johnson et al. 2012).

One of the most critical problems in divergence time estimation in fishes is the lack of adequate fossil records that can be used as the time constraints at multiple nodes, especially those of percomorphs. Azuma et al. (2008) explored an alternative source of time constraints in the higher teleostean phylogenies by evaluating a biogeographic hypothesis concerning freshwater fishes from the family Cichlidae (Perciformes). Azuma et al. (2008) conducted phylogenetic analyses and divergence time estimations using mitogenomic sequences from 10 cichlids and 44 outgroups. The resulting timetree indicated that the phylogenetic relationships and timing of diversifications in continental cichlids were much more congruent with Gondwanaland origin and Cretaceous vicariant divergences than with Cenozoic transmarine dispersal between major continents (Azuma et al. 2008) (Fig. 30). On the basis of such a remarkable congruence between the cichlid phylogenies and geological events, Azuma et al. (2008) proposed adding the biogeographic assumption of cichlid divergences by continental fragmentation as effective time constraints in dating teleostean divergence times. Their proposal, however, has been criticized by Friedman et al. (2013) who provided independent palaeontological and relaxed-molecular-clock estimates for the time of cichlid origin that collectively reject the antiquity of the group required by the Gondwanan vicariance scenario. It appears that congruence between paleontological and molecular time scales for cichlid evolution of Friedman et al. (2013) is better than that of Azuma et al. (2008), but the results from the former study demand what can only be considered a series of highly unlikely trans-oceanic dispersal events.

Fig. 30
figure 30

Timetree of the 10 continental cichlids recovered in the Bayesian relaxed-molecular clock analysis of Azuma et al. (2008). Horizontal bars indicate 95 % credible intervals of the divergence time estimates. Cichlid distributions are indicated by blue (India), green (Madagascar), orange (South America) and red (Africa) circles, which correspond to those of paleogeographic maps. Redrawn from the original data in Azuma et al. (2008)

In general, our mitogenomic estimations of the evolutionary history of actinopterygians (see below) are more consistent with older vicariance scenarios than with the conservative and literal ages of the fossil record (Inoue et al. 2005, 2009; Setiamarga et al. 2009; Nakatani et al. 2011; Saitoh et al. 2011). The resulting “ghost range” is extremely large, sometimes reaching >100 million years (Myr) without fossil evidence. Nakatani et al. (2011) argued that this formidable ghost range partially reflects a genuine difference between the estimated ages of stem group origin (molecular divergence time) and crown group morphological diversification (fossil divergence time) (Brown et al. 2008) and the ghost range may be filled with future discoveries of older fossils that can be used as more reasonable time constraints as well as with the development of more realistic models that accurately capture the divergence rates of molecular sequences.

Historical biogeography

The disjunct distributions of the osteoglossomorph family Notopteridae across Africa, India, and Southeast Asia constitute a long-standing issue of freshwater fish biogeography (Fig. 31). Inoue et al. (2009) investigated historical biogeography of the notopterids based on whole mitogenome sequences and the resultant time-calibrated phylogenies suggested that the Asian notopterids diverged from the African notopterids in Gondwanaland and migrated into Eurasia on the Indian subcontinent from the Cretaceous to the Tertiary. However, Inoue et al. (2009) could not exclude an alternative explanation that the African and Asian notopterids diverged in Pangaea before complete separation into Laurasia and Gondwanaland, to which these two lineages were later confined, respectively.

Fig. 31
figure 31

Geographic distributions of the African and Asian members of the family Notopteridae. Redrawn from Inoue et al. (2009)

The southern and northern Japanese populations of the medaka (Oyzias spp., currently recognized as two distinct species; see Asai et al. 2011) provide useful tools to gain insights into the comparative genomics and speciation of vertebrates. These fishes can breed to produce healthy and fertile offspring despite their highly divergent genetic backgrounds compared to those of human–chimpanzee (Ishikawa 2000). On the basis of Bayesian relaxed molecular-clock analyses of whole mitogenome sequences from 72 ray-finned fishes (including 14 medakas), Setiamarga et al. (2009) demonstrated that the divergence time between the two populations (18 Myr; Fig. 32) is about four times older than that of the previous study (Takehana et al. 2003). This remarkably older estimate, however, can be reconciled with the vicariant events of the Japanese archipelago, and the resulting rates of molecular evolution are found to be almost identical between the medaka and hominid lineages (Setiamarga et al. 2009).

Fig. 32
figure 32

Divergence times of the medaka fishes recovered in the Bayesian relaxed-molecular clock analysis of Setiamarga et al. (2009). Horizontal bars indicate 95 % credible intervals of the divergence time estimates. ac Plate tectonic process of the Japanese archipelago and (d) the present distribution range of the Southern and Northern populations. Redrawn from Setiamarga et al. (2009)

Clupeoidei includes 397 extant species classified into five families and their distribution follows two general patterns of species richness, the longitudinal and latitudinal gradients. To test historical hypotheses explaining the formation of these two gradients, Lavoué et al. (2013) examined the early biogeography of the Clupeoidei by reconstructing the evolution of their habitat preferences (Fig. 33) along with their ancestral range distributions on a time-calibrated mitogenomic phylogeny. The resulting timetree and the ancestral range reconstruction suggested that the probable region of origin and diversification of the Clupeoidei was the tropical marine precursor to present-day Indo-West Pacific region during the Cretaceous, favoring the hypotheses of “region of origin” (Briggs 2007) and “tropical conservatism” (Mittelbach et al. 2007) to explain the origins of the longitudinal and latitudinal gradients of clupeoid species richness, respectively.

Fig. 33
figure 33

Geographic distribution patterns of the Clupeoidei. a Distribution and species diversity of Clupeoidei. Number of species per grid cell (4 by 4 degree latitude–longitude resolution) is represented by cool (low diversity) to warm (high diversity) colors. b The 12 biogeographical units used in the ancestral ranges reconstruction analysis; each unit was delimited by land masses, vast expanses of open ocean and water temperature. Redrawn from Lavoué et al. (2013)

Freshwater fishes can expand their distributions only through connection of freshwater habitats owing to their inability to acclimate to the sea (Saitoh et al. 2011). Saitoh et al. (2011) analyzed phylogenetic relationships and divergence times of the Cypriniformes, one of the largest teleost freshwater fish clade. Based on the resulting timetree, they performed historical biogeography analysis and found the lower Mesozoic of South Asia as the time and place of basal divergence of cypriniforms. Concurrently, Nakatani et al. (2011) investigated the evolutionary history of the more comprehensive clade called Otophysi, which includes the Cypriniformes and Characiphysi (Gymnotiformes, Characiformes, Siluriformes), together accounting for 43 % of all freshwater fish species. Nakatani et al. (2011) performed phylogenetic analyses and divergence time estimations and the resulting timetree suggested that survival of the ancestral lineages through the two consecutive mass extinctions on Pangaea, and subsequent radiations during the Jurassic through early Cretaceous, shaped the modern familial diversity of otophysans (Fig. 34).

Fig. 34
figure 34

Timetree of the otophysans recovered in the Bayesian relaxed-molecular clock analysis of Nakatani et al. (2011). Horizontal bars indicate 95 % credible intervals of the divergence time estimates. ML reconstruction of ancestral habitats is indicated on selected nodes with pie charts showing the likelihoods for two character states (blue, freshwater; light green, saltwater). All marine species are indicated by asterisks. Dagger symbols indicate the three big mass extinction events after 300 million years ago. Paleocoastline maps are shown below the timetree with moist zones schematically illustrated in green. Redrawn from Nakatani et al. (2011)

Evolutionary origin

One of the most remarkable examples of convergent evolution among vertebrates is illustrated by the independent origins of an active electric sense in weakly electric fishes from South America and Africa, the Gymnotiformes and Mormyroidea, respectively (Lavoué et al. 2012a) (Fig. 35). Lavoué et al. (2012a) examined the timing of the origins of the Gymnotiformes and the Mormyroidea using whole mitogenome sequences and estimated similar ages for the independent origins of the Mormyroidea and Gymnotiformes, which slightly postdate, or just predate, the final separation of Africa and South America by continental drift. Lavoué et al. (2012a) argued that such a convergent evolution with similar age enhances the comparative value of the weakly electric fish system for investigating pathways to evolutionary novelty, as well as the influences of key innovations in communication on the process of species radiation.

Fig. 35
figure 35

Morphological convergences between African and South American electric fishes. Mormyroid African electric fishes (left column) are facing gymnotiform South American electric fishes (right column) with similar aspects of morphology. Electric organ discharge waveform shown for every species (each trace 5 ms in total duration with head-positivity plotted upward). a Mormyrops zanclirostris ; b Sternarchorhynchus oxyrhynchus; c Mormyrus proboscirostris; d Rhamphichthys sp.; e Mormyrops anguilloides; f Gymnotus sp.; g Petrocephalus sullivani; h Eigenmannia sp. Reproduced from Lavoué et al. (2012a)

The chondrichthyans, comprising holocephalans (chimaeras) and elasmobranchs (shark, skates, rays), are the oldest living group of jawed vertebrates that diverged from the common ancestor of bony vertebrates in the early Silurian about 420 Mya (Benton 2009). The modern holocephalans—a previously successful and diverse group currently represented by only 39 extant species—and their relationship with elasmobranchs and other jawed vertebrates has been poorly documented largely owing to a lack of well-preserved fossil materials after the end-Permian about 250 Mya (Didier 2004). Inoue et al. (2001b) investigated their evolutionary history based on whole mitogenome sequences from eight representatives from all three families, finding a single origin of the modern holocephalans and a sister-group relationship with elasmobranchs. The mitogenomic tree recovered the most basal callorhinchids within the chimaeriforms, which is the sister group to a clade comprising the remaining two families (rhinochimaerids and chimaerids). The timetree suggests that the holocephalans originated in the Silurian about 420 Mya, having survived the end-Permian (250 Mya) mass extinction and undergoing familial diversifications during the late Jurassic to early Cretaceous (170–120 Mya) (Fig. 36). This postulated evolutionary scenario agrees well with that based on the paleontological observations (Grogan and Lund 2004).

Fig. 36
figure 36

Timetree of the holocephalans and outgroup vertebrates recovered in the Bayesian relaxed-molecular clock analysis of Inoue et al. (2010b). Horizontal bars indicate 95 % credible intervals of the divergence time estimates. Redrawn from Inoue et al. (2010b)

Aschliman et al. (2012), on the other hand, investigated evolutionary history of the batoids (skates and rays)—one of the three major lineages of chondrichthyans—using mitogenomes, nuclear genes, and fossils, sampling densely across taxa. Aschliman et al. (2012) found widespread body plan convergence in batoids; for example, a depressed, rounded pectoral disk supported to the snout tip by fin radials, common to skates and stingrays, is indicated to have been derived independently by each group, while the long, spiny rostrum of sawfishes similarly appears to be convergent with that of sawsharks, which are not batoids. The major extant batoid lineages are inferred to have arisen relatively rapidly from the Late Triassic into the Jurassic, with long stems followed by subsequent radiations in each group around the Cretaceous/Paleogene boundary.

Comparative mitogenomics

Accumulation by our research group of the whole mitogenome sequences from a diverse array of fishes allowed us to compare the structural properties of the mitogenomes and we found that mitochondrial gene arrangement has been highly conserved among the major lineages of fishes (including tetrapods) for more than 500 Myr with a few exceptions (see Inoue et al. 2003b). It remained unclear whether high conservation of gene order is a consequence of some constraints, or whether it results only from shared common ancestry. To address this question, Satoh et al. (2010) analyzed codon usage and tRNA gene arrangements of the vertebrate mitogenomes to examine possible constraints on the gene order of vertebrate mitogenomes. They found that the tRNA genes specifying the hydrophobic residues were positioned close to the control region (CR), where the transcription efficiency is estimated to be relatively high. Using 47 vertebrate mitogenome sequences representing jawless fishes to mammals, Satoh et al. (2010) further found a correlation between codon usage and tRNA gene positions. They suggested the existence of translational constraint acting on the vertebrate gene arrangement of the mitogenome and concluded that such translational constraints, together with the deamination-related constraint, may have contributed to long-term maintenance of gene order.

Developments of the database and annotation pipeline

In 2004 we launched a publicly accessible, specialized nucleotide sequence database for fish mitogenomes named MitoFish (http://mitofish.aori.u-tokyo.ac.jp). Since then, MitoFish has received an average of more than 30,000 unique hits annually, and currently deposits 1,489 whole, plus 315,214 partial, mitogenome sequences from fishes as of 2 August 2014 (Fig. 37). Although older versions of MitoFish had a few simple functions, such as a homology search, it was recently updated by Iwasaki et al. (2013) to provide the web site with an automatic annotation pipeline called MitoAnnotator. The latter function is particularly important, because a severe bottleneck occurs during mitogenomic annotation, which is time consuming and involves difficulties in annotating sequences with degenerating tRNA structures, divergent start/stop codons of the coding elements, and the overlapping of adjacent elements (Iwasaki et al. 2013). Actually, MitoAnnotator automatically annotates a fish mitogenome with a high degree of accuracy in approximately five minutes (Iwasaki et al. 2013). It should be noted that as many as 38 papers have already used this novel annotation system as a useful tool since the publication of Iwasaki et al. (2013), less than a year ago.

Fig. 37
figure 37

MitoFish home page. A vertical menu bar on the right-hand side allows users to access the main functions of MitoFish including MitoAnnotator. The fish species/taxonomy search and sequence similarity searches can also be performed directly from the home page

Education

We have jointly supervised portions of M.S. or Ph.D. studies of nine graduate students in Japan who earned those degrees with the mitogenomics of fishes at MN’s home institutes (Fukui Prefectural University and Ocean Research Institute, The University of Tokyo), where we have regularly held monthly discussion meetings with them up to MN’s move to University of Ryukyus (April 2013) after his retirement from UT (March 2012). Among many graduate students whom we advised, J.G. Inoue, N.B. Ishiguro, A. Kawaguchi, T.P. Satoh, Y. Minegishi, Y. Yamanoue, R. Kawahara, D.H.E. Setiamarga, and M. Nakatani have greatly contributed to the development of the mitogenomics of fishes (see the reference section below).

In addition, we have hosted a number of overseas graduate students who wanted to apply the mitogenomic data to their own studies and have learned the DNA experimental methods and data analysis in our laboratories. Those oversea graduate students include N.C. Aschliman, M.A. Campbell, S.C. DeVaney, N.J. Lang, M.H. Doosey, J.J.D. Egge, and K.M. Laumann, all from the U.S., and J.Y. Poulsen from Denmark. The U.S. graduate students came to our labs through an EAPSI (East Asia and Pacific Summer Institute for U.S. Graduate Students) program jointly sponsored by JSPS (Japan Society for the Promotion of Science) and NSF (National Science Foundation), which not only provides them with great educational opportunities, but also will help them initiate professional relationships to enable collaborations with us. Some of them have published their own papers based on the mitogenomic data taken during their stay (Doosey et al. 2010; Aschliman et al. 2012; Campbell et al. 2013b) and those papers helped decide the future directions of their independent studies.

Concluding remarks

We have summarized the major achievements and novel perspectives that our project has brought to molecular phylogenetics and evolution of fishes from 1999 to 2014. During this period, two major international projects, which are targeted to address large-scale resolution of organismal phylogenies and taxonomy, have been launched: 1) Assembling the Tree of Life (AToL; currently replaced by Genealogy of Life “GoLife”), a phylogeny-oriented initiative (Cracraft 2004) sponsored by U.S. National Science Foundation (NSF); and 2) Barcode of Life (BOL), a molecular taxonomy-oriented project (Hebert et al. 2003) mainly sponsored by Genome Canada. As we stated earlier in this review, we have played significant roles in one of the AToL projects “Cypriniformes Tree of Life” (CToL) using the mitogenomic data and actually published a number of papers on cypriniform relationships. For the latter project, we were appointed as members of a regional working group (Northeast Asia) in one of the initiatives (FISH-BOL), although we were unable to apply the mitogenomic data to taxonomy and species identification. This is largely because we lacked various resources deemed necessary for development of this burgeoning field in biodiversity research while performing the phylogenetic studies at the same time.

Concerning phylogenetic issues addressed by the former project (AToL), our pioneering work based on mitogenomic data has provided a number of novel phylogenetic hypotheses across multiple taxonomic levels, most of which have been verified by subsequent studies based on multiple nuclear genes with intensive taxonomic sampling (e.g., see Table 2). We acknowledge, however, that more definite resolution of the higher-level relationships of fishes should be expected to use nuclear genomes, where large amounts of phylogenetic information spanning hundreds or thousands of mega base pairs (Mbp) remain unexplored. This plethora of data is in contrast to the 16.5 kilo base pairs (Kbp) maximally available in mitogenomes. Also, Lavoué et al. (2014) recently observed that the phylogenetic signal contained in the mitogenomes is lower than that of the concatenated multiple nuclear genes in resolving the percomorph relationships, where most of the unresolved phylogenetic questions still persist across the “bush at the top” (Nelson 1989) or the “new bush at the top” (Betancur et al. 2013a).

Nevertheless, caution should be exercised when using nuclear genes in phylogenetic analysis, in particular, for those lineages undergoing polyploidization or whole genome duplication events, where paralogous comparisons of the amplified genes from different species are likely to mislead the phylogenetic analysis (Martin and Burg 2002). For example, mitogenomic analysis supported a sister-group relationship between Coregoninae and Thymallinae among three salmoniform subfamilies (Li et al. 2010; Campbell et al. 2013b), while those studies based on multiple nuclear genes (Near et al. 2012) and combined partial mitochondrial and nuclear genes (Crête-Lafrenière et al. 2012) supported a sister-group relationship between Coregoninae and Salmoninae. With strict assessment of orthology, however, Macqueen and Johnston (2014) convincingly demonstrated strong support for the former sister-group relationship (Coregoninae and Thymallinae) based on the 36 nuclear genes, which is congruent with the results from mitogenomic analyses (Li et al. 2010; Campbell et al. 2013b) and reanalysis of the 13 mitochondrial protein-coding genes by themselves (Macqueen and Johnston 2014). Although it remains unclear that the different subfamilial relationships in salmoniforms resulted from paralogous comparisons of nuclear genes, such polyploidization is commonly found in fishes of the Cypriniformes and broadly observed across fish lineages from the Lepidosireniformes (lungfish) to the Perciformes (perches) (Comber and Smith 2004), which are likely to compound the phylogenetic analysis. Considering the supposed whole genome duplication event in a common ancestor of the extant teleosts (Sato and Nishida 2010), putatively orthologous genes from different species should be inspected more carefully before and after phylogenetic analysis.

The utility of mitogenome sequences in fish phylogenetics is not over in this nuclear genomics age (Li et al. 2008; Near et al. 2012, 2013 Wainwright et al. 2012; Betancur et al. 2013a; Broughton et al. 2013). The mitogenomic data will continue to contribute to the resolution of the higher-level relationships of fishes, such as those shown in the novel findings from recent studies (Miya et al. 2013; Song et al. 2014), and they will still be useful molecular markers as independent lines of evidence. We envisage that more promising use of the mitogenomic data would be at the interface between species and populations. Whole mitogenome sequences (ca. 16,500 bp) should be able to provide more informative data than the partial sequences widely used in the fish barcode (655 bp from the COI gene, Ward et al. 2005). For example, COI sequences are unable to unequivocally distinguish the eight currently recognized species of tunas (Thunnus spp.) (Ward et al. 2005), but whole mitogenome sequences are demonstrably able to do so (Miya et al. 2013). In insects, for example, Timmermans et al. (2010) even claim that the mitogenomes would be alternative molecular markers to delimit species boundaries by using next-generation sequencing (NGS) technologies, which produced 21 nearly complete and 7 partial sets of protein-coding mitochondrial genes from beetles in a single run. Mitogenome sequencing also revealed shallow evolutionary history of fishes, as seen in complete sampling from all the 18 species/subspecies of the freshwater eels (Minegishi et al. 2005), which eventually led to the discovery of a new species in conjunction with morphological analysis (Watanabe et al. 2009). More recently, NGS technologies made it possible to quickly and economically generate whole mitogenome sequences (Morin et al. 2010), and Jacobsen et al. (2012) successfully demonstrated recent divergences between morphologically and ecologically distinct European whitefish using 106 mitogenome sequences from the Coregonus species complex. Intensive taxonomic sampling from the interface between species and populations together with the massive character sampling from mitogenome sequences using NGS technologies would enable simultaneous attempts to delimit species boundaries (plus the resulting discoveries of cryptic species) and to reconstruct their evolutionary relationships at much finer resolution, eventually unraveling the fish part of the Tree of Life in a bottom-up manner with more accurate estimations of species diversity.