Introduction

The animal mitochondrial genome is a single, small, double-stranded circular DNA (Attardi, 1985). In general, the size of the animal mitochondrial genome is approximately 16.5 kb and it usually contains 13 protein coding genes, 2 ribosomal RNAs (rRNAs; small 12 S and large 16 S), 22 transfer RNAs (tRNAs), and a major noncoding region that allows for the initiation of mitochondrial replication and transcription (Boore, 1999). Because of its compact size, multiple copy status in a cell, rapid evolutionary rate, and lack of recombination, mitochondrial DNA (mtDNA) has been extensively used as a marker for evolutionary and population genetic studies (Curole and Kocher, 1999). Because of its maternal inheritance, mtDNA showed some advantages over nuclear DNA markers in studies of population subdivision, gene flow, hybridization, and introgression (Jiggins et al., 1997; Chubb et al., 1998; McLean and Taylor, 2001; Stepien et al., 2001). Hence, the usefulness of the mitochondrial genome has received much attention in fisheries science. Many earlier studies of mtDNA concentrated on the polymorphism in the control region (CR) or protein coding genes by using polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) (e.g., Hiendleder, 1996). Today the trend in more and more studies is to move to direct sequencing of genes (Katsares et al., 2003), CR (Lee et al., 1995), and complete mtDNA (Inoue et al., 2001; Machida et al., 2004).

Asian seabass Lates calcarifer, also called barramundi in Australia, is one of the nine Lates species of the family Centropomidae and is widely distributed in the coastal and freshwaters of the tropical Indowest Pacific, from the Persian Gulf to India to Northern Australia (Nelson, 1994). Cultured seabass is a commercially important fish species and considered to have a high potential in the tropical Asia Pacific region (Chou and Lee, 1997). When sold either live or freshly frozen, its large size and delicate-flavored flesh command a premium price in the market. The fry is produced commercially in the hatchery and can be obtained throughout the year at reasonable prices because of its high stocking density and survival rate as well as the ease of managing the stock. Morphometric analysis indicated a slight difference between seabass from Australia compared with Asian seabass in Southeast Asia. Although some populations of seabass have been studied using microsatellites (Yue et al., 2002), protein polymorphism and partial mtDNA sequences (Chenoweth et al., 1998), the complete nucleotide sequence of the mitochondrial genome has not yet been uncovered.

In this study, we determined the complete mitochondrial genome (mtDNA) sequence of Lates calcarifer using shotgun sequencing of overlapping PCR products, and characterized single nucleotide polymorphisms (SNPs) in the CR by comparing sequences among 15 Asian seabass individuals from Australia and 10 from Singapore. All information reported in this article may facilitate in studies on population structure, genetic diversity, broodstock management and reconstruction of phylogenetic trees of life, as well as supply useful new markers for identification of the origins of fish and for the authentication of processed Asian seabass products.

Materials and Methods

Fish Sample and DNA Isolation.

A 25-day-old Asian seabass fingerling was obtained from Marine Aquaculture Centre, Agri-Food and Veterinary Authority, Singapore and the whole fingerling was immediately preserved in 100% ethanol. Total genomic DNA was extracted using a traditional phenol–chloroform method (Sambrook and Russell, 2001) and kept at −20°C.

PCR Amplification, Shotgun Sequencing, and Assembly of mtDNA.

The initial determination of the partial sequences of mtDNA was obtained using four universal primer pairs: L14841+H15149, L1091+H1478, L7450+H8055, and L9225+H9407 (Meyer, 1993). PCR was conducted on a PTC-100 PCR thermal cycler (MJ Research, Waltham, MA, USA) using the following program: 94°C for 2 min, followed by 35 cycles of 94°C for 30 s, 55°C for 30 s, and 72°C for 1 min, and a final extension at 72°C for 5 min. Each 25-μl PCR reaction volume contained 30 ng of DNA, 1 × PCR buffer (Finnzymes, Espoo, Finland), 200 nM of each primer, 50 μM of each dNTP, and 1 U of DNA polymerase (Finnzymes). The primer pairs L14841+H15149, L1091+H1478, and L9225+H9407 amplified PCR products. The PCR products were cloned into a pGEM-T vector (Promega, San Luis Obispo, CA, USA) and sequenced using Bigdye v3.0 chemicals (Applied Biosystems, Foster City, CA, USA) as described by Yue et al. (2004). The sequences obtained were aligned using Sequencher (GeneCodes, Ann Arbor, MI, USA), and three specific primer pairs: LcamtA1+B1, LcamtA2+B2, and LcamtA3+B3 (for sequences, see Appendix 1) were designed using PrimerSelect (DNASTARPara>AQ_6>, Madison, WI, USA). The expected length of the three PCR products was approximately 2.5 kb, 7.5 kb, and 7.5 kb, respectively. Long-distance PCR was carried out using Expand Long Template PCR system (Roche, Basel, Switzerland) on a PTC-100 thermal cycler (MJ Research). Each 50-μl reaction volume contained 200 μM of dNTPs, 1 × buffer 1 with 1.75 mM MgCl2, 200 nM of each primer, 3.75 U of Taq Polymerase Enzyme Mix (Roche), and 50 ng of total DNA. The thermal cycling profile was pre-denaturation at 94°C for 2 min, 8 cycles of 94°C for 10 s, 60°C for 30 s, and 68°C for 6 min followed by 19 cycles of 94°C for 10 s, 60°C for 30 s, and 68°C for 6 min with a time increase of 20 s/cycle, then a final extension at 68°C for 10 min. Long-distance PCR products were electrophoresed on a 1% agarose gel (Bio-Rad, Hercules, CA, USA) and later stained with ethidium bromide for visualization via ultraviolet transillumination.Para>/Para>Para>Para>Long-distance PCR products of 2.5 kb, 7.5 kb, and 7.5 kb were sonicated using the Branson Digital Sonifier 450 (Labequip, Ontario, Canada) under 20% amplitude for 4 s, 6 s and 6 s, respectively. The sonicated DNA was separated on a 2% agarose gel, and fragments between 0.5 and 1.5 kb were excised, concentrated, and purified using GFX PCR DNA and Gel Band Purification Kit (Amersham, Piscataway, NJ, USA) according to the manufacturer's recommendations. The sonicated long-distance PCR products were treated with DNATerminator End Repair Kit (Lucigen, Middleton, WI, USA) according to the manufacturer's protocol, in which treatment of the products was carried out at room temperature for 30 min with a 50-μl reaction volume containing 18 μl of sterile distilled water, 10 μl of 5 × end repair buffer, 2 μl of end repair enzymes (Lucigen), and 20 μl (approximately 500 ng) of sonicated DNA template. The treatment of the sonicated long PCR products was terminated by incubation at 70°C for 10 min. The treated PCR products were purified and concentrated with GFX PCR DNA and Gel Band Purification Kit (Amersham) and ligated into pBluescript II KS (−) (Stratagene, La Jolla, CA, USA) with T4 DNA Ligase (Stratagene). The ligation was performed at room temperature for approximately 2 to 3 hours with an 11.1μl reaction volume containing 1.1 μl of 10 × ligation buffer, 1 μl of rATP (10 mM), 0.5 μl of pBluescript II KS(−) (Stratagene) containing approximately 25 ng of plasmid, 1 μl of T4 DNA ligase (4 U/μl), and 7.5 μl (approximately 50 ng) of treated DNA template. The ligated DNA was cloned into XL-1 Blue competent cells (Stratagene) and cultured on Luria Bertani (LB) plates containing 100 μg/L of ampicillin, 200 μg/μl of X-galactoside, and 200 μl of 0.5 M isopropyl-β-D-thiogalactopyranoside (IPTG). White colonies were picked and cultured in LB medium containing 125μg/μl of ampicillin overnight at 37°C, with shaking at 250 rpm. Colony PCR was performed on PTC-100 thermal cycler (MJ Research) as described by Yue et al. (2000). Sequencing of the PCR products was carried out using M13/M13 reverse primer, Bigdye v3.0 (Applied Biosystems), and the DNA sequencer ABI3730 xl (Applied Biosystems) according to the manufacturer's recommendation.Para>/Para>Para>Para>The DNA sequences were analyzed using EditSeq (DNASTAR). Flanking vector sequences were removed manually or automatically using Sequencher (GeneCodes) with a manual correction. Insert sequences were assembled by using the same software. Following the assembly, tRNA genes were identified by using the method described by Lowe and Eddy (1997). The locations of 13 protein-coding genes were identified by comparison of DNA or amino acid sequences of mitochondrial genomes of other bony fish, whereas the two tRNAs were determined by sequence homology and secondary structure as described by Johansen and Bakke (1996). The 5′ ends of protein genes were inferred to be located at the first legitimate in-frame start codon (ATN, GTG, TTG, and GTT) that did not overlap with the preceding gene, except that with an upstream tRNA gene was limited to the most 3′ nucleotide of the tRNA. Protein gene termini were inferred to be located at the first in-frame stop codon unless that codon was located within the sequence of a downstream gene. Otherwise, a truncated stop codon (T or TA) adjacent to the beginning of the downstream gene was designated as the termination codon and was assumed to be completed by polyadenylation after transcript cleavage. Base composition and codon usage were analyzed by using EditSeq and GeneQuest (DNASTAR). The complete mtDNA sequence of Lates calcarifer was deposited in GenBank under Accession No. DQ010541.

SNP Detection and Characterization in the CR.

The complete CR of 15 Australian seabass and 10 Singapore seabass individuals was amplified by PCR using the primer pair Dloop-A1 and Dloop-B1 (for sequences, see Appendix 1) located in the tRNAThr and 12 S RNA regions flanking the whole CR. PCR was conducted in a 25-μl reaction volume containing 40 ng of total DNA, 200 nM of each primer, 200 μM dNTPs, 1 U of Taq DNA polymerase (Finzymes), and 1 × buffer with 1.5 mM MgCl2 on a PTC-100 (MJ Research). The PCR program was 94°C for 2 min, then 34 cycles of 94°C for 30 s, 55°C for 30 s and 72°C for 48 s, followed by a final extension of 72°C for 5 min. The PCR products were separated on 1% agarose gels, and the strong band from each sample was excised and cleaned using glassmilk as described by Yue and Orban (2001). The cleaned PCR products were directly sequenced using two primers, Dloop-A1/Dloop-B1, and Bigdye v3.0 chemicals (Applied Biosystems) as described earlier. Forward and reverse sequences were assembled with Sequencher (Gene Codes) and manually checked by two persons independently. Sequences of the 25 individuals were aligned using Clustal_X 1.85 (Thompson et al., 1997) and then analyzed with MEGA (Kumar et al., 2001). The CR nucleotide sequences of the 25 fish have been deposited in GenBank under the Accession Nos. DQ012409–DQ012433. Phylogenetic trees (NJ, ME, MP, and UPGMA) were constructed on the basis of the complete CR nucleotide sequences using MEGA (Kumar et al., 2001). Conserved sequences were detected by comparing seabass CR sequences with those of other fish species (Lee et al., 1995).

Construction of Phylogenetic Trees.

To determine the evolutionary position of the Asian seabass, the mtDNA sequences of 30 fish species belonging to 14 suborders (for details, see the legend to Figure 3) were selected and used. The analyses were based on concatenated sequences of the 12 heavy-strand encoded protein-coding genes. The sequence of ND6 encoded by the light strand was excluded in the analyses because of the deviating nucleotide and amino acid composition of this gene as compared to those encoded by the heavy strand. Amino acid and nucleotide sequences of the 12 protein coding genes were aligned with Clustal_X 1.85 (Thompson et al., 1997). The data set was analyzed with maximum parsimony (MP), neighbor-joining (NJ), unweighted pair group method with arithmetic mean (UPGMA) and molecular evolution (ME) as implemented in the program MEGA (Kumar et al., 2001).

Results and Discussion

Long-Distance PCR and Shotgun Sequencing.

The long-distance PCR method described here was used to obtain the template for sequencing Lates calcarifer mtDNA. The whole mtDNA was amplified by long-distance PCR with 3 pairs of specific primers. In comparison with the traditional method (i.e., extraction of mtDNA from total DNA: e.g., Arnason and Johnsson, 1992) for sequencing mtDNA, long-distance PCR is quicker and simpler. Before specific primer pairs are designed, however, specific sequences of some conserved regions of the mtDNA should be determined. By sequencing 96 clones with inserts between 0.50 and 1.5 kb in both directions, the whole mtDNA was covered about 6 times without any gaps. Using the ABI3730 xl sequencer, the 96 clones could be sequenced within 2 hours. Therefore long-distance PCR and shotgun sequencing are the simplest and quickest methods to obtain the complete sequence of mtDNAs.

Genome Content and Base Composition.

The genome content of Lates calcarifer was in accordance to the length of other vertebrates [see review see Boore (1999)], including 2 rRNA, 22 tRNA, 13 protein-coding genes, and a noncoding CR. The total length of the Asian seabass mitochondrial genome was 16,535 bp (Table  1). The G+C content (46.1%) of mtDNA of Lates calacrifer was higher than that of other fish species, such as Takifugu rubripes (44.3%) (Elmerot et al., 2002), zebrafish (39.9%) (Broughton et al., 2001), Daphnia pulex (37.7%) (Crease, 1999); it was also higher than invertebrate mtDNAs (the G+C content ranges in chordates from 44.4% to 37.8%; echinoderms, 41.1% to 38.7%; annelids, 38.4%; cnidarians, 37.5% to 35.5% (Noguchi et al., 2000), but similar to that of chicken (46.04%) (Nishibori et al., 2003) and Japanese quail (44.52%) (Nishibori et al., 2001). It seems that the high G+C content is associated with higher temperatures in tropical waters. In bacteria, previous studies showed a strong positive correlation between the G+C content of stem regions of rRNA and environmental growth temperature (Dalgaard and Garrett, 1993; Galtier and Lobry, 1997). A current study demonstrated that the relationship between the nucleotide content of structural RNAs and environmental growth temperature was not due entirely to phylogenetic history, but reflected a repeated selective response to the elevated environmental temperature (Wang and Hickey, 2002).

Table 1 Location of features in the mitochondrial genome of lates calcarifer.

Gene Order.

Although rearrangements of gene order in mtDNA have been seen in several taxa, such as Balanoglossus carnosus (Castresana et al., 1998) and Branchiostoma lanceolatum (Spruyt et al., 1998; Boore et al., 1999), the gene order of the Asian seabass was identical to that of the typi-cal vertebrates. The 768-bp noncoding CR was located between the tRNA that encodes for proline (tRNAPro) and phenylalanine (tRNAphe) and the origin of light replication (OL: 34 bp) was located between a group of 5 tRNA genes (WANCY). Small intergenic sequences varying from 1 to 8 nucleotides were present between some genes (Table 1).

Protein-Coding Genes.

Typical of all bony fishes (Boore, 1999), all protein-coding genes has a methionine (ATG) start codon with the exception of COX1 and ND4L, which had an open reading frame of GTG (Table 1). The Asian seabass had and 1 TAG and 7 TAA stop codons and the remainder of the reading frames had an incomplete termination of either T or TA (Table 1). For those genes with an incomplete stop codon, the transcripts would be modified to form complete termination signal UAA via post-transcriptional polyadenylation. The overlapping reading frames on the same strand of ATP8–ATP6, ND4L–ND4, and ND5–ND6 shared 10, 7, and 4 nucleotides respectively, which are similar to Atlantic cod (Johansen and Bakke, 1996), floating goby (Kim et al., 2004), and walleye pollock (Yanagimoto et al., 2004). The G+C content (47%) in protein-coding genes is much higher than that in zebrafish (39%) (Broughton et al., 2001); Antarctic krill, Euphausia superba (32.2%) (Machida et al., 2004); and Takifugu rubripes (45.3%) (Elmerot et al., 2002). The high G+C content in protein-coding genes might reflect a selective response to high environmental temperature.

tRNA Genes and rRNA Genes.

The Lates calcarifer mitochondrial genome contained 22 tRNA genes interspersed between the rRNA and protein-coding genes, which is typical of mtDNAs of other vertebrates (Boore, 1999). The size of tRNA genes ranged from 66 to 75 nucleotides (Table 1), most of which were large enough for the encoded tRNAs to fold into cloverleaf secondary structures. The 12 S and 16 S rRNA genes were 965 and 1,754 nucleotides long, respectively. Their locations in the genome were identical to those of other teleosts (Elmerot et al., 2002), separated by the tRNAVal gene and positioned between the genes of tRNAPhe and tRNALeu.

Noncoding Sequence.

Typically the light strand replication origin (OL) was positioned in a cluster of 5 tRNA genes known as the WANCY region between the tRNAAsn and tRNACys genes. In Asian seabass, the OL comprised of 34 nucleotides, and interestingly the characteristic sequence motif 5′-GCCGG-3′ at the base of the stem within tRNACys (Elmerot et al., 2002) was mutated to GGCGG sequence. The major noncoding CR in the Asian seabass mitochondrial genome was located between tRNAPro and tRNAPhe genes. This region was 768 nucleotides long, shorter than that of other bony fishes (Lee et al., 1995), but similar to Emmelichthys struhsakeri (838 bp) (Miya et al., 2003) belonging to the order Perciformes. The CR had an average A+T nucleotide content of 64.32%. It had several conserved blocks: TAS, CSB-1, CSB-2, CSB-3, and CSB-D (Figure  1). An AT-repeat sequence was located at the 5′ end of the CR, similar to that of the common carp (Chang et al., 1994). Tandemly repeated sequences present in the CR of some teleosts (Lee et al., 1995) were not observed in the Asian seabass.

Fig. 1
figure 1

The complete nucleotide sequences of mtDNA control region of Lates calcarifer. CSB-D: conserved sequence block D; CSB-1,-2,-3: conserved sequence blocks 1, 2, and 3. TAS: termination associated sequence.

SNP Characterization of the CR and Evolutionary Relationship of Australian and Asian Seabass.

The length of the CR of the 25 individual fish ranged from 767 to 772 bp. The length difference among individuals was caused mainly by the length polymorphism of AT repeat sequences located at the 5′ end of the CR. A total of 68 SNPs were detected in CR among the 25 seabass individuals from Australia and Singapore. Most SNPs were found in the first part of 5′ end (1–300 bp: 39 SNPs) and in the region between 500 and 600 bp (16 SNPs), whereas only 7 and 9 SNPs were found in the region between 301 and 499 bp and between 601 and 768 bp respectively (for details, see Appendix  2), suggesting that the 5′ end of the CR of Asian seabass is most polymorphic. Interspecies comparison of CR sequences had detected the same trend in other fish species (Lee and Kocher, 1995). At 18 positions, the SNPs were fixed with alternative nucleotides in both populations; therefore they can be used to differentiate the origin of the individuals, as well as for authentication of processed Asian seabass products. Among the 25 seabass individuals, a total of 22 haplotypes were detected. The seabass from Singapore were more diverse; each fish represented one haplotype, whereas the 15 seabass from Australia represented only 12 haplotypes. The AT repeat sequence at the 5′ end of the CR showed polymorphism among the 25 fish; the number of AT repeats ranged from 4 to 6.

Among the 68 variable sites, 65 are parsimony-informative. Based on the nucleotide sequences of the CR, 4 phylogenetic trees (NJ, ME, MP, and UPGMA) were constructed. All 4 trees showed the same genetic relationship among the 25 fish individuals. An NJ tree is shown in Figure 2. The seabass individuals from Australia and Singapore were clustered into two distinct branches with strong support of high bootstrap value (100%), suggesting distinct genetic differences between seabass from the two countries, which also correspond to their phenotypic differences.

Fig. 2
figure 2

A NJ tree generated with MEGA for 15 seabass individuals from Australia and 10 seabass individuals from Singapore, based on the complete nucleotide sequence of mtDNA control region. Scale bar is shown under the tree, while the bootstrap proportions created by 1,000 replicates are indicated on each branch. BA: individuals from Australia; PPD: individuals from Singapore. GenBank Accession No. of the control region sequences: DQ012409–DQ012433.

Evolutionary Position of Lates calcarifer.

Phylogenetic analyses were used to examine the evolutionary position of Lates calcarifer. The trees included 30 fish species representing a total of 14 suborders. Trees constructed using 4 different methods showed a similar topology. Figure  3 shows a neighbor-joining (NJ) tree based on phylogenetic analysis of the concatenated amino acids sequences of 12 mitochondrial protein-coding genes. The tree was rooted by 4 ancient fish species: Raja raduata, Polyperus ornatipinnis, Squalus acanthia, and Mustelus manazo. In agreement with the traditional systematic classification (Nelson, 1994), most fish orders and higher level systematic relationship were reconstructed as monophyletic. The tree constructed in this study is quite similar to the one reported by Elmerot et al. (2002). Lates calcarifer was located in the cluster of fish species from the order Perciformes, supporting the traditional taxonomic classification. However, the branch length in the cluster of Perciformes was much longer than that in cluster of Salmoniformes, suggesting that the divergence time among the species in Perciformes is longer than that among species in Salmoniformes. Because, according to Nelson (1994), the order Perciformes comprises more than 2,815 fish species, it is necessary to sequence more mtDNAs of fish species to learn the fine evolutionary relationships of those species.

Fig. 3
figure 3

An NJ-tree reconstructed with MEGA, showing the evolutionary position of Lates calcrifer relative to that of 30 other fish species belonging to 14 suborders. The tree is based on concatenated amino acids sequences of 12 protein-coding genes. The scale bar is shown under the tree, whereas the bootstrap value (>50%) of 1,000 replicates is listed. The analyses included the following 30 species: Pterocaesio tile (NC–004408), Pagrus auriga (NC–005146), Etheostoma radiosum (NC–005254), Scomber, scombrus (NC–006398), Chlorurus sordidus (NC–006355), Acanthogobius hasta (NC–006131), Takifugu rubripes (AJ421455), Stephanolepis cirrhifer (NC–003177), Gadus morhua (X99772), Hime japonica (AB047821), Saurida undosquamis (NC–003162), Oncorhynchus mykiss (L29771), Salmo salar (U12143), Salvelinus alpinus (AF154851), Salvelinus fontinalis (AF154850), Sardinops melanostictus (AB032554), Ictalurus punctatus (AF482987), Cyprinus carpio (X61010), Danio rerio (AC024175), Conger myriaster (AB038381), Anguilla japonica (AB038556), Hiodon alosoides (AP004356), Pantodon buchholzi (AB043068), Scleropages formosus (DQ023143), Osteoglossum bicirrhosum (AB043025), Mustelus manazo (AB015962), Squalus acanthias (Y18134), Raja radiate (AF106038), Polypterus ornatipinnis (U62532) and Lates calcarifer (DQ010541).