Introduction

At least 19 species of fish have populations in Mexico’s Gulf of California (Sea of Cortez) that appear to be partially or fully disjunct from their respective Pacific coast populations (Present 1987; Bernardi et al. 2003). These “Baja California disjunct” species fall into 14 families and display a wide range of life history characteristics. The Baja California region therefore serves as an excellent natural laboratory for exploring connections between a marine species’ life history characteristics and its evolutionary response to vicariance events (see discussion in Jacobs et al. 2004 and Dawson et al. 2006 for discussion of taxon distributions and the geologic history of the Gulf of California). The degree of genetic subdivision exhibited across the Baja California peninsula in these species ranges from undetectable in some species to very strong in others (Crabtree 1983; Orton and Buth 1984; Present 1987; Waples 1987; Tranah and Allen 1999; Terry et al. 2000; Stepien et al. 2001; Huang and Bernardi 2001; Bernardi et al. 2003; Bernardi and Lape 2005).

Traditionally, the degree of genetic subdivision in marine species is believed to relate to a species’ ability to disperse via larval exchange (Palumbi 1992, 1994). Bernardi et al. (2003) searched for correlations between the pelagic larval duration (PLD) and genetic structure exhibited by Baja California disjunct species and found no evidence for a relationship between PLD and genetic subdivision. The authors instead suggested that adult migration and habitat availability in the Gulf of California may be more important factors in determining the level of subdivision in Baja California disjunct species. This hypothesis is supported by studies finding evidence for larval retention in the Gulf of California (Brogan 1994). In addition, the northwestern Gulf of California includes primarily soft-bottom, estuarine adult habitats that are mostly absent from the rocky, southern tip of Baja California (Walker 1960). However, few studies exist to provide detailed information regarding the migratory abilities and other life history characteristics of most Baja California disjunct fishes. The diamond turbot (Hypsopsetta guttulata) is a Baja California disjunct species for which detailed life history information is available but has not been characterized genetically.

Hypsopsetta guttulata (see Cooper and Chapleau 1998 and Evseenko 2003 for a discussion of nomenclatural history) is an estuarine flatfish ranging from Cape Mendocino to Bahía Magdalena on the Pacific coast of North America, and from Bahía Concepción to Guaymas in the Gulf of California (Fig. 1) (Miller and Lea 1972; Lane 1975; Present 1987). Tagging studies and otolith trace element tagging fingerprints suggest that individuals of the species exhibit low overall adult migration between estuaries (Lane 1975; Swearer et al. 2003). Hypsopsetta guttulata settles at the smallest size of any near-shore California flatfish (Ahlstrom et al. 1984) and exhibits a moderate PLD of 5–6 weeks (Gadomski and Peterson 1988). If larval duration was the main factor determining gene flow, diamond turbot would have the potential for substantial migration between the Gulf of California and the Pacific populations. However, if adult habitat specificity is an effective limitation to migration gene flow between the Gulf of California and the Pacific Ocean may be greatly reduced.

Fig. 1
figure 1

Collecting sites for H. guttulata, including site abbreviations for each locality. Additional relevant geographic points are labeled in smaller text. Previously described range of H. guttulata (Present 1987) is shown as gray line parallel to the Pacific and Gulf of California coasts

The goal of this study is to estimate the genetic differentiation between Pacific and Gulf of California populations of H. guttulata, using mitochondrial and nuclear sequence data and external morphology. We hypothesized that although the diamond turbot has the potential for gene flow via pelagic larvae, the limited adult migratory ability of H. guttulata due to its restriction to soft-bottom habitats should limit gene flow and result in genetic divergence between Gulf of California and Pacific populations.

Materials and methods

Sample collection, morphometrics, and DNA extraction

We collected H. guttulata in the field using a beach seine. In total, we obtained 77 specimens from 9 sites along H. guttulata’s Pacific range and 20 specimens from 2 sites in the Gulf of California (Fig. 1; Table 1a). Note that the Yavaros sample in the Gulf of California represents a southward extension of the published range. Individual specimens of Pleuronichthys verticalis, P. ocellatus, and Paralichthys californicus were sampled as outgroups. Samples were stored in 95% ethanol and/or frozen at −20°C after collection (one H. guttulata specimen, G1, was freeze-dried for preservation). We recorded external morphological characteristics by counting dorsal and anal fin rays and measuring distances between external landmarks (Fig. 2). We augmented our sample sizes for morphological analysis by measuring additional specimens from the Scripps Institution of Oceanography Marine Vertebrate Museum (SIO_Samples). This brought our total number of specimens for morphological analysis to 50 Gulf of California samples and 90 Pacific samples. We extracted genomic DNA from muscle tissue of each non-museum specimen using standard silicon capture extraction protocols.

Table 1 Genetic diversity for diamond turbot genes
Fig. 2
figure 2

Morphological measurements included body length (BL), tail length (TL), body depth (BD), head length (HL), maxilla length (ML), pectoral fin length (PL), number of anal rays (AR), and number of dorsal rays (DR)

PCR amplification and sequencing

We designed primers, JSCRF: 5′-CCCTAACTCCCAAAGCTAGGATTCTAG-3′ and JSCRR: 5′-GGCCCATCTTAACATCTTCAGTGT-3′, to amplify the mitochondrial control region in H. guttulata. PCR products were sequenced using the forward primer, JSCRF, or an internal reverse primer, JSSeqCRR: 5′-CCTTACCCGCTGGAGTGAAYG-3′. Every unique haplotype was sequenced in both directions.

We used the universal primers, S7RPEX1F and S7RPEX2R (Chow and Hazama 1998), to amplify the first intron of the low-copy, ribosomal-protein-coding nuclear gene, S7 (S7 1). In order to confirm preliminary genetic patterns observed in S7 1, we also sequenced the second intron of S7 (S7 2) for 22 Pacific and 20 Gulf of California individuals (Table 1). We initially used the universal primers, S7RPEX2F and S7RPEX3R (Chow and Hazama 1998), to amplify S7 2. Due to inconsistent PCR results, we designed new primers based on preliminary sequences: JSS72F: 5′-CATCTCCAGCTCGAGCAGAG-3′ and JSS72R: 5′-AAAGCCAGACGAGTTTGAGTCT-3′. We sequenced S7 1 products using S7RPEX1F and an internal primer, JSS7SeqR: 5′- CTGGACGCCTGAATGTC-3′. We sequenced S7 2 products using JSS72F and JSS72R. Unique haplotypes were sequenced in both directions. When both S7 1 and S7 2 sequences were available for a specimen, we concatenated the sequences prior to analysis. PCR and sequencing protocols are available from the authors.

We used CHB v. 1.0 (Niu et al. 2001; Zhang et al. 2006) and PHASE v. 2.1 (Stephens et al. 2001; Stephens and Scheet 2005) to infer the gametic phases of nuclear haplotypes. We performed initial runs in each program using default parameter values then performed additional runs with different random number seeds and a greater number of preliminary burn-in steps and post-burn-in iterations (respectively, 350/175 for PHASE and 1000/700 for CHB). We compared the results of these runs for consistency between programs, and between runs within a program. Haplotypes whose gametic phases could not be consistently determined with greater than an 85% posterior probability were cloned using the TOPO TA cloning kit for sequencing (Invitrogen Corporation). We screened 4–9 colonies from each product in PCRs using M13 vector primers. We cleaned and sequenced those products as indicated earlier.

Morphological analyses

We analyzed morphological data in SPSS (v. 11.5.0) using direct logistic regression analysis with Gulf of California/Pacific origin as outcome and morphological observations as predictors. Logistic analysis permits the inclusion of data that are not normally distributed. Prior to analysis of measurements between landmarks, we performed a least squares regression between measurements taken in different geometric planes (i.e., between BL and BD as opposed to BL and HL, see Fig. 2). BL, TL, HL, PF, and ML were regressed against BD. PF and ML were also regressed against BL (Fig. 2). In order to control for allometry, we used the residuals from these regressions for further analysis (Santos et al. 2006). We initially performed the logistic regression with all variables entering the model together in the first step. We also searched for the most efficient model by using stepwise addition of predictor variables based on the likelihood ratio test. We raised the probability for stepwise entry to 0.2 to help ensure that all variables with coefficients different than 0 entered the model (Tabachnick and Fidell 2007). A 2 × 2 classification table was used to examine the success of the resulting model in assigning individuals to region of origin.

Estimation of phylogenetic trees

Phylogenies were estimated separately for control region sequences and S7 sequences, using the following procedures: sequences were aligned automatically in Sequencher (v. 3.1.1), and manually adjusted to ensure parsimonious alignment around gaps. We used PAUP 4.0b10 (Swofford 2003) to estimate maximum likelihood trees. The BIC in Modeltest 3.7 (Posada and Crandall 1998) was used to determine the least parameterized substitution model. In PAUP, we initially created a neighbor-joining tree and estimated model parameters on this topology using maximum likelihood and the substitution model selected using Modeltest. We then fixed these parameters values and performed a maximum likelihood heuristic topology search. The new topology was then used to re-estimate the model parameters, and this process was repeated until model parameters and log likelihood scores stabilized. Maximum likelihood bootstrapping (100 replicates) was performed using the final estimated model parameters. We used these same parameters for Bayesian tree analysis in MrBayes (Huelsenbeck and Ronquist 2001; Ronquist and Huelsenbeck 2003). We ran 4 chains for 2,000,000 generations, sampling every 500 generations, and monitored the average deviation of split frequencies between independent runs for convergence. In calculating posterior probabilities, we used only the trees from generations after the average deviation of split frequencies between independent runs dropped below 0.05. All previous trees were deleted as burn-in. We calculated posterior probabilities by creating a 50% majority rule consensus tree in PAUP from all post-burn-in trees.

Population genetic analysis

To estimate H. guttulata population structure across the Baja peninsula and along their Pacific range, we estimated Fst, Φst (between populations), Φct (between regions) and AMOVA statistics with Arlequin v.3.01 (Schneider et al. 2000) using the TrN+G model with gamma distribution shape parameter set to the values estimated for each dataset in PAUP.

When high levels of haplotype diversity were encountered, we created haplotype networks using TCS v.1.21 (Clement et al. 2000). (Links to certain control region haplotypes could not be established in haplotype networks due to their unusually high sequence divergence from the other haplotypes. Those haplotypes were treated as separate clusters.) Closed loops in the networks were broken by preserving evolutionary relationships between haplotypes suggested by our maximum likelihood trees. Haplotypes were clustered as in Nested Clade Analysis (Templeton et al. 1995), and Arlequin analyses were repeated at various nesting levels to examine the population subdivision for related clusters of alleles.

We used LAMARC v.2.1 (Kuhner 2006) to estimate the population parameter Θ and the number of migrants per generation between populations. We employed Bayesian searches on combined control region and S7 1 data, since recent studies suggested these strategies produce more precise, and potentially more accurate, estimates (Beerli 2006; Kuhner and Smith 2007). Only populations with greater than five individuals (O, L, U, C, B, Y, Fig. 1) were used. We set the effective population size scalar to 4 for nuclear sequences and 1 for control region sequences. We set the relative mutation rate to 1 for nuclear sequences and 7.3 for control region sequences, determined by examining relative model-corrected distances among nuclear haplotypes and among control region haplotypes. The transition/transversion ratio for each locus was set at the values estimated by PAUP. We performed iterative analyses in LAMARC, similar to maximum likelihood tree estimation. All search parameter values were left at default levels for the initial run, except that we increased the number of long chains to 3. All additional runs used only 1 long chain, and no short chains, with 3,000 burn-in steps, 20,000 samples, and a sampling interval of 40. Parameters were averaged over five independent sub-runs. In addition, adaptive heating was used with initial temperatures set to 1.0, 1.2, 1.5, and 3.0, and a swap interval of 1. After the initial run, starting values of Θ and migration for run i were set to the values estimated in run i-1. We performed 3 runs, tracking the most probable estimates and confidence intervals of parameters for consistency and convergence over runs. After 3 runs, 95% confidence intervals for estimated parameters remained relatively constant between runs, so analysis was concluded. We additionally plotted the contents of the curve files produced for each parameter to determine whether the program estimated a single optimum for each parameter or whether the data suggested multiple potentially optimal values. Since we set the control region effective population size scalar to 1 and the S7 scalar to 4, LAMARC estimated Θ on the mitochondrial scale (Θ = Nμ). We multiplied this value by 4 in order to report Θ estimates on the scale of an autosomal gene. We also multiplied the migration estimates [M(LAMARC) = m/μ] by the Θ estimate (4 Nμ) for the recipient population to report migration as M = 4 Nm. We calculated 95% confidence intervals for M by multiplying the upper and lower confidence values for Θ by the respective upper or lower values for M(LAMARC). We used these same LAMARC protocols to estimate migration between the Gulf of California and Pacific regions. For these analyses, the input file contained combined S7 1 and S7 2 data grouped by region, along with corresponding control region sequences for those individuals.

Results

Morphological analyses

Average body depth was not significantly different between the Gulf of California sample [29.39 mm ± 16.75 (SD)] and the Pacific Ocean sample (30.95 mm ± 26.63), suggesting that our samples did not differ in overall body size. The initial logistic regression model with all variables entered was statistically better than a constant-only model (Chi-square 55.249, df 10, P < 0.001). This shows that the morphological variables, as a set, reliably distinguish between Gulf of California and Pacific samples. The stepwise logistic regression indicates that 5 variables (number of anal fin rays, head length, pectoral fin length, maxilla length, and tail length) contribute most to the ability of the model to predict region of origin (Table 2). Including additional morphological variables in the model does not significantly improve the model’s ability to predict the outcome. This 5-variable model correctly assigns 77.27% of Gulf of California samples and 88.16% of Pacific samples to their region of origin. Head length exhibits the largest odds ratio (4.1097), indicating that assignment to Gulf of California or Pacific is most highly influenced by changes in size-corrected head length. The values of the 5 morphological variables included in the model overlap significantly in Pacific and Gulf of California samples, so we cannot assign a strict cutoff for separating Pacific and Gulf of California specimens based on these features.

Table 2 Results of logistic regression

MtDNA sequence analysis

We obtained between 426 and 429 bps of sequence from the 5′ end of the control region in H. guttulata. This locus includes 143 variable sites and 6 single-base insertions/deletions (indels) between sequences. Sequences exhibit 10 fixed differences between Gulf of California and Pacific populations, including 1 fixed indel in Gulf of California sequences at base 349. The average, corrected pairwise genetic distance between Pacific and Gulf of California regions derived from Arlequin using the GTR+I+G model described later is 0.1211. This contrasts with an average pairwise distance of 0.0360 among Pacific sequences and 0.0230 among Gulf of California sequences (Table 1b). All control region sequences are available from GenBank with accession numbers FJ155668–FJ155748.

MtDNA phylogenetic tree

The large between-region genetic distances of control region sequences are reflected in the strong statistical support for reciprocal monophyly between Gulf of California and Pacific haplotypes in the maximum likelihood phylogenetic tree (Fig. 3). Modeltest (BIC) chose the GTR+I+G substitution model for control region sequences. The final maximum likelihood model included a ti/tv of 10.1246, a proportion of invariant sites of 0.5145, and a shape parameter of 1.1429. We did not obtain clean control region sequence for P. verticalis or P. ocellatus, and P. californicus control region exhibits an uncorrected genetic distance of 0.57 from the nearest H. guttulata haplotype. In contrast, the maximum uncorrected distance between H. guttulata haplotypes is 0.12. This extreme level of sequence divergence between H. guttulata and the outgroup prevented outgroup estimation of the proper tree root. The tree is therefore shown with midpoint rooting.

Fig. 3
figure 3

Maximum likelihood control region haplotype phylogeny. Bayesian posterior probabilities are shown above branches and maximum likelihood bootstraps are below branches. Haplotypes are labeled with sample numbers (site code + sample no. from site) of the individuals possessing the haplotype

MtDNA population genetic analyses

Clustering populations into 2 regions for AMOVA results in a Φct of 0.774 between the Pacific and the Gulf of California regions. All pairwise Φst values between populations from the Pacific and Gulf of California populations (with N > 5) were found to be significantly different than 0, ranging from 0.740 to 0.869 (Table 3). Between-population variation was much lower within regions. Pairwise Φst ranges from 0.000 to 0.130 in the Pacific and equals 0.148 between the two Gulf of California sites. However, some significant variation was found. Significant pairwise Φst values were estimated between Oakland and 2 southern Pacific sites, and between the Gulf of California sites of Bahía de Los Ángeles and Yavaros (Table 3).

Table 3 Population structure comparisons for Diamond Turbot

Because of the large number of control region haplotypes, the high within-population variation (Table 1) could mask between-population variation in Φst analyses for the Pacific populations. Therefore, we clustered haplotypes by relationship by recoding the haplotypes for Pacific individuals as belonging to one of 4 clades: P4-1, P4-2, Q4-1, and Q4-2 (Fig. 4). The recoded data set was then used to estimate traditional (frequency only) Fst values in Arlequin. High and statistically non-zero Fst estimates between northern and southern Pacific sites support the conclusion of restricted gene flow within this region (Table 4).

Fig. 4
figure 4

Haplotype networks for Pacific control region sequences (“P” network and “Q” network). The exact connection between these networks is unknown. Nestings are shown only for higher level clades. Each branch represents a single nucleotide difference between adjacent haplotypes. Haplotype names are shown as ovals containing sample names. Unsampled but inferred intermediate haplotypes are shown as small filled dots in branches

Table 4 Results of Fst analysis of nested Pacific haplotype data

Autosomal intron sequence analysis

As expected, rpS7 intron sequences are less variable than control region sequences. We identified 27 polymorphic sites out of 508 total bases in S7 1. One individual from Oakland additionally appears heterozygous for a 2-base deletion in S7 1. Preliminary comparisons between Pacific and Gulf of California populations suggested that only one very common allele was shared between those regions. We sequenced S7 2 in order to determine whether additional sequence data would provide evidence that this shared allele is in fact different between the Pacific and the Gulf of California regions. We obtained 280 bps of sequence from S7 2, which included 11 polymorphic sites and no indels. Concatenated S7 sequences revealed that some individuals with the shared S7 1 haplotype are indeed different when additional S7 gene sequence is added. There is still a single, shared haplotype between Pacific and Gulf of California populations, but that 2-intron haplotype (h1) is not as prevalent (occurring only twice in Pacific samples and 5 times in Gulf of California samples) as the shared S7 1 haplotype. The average, corrected pairwise genetic distance between Pacific and Gulf of California S7 sequences derived from Arlequin (S7 1 and 2 combined) is 0.0029. The within Pacific distance is 0.0051, and the within Gulf of California distance is 0.0015 (Table 1b). All rpS7 sequences are available from GenBank with accession numbers FJ155749–FJ155799.

Autosomal intron phylogenetic tree

Here, we present only the tree estimated from combined S7 1 and S7 2 data (Fig. 5). Since individual S7 1 or S7 2 trees incorporate less sequence data, they include fewer haplotypes and the resolution of clades is poor. Again, divergence between H. guttulata and outgroup S7 sequences is large. The least diverged outgroup from H. guttulata exhibits an uncorrected genetic distance of 0.19 from H. guttulata specimens. That contrasts with a maximum uncorrected distance of 0.017 between H. guttulata specimens. As explained above for control region, the phylogenetic tree for S7 is therefore midpoint rooted. Modeltest chose the HKY+I substitution model for S7 sequences. After heuristic searches, the final model included a ti/tv of 1.5402, a proportion of invariant sites of 0.9429, and a shape parameter of 3.4358. The maximum likelihood tree places the only haplotype shared between the Pacific and the Gulf of California (h1) at the base of a monophyletic clade that includes all the other Gulf of California haplotypes (h2, h4, h9, h10, h11, h16). Although the monophyly of Gulf of California haplotypes is not strongly supported by bootstrap values or Bayesian posterior probabilities, it is interesting to note that maximum likelihood analyses of both control region and S7 grouped haplotypes from the Gulf of California as monophyletic groups.

Fig. 5
figure 5

Maximum likelihood S7 haplotype phylogeny. Bayesian posterior probabilities are shown above branches and maximum likelihood bootstraps are below branches. Letters next to haplotype numbers are site codes indicating sites where that haplotype was found

Autosomal intron population genetic analyses

All S7 1 pairwise Φst values between Pacific and Gulf of California populations are statistically distinguishable from 0. As mentioned above, Φst values are reported only for populations with 5 or more individuals sampled. Pairwise Φst values between the Gulf of California sites and Pacific sites range from 0.352 to 0.513 (Table 3). By comparison, pairwise Φst values among populations within the Pacific range from 0.000 to 0.177 and equals 0.068 between the Gulf of California sites. Clustering populations into regions in the overall AMOVA results in a Φct of 0.345 between the Pacific and the Gulf of California. Combining S7 1 data with S7 2 data in AMOVA produces a Gulf of California-Pacific Φct of 0.456.

Migration between populations

Using combined control region and S7 data, LAMARC estimated very restricted migration between the Gulf of California and the Pacific, and relatively high levels of migration between most Pacific populations (Table 5; Fig. 6). Migration into Oakland and out of Oakland to El Cuarenta and Cancun appear relatively reduced compared to migration between other sites. Additionally, migration from Cancun to Los Alamitos and El Cuarenta (northward) was estimated at several times the rates of migration into Cancun from those sites (southward). However, the 95% confidence intervals for these migration rates are quite high, and the differences are not statistically significant. Analysis of curve files produced by LAMARC indicates that the Θ and migration estimates for each locus are the results of single, optimal peaks in the data, and the overall estimate for combined data appears to reconcile differences in estimates between the two loci. It should be noted that interpretation of Nm estimates as migration assumes that retained ancestral polymorphism is not responsible for shared alleles or related sets of alleles. Given the reciprocal monophyly of the mtDNA, retention of ancestral polymorphism may be a better interpretation of the LAMARC results than high rates of gene flow.

Table 5 Estimates of H. guttulata migration produced by LAMARC analysis
Fig. 6
figure 6

Estimates of migration rate between 4 Pacific sites (O, L, U, C), the 2 Gulf of California sites (B, Y), and between the Gulf of California and Pacific regions. Relative thickness of arrows corresponds to relative levels of migration between sites. Color differences between arrows are only to improve readability. Numbers on arrows indicate number of migrants per generation

Discussion

Gulf of California range for H. guttulata

Present (1987) included H. guttulata in the group of Baja California disjunct species whose Gulf of California range does not extend south of Guaymas Bay on the coast of mainland Mexico (Fig. 1). Since the Yavaros samples in this study originate from a site over 150 km south of Guaymas Bay, the Gulf of California range for H. guttulata should be extended to at least that point.

Gulf of California: Pacific disjunction and speciation

Strong evidence of reciprocal monophyly between Gulf of California and Pacific mitochondrial control region sequences (Fig. 3) and the implication of ongoing sorting toward monophyly in nuclear sequences (Fig. 5) both suggest that H. guttulata has significantly reduced gene flow between Pacific and Gulf of California populations. Like some other Baja California disjunct species (Bernardi et al. 2003; Bernardi and Lape 2005), Gulf of California and Pacific H. guttulata may be in the process of speciation. Given a 1:1 sex ratio, the effective population size for mitochondrial loci is ¼ that of autosomal loci because of maternal inheritance and effective haploidy. This means that allele sorting between allopatric populations is expected to proceed more rapidly for mitochondrial sequences (N generations on average, where N is the inbreeding effective population size) than for nuclear sequences (4 N generations) (Moore 1995). Our data suggest that Pacific and Gulf of California populations of H. guttulata have not been separated long enough for full reciprocal monophyly for nuclear loci. Another Baja California disjunct species, A. davidsonii, has been studied using both mitochondrial and nuclear sequence data (Bernardi and Lape 2005). Gulf of California and Pacific populations of A. davidsonii showed reciprocal monophyly for mtDNA, but a significant amount of sharing of rpS7 sequences.

Additional morphological and molecular analyses support the possibility of allopatric, incipient speciation. External morphological features show statistically detectable differences between Gulf of California and Pacific H. guttulata (Table 2). Φst and migration analyses of control region and S7 sequences both additionally detect significant restrictions in gene flow between the Pacific and the Gulf of California regions (Tables 3, 4, 5). Further study of H. guttulata using a wider range of more quickly evolving nuclear genes could potentially detect reciprocal monophyly for some loci.

Phylogeography within regions

Φst analyses and migration estimates suggest varying levels of reduced gene flow between sites within the Gulf of California and Pacific regions. In the Gulf of California, two analyses (control region Φst, migration estimation) detect structure between Bahía de Los Ángeles and Yavaros (Tables 3 and 5; Fig. 6). With only two populations sampled from the Gulf of California, we cannot assess details of within Gulf differentiation that are known for some species (Riginos 2005). Φst analyses, migration estimates, and control region Fst analysis of clustered haplotypes also detect genetic structure among Pacific sites. All Pacific analyses suggest restricted gene flow between northern Pacific and southern Pacific sites. Point Conception and Punta Eugenia (Fig. 1) have traditionally been recognized as important phylogeographic barriers for marine fishes (Dawson et al. 2006). Our data are consistent with the possibility of these general regions as phylogeographic barriers for H. guttulata. The S7 1 Φst analysis (Table 3) suggests genetic structure between Oakland and Los Alamitos. Additionally, when nested control region data (data not shown) are grouped as either north or south of Point Conception, the resulting frequency-based Fst is significant (Fst = 0.262). Analyses also imply genetic structure in the region of central Baja California. The S7 1 Φst, control region Φst, and migration estimates (Tables 3, 4, and 5; Fig. 6) all point to restricted gene flow between Oakland and the 2 sites south of Punta Eugenia (El Cuarenta and Cancun). The S7 1 Φst analysis shows restricted gene flow between Los Alamitos and Cancun (Table 3), and the control region Fst analysis on nested haplotypes (Table 4) detects restricted gene flow between sites north and south of Punta Eugenia. Taken together, these results seem to indicate that central Baja California represents an important area in the phylogeography of H. guttulata. Two population genetic analyses, control region nested Fst and S7 1 Φst, also indicate structure between the southern Pacific sites of El Cuarenta and Cancun.

Ecology/life history and trans-Baja California differentiation

Results from our study indicate that H. guttulata conforms to expectations of population genetic structure based on its adult life history. As an example of a Baja California disjunct species with a low adult migratory ability and a dependence on soft-bottom habitats, H. guttulata exhibits relatively strong genetic differentiation between Gulf of California and Pacific populations. Examination of habitat associations, life history characteristics, and museum collection records for H. guttulata and other Baja California disjunct species supports the hypothesis that adult migration and ecological factors play important roles in determining whether Baja California disjunct species are genetically distinct or homogeneous across Baja California. For example, all of the presently studied Baja California disjunct species exhibiting little genetic differentiation across Baja California (zebraperch, rock wrasse, California sheephead, Mexican rockfish) are associated with conspicuous rocky habitats (Allen and Pondella 2006). In contrast, those Baja California disjunct species associated with either soft-substratum habitats (grunion, spotted bass, diamond turbot, orangethroat pikeblenny, longjaw mudsucker) or cryptic rocky reef habitats (mussel blenny, blue banded goby) (Allen and Pondella 2006) exhibit high Gulf of California/Pacific genetic divergence (for genetic references see Intro). This matches well with the hypothesis that the non-uniform distribution of soft-bottom habitat in the Gulf of California (Walker 1960) contributes to allopatry and incipient speciation in these species. A full review of the literature is in progress (Schinske in preparation).

Ecological studies on some Baja California disjunct fishes additionally suggest relatively limited adult migration in species with genetic differentiation across Baja California. Adult movement appears restricted for the diamond turbot (Lane 1975; Swearer et al. 2003), mussel blenny (Losey 1968; Stephens et al. 1970), spotted bass (Allen et al. 1995), blue banded goby (Steele 1997, 1998), orangethroat pikeblenny (Thresher 1997), longjaw mudsucker (Barlow 1961; Bernardi et al. 2003), and opaleye (Valle 1989). While few studies have examined migration in the panmictic Baja California disjunct species, collection records indicate the possible occasional appearance of those species near the tip of Baja California (CICIMAR Sample 1693; SIO Sample 65-346; SIO Sample 84-73; SIO Sample 59-219; 65-230; Present 1987; Bernardi et al. 2003). These studies and observations further support the concept that life history considerations restrict the populations of certain Baja California disjunct species to particular areas in the Baja California region, resulting in allopatric speciation between those geographically isolated populations. However, drawing firm conclusions will require additional long-term ecological studies of these species’ habitat associations and adult movement (especially in panmictic species). The present study demonstrates the necessity of both ecological and genetic data when considering the forces that affect marine evolutionary divergence and speciation.