Diversification and distribution of gall crabs (Brachyura: Cryptochiridae: Opecarcinus) associated with Agariciidae corals

Coral reefs are home to the greatest diversity of marine life, and many species on reefs live in symbiotic associations. Studying the historical biogeography of symbiotic species is key to unravelling (potential) coevolutionary processes and explaining species richness patterns. Coral-dwelling gall crabs (Cryptochiridae) live in obligate symbiosis with a scleractinian host, and are ideally suited to study the evolutionary history between heterogeneous taxa involved in a symbiotic relationship. The genus Opecarcinus Kropp and Manning, 1987, like its host coral familyAgariciidae, occurs in both Indo-Pacific and Caribbean seas, and is the only cryptochirid genus with a circumtropical distribution. Here, we use mitochondrial and nuclear DNA gene fragments of Opecarcinus specimens sampled from 21 Indo-Pacific localities and one Atlantic (Caribbean) locality. We applied several species delimitation tests to characterise species diversity, inferred a Bayesian molecular-clock time-calibrated phylogeny to estimate divergence times and performed an ancestral area reconstruction. Time to the most recent common ancestor (tMRCA) of Opecarcinus is estimated at 15-6 Mya (middle Miocene—late Miocene). The genus harbours * 15 undescribed species as well as several potential species complexes. There are indications of strict host-specificity patterns in certain Opecarcinus species in the Indo-Pacific andAtlantic, however, a robust phylogeny reconstruction of Agariciidae corals— needed to test this further—is currently lacking. The Indo-West Pacificwas inferred to be themost probable ancestral area, from where theOpecarcinus lineage colonised theWestern Atlantic and subsequently speciated into O. hypostegus. Opecarcinus likely invaded from the Indo-West Pacific across the East Pacific Barrier to the Atlantic, before the full closure of the Isthmus of Panama. The subsequent speciation of O. hypostegus, is possibly associated with newly available niches in the Caribbean, in combination with genetic isolation following the closure of the Panama Isthmus.


Introduction
Coral reefs are home to the greatest diversity of marine life, and many species on reefs live in symbiotic associations. Symbiosis plays a key role in maintaining the health and balance of diversity of reef systems (Stewart et al. 2006). The biodiversity of coral reefs is dominated by invertebrates, many of which rely on hosts for food, habitat, or settlement cues (Stella et al. 2011;Hoeksema et al. 2012). While the diversity, distribution, and relationships of some reef organisms are fairly well-studied, we know relatively less about coral symbionts other than zooxanthellae. The study of the historical biogeography of symbiont taxa is important for our understanding of the evolution of symbiotic relationships and their species richness gradients (Pinto-Ledezma et al. 2017).
While reefs and reef corals exist in all four tropical marine regions, they are best developed and most diverse in the Indo-West Pacific (IWP) and the West Atlantic (WA), and occur to a more limited extent in the East Pacific (EP) and East Atlantic (EA). The origin and evolution of reef biota in the two great reef regions have been complex. Within the IWP the Coral Triangle (CT) is the centre of marine biodiversity (Renema et al. 2008), and diversity of most marine organisms declines from there with both latitude (Ukuwela et al. 2016) and longitude (Miller et al. 2018). These diversity clines have long been studied and numerous hypotheses advanced to explain them (Rosen 1988;Paulay 1997;Bellwood et al. 2005;Huang et al. 2018).
Diversity in the IWP is about an order of magnitude greater than in the WA (Paulay 1997). Part of the biota of both regions have radiated in situ, while other lineages have not diversified since their arrival. In situ radiations dominate the IWP fauna, while migrant lineages that have not diversified are more common in the WA. In situ diversification is nevertheless common in the WA, and characterises much of the biota, as exemplified by several coral clades (Fukami et al. 2004(Fukami et al. , 2008, mithracid crabs (Windsor and Felder 2014), and cone snails (Kohn 2014). Other WA species represent isolated lineages that have not diversified within the basin (e.g. O'Hara et al. 2019).
WA lineages that have IWP ancestry range broadly in age. Phylogenetic analyses reveal that some species that range across the IWP and WA show little differentiation and are recently or currently connected (Collin et al. 2020). Other species that were thought to be so wide-ranging turned out to be cryptic complexes, with divergent lineages in the IWP and WA (Michonneau 2015;Dudoit et al. 2018). Many well-characterised and older WA endemics are nestled in IWP clades (O'Hara et al. 2019).
Some clades or lineages that range across the IWP and WA have attained their wide ranges by crossing the East Pacific Barrier (EPB) prior to when the Isthmus of Panama separated the EP and WA (Glynn and Ault 2000;Lessios and Robertson 2006;Baraf et al. 2019), others have colonised the WA around the Cape of Good Hope via the Benguela Current (Rocha et al. 2005;Andrews et al. 2016), and some have done both (Bowen et al. 2001). The more species-rich IWP has typically been the source for interregional dispersal, with some notable exceptions (Levinton et al. 1996;Huang et al. 2018).
To what extent is the diversification and distribution of symbiotic groups coordinated? Here we investigate the evolutionary dynamics of a crab lineage that is obligately symbiotic with stony corals. The modern scleractinian faunas of both IWP and WA are dominated by locally diversified lineages, such as the endemic Faviidae, Meandrinidae, and Agaricia Lamarck, 1801 of the WA, and most coral clades in the IWP. In contrast, local radiations appear to be common in coral-symbiotic crabs in the IWP, but not in the WA.
Several crab lineages have evolved obligate or facultative symbioses with scleractinian corals (Castro 2015), and these symbionts are much more diverse in the IWP than WA. Cryptochiridae and Domeciidae (not Maldivia Borradaile, 1902, which associates with gorgonians) include representatives in both the IWP and WA, while the Tetraliidae, Trapezia Latreille, 1828 (Trapeziidae), Tanaocheles Kropp, 1984 (Tanaochelidae) and Cymo De Haan, 1833 (Xanthidae) associate with scleractinians in the Indo-Pacific (Lai et al. 2009;Castro 2015). Currently 47 cryptochirid species have been described from the IWP, and only four are known from the WA in three genera, with one of these genera endemic to the WA (Kropp and Manning 1987;Ng et al. 2008;Van der Meij 2014b;Castro 2015;WoRMS 2021). Five domeciids are known from the IWP and only one from the WA (Castro et al. 2004). Thus it appears that symbiotic crabs may not have diversified within the WA, although this needs further testing given the high diversity of undiscovered and cryptic species in these groups (as we also demonstrate below) (Van Tienderen and Van der Meij 2017).
Our goal is to explore the diversity and distributional dynamics of the cryptochirid genus Opecarcinus, obligate symbionts of the scleractinian coral family Agariciidae. These crabs are a prime example of species living in obligate symbiosis with a scleractinian coral host (Castro 1988). Van der Meij and Schubart (2014) demonstrated that the Cryptochiridae is monophyletic, and their most recent common ancestor (MRCA) is estimated at 50-23 Mya (Van der Meij and Klaus 2015). The cryptochirid MRCA was previously estimated at * 83 Mya in a study on the infraorder Brachyura by Tsang et al. (2014), however, the clade containing the cryptochirid specimen has poor support. The Agariciidae currently includes seven genera that range across the IWP, EP, and WA, although ongoing taxonomic revisions will likely lead to changes in generic classification (Terraneo et al. 2017). Agariciidae are mostly zooxanthellate reef corals, common in tropical shallow-waters and also well represented in mesophotic reefs (Terraneo et al. 2017 We explore the diversity of the genus using a multimarker dataset to assess how much undiscovered and cryptic diversity exists and where these additional species live. With a time-calibrated, multigene phylogeny we then explore how the diversity of this group has evolved across the tropical reefscape, with special attention to how WA and IWP species are related. Do Opecarcinus in these regions represent sister lineages or are they nested? What is the timing and likely route of colonisation?

Sample collection and data collection
Species of Opecarcinus and Pseudohapalocarcinus ransoni Fize and Serène, 1956 (and cryptochirid outgroups) were collected from 21 localities in the IWP and WA (Fig. 1, Table S1), between 2006 and 2017. Nine species belonging to seven cryptochirid genera were chosen as outgroups according to Van der Meij and Nieman (2016). Specimens were photographed alive to document colour patterns, then fixed and stored in 80% ethanol. The material collected from the Red Sea, Maldives, Coral Triangle, Japan, New Caledonia, and Curaçao is deposited in Naturalis Biodiversity Center, Leiden, The Netherlands (RMNH), whereas specimens from the remaining localities are deposited in the Florida Museum of Natural History, University of Florida, Gainesville, USA (UF) ( Table S1). Most sampled localities were extensively explored for gall crabs, with the exception of Japan, Taiwan, Hawaii and New Caledonia, for which a limited number of Opecarcinus specimens were available for analyses. DNA extractions, PCR and sequencing followed the protocol in . Specimens were identified using Kropp (1989) and Van der Meij (2014b), using morphological characters combined with host and distribution data. Provisional names were assigned to species that did not fit established described taxa, using the prefix SET (for SET van der Meij) and a numeric designation. These names will be consistently applied to these OTUs in the future until a proper name is established for each.

Phylogenetic analyses and divergence time estimation
All analyses were performed on a concatenated Opecarcinus dataset of two mitochondrial genes (Cytochrome Oxidase I (COI) and 16S rRNA) and a nuclear gene (Histone H3). The total data set consisted of 1539 bp: 658 bp for COI, 594 bp for 16S rRNA and 287 bp for H3. The sequences of each marker were aligned separately using Clustal W 2.1 (Thompson et al. 1994) and then adjusted manually. All sequences were concatenated by Phylosuite 1.2.1 (Zhang et al. 2020); subsequently Parti-tionFinder 2 (Lanfear et al. 2017) was applied to find the best partition scheme for the complete dataset consisting of 230 terminals. The best-fit scheme corresponded with the markers (COI, 16S, H3) in the original dataset. Parti-tionFinder was also used to find the best-fit nucleotide Bayesian Inference (BI) analyses and divergence time estimations were conducted on the concatenated data set in BEAST v1.10.4 ) by running the Markov chain for 100 9 10 6 steps iterations, sampling every 5000 iterations. The TN93 ? C ? I ? X substitution model was applied to COI, while the best model for 16S and H3 was GTR ? C ? I ? X. A Yule tree prior with default settings for the speciation rate and an uncorrelated relaxed clock with lognormal distribution were applied. Tracer v1. Calibration information for divergence time estimation can come from several sources, such as substitution rates, fossils, and geological data (Heath 2015). There are no known cryptochirid fossils (only trace fossils, see Klompmaker et al. 2016), hence substitution rates for each of the three gene fragments were used for calibration (Van der Meij and Klaus 2015). The priors for substitution rates were set as follows. Substitution rates of the COI locus in arthropods range between 0.7% and 2.0% per Myr (e.g. Schubart et al. 1998;Daniels et al. 2015). Here the mean rate of 1.17% per Myr for COI locus was used with an SD of 0.9%, and 95% highest posterior density (HPD) was from 0.20 to 2.69%. The base substitution rate of 16S rRNA was set to 1.09 ± 0.24% (mean ± SD) per Myr and 95% HPD was from 0.63 to 1.41%. Histone H3 was set to 0.19 ± 0.04% per Myr distribution and 95% HPD was from 0.12 to 0.26% (Van der Meij and . Substitution rates for the latter two genes are derived from divergence time estimates of freshwater crabs from the Old World (Asia, Africa and Europe) based on three fossil calibration points (Klaus et al. 2010). All priors of gene fragments were calculated using a normal distribution.
In addition to the time-calibrated phylogenetic reconstruction, a ML analysis based on three concatenated markers (COI, 16S rRNA and H3) including Opecarcinus, Pseudohapalocarcinus ransoni and nine cryptochirid outgroups was conducted by IQ-TREE (Nguyen et al. 2015) for 10,000 ultrafast bootstraps (Minh et al. 2013). The bestfit nucleotide substitution model for each marker was GTR ? I ? G.

Ancestral area reconstruction
The Opecarcinus samples were collected from IWP and WA, and these two regions were applied to ancestral area reconstruction. To estimate ancestral ranges across the Opecarcinus phylogeny, a Maximum Clade Credibility (MCC) tree was implemented with BEAST using the same process as described above for the divergence time estimation. The best-fit nucleotide substitution model for 16S was GTR based on PartitionFinder 2. However, the eigenvalues did not converge, likely because the GTR model was applied to small partitions with too few taxa (Drummond and Bouckaert 2015), so HKY was used instead for 16S. Parametric methods (e.g. DEC and its extension; Yu et al. 2015) have been developed as a response to the shortcomings in event-based methods, which focus on integrating biogeographic processes and patterns (e.g. Dispersal-Vicariance Analysis, DIVA) (Ronquist 1997). Hence, ancestral range estimation was computed using the R package 'BioGeoBEARS' under the Dispersal-Extinction Cladogenesis model (DEC) (Ree et al. 2005; Ree and Smith 2008; R Core Team 2020). Considering the criticism of the DEC ? j model (Ree and Sanmartín 2018), 'jump' speciation was not considered in our analyses.

Phylogenetic inference and divergence time of Opecarcinus
The phylogenetic reconstruction and species delimitation tests recovered 25 species in Opecarcinus by all species delineation methods (Fig. 2, Table S1) and all with high branch support. Additional species were recovered by some, but not all, delineation methods within seven species: O. hypostegus Shaw and Hopkins, 1977, O. pholeter Kropp, 1989 (Table S1). We treated each of these latter as single species.
Time to the Most Recent Common Ancestor (tMRCA) of Opecarcinus was estimated at 15-6 Mya (middle Miocene-late Miocene). Within Opecarcinus two main clades can be discerned (Fig. 2, Fig. S2 Clade II (tMRCA 10-5 Mya) contains all other Opecarcinus species. Within this clade several groupings can be discerned, and several potential species complexes are revealed (Fig. 2, S2)

Ancestral area reconstruction
The IWP was recovered as the most probable ancestral area for Opecarcinus as a whole, as well as for all nodes within the genus (Fig. 3). Opecarcinus colonised the WA from the IWP, and speciated into O. hypostegus. Our divergence  Fig. S2).

Diversity of Opecarcinus
Currently Opecarcinus contains nine described species (Van der Meij 2014b; WoRMS 2021), however, our results suggest that the genus includes at least 25 species (Figs. 2, S2, Table S1). These results will form the basis of a taxonomic revision of the genus. Moreover, substantial genetic variation in several species (e.g. O. pholeter, O. SET7, and O. SET12; see Figs. 2, S2), suggests further, potential cryptic species diversity, which warrants investigation. The only Atlantic species, O. hypostegus, is also a potential species complex (Fig. 2, S2). This cryptochirid inhabits species of Agaricia (Kropp and Manning, 1987; Van der Meij 2014a), and Helioseris cucullata Ellis and Solander, 1786 (Hoeksema et al. 2017). Our results are in line with those of Van Tienderen and Van der Meij (2017), who identified high levels of genetic divergence within this species, with significant genetic differentiation across its host species. The authors hypothesised that this differentiation may represent early signs of host speciation in O. hypostegus, but still considered this gall crab a single species.
Opecarcinus is strictly associated with the Agariciidae (Kropp 1989;Van der Meij 2014b). This coral family also hosts the monotypic gall crab genera Pseudohapalocarcinus and Luciades Kropp and Manning, 1996, neither of Fig. 3 Ancestral area estimation for Opecarcinus, implemented in BiogGeoBEARS under the DEC model. Each terminal clade is represented by one sequence. The most likely ancestral area is indicated by letters at nodes and corners, the latter are the immediate states after species divergence Coral Reefs which occurs in the Atlantic. The position of P. ransoni is not fixed in the various reconstructions of the Cryptochiridae (e.g. Van der Meij and Klaus 2015; Van der Meij and Nieman 2016), however, our results show that it falls well within Opecarcinus with full support (Fig. S1), making Opecarcinus paraphyletic. Further study with additional markers, combined with morphological data, is needed to robustly place P. ransoni within Opecarcinus. No fresh material of Luciades agana Kropp and Manning, 1996 is currently available for genetic analyses, hence we cannot assess its phylogenetic position. However, the overall morphology of this species is similar to Opecarcinus (distinguished only by the lack of a distal expansion of pereopod 2), as is its association with the agariciid Leptoseris papyracea Dana, 1846, suggesting that L. agana may also fall within Opecarcinus (Kropp and Manning 1996;Komatsu and Takeda 2013).

The origin of Opecarcinus
Van der Meij and  established the first timecalibrated phylogenetic reconstruction of Cryptochiridae, including species belonging to 17 of 21 described genera. They estimated the MRCA of Cryptochiridae at 50-23 Mya, much later than the estimated Middle Ordovician origin of Scleractinia (Seiblitz et al. 2020). Such a discrepancy in diversification between host and symbiont has been observed in other taxa, such as coral-dwelling gobies (Duchene et al. 2013). A more focussed approach studying gall crab species in a single genus, allowing for the study of biogeographic and host use patterns in more detail, has been lacking.
Our divergence time estimation indicates that the tMRCA for Opercarcinus is around 15-6 Mya (middle Miocene-late Miocene), in line with the results (11-5 Mya) of Van der Meij and Klaus (2015). Our ancestral area reconstruction based on samples collected from IWP and WA is the first analyses of the evolutionary history of Opecarcinus, and indicates the IWP as the area of origin.

Timing and route of colonisation
How did the largely endemic biotas of the major reef regions develop? Cryptochirids are most diverse in the IWP, and only four species are currently recorded from the WA. What are the origins of these WA gall crabs? Our results indicate that Opecarcinus hypostegus is a relatively recent (ca. 3.25 Mya) colonist in the WA from the IWP. Limited phylogenetic information on two other WA gall crabs (Kroppcarcinus siderastreicola Badaro, Neves, Castro andJohnsson, 2012, Troglocarcinus corallicola Verrill, 1908) suggests they also have sister taxa in the IWP (Van der Meij and Klaus 2015; Van der Meij and Nieman 2016). These studies did not aim to trace the origin and route of colonisation of the WA species. Van der Meij and Klaus (2015) estimated that K. siderastreicola and T. corallicola diverged early within their respective clades , at a time when the connection between the Atlantic/Mediterranean Sea and IWP across the Tethys seaway was still open and thus may have served as a colonisation route, in addition to the two other routes discussed below (Bialik et al. 2019).
Subsequent to the closure of the Tethys in the early Miocene, dispersal between the IWP and the Atlantic could occur around the Cape of Good Hope, or across the EPB prior to the rise of the Isthmus of Panama. Reef organisms appear to have utilised both paths. Atlantic populations that established after the closure of the Isthmus had to have dispersed around the Cape of Good Hope as has been demonstrated in brachyuran crabs (Guinot and Castro 2007;Rahayu and Ng 2014;Shih et al. 2016) and other organisms, such as Etelis Cuvier, 1828 snappers, Gnatholepis Bleeker, 1874 gobies, Stenopus Latreille, 1819 shrimp, and the sea star Valvaster Perrier, 1875, some potentially facilitated by unusual life history strategies, such as larval cloning (Rocha et al. 2005;Andrews et al. 2016;Dudoit et al. 2018;Collin et al. 2020).
The EPB is a semipermeable biogeographic barrier as evidenced by comparisons of populations across this vast expanse of open ocean. There are some examples from corals (Glynn and Ault 2000) and molluscs (Emerson and Chaney 1995). Dispersal from the IWP to the WA across the EPB prior to the closure of the Isthmus has been put forward to account for the presence of numerous marine taxa in the Atlantic (e.g. Barber and Bellwood 2005;Baraf et al. 2019), including several crabs (Harrison and Crespi 1999;Thiercelin and Schubart 2014;Magalhães et al. 2016).
Opecarcinus appears to have crossed the EPB, one of the world's most potent marine biogeographic barriers, multiple times. Two Opecarcinus species are recorded from both the IWP and EP. Opecarcinus crescentus Edmondson, 1925 has been recorded from Vietnam, Palau and Johnston Island in the IWP (Garth 1965), and from Clipperton Island to the Gulf of California in the EP (Garth and Hopkins 1968). Opecarcinus lobifrons is known from the Red Sea to French Polynesia in the IWP, and Clipperton Atoll off the American mainland in the EP (Kropp 1989). Unfortunately, we lack samples from the EP, hence have not been able to assess the origin and diversity of Opecarcinus from this region directly. Opecarcinus Agariciids have a well-established fossil record in the Caribbean indicating that suitable hosts were available at the time Opecarcinus colonised the region. Pavona and Gardineroseris, currently restricted to the Indo-Pacific, have a Caribbean fossil record from the late Miocene to the Middle Pleistocene (Budd et al. 1994). The Caribbean endemic Agaricia first appeared in the Early to Middle Miocene (Budd 2000). Trace fossils (dwellings) of gall crabs are recorded from late Pliocene-Pleistocene corals from the WA, including Agaricia (Klompmaker et al. 2016). Given Opecarcinus' high levels of host specificity, we hypothesise that gall crabs diverged over closely related coral species, and subsequently speciated through hostswitching to newly available niches (i.e. Agaricia) in the Atlantic. This result is in contrast with a study on coralassociated hydrozoans of the genus Zanclea Gegenbaur, 1856, where the Caribbean harbours the same generalist hydrozoan species as the Indo-Pacific (Maggioni et al. 2020), highlighting the suitability of Cryptochiridae crabs for co-evolutionary studies.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.