Introduction

Mesophotic coral ecosystems (MCEs, 30–150 m) and rariphotic habitats (150–300 m) are widespread in tropical and subtropical regions (Pyle and Copus 2019). These "deeper reefs" host biodiverse communities of interest to conservation (Soares et al. 2020), provide key ecosystem functions and services including food provision and refuge for some depth-generalist reef species affected by climate change-induced rising sea temperatures and/or shallow-water disturbances (Holstein et al. 2019).

Although deeper reef research is increasing globally, for many tropical regions the distribution and biological data from these habitats is unavailable due to the financial, technological and operational constraints of underwater surveys beyond the depth limits of most airborne sensors and conventional SCUBA diving (> 30–50 m depth) (Woodall et al. 2018). Consequently, the environmental factors shaping deeper reef assemblages remain poorly described for most areas (Turner et al. 2017). This is particularly acute for the Western Indian Ocean (WIO) region where there is very little information on deeper reefs (Pyle and Copus 2019; Osuka et al. 2021). Improving our understanding of the environmental factors that drive deeper reef ecosystems is important for ecosystem-based management approaches aiming to identify seascape features important for ecosystem health and functioning (Hinderstein et al. 2010; Bridge et al. 2020).

Seabed topography influences habitat suitability and is a demonstrated driver of benthic species distributions (Wilson et al. 2007). Geomorphology influences deeper reef occurrence and distribution through its effects on habitat structure and substrate characteristics and by mediating biotic and abiotic processes such as hydrodynamic flows, turbidity and sedimentation (Kahng et al. 2010; Locker et al. 2010; Sherman et al. 2019). The quantification of seabed structural patterns from bathymetric surveys combined with data from visual surveys of deeper reefs has enabled the application of predictive distribution modelling as a tool to address biogeographical and ecological knowledge gaps (Guisan et al. 2013; Costa et al. 2015). Predictive distribution models that result in maps of expected habitat suitability have become critical tools to inform the design of effective conservation measures (Bridge et al. 2020). The majority of deeper reef distribution modelling studies have focused on a dominant mesophotic species or taxon (Costa et al. 2015; Veazey et al. 2016; Silva and MacDonald 2017). These efforts might not always be representative, as deeper reefs often comprise diverse benthic communities, including corals, sponges and algae with no single dominant group or taxon (Kahng et al. 2010; Bridge et al. 2011a; Turner et al. 2017). Instead, assemblage-based predictive models can incorporate multiple species, including low abundance species in a single model (Ferrier and Guisan 2006; Piechaud et al. 2015).

Few studies globally study the occurrence of both MCE and rariphotic megabenthos at the assemblage level (e.g. Stefanoudis et al. 2019), and there are even fewer that relate assemblages to local geomorphology. To explain and predictively map the spatial distribution of deeper reefs around remote WIO atolls, this study classifies and predictively models the distribution of benthic assemblages using digital bathymetric models and underwater video. The study objective was to determine if seabed terrain structure quantified at multiple scales can act as a reliable spatial proxy for predicting deeper reef assemblage distribution. We addressed this objective by: (1) clustering biological observations to categorize assemblage composition at each site; (2) testing assemblage associations with multiscale terrain metrics and their relative importance in driving assemblage occurrence and distribution; (3) defining habitat preferences for these assemblages based on metric ranges identified; and (4) identifying geomorphological features that may represent priority habitats for conservation by predicting the spatial distribution of deeper reef assemblages.

Methods

Study area

Seychelles is an archipelagic state in the Western Indian Ocean. This study focused on four coral atolls: three in the Amirantes island Group: Desroches (DES), St. Joseph (StJ) and Poivre (POI), and one in the Aldabra Group: Aldabra (ALD) (Fig. 1a). Biological ground-truthing and multibeam echosounder data were collected at every site (Fig. 1b–e) between March and April 2019 during Seychelles First Descent (Woodall and Rivers 2019).

Fig. 1
figure 1

Research locations within the Seychelles archipelago (a) and bathymetry and submersible video transect tracks at each study site (be). White and black dots indicate absences and presences of deeper reef assemblages, respectively. Area surveyed at each site was approximately 2km2

Biological data collection and preparation

At each site, three replicate 250 m long video transects were collected at four target depths (~ 30 m, 60 m, 120 m and 250 m) using stereo-sets of video cameras (Paralenz DiveCamera +) mounted on a submersible (Triton 1000/2). This depth-stratified sampling design was chosen to balance various science objectives of the expedition. Photo frames were extracted every 30 s from video data, and each photo frame was linked to a geographical position (latitude, longitude and depth) corresponding to submersible location. The best quality photograph of each stereopair was then selected for biological and substrate data annotation. Photo frames were excluded from the analysis if no USBL positioning data were available, if photo quality was inadequate for video annotation (low light, blurry, obstructed) and when submersible distance from the seabed was too great to allow positive identification of benthos (= photos covering an area > 8 m2).

With the image analysis software TransectMeasure from SeaGIS (Seager 2014), retained photo frames were annotated using twenty random points per m2 classified as either a substrate type or a living organism. Sessile megabenthos was placed into morphotypes (= morphologically similar specimens) identified to the lowest possible taxonomic level, which in most cases was genus or family (Fassbender et al. 2021). For each photo frame, substrate classifications were averaged and merged according to the majority rule (> 50% of points) to hard (bedrock, rubble, boulders, cobbles—class 1), mixed (when substratum types occurred in equal proportions, class 2) or soft (mud, sand, sediment—class 3) substratum Using this method, 350–500 annotated photographs were extracted per site. When no organisms were visible in the photo an absence was recorded. The total surface area annotated averaged 749 m2 per survey site (Aldabra: 931 m2, St Joseph: 595 m2, Poivre: 628 m2 and Desroches: 843 m2).

Hierarchical agglomerative clustering based on Bray–Curtis dissimilarity and Ward's minimum variance method was conducted to identify clusters of co-occurring taxonomic groups at each site, using the packages vegan (Oksanen et al. 2018) in R version 3.5.1 (R Core Team 2018). The optimum number of clusters for each site was determined by maximizing Dunn's index (Dunn 1973). Clustering results were visualized using non-metric multidimensional scaling (nMDS) plots and PERMANOVA was used to confirm their statistical significance (p < 0.05). The contribution of recorded taxa to clusters per site was assessed using permutational Dufrene-Legendre indicator species analysis (Dufrene and Legendre 1997) in R package indicspecies (Cáceres and Legendre 2009). Each photo was assigned an assemblage class, as defined by the clustering analysis.

Environmental predictors

Bathymetry and acoustic backscatter data were collected at each site between 0 and 300 m depth, using a Teledyne Reson Extended range SeaBat T50-P multibeam echosounder and the Teledyne PDS acquisition software. Qimera (QPS) software was used to process and grid the bathymetric data to 2 × 2 m resolution. Backscatter data were processed using the FMGeocoderToolbox (Fledermaus™ software, QPS). Despite post-processing, bathymetry contained some small artefacts at Desroches and Poivre and backscatter data was oversaturated (artificially inflated backscatter intensity) in areas beyond 150 m depth at all sites. As oversaturation was only a problem for these deeper depths, backscatter was nevertheless included as predictor variable.

Ecologically relevant terrain derivatives were extracted in ArcMap 10.6 using Benthic Terrain Modeller (BTM) (Walbridge et al. 2018) and in R using the raster (Hijmans 2017) and spatialEcology (Evans 2017) packages. Terrain derivatives were chosen to represent four broad geomorphological classes: slope, orientation, curvature and terrain variability (Wilson et al. 2007, Table 1). Parameters describing substrate type and geographic position were also quantified (Table 1). Although correlation between assemblage occurrence and sampling depth is expected as a result of the sampling design, depth was included as a continuous predictor to enable meaningful spatial predictions outside of sampling depths. Where variables exhibited high collinearity (Pearson correlation >|0.7|), the variable allowing for more intuitive ecological interpretation was retained (Dormann et al. 2013). The substrate classes identified in the annotation process (soft, mixed, hard) were used as ground-truthing for a random forest classification (Breiman 2001) using the non-collinear predictors excluding backscatter.

Table 1 Predictor variables explored and retained to quantify seascape structure. The table indicates their ecological relevance, method of extraction and spatial scales retained for analysis

To investigate the influence of spatial scale on the relationships between terrain morphology and assemblage structure, terrain derivatives were extracted from bathymetry resampled to three spatial resolutions (2 m, 10 m, 25 m). The highest gridding resolution of bathymetry data determined the finest spatial scale (2 m). The appropriate broad scale was determined using Dragut's Estimation of Scale Parameter (ESP) Tool (Drǎguţ et al. 2010), where a decline in Rate of Change of Local Variance (ROC-LV) curves was observed at a resolution of 25 m. Derivatives were also calculated at 10 m resolution to account for intermediate-scale effects and following resolutions used in other mesophotic reef studies (Costa et al. 2015; Sterne et al. 2020). Although variables at multiple scales exhibited collinearity, these were retained to account for possible multiscale effects of predictors on assemblage occurrence and distribution. Machine learning algorithms used in this study are less sensitive to collinearity and can be parameterized to avoid overfitting while dealing with several variables and interactions (Costa et al. 2015).

Model selection, building and evaluation

Random forests (RF) (Breiman 2001) and boosted regression trees (BRT) (Friedman 2001) were used to generate spatial predictive models. RFs and BRTs are machine learning algorithms adept at modelling complex nonlinear responses, allowing interactions between predictor variables and effectively filtering out irrelevant predictors (Elith et al. 2008).

R packages randomForest (Liaw and Wiener 2002) for RF models and gbm (Ridgeway 2017) for BRT models were implemented through the package caretEnsemble (Deane-Mayer and Knowles 2016). The model building process was identical for each assemblage cluster and site. We created a presence-absence dataset for each assemblage, where observations assigned to that assemblage through clustering were scored as presences (Table 2). We subsequently extracted the values of predictor variables from the raster cell corresponding to the geographic x, y, z location of each photo frame. We optimized the number of trees and number of variables available for splitting at each node (mtry) in RF models, and the number of trees, interaction depth (complexity) and shrinkage (learning rate) for BRT models. The best model was selected using the smallest value for the mean absolute error (MAE). Our evaluation dataset was obtained by structured sub-sampling of the training data set during the tenfold cross-validation process. We conducted a performance assessment by running ten resamples of the produced model.

Table 2 Taxa significantly (p < 0.05) associated with each cluster per site, ordered by indicator value. Taxa are described in Fassbender et al. (2021)

For each model, we evaluated: (1) model performance in (a) discrimination capacity, using the area under the curve of the receiver operating characteristic (AUC-ROC), where a value of 0.9–1 indicates excellent performance, a value of 0.7–0.9 good performance and a value of 0.5 a performance no better than that expected by chance; (b) reliability, using the mean absolute error (MAE) which ranges from 0 to 1, with 0 corresponding to a model that is a perfect output predictor; (c) spatial autocorrelation in model residuals and (d) visual accuracy; (2) the contribution of predictors to each model using variable importance analysis; (3) habitat preferences for each assemblage with partial dependence plots (Milborrow 2020) and density plots (Bowman and Azzalim 2018); and (4) predicted assemblage occurrence and distribution over the entire site.

Results

Cluster composition

Cluster analysis resulted in four assemblage clusters for each site (Table 2, Fig. 2, Online Resource 1–1), which despite some overlap between assemblages 1–3 (Fig. 3), all differed significantly in composition (PERMANOVA p < 0.001, Online Resource 1–2). Indicator species analysis performed on these assemblage clusters demonstrated that, although the exact taxa composition of each assemblage cluster differed per site, they corresponded to similar biological zones linked to sampling depth (Table 2, Online Resource 1–3). Assemblage 1 (30–60 m, Fig. 2a) contained scleractinians and zooxanthellate octocoral, macroalgae and sponges. Assemblage 2 (60 m, Fig. 2b) is dominated by azooxanthellate octocoral with diverse morphologies, scleractinians showing depth adaptations (e.g. dominance of plating morphologies that enhance monodirectional light absorption) and some antipatharians. Assemblage 3 (120 m, Fig. 2c) consisted mainly of azooxanthellate octocoral fans and encrusting organisms, most notably crustose coralline algae (CCA) and Porifera. Reef-building species were absent from assemblage 4 (250 m, Fig. 2d), which consisted of sea urchins and sea stars and occasionally hydrocorals (Stylasteridae). Indicator species analysis also revealed generalist taxa that occurred across multiple assemblage clusters, most notably Demospongiae sp. 2, Hydrozoa spp. and Corallinales spp; however, it is worth noting that these are coarse taxonomic identifications and hence include several species (Fassbender et al. 2021).

Fig. 2
figure 2

Image a, b, c and d visualize representative examples of assemblage 1 (30–60 m), 2 (60 m), 3 (120 m) and 4 (250 m), respectively, at Aldabra

Fig. 3
figure 3

nMDS plots visualizing the dissimilarity between observations from different assemblages

Model performance

All models showed good to excellent performance, with some site and assemblage-specific variation (Online Resource 2). With ROC-AUC values between 0.86 and 0.99 for RF (mean 0.95, sd 0.021) and 0.86 and 0.99 for BRT models (mean 0.94, sd 0.024), discrimination capacity was good to excellent for all models (Fig. 4a, c), and comparable between algorithms. Discrimination capacity was higher for assemblage 3 than other assemblages, with some site-specific variation without clear pattern. Like discrimination capacity, model reliability (Fig. 4 b, d) was good to excellent, with mean MAE ranging from 1.3% to 18.4% for RF (mean 9.2%, sd 1.8%) and from 1.6 to 18.2% for BRT models (mean 9.8%, sd 1.4%), comparable between algorithms. Model reliability was again high for assemblage 3 (mean MAE < 6%) and reduced for assemblages 1 and 2 (mean MAE up to 18%) with some site-specific variation. Although there was no significant residual spatial autocorrelation in BRT or RF models for assemblages 1, 2 and 3 calculated using Moran's I (p > 0.05), there was for assemblage 4 (p < 0.05) (Online Resource 2).

Fig. 4
figure 4

Performance statistics per model for assemblage 1–4. Discrimination capacity was measured as area under the curve for the receiver operating characteristic (AUC-ROC) for both RF (a) and BRT (c). Accuracy was measured as mean absolute error for RF (b) and BRT (d). Standard errors were calculated as a function of 10 model iterations

Predictor contributions to deeper reef distribution models

Overall, the main determinants of assemblage occurrence (measured by variable importance) were depth, distance to shore, arc-chord (AC) rugosity, curvature, slope and backscatter. Predictor variables calculated at 25 m and 10 m resolutions contributed more to the model than predictors calculated at the 2 m resolution, but were correlated. Primary predictors were consistent between model algorithms and geographical location. Variable importances showed that RF (Fig. 5a–d) models were often characterized by smaller contributions of multiple predictors extracted at different scales, whereas the relative contribution of individual predictors was greater for BRT models (Fig. 5e–h) which filtered out predictors with low model contributions. The relative importance of these main predictors also varied with assemblage type.

Fig. 5
figure 5

Relative contribution of the ten main predictors per assemblage across sites, assessed for random forests (ad) and boosted regression trees (eh). Variable importance was calculated as the mean decrease in node impurity and standardized as a percentage

The most important predictors of assemblage 1 were depth, distance to shore and terrain complexity (AC rugosity). Predictors associated with assemblage 2 also included depth, distance to shore and terrain complexity (AC rugosity), with slope and curvature important in BRT models. A smaller set of predictors contributed to the models of assemblage 3 and included curvature, terrain complexity (AC rugosity) and slope. Finally, depth, distance to shore and backscatter, which was oversaturated at these depths, contributed most to models of assemblage 4.

Predicted MCE habitat suitability

Deeper reef assemblage habitat preferences were derived from partial dependence plots (Online Resource 3) and density plots (Fig. 6, Online Resource 4). Partial dependence plots revealed that many key predictor variables had a non-linear relationship with model predictions. They also showed that key predictors (apart from depth and slope) exhibited variable multiscale effects on model predictions, with the effects of predictors extracted at fine-scale (2 m) often different from the effects of broad-scale (10 m and 25 m) predictors. Density plots were constructed using six broad-scale predictors that consistently contributed most across assemblages and models and revealed that assemblages were segregated by depth and distance to shore but mediated by terrain drivers important for individual assemblages.

Probability maps produced with the best performing models showed potential spatial distributions of each assemblage per site (Fig. 7 and Online Resource 5) and linked potential habitat characteristics identified using the partial dependence plots and density plots to geomorphological features.

Assemblages 1 and 2 were distributed over the insular shelf and exhibit high probabilities on elevated complex terrain as reef crests (Aldabra, Desroches), rocky ridges (Poivre) or terraces (St. Joseph). High probability densities of assemblage 1 were found closest to shore on either flat to gently sloping areas of low rugosity (AC rug < 1.01) on a mixed substrate (St. Joseph, Poivre, Fig. 6) or rugose (AC rug > 1.02) hard terrain such as ridges (Aldabra, Desroches). Across sites, high probability densities of assemblage 2 were linked to water depths extending to 75 m on slightly sloping (slope < 30°) or slightly curved terrain of hard to mixed substrate types. Assemblage 3 was environmentally distinct from other assemblages, with high probability densities on steeply sloping (slope > 40°), highly positively (St Joseph, Poivre, Aldabra) or negatively (Desroches) curved and rugose (AC rug > 1.02) terrain beyond depths of 100 m. These correspond to highly complex and sloping rocky ridges, including paleo-shorelines. Probability distributions for assemblage 4 were located on insular slopes and separated from those for assemblage 1, 2 and 3 and generally were higher in low rugose areas beyond 200 m furthest from shore on soft to mixed substrate (substrate class = 1 or substrate class = 2) consisting of sand or bedrock covered with sand and related to high backscatter intensities. However, the occurrence of assemblage 4 was environmentally not distinct from biological absences, as high probabilities of assemblage absences were found in similar environmental conditions as occurrences of assemblage 4. Regardless of site and model type used, spatial predictions of assemblage 4 were unreliable and strongly linked to oversaturated backscatter data beyond 150 m depth, particularly pronounced at Poivre and Desroches where assemblage 4 is predicted to occur in segregated sections that correspond to inflated values of backscatter intensity.

Fig. 6
figure 6

Density plots of assemblage response to the consistently highest contributing predictors (depth, distance to shore, rugosity, curvature, slope and backscatter) extracted at the broadest resolution (25 m) at St Joseph. Density plots for Aldabra, Poivre and Desroches can be found in Online Resource 4

RF and BRT models show similar results in predicted assemblage locations but differ in the predicted extent and probability level. Probability distributions from BRT models appeared more sensitive to data artefacts or input data uncertainty. This was demonstrated in the probability maps of assemblage 1 at Desroches and Poivre (Online Resource 5) where bathymetric data < 70 m suffered from some artefacts.

Discussion

Mesophotic and rariphotic reef ecosystems are increasingly recognized as important biodiversity hotspots in the global tropics, but major knowledge gaps remain on the environmental factors that interact to influence assemblage structure, occurrence and resulting geographical distributions (Turner et al. 2019). By using an assemblage-level analysis, this study showed that deeper reefs consist of depth-separated assemblages that exhibit distinct responses to terrain variables.

Drivers and distribution of mesophotic assemblages

Across MCE assemblages, predicted distributions were strongly segregated by depth and distance to shore, both proxies of decreasing temperature and light penetration (Kleypas et al. 1999) that strongly influence MCE occurrence (Turner et al. 2017; Tamir et al. 2019). Moreover, substrate characteristics also drive and limit deeper reef occurrence: hard substrates provide settlement opportunities and may support more biodiverse deeper reef assemblages than mixed or soft substrata (Sherman et al. 2019). Assemblage–environment relationships for each individual assemblage appear related to their functional characteristics. Assemblage 1, composed of obligate zooxanthellate reef-building taxa (Table 2), is found in well-lit reefs between 30 and 60 m, which tend to be closest to island coastlines. Its occurrence was linked to complex terrain and hard substrate, but these values may be amplified as biogenic reef species found in assemblage 1 further contribute to the availability of complex terrain and hard substrate. Assemblage 2 contained both zooxanthellate and azooxanthellate species and is therefore in part associated with depths where sufficient light penetrates to enable photosynthesis. Assemblage 2 was also associated with slope and curvature which mediate hydrodynamic properties of water flow past the reef system, influencing food availability for heterotrophic suspension-feeding organisms (Locker et al. 2010; Bridge et al. 2011b; Sterne et al. 2020). Increased slope values may also shield taxa characterized by flattened morphologies found in assemblage 2 from heavy sedimentation (Ohlhorst and Liddell 1988; Kahng et al. 2010). Assemblage 3 consisted of encrusting morphotypes and azooxanthellate octocorals, and high habitat complexity strongly predicted its occurrence, mirroring results found in previous studies on mesophotic octocorals (Silva and MacDonald 2017). Structurally complex habitats may provide a variety of microhabitats that organisms can use for shelter and feeding, create additional settlement space for encrusting organisms and finally induce increased current flow that may bring food particles necessary for suspension-feeding octocorals (Pygas et al. 2020). The steep slopes and walls at ~ 120 m depth, where assemblage 3 was also found, direct downward sediment transport (Kahng et al. 2010). In this context, locally elevated features, measured by terrain curvature, might prevent communities being covered by sediment falls (Sherman et al. 2019). Assemblage 4 was found beyond 200 m depth, but it proved challenging to define habitat conditions. Megabenthos characterizing these assemblages occur on patches of hard substrate interspersed among soft substrate, and the resolution at which spatial predictors were extracted likely did not capture substrate variation at this fine scale. Additionally, limited observations of this assemblage were available.

Vertical connectivity between shallow and deeper reefs has been an important focus of recent studies and is essential to assess the potential refugia and reseeding function of MCEs (Holstein et al. 2016). With statistically distinct assemblages, this study provides evidence of limited vertical connectivity between the surveyed deeper reefs. However, the nMDS plots and habitat suitability maps revealed that the mesophotic assemblages (30 m, 60 m and 120 m) had some overlap and were distinct from rariphotic systems. These overlaps are driven to some extent by depth-generalist species found across multiple depths (Table 2). The limited structural connectivity between MCEs and rariphotic reefs reflects results from other regions that MCEs are biologically unique systems (Stefanoudis et al. 2019). Further research incorporating high-taxonomic resolution data along the depth gradient is necessary for reliable inferences on actual connectivity.

Geomorphology as a proxy for WIO deeper reef occurrence

Geomorphology is a key driver of hard bottom habitat and coral reef systems across all depths (Goreau and Goreau 1973; Yesson et al. 2012). Studies focusing on the geomorphological patterns driving deeper reef occurrence have primarily been conducted in other geographies than the WIO and primarily focus on MCEs rather than rariphotic ecosystems. As our findings on drivers of rariphotic systems were less reliable, here we equally focus on MCEs. Research in US waters containing the US Caribbean, Hawaii and the Gulf of Mexico demonstrated that geomorphological structures resulting from past sea-level change around islands drive mesophotic reef colonization (Locker et al. 2010). Seascape structures including carbonate mounds (Silva and MacDonald 2017), paleo-shorelines, escarpments and terraces (Locker et al. 2010) and shelf edges (Smith et al. 2016) are of demonstrated importance for MCE habitat. Effects of relic topographic structures have also been demonstrated on the Great Barrier Reef (GBR) and Pacific, where MCE assemblages have been linked to the presence of submerged reefs (Bridge et al. 2011b, 2012) and submerged banks (Linklater et al. 2016) and the Red Sea, where submarine terraces support biodiverse MCEs (Weinstein et al. 2020). Likewise, a first study in the WIO indicates steep slopes and submarine walls may function as a priority habitat for MCEs (Osuka et al. 2021).

Our findings confirm that in the WIO, geomorphology exerts control on the occurrence and distribution of mesophotic reef assemblages. High MCE occurrence probabilities were linked to relief features at all sites that provide topographically complex habitat and hard substrates for colonization at suitable depths. This indicates that topographic structures formed through historical sea-level patterns function as a proxy of suitable topographic MCE habitat. Importantly, paleo-shorelines at 120 m depth, likely formed during the Last Glacial Maximum about 20,000 years ago (Clark and Mix, 2002), at the investigated locations were associated with high probability occurrences of assemblage 3. Submarine terraces along the insular shelf of St Joseph and Aldabra were also associated with high occurrence probabilities of assemblages 2 and 3, reflecting a second MCE proxy feature previously reported in the literature (Sherman et al. 2010; Weinstein et al. 2020). MCE assemblages occurring on shelves (30–60 m) at all sites were linked to raised relief features (Fig. 7, Online Resource 5). For example, the insular shelf at Aldabra contained elevated ridges extending along the slope, which provided hard substrate associated with higher probability occurrences of MCE habitat. The consistency between sites indicates that identified assemblage–environment relationships might hold for other coralline atolls in the region.

Fig. 7
figure 7

Spatial predictions for deeper reef assemblages at St Joseph produced using random forests (ad) and boosted regression trees (eh) with depth contours at 50 m intervals. Predictive maps for other sites (Aldabra, Desroches and Poivre) are available in Online Resource 5

Methodological considerations

Due to the challenges associated with extensive sampling in environments beyond SCUBA depth, our sampling scheme was spatially limited and, because of choices made in expedition sampling strategy, depth segregated. Future surveys could use a revised protocol to provide more information on species turnover and environmental conditions between our sampling depths, for example, through transects along the depth gradient. Data collection practices influenced the quality of the predictor variables. Multibeam data were acquired on a small tender, which was sensitive to motion artefacts. These artefacts were less pronounced in 25 m- and 10 m-resolution data layers than in 2 m-resolution data layers, which, in addition to the importance of the wider terrain, may contribute to their larger variable importance. Apart from the terrain variables considered in this study, other environmental factors not captured or captured as proxies also influence habitable space. These include biological processes as competition, predation and herbivory (Kahng et al. 2010) and oceanographic parameters like water chemistry, upwelling, wave and current action. In the Indian Ocean, the monsoon regime has a strong influence on reef development and recruitment through these factors (Gischler et al. 2014). Future studies could benefit from incorporating biotic and oceanographic parameters in similar analyses.

The choice of model building and evaluation method is important as different algorithms fit different relationships and tolerance constraints between biological and environmental data. This study confirmed that random forests and boosted regression trees are suitable tools to quantify the effect of multiscale interacting environmental drivers on the occurrence and distribution of deeper reefs. However, although high model accuracy and consistent results were obtained, caution is required when these predictive maps are to be generalized for marine management applications as presented maps do not take the spatial distribution of model errors and uncertainty into account (Lecours 2017). RF and BRT spatial predictions were consistent in extent but differed in intensity. Upon visual inspection, BRT models appeared less reliable than RF models. This may be linked to the boosting characteristic of BRTs, as the sensitivity and effect of data artefacts can be additive over many model iterations (Sexton and Laake 2007). Results for assemblage 4 also show that, despite high traditional model performance metrics, spatial autocorrelation in model residuals and the low number of observations may have underestimated model errors and resulted in erroneous relationships, such as the unreliable relationship with backscatter data. Finally, our study highlighted the importance of considering scale effects by showing that the spatial resolution at which a predictor is calculated influences model responses and variable importance.

Conservation implications

There are few specific conservation efforts targeting mesophotic and rariphotic reefs globally, yet these reefs are of value to conservation and meet the criteria to be classified as Ecologically or Biologically Significant Marine Areas (EBSAs) under the Convention on Biological Diversity (CBD) (Soares et al. 2020). Several key knowledge gaps limit evidence-based conservation and management of deeper reefs, including information on assemblage composition, drivers of occurrence and distribution and vertical connectivity (Turner et al. 2019).

Our results contribute to filling some of those knowledge gaps for deeper reef seascapes in the WIO and have been presented to Seychellois stakeholders involved in marine management and conservation. Deeper reefs are of particular interest for marine managers in Seychelles for their possible role as refuges for fish species of socio-cultural and economic value such as groupers and snappers (Tyler et al. 2009). Key takeaways from these conversations are that further application of predictive mapping approaches offers potential to maximize the information obtained from seascape structure and geographically dispersed mesophotic observations and to produce spatial predictions of suitable habitat in the WIO. Further, our results provide baseline information on the geomorphological factors that shape mesophotic reef distribution in the WIO and confirm findings from other regions and anecdotal observations that complex geomorphological structures as paleo-shorelines and terraces may act as proxies for mesophotic assemblages. The consistency between the results obtained per site, despite changing geomorphologies, suggests robustness of results and is an indication that similar relationships may exist across other Seychelles' Outer Islands where detailed biological survey data may be unavailable. This is important for a large ocean nation like Seychelles where country size and funding availability limit extensive biological surveys to locate deeper reefs throughout the territory. Our findings provide a starting point for marine managers to investigate possible target locations for further studies and to assess whether these structures also form priority habitat or refuge for vulnerable or commercially important fish species within Seychelles. Establishing the importance of deeper reef habitats for these species, or a species that they feed on, is a key step towards putting these habitats under consideration for a measure of protection.

Although obtaining quantitative data of these, often remote, systems remains a critical bottleneck in informing conservation and management, as the costs of robotic tools as AUVs decrease and interest in MCEs grows we expect data on these systems to become increasingly available (Armstrong et al. 2019; Osuka et al. 2021). Combined with predictive mapping approaches, these form valuable tools to provide important baseline information assessing the character and potential distribution of unexplored MCEs.